Altinity Stable for ClickHouse 21.8.8

Details of the Altinity Stable 21.8.8

A few months ago we certified ClickHouse 21.3 as an Altinity Stable release. Since then, we have worked on newer releases and run them in-house. We completed several new features, and even more have been added by community contributors. We were running ClickHouse 21.8 to power our public datasets at Altinity.Cloud instance and testing it in our environments. We have also performed a comprehensive QA of ClickHouse 21.8 to make sure upgrades go smoothly. As of 21.8.8 we are confident in certifying 21.8 as an Altinity Stable release.

This release is a significant upgrade since the previous Altinity Stable release. It includes 1472 pull requests from 224 contributors. Please look below for the detailed release notes.

Major new features since the previous stable release 21.3

A new release introduces a lot of changes and new functions. The full list is available in the Appendix, so refer to this section for more detail. The following new features are worth mentioning on the front page:

  • SQL features:
    • DISTINCT ON a subset of columns
    • Partial support of SQL/JSON standard
    • Arrays in dictionaries are now supported
    • Arrays and nested data types are now supported for Parquet and Arrow formats
    • DateTime64 extended range, Now dates between 1925 to 2283 years are supported. a)
  • Security features:
    • Disk level encryption
    • Kerberos authentication for HTTP protocol a)
    • Active Directory groups mapping for LDAP user directory a)
  • Replication and Cluster improvements:
    • ClickHouse Keeper (experimental) – in-process ZooKeeper replacement
    • SYSTEM RESTORE REPLICA – a handy tool that makes life easier in the unfortunate occasion if ZooKeeper metadata is lost a)
    • Support for MySQL/PostgreSQL clusters when using mysql/postgresql table functions
    • Zero-copy replication for S3 tables (see Altinity blog post for details)
    • Parts movement between shards (experimental)
  • MergeTree features:
  • Integrations:
    • MaterializedPostgreSQL database engine for replication from PostgreSQL
    • HDFS disk support (experimental)
    • Allow to catch Kafka errors into a separate stream (see the KB article on this)
  • Other:
    • YAML configuration format as an alternative to XML

As usual with ClickHouse, there are many performance and operational improvements in different server components.

a) - contributed by Altinity developers.

Backward Incompatible Changes

The following changes are backward incompatible and require user attention during an upgrade:

  • Users of LowCardinality(Nullable(...)) can not safely downgrade to versions older than 21.4. Data in tables with columns of those types inserted / merged by 21.8 will be lost after the downgrade.
  • Values of UUID type cannot be compared with integers. For example, instead of writing uuid != 0 type uuid != '00000000-0000-0000-0000-000000000000'
  • The toStartOfIntervalFunction will align hour intervals to midnight (in previous versions they were aligned to the start of unix epoch). For example, toStartOfInterval(x, INTERVAL 11 HOUR) will split every day into three intervals: 00:00:00..10:59:59, 11:00:00..21:59:59 and 22:00:00..23:59:59.
  • It’s not possible to rollback to the older ClickHouse version after executing ALTER ... ATTACH query as the old servers would fail to process the new command entry ATTACH_PART in the replicated log.
  • The behaviour of remote_url_allow_hosts has changed. In previous versions the empty section did nothing, in 21.8 it will block access to all external hosts. Remove this section from the configuration files after an upgrade if you experience issues with url() or s3() functions.
  • If you will downgrade to version before 21.1 clickhouse will not be able to start automatically – you will need to remove the system.*_log tables manually to downgrade
  • There is an issue with uniqueState(UUID) in AggregatingMergeTree tables, and can be corrected by replacing uniqState(uuid) in MATERIALIZED VIEWs with uniqState(sipHash64(uuid)) and change data type for already saved data from AggregateFunction(uniq, UUID) to AggregateFunction(uniq, UInt64). For more information see the following:

Upgrade Notes

There were several changes between versions that may affect the rolling upgrade of big clusters. Upgrading only part of the cluster is not recommended.

  • Distributed queries with explicitly defined large sets are now executed differently. Compatibility setting legacy_column_name_of_tuple_literal may be enabled during the rolling upgrade of the cluster. Otherwise distributed queries with explicitly defined sets at IN clause may fail during upgrade.
  • ATTACH PART[ITION] queries may not work during cluster upgrade

Other Important Changes

ClickHouse embedded monitoring has become a bit more aggressive. It now collects several system stats, and stores them in the table system.asynchronious_metric_log. This can be visible as an increase of background writes, storage usage, etc. To return to the old rate of metrics refresh / flush, adjust those settings in config.xml:

<asynchronous_metrics_update_period_s>
    60
</asynchronous_metrics_update_period_s>
<asynchronous_metric_log>
    <flush_interval_milliseconds>
        60000
    </flush_interval_milliseconds>
</asynchronous_metric_log>

Alternatively, metric_log and asynchronous_metric_log tables can be completely disabled:

<yandex>
    <asynchronous_metric_log remove="1"/>
    <metric_log remove="1"/>
</yandex>

Some new ClickHouse features are now enabled by default. It may lead to a change in behaviour, so review those carefully and disable features that may affect your system:

  • async_socket_for_remote
  • compile_aggregate_expressions
  • compile_expressions
  • cross_to_inner_join_rewrite
  • insert_null_as_default
  • optimize_skip_unused_shards_rewrite_in
  • query_plan_enable_optimizations
  • query_plan_filter_push_down

In the previous releases we recommended disabling optimize_on_insert. This recommendation stays for 21.8 as well as inserts into Summing and AggregatingMergeTree can slow down.

Known issues in 21.8.8

The development team continues to improve the quality of the 21.8 release. The following issues still exist in the 21.8.8 version and may affect ClickHouse operation. Please inspect them carefully to decide if those are applicable to your applications:

  • system.events for event = 'Merge' are overstated. ClickHouse incorrectly increments this counter.
  • Timeout exceeded: elapsed 18446744073.709553 seconds error that might happen in extremely rare cases, presumably due to some bug in kernel.

You may also look into a GitHub issues using a special v21.8-affected label.​​

ClickHouse Altinity Stable Releases are based on the community versions. For more information on installing ClickHouse from either the Altinity Stable builds or the community builds, see the ClickHouse Altinity Stable Release Build Install Guide.

Please contact us at info@altinity.com if you experience any issues with the upgrade.

Appendix

New functions

  • DateTime functions:
    • dateName
    • timeZone, timeZoneOf, timeZoneOffset, timezoneOf, toTimezone
  • JSON processing functions:
    • JSON_EXISTS, JSON_QUERY, JSON_VALUE
    • simpleJSONExtractBool, simpleJSONExtractFloat, simpleJSONExtractInt, simpleJSONExtractRaw, simpleJSONExtractString, simpleJSONExtractUInt, simpleJSONHas
    • toJSONString
  • String functions:
    • bin/unbin
    • leftPad, lpad, leftPadUTF8
    • rightPad, rpad, rightPadUTF8
    • splitByRegexp
  • Array functions:
    • arrayProduct
    • bitPositionsToArray
    • validateNestedArraySizes
  • Dictionary:
    • dictGetChildren
    • dictGetDescendants
    • dictGetOrNull
  • Aggregate functions
    • deltaSumTimestamp
    • intervalLengthSum
    • lagInFrame/leadInFrame
    • sequenceNextNode
    • sumCount
    • uniqTheta
  • Geo:
    • polygonAreaCartesian, polygonAreaSpherical
    • polygonConvexHullCartesian
    • polygonPerimeterCartesian, polygonPerimeterSpherical
    • polygonsDistanceCartesian, polygonsDistanceSpherical
    • polygonsEqualsCartesian
    • polygonsIntersectionCartesian, polygonsIntersectionSpherical
    • polygonsSymDifferenceCartesian, polygonsSymDifferenceSpherical
    • polygonsUnionCartesian, polygonsUnionSpherical
    • polygonsWithinCartesian, polygonsWithinSpherical
    • readWktMultiPolygon, readWktPoint, readWktPolygon, readWktRing
    • wkt
  • Math/Statistics:
    • quantileBFloat16, quantilesBFloat16
  • Type related:
    • reinterpretAsUInt128
    • toUInt128, toUInt128OrNull, toUInt128OrZero
  • Other:
    • indexHint
    • isIPAddressInRange
    • partitionId – that corresponds to _partition_id virtual column added to MergeTree tables

New table functions

  • dictionary
  • s3Cluster – see our blog article that highlights this one

New table engines

  • ExternalDistributed – allows to query MySQL or PostgreSQL clusters
  • MaterializedPostgreSQL

New metrics and events

system.asynchronous_metrics
  • AsynchronousMetricsCalculationTimeSpent

The table has also got a lot of new metrics for a host monitoring rather than ClickHouse.

system.metrics
  • BrokenDistributedFilesToInsert
  • MMappedFileBytes
  • NetworkReceive
  • NetworkSend
system.events
  • MMappedFileCacheHits
  • MMappedFileCacheMisses
  • MergeTreeDataProjectionWriterBlocks
  • MergeTreeDataProjectionWriterBlocksAlreadySorted
  • MergeTreeDataProjectionWriterCompressedBytes
  • MergeTreeDataProjectionWriterRows
  • MergeTreeDataProjectionWriterUncompressedBytes
  • NetworkReceiveBytes
  • NetworkSendBytes
  • StorageBufferLayerLockReadersWaitMilliseconds
  • StorageBufferLayerLockWritersWaitMilliseconds
  • StorageBufferPassedBytesFlushThreshold
  • StorageBufferPassedRowsFlushThreshold
  • StorageBufferPassedTimeFlushThreshold
system.errors
  • CANNOT_CREATE_FILE
  • CANNOT_PARSE_YAML
  • CANNOT_SET_ROUNDING_MODE
  • CANNOT_SYSCONF
  • CONCURRENT_ACCESS_NOT_SUPPORTED
  • DISTRIBUTED_BROKEN_BATCH_FILES
  • DISTRIBUTED_BROKEN_BATCH_INFO
  • ILLEGAL_PROJECTION
  • INCORRECT_PART_TYPE
  • INVALID_FORMAT_INSERT_QUERY_WITH_DATA
  • KERBEROS_ERROR
  • NO_SUCH_PROJECTION_IN_TABLE
  • POSTGRESQL_CONNECTION_FAILURE
  • PROJECTION_NOT_USED
  • TOO_LARGE_DISTRIBUTED_DEPTH
  • UNKNOWN_SNAPSHOT

New system tables

  • data_skipping_indices:
    • database, table, name, type, expr, granularity
  • part_moves_between_shards:
    • database, table, task_name, task_uuid, create_time, part_name, part_uuid, to_shard, update_time, state, num_tries, last_exception
  • projection_parts – same as parts but with extra ‘parent_*’ columns
  • projection_parts_columns – same as parts_columns but with extra ‘parent_*’ columns

New columns in system tables

  • clusters
    • shutdown_count
  • dictionaries:
    • key – removed
    • key.names, key.types, found_rate
  • distribution_queue:
    • broken_data_files, broken_data_compressed_bytes
  • errors:
    • last_error_time, last_error_message, last_error_trace, remote
  • metric_log:
    • multiple columns
  • parts_columns:
    • uuid, min_time, max_time
  • processes:
    • Settings (Map replaces Settings.Names/Values arrays), ProfileEvents (Map replaces ProfileEvents.Names/Values arrays), current_database
  • query_log:
    • projections, initial_query_start_time, initial_query_start_time_microseconds, Settings (Map replaces Settings.Names/Values arrays), ProfileEvents (Map replaces ProfileEvents.Names/Values arrays)
  • query_thread_log:
    • initial_query_start_time_microseconds, Settings (Map replaces Settings.Names/Values arrays), ProfileEvents (Map replaces ProfileEvents.Names/Values arrays)
  • stack_trace:
    • thread_name
  • table_engines:
    • supports_projections
  • tables:
    • comment
  • users
    • grantees_any, grantees_list, grantees_except

System.merge_tree_settings added/changed

Name Old value New value Description
max_parts_to_merge_at_once 100 Max amount of parts which can be merged at once (0 - disabled). Doesn’t affect OPTIMIZE FINAL query.
max_replicated_fetches_network_bandwidth 0 The maximum speed of data exchange over the network in bytes per second for replicated fetches. Zero means unlimited.
max_replicated_sends_network_bandwidth 0 The maximum speed of data exchange over the network in bytes per second for replicated sends. Zero means unlimited.
min_bytes_to_rebalance_partition_over_jbod 0 Minimal amount of bytes to enable part rebalance over JBOD array (0 - disabled).
non_replicated_deduplication_window 0 How many last blocks of hashes should be kept on disk (0 - disabled).
part_moves_between_shards_delay_seconds 30 Time to wait before/after moving parts between shards.
part_moves_between_shards_enable 0 Experimental/Incomplete feature to move parts between shards. Does not take into account sharding expressions.
replicated_fetches_http_connection_timeout 0 HTTP connection timeout for part fetch requests. Inherited from default profile http_connection_timeout if not set explicitly.
replicated_fetches_http_receive_timeout 0 HTTP receive timeout for fetch part requests. Inherited from default profile http_receive_timeout if not set explicitly.
replicated_fetches_http_send_timeout 0 HTTP send timeout for part fetch requests. Inherited from default profile http_send_timeout if not set explicitly.
remote_fs_execute_merges_on_single_replica_time_threshold 10800 When greater than zero only a single replica starts the merge immediately when merged part on shared storage and ‘allow_remote_fs_zero_copy_replication’ is enabled.

system.settings added/changed

Type Name Old value New value Description
settings allow_experimental_bigint_types 0 1 Obsolete setting, does nothing.
settings allow_experimental_codecs 0 If it is set to true, allow to specify experimental compression codecs (but we don't have those yet and this option does nothing).
settings allow_experimental_database_materialized_postgresql 0 Allow to create database with Engine=MaterializedPostgreSQL(…).
settings allow_experimental_funnel_functions 0 Enable experimental functions for funnel analysis.
settings allow_experimental_map_type 0 1 Obsolete setting, does nothing.
settings allow_experimental_projection_optimization 0 Enable projection optimization when processing SELECT queries
settings async_socket_for_remote 0 1 Asynchronously read from socket executing remote query
settings background_schedule_pool_size 16 128 Number of threads performing background tasks for replicated tables, dns cache updates. Only has meaning at server startup.
settings compile_aggregate_expressions 1 Compile aggregate functions to native code.
settings compile_expressions 0 1 Compile some scalar functions and operators to native code.
settings cross_to_inner_join_rewrite 1 Use inner join instead of comma/cross join if possible
settings database_replicated_always_detach_permanently 0 Execute DETACH TABLE as DETACH TABLE PERMANENTLY if database engine is Replicated
settings distributed_ddl_entry_format_version 1 Version of DDL entry to write into ZooKeeper
settings distributed_ddl_output_mode throw Format of distributed DDL query result
settings distributed_directory_monitor_split_batch_on_failure 0 Should StorageDistributed DirectoryMonitors try to split batch into smaller in case of failures.
settings distributed_push_down_limit 0 If 1, LIMIT will be applied on each shard separatelly. Usually you don't need to use it, since this will be done automatically if it is possible, i.e. for simple query SELECT FROM LIMIT.
settings experimental_query_deduplication_send_all_part_uuids 0 If false only part UUIDs for currently moving parts are sent. If true all read part UUIDs are sent (useful only for testing).
settings external_storage_max_read_bytes 0 Limit maximum number of bytes when table with external engine should flush history data. Now supported only for MySQL table engine, database engine, dictionary and MaterializeMySQL. If equal to 0, this setting is disabled
settings external_storage_max_read_rows 0 Limit maximum number of rows when table with external engine should flush history data. Now supported only for MySQL table engine, database engine, dictionary and MaterializeMySQL. If equal to 0, this setting is disabled
settings force_optimize_projection 0 If projection optimization is enabled, SELECT queries need to use projection
settings glob_expansion_max_elements 1000 Maximum number of allowed addresses (For external storages, table functions, etc).
settings group_by_two_level_threshold_bytes 100000000 50000000 From what size of the aggregation state in bytes, a two-level aggregation begins to be used. 0 - the threshold is not set. Two-level aggregation is used when at least one of the thresholds is triggered.
settings handle_kafka_error_mode default Obsolete setting, does nothing.
settings http_max_field_name_size 1048576 Maximum length of field name in HTTP header
settings http_max_field_value_size 1048576 Maximum length of field value in HTTP header
settings http_max_fields 1000000 Maximum number of fields in HTTP header
settings http_max_uri_size 1048576 Maximum URI length of HTTP request
settings insert_null_as_default 1 Insert DEFAULT values instead of NULL in INSERT SELECT (UNION ALL)
settings legacy_column_name_of_tuple_literal 0 List all names of element of large tuple literals in their column names instead of hash. This settings exists only for compatibility reasons. It makes sense to set to 'true', while doing rolling update of cluster from version lower than 21.7 to higher.
settings max_distributed_depth 5 Maximum distributed query depth
settings max_replicated_fetches_network_bandwidth_for_server 0 The maximum speed of data exchange over the network in bytes per second for replicated fetches. Zero means unlimited. Only has meaning at server startup.
settings max_replicated_sends_network_bandwidth_for_server 0 The maximum speed of data exchange over the network in bytes per second for replicated sends. Zero means unlimited. Only has meaning at server startup.
settings min_count_to_compile_aggregate_expression 3 The number of identical aggregate expressions before they are JIT-compiled
settings normalize_function_names 1 0 Normalize function names to their canonical names
settings odbc_bridge_connection_pool_size 16 Connection pool size for each connection settings string in ODBC bridge.
settings optimize_functions_to_subcolumns 0 Transform functions to subcolumns, if possible, to reduce amount of read data. E.g. 'length(arr)' -> 'arr.size0', 'col IS NULL' -> 'col.null'
settings optimize_fuse_sum_count_avg 0 Fuse aggregate functions sum(), avg(), count() with identical arguments into one sumCount() call, if the query has at least two different functions
settings optimize_move_to_prewhere_if_final 0 If query has FINAL, the optimization move_to_prewhere is not always correct and it is enabled only if both settings optimize_move_to_prewhere and optimize_move_to_prewhere_if_final are turned on
settings optimize_skip_unused_shards_limit 1000 Limit for number of sharding key values, turns off optimize_skip_unused_shards if the limit is reached
settings optimize_skip_unused_shards_rewrite_in 1 Rewrite IN in query for remote shards to exclude values that does not belong to the shard (requires optimize_skip_unused_shards)
settings output_format_arrow_low_cardinality_as_dictionary 0 Enable output LowCardinality type as Dictionary Arrow type
settings postgresql_connection_pool_size 16 Connection pool size for PostgreSQL table engine and database engine.
settings postgresql_connection_pool_wait_timeout 5000 Connection pool push/pop timeout on empty pool for PostgreSQL table engine and database engine. By default it will block on empty pool.
settings prefer_column_name_to_alias 0 Prefer using column names instead of aliases if possible.
settings prefer_global_in_and_join 0 If enabled, all IN/JOIN operators will be rewritten as GLOBAL IN/JOIN. It's useful when the to-be-joined tables are only available on the initiator and we need to always scatter their data on-the-fly during distributed processing with the GLOBAL keyword. It's also useful to reduce the need to access the external sources joining external tables.
settings query_plan_enable_optimizations 1 Apply optimizations to query plan
settings query_plan_filter_push_down 1 Allow to push down filter by predicate query plan step
settings s3_max_single_read_retries 4 The maximum number of retries during single S3 read.
settings sleep_in_send_data 0
settings sleep_in_send_data_ms 0 Time to sleep in sending data in TCPHandler
settings sleep_in_send_tables_status 0
settings sleep_in_send_tables_status_ms 0 Time to sleep in sending tables status response in TCPHandler
settings use_antlr_parser 0

Also, please refer to the release notes from the development team available at the following URLs:

Last modified 2021.10.11