This is the multi-page printable view of this section. Click here to print.
Altinity Documentation
- 1: Altinity.Cloud
- 1.1: Altinity.Cloud 101
- 1.2: Quick Start Guide
- 1.2.1: Account Creation and Login
- 1.2.2: Explore Clusters View
- 1.2.3: Create Your First Cluster
- 1.2.4: Your First Queries
- 1.2.5: Connect Remote Clients
- 1.2.6: Conclusion
- 1.2.7: FAQ
- 1.3: General User Guide
- 1.3.1: How to Create an Account
- 1.3.2: How to Login
- 1.3.3: How to Logout
- 1.3.4: Account Settings
- 1.3.5: Notifications
- 1.3.6: Billing
- 1.3.7: System Status
- 1.3.8: Clusters View
- 1.3.9: Cluster Explore Guide
- 1.3.9.1: Query Tool
- 1.3.9.2: Schema View
- 1.3.9.3: Processes
- 1.4: Administrator Guide
- 1.4.1: Clusters
- 1.4.1.1: View Cluster Details
- 1.4.1.2: Cluster Actions
- 1.4.1.2.1: Upgrade Cluster
- 1.4.1.2.2: Rescale Cluster
- 1.4.1.2.3: Stop and Start a Cluster
- 1.4.1.2.4: Export Cluster Settings
- 1.4.1.2.5: Replicate a Cluster
- 1.4.1.2.6: Destroy Cluster
- 1.4.1.3: Cluster Settings
- 1.4.1.4: Configure Cluster
- 1.4.1.4.1: How to Configure Cluster Settings
- 1.4.1.4.2: How to Configure Cluster Profiles
- 1.4.1.4.3: How to Configure Cluster Users
- 1.4.1.5: Launch New Cluster
- 1.4.1.6: Cluster Alerts
- 1.4.1.7: Cluster Health Check
- 1.4.1.8: Cluster Monitoring
- 1.4.1.9: Cluster Logs
- 1.4.2: Access Control
- 1.4.2.1: Role Based Access and Security Tiers
- 1.4.2.2: Account Management
- 1.5: Connectivity
- 1.5.1: Cluster Access Point
- 1.5.2: Configure Cluster Connections
- 1.5.3: Connecting with DBeaver
- 1.5.4: clickhouse-client
- 1.5.5: Amazon VPC Endpoint
- 1.5.6: Amazon VPC Endpoint for Amazon MSK
- 2: Altinity Stable Builds
- 2.1: Altinity Stable Builds Install Guide
- 2.1.1: Altinity Stable Builds Deb Install Guide
- 2.1.2: Altinity Stable Builds RPM Install Guide
- 2.1.3: Altinity Stable Builds Docker Install Guide
- 2.1.4: Altinity Stable Builds macOS Install Guide
- 2.1.5: Altinity Stable Build Guide for ClickHouse
- 2.1.6: Legacy ClickHouse Altinity Stable Releases Install Guide
- 2.2: Monitoring Considerations
- 3: ClickHouse on Kubernetes
- 3.1: Altinity Kubernetes Operator Quick Start Guide
- 3.1.1: Installation
- 3.1.2: First Clusters
- 3.1.3: Zookeeper and Replicas
- 3.1.4: Persistent Storage
- 3.1.5: Uninstall
- 3.2: Kubernetes Install Guide
- 3.3: Operator Guide
- 3.3.1: Installation Guide
- 3.3.1.1: Basic Installation Guide
- 3.3.1.2: Custom Installation Guide
- 3.3.1.3: Source Build Guide - 0.18 and Up
- 3.3.1.4: Specific Version Installation Guide
- 3.3.1.5: Upgrade Guide
- 3.3.2: Configuration Guide
- 3.3.2.1: ClickHouse Operator Settings
- 3.3.2.2: ClickHouse Cluster Settings
- 3.3.3: Resources
- 3.3.4: Networking Connection Guides
- 3.3.4.1: MiniKube Networking Connection Guide
- 3.3.5: Storage Guide
- 3.3.5.1: Persistent Storage Overview
- 4: Operations Guide
- 4.1: Security
- 4.1.1: Hardening Guide
- 4.1.1.1: User Hardening
- 4.1.1.2: Network Hardening
- 4.1.1.3: Storage Hardening
- 4.2: Care and Feeding of Zookeeper with ClickHouse
- 4.2.1: ZooKeeper Installation and Configuration
- 4.2.2: ZooKeeper Monitoring
- 4.2.3: ZooKeeper Recovery
- 5: Integrations
- 5.1: Integrating Superset with ClickHouse
- 5.1.1: Install Superset
- 5.1.2: Connect Superset to ClickHouse
- 5.1.3: Create Charts from ClickHouse Data
- 5.2: Integrating Tableau with ClickHouse
- 5.3: ClickHouse ODBC Driver
- 5.3.1: ClickHouse ODBC Driver Installation for Windows
- 5.3.2: ClickHouse ODBC Connection for Microsoft Excel
- 5.4: Integrating Grafana with ClickHouse
- 6: Altinity Software Releases
- 6.1: Altinity Backup for ClickHouse
- 6.1.1: Altinity Backup for ClickHouse 1.5.0
- 6.1.2: Altinity Backup for ClickHouse 1.4.8
- 6.1.3: Altinity Backup for ClickHouse 1.4.9
- 6.1.4: Altinity Backup for ClickHouse 1.4.7
- 6.1.5: Altinity Backup for ClickHouse 1.4.6
- 6.1.6: Altinity Backup for ClickHouse 1.4.5
- 6.1.7: Altinity Backup for ClickHouse 1.4.3
- 6.1.8: Altinity Backup for ClickHouse 1.4.4
- 6.1.9: Altinity Backup for ClickHouse 1.4.2
- 6.1.10: Altinity Backup for ClickHouse 1.4.1
- 6.1.11: Altinity Backup for ClickHouse 1.4.0
- 6.1.12: Altinity Backup for ClickHouse 1.3.2
- 6.1.13: Altinity Backup for ClickHouse 1.3.1
- 6.1.14: Altinity Backup for ClickHouse 1.2.3
- 6.1.15: Altinity Backup for ClickHouse 1.3.0
- 6.2: Altinity Stable for ClickHouse
- 6.2.1: Altinity Stable for ClickHouse 22.3
- 6.2.2: Altinity Stable for ClickHouse 21.8
- 6.2.2.1: Altinity Stable for ClickHouse 21.8.15
- 6.2.2.2: Altinity Stable for ClickHouse 21.8.13
- 6.2.2.3: Altinity Stable for ClickHouse 21.8.12
- 6.2.2.4: Altinity Stable for ClickHouse 21.8.11
- 6.2.2.5: Altinity Stable for ClickHouse 21.8.10
- 6.2.2.6: Altinity Stable for ClickHouse 21.8.8
- 6.2.3: Altinity Stable for ClickHouse 21.3
- 6.2.3.1: Altinity Stable for ClickHouse 21.3.20.2
- 6.2.3.2: Altinity Stable for ClickHouse 21.3.17.3
- 6.2.3.3: Altinity Stable for ClickHouse 21.3.13.9
- 6.2.4: Altinity Stable for ClickHouse 21.1
- 6.2.4.1: Altinity Stable for ClickHouse 21.1.11.3
- 6.2.4.2: Altinity Stable for ClickHouse 21.1.10.3
- 6.2.4.3: Altinity Stable for ClickHouse 21.1.9.41
- 6.2.4.4: Altinity Stable for ClickHouse 21.1.7.1
- 6.2.5: Altinity Stable for ClickHouse 20.8
- 6.2.5.1: Altinity Stable for ClickHouse 20.8.12.2
- 6.2.5.2: Altinity Stable for ClickHouse 20.8.11.17
- 6.2.5.3: Altinity Stable for ClickHouse 20.8.7.15
- 6.2.6: Altinity Stable for ClickHouse 20.3
- 6.2.6.1: Altinity Stable for ClickHouse 20.3.19.4
- 6.2.6.2: Altinity Stable for ClickHouse 20.3.12.112
- 6.2.7: Altinity Stable for ClickHouse 19.16
- 6.2.7.1: Altinity Stable for ClickHouse 19.16.19.85
- 6.2.7.2: Altinity Stable for ClickHouse 19.16.12.49
- 6.2.7.3: Altinity Stable for ClickHouse 19.16.10.44
- 6.2.8: Altinity Stable for ClickHouse 19.13
- 6.2.9: Altinity Stable for ClickHouse 19.11
- 6.2.10: Altinity Stable for ClickHouse 18.14
- 6.2.10.1: Altinity Stable for ClickHouse 18.14.19
- 6.3: Altinity Kubernetes Operator
- 6.3.1: Altinity Kubernetes Operator 0.19.0
- 6.3.2: Altinity Kubernetes Operator 0.18.5
- 6.3.3: Altinity Kubernetes Operator 0.18.4
- 6.3.4: Altinity Kubernetes Operator 0.18.3
- 6.3.5: Altinity Kubernetes Operator 0.18.2
- 6.3.6: Altinity Kubernetes Operator 0.18.1
- 6.3.7: Altinity Kubernetes Operator 0.18.0
- 6.4: Altinity.Cloud Release Notes
- 6.4.1: Altinity.Cloud 22.8.15
- 6.4.2: Altinity.Cloud 22.7.13
- 6.4.3: Altinity.Cloud 22.6.18
- 6.4.4: Altinity.Cloud 22.5.15
- 6.4.5: Altinity.Cloud 22.4.15
- 6.4.6: Altinity.Cloud 22.3.14
- 6.4.7: Altinity.Cloud 22.2.15
- 6.4.8: Altinity.Cloud 22.1.28
- 6.4.9: Altinity.Cloud 21.13.21
- 6.4.10: Altinity.Cloud 21.12.26
- 6.4.11: Altinity.Cloud 21.11.24
- 6.4.12: Altinity.Cloud 21.10.28
- 6.4.13: Altinity.Cloud 21.9.9
- 6.4.14: Altinity.Cloud 21.8.20
- 6.4.15: Altinity.Cloud 21.7.19
- 6.4.16: Altinity.Cloud 21.6.9
- 6.4.17: Altinity.Cloud 21.5.10
- 6.4.18: Altinity.Cloud 21.4.10
- 6.4.19: Altinity.Cloud 21.3.21
- 6.4.20: Altinity.Cloud 21.2.15
- 6.4.21: Altinity.Cloud 21.1.12
- 6.4.22: Altinity.Cloud 20.5.10
- 6.4.23: Altinity.Cloud 20.4.3
- 6.4.24: Altinity.Cloud 20.3.4
- 6.5: Altinity Dashboard for Kubernetes
- 6.5.1: Altinity Dashboard for Kubernetes 0.1.4 Release Notes
- 6.5.2: Altinity Dashboard for Kubernetes 0.1.3 Release Notes
- 6.5.3: Altinity Dashboard for Kubernetes 0.1.2 Release Notes
- 6.5.4: Altinity Dashboard for Kubernetes 0.1.1 Release Notes
- 6.5.5: Altinity Dashboard for Kubernetes 0.1.0 Release Notes
- 6.6: Altinity Plugin Grafana for ClickHouse
- 6.6.1: Altinity Grafana Plugin for ClickHouse 2.5.0
- 6.6.2: Altinity Grafana Plugin for ClickHouse 2.4.2
- 6.7: Altinity Tableau Connector for ClickHouse
- 6.7.1: Altinity Tableau Connector for ClickHouse 1.0.0 ODBC
- 6.7.2: Altinity Tableau Connector for ClickHouse 1.0.0 JDBC Release Notes
- 6.8: ODBC Driver for ClickHouse ReleaseNotes
- 7:
1 - Altinity.Cloud
Altinity.Cloud provides the best experience in managing ClickHouse. Create new clusters with the version of ClickHouse, set your node configurations, and get right to work.
1.1 - Altinity.Cloud 101
Welcome to Altinity.Cloud. In this guide, we will be answering a few simple questions:
- What is Altinity.Cloud?
- Why should I use it?
- How does it work?
What is Altinity.Cloud?
Altinity.Cloud is a fully managed ClickHouse services provider. Altinity.Cloud is the easiest way to set up a ClickHouse cluster with different configurations of shards and replicas, with the version of ClickHouse or Altinity Stable for ClickHouse you want. From one spot you can monitor performance, run queries, upload data from S3 or other cloud stores, and other essential operations.
For more details on Altinity.Cloud abilities, see the Administrator Guide. For a crash course on how to create your own ClickHouse clusters with Altinity.Cloud, we have the Altinity.Cloud Quick Start Guide.
What Can I Do with Altinity.Cloud?
Altinity.Cloud lets you create, manage, and monitor ClickHouse clusters with a few simple clicks. Here’s a brief look at the user interface:

- A: Cluster Creation: Clusters can be created from scratch with Launch Cluster.
- B: Clusters: Each cluster associated with your Altinity.Cloud account is listed in either tile format, or as a short list. They’ll display a short summary of their health and performance. By selecting a cluster, you can view the full details.
- C: User and Environment Management:
- Change to another environment.
- Manage environments and zookeepers.
- Update account settings.
Clusters can be spun up and set with the number of replicas and shards, the specific version of ClickHouse that you want to run on them, and what kind of virtual machines to power the nodes.
When your clusters are running you can connect to them with the ClickHouse client, or your favorite applications like Grafana, Kafka, Tableau, and more. See the Altinity.Cloud connectivity guide for more details.
Monitoring
Cluster performance can be monitored in real time through the Cluster Monitor system.

Some of the metrics displayed here include:
- DNS and Distributed Connection Errors: Displays the rate of any connection issues.
- Select Queries: The number of select queries submitted to the cluster.
- Zookeeper Transactions: The communications between the zookeeper nodes.
- ClickHouse Data Size on Disk: The total amount of data the ClickHouse database is using.
How is Altinity.Cloud organized?

Altinity.Cloud starts at the Organization level - that’s your company. When you and members of your team log into Altinity.Cloud, they’ll start here. Depending on their access level, they can then access the different systems within the organization.
The next level down from there are the Environments. Each organization has at least one Environment, and these are used to allow users access to one or more Clusters.
Clusters consist of one or more Nodes - individual containers that run the ClickHouse databases. These nodes are grouped into shards, which are sets of nodes that all work together to improve performance and reliability. Shards can then be set as replicas, where groups of nodes are copied. If one replica goes down, the other replicas can keep running and copy their synced data when the replica is restored or when a new replica is added.
To recap in reverse order:
- Nodes are individual virtual machines or containers that run ClickHouse.
- Shards are groups of nodes work together to improve performance and share data.
- Replicas are groups of shards that mirror data and performance so when one replica goes down, they can keep going.
- Clusters are sets of replicas that work together to replicate data and improve performance.
- Environments contain different clusters into a set to control access and resources.
- Organizations have one or more environments that service your company.
Altinity.Cloud Access
Altinity.Cloud keeps your users organized in the following roles:
Role | Environment | Cluster | |
---|---|---|---|
orgadmin | These users can create environments and clusters, and assign users in their organization to them. | ||
envadmin | These users have control over environments they are assigned to by the orgadmin. They can create clusters and control clusters within these environments. | ||
envuser | These users can access the clusters they are specifically assigned to within specific environments. |
More details are available in the Account Administration guide.
Where can I find out more?
Altinity provides the following resources to our customers an the Open Source community:
- Altinity Documentation Site: Official documentation on using Altinity.Cloud, Altinity Stable, and related product.
- The Altinity Knowledge Base: An Open Source and community driven place to learn about ClickHouse configurations and answers to questions.
- The Altinity Web Page where you can learn about other resources, meetups, training, conferences, and more.
- The Altinity Community Slack Channel to work with Altinity engineers and other ClickHouse users to get answers to your problems and share your own solutions.
- The ClickHouse Sub-Reddit where the community can discuss what’s going on with ClickHouse and find answers.
1.2 - Quick Start Guide
Welcome to Altinity.Cloud! Altinity.Cloud is the fastest, easiest way to set up, administer and use ClickHouse. Your ClickHouse is fully managed so you can focus on your work.
If this is your first time using Altinity.Cloud, this quick start guide will give you the minimum steps to become familiar with the system. When you’re ready to dig deeper and use the full power of ClickHouse in your Altinity.Cloud environment, check out our Administrator and Developer Guides for in depth knowledge and best practices.
A full PDF version of this document is available through this link: Quick Start Guide PDF
1.2.1 - Account Creation and Login
Create an Account
To start your Altinity.Cloud journey, the first thing you need is an account. New users can sign up for a test account on the Altinity.Cloud Test Drive page. Enter your contact information, and an Altinity.Cloud rep will get right with you.
Once finished, the Altinity.Cloud team will reach out to you with your login credentials. Altinity.Cloud uses your email address as your username, and you’ll be assigned an initial password.
Login to Altinity.Cloud
There are two methods to login:
Login with Username and Password
If you’ve used any web site, you’re likely familiar with this process.
To login to Altinity.Cloud with your username and password:
- Enter your username - in this case your email address - in the field marked Login.
- Enter your Password, then click Sign In.
Login with Auth0
Auth0 allows you to authenticate to Altinity.Cloud through a trusted authentication provider, in this case Google. Once set up, you can click Auth0 authenticate through your Google account.
- Requirements: In order to use Auth0, you must have a Google account with the same email address that you use for Altinity.Cloud.
To setup authentication with Auth0 for the first time:
- Access the Altinity.Cloud page.
- Select Auth0.
- Select the Google account to use for authentication.
IMPORTANT NOTE: The Google account must have the same email address as your Altinity.Cloud account. - Select Continue with Google, and you’ll be in Altinity.Cloud.
After you’ve completed the Auth0 setup process, you can login to Altinity.Cloud by selecting Auth0 from the login page.
1.2.2 - Explore Clusters View
Explore Clusters View
Once you’ve logged in to Altinity.Cloud, let’s take a moment and familiarize ourselves with the environment. The default page is the Clusters View.
The Clusters View page is separated into the following sections:

- A: Cluster Creation: Clusters can be created from scratch with Launch Cluster.
- B: Clusters: Each cluster associated with your Altinity.Cloud account is listed in either tile format, or as a short list. They’ll display a short summary of their health and performance. By selecting a cluster, you can view the full details.
- C: User and Environment Management:
- Change to another environment.
- Manage environments and zookeepers.
- Update account settings.
1.2.3 - Create Your First Cluster
Time to make your first cluster! For this example, we’re creating a minimally sized cluster, but you can rescale your cluster later to make it the exact size you need for your ClickHouse needs.
As of October 21, 2021, Altinity.Cloud supports Google Compute Platform (GCP) and Amazon Web Services (AWS). For more information, see the Altinity.Cloud Administrator Guide.
To create your first cluster:
-
From the Clusters View page, select Launch Cluster. This starts the Cluster Launch Wizard.
-
The first page is Resources Configuration, where we set the name, size and authentication for the new cluster. When finished, click Next. Use the following settings:
Setting Value Name Cluster names will be used to create the DNS name of the cluster. Therefore, cluster names must follow DNS name restrictions (letters, numbers, and dashes allowed, periods and special characters are not).
Cluster names must start with a letter, and should be 15 characters at most.Node Type Select m5.large
This is the size of the node. This selection gives us a cluster with 2 CPUs and around 7 GB RAM. Recall that we can rescale this cluster later. For more information, see the Administrator Guide.Node Storage Set to 30 GB.
The size of each Cluster node in GB (gigabytes). Each node will have the same storage area.Number of Volumes Set to 1.
Network storage can be split into separate volumes. Use more volumes to increase query performance.Volume Type Select gp2 (Not Encrypted).
Volumes can be either encrypted or unencrypted, depending on your security requirements.Number of Shards Set to 1.
The shard represents a set of nodes. Shards can then be replicated to provide increased availability and recovery.ClickHouse Version Select the most recent Altinity Stable Build.
Your ClickHouse cluster can use the version that best meets your needs. Note that all nodes will run the same ClickHouse version.ClickHouse User Name Auto-set to admin.
The default administrative user.ClickHouse User Password and Confirm Password Set to your security requirements. Both the ClickHouse User Password and Confirm Password must match. -
The next page is High Availability Configuration. This is where you can set your replication, Zookeeper, and backup options. Use the following values for your first cluster, then click Next to continue:
Setting Value Data Replication Set to Enabled.
Data replication duplicates data across replica clusters for increased performance and availability.Number of Replicas Set to 2.
Only required if Data Replication is Enabled.
Sets the number of replicas for each cluster shard.Zookeeper Configuration The only option at this time is Dedicated
Apache Zookeeper manages synchronization between the clusters.Zookeeper Node Type Default is selected by default. Enable Backups Set to Enabled by default and cannot be disabled as of this time. Backup Schedule and Number of Backups to keep Is set to Daily and 5, and can not be changed as of this time. -
The Connection Configuration page determines how to communicate with your new cluster. Set the following values, then select Next to continue:
Setting Value Endpoint This is automatically set based on your cluster name. It will display the final DNS name for your cluster end point. Use TLS Set to Enabled.
When enabled, communications with your cluster are encrypted with TLS.Load Balancer Type Select Altinity Edge Ingress.
IMPORTANT NOTE: This setting requires clients to support SNI (Server Name Indication). This will require the most current ClickHouse client and Python clickhouse-driver.
This setting cannot be changed after the cluster is created.- Protocols can restrict communications to the Altinity.Cloud cluster based on your organizations needs. By default **Binary Protocol (port:9440) and HTTP Protocol (port: 8443) are enabled.
- Datadog integration: Not enabled at this time. Stay tuned for future developments.
- IP restrictions: Restrict IP communications to the cluster to specific IP addresses. For more information, see the Administrator Guide. Leave blank for now.
-
Last page! Review & Launch lets you double check your settings and see the estimated cluster cost. When you’re ready, click Launch.
It will take a little while before your new cluster is ready, so grab your coffee or tea or other hydrating beverage. When it’s complete, you’ll see your new cluster with all nodes online and all health checks passed.
1.2.4 - Your First Queries
Once your cluster is created, time to create some tables and do some queries.
For those experienced with ClickHouse, this will be very familiar. For people who are new to ClickHouse, creating tables is very similar to other SQL based databases, with some extra syntax that defines the type of table we’re making. This is part of what gives ClickHouse its speed. For complete details on ClickHouse commands, see the ClickHouse SQL Reference Guide.
Cluster Explore
The Cluster Explore page allows you to run queries, view the schema, and check on processes for your cluster. It’s a quick way of getting into your ClickHouse database, run commands and view your schema straight from your web browser. We’ll be using this to generate our first tables and input some data.
To access Cluster Explore for your cluster, just click Explore for the specific cluster to manage.
For our example, we’re going to create two tables:
- events_local: This table will use the ReplicatedMergeTree table engine. If you don’t know about table engines, don’t worry about that for now. See the ClickHouse Engines page for complete information.
- events: This table will be distributed on your cluster with the Distributed table engine.
In our examples, we’ll be using macro variables - these are placed between curly brackets and let us use the same SQL commands on different clusters and environments without having to fill in every detail. Any time you see an entry like {cluster}
or {shard}
you should recognize those as a macro variable.
The commands below will create these tables into the default database on your cluster.
Create Tables
To create your first tables:
-
From the Clusters View select Explore for the cluster to manage.
-
The Query tab is selected by default. (This may change in future releases.)
-
For our first table, copy and paste the following into the Query window, then select Execute.
CREATE TABLE IF NOT EXISTS events_local ON CLUSTER '{cluster}' ( event_date Date, event_type Int32, article_id Int32, title String ) ENGINE = ReplicatedMergeTree('/clickhouse/{cluster}/tables/{shard}/{database}/{table}', '{replica}') PARTITION BY toYYYYMM(event_date) ORDER BY (event_type, article_id);
You should see the following under Execute confirming the command ran, just replacing docdemo with your cluster:
docdemo.demo.beta.altinity.cloud:8443 (query time: 0.342s) chi-docdemo-docdemo-0-0 9000 0 1 0 chi-docdemo-docdemo-0-1 9000 0 0 0
-
Now let’s create our second table. Back in the Query window, enter the following and select Execute:
CREATE TABLE events ON CLUSTER '{cluster}' AS events_local ENGINE = Distributed('{cluster}', default, events_local, rand())
Once again, you should see confirmation under Execute:
docdemo.demo.beta.altinity.cloud:8443 (query time: 0.162s) chi-docdemo-docdemo-0-0 9000 0 1 0 chi-docdemo-docdemo-0-1 9000 0 0 0
-
Now that we have some tables, let’s not leave them empty. Inserting data into a ClickHouse table is very similar to most SQL systems. Let’s Insert our data, then do a quick Select on it. Enter the following Insert command into Query, then select Execute:
INSERT INTO events VALUES(today(), 1, 13, 'Example');
You’ll see the results confirmed under Execute, just like before.
OK.
Then enter the following Select command and select Execute again:
SELECT * FROM events;
The results will look like the following:
docdemo.demo.beta.altinity.cloud:8443 (query time: 0.018s) ┌─event_date─┬─event_type─┬─article_id─┬─title───┐ │ 2020-11-19 │ 1 │ 13 │ Example │ └────────────┴────────────┴────────────┴─────────┘
View Schema
The Database Schema shows a graphical view of your cluster’s database, the tables in it, and their structure.
To view your Schema:
- From the Clusters View select Explore for the cluster to manage.
- Select Schema.
You can expand the databases to display the tables in each database, or select the table to view its details, schema, and some sample rows.
View Processes
To view current actions running on your cluster select Processes. This displays what processes are currently running, what user account they are running under, and allows you to view more details regarding the process.
1.2.5 - Connect Remote Clients
Now that we’ve shown how to create a cluster and use ClickHouse SQL queries into your new cluster, let’s connect to it remotely.
For the following, we’re going to be using the clickhouse-client program, but the same process will help you gain access from your favorite client.
Full instructions for installing ClickHouse can be found on the ClickHouse Installation page. We’ll keep this simple and assume you’re using a Linux environment like Ubuntu. For this example, we set up a virtual machine running Ubuntu 20.04.
First, we need to know our connection details for our Altinity.Cloud ClickHouse cluster. To view your connection details:
-
From the Clusters View, select Connection Details for the cluster to manage.
-
From here, you can copy and paste the settings for the ClickHouse client from your cluster’s Connection Details. For example:
clickhouse-client -h yourdataset.yourcluster.altinity.cloud --port 9440 -s --user=admin --password
ClickHouse Client
The ClickHouse Client is a command line based program that will be familiar to SQL based users.
Setting Up ClickHouse Client in Linux
If you’ve already set up ClickHouse client, then you can skip this step. These instructions are modified from the Altinity Stable Builds Quick Start Guides to quickly get a ClickHouse client running on your system.
Deb Linux Based Installs
For those who need quick instructions on how to install ClickHouse client in their deb based Linux environment (like Ubuntu), use the following:
-
Update the
apt-get
local repository:sudo apt-get update
-
Install the Altinity package signing keys:
sudo sh -c 'mkdir -p /usr/share/keyrings && curl -s https://builds.altinity.cloud/apt-repo/pubkey.gpg | gpg --dearmor > /usr/share/keyrings/altinity-dev-archive-keyring.gpg'
-
Update the
apt-get
repository to include the Altinity Stable build repository with the following commands:sudo sh -c 'echo "deb [signed-by=/usr/share/keyrings/altinity-dev-archive-keyring.gpg] https://builds.altinity.cloud/apt-repo stable main" > /etc/apt/sources.list.d/altinity-dev.list'
sudo apt-get update
-
Install the most current version of the Altinity Stable Build for ClickHouse client with the following:
sudo apt-get install clickhouse-client
macOS Based Installs
For macOS users, the Altinity Stable for ClickHouse client can be installed through Homebrew with the through the Altinity Homebrew Tap for ClickHouse with the following quick command:
brew install altinity/clickhouse/clickhouse
Connect With ClickHouse Client
If your ClickHouse client is ready, then you can copy and paste your connection settings into your favorite terminal program, and you’ll be connected.

ClickHouse Python Driver
Users who prefer Python can use the clickhouse-driver to connect through Python. These instructions are very minimal, and are intended just to get you working in Python with your new Altinity.Cloud cluster.
These instructions are in the bash
shell and require the following be installed in your environment:
-
Python 3.7
and above -
The Python module venv
-
git
-
IMPORTANT NOTE: Install the
clickhouse-driver
0.2.0 or above which has support for Server Name Indication (SNI) when using TLS communications.
To connect with the Python clickhouse-driver:
-
(Optional) Setup your local environment. For example:
python3 -m venv test . test/bin/activate
-
Install the driver with pip3:
pip3 install clickhouse-driver
-
Start Python.
-
Add the client and connection details. The Access Point provides the necessary information to link directly to your cluster.
Import the clickhouse_driver client and enter the connection settings:
from clickhouse_driver import Client client = Client('<HOSTNAME>', user='admin', password=<PASSWORD>, port=9440, secure='y', verify=False)
-
Run client.execute and submit your query. Let’s just look at the tables from within Python:
>>> client.execute('SHOW TABLES in default') [('events',), ('events_local',)]
-
You can perform selects and inserts as you need. For example, continuing from our Your First Queries using Cluster Explore.
>>> result = client.execute('SELECT * FROM default.events') >>> print(result) [(datetime.date(2020, 11, 23), 1, 13, 'Example')]
For more information see the article ClickHouse And Python: Getting To Know The ClickHouse-driver Client.
1.2.6 - Conclusion
There are several ways of running ClickHouse to take advantage of the robust features and speed in your big data applications. Altinity.Cloud makes it easy to start up a cluster the way you need, manage it, and connect to it so you can go from concept to execution in the fastest way possible.
If you have any questions, please feel free to Contact Us at any time.
1.2.7 - FAQ
When using Launch Cluster, I can’t Click Next or complete the process
Make sure that all of your settings are filled in. Some common gotchas are:
- Make sure that the ClickHouse User Password field has been entered and confirmed.
- Cluster names must follow DNS name restrictions (letters, numbers, and dashes allowed, periods and special characters are not).
- Cluster names must start with a letter, and should be 15 characters at most.
1.3 - General User Guide
Altinity.Cloud is made to be both convenient and powerful for ClickHouse users. Whether you’re a ClickHouse administrator or a developer, these are the concepts and procedures common to both.
1.3.1 - How to Create an Account
To create an Altinity.Cloud account, visit the Altinity.Cloud info page and select Free Trial. Fill in your contact information, and our staff will reach out to you to create a test account.
If you’re ready to upgrade to a full production account, talk to one of our consultants by filling out your contact information on our Consultation Request page.
1.3.2 - How to Login
Altinity.Cloud provides the following methods to login to your account:
- Username and Password
- Auth0
Login with Username and Password
To login to Altinity.Cloud with your Username and Password:
- Open the Altinity.Cloud website.
- Enter your Email Address registered to your Altinity.Cloud account.
- Enter your Password.
- Click Sign In.
Once authenticated, you will be logged into Altinity.Cloud.
Login with Auth0
Auth0 allows you to use your existing Altinity account using trust authentication platforms such as Google to verify your identity.
- IMPORTANT NOTE: This requires that your Altinity.Cloud account matches the authentication platform you are using. For example, if your email address in Altinity.Cloud is listed as Nancy.Doe@gmail.com, your Gmail address must also be Nancy.Doe@gmail.com.
To login using Auth0:
- Open the Altinity.Cloud website.
- Select Auth0.
- Select which authentication platform to use from the list (for example: Google).
- If this is your first time using Auth0, select which account to use. You must be already logged into the authentication platform
- You will be automatically logged into Altinity.Cloud.
1.3.3 - How to Logout
To logout:
- Select your profile icon in the upper right hand corner.
- Select Log out.
Your session will be ended, and you will have to authenticate again to log back into Altinity.Cloud.
1.3.4 - Account Settings
Access My Account
To access your account profile:
-
Select your user profile in the upper right hand corner.
-
Select My Account.
My Account Settings
From the My Account page the following settings can be viewed:
- Common Information. From here you can update or view the following:
- Email Address View Only: Your email address or login
- Password settings.
- Dark Mode: Set the user interface to either the usual or darker interface.
- API Access: The security access rights assigned to this account.
- Access Rights: What security related actions this account can perform.
Update Password
To update your account password:
-
Click your profile icon in the upper right hand corner.
-
Select My Account.
-
In the Common Information tab, enter the new password in the Password field.
-
Select Save.
API Access Settings
Accounts can make calls to Altinity.Cloud through the API address at https://acm.altinity.cloud/api, and the Swagger API definition file is available at https://acm.altinity.cloud/api/reference.json.
Access is controlled through API access keys and API Allow Domains.
API Access Keys
Accounts can use this page to generate one or more API keys that can be used without exposing the accounts username and password. They allow API calls made to Altinity.Cloud to be made by the same Account as the keys were generated for.
When an Altinity.Cloud API key is generated, an expiration date is set for the key. By default, the expiration date is set 24 hours after the key is generated, with the date and time set to GMT. This date can be manually adjusted to allow the expiration date to make the API key invalid at the date of your choosing.
To generate a new API key:
- Click your profile icon in the upper right hand corner.
- Select My Account.
- In the API Access tab, select + Add Key. The key will be available for use with the Altinity.Cloud API.
To change the expiration date of an API key:
- Click your profile icon in the upper right hand corner.
- Select My Account.
- In the API Access tab, update the date and time for the API key being modified. Note that the date and time are in GMT (Greenwich Mean Time).
To remove an API key:
- Click your profile icon in the upper right hand corner.
- Select My Account.
- In the API Access tab, select the trashcan icon next to the API key to delete. The key will no longer be allowed to connect to the Altinity.Cloud API for this account.
API Allow Domains
API submissions can be restricted by the source domain address. This provides enhanced security by keeping API communications only between authorized sources.
To update the list of domains this account can submit API commands from:
- Click your profile icon in the upper right hand corner.
- Select My Account.
- In the API Access tab, list each URL this account can submit API commands from. Each URl is a separate line.
- Click Save to update the account settings.

Access Rights
The Access Rights page displays which permissions your account has. These are listed in three columns:
- Section: The area of access within Altinity.Cloud, such as Accounts, Environments, and Console.
- Action: What actions the access right rule allows within the section. Actions marked as
*
include all actions within the section. - Rule: Whether the Action in the Section is Allow (marked with a check mark), or Deny (marked with an X).
1.3.5 - Notifications
Notifications allow you to see any messages related to your Altinity.Cloud account. For example: billing, service issues, etc.
To access your notifications:
-
From the upper right corner of the top navigation bar, select your user ID, then Notifications.
Notifications History
The Notifications History page shows the notifications for your account, including the following:
- Message: The notifications message.
- Level: The priority level which can be:
- Danger: Critical notifications that can effect your clusters or account.
- Warning: Notifications of possible issues that are less than critical.
- News: Notifications of general news and updates in Altinity.Cloud.
- Info: Updates for general information.
1.3.6 - Billing
Accounts with the role orgadmin
are able to access the Billing page for their organizations.
To access the Billing page:
- Login to your Altinity.Cloud with an account with the
orgadmin
role. - From the upper right hand corner, select the Account icon, and select Billing.

From the billing page, the following Usage Summary and the Billing Summary are available for the environments connected to the account.

Usage Summary
The Usage Summary displays the following:
- Current Period: The current billing month displaying the following:
- Current Spend: The current total value of charges for Altinity.Cloud services.
- Avg. Daily Spend: The average cost of Altinity.Cloud services per day.
- Est. Monthly BIll: The total estimated value for the current period based on Current Spend and if usage continues at the current rate.
- Usage for Period: Select the billing period to display.
- Environment: Select the environment or All environments to display billing costs for. Each environment, its usage, and cost will be displayed with the total cost.
Billing Summary
The Billing Summary section displays the payment method, service address, and email address used for billing purposes. Each of these settings can be changed as required.
1.3.7 - System Status
The System Status page provides a quick view of whether the Altinity.Cloud services are currently up or down. This provides a quick glance to help devops staff determine where any issues may be when communicating with their Altinity.Cloud clusters.
To access tne System Status page:
-
Login to your Altinity.Cloud account.
-
From the upper right hand corner, select the Account icon, and select System Status.
System Status Page
The System Status page displays the status of the Altinity.Cloud services. To send a message to Altinity.Cloud support representatives, select Get in touch.
From the page the following information is displayed:

This sample is from a staging environment and cluster that was stopped and started to demonstrate how the uptime tracking system works.
- Whether all Altinity.Cloud services are online or if there are any issues.
- The status of services by product, with the uptime of the last 60 days shown as either green (the service was fully available that day), or red (the service suffered an issue). Hovering over a red bar will display how long the service was unavailable for the following services:
- ClickHouse clusters
- Ingress
- Management Console
Enter your email at the bottom of the page in the section marked Subscribe to status updates to receive notifications via email regarding any issues with Altinity.Cloud services.
1.3.8 - Clusters View
The Clusters View page allows you to view available clusters and access your profile settings.
To access the Clusters View page while logged in to Altinity.Cloud, click Altinity Cloud Manager.
The Clusters View page is separated into the following section:
- A: Cluster Creation: For more information on how to create new clusters, see the Administrator Guide.
- B: Clusters: Each cluster associated with your Altinity.Cloud account is listed in either tile format, or as a short list.
- C: User Management:
- Change which environment clusters are on display.
- Access your Account Settings.

Organizational Admins have additional options in the left navigation panel that allows them to select the Accounts, Environments, and Clusters connected to the organization’s Altinity.Cloud account.
Change Environment
Accounts who are assigned to multiple Altinity.Cloud environments can select which environment’s clusters they are viewing. To change your current environment:
- Click the environment dropdown in the upper right hand corner, next to your user profile icon.
- Select the environment to use. You will automatically view that environment’s clusters.

Manage Environments
Accounts that have permission to manage environments access them through the following process:
- Select the Settings icon in the upper right hand corner.
- Select Environments.

For more information on managing environments, see the Administrator Guide.
Access Settings
For information about access your account profile and settings, see Account Settings.
Cluster Access
For details on how to launch and manage clusters, see the Administrator Guide for Clusters.
1.3.9 - Cluster Explore Guide
Altinity.Cloud users a range of options they can take on existing clusters.
For a quick view on how to create a cluster, see the Altinity.Cloud Quick Start Guide. For more details on interacting with clusters, see the Administrative Clusters Guide.
1.3.9.1 - Query Tool
The Query Tool page allows users to submit ClickHouse SQL queries directly to the cluster or a specific cluster node.
To use the Query Tool:
-
Select Explore from either the Clusters View or the Clusters Detail Page.
-
Select Query from the top tab. This is the default view for the Explore page.
-
Select from the following:
-
Select which cluster to run a query against.
-
Select Run DDLs ON CLUSTER to run Distributed DDL Queries.
-
Select the following node options:
- Any: Any node selected from the Zookeeper parameters.
- All: Run the query against all nodes in the cluster.
- Node: Select a specific node to run the query against.
-
The Query History allows you to scroll through queries that have been executed.
-
Enter the query in the Query Textbox. For more information on ClickHouse SQL queries, see the SQL Reference page on ClickHouse.tech.
-
Select Execute to submit the query from the Query Textbox.
-
The results of the query will be displayed below the Execute button.
-
Additional tips and examples are listed on the Query page.
1.3.9.2 - Schema View
The Schema page allows you to view the databases, tables, and other details.
To access the Schema page:
-
Select Explore from either the Clusters View or the Clusters Detail Page.
-
Select Schema from the top tab.
-
Select the following node options:
- Any: Any node selected from the Zookeeper parameters.
- All: Run the query against all nodes in the cluster.
- Node: Select a specific node to run the query against.
To view details on a table, select the table name. The following details are displayed:
- Table Description: Details on the table’s database, engine, and other details.
- Table Schema: The
CREATE TABLE
command used to generate the table. - Sample Rows: A display of 5 selected rows from the table to give an example of the data contents.
1.3.9.3 - Processes
The Processes page displays the currently running processes on a cluster or node.
To view the processes page:
-
Select Explore from either the Clusters View or the Clusters Detail Page.
-
Select Processes from the top tab.
-
Select the following node options:
- Any: Any node selected from the Zookeeper parameters.
- All: Run the query against all nodes in the cluster.
- Node: Select a specific node to run the query against.
The following information is displayed:
- Query ID: The ClickHouse ID of the query.
- Query: The ClickHouse query that the process is running.
- Time: The elapsed time of the process.
- User: The ClickHouse user running the process.
- Client Address: The address of the client submitting the process.
- Action: Stop or restart a process.
1.4 - Administrator Guide
Altinity.Cloud allows administrators to manage clusters, users, and keep control of their ClickHouse environments with a few clicks. Monitoring tools are provided so you can keep track of everything in your environment to keep on top of your business.
1.4.1 - Clusters
ClickHouse databases are managed through clusters, which harness the power of distributed processing to quickly deliver results on even the most complex and data intensive queries.
Altinity.Cloud users can create their own ClickHouse Clusters tailored to their organization’s needs.
1.4.1.1 - View Cluster Details
Cluster Details Page
Once a cluster has been launched, its current operating details can be viewed by selecting the cluster from the Clusters View. This displays the Cluster Details page. From the Cluster Dashboard page, select Nodes to view the Nodes Summary page.
From the Cluster Details page, users can perform the following:

- A: Manage the cluster’s:
- Actions
- Configuration
- Tables and structure with Explore
- Alerts
- Logs
- B: Check Cluster Health.
- C: View the cluster’s Access Point.
- D: Monitor the Cluster and its Queries.
- E: View summary details for the Cluster or Node. Select Nodes to view details on the cluster’s Nodes.
Nodes Summary
The Nodes Summary Page displays all nodes that are part of the selected cluster. From this page the following options and information is available:
- The Node Summary that lists:
- Endpoint: The connection settings for this node. See Node Connection.
- Details and Node View: Links to the Node Dashboard and Node Metrics.
- Version: The ClickHouse version running on this node.
- Type: The processor setting for the node.
- Node Storage: Storage space in GB available.
- Memory: RAM memory allocated for the node.
- Availability Zone: Which AWS Availability Zone the node is hosted on.
- Select Node View or View for the specific node to access the Node Dashboard, Node Metrics, and Node Schema sections.

Node Connection
The Node Connection Details shows how to connect from various clients, including the clickhouse-client
, JDBC drivers, HTTPS, and Python. Unlike the Cluster Access Point, this allows a connection directly to the specific node.

Node Dashboard
From the Node Dashboard Page users can:

- A: Manage the node’s:
- Tables and structure with Explore
- Logs
- B: Check the node’s health.
- C: View summary details node, it’s Metrics and its Schema.
- D: Perform Node Actions.
Node Metrics
Node Metrics provides a breakdown of the node’s performance, such as CPU data, active threads, etc.
Node Schema
The Node Schema provides a view of the databases’ schema and tables. For more information on how to interact with a Node by submitting queries, viewing the schema of its databases and tables, and viewing process, see the Cluster Explore Guide.
1.4.1.2 - Cluster Actions
Launched clusters can be have different actions applied to them based on your needs.
1.4.1.2.1 - Upgrade Cluster
Clusters can be upgraded to versions of ClickHouse other than the one your cluster is running.
When upgrading to a ClickHouse Altinity Stable Build, review the release notes for the version that you are upgrading to.
How to Upgrade an Altinity Cloud Cluster
To upgrade a launched cluster:
-
Select Actions for the cluster to upgrade.
-
Select Upgrade.
-
Select the ClickHouse version to upgrade to.
-
Select Upgrade to start the process.
The upgrade process completion time varies with the size of the cluster, as each server is upgraded individually. This may cause downtime while the cluster is upgraded.
1.4.1.2.2 - Rescale Cluster
The size and structure of the cluster may need to be altered after launching based on your organization’s needs. The following settings can be rescaled:
- Number of Shards
- Number of Replicas
- Node Type
- Node Storage
- Number of Volumes
- Apply to new nodes only: This setting will only effect nodes created from this point forward.
See Cluster Settings for more information.
How to Rescale a Cluster
To rescale a cluster:
-
Select Actions for the cluster to rescale.
-
Select Rescale.
-
Set the new values of the cluster.
-
Click OK to begin rescaling.
Depending on the size of the cluster, this may take several minutes.
1.4.1.2.3 - Stop and Start a Cluster
To stop an launched cluster, or start a stopped cluster:
- From either the Clusters View or the Cluster Details Page, select Actions.
- If the cluster is currently running, select Stop to halt its operations.
- If the cluster has been stopped, select Start to restart it.
Depending on the size of your cluster, it may take a few minutes until it is fully stopped or is restarted. To access the health and availability of the cluster, see Cluster Health or the Cluster Availability.
1.4.1.2.4 - Export Cluster Settings
The structure of an Altinity Cloud cluster can be exported as JSON. For details on the cluster’s settings that are exported, see Cluster Settings.
To export a cluster’s settings to JSON:
- From either the Clusters View or the Cluster Details Page, select Actions, then select Export.
- A new browser window will open with the settings for the cluster in JSON.
1.4.1.2.5 - Replicate a Cluster
Clusters can be replicated with the same or different settings. These can include the same database schema as the replicated cluster, or launched without the schema. This may be useful to create a test cluster, then launch the production cluster with different settings ready for production data.
For complete details on Altinity.Cloud clusters settings, see Cluster Settings.
To create a replica of an existing cluster:
- From either the Clusters View or the Cluster Details Page, select Actions, then select Launch a Replica Cluster.
- Enter the desired values for Resources Configuration.
-
To replicate the schema of the source directory, select Replicate Schema.
-
Click Next to continue.
-
- High Availability Configuration, and Connection Configuration.
- Each section must be completed in its entirety before moving on to the next one.
- In the module Review & Launch, verify the settings are correct. When finished, select Launch.
Depending on the size of the new cluster it will be available within a few minutes. To verify the health and availability of the new cluster, see Cluster Health or the Cluster Availability.
1.4.1.2.6 - Destroy Cluster
When a cluster is no longer required, the entire cluster and all of its data can be destroyed.
- IMPORTANT NOTE: Once destroyed, a cluster can not be recovered. It must be manually recreated.
To destroy a cluster:
-
From either the Clusters View or the Cluster Details Page, select Actions, then select Destroy.
-
Enter the cluster name, then select OK to confirm its deletion.
1.4.1.3 - Cluster Settings
ClickHouse Clusters hosted on Altinity.Cloud have the following structural attributes. These determine options such as the version of ClickHouse installed on them, how many replicas, and other important features.
Name | Description | Values |
---|---|---|
Cluster Name | The name for this cluster. It will be used for the hostname of the cluster. | Cluster names must be DNS compliant. This includes:
|
Node Type | Determines the number of CPUs and the amount of RAM used per node. | The following Node Types are sample values, and may be updated at any time:
|
Node Storage | The amount of storage space available to each node, in GB. | |
Number of Volumes | Storage can be split across multiple volumes. The amount of data stored per node is the same as set in Node Storage, but it split into multiple volumes. Separating storage into multiple volumes can increase query performance. |
|
Volume Type | Defines the Amazon Web Services volume class. Typically used to determine whether or not to encrypt the columns. | Values:
|
Number of Shards | Shards represent a set of nodes. Shards can be replicated to provide increased availability and computational power. | |
ClickHouse Version | The version of the ClickHouse database that will be used on each node. To run a custom ClickHouse container version, specify the Docker image to use.
|
Currently available options:
|
ClickHouse Admin Name | The name of the ClickHouse administrative user. | Set to admin by default. Can not be changed. |
ClickHouse Admin Password | The password for the ClickHouse administrative user. | |
Data Replication | Toggles whether shards will be replicated. When enabled, Zookeeper is required to manage the shard replication process. | Values:
|
Number of Replicas | Sets the number of replicas per shard. Only enabled if Data Replication is enabled. | |
Zookeeper Configuration | When Data Replication is set to Enabled, Zookeeper is required. This setting determines how Zookeeper will run and manage shard replication. The Zookeeper Configuration mainly sets how many Zookeeper nodes are used to manage the shards. More Zookeeper nodes increases the availability of the cluster. |
Values:
|
Zookeeper Node Type | Determines the type of Zookeeper node. | Defaults to default and can not be changed. |
Node Placement | Sets how nodes are distributed via Kubernetes. Depending on your situation and how robust you want your replicas and clusters. | Values:
|
Enable Backups | Backs up the cluster. These can be restored in the event data loss or to roll back to previous versions. | Values:
|
Backup Schedule | Determines how often the cluster will be backed up. | Defaults to Daily |
Number of Backups to keep | Sets how many backups will be stored before deleting the oldest one | Defaults to 5. |
Endpoint | The Access point Domain Name. | This is hard set by the name of your cluster and your organization. |
Use TLS | Sets whether or not to encrypt external communications with the cluster to TLS. | Default to Enabled and can not be changed. |
Load Balancer Type | The load balancer manages communications between the various nodes to ensure that nodes are not overwhelmed. | Defaults to Altinity Edge Ingress |
Protocols | Sets the TCP ports used in external communications with the cluster. | Defaults to ClickHouse TCP port 9440 and HTTP port 8443 . |
1.4.1.4 - Configure Cluster
Once a cluster has been launched, it’s configuration can be updated to best match your needs.
1.4.1.4.1 - How to Configure Cluster Settings
Cluster settings can be updated from the Clusters View or from the Cluster Details by selecting Configure > Settings.
- IMPORTANT NOTE: Changing a cluster’s settings will require a restart of the entire cluster.
Note that some settings are locked - their values can not be changed from this screen.

How to Set Troubleshooting Mode
Troubleshooting mode prevents your cluster from auto-starting after a crash. To update this setting:
- Toggle Troubleshooting Mode either On or Off.
How to Edit an Existing Setting
To edit an existing setting:
- Select the menu on the left side of the setting to update.
- Select Edit.
- Set the following:
- Setting Type.
- Name
- Value
- Select OK to save the setting.

How to Add a New Setting
To add a new setting to your cluster:
- Select Add Setting.
- Set the following:
- Setting Type.
- Name
- Value
- Select OK to save the setting.

How to Delete an Existing Setting
To delete an existing setting:
- Select the menu on the left side of the setting to update.
- Select OK.
- Select Remove to confirm removing the setting.

1.4.1.4.2 - How to Configure Cluster Profiles
Cluster profiles allow you to set the user permissions and settings based on their assigned profile.
The Cluster Profiles can be accessed from the Clusters View or from the Cluster Details by selecting Configure > Settings.

Add a New Profile
To add a new cluster profile:
- From the Cluster Profile View page, select Add Profile.
- Provide profile’s Name and Description, then click OK.
Edit an Existing Profile
To edit an existing profile:
- Select the menu to the left of the profile to update and select Edit, or select Edit Settings.
- To add a profile setting, select Add Setting and enter the Name and Value, then click OK to store your setting value.
- To edit an existing setting, select the menu to the left of the setting to update. Update the Name and Value, then click OK to store the new value.
Delete an Existing Profile
To delete an existing profile:
- Select the menu to the left of the profile to update and select Delete.
- Select OK to confirm the profile deletion.
1.4.1.4.3 - How to Configure Cluster Users
The cluster’s Users allow you to set one or more entities who can access your cluster, based on their Cluster Profile.
Cluster users can be updated from the Clusters View or from the Cluster Details by selecting Configure > Users.

How to Add a New User
To add a new user to your cluster:
-
Select Add User
-
Enter the following:
- Login: the name of the new user.
- Password and Confirm Password: the authenticating credentials for the user. Th
- Networks: The networks that the user can connect from. Leave as 0.0.0.0/0 to allow access from all networks.
- Databases: Which databases the user can connect to. Leave empty to allow access all databases.
- Profile: Which profile settings to apply to this user.
-
Select OK to save the new user.
How to Edit a User
To edit an existing user:
- Select the menu to the left of the user to edit, then select Edit.
- Enter the following:
- Login: the new name of the user.
- Password and Confirm Password: the authenticating credentials for the user. Th
- Networks: The networks that the user can connect from. Leave as 0.0.0.0/0 to allow access from all networks.
- Databases: Which databases the user can connect to. Leave empty to allow access all databases.
- Profile: Which profile settings to apply to this user.
- Select OK to save the updated user user.
How to Delete a User
- Select the menu to the left of the user to edit, then select Delete.
- Select OK to verify the user deletion.
1.4.1.5 - Launch New Cluster
Launching a new ClickHouse Cluster is incredibly easy, and only takes a few minutes. For those looking to create their first ClickHouse cluster with the minimal steps, see the Quick Start Guide. For complete details on Altinity.Cloud clusters settings, see Cluster Settings.
To launch a new ClickHouse cluster:
-
From the Clusters View page, select Launch Cluster. This starts the Cluster Launch Wizard.
-
Enter the desired values for Resources Configuration, High Availability Configuration, and Connection Configuration.
- Each section must be completed in its entirety before moving on to the next one.
-
In the module Review & Launch, verify the settings are correct. When finished, select Launch.
Within a few minutes, the new cluster will be ready for your use and display that all health checks have been passed.
1.4.1.6 - Cluster Alerts
The Cluster Alerts module allows users to set up when they are notified for a set fo events. Alerts can either be a popup, displaying the alert when the user is logged into Altinity.Cloud, or email so they can receive an alert even when they are not logged into Altinty.Cloud.
To set which alerts you receive:
-
From the Clusters view, select the cluster to for alerts.
-
Select Alerts.
-
Add the Email address to send alerts to.
-
Select whether to receive a Popup or Email alert for the following events:
- ClickHouse Version Upgrade: Alert triggered when the version of ClickHouse that is installed in the cluster has a new update.
- Cluster Rescale: Alert triggered when the cluster is rescaled, such as new shards added.
- Cluster Stop: Alert triggered when some event has caused the cluster to stop running.
- Cluster Resume: Alert triggered when a cluster that was stopped has resumed operations.
1.4.1.7 - Cluster Health Check
From the Clusters View, you can see the health status of your cluster and its nodes at a glance.
How to Check Node Health
The quick health check of your cluster’s nodes is displayed from the Clusters View. Next to the cluster name is a summary of your nodes’ statuses, indicating the total number of nodes and how many nodes are available.

How to Check Cluster Health
The overall health of the cluster is shown in the Health row of the cluster summary, showing the number of health checks passed.

Click checked passed to view a detailed view of the cluster’s health.
How to View a Cluster’s Health Checks
The cluster’s Health Check module displays the status of the following health checks:
- Access point availability check
- Distributed query check
- Zookeeper availability check
- Zookeeper contents check
- Readonly replica check
- Delayed inserts check
To view details on what queries are used to verify the health check, select the caret for each health check.

1.4.1.8 - Cluster Monitoring
Altinity.Cloud integrates Grafana into its monitoring tools. From a cluster, you can quickly access the following monitoring views:
- Cluster Metrics
- Queries
- Logs
How to Access Cluster Metrics
To access the metrics views for your cluster:
- From the Clusters view, select the cluster to monitor.
- From Monitoring, select the drop down View in Grafana and select from one of the following options:
- Cluster Metrics
- Queries
- Logs
- Each metric view opens in a separate tab.
Cluster Metrics
Cluster Metrics displays how the cluster is performing from a hardware and connection standpoint.

Some of the metrics displayed here include:
- DNS and Distributed Connection Errors: Displays the rate of any connection issues.
- Select Queries: The number of select queries submitted to the cluster.
- Zookeeper Transactions: The communications between the zookeeper nodes.
- ClickHouse Data Size on Disk: The total amount of data the ClickHouse database is using.
Queries
The Queries monitoring page displays the performance of clusters, including the top requests, queries that require the most memory, and other benchmarks. This can be useful in identifying queries that can cause performance issues and refactoring them to be more efficient.

Log Metrics
The Log monitoring page displays the logs for your clusters, and allows you to make queries directly on them. If there’s a specific detail you’re trying to iron out, the logs are the most granular way of tracking down those issues.

1.4.1.9 - Cluster Logs
Altinity.Cloud provides the cluster log details so users can track down specific issues or performance bottlenecks.
To access a cluster’s logs:
- From the Clusters view, select the cluster to for alerts.
- Select Logs.
- From the Log Page, you can display the number of rows to view, or filter logs by specific text.
- To download the logs, select the download icon in the upper right corner (A).
- To refresh the logs page, select the refresh icon (B).

The following logs are available:
- ACM Logs: These logs are specific to Altinity.Cloud issues and include the following:
- System Log: Details the system actions such as starting a cluster, updating endpoints, and other details.
- API Log: Displays updates to the API and activities.
- ClickHouse Logs: Displays the Common Log that stores ClickHouse related events. From this view a specific host can be selected form the dropdown box.
- Backup Logs: Displays backup events from the
clickhouse-backup
service. Log details per cluster host can be selected from the dropdown box. - Operator Logs: Displays logs from the Altinity Kubernetes Operator service, which is used to manage cluster replication cluster and communications in the Kubernetes environment.
1.4.2 - Access Control
Altinity.Cloud provides role based access control. Depending the role granted to an Altinity.Cloud Account, they can assign other Altinity.Cloud accounts roles and grant permissions to access organizations, environments, or clusters.
1.4.2.1 - Role Based Access and Security Tiers
Access to ClickHouse data hosted in Altinity.Cloud is controlled through a combination of security tiers and account roles. This allows companies to tailor access to data in a way that maximizes security while still allowing ease of access.
Security Tiers
Altinity.Cloud groups sets of clusters together in ways that allows companies to provide Accounts access only to the clusters or groups of clusters that they need to.
Altinity.Cloud groups clusters into the following security related tiers:

- Nodes: The most basic level - an individual ClickHouse database and tables.
- Clusters: These contain one or more nodes provide ClickHouse database access.
- Environments: Environments contain one or more clusters.
- Organizations: Organizations contain one or more environments.
Account access is controlled by assigning an account a single role and a security tier depending on their role. A single account can be assigned to multiple organizations, environments, multiple clusters in an environment, or a single cluster depending on their account role.
Account Roles
The actions that can be taken by Altinity.Cloud accounts is based on the role they are assigned. The following roles and their actions based on the security tier is detailed in the table below:
Role | Environment | Cluster | |
---|---|---|---|
orgadmin | Create, Edit, and Delete environments that they create, or are assigned to, within the assigned organizations. Administrate Accounts associated with environments they are assign to. |
Create, Edit, and Delete clusters within environments they create or assigned to in the organization. | |
envadmin | Access assigned environments. | Create, Edit, and Delete clusters within environments they are assigned to in the organization. | |
envuser | Access assigned environments. | Access one or more clusters the account is specifically assigned to. |
The account roles are tied into the security tiers, and allow an account to access multiple environment and clusters depending on what type of tier they are assigned to.
For example, we may have the following situation:
- Accounts
peter
,paul
, andmary
andjessica
are all members of the organizationHappyDragon
. HappyDragon
has the following environments:HappyDragon_Dev
andHappyDragon_Prod
, each with the clustersmarketing
,sales
, andops
.
The accounts are assigned the following roles and security tiers:
Account | Role | Organization | Environments | Clusters |
---|---|---|---|---|
mary | orgadmin | HappyDragon |
HappyDragon_Prod |
* |
peter | envadmin | HappyDragon |
HappyDragon_Dev |
* |
jessica | envadmin | HappyDragon |
HappyDragon_Prod , HappyDragon_Dev |
* |
paul | envuser | HappyDragon |
HappyDragon_Prod |
marketing |
In this scenario, mary
has the ability to access the environment HappyDragon_Prod
, or can create new environments and manage them and any clusters within them. However, she is not able to edit or access HappyDragon_Dev
or any of its clusters.
- Both
peter
andjessica
have the ability to create and remove clusters within their assigned environments.peter
is able to modify the clusters in the environmentHappyDragon_Dev
.jessica
can modify clusters in both environments.
paul
can only access the clustermarketing
in the environmentHappyDragon_Prod
.
1.4.2.2 - Account Management
Altinity.Cloud accounts with the role orgadmin are able to create new Altinity.Cloud accounts and associate them with organizations, environments, and one or more clusters depending on their role. For more information on roles, see Role Based Access and Security Tiers.
Account Page
The Account Page displays all accounts assigned to the same Organization and Environments as the logged in account.
For example: the accounts mario
, luigi
, and peach
and todd
are members of the organizations MushroomFactory
and BeanFactory
as follows:
Account | Role | Organization: MushroomFactory | Organization: BeanFactory |
---|---|---|---|
peach | orgadmin | * | |
mario | orgadmin | * | |
luigi | envuser | * | |
todd | envuser | * |
peach
will be able to see their account andtodd
in the Account Page, while accountsmario
andluigi
will be hidden from them.mario
will be able to see their account andluigi
.
Access Accounts
To access the accounts that are assigned to the same Organizations and Environments as the logged in user with the account role orgadmin:
- Login to Altinity.Cloud with an account granted the orgadmin role.
- From the left navigation panel, select Accounts.
- All accounts that are in the same Organizations and Environments as the logged in account will be displayed.
Account Details
Accounts have the following details that can be set by an account with the orgadmin role:
- Common Information:
- Name: The name of the account.
- Email (Required): The email address of the account. This will be used to login, reset passwords, notifications, and other uses. This must be a working email for these functions to work.
- Password: The password for the account. Once a user has authenticated to the account, they can change their password.
- Confirm Password: Confirm the password for the account.
- Role (Required): The role assigned to the account. For more information on roles, see Role Based Access and Security Tiers.
- Organization: The organization assigned to the account. Note that the
orgadmin
can only assign accounts the same organizations that theorgadmin
account also belongs to. - Suspended: When enabled, this prevents the account from logging into Altinity.Cloud.
- Environment Access:
- Select the environments that the account will require access to. Note that the
orgadmin
can only assign accounts the same environments that theorgadmin
account also belongs to.
- Select the environments that the account will require access to. Note that the
- Cluster Access:
- This is only visible if the Role is set to envuser. This allows one or more clusters in the environments the new account was assigned to in Environmental Access to be accessed by them.
- API Access:
- Allows the new account to make API calls from the listed domain names.
Account Actions
Create Account
orgadmin
accounts can create new accounts and assign them to the same organization and environments they are assigned to. For example, continuing the scenario from above, if account peach
is assigned to the organization MushroomFactory
and the environments MushroomProd
and MushroomDev
, they can assign new accounts to the organization MushroomFactory
, and to the environments MushroomProd
or MushroomDev
or both.
To create a new account:
-
Login to Altinity.Cloud with an account granted the orgadmin role.
-
From the left navigation panel, select Accounts.
-
Select Add Account.
-
Set the fields as listed in the Account Details section.
-
Once all settings are completed, select Save. The account will be able to login with the username and password, or if their email address is registered through Google, Auth0.
Edit Account
- Login to Altinity.Cloud with an account granted the orgadmin role.
- From the left navigation panel, select Accounts.
- From the left hand side of the Accounts table, select the menu icon for the account to update and select Edit.
- Update the fields as listed in the Account Details section.
- When finished, select Save.
Suspend Account
Instead of deleting an account, setting an account to Suspended may be more efficient to preserve the accounts name and other settings. A suspended account is unable to login to Altinity.Cloud. This includes directly logging through a browser and API calls made under the account.
To suspend or activate an account:
- Login to Altinity.Cloud with an account granted the orgadmin role.
- From the left navigation panel, select Accounts.
- From the left hand side of the Accounts table, select the menu icon for the account to update and select Edit.
- To suspend an account, toggle Suspended to on.
- To activate a suspended account, toggle Suspended to off.
- When finished, select Save.
Delete Account
Accounts can be deleted which removes all information on the account. Clusters and environments created by account will remain.
To delete an existing account:
- Login to Altinity.Cloud with an account granted the orgadmin role.
- From the left navigation panel, select Accounts.
- From the left hand side of the Accounts table, select the menu icon for the account to update and select Delete.
- Verify the account is to be deleted by selecting OK.
1.5 - Connectivity
The following guides are designed to help organizations connect their existing services to Altinity.Cloud.
1.5.1 - Cluster Access Point
ClickHouse clusters created in Altinity.Cloud can be accessed through the Access Point. The Access Point is configured by the name of your cluster and environment it is hosted in.
Information on the Access Point is displayed from the Clusters View. Clusters with TLS Enabled will display a green shield icon.
View Cluster Access Point
To view your cluster’s access point:
- From the Clusters View, select Access Point.
- The Access Point details will be displayed.

Access Point Details
The Access Point module displays the following details:
-
Host: The dns host name of the cluster, based on the name of the cluster and environment the cluster is hosted in.
-
TCP Port: The ClickHouse TCP port for the cluster.
-
HTTP Point: The HTTP port used for the cluster.
-
Client Connections: The client connections are quick commands you can copy and paste into your terminal or use in your code. This make it a snap to have your code connecting to your cluster by copying the details right from your cluster’s Access Point. Provided client includes are:
- clickhouse-client
jdbc
https
python
1.5.2 - Configure Cluster Connections
Altinity.Cloud provides accounts the ability to customize their connections to their clusters. This allows organizations the ability to enable or disable:
- The Binary Protocol: The native ClickHouse client secure port on port
9440
. - The HTTP Protocol: The HTTPS protocol on port
8443
. - IP Restrictions: Restricts ClickHouse client connections to the provided whitelist. The IP addresses must be listed in CIDR format. For example,
ip_address1,ip_address2,etc
.
Notice
IP addresses required for the functionality of Altinity.Cloud services to the cluster will be automatically included in the IP Restrictions list.As of this time, accounts can update the IP Restrictions section. Binary Protocol and HTTP Protocol are enabled by default and can not be disabled.
Update Connection Configuration
To update the cluster’s Connection Configuration:
-
Log into Altinity.Cloud with an account.
-
Select the cluster to update.
-
From the top menu, select Configure->Connections.
-
To restrict IP communication only to a set of whitelisted IP addresses:
-
Under IP Restrictions, select Enabled.
-
Enter a list of IP addresses. These can be separated by comma, spaces, or a new line. The following examples are all equivalent:
192.168.1.1,192.168.1.2 192.168.1.1 192.168.1.2 192.168.1.1 192.168.1.2
-
-
When finished, select Confirm to save the Connection Configuration settings.
1.5.3 - Connecting with DBeaver
Connecting to Altinity.Cloud from DBeaver is a quick, secure process thanks to the available JDBC driver plugin.
Required Settings
The following settings are required for the driver connection:
- hostname: The DNS name of the Altinity.Cloud cluster. This is typically based on the name of your cluster, environment, and organization. For example, if the organization name is
CameraCo
and the environment isprod
with the clustersales
, then the URL may behttps://sales.prod.cameraco.altinity.cloud
. Check the cluster’s Access Point to verify the DNS name of the cluster. - port: The port to connect to. For Altinity.Cloud, it will be HTTPS on port
8443
. - Username: The ClickHouse user to authenticate to the ClickHouse server.
- Password: The ClickHouse user password used to authenticate to the ClickHouse server.
Example
The following example is based on connecting to the Altinity.Cloud public demo database, with the following settings:
- Server:
github.demo.trial.altinity.cloud
- Port:
8443
- Database:
default
- Username:
demo
- Password:
demo
- Secure:
yes
DBeaver Example
-
Start DBeaver and select Database->New Database Connection.
-
Select All, then in the search bar enter
ClickHouse
. -
Select the ClickHouse icon in the “Connect to a database” screen.
-
Enter the following settings:
- Host:
github.demo.trial.altinity.cloud
- Port:
8443
- Database:
default
- User:
demo
- Password:
demo
- Host:
-
Select the Driver Properties tab. If prompted, download the ClickHouse JDBC driver.
-
Scroll down to the ssl property. Change the value to
true
. -
Press the Test Connection button. You should see a successful connection message.
1.5.4 - clickhouse-client
clickhouse-client
.The ClickHouse Client is a command line based program that will be familiar to SQL based users. For more information on clickhouse-client
, see the ClickHouse Documentation Command-Line Client page.
The access points for your Altinity.Cloud ClickHouse cluster can be viewed through the Cluster Access Point.
How to Setup clickhouse-client for Altinity.Cloud in Linux
As of this document’s publication, version 20.13 and above of the ClickHouse client is required to connect with the SNI enabled clusters. These instructions use the testing version of that client. An updated official stable build is expected to be released soon.
sudo apt-get install apt-transport-https ca-certificates dirmngr
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv E0C56BD4
echo "deb https://repo.clickhouse.tech/deb/testing/ main/" | sudo tee \
/etc/apt/sources.list.d/clickhouse.list
sudo apt-get update
sudo apt-get install -y clickhouse-client
Connect With clickhouse-client to a Altinity.Cloud Cluster
If your ClickHouse client is ready, then you can copy and paste your connection settings into your favorite terminal program, and you’ll be connected.

1.5.5 - Amazon VPC Endpoint
Altinity.Cloud users can connect a VPC (Virtual Private Cloud) Endpoint from existing AWS environments to their Altinity.Cloud environment. The VPC Endpoint becomes a private connection between their existing Amazon services and Altinity.Cloud, without exposing the connection to the Internet.
The following instructions are based on using the AWS console. Examples of the Terraform equivalent settings are included.
Requirements
Altinity.Cloud requires the AWS ID for the account that will be linked to the Altinity.Cloud environment. This can be found when you login to your AWS Console, and select your username from the upper right hand corner:

Instructions
To create an VPC Endpoint, the following general steps are required:
- Retrieve Your Altinity.Cloud Environment URL.
- Request an Endpoint Service Name from Altinity.Cloud.
- Create a VPC Endpoint. This must be in the same region as the service to be connected to.
- Create a private Route 53 Hosted Zone to internal.{Altinity.Cloud environment name}.altinity.cloud.
- Create a CNAME that points to the VPC Endpoint.
Retrieve Your Altinity.Cloud Environment URL
Your AWS service will be connected to the URL for your Altinity.Cloud environment. Typically this will be the name of your environment, followed by internal.{Altinity.Cloud environment name}.altinity.cloud
. For example: if your environment is named trafficanalysis
, then your environment URL will be internal.trafficanalysis.altinity.cloud
.
This may differ depending on your type of service. If you have any questions, please contact your Altinity Support representative.
Request an Endpoint Service Name
Before creating a VPC Endpoint, Altinity.Cloud will need to provide you a AWS Service Name that will be used for your Endpoint. To request your AWS Service Name to use in later steps of creating the VPC Endpoint to Altinity.Cloud:
-
Login to your AWS console and retrieve your AWS ID.
-
Contact your Altinity.Cloud support representative and inform them that you want to set up a VPC Endpoint to your Altinity.Cloud environment. They will require your AWS ID.
-
Your Altinity.Cloud support representative will process your request, and return your AWS Service Name to you. Store this in a secure location for your records.
Create a VPC Endpoint
The next step in connecting Altinity.Cloud to the existing AWS Service is to create an Endpoint.
-
From the AWS Virtual Private Cloud console, select Endpoints > Create Endpoint.
-
Set the following:
- Service Category: Set to Find service by name. (1)
- Service Name: Enter the Service Name (2) provided in the step Create Service Name, then select Verify. (3)
-
Select the VPC from the dropdown.
-
Select Create Endpoint.
Terraform VPC Endpoint Configuration
resource "aws_vpc_endpoint" "this" {
service_name = local.service_name,
vpc_endpoint_type = "Interface",
vpc_id = aws_vpc.this.id,
subnet_ids = [aws_subnet.this.id],
security_group_ids = [aws_vpc.this.default_security_group_id],
private_dns_enabled = false,
tags = local.tags
}
Create Route 53 Hosted Zone
To create the Route 53 Hosted Zone for the newly created endpoint:
-
From the AWS Console, select Endpoints.
-
Select the Endpoint to connect to Altinity.Cloud, then the tab Details. In the section marked DNS names, select the DNS entry created and copy it. Store this in a separate location until ready.
-
Enter the Route 53 console, and select Hosted zones.
-
Select Create hosted zone.
-
On the Hosted zone configuration page, update the following:
- Domain name: Enter the URL of the Altinity.Cloud environment. Recall this will be
internal.{Altinity.Cloud environment name}.altinity.cloud
, where {your environment name} was determined in the step Retrieve Your Altinity.Cloud Environment URL. - Description (optional): Enter a description of the hosted zone.
- Type: Set to Private hosted zone.
- Domain name: Enter the URL of the Altinity.Cloud environment. Recall this will be
-
In VPCs to associated with the hosted zone, set the following:
- Region: Select the region for the VPC to use.
- VPC ID: Enter the ID of the VPC that is being used.
-
Verify the information is correct, then select Create hosted zone.
Terraform Route 53 Configuration
resource "aws_route53_zone" "this" {
name = "$internal.{environment_name}.altinity.cloud.",
vpc {
vpc_id = aws_vpc.this.id
}
tags = local.tags
}
Create CNAME for VPC Endpoint
Once the Hosted Zone that will be used to connect the VPC to Altinity.Cloud has been created, the CNAME for the VPC Endpoint can be configured through the following process:
-
From the AWS Console, select Route 53 > Hosted Zones, then select Create record.
-
Select the Hosted Zone that will be used for the VPC connection. This will be the internal.{Altinity.Cloud environment name}.altinity.cloud.
-
Select Create record.
-
From Choose routing policy select Simple routing, then select Next.
-
From Configure records, select Define simple record.
-
From Define simple record, update the following:
- Record name: set to
*
. (1) - Value/Route traffic to:
- Select Ip address or another value depending on the record type. (3)
- Enter the DNS name for the Endpoint created in Create Route 53 Hosted Zone.
- Record type
- Select CNAME (2).
- Record name: set to
-
Verify the information is correct, then select Define simple record.
Terraform CNAME Configuration
resource "aws_route53_record" "this" {
zone_id = aws_route53_zone.this.zone_id,
name = "*.${aws_route53_zone.this.name}",
type = "CNAME",
ttl = 300,
records = [aws_vpc_endpoint.this.dns_entry[0]["dns_name"]]
}
Test
To verify the VPC Endpoint works, launch a EC2 instance and execute the following curl
command, and will return OK
if successful. Use the name of your Altinity.Cloud environment’s host name in place of {your environment name here}:
curl -sS https://statuscheck.{your environment name here}
OK
For example, if your environment is internal.trafficanalysis.altinity.cloud
, then use:
curl -sS https://statuscheck.internal.trafficanalysis.altinity.cloud
OK
References
1.5.6 - Amazon VPC Endpoint for Amazon MSK
Altinity.Cloud users can connect a VPC (Virtual Private Cloud) Endpoint service from their existing AWS (Amazon Web Services) MSK (Amazon Managed Streaming for Apache Kafka) environments to their Altinity.Cloud environment. The VPC Endpoint services become a private connection between their existing Amazon services and Altinity.Cloud, without exposing Amazon MSK to the Internet.
The following instructions are based on using the AWS console. Examples of the Terraform equivalent settings are included.
Requirements
- Amazon MSK
- Provision Broker mapping.
Instructions
To create an VPC Endpoint Service, the following general steps are required:
- Contact your Altinity Support representative to retrieve the Altinity.Cloud AWS Account ID.
- Create VPC Endpoint Services: For each broker in the Amazon MSK cluster, provision a VPC endpoint service in the same region your Amazon MSK cluster. For more information, see the Amazon AWS service endpoints documentation.
- Configure each endpoint service to a Kafka broker. For example:
- Endpoint Service:
com.amazonaws.vpce.us-east-1.vpce-svc-aaa
- Kafka broker:
b-0.xxx.yyy.zzz.kafka.us-east-1.amazonaws.com
- Endpoint service provision settings: Set
com.amazonaws.vpce.us-east-1.vpce-svc-aaa
=b-0.xxx.yyy.zzz.kafka.us-east-1.amazonaws.com
- Endpoint Service:
- Provide Endpoint Services and MSK Broker mappings to your Altinity Support representative.
Create VPC Endpoint Services
To create the VPC Endpoint Service that connects your Altinity.Cloud environment to your Amazon MSK service:
-
From the AWS Virtual Private Cloud console, select Endpoints Services > Create Endpoint Service.
-
Set the following:
- Name: Enter a Name of you own choice (A).
- Load balancer type: Set to Network. (B)
- Available load balancers: Set to the load balancer you provisioned for this broker. (C)
- Additional settings:
- If you are required to manually accept the endpoint, set Acceptence Required to Enabled (D).
- Otherwise, leave Acceptance Required unchecked.
-
Select Create.
Test
To verify the VPC Endpoint Service works, please contact your Altinity Support representative.
References
2 - Altinity Stable Builds
ClickHouse, as an open source project, has multiple methods of installation. Altinity recommends either using Altinity Stable builds for ClickHouse, or community builds.
The Altinity Stable builds are releases with extended service of ClickHouse that undergo rigorous testing to verify they are secure and ready for production use. Altinity Stable Builds provide a secure, pre-compiled binary release of ClickHouse server and client with the following features:
- The ClickHouse version release is ready for production use.
- 100% open source and 100% compatible with ClickHouse community builds.
- Provides Up to 3 years of support.
- Validated against client libraries and visualization tools.
- Tested for cloud use including Kubernetes.
For more information regarding the Altinity Stable builds, see Altinity Stable Builds for ClickHouse.
Altinity Stable Builds Life-Cycle Table
The following table lists Altinity Stable builds and their current status. Community builds of ClickHouse are no longer available after Community Support EOL. Contact us for build support beyond the Altinity Extend Support EOL.
Release Notes | Build Status | Latest Version | Release Date | Latest Update | Support Duration | Community Support End-of-Life* | Altinity Extended Support End-of-Life** |
---|---|---|---|---|---|---|---|
22.3 | Available | 22.3.8.39 | 15 Jul 2022 | 15 Jul 2022 | 3 years | 15 Mar 2023 | 15 Jul 2025 |
21.8 | Available | 21.8.15.7 | 11 Oct 2021 | 15 Apr 2022 | 3 years | 31 Aug 2022 | 30 Aug 2024 |
21.3 | Available | 21.3.20.2 | 29 Jun 2021 | 10 Feb 2022 | 3 years |
30 Mar 2022
|
31 Mar 2024 |
21.1 | Available | 21.1.11.3 | 24 Mar 2021 | 01 Jun 2022 | 2 years |
30 Apr 2021
|
31 Jan 2023 |
20.8 | Available Upon Request | 20.8.12.2 | 02 Dec 2020 | 03 Feb 2021 | 2 years |
31 Aug 2021
|
02 Dec 2022 |
20.3 | Available Upon Request | 20.3.19.4 | 24 Jun 2020 | 23 Sep 2020 | 2 years |
31 Mar 2021
|
24 Jun 2022 |
- *During Community Support bug fixes are automatically backported to community builds and picked up in refreshes of Altinity Stable builds.
- **Altinity Extended Support covers P0-P1 bugs encountered by customers and critical security issues regardless of audience. Fixes are best effort and may not be possible in every circumstance. Altinity makes every effort to ensure a fix, workaround, or upgrade path for covered issues.
2.1 - Altinity Stable Builds Install Guide
Installing ClickHouse from the Altinity Stable Builds, available from https://builds.altinity.cloud, takes just a few minutes.
Notice
Organizations that have used the legacy Altinity Stable Release repository at packagecloud.io can upgrade to the Altinity Stable Build without any conflicts. For more information on using the legacy repository, see the Legacy ClickHouse Altinity Stable Releases Install Guide.General Installation Instructions
When installing or upgrading from a previous version of ClickHouse from the Altinity Stable Builds, review the Release Notes for the ClickHouse version to install and upgrade to before starting. This will inform you of additional steps or requirements of moving from one version to the next.
Part of the installation procedures recommends you specify the version to install. The Release Notes lists the version numbers available for installation.
There are three main methods for installing Altinity Stable Builds:
- Deb Packages
- RPM Packages
- Docker images
The package sources come from two sources:
-
Altinity Stable Builds: These are built from a secure, internal build pipeline and available from https://builds.altinity.cloud. Altinity Stable Builds are distinguishable from community builds when displaying version information:
select version() ┌─version()─────────────────┐ │ 21.8.11.1.altinitystable │ └───────────────────────────┘
-
Community Builds: These are made by ClickHouse community members, and are available at repo.clickhouse.tech.
2.1.1 - Altinity Stable Builds Deb Install Guide
Installation Instructions: Deb packages
ClickHouse can be installed from the Altinity Stable builds, located at https://builds.altinity.cloud, or from the ClickHouse community repository.
IMPORTANT NOTE
We highly encourage organizations use a specific version to maximize compatibility, rather than relying on the most recent version. Instructions for how to specify the specific version of ClickHouse are included below.Deb Prerequisites
The following prerequisites must be installed before installing an Altinity Stable build of ClickHouse:
- curl
- gnupg2
- apt-transport-https
- ca-certificates
These can be installed prior to installing ClickHouse with the following command:
sudo apt-get update
sudo apt-get install curl gnupg2 apt-transport-https ca-certificates
Deb Packages for Altinity Stable Build
To install ClickHouse Altinity Stable build via Deb based packages from the Altinity Stable build repository:
-
Update the
apt-get
local repository:sudo apt-get update
-
Install the Altinity package signing keys:
sudo sh -c 'mkdir -p /usr/share/keyrings && curl -s https://builds.altinity.cloud/apt-repo/pubkey.gpg | gpg --dearmor > /usr/share/keyrings/altinity-dev-archive-keyring.gpg'
-
Update the
apt-get
repository to include the Altinity Stable build repository with the following commands:sudo sh -c 'echo "deb [signed-by=/usr/share/keyrings/altinity-dev-archive-keyring.gpg] https://builds.altinity.cloud/apt-repo stable main" > /etc/apt/sources.list.d/altinity-dev.list'
sudo apt-get update
-
Install either a specific version of ClickHouse, or the most current version.
- To install a specific version, include the version in the
apt-get install
command. The example below specifies the version21.8.10.1.altinitystable
:
version=21.8.10.1.altinitystable
sudo apt-get install clickhouse-common-static=$version clickhouse-client=$version clickhouse-server=$version
- To install the most current version of the ClickHouse Altinity Stable build without specifying a specific version, leave out the
version=
command.
sudo apt-get install clickhouse-client clickhouse-server
- To install a specific version, include the version in the
-
When prompted, provide the password for the default
clickhouse
user. -
Restart server.
Installed packages are not applied to an already running server. It makes it convenient to install the packages first and restart later when convenient.
sudo systemctl restart clickhouse-server
Remove Community Package Repository
For users upgrading to Altinity Stable builds from the community ClickHouse builds, we recommend removing the community builds from the local repository. See the instructions for your distribution of Linux for instructions on modifying your local package repository.
Community Builds
For instructions on how to install ClickHouse community, see the ClickHouse Documentation site.
2.1.2 - Altinity Stable Builds RPM Install Guide
Installation Instructions: RPM packages
ClickHouse can be installed from the Altinity Stable builds, located at https://builds.altinity.cloud, or from the ClickHouse commuinity repository.
Depending on your Linux distribution, either dnf
or yum
will be used. See your particular distribution of Linux for specifics.
The instructions below uses the command $(type -p dnf || type -p yum)
to provide the correct command based on the distribution to be used.
IMPORTANT NOTE
We highly encourage organizations use a specific version to maximize compatibility, rather than relying on the most recent version. Instructions for how to specify the specific version of ClickHouse are included below.RPM Prerequisites
The following prerequisites must be installed before installing an Altinity Stable build:
- curl
- gnupg2
These can be installed prior to installing ClickHouse with the following:
sudo $(type -p dnf || type -p yum) install curl gnupg2
RPM Packages for Altinity Stable Build
To install ClickHouse from an Altinity Stable build via RPM based packages from the Altinity Stable build repository:
-
Update the local RPM repository to include the Altinity Stable build repository with the following command:
sudo curl https://builds.altinity.cloud/yum-repo/altinity.repo -o /etc/yum.repos.d/altinity.repo
-
Install ClickHouse server and client with either
yum
ordnf
. It is recommended to specify a version to maximize compatibly with other applications and clients.- To specify the version of ClickHouse to install, create a variable for the version and pass it to the installation instructions. The example below specifies the version
21.8.10.1.altinitystable
:
version=21.8.10.1.altinitystable sudo $(type -p dnf || type -p yum) install clickhouse-common-static-$version clickhouse-server-$version clickhouse-client-$version
- To install the most recent version of ClickHouse, leave off the
version-
command and variable:
sudo $(type -p dnf || type -p yum) install clickhouse-common-static clickhouse-server clickhouse-client
- To specify the version of ClickHouse to install, create a variable for the version and pass it to the installation instructions. The example below specifies the version
Remove Community Package Repository
For users upgrading to Altinity Stable builds from the community ClickHouse builds, we recommend removing the community builds from the local repository. See the instructions for your distribution of Linux for instructions on modifying your local package repository.
RPM Downgrading Altinity ClickHouse Stable to a Previous Release
To downgrade to a previous release, the current version must be installed, and the previous version installed with the --setup=obsoletes=0
option. Review the Release Notes before downgrading for any considerations or issues that may occur when downgrading between versions of ClickHouse.
For more information, see the Altinity Knowledge Base article Altinity packaging compatibility greater than 21.x and earlier.
Community Builds
For instructions on how to install ClickHouse community, see the ClickHouse Documentation site.
2.1.3 - Altinity Stable Builds Docker Install Guide
Installation Instructions: Docker
These included instructions detail how to install a single Altinity Stable build of ClickHouse container through Docker. For details on setting up a cluster of Docker containers, see ClickHouse on Kubernetes.
Docker Images are available for Altinity Stable builds and Community builds. The instructions below focus on using the Altinity Stable builds for ClickHouse.
IMPORTANT NOTE
The Altinity Stable builds for ClickHouse do not use thelatest
tag. We highly encourage organizations to install a specific version of Altinity Stable builds to maximize compatibility. For information on the latest Altinity Stable Docker images, see the Altinity Stable for ClickHouse Docker page.
The Docker repositories are located at:
- Altinity Stable builds: altinity/clickhouse-server
To install a ClickHouse Altinity Stable build through Docker:
-
Create the directory for the docker-compose.yml file and the database storage and ClickHouse server storage.
mkdir clickhouse cd clickhouse mkdir clickhouse_database
-
Create the file
docker-compose.yml
and populate it with the following, updating theclickhouse-server
to the currentaltinity/clickhouse-server
version:version: '3' services: clickhouse_server: image: altinity/clickhouse-server:21.8.10.1.altinitystable ports: - "8123:8123" volumes: - ./clickhouse_database:/var/lib/clickhouse networks: - clickhouse_network networks: clickhouse_network: driver: bridge ipam: config: - subnet: 10.222.1.0/24
-
Launch the ClickHouse Server with
docker-compose
ordocker compose
depending on your version of Docker:docker compose up -d
-
Verify the installation by logging into the database from the Docker image directly, and make any other necessary updates with:
docker compose exec clickhouse_server clickhouse-client root@67c732d8dc6a:/# clickhouse-client ClickHouse client version 21.3.15.2.altinity+stable (altinity build). Connecting to localhost:9000 as user default. Connected to ClickHouse server version 21.1.10 revision 54443. 67c732d8dc6a :)
2.1.4 - Altinity Stable Builds macOS Install Guide
Altinity Stable for ClickHouse is available to macOS users through the Homebrew package manager. Users and developers who use macOS as their preferred environment can quickly install a production ready version of ClickHouse within minutes.
The following instructions are targeted for users of Altinity Stable for ClickHouse. For more information on running community or other versions of ClickHouse on macOS, see either the Homebrew Tap for ClickHouse project or the blog post Altinity Introduces macOS Homebrew Tap for ClickHouse.
Note
As of this time, the only versions of macOS with prepared binary bottles are the following:
- macOS Monterey (version 12) on Intel
- macOS Monterey (version 12) on Apple silicon
Other versions of macOS will compile from the source code rather than downloading pre-compiled binaries. This process can take anywhere from 30 minutes to several hours depending on your environment and internet connection.
macOS Prerequisites
- Installed the Homebrew package manager.
Brew Install for Altinity Stable Instructions
By default, installing ClickHouse through brew
will install the latest version of the community version of ClickHouse. Extra steps are required to install the Altinity Stable version of ClickHouse. Altinity Stable is installed as a keg-only version, which requires manually setting paths and other commands to run the Altinity Stable for ClickHouse through brew
.
To install Altinity Stable for ClickHouse in macOS through Brew:
-
Add the ClickHouse formula via
brew tap
:brew tap altinity/clickhouse
-
Install Altinity Stable for ClickHouse by specifying
clickhouse@altinity-stable
for the most recent Altinity Stable version, or specify the version withclickhouse@{Altinity Stable Version}
. For example, as of this writing the most current version of Altinity Stable is 21.8, therefore the command to install that version of altinity stable isclickhouse@21.8-altinity-stable
. To install the most recent version, use thebrew install
command as follows:brew install clickhouse@altinity-stable
-
Because Altinity Stable for ClickHouse is available as a keg only release, the path must be set manually. These instructions will be displayed as part of the installation procedure. Based on your version, executable directory will be different based on the pattern:
$(brew --prefix)/{clickhouse version}/bin
For our example,
clickhouse@altinity-stable
gives us the following path setting:export PATH="/opt/homebrew/opt/clickhouse@21.8-altinity-stable/bin:$PATH"
Using the
which
command after updating the path reveals the location of theclickhouse-server
executable:which clickhouse-server /opt/homebrew/opt/clickhouse@21.8-altinity-stable/bin/clickhouse-server
-
To start the Altinity Stable for ClickHouse server use the
brew services start
command. For example:brew services start clickhouse@altinity-stable
-
Connect to the new server with
clickhouse-client
:> clickhouse-client ClickHouse client version 21.8.13.1. Connecting to localhost:9000 as user default. Connected to ClickHouse server version 21.11.6 revision 54450. ClickHouse client version is older than ClickHouse server. It may lack support for new features. penny.home :) select version() SELECT version() Query id: 128a2cae-d0e2-4170-a771-83fb79429260 ┌─version()─┐ │ 21.11.6.1 │ └───────────┘ 1 rows in set. Elapsed: 0.004 sec. penny.home :) exit Bye.
-
To end the ClickHouse server, use
brew services stop
command:brew services stop clickhouse@altinity-stable
2.1.5 - Altinity Stable Build Guide for ClickHouse
Organizations that prefer to build ClickHouse manually can use the Altinity Stable versions of ClickHouse directly from the source code.
Clone the Repo
Before using either the Docker or Direct build process, the Altinity Stable for ClickHouse must be downloaded from the Altinity Stable of ClickHouse repository, located at https://github.com/Altinity/clickhouse. The following procedure is used to update the source code to the most current version. For more information on downloading a specific version of the source code, see the GitHub documentation.
Hardware Recommendations
ClickHouse can run on the most minimum hardware to full clusters. The following hardware requirements are recommended for building and running ClickHouse:
- 16GB of RAM (32 GB recommende)
- Multiple cores (4+)
- 20-50 GB disk storage
Downloading Altinity Stable for ClickHouse
Before building ClickHouse, specify the verified version to download and build by specifying the Altinity Stable for ClickHouse tags. The `–recursive`` command will download all submodules part of the Altinity Stable project.
As of this writing, the most recent verified version is v21.8.10.19-altinitystable
, so the download command to download that version of Altinity Stable into the folder AltinityStableClickHouse
is:
git clone --recursive -b v21.8.10.19-altinitystable --single-branch https://github.com/Altinity/clickhouse.git AltinityStableClickHouse
.
Direct Build Instructions for Deb Based Linux
To build Altinity Stable for ClickHouse from the source code for Deb based Linux platforms:
-
Install the prerequisites:
sudo apt-get install git cmake python ninja-build
-
Install
clang-12
.sudo apt install clang-12
-
Create and enter the
build
directory within your AltinityStable directory.mkdir build && cd build
-
Set the compile variables to
clang-12
and initiate theninja
build.CC=clang-12 CXX=clang++-12 cmake .. -GNinja
-
Provide the
ninja
command to build your own Altinity Stable for ClickHouse:ninja clickhouse
-
Once complete, Altinity Stable for ClickHouse will be in the project’s
programs
folder, and can be run with the following commands:- ClickHouse Server:
clickhouse server
- ClickHouse Client:
clickhouse client
- ClickHouse Server:
2.1.6 - Legacy ClickHouse Altinity Stable Releases Install Guide
ClickHouse Altinity Stable Releases are specially vetted community builds of ClickHouse that Altinity certifies for production use. We track critical changes and verify against a series of tests to make sure they’re ready for your production environment. We take the steps to verify how to upgrade from previous versions, and what issues you might run into when transitioning your ClickHouse clusters to the next Stable Altinity ClickHouse release.
As of October 12, 2021, Altinity replaced the ClickHouse Altinity Stable Releases with the Altinity Stable Builds, providing longer support and validation. For more information, see Altinity Stable Builds.
Legacy versions of the ClickHouse Altinity Stable Releases are available from the Altinity ClickHouse Stable Release packagecloud.io repository, located at https://packagecloud.io/Altinity/altinity-stable.
The available Altinity ClickHouse Stable Releases from packagecloud.io for ClickHouse server, ClickHouse client and ClickHouse common versions are:
- Altinity ClickHouse Stable Release 21.1.10.3
- Altinity ClickHouse Stable Release 21.3.13.9
- Altinity ClickHouse Stable Release 21.3.15.2
- Altinity ClickHouse Stable Release 21.3.15.4
General Installation Instructions
When installing or upgrading from a previous version of legacy ClickHouse Altinity Stable Release, review the Release Notes for the version to install and upgrade to before starting. This will inform you of additional steps or requirements of moving from one version to the next.
Part of the installation procedures recommends you specify the version to install. The Release Notes lists the version numbers available for installation.
There are three main methods for installing the legacy ClickHouse Altinity Stable Releases:
Altinity ClickHouse Stable Releases are distinguishable from community builds when displaying version information. The suffix altinitystable
will be displayed after the version number:
select version()
┌─version()─────────────────┐
│ 21.3.15.2.altinitystable │
└───────────────────────────┘
Prerequisites
This guide assumes that the reader is familiar with Linux commands, permissions, and how to install software for their particular Linux distribution. The reader will have to verify they have the correct permissions to install the software in their target systems.
Installation Instructions
Legacy Altinity ClickHouse Stable Release DEB Builds
To install legacy ClickHouse Altinity Stable Release version DEB packages from packagecloud.io:
-
Update the
apt-get
repository with the following command:curl -s https://packagecloud.io/install/repositories/Altinity/altinity-stable/script.deb.sh | sudo bash
-
ClickHouse can be installed either by specifying a specific version, or automatically going to the most current version. It is recommended to specify a version for maximum compatibility with existing clients.
- To install a specific version, create a variable specifying the version to install and including it with the install command:
version=21.8.8.1.altinitystable sudo apt-get install clickhouse-client=$version clickhouse-server=$version clickhouse-common-static=$version
- To install the most current version of the legacy ClickHouse Altinity Stable release without specifying a specific version, leave out the
version=
command.
sudo apt-get install clickhouse-client clickhouse-server clickhouse-server-common
-
Restart server.
Installed packages are not applied to the already running server. It makes it convenient to install packages first and restart later when convenient.
sudo systemctl restart clickhouse-server
Legacy Altinity ClickHouse Stable Release RPM Builds
To install legacy ClickHouse Altinity Stable Release version RPM packages from packagecloud.io:
-
Update the
yum
package repository configuration with the following command:curl -s https://packagecloud.io/install/repositories/Altinity/altinity-stable/script.rpm.sh | sudo bash
-
ClickHouse can be installed either by specifying a specific version, or automatically going to the most current version. It is recommended to specify a version for maximum compatibility with existing clients.
- To install a specific version, create a variable specifying the version to install and including it with the install command:
version=version=21.8.8.1.altinitystable sudo yum install clickhouse-client-${version} clickhouse-server-${version} clickhouse-server-common-${version}
- To install the most current version of the legacy ClickHouse Altinity Stable release without specifying a specific version, leave out the
version=
command.
sudo yum install clickhouse-client clickhouse-server clickhouse-server-common
-
Restart the ClickHouse server.
sudo systemctl restart clickhouse-server
2.2 - Monitoring Considerations
Monitoring helps to track potential issues in your cluster before they cause a critical error.
External Monitoring
External monitoring collects data from the ClickHouse cluster and uses it for analysis and review. Recommended external monitoring systems include:
- Prometheus: Use embedded exporter or clickhouse-exporter
- Graphite: Use the embedded exporter. See config.xml.
- InfluxDB: Use the embedded exporter, plus Telegraf. For more information, see Graphite protocol support in InfluxDB.
ClickHouse can collect the recording of metrics internally by enabling system.metric_log
in config.xml
.
For dashboard system:
- Grafana is recommended for graphs, reports, alerts, dashboard, etc.
- Other options are Nagios or Zabbix.
The following metrics should be collected:
- For Host Machine:
- CPU
- Memory
- Network (bytes/packets)
- Storage (iops)
- Disk Space (free / used)
- For ClickHouse:
- Connections (count)
- RWLocks
- Read / Write / Return (bytes)
- Read / Write / Return (rows)
- Zookeeper operations (count)
- Absolute delay
- Query duration (optional)
- Replication parts and queue (count)
- For Zookeeper:
The following queries are recommended to be included in monitoring:
- SELECT * FROM system.replicas
- For more information, see the ClickHouse guide on System Tables
- SELECT * FROM system.merges
- Checks on the speed and progress of currently executed merges.
- SELECT * FROM system.mutations
- This is the source of information on the speed and progress of currently executed merges.
Monitor and Alerts
Configure the notifications for events and thresholds based on the following table:
Health Checks
The following health checks should be monitored:
Check Name | Shell or SQL command | Severity |
---|---|---|
ClickHouse status | $ curl 'http://localhost:8123/'Ok. | Critical |
Too many simultaneous queries. Maximum: 100 | select value from system.metrics where metric='Query' | Critical |
Replication status | $ curl 'http://localhost:8123/replicas_status'Ok. | High |
Read only replicas (reflected by replicas_status as well) | select value from system.metrics where metric='ReadonlyReplica’ | High |
ReplicaPartialShutdown (not reflected by replicas_status, but seems to correlate with ZooKeeperHardwareExceptions) | select value from system.events where event='ReplicaPartialShutdown' | HighI turned this one off. It almost always correlates with ZooKeeperHardwareExceptions, and when it’s not, then there is nothing bad happening… |
Some replication tasks are stuck | select count()from system.replication_queuewhere num_tries > 100 | High |
ZooKeeper is available | select count() from system.zookeeper where path='/' | Critical for writes |
ZooKeeper exceptions | select value from system.events where event='ZooKeeperHardwareExceptions' | Medium |
Other CH nodes are available | $ for node in `echo "select distinct host_address from system.clusters where host_name !='localhost'" | curl 'http://localhost:8123/' –silent –data-binary @-`; do curl "http://$node:8123/" –silent ; done |
All CH clusters are available (i.e. every configured cluster has enough replicas to serve queries) | for cluster in `echo "select distinct cluster from system.clusters where host_name !='localhost'" | curl 'http://localhost:8123/' –silent –data-binary @-` ; do clickhouse-client –query="select '$cluster', 'OK' from cluster('$cluster', system, one)" ; done |
There are files in 'detached' folders | $ find /var/lib/clickhouse/data///detached/* -type d |
wc -l; 19.8+select count() from system.detached_parts |
Too many parts: Number of parts is growing; Inserts are being delayed; Inserts are being rejected |
select value from system.asynchronous_metrics where metric='MaxPartCountForPartition';select value from system.events/system.metrics where event/metric='DelayedInserts'; select value from system.events where event='RejectedInserts' |
Critical |
Dictionaries: exception | select concat(name,': ',last_exception) from system.dictionarieswhere last_exception != '' | Medium |
ClickHouse has been restarted | select uptime();select value from system.asynchronous_metrics where metric='Uptime' | |
DistributedFilesToInsert should not be always increasing | select value from system.metrics where metric='DistributedFilesToInsert' | Medium |
A data part was lost | select value from system.events where event='ReplicatedDataLoss' | High |
Data parts are not the same on different replicas |
select value from system.events where event='DataAfterMergeDiffersFromReplica'; select value from system.events where event='DataAfterMutationDiffersFromReplica' |
Medium |
Monitoring References
3 - ClickHouse on Kubernetes
Setting up a cluster of Altinity Stable for ClickHouse is made easy with Kubernetes, even if saying that takes some effort from the tongue. Organizations that want to setup their own distributed ClickHouse environments can do so with the Altinity Kubernetes Operator.
As of this time, the current version of the Altinity Kubernetes Operator is 0.18.5.
3.1 - Altinity Kubernetes Operator Quick Start Guide
If you’re running the Altinity Kubernetes Operator for the first time, or just want to get it up and running as quickly as possible, the Quick Start Guide is for you.
Requirements:
- An operating system running Kubernetes and Docker, or a service providing support for them such as AWS.
- A ClickHouse remote client such as
clickhouse-client
. Full instructions for installing ClickHouse can be found on the ClickHouse Installation page.
3.1.1 - Installation
The Altinity Kubernetes Operator can be installed in just a few minutes with a single command into your existing Kubernetes environment.
For those who need a more customized installation or want to build the Altinity Kubernetes Operator themselves can do so through the Operator Installation Guide.
Requirements
Before starting, make sure you have the following installed:
- Kubernetes 1.15.11+.
For instructions on how to install Kubernetes for your particular environment,
see the Kubernetes Install Tools
page. - Access to the clickhouse-operator-install-bundle.yaml file.
Quick Install
To install the Altinity Kubernetes Operator into your existing Kubernetes environment, run the following or download the Altinity Kubernetes Operator install file to modify it as best fits your needs. For more information custom Altinity Kubernetes Operator settings that can be applied, see the Operator Guide.
We recommend that when installing the Altinity Kubernetes Operator, specify the version to be installed. This insures maximum compatibility with applications and established Kubernetes environments running your ClickHouse clusters. For more information on installing other versions of the Altinity Kubernetes Operator, see the specific Version Installation Guide.
The most current version is 0.18.3
:
kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
IMPORTANT NOTICE
Never delete the operator or run the following command while there are live ClickHouse clusters managed by the operator:
kubectl delete -f https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/deploy/operator/clickhouse-operator-install-bundle.yaml
The command will hang due to the live clusters. If you then re-install the operator, those clusters will be deleted and the operator will not work correctly.
See Altinity/clickhouse-operator#830 for more details.
Output similar to the following will be displayed on a successful installation. For more information on the resources created in the installation, see Altinity Kubernetes Operator Resources
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com created
serviceaccount/clickhouse-operator created
clusterrole.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
configmap/etc-clickhouse-operator-files created
configmap/etc-clickhouse-operator-confd-files created
configmap/etc-clickhouse-operator-configd-files created
configmap/etc-clickhouse-operator-templatesd-files created
configmap/etc-clickhouse-operator-usersd-files created
deployment.apps/clickhouse-operator created
service/clickhouse-operator-metrics created
Quick Verify
To verify that the installation was successful, run the following. On a successful installation you’ll be able to see the clickhouse-operator
pod under the NAME column.
kubectl get pods --namespace kube-system
NAME READY STATUS RESTARTS AGE
clickhouse-operator-857c69ffc6-dq2sz 2/2 Running 0 5s
coredns-78fcd69978-nthp2 1/1 Running 4 (110s ago) 50d
etcd-minikube 1/1 Running 4 (115s ago) 50d
kube-apiserver-minikube 1/1 Running 4 (105s ago) 50d
kube-controller-manager-minikube 1/1 Running 4 (115s ago) 50d
kube-proxy-lsggn 1/1 Running 4 (115s ago) 50d
kube-scheduler-minikube 1/1 Running 4 (105s ago) 50d
storage-provisioner 1/1 Running 8 (115s ago) 50d
3.1.2 - First Clusters
If you followed the Quick Installation guide, then you have the
Altinity Kubernetes Operator for Kubernetes installed.
Let’s give it something to work with.
Create Your Namespace
For our examples, we’ll be setting up our own Kubernetes namespace test
.
All of the examples will be installed into that namespace so we can track
how the cluster is modified with new updates.
Create the namespace with the following kubectl
command:
kubectl create namespace test
namespace/test created
Just to make sure we’re in a clean environment,
let’s check for any resources in our namespace:
kubectl get all -n test
No resources found in test namespace.
The First Cluster
We’ll start with the simplest cluster: one shard, one replica.
This template and others are available on the
Altinity Kubernetes Operator Github example site,
and contains the following:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "demo-01"
spec:
configuration:
clusters:
- name: "demo-01"
layout:
shardsCount: 1
replicasCount: 1
Save this as sample01.yaml and launch it with the following:
kubectl apply -n test -f sample01.yaml
clickhouseinstallation.clickhouse.altinity.com/demo-01 created
Verify that the new cluster is running. When the status is
Running
then it’s complete.
kubectl -n test get chi -o wide
NAME VERSION CLUSTERS SHARDS HOSTS TASKID STATUS UPDATED ADDED DELETED DELETE ENDPOINT
demo-01 0.18.1 1 1 1 6d1d2c3d-90e5-4110-81ab-8863b0d1ac47 Completed 1 clickhouse-demo-01.test.svc.cluster.local
To retrieve the IP information use the get service
option:
kubectl get service -n test
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
chi-demo-01-demo-01-0-0 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 2s
clickhouse-demo-01 LoadBalancer 10.111.27.86 <pending> 8123:31126/TCP,9000:32460/TCP 19s
So we can see our pods is running, and that we have the"
load balancer for the cluster.
Connect To Your Cluster With Exec
Let’s talk to our cluster and run some simple ClickHouse queries.
We can hop in directly through Kubernetes and run the clickhouse-client
that’s part of the image with the following command:
kubectl -n test exec -it chi-demo-01-demo-01-0-0-0 -- clickhouse-client
ClickHouse client version 20.12.4.5 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 20.12.4 revision 54442.
chi-demo-01-demo-01-0-0-0.chi-demo-01-demo-01-0-0.test.svc.cluster.local :)
From within ClickHouse, we can check out the current clusters:
SELECT * FROM system.clusters
┌─cluster─────────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───────────────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
│ all-replicated │ 1 │ 1 │ 1 │ chi-demo-01-demo-01-0-0 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ all-sharded │ 1 │ 1 │ 1 │ chi-demo-01-demo-01-0-0 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ demo-01 │ 1 │ 1 │ 1 │ chi-demo-01-demo-01-0-0 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_one_shard_three_replicas_localhost │ 1 │ 1 │ 1 │ 127.0.0.1 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_one_shard_three_replicas_localhost │ 1 │ 1 │ 2 │ 127.0.0.2 │ 127.0.0.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_one_shard_three_replicas_localhost │ 1 │ 1 │ 3 │ 127.0.0.3 │ 127.0.0.3 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards │ 1 │ 1 │ 1 │ 127.0.0.1 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards │ 2 │ 1 │ 1 │ 127.0.0.2 │ 127.0.0.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_internal_replication │ 1 │ 1 │ 1 │ 127.0.0.1 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_internal_replication │ 2 │ 1 │ 1 │ 127.0.0.2 │ 127.0.0.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_localhost │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_localhost │ 2 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_shard_localhost │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_shard_localhost_secure │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9440 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_unavailable_shard │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_unavailable_shard │ 2 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 1 │ 0 │ default │ │ 0 │ 0 │ 0 │
└─────────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴─────────────────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘
Exit out of your cluster:
chi-demo-01-demo-01-0-0-0.chi-demo-01-demo-01-0-0.test.svc.cluster.local :) exit
Bye.
Connect to Your Cluster with Remote Client
You can also use a remote client such as clickhouse-client
to
connect to your cluster through the LoadBalancer.
-
The default username and password is set by the
clickhouse-operator-install.yaml
file. These values can be altered
by changing thechUsername
andchPassword
ClickHouse
Credentials section:- Default Username:
clickhouse_operator
- Default Password:
clickhouse_operator_password
- Default Username:
# ClickHouse credentials (username, password and port) to be used
# by operator to connect to ClickHouse instances for:
# 1. Metrics requests
# 2. Schema maintenance
# 3. DROP DNS CACHE
# User with such credentials can be specified in additional ClickHouse
# .xml config files,
# located in `chUsersConfigsPath` folder
chUsername: clickhouse_operator
chPassword: clickhouse_operator_password
chPort: 8123
In either case, the command to connect to your new cluster will
resemble the following, replacing {LoadBalancer hostname}
with
the name or IP address or your LoadBalancer, then providing
the proper password. In our examples so far, that’s been localhost
.
From there, just make your ClickHouse SQL queries as you please - but
remember that this particular cluster has no persistent storage.
If the cluster is modified in any way, any databases or tables
created will be wiped clean.
Update Your First Cluster To 2 Shards
Well that’s great - we have a cluster running. Granted, it’s really small
and doesn’t do much, but what if we want to upgrade it?
Sure - we can do that any time we want.
Take your sample01.yaml
and save it as sample02.yaml
.
Let’s update it to give us two shards running with one replica:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "demo-01"
spec:
configuration:
clusters:
- name: "demo-01"
layout:
shardsCount: 2
replicasCount: 1
Save your YAML file, and apply it. We’ve defined the name
in the metadata, so it knows exactly what cluster to update.
kubectl apply -n test -f sample02.yaml
clickhouseinstallation.clickhouse.altinity.com/demo-01 configured
Verify that the cluster is running - this may take a few
minutes depending on your system:
kubectl -n test get chi -o wide
NAME VERSION CLUSTERS SHARDS HOSTS TASKID STATUS UPDATED ADDED DELETED DELETE ENDPOINT
demo-01 0.18.1 1 2 2 80102179-4aa5-4e8f-826c-1ca7a1e0f7b9 Completed 1 clickhouse-demo-01.test.svc.cluster.local
kubectl get service -n test
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
chi-demo-01-demo-01-0-0 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 26s
chi-demo-01-demo-01-1-0 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 3s
clickhouse-demo-01 LoadBalancer 10.111.27.86 <pending> 8123:31126/TCP,9000:32460/TCP 43s
Once again, we can reach right into our cluster with
clickhouse-client
and look at the clusters.
clickhouse-client --host localhost --user=clickhouse_operator --password=clickhouse_operator_password
ClickHouse client version 20.12.4.5 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 20.12.4 revision 54442.
chi-demo-01-demo-01-1-0-0.chi-demo-01-demo-01-1-0.test.svc.cluster.local :)
SELECT * FROM system.clusters
┌─cluster─────────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───────────────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
│ all-replicated │ 1 │ 1 │ 1 │ chi-demo-01-demo-01-0-0 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ all-sharded │ 1 │ 1 │ 1 │ chi-demo-01-demo-01-0-0 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ demo-01 │ 1 │ 1 │ 1 │ chi-demo-01-demo-01-0-0 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_one_shard_three_replicas_localhost │ 1 │ 1 │ 1 │ 127.0.0.1 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_one_shard_three_replicas_localhost │ 1 │ 1 │ 2 │ 127.0.0.2 │ 127.0.0.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_one_shard_three_replicas_localhost │ 1 │ 1 │ 3 │ 127.0.0.3 │ 127.0.0.3 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards │ 1 │ 1 │ 1 │ 127.0.0.1 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards │ 2 │ 1 │ 1 │ 127.0.0.2 │ 127.0.0.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_internal_replication │ 1 │ 1 │ 1 │ 127.0.0.1 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_internal_replication │ 2 │ 1 │ 1 │ 127.0.0.2 │ 127.0.0.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_localhost │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_localhost │ 2 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_shard_localhost │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_shard_localhost_secure │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9440 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_unavailable_shard │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_unavailable_shard │ 2 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 1 │ 0 │ default │ │ 0 │ 0 │ 0 │
└─────────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴─────────────────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘
So far, so good. We can create some basic clusters.
If we want to do more, we’ll have to move ahead with replication
and zookeeper in the next section.
3.1.3 - Zookeeper and Replicas
kubectl create namespace test
namespace/test created
Now we’ve seen how to setup a basic cluster and upgrade it. Time to step
up our game and setup our cluster with Zookeeper, and then add
persistent storage to it.
The Altinity Kubernetes Operator does not install or manage Zookeeper.
Zookeeper must be provided and managed externally. The samples below
are examples on establishing Zookeeper to provide replication support.
For more information running and configuring Zookeeper,
see the Apache Zookeeper site.
This step can not be skipped - your Zookeeper instance must
have been set up externally from your ClickHouse clusters.
Whether your Zookeeper installation is hosted by other
Docker Images or separate servers is up to you.
Install Zookeeper
Kubernetes Zookeeper Deployment
A simple method of installing a single Zookeeper node is provided from
the Altinity Kubernetes Operator
deployment samples. These provide samples deployments of Grafana, Prometheus, Zookeeper and other applications.
See the Altinity Kubernetes Operator deployment directory
for a full list of sample scripts and Kubernetes deployment files.
The instructions below will create a new Kubernetes namespace zoo1ns
,
and create a Zookeeper node in that namespace.
Kubernetes nodes will refer to that Zookeeper node by the hostname
zookeeper.zoo1ns
within the created Kubernetes networks.
To deploy a single Zookeeper node in Kubernetes from the
Altinity Kubernetes Operator Github repository:
-
Download the Altinity Kubernetes Operator Github repository, either with
git clone https://github.com/Altinity/clickhouse-operator.git
or by selecting Code->Download Zip from the
Altinity Kubernetes Operator GitHub repository
. -
From a terminal, navigate to the
deploy/zookeeper
directory
and run the following:
cd clickhouse-operator/deploy/zookeeper
./quick-start-volume-emptyDir/zookeeper-1-node-create.sh
namespace/zoo1ns created
service/zookeeper created
service/zookeepers created
[33;1mWarning:[0m policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
poddisruptionbudget.policy/zookeeper-pod-disruption-budget created
statefulset.apps/zookeeper created
- Verify the Zookeeper node is running in Kubernetes:
kubectl get all --namespace zoo1ns
NAME READY STATUS RESTARTS AGE
pod/zookeeper-0 0/1 Running 0 2s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/zookeeper ClusterIP 10.100.31.86 <none> 2181/TCP,7000/TCP 2s
service/zookeepers ClusterIP None <none> 2888/TCP,3888/TCP 2s
NAME READY AGE
statefulset.apps/zookeeper 0/1 2s
- Kubernetes nodes will be able to refer to the Zookeeper
node by the hostnamezookeeper.zoo1ns
.
Configure Kubernetes with Zookeeper
Once we start replicating clusters, we need Zookeeper to manage them.
Create a new file sample03.yaml
and populate it with the following:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "demo-01"
spec:
configuration:
zookeeper:
nodes:
- host: zookeeper.zoo1ns
port: 2181
clusters:
- name: "demo-01"
layout:
shardsCount: 2
replicasCount: 2
templates:
podTemplate: clickhouse-stable
templates:
podTemplates:
- name: clickhouse-stable
spec:
containers:
- name: clickhouse
image: altinity/clickhouse-server:21.8.10.1.altinitystable
Notice that we’re increasing the number of replicas from the
sample02.yaml
file in the
[First Clusters - No Storage]({<ref “quickcluster”>}) tutorial.
We’ll set up a minimal Zookeeper connecting cluster by applying
our new configuration file:
kubectl apply -f sample03.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 created
Verify it with the following:
kubectl -n test get chi -o wide
NAME VERSION CLUSTERS SHARDS HOSTS TASKID STATUS UPDATED ADDED DELETED DELETE ENDPOINT AGE
demo-01 0.18.3 1 2 4 5ec69e86-7e4d-4b8b-877f-f298f26161b2 Completed 4 clickhouse-demo-01.test.svc.cluster.local 102s
kubectl get service -n test
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
chi-demo-01-demo-01-0-0 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 85s
chi-demo-01-demo-01-0-1 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 68s
chi-demo-01-demo-01-1-0 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 47s
chi-demo-01-demo-01-1-1 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 16s
clickhouse-demo-01 LoadBalancer 10.104.157.249 <pending> 8123:32543/TCP,9000:30797/TCP 101s
If we log into our cluster and show the clusters, we can show
the updated results and that we have a total of 4 replicas
of demo-01
- two shards for each node with two replicas.
SELECT * FROM system.clusters
┌─cluster──────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───────────────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
│ all-replicated │ 1 │ 1 │ 1 │ chi-demo-01-demo-01-0-0 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ all-replicated │ 1 │ 1 │ 2 │ chi-demo-01-demo-01-0-1 │ 172.17.0.6 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ all-replicated │ 1 │ 1 │ 3 │ chi-demo-01-demo-01-1-0 │ 172.17.0.7 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ all-replicated │ 1 │ 1 │ 4 │ chi-demo-01-demo-01-1-1 │ 172.17.0.8 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ all-sharded │ 1 │ 1 │ 1 │ chi-demo-01-demo-01-0-0 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ all-sharded │ 2 │ 1 │ 1 │ chi-demo-01-demo-01-0-1 │ 172.17.0.6 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ all-sharded │ 3 │ 1 │ 1 │ chi-demo-01-demo-01-1-0 │ 172.17.0.7 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ all-sharded │ 4 │ 1 │ 1 │ chi-demo-01-demo-01-1-1 │ 172.17.0.8 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ demo-01 │ 1 │ 1 │ 1 │ chi-demo-01-demo-01-0-0 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ demo-01 │ 1 │ 1 │ 2 │ chi-demo-01-demo-01-0-1 │ 172.17.0.6 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ demo-01 │ 2 │ 1 │ 1 │ chi-demo-01-demo-01-1-0 │ 172.17.0.7 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ demo-01 │ 2 │ 1 │ 2 │ chi-demo-01-demo-01-1-1 │ 172.17.0.8 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards │ 1 │ 1 │ 1 │ 127.0.0.1 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards │ 2 │ 1 │ 1 │ 127.0.0.2 │ 127.0.0.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_internal_replication │ 1 │ 1 │ 1 │ 127.0.0.1 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_internal_replication │ 2 │ 1 │ 1 │ 127.0.0.2 │ 127.0.0.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_localhost │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_localhost │ 2 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_shard_localhost │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_shard_localhost_secure │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9440 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_unavailable_shard │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_unavailable_shard │ 2 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 1 │ 0 │ default │ │ 0 │ 0 │ 0 │
└──────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴─────────────────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘
Distributed Tables
We have our clusters going - let’s test it out with some distributed
tables so we can see the replication in action.
Login to your ClickHouse cluster and enter the following SQL statement:
CREATE TABLE test AS system.one ENGINE = Distributed('demo-01', 'system', 'one')
Once our table is created, perform a SELECT * FROM test
command.
We’ll see nothing because we didn’t give it any data,
but that’s all right.
SELECT * FROM test
┌─dummy─┐
│ 0 │
└───────┘
┌─dummy─┐
│ 0 │
└───────┘
Now let’s test out our results coming in.
Run the following command - this tells us just what shard is
returning the results. It may take a few times, but you’ll
start to notice the host name changes each time you run the
command SELECT hostName() FROM test
:
SELECT hostName() FROM test
┌─hostName()────────────────┐
│ chi-demo-01-demo-01-0-0-0 │
└───────────────────────────┘
┌─hostName()────────────────┐
│ chi-demo-01-demo-01-1-1-0 │
└───────────────────────────┘
SELECT hostName() FROM test
┌─hostName()────────────────┐
│ chi-demo-01-demo-01-0-0-0 │
└───────────────────────────┘
┌─hostName()────────────────┐
│ chi-demo-01-demo-01-1-0-0 │
└───────────────────────────┘
This is showing us that the query is being distributed across
different shards. The good news is you can change your
configuration files to change the shards and replication
however suits your needs.
One issue though: there’s no persistent storage.
If these clusters stop running, your data vanishes.
Next instruction will be on how to add persistent storage
to your ClickHouse clusters running on Kubernetes.
In fact, we can test by creating a new configuration
file called sample04.yaml
:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "demo-01"
spec:
configuration:
zookeeper:
nodes:
- host: zookeeper.zoo1ns
port: 2181
clusters:
- name: "demo-01"
layout:
shardsCount: 1
replicasCount: 1
templates:
podTemplate: clickhouse-stable
templates:
podTemplates:
- name: clickhouse-stable
spec:
containers:
- name: clickhouse
image: altinity/clickhouse-server:21.8.10.1.altinitystable
Make sure you’re exited out of your ClickHouse cluster,
then install our configuration file:
kubectl apply -f sample04.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 configured
Notice that during the update that four pods were deleted,
and then two new ones added.
When your clusters are settled down and back down to just 1 shard
with 1 replication, log back into your ClickHouse database
and select from table test:
SELECT * FROM test
Received exception from server (version 21.8.10):
Code: 60. DB::Exception: Received from localhost:9000. DB::Exception: Table default.test doesn't exist.
command terminated with exit code 60
No persistent storage means any time your clusters are changed over,
everything you’ve done is gone. The next article will cover
how to correct that by adding storage volumes to your cluster.
3.1.4 - Persistent Storage
kubectl create namespace test
namespace/test created
We’ve shown how to create ClickHouse clusters in Kubernetes, how to add zookeeper so we can create replicas of clusters. Now we’re going to show how to set persistent storage so you can change your cluster configurations without losing your hard work.
The examples here are built from the Altinity Kubernetes Operator examples, simplified down for our demonstrations.
IMPORTANT NOTE
The Altinity Stable builds for ClickHouse do not use thelatest
tag. We highly encourage organizations to install a specific version of Altinity Stable builds to maximize compatibility. For information on the latest Altinity Stable Docker images, see the Altinity Stable for ClickHouse Docker page.
Create a new file called sample05.yaml
with the following:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "demo-01"
spec:
configuration:
zookeeper:
nodes:
- host: zookeeper.zoo1ns
port: 2181
clusters:
- name: "demo-01"
layout:
shardsCount: 2
replicasCount: 2
templates:
podTemplate: clickhouse-stable
volumeClaimTemplate: storage-vc-template
templates:
podTemplates:
- name: clickhouse-stable
spec:
containers:
- name: clickhouse
image: altinity/clickhouse-server:21.8.10.1.altinitystable
volumeClaimTemplates:
- name: storage-vc-template
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Those who have followed the previous examples will recognize the
clusters being created, but there are some new additions:
volumeClaimTemplate
: This is setting up storage, and we’re
specifying the class asdefault
. For full details on the different
storage classes see the
kubectl Storage Class documentationstorage
: We’re going to give our cluster 1 Gigabyte of storage,
enough for our sample systems. If you need more space that
can be upgraded by changing these settings.podTemplate
: Here we’ll specify what our pod types are going to be.
We’ll use the latest version of the ClickHouse containers,
but other versions can be specified to best it your needs.
For more information, see the
[ClickHouse on Kubernetes Operator Guide]({<ref “kubernetesoperatorguide”>}).
Save your new configuration file and install it.
If you’ve been following this guide and already have the
namespace test
operating, this will update it:
kubectl apply -f sample05.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 created
Verify it completes with get all
for this namespace,
and you should have similar results:
kubectl -n test get chi -o wide
NAME VERSION CLUSTERS SHARDS HOSTS TASKID STATUS UPDATED ADDED DELETED DELETE ENDPOINT AGE
demo-01 0.18.3 1 2 4 57ec3f87-9950-4e5e-9b26-13680f66331d Completed 4 clickhouse-demo-01.test.svc.cluster.local 108s
kubectl get service -n test
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
chi-demo-01-demo-01-0-0 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 81s
chi-demo-01-demo-01-0-1 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 63s
chi-demo-01-demo-01-1-0 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 45s
chi-demo-01-demo-01-1-1 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 8s
clickhouse-demo-01 LoadBalancer 10.104.236.138 <pending> 8123:31281/TCP,9000:30052/TCP 98s
Testing Persistent Storage
Everything is running, let’s verify that our storage is working.
We’re going to exec into our cluster with a bash prompt on
one of the pods created:
kubectl -n test exec -it chi-demo-01-demo-01-0-0-0 -- df -h
Filesystem Size Used Avail Use% Mounted on
overlay 32G 26G 4.0G 87% /
tmpfs 64M 0 64M 0% /dev
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/sda2 32G 26G 4.0G 87% /etc/hosts
shm 64M 0 64M 0% /dev/shm
tmpfs 7.7G 12K 7.7G 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 3.9G 0 3.9G 0% /proc/acpi
tmpfs 3.9G 0 3.9G 0% /proc/scsi
tmpfs 3.9G 0 3.9G 0% /sys/firmware
And we can see we have about 1 Gigabyte of storage
allocated into our cluster.
Let’s add some data to it. Nothing major, just to show that we can
store information, then change the configuration and the data stays.
Exit out of your cluster and launch clickhouse-client
on your LoadBalancer.
We’re going to create a database, then create a table in the database,
then show both.
SHOW DATABASES
┌─name────┐
│ default │
│ system │
└─────────┘
CREATE DATABASE teststorage
CREATE TABLE teststorage.test AS system.one ENGINE = Distributed('demo-01', 'system', 'one')
SHOW DATABASES
┌─name────────┐
│ default │
│ system │
│ teststorage │
└─────────────┘
SELECT * FROM teststorage.test
┌─dummy─┐
│ 0 │
└───────┘
┌─dummy─┐
│ 0 │
└───────┘
If you followed the instructions from
[Zookeeper and Replicas]({<ref “quickzookeeper” >}),
note at the end when we updated the configuration of our sample cluster
that all of the tables and data we made were deleted.
Let’s recreate that experiment now with a new configuration.
Create a new file called sample06.yaml
. We’re going to reduce
the shards and replicas to 1:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "demo-01"
spec:
configuration:
zookeeper:
nodes:
- host: zookeeper.zoo1ns
port: 2181
clusters:
- name: "demo-01"
layout:
shardsCount: 1
replicasCount: 1
templates:
podTemplate: clickhouse-stable
volumeClaimTemplate: storage-vc-template
templates:
podTemplates:
- name: clickhouse-stable
spec:
containers:
- name: clickhouse
image: altinity/clickhouse-server:21.8.10.1.altinitystable
volumeClaimTemplates:
- name: storage-vc-template
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Update the cluster with the following:
kubectl apply -f sample06.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 configured
Wait until the configuration is done and all of the pods are spun down,
then launch a bash prompt on one of the pods and check
the storage available:
kubectl -n test get chi -o wide
NAME VERSION CLUSTERS SHARDS HOSTS TASKID STATUS UPDATED ADDED DELETED DELETE ENDPOINT AGE
demo-01 0.18.3 1 1 1 776c1a82-44e1-4c2e-97a7-34cef629e698 Completed 4 clickhouse-demo-01.test.svc.cluster.local 2m56s
kubectl -n test exec -it chi-demo-01-demo-01-0-0-0 -- df -h
Filesystem Size Used Avail Use% Mounted on
overlay 32G 26G 4.0G 87% /
tmpfs 64M 0 64M 0% /dev
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/sda2 32G 26G 4.0G 87% /etc/hosts
shm 64M 0 64M 0% /dev/shm
tmpfs 7.7G 12K 7.7G 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 3.9G 0 3.9G 0% /proc/acpi
tmpfs 3.9G 0 3.9G 0% /proc/scsi
tmpfs 3.9G 0 3.9G 0% /sys/firmware
Storage is still there. We can test if our databases are still available
by logging into clickhouse:
SHOW DATABASES
┌─name────────┐
│ default │
│ system │
│ teststorage │
└─────────────┘
SELECT * FROM teststorage.test
┌─dummy─┐
│ 0 │
└───────┘
All of our databases and tables are there.
There are different ways of allocating storage - for data, for logging,
multiple data volumes for your cluster nodes, but this will get you
started in running your own Kubernetes cluster running ClickHouse
in your favorite environment.
3.1.5 - Uninstall
To remove the Altinity Kubernetes Operator, both the Altinity Kubernetes Operator and the components in its installed namespace will have to be removed. The proper command is to uses the same clickhouse-operator-install-bundle.yaml
file that was used to install the Altinity Kubernetes Operator. For more details, see how to install and verify the Altinity Kubernetes Operator.
The following instructions are based on the standard installation instructions. For users who perform a custom installation, note that the any custom namespaces that the user wants to remove will have to be deleted separate from the Altinity Kubernetes Operator deletion.
For example, if the custom namespace operator-test
is created, then it would be removed with the command kubectl delete namespaces operator-test
.
IMPORTANT NOTICE 1
Never delete the operator or run the following command while there are live ClickHouse clusters managed by the operator:
kubectl delete --namespace "kube-system" -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
The command will hang due to the live clusters. If you then re-install the operator, those clusters will be deleted and the operator will not work correctly.
See Altinity/clickhouse-operator#830 for more details.
IMPORTANT NOTICE 2
Please follow the instructions below. The uninstall command is geared to properly remove the Altinity Kubernetes Operator and its namespace. Deleting the namespace without properly removing the Altinity Kubernetes Operator can cause it to hang.Instructions
To remove the Altinity Kubernetes Operator from your Kubernetes environment from a standard install:
-
Verify the Altinity Kubernetes Operator is in the
kube-system
namespace. The Altinity Kubernetes Operator and other pods will be displayed:NAME READY STATUS RESTARTS AGE clickhouse-operator-857c69ffc6-2frgl 2/2 Running 0 5s coredns-78fcd69978-nthp2 1/1 Running 4 (23h ago) 51d etcd-minikube 1/1 Running 4 (23h ago) 51d kube-apiserver-minikube 1/1 Running 4 (23h ago) 51d kube-controller-manager-minikube 1/1 Running 4 (23h ago) 51d kube-proxy-lsggn 1/1 Running 4 (23h ago) 51d kube-scheduler-minikube 1/1 Running 4 (23h ago) 51d storage-provisioner 1/1 Running 9 (23h ago) 51d
-
Issue the
kubectl
delete command using the same YAML file used to install the Altinity Kubernetes Operator. By default the Altinity Kubernetes Operator is installed in the namespacekube-system
. If this was installed into a custom namespace, verify that it installed in the uninstall command. In this example, we specified an installation of the Altinity Kubernetes Operator version0.18.3
into the defaultkube-system
namespace. This produces output similar to the following:kubectl delete -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
customresourcedefinition.apiextensions.k8s.io "clickhouseinstallations.clickhouse.altinity.com" deleted customresourcedefinition.apiextensions.k8s.io "clickhouseinstallationtemplates.clickhouse.altinity.com" deleted customresourcedefinition.apiextensions.k8s.io "clickhouseoperatorconfigurations.clickhouse.altinity.com" deleted serviceaccount "clickhouse-operator" deleted clusterrole.rbac.authorization.k8s.io "clickhouse-operator-kube-system" deleted clusterrolebinding.rbac.authorization.k8s.io "clickhouse-operator-kube-system" deleted configmap "etc-clickhouse-operator-files" deleted configmap "etc-clickhouse-operator-confd-files" deleted configmap "etc-clickhouse-operator-configd-files" deleted configmap "etc-clickhouse-operator-templatesd-files" deleted configmap "etc-clickhouse-operator-usersd-files" deleted deployment.apps "clickhouse-operator" deleted service "clickhouse-operator-metrics" deleted
-
To verify the Altinity Kubernetes Operator has been removed, use the
kubectl get namespaces
command:kubectl get pods --namespace kube-system
NAME READY STATUS RESTARTS AGE coredns-78fcd69978-nthp2 1/1 Running 4 (23h ago) 51d etcd-minikube 1/1 Running 4 (23h ago) 51d kube-apiserver-minikube 1/1 Running 4 (23h ago) 51d kube-controller-manager-minikube 1/1 Running 4 (23h ago) 51d kube-proxy-lsggn 1/1 Running 4 (23h ago) 51d kube-scheduler-minikube 1/1 Running 4 (23h ago) 51d storage-provisioner 1/1 Running 9 (23h ago) 51d
3.2 - Kubernetes Install Guide
Kubernetes and Zookeeper form a major backbone in running the Altinity Kubernetes Operator in a cluster. The following guides detail how to setup Kubernetes in different environments.
3.2.1 - Install minikube for Linux
One popular option for installing Kubernetes is through minikube, which creates a local Kubernetes cluster for different environments. Tests scripts and examples for the clickhouse-operator
are based on using minikube
to set up the Kubernetes environment.
The following guide demonstrates how to install minikube that support the clickhouse-operator
for the following operating systems:
- Linux (Deb based)
Minikube Installation for Deb Based Linux
The following instructions assume an installation for x86-64 based Linux that use Deb package installation. Please see the referenced documentation for instructions for other Linux distributions and platforms.
To install minikube
that supports running clickhouse-operator
:
kubectl Installation for Deb
The following instructions are based on Install and Set Up kubectl on Linux
-
Download the
kubectl
binary:curl -LO 'https://dl.k8s.io/release/v1.22.0/bin/linux/amd64/kubectl'
-
Verify the SHA-256 hash:
curl -LO "https://dl.k8s.io/v1.22.0/bin/linux/amd64/kubectl.sha256"
echo "$(<kubectl.sha256) kubectl" | sha256sum --check
-
Install
kubectl
into the/usr/local/bin
directory (this assumes that yourPATH
includesuse/local/bin
):sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
-
Verify the installation and the version:
kubectl version
Install Docker for Deb
These instructions are based on Docker’s documentation Install Docker Engine on Ubuntu
-
Install the Docker repository links.
-
Update the
apt-get
repository:sudo apt-get update
-
-
Install the prequisites
ca-certificates
,curl
,gnupg
, andlsb-release
:sudo apt-get install -y ca-certificates curl gnupg lsb-release
-
Add the Docker repository keys:
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --yes --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
-
Add the Docker repository:
echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \ $(lsb_release -cs) stable" |sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
-
-
Install Docker:
-
Update the
apt-get
repository:sudo apt-get update
-
Install Docker and other libraries:
sudo apt install docker-ce docker-ce-cli containerd.io
-
-
Add non-root accounts to the
docker
group. This allows these users to run Docker commands without requiring root access.-
Add current user to the
docker
group and activate the changes to groupsudo usermod -aG docker $USER&& newgrp docker
-
Install Minikube for Deb
The following instructions are taken from minikube start.
-
Update the
apt-get
repository:sudo apt-get update
-
Install the prerequisite
conntrack
:sudo apt install conntrack
-
Download minikube:
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
-
Install minikube:
sudo install minikube-linux-amd64 /usr/local/bin/minikube
-
To correct issues with the
kube-proxy
and thestorage-provisioner
, setnf_conntrack_max=524288
before startingminikube
:sudo sysctl net/netfilter/nf_conntrack_max=524288
-
Start minikube:
minikube start && echo "ok: started minikube successfully"
-
Once installation is complete, verify that the user owns the ~/.kube and ~/.minikube directories:
sudo chown -R $USER.$USER .kube
sudo chown -R $USER.$USER .minikube
3.2.2 - Altinity Kubernetes Operator on GKE
Organizations can host their Altinity Kubernetes Operator on the Google Kubernetes Engine (GKE). This can be done either through Altinity.Cloud or through a separate installation on GKE.
To setup a basic Altinity Kubernetes Operator environment, use the following steps. The steps below use the current free Google Cloud services to set up a minimally viable Kubernetes with ClickHouse environment.
Prerequisites
- Register a Google Cloud Account: https://cloud.google.com/.
- Create a Google Cloud project: https://cloud.google.com/resource-manager/docs/creating-managing-projects
- Install
gcloud
and rungcloud init
orgcloud init --console
to set up your environment: https://cloud.google.com/sdk/docs/install - Enable the Google Compute Engine: https://cloud.google.com/endpoints/docs/openapi/enable-api
- Enable GKE on your project: https://console.cloud.google.com/apis/enableflow?apiid=container.googleapis.com.
- Select a default Computer Engine zone.
- Select a default Compute Engine region.
- Install kubectl on your local system. For sample instructions, see the Minikube on Linux installation instructions.
Altinity Kubernetes Operator on GKE Installation instructions
Installing the Altinity Kubernetes Operator in GKE has the following main steps:
Create the Network
The first step in setting up the Altinity Kubernetes Operator in GKE is creating the network. The complete details can be found on the Google Cloud documentation site regarding the gcloud compute networks create command. The following command will create a network called kubernetes-1
that will work for our minimal Altinity Kubernetes Operator cluster. Note that this network will not be available to external networks unless additional steps are made. Consult the Google Cloud documentation site for more details.
-
See a list of current networks available. In this example, there are no networks setup in this project:
gcloud compute networks list NAME SUBNET_MODE BGP_ROUTING_MODE IPV4_RANGE GATEWAY_IPV4 default AUTO REGIONAL
-
Create the network in your Google Cloud project:
gcloud compute networks create kubernetes-1 --bgp-routing-mode regional --subnet-mode custom Created [https://www.googleapis.com/compute/v1/projects/betadocumentation/global/networks/kubernetes-1]. NAME SUBNET_MODE BGP_ROUTING_MODE IPV4_RANGE GATEWAY_IPV4 kubernetes-1 CUSTOM REGIONAL Instances on this network will not be reachable until firewall rules are created. As an example, you can allow all internal traffic between instances as well as SSH, RDP, and ICMP by running: $ gcloud compute firewall-rules create <FIREWALL_NAME> --network kubernetes-1 --allow tcp,udp,icmp --source-ranges <IP_RANGE> $ gcloud compute firewall-rules create <FIREWALL_NAME> --network kubernetes-1 --allow tcp:22,tcp:3389,icmp
-
Verify its creation:
gcloud compute networks list NAME SUBNET_MODE BGP_ROUTING_MODE IPV4_RANGE GATEWAY_IPV4 default AUTO REGIONAL kubernetes-1 CUSTOM REGIONAL
Create the Cluster
Now that the network has been created, we can set up our cluster. The following cluster will be one using the ec2-micro
machine type - the is still in the free model, and gives just enough power to run our basic cluster. The cluster will be called cluster-1
, but you can replace it whatever name you feel appropriate. It uses the kubernetes-1
network specified earlier and creates a new subnet for the cluster under k-subnet-1
.
To create and launch the cluster:
-
Verify the existing clusters with the
gcloud
command. For this example there are no pre-existing clusters.gcloud container clusters list
-
From the command line, issue the following
gcloud
command to create the cluster:gcloud container clusters create cluster-1 --region us-west1 --node-locations us-west1-a --machine-type e2-micro --network kubernetes-1 --create-subnetwork name=k-subnet-1 --enable-ip-alias &
-
Use the
clusters list
command to verify when the cluster is available for use:gcloud container clusters list Created [https://container.googleapis.com/v1/projects/betadocumentation/zones/us-west1/clusters/cluster-1]. To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/us-west1/cluster-1?project=betadocumentation kubeconfig entry generated for cluster-1. NAME LOCATION MASTER_VERSION MASTER_IP MACHINE_TYPE NODE_VERSION NUM_NODES STATUS cluster-1 us-west1 1.21.6-gke.1500 35.233.231.36 e2-micro 1.21.6-gke.1500 3 RUNNING NAME LOCATION MASTER_VERSION MASTER_IP MACHINE_TYPE NODE_VERSION NUM_NODES STATUS cluster-1 us-west1 1.21.6-gke.1500 35.233.231.36 e2-micro 1.21.6-gke.1500 3 RUNNING [1]+ Done gcloud container clusters create cluster-1 --region us-west1 --node-locations us-west1-a --machine-type e2-micro --network kubernetes-1 --create-subnetwork name=k-subnet-1 --enable-ip-alias
Get Cluster Credentials
Importing the cluster credentials into your kubectl
environment will allow you to issue commands directly to the cluster on Google Cloud. To import the cluster credentials:
-
Retrieve the credentials for the newly created cluster:
gcloud container clusters get-credentials cluster-1 --region us-west1 --project betadocumentation Fetching cluster endpoint and auth data. kubeconfig entry generated for cluster-1.
-
Verify the cluster information from the
kubectl
environment:kubectl cluster-info Kubernetes control plane is running at https://35.233.231.36 GLBCDefaultBackend is running at https://35.233.231.36/api/v1/namespaces/kube-system/services/default-http-backend:http/proxy KubeDNS is running at https://35.233.231.36/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy Metrics-server is running at https://35.233.231.36/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Install the Altinity ClickHouse Operator
Our cluster is up and ready to go. Time to install the Altinity Kubernetes Operator through the following steps. Note that we are specifying the version of the Altinity Kubernetes Operator to install. This insures maximum compatibility with your applications and other Kubernetes environments.
As of the time of this article, the most current version is 0.18.1
-
Apply the Altinity Kubernetes Operator manifest by either downloading it and applying it, or referring to the GitHub repository URL. For more information, see the Altinity Kubernetes Operator Installation Guides.
kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.1/deploy/operator/clickhouse-operator-install-bundle.yaml
-
Verify the installation by running:
kubectl get pods --namespace kube-system NAME READY STATUS RESTARTS AGE clickhouse-operator-77b54889b4-g98kk 2/2 Running 0 53s event-exporter-gke-5479fd58c8-7h6bn 2/2 Running 0 108s fluentbit-gke-b29c2 2/2 Running 0 79s fluentbit-gke-k8f2n 2/2 Running 0 80s fluentbit-gke-vjlqh 2/2 Running 0 80s gke-metrics-agent-4ttdt 1/1 Running 0 79s gke-metrics-agent-qf24p 1/1 Running 0 80s gke-metrics-agent-szktc 1/1 Running 0 80s konnectivity-agent-564f9f6c5f-59nls 1/1 Running 0 40s konnectivity-agent-564f9f6c5f-9nfnl 1/1 Running 0 40s konnectivity-agent-564f9f6c5f-vk7l8 1/1 Running 0 97s konnectivity-agent-autoscaler-5c49cb58bb-zxzlp 1/1 Running 0 97s kube-dns-697dc8fc8b-ddgrx 4/4 Running 0 98s kube-dns-697dc8fc8b-fpnps 4/4 Running 0 71s kube-dns-autoscaler-844c9d9448-pqvqr 1/1 Running 0 98s kube-proxy-gke-cluster-1-default-pool-fd104f22-8rx3 1/1 Running 0 36s kube-proxy-gke-cluster-1-default-pool-fd104f22-gnd0 1/1 Running 0 29s kube-proxy-gke-cluster-1-default-pool-fd104f22-k2sv 1/1 Running 0 12s l7-default-backend-69fb9fd9f9-hk7jq 1/1 Running 0 107s metrics-server-v0.4.4-857776bc9c-bs6sl 2/2 Running 0 44s pdcsi-node-5l9vf 2/2 Running 0 79s pdcsi-node-gfwln 2/2 Running 0 79s pdcsi-node-q6scz 2/2 Running 0 80s
Create a Simple ClickHouse Cluster
The Altinity Kubernetes Operator allows the easy creation and modification of ClickHouse clusters in whatever format that works best for your organization. Now that the Google Cloud cluster is running and has the Altinity Kubernetes Operatorinstalled, let’s create a very simple ClickHouse cluster to test on.
The following example will create an Altinity Kubernetes Operator controlled cluster with 1 shard and replica, 500 MB of persistent storage, and sets the password of the default Altinity Kubernetes Operator user’s password to topsecret
. For more information on customizing the Altinity Kubernetes Operator, see the Altinity Kubernetes Operator Configuration Guides.
-
Create the following manifest and save it as
gcp-example01.yaml
.apiVersion: "clickhouse.altinity.com/v1" kind: "ClickHouseInstallation" metadata: name: "gcp-example" spec: configuration: # What does my cluster look like? clusters: - name: "gcp-example" layout: shardsCount: 1 replicasCount: 1 templates: podTemplate: clickhouse-stable volumeClaimTemplate: pd-ssd # Where is Zookeeper? zookeeper: nodes: - host: zookeeper.zoo1ns port: 2181 # What are my users? users: # Password = topsecret demo/password_sha256_hex: 53336a676c64c1396553b2b7c92f38126768827c93b64d9142069c10eda7a721 demo/profile: default demo/quota: default demo/networks/ip: - 0.0.0.0/0 - ::/0 templates: podTemplates: # What is the definition of my server? - name: clickhouse-stable spec: containers: - name: clickhouse image: altinity/clickhouse-server:21.8.10.1.altinitystable # Keep servers on separate nodes! podDistribution: - scope: ClickHouseInstallation type: ClickHouseAntiAffinity volumeClaimTemplates: # How much storage and which type on each node? - name: pd-ssd # Do not delete PVC if installation is dropped. reclaimPolicy: Retain spec: accessModes: - ReadWriteOnce resources: requests: storage: 500Mi
-
Create a namespace in your GKE environment. For this example, we will be using
test
:kubectl create namespace test namespace/test created
-
Apply the manifest to the namespace:
kubectl -n test apply -f gcp-example01.yaml clickhouseinstallation.clickhouse.altinity.com/gcp-example created
-
Verify the installation is complete when all pods are in a
Running
state:kubectl -n test get chi -o wide NAME VERSION CLUSTERS SHARDS HOSTS TASKID STATUS UPDATED ADDED DELETED DELETE ENDPOINT gcp-example 0.18.1 1 1 1 f859e396-e2de-47fd-8016-46ad6b0b8508 Completed 1 clickhouse-gcp-example.test.svc.cluster.local
Login to the Cluster
This example does not have any open external ports, but we can still access our ClickHouse database through kubectl exec
. In this case, our specific pod we are connecting to is chi-demo-01-demo-01-0-0-0
. Replace this with the designation of your pods;
Use the following procedure to verify the Altinity Stable build install in your GKE environment.
-
Login to the
clickhouse-client
in one of your existing pods:kubectl -n test exec -it chi-gcp-example-gcp-example-0-0-0 -- clickhouse-client
-
Verify the cluster configuration:
kubectl -n test exec -it chi-gcp-example-gcp-example-0-0-0 -- clickhouse-client -q "SELECT * FROM system.clusters FORMAT PrettyCompactNoEscapes" ┌─cluster──────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───────────────────────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐ │ all-replicated │ 1 │ 1 │ 1 │ chi-gcp-example-gcp-example-0-0 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │ │ all-sharded │ 1 │ 1 │ 1 │ chi-gcp-example-gcp-example-0-0 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │ │ gcp-example │ 1 │ 1 │ 1 │ chi-gcp-example-gcp-example-0-0 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │ │ test_cluster_two_shards │ 1 │ 1 │ 1 │ 127.0.0.1 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │ │ test_cluster_two_shards │ 2 │ 1 │ 1 │ 127.0.0.2 │ 127.0.0.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │ │ test_cluster_two_shards_internal_replication │ 1 │ 1 │ 1 │ 127.0.0.1 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │ │ test_cluster_two_shards_internal_replication │ 2 │ 1 │ 1 │ 127.0.0.2 │ 127.0.0.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │ │ test_cluster_two_shards_localhost │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │ │ test_cluster_two_shards_localhost │ 2 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │ │ test_shard_localhost │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │ │ test_shard_localhost_secure │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9440 │ 0 │ default │ │ 0 │ 0 │ 0 │ │ test_unavailable_shard │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │ │ test_unavailable_shard │ 2 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 1 │ 0 │ default │ │ 0 │ 0 │ 0 │ └──────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴─────────────────────────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘
-
Exit out of your cluster:
chi-gcp-example-gcp-example-0-0-0.chi-gcp-example-gcp-example-0-0.test.svc.cluster.local :) exit Bye.
Further Steps
This simple example demonstrates how to build and manage an Altinity Altinity Kubernetes Operator run cluster for ClickHouse. Further steps would be to open the cluster to external network connections, setup replication schemes, etc.
For more information, see the Altinity Kubernetes Operator guides and the Altinity Kubernetes Operator repository.
3.3 - Operator Guide
The the Altinity Kubernetes Operator is an open source project managed and maintained by Altinity Inc. This Operator Guide is created to help users with installation, configuration, maintenance, and other important tasks.
3.3.1 - Installation Guide
Depending on your organization and its needs, there are different ways of installing the Kubernetes clickhouse-operator.
3.3.1.1 - Basic Installation Guide
Requirements
The Altinity Kubernetes Operator for Kubernetes has the following requirements:
- Kubernetes 1.15.11+. For instructions on how to
install Kubernetes for your particular environment, see the
Kubernetes Install Tools page. - Access to the clickhouse-operator-install-bundle.yaml
file.
Instructions
To install the Altinity Kubernetes Operator for Kubernetes:
-
Deploy the Altinity Kubernetes Operator from the manifest directly from GitHub. It is recommended that the version be specified during installation - this insures maximum compatibility and that all replicated environments are working from the same version. For more information on installing other versions of the Altinity Kubernetes Operator, see the specific Version Installation Guide.
The most current version is
0.18.3
:
kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
- The following will be displayed on a successful installation.
For more information on the resources created in the installation,
see [Altinity Kubernetes Operator Resources]({<ref “operatorresources” >})
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com created
serviceaccount/clickhouse-operator created
clusterrole.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
configmap/etc-clickhouse-operator-files created
configmap/etc-clickhouse-operator-confd-files created
configmap/etc-clickhouse-operator-configd-files created
configmap/etc-clickhouse-operator-templatesd-files created
configmap/etc-clickhouse-operator-usersd-files created
deployment.apps/clickhouse-operator created
service/clickhouse-operator-metrics created
- Verify the installation by running:
kubectl get pods --namespace kube-system
The following will be displayed on a successful installation,
with your particular image:
NAME READY STATUS RESTARTS AGE
clickhouse-operator-857c69ffc6-ttnsj 2/2 Running 0 4s
coredns-78fcd69978-nthp2 1/1 Running 4 (23h ago) 51d
etcd-minikube 1/1 Running 4 (23h ago) 51d
kube-apiserver-minikube 1/1 Running 4 (23h ago) 51d
kube-controller-manager-minikube 1/1 Running 4 (23h ago) 51d
kube-proxy-lsggn 1/1 Running 4 (23h ago) 51d
kube-scheduler-minikube 1/1 Running 4 (23h ago) 51d
storage-provisioner 1/1 Running 9 (23h ago) 51d
3.3.1.2 - Custom Installation Guide
Users who need to customize their Altinity Kubernetes Operator namespace or
can not directly connect to Github from the installation environment
can perform a custom install.
Requirements
The Altinity Kubernetes Operator for Kubernetes has the following requirements:
- Kubernetes 1.15.11+. For instructions on
how to install Kubernetes for your particular environment, see
the Kubernetes Install Tools page. - Access to the clickhouse-operator-install-bundle.yaml file.
Instructions
Script Install into Namespace
By default, the Altinity Kubernetes Operator installed into the kube-system
namespace when using the Basic Installation instructions.
To install into a different namespace use the following command replacing {custom namespace here}
with the namespace to use:
curl -s https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/deploy/operator-web-installer/clickhouse-operator-install.sh | OPERATOR_NAMESPACE={custom_namespace_here} bash
For example, to install into the namespace test-clickhouse-operator
namespace, use:
curl -s https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/deploy/operator-web-installer/clickhouse-operator-install.sh | OPERATOR_NAMESPACE=test-clickhouse-operator bash
Setup ClickHouse Operator into 'test-clickhouse-operator' namespace
No 'test-clickhouse-operator' namespace found. Going to create
namespace/test-clickhouse-operator created
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com created
serviceaccount/clickhouse-operator created
clusterrole.rbac.authorization.k8s.io/clickhouse-operator-test-clickhouse-operator configured
clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-test-clickhouse-operator configured
configmap/etc-clickhouse-operator-files created
configmap/etc-clickhouse-operator-confd-files created
configmap/etc-clickhouse-operator-configd-files created
configmap/etc-clickhouse-operator-templatesd-files created
configmap/etc-clickhouse-operator-usersd-files created
deployment.apps/clickhouse-operator created
service/clickhouse-operator-metrics created
If no OPERATOR_NAMESPACE
value is set, then the Altinity Kubernetes Operator will
be installed into kube-system
.
Manual Install into Namespace
For organizations that can not access GitHub directly from the environment they are installing the Altinity Kubernetes Operator in, they can perform a manual install through the following steps:
-
Download the install template file: clickhouse-operator-install-template.yaml.
-
Edit the file and set OPERATOR_NAMESPACE value.
-
Use the following commands, replacing {your file name} with the name of your YAML file:
namespace = "custom-clickhouse-operator" bash("sed -i s/'${OPERATOR_NAMESPACE}'/test-clickhouse-operator/ clickhouse-operator-install-template.yaml", add_to_text=False) bash(f"kubectl apply -f clickhouse-operator-install-template.yaml", add_to_text=False) try: retry(bash, timeout=60, delay=1)("kubectl get pods --namespace test-clickhouse-operator " "-o=custom-columns=NAME:.metadata.name,STATUS:.status.phase", exitcode=0, message="Running", lines=slice(1, None), fail_message="not all pods in Running state", add_to_text=true) finally: bash(f"kubectl delete namespace test-clickhouse-operator', add_to_text=False)
kubectl apply -f {your file name}
For example:
kubectl apply -f customtemplate.yaml
Alternatively, instead of using the install template, enter the following into your console
(bash
is used below, modify depending on your particular shell).
Change the OPERATOR_NAMESPACE value to match your namespace.
# Namespace to install operator into
OPERATOR_NAMESPACE="${OPERATOR_NAMESPACE:-clickhouse-operator}"
# Namespace to install metrics-exporter into
METRICS_EXPORTER_NAMESPACE="${OPERATOR_NAMESPACE}"
# Operator's docker image
OPERATOR_IMAGE="${OPERATOR_IMAGE:-altinity/clickhouse-operator:latest}"
# Metrics exporter's docker image
METRICS_EXPORTER_IMAGE="${METRICS_EXPORTER_IMAGE:-altinity/metrics-exporter:latest}"
# Setup Altinity Kubernetes Operator into specified namespace
kubectl apply --namespace="${OPERATOR_NAMESPACE}" -f <( \
curl -s https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/deploy/operator/clickhouse-operator-install-template.yaml | \
OPERATOR_IMAGE="${OPERATOR_IMAGE}" \
OPERATOR_NAMESPACE="${OPERATOR_NAMESPACE}" \
METRICS_EXPORTER_IMAGE="${METRICS_EXPORTER_IMAGE}" \
METRICS_EXPORTER_NAMESPACE="${METRICS_EXPORTER_NAMESPACE}" \
envsubst \
)
Verify Installation
To verify the Altinity Kubernetes Operator is running in your namespace, use the following command:
kubectl get pods -n clickhouse-operator
NAME READY STATUS RESTARTS AGE
clickhouse-operator-5d9496dd48-8jt8h 2/2 Running 0 16s
3.3.1.3 - Source Build Guide - 0.18 and Up
For organizations who prefer to build the software directly from source code,
they can compile the Altinity Kubernetes Operator and install it into a Docker
container through the following process. The following procedure is available for versions of the Altinity Kubernetes Operator 0.18.0 and up.
Binary Build
Binary Build Requirements
go-lang
compiler: Go.- Go
mod
Package Manager. - The source code from the Altinity Kubernetes Operator repository.
This can be downloaded usinggit clone https://github.com/altinity/clickhouse-operator
.
Binary Build Instructions
-
Switch working dir to
clickhouse-operator
. -
Link all packages with the command:
echo {root_password} | sudo -S -k apt install -y golang
. -
Build the sources with
go build -o ./clickhouse-operator cmd/operator/main.go
.
This creates the Altinity Kubernetes Operator binary. This binary is only used
within a kubernetes environment.
Docker Image Build and Usage
Docker Build Requirements
kubernetes
: https://kubernetes.io/docker
: https://www.docker.com/- Complete the Binary Build Instructions
Install Docker Buildx CLI plugin
-
Download Docker Buildx binary file releases page on GitHub
-
Create folder structure for plugin
mkdir -p ~/.docker/cli-plugins/
-
Rename the relevant binary and copy it to the destination matching your OS
mv buildx-v0.7.1.linux-amd64 ~/.docker/cli-plugins/docker-buildx
-
On Unix environments, it may also be necessary to make it executable with
chmod +x
:chmod +x ~/.docker/cli-plugins/docker-buildx
-
Set buildx as the default builder
docker buildx install
-
Create config.json file to enable the plugin
touch ~/.docker/config.json
-
Create config.json file to enable the plugin
echo "{"experimental": "enabled"}" >> ~/.docker/config.json
Docker Build Instructions
-
Switch working dir to
clickhouse-operator
-
Build docker image with
docker
:docker build -f dockerfile/operator/Dockerfile -t altinity/clickhouse-operator:dev .
-
Register freshly build
docker
image insidekubernetes
environment with the following:docker save altinity/clickhouse-operator | (eval $(minikube docker-env) && docker load)
-
Install the Altinity Kubernetes Operator as described in either the Basic Build
or Custom Build.
3.3.1.4 - Specific Version Installation Guide
Users may want to install a specific version of the Altinity Kubernetes Operator for a variety of reasons: to maintain parity between different environments, to preserve the version between replicas, or other reasons.
The following procedures detail how to install a specific version of the Altinity Kubernetes Operator in the default Kubernetes namespace kube-system
. For instructions on performing custom installations based on the namespace and other settings, see the Custom Installation Guide.
Requirements
The Altinity Kubernetes Operator for Kubernetes has the following requirements:
- Kubernetes 1.15.11+. For instructions on
how to install Kubernetes for your particular environment, see
the Kubernetes Install Tools page. - Access to the Altinity Kubernetes Operator GitHub repository.
Instructions
Altinity Kubernetes Operator Versions After 0.17.0
To install a specific version of the Altinity Kubernetes Operator after version 0.17.0
:
-
Run
kubectl
and apply the manifest directly from the GitHub Altinity Kubernetes Operator repository, or by downloading the manifest and applying it directly. The format for the URL is:https://github.com/Altinity/clickhouse-operator/raw/{OPERATOR_VERSION}/deploy/operator/clickhouse-operator-install-bundle.yaml
Replace the {OPERATOR_VERSION} with the version to install. For example, for the Altinity Kubernetes Operator version 0.18.3, the URL would be:
The command to apply the Docker manifest through
kubectl
is:kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com configured customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com configured customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com configured serviceaccount/clickhouse-operator created clusterrole.rbac.authorization.k8s.io/clickhouse-operator-kube-system created clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-kube-system created configmap/etc-clickhouse-operator-files created configmap/etc-clickhouse-operator-confd-files created configmap/etc-clickhouse-operator-configd-files created configmap/etc-clickhouse-operator-templatesd-files created configmap/etc-clickhouse-operator-usersd-files created deployment.apps/clickhouse-operator created service/clickhouse-operator-metrics created
-
Verify the installation is complete and the
clickhouse-operator
pod is running:kubectl get pods --namespace kube-system
A similar result to the following will be displayed on a successful installation:
NAME READY STATUS RESTARTS AGE clickhouse-operator-857c69ffc6-q8qrr 2/2 Running 0 5s coredns-78fcd69978-nthp2 1/1 Running 4 (23h ago) 51d etcd-minikube 1/1 Running 4 (23h ago) 51d kube-apiserver-minikube 1/1 Running 4 (23h ago) 51d kube-controller-manager-minikube 1/1 Running 4 (23h ago) 51d kube-proxy-lsggn 1/1 Running 4 (23h ago) 51d kube-scheduler-minikube 1/1 Running 4 (23h ago) 51d storage-provisioner 1/1 Running 9 (23h ago) 51d
-
To verify the version of the Altinity Kubernetes Operator, use the following command:
kubectl get pods -l app=clickhouse-operator --all-namespaces -o jsonpath="{.items[*].spec.containers[*].image}" | tr -s "[[:space:]]" | sort | uniq -c
1 altinity/clickhouse-operator:0.18.3 altinity/metrics-exporter:0.18.3
3.3.1.5 - Upgrade Guide
The Altinity Kubernetes Operator can be upgraded at any time by applying the new manifest from the Altinity Kubernetes Operator GitHub repository.
The following procedures detail how to install a specific version of the Altinity Kubernetes Operator in the default Kubernetes namespace kube-system
. For instructions on performing custom installations based on the namespace and other settings, see the Custom Installation Guide.
Requirements
The Altinity Kubernetes Operator for Kubernetes has the following requirements:
- Kubernetes 1.15.11+. For instructions on
how to install Kubernetes for your particular environment, see
the Kubernetes Install Tools page. - Access to the Altinity Kubernetes Operator GitHub repository.
Instructions
The following instructions are based on installations of the Altinity Kubernetes Operator greater than version 0.16.0. In the following examples, Altinity Kubernetes Operator version 0.16.0 has been installed and will be upgraded to 0.18.3.
For instructions on installing specific versions of the Altinity Kubernetes Operator, see the Specific Version Installation Guide.
-
Deploy the Altinity Kubernetes Operator from the manifest directly from GitHub. It is recommended that the version be specified during the installation for maximum compatibilty. In this example, the version being upgraded to is :
kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
-
The following will be displayed on a successful installation.
For more information on the resources created in the installation,
see [Altinity Kubernetes Operator Resources]({<ref “operatorresources” >})customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com configured customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com configured customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com configured serviceaccount/clickhouse-operator configured clusterrole.rbac.authorization.k8s.io/clickhouse-operator-kube-system configured clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-kube-system configured configmap/etc-clickhouse-operator-files configured configmap/etc-clickhouse-operator-confd-files configured configmap/etc-clickhouse-operator-configd-files configured configmap/etc-clickhouse-operator-templatesd-files configured configmap/etc-clickhouse-operator-usersd-files configured deployment.apps/clickhouse-operator configured service/clickhouse-operator-metrics configured
-
Verify the installation by running:
The following will be displayed on a successful installation, with your particular image:
kubectl get pods --namespace kube-system
NAME READY STATUS RESTARTS AGE clickhouse-operator-857c69ffc6-dqt5l 2/2 Running 0 29s coredns-78fcd69978-nthp2 1/1 Running 3 (14d ago) 50d etcd-minikube 1/1 Running 3 (14d ago) 50d kube-apiserver-minikube 1/1 Running 3 (2m6s ago) 50d kube-controller-manager-minikube 1/1 Running 3 (14d ago) 50d kube-proxy-lsggn 1/1 Running 3 (14d ago) 50d kube-scheduler-minikube 1/1 Running 3 (2m6s ago) 50d storage-provisioner 1/1 Running 7 (48s ago) 50d
-
To verify the version of the Altinity Kubernetes Operator, use the following command:
kubectl get pods -l app=clickhouse-operator -n kube-system -o jsonpath="{.items[*].spec.containers[*].image}" | tr -s "[[:space:]]" | sort | uniq -c
1 altinity/clickhouse-operator:0.18.3 altinity/metrics-exporter:0.18.3
3.3.2 - Configuration Guide
Depending on your organization’s needs and environment, you can modify your environment to best fit your needs with the Altinity Kubernetes Operator or your cluster settings.
3.3.2.1 - ClickHouse Operator Settings
Altinity Kubernetes Operator 0.18 and greater
For versions of the Altinity Kubernetes Operator 0.18
and later, the Altinity Kubernetes Operator settings can be modified through the clickhouse-operator-install-bundle.yaml file in the section etc-clickhouse-operator-files
. This sets the config.yaml
settings that are used to set the user configuration and other settings. For more information, see the sample config.yaml for the Altinity Kubernetes Operator.
Altinity Kubernetes Operator before 0.18
For versions before 0.18
, the Altinity Kubernetes Operator settings can be modified through clickhouse-operator-install-bundle.yaml file in the section marked ClickHouse Settings Section
.
New User Settings
Setting | Default Value | Description |
---|---|---|
chConfigUserDefaultProfile | default | Sets the default profile used when creating new users. |
chConfigUserDefaultQuota | default | Sets the default quota used when creating new users. |
chConfigUserDefaultNetworksIP | ::1 127.0.0.1 0.0.0.0 |
Specifies the networks that the user can connect from. Note that 0.0.0.0 allows access from all networks. |
chConfigUserDefaultPassword | default |
The initial password for new users. |
ClickHouse Operator Settings
The ClickHouse Operator role can connect to the ClickHouse database to perform the following:
- Metrics requests
- Schema Maintenance
- Drop DNS Cache
Additional users can be created with this role by modifying the usersd XML files.
Setting | Default Value | Description |
---|---|---|
chUsername | clickhouse_operator |
The username for the ClickHouse Operator user. |
chPassword | clickhouse_operator_password |
The default password for the ClickHouse Operator user. |
chPort | 8123 | The IP port for the ClickHouse Operator user. |
Log Parameters
The Log Parameters sections sets the options for log outputs and levels.
Setting | Default Value | Description |
---|---|---|
logtostderr | true | If set to true, submits logs to stderr instead of log files. |
alsologtostderr | false | If true, submits logs to stderr as well as log files. |
v | 1 | Sets V-leveled logging level. |
stderrthreshold | "" |
The error threshold. Errors at or above this level will be submitted to stderr. |
vmodule | "" |
A comma separated list of modules and their verbose level with {module name} = {log level}. For example: "module1=2,module2=3" . |
log_backtrace_at | "" |
Location to store the stack backtrace. |
Runtime Parameters
The Runtime Parameters section sets the resources allocated for processes such as reconcile functions.
Setting | Default Value | Description |
---|---|---|
reconcileThreadsNumber | 10 | The number threads allocated to manage reconcile requests. |
reconcileWaitExclude | false | ??? |
reconcileWaitInclude | false | ??? |
Template Parameters
Template Parameters sets the values for connection values, user default settings, and other values. These values are based on ClickHouse configurations. For full details, see the ClickHouse documentation page.
3.3.2.2 - ClickHouse Cluster Settings
ClickHouse clusters that are configured on Kubernetes have several options based on the Kubernetes Custom Resources settings. Your cluster may have particular requirements to best fit your organizations needs.
For an example of a configuration file using each of these settings, see the 99-clickhouseinstllation-max.yaml file as a template.
This assumes that you have installed the clickhouse-operator
.
Initial Settings
The first section sets the cluster kind and api.
Parent | Setting | Type | Description |
---|---|---|---|
None | kind | String | Specifies the type of cluster to install. In this case, ClickHouse. Value value: ClickHouseInstallation |
None | metadata | Object | Assigns metadata values for the cluster |
metadata | name | String | The name of the resource. |
metadata | labels | Array | Labels applied to the resource. |
metadata | annotation | Array | Annotations applied to the resource. |
Initial Settings Example
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "clickhouse-installation-max"
labels:
label1: label1_value
label2: label2_value
annotations:
annotation1: annotation1_value
annotation2: annotation2_value
.spec.defaults
.spec.defaults
section represents default values the sections that follow .specs.defaults
.
Parent | Setting | Type | Description |
---|---|---|---|
defaults | replicasUseFQDN | `[Yes | No ]` |
defaults | distributedDDL | String | Sets the <yandex><distributed_ddl></distributed_ddl></yandex> configuration settings. For more information, see Distributed DDL Queries (ON CLUSTER Clause). |
defaults | templates | Array | Sets the pod template types. This is where the template is declared, then defined in the .spec.configuration later. |
.spec.defaults Example
defaults:
replicasUseFQDN: "no"
distributedDDL:
profile: default
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
serviceTemplate: chi-service-template
.spec.configuration
.spec.configuration
section represents sources for ClickHouse configuration files. For more information, see the ClickHouse Configuration Files page.
.spec.configuration Example
configuration:
users:
readonly/profile: readonly
# <users>
# <readonly>
# <profile>readonly</profile>
# </readonly>
# </users>
test/networks/ip:
- "127.0.0.1"
- "::/0"
# <users>
# <test>
# <networks>
# <ip>127.0.0.1</ip>
# <ip>::/0</ip>
# </networks>
# </test>
# </users>
test/profile: default
test/quotas: default
.spec.configuration.zookeeper
.spec.configuration.zookeeper
defines the zookeeper settings, and is expanded into the <yandex><zookeeper></zookeeper></yandex>
configuration section. For more information, see ClickHouse Zookeeper settings.
.spec.configuration.zookeeper Example
zookeeper:
nodes:
- host: zookeeper-0.zookeepers.zoo3ns.svc.cluster.local
port: 2181
- host: zookeeper-1.zookeepers.zoo3ns.svc.cluster.local
port: 2181
- host: zookeeper-2.zookeepers.zoo3ns.svc.cluster.local
port: 2181
session_timeout_ms: 30000
operation_timeout_ms: 10000
root: /path/to/zookeeper/node
identity: user:password
.spec.configuration.profiles
.spec.configuration.profiles
defines the ClickHouse profiles that are stored in <yandex><profiles></profiles></yandex>
. For more information, see the ClickHouse Server Settings page.
.spec.configuration.profiles Example
profiles:
readonly/readonly: 1
expands into
<profiles>
<readonly>
<readonly>1</readonly>
</readonly>
</profiles>
.spec.configuration.users
.spec.configuration.users
defines the users and is stored in <yandex><users></users></yandex>
. For more information, see the ClickHouse Server Settings page.
.spec.configuration.users Example
users:
test/networks/ip:
- "127.0.0.1"
- "::/0"
expands into
<users>
<test>
<networks>
<ip>127.0.0.1</ip>
<ip>::/0</ip>
</networks>
</test>
</users>
.spec.configuration.settings
.spec.configuration.settings
sets other ClickHouse settings such as compression, etc. For more information, see the ClickHouse Server Settings page.
.spec.configuration.settings Example
settings:
compression/case/method: "zstd"
# <compression>
# <case>
# <method>zstd</method>
# </case>
# </compression>
.spec.configuration.files
.spec.configuration.files
creates custom files used in the custer. These are used for custom configurations, such as the ClickHouse External Dictionary.
.spec.configuration.files Example
files:
dict1.xml: |
<yandex>
<!-- ref to file /etc/clickhouse-data/config.d/source1.csv -->
</yandex>
source1.csv: |
a1,b1,c1,d1
a2,b2,c2,d2
spec:
configuration:
settings:
dictionaries_config: config.d/*.dict
files:
dict_one.dict: |
<yandex>
<dictionary>
<name>one</name>
<source>
<clickhouse>
<host>localhost</host>
<port>9000</port>
<user>default</user>
<password/>
<db>system</db>
<table>one</table>
</clickhouse>
</source>
<lifetime>60</lifetime>
<layout><flat/></layout>
<structure>
<id>
<name>dummy</name>
</id>
<attribute>
<name>one</name>
<expression>dummy</expression>
<type>UInt8</type>
<null_value>0</null_value>
</attribute>
</structure>
</dictionary>
</yandex>
.spec.configuration.clusters
.spec.configuration.clusters
defines the ClickHouse clusters to be installed.
clusters:
Clusters and Layouts
.clusters.layout
defines the ClickHouse layout of a cluster. This can be general, or very granular depending on your requirements. For full information, see Cluster Deployment.
Templates
podTemplate
is used to define the specific pods in the cluster, mainly the ones that will be running ClickHouse. The VolumeClaimTemplate
defines the storage volumes. Both of these settings are applied per replica.
Basic Dimensions
Basic dimensions are used to define the cluster definitions without specifying particular details of the shards or nodes.
Parent | Setting | Type | Description |
---|---|---|---|
.clusters.layout | shardsCount | Number | The number of shards for the cluster. |
.clusters.layout | replicasCount | Number | The number of replicas for the cluster. |
Basic Dimensions Example
In this example, the podTemplates
defines ClickHouses containers into a cluster called all-counts
with three shards and two replicas.
- name: all-counts
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
layout:
shardsCount: 3
replicasCount: 2
This is expanded into the following configuration. The IP addresses and DNS configuration are assigned by k8s and the operator.
<yandex>
<remote_servers>
<all-counts>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>192.168.1.1</host>
<port>9000</port>
</replica>
<replica>
<host>192.168.1.2</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>192.168.1.3</host>
<port>9000</port>
</replica>
<replica>
<host>192.168.1.4</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>192.168.1.5</host>
<port>9000</port>
</replica>
<replica>
<host>192.168.1.6</host>
<port>9000</port>
</replica>
</shard>
</all-counts>
</remote_servers>
</yandex>
Specified Dimensions
The templates
section can also be used to specify more than just the general layout. The exact definitions of the shards and replicas can be defined as well.
In this example, shard0
here has replicasCount
specified, while shard1
has 3 replicas explicitly specified, with possibility to customized each replica.
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
layout:
shardsCount: 3
replicasCount: 2
- name: customized
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
layout:
shards:
- name: shard0
replicasCount: 3
weight: 1
internalReplication: Disabled
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
- name: shard1
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
replicas:
- name: replica0
- name: replica1
- name: replica2
Other examples are combinations, where some replicas are defined but only one is explicitly differentiated with a different podTemplate
.
- name: customized
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
layout:
shards:
- name: shard2
replicasCount: 3
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
replicas:
- name: replica0
port: 9000
templates:
podTemplate: clickhouse-v19.11.3.11
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
.spec.templates.serviceTemplates
.spec.templates.serviceTemplates
represents Kubernetes Service templates, with additional fields.
At the top level is generateName
which is used to explicitly specify service name to be created. generateName
is able to understand macros for the service level of the object created. The service levels are defined as:
- CHI
- Cluster
- Shard
- Replica
The macro and service level where they apply are:
Setting | CHI | Cluster | Shard | Replica | Description |
---|---|---|---|---|---|
{chi} |
X | X | X | X | ClickHouseInstallation name |
{chiID} |
X | X | X | X | short hashed ClickHouseInstallation name (Experimental) |
{cluster} |
X | X | X | The cluster name | |
{clusterID} |
X | X | X | short hashed cluster name (BEWARE, this is an experimental feature) | |
{clusterIndex} |
X | X | X | 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature) | |
{shard} |
X | X | shard name | ||
{shardID} |
X | X | short hashed shard name (BEWARE, this is an experimental feature) | ||
{shardIndex} |
X | X | 0-based index of the shard in the cluster (BEWARE, this is an experimental feature) | ||
{replica} |
X | replica name | |||
{replicaID} |
X | short hashed replica name (BEWARE, this is an experimental feature) | |||
{replicaIndex} |
X | 0-based index of the replica in the shard (BEWARE, this is an experimental feature) |
.spec.templates.serviceTemplates Example
templates:
serviceTemplates:
- name: chi-service-template
# generateName understands different sets of macroses,
# depending on the level of the object, for which Service is being created:
#
# For CHI-level Service:
# 1. {chi} - ClickHouseInstallation name
# 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
#
# For Cluster-level Service:
# 1. {chi} - ClickHouseInstallation name
# 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
# 3. {cluster} - cluster name
# 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
# 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
#
# For Shard-level Service:
# 1. {chi} - ClickHouseInstallation name
# 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
# 3. {cluster} - cluster name
# 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
# 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
# 6. {shard} - shard name
# 7. {shardID} - short hashed shard name (BEWARE, this is an experimental feature)
# 8. {shardIndex} - 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
#
# For Replica-level Service:
# 1. {chi} - ClickHouseInstallation name
# 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
# 3. {cluster} - cluster name
# 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
# 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
# 6. {shard} - shard name
# 7. {shardID} - short hashed shard name (BEWARE, this is an experimental feature)
# 8. {shardIndex} - 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
# 9. {replica} - replica name
# 10. {replicaID} - short hashed replica name (BEWARE, this is an experimental feature)
# 11. {replicaIndex} - 0-based index of the replica in the shard (BEWARE, this is an experimental feature)
generateName: "service-{chi}"
# type ObjectMeta struct from k8s.io/meta/v1
metadata:
labels:
custom.label: "custom.value"
annotations:
cloud.google.com/load-balancer-type: "Internal"
service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
service.beta.kubernetes.io/openstack-internal-load-balancer: "true"
service.beta.kubernetes.io/cce-load-balancer-internal-vpc: "true"
# type ServiceSpec struct from k8s.io/core/v1
spec:
ports:
- name: http
port: 8123
- name: client
port: 9000
type: LoadBalancer
.spec.templates.volumeClaimTemplates
.spec.templates.volumeClaimTemplates
defines the PersistentVolumeClaims
. For more information, see the Kubernetes PersistentVolumeClaim page.
.spec.templates.volumeClaimTemplates Example
templates:
volumeClaimTemplates:
- name: default-volume-claim
# type PersistentVolumeClaimSpec struct from k8s.io/core/v1
spec:
# 1. If storageClassName is not specified, default StorageClass
# (must be specified by cluster administrator) would be used for provisioning
# 2. If storageClassName is set to an empty string (‘’), no storage class will be used
# dynamic provisioning is disabled for this PVC. Existing, “Available”, PVs
# (that do not have a specified storageClassName) will be considered for binding to the PVC
#storageClassName: gold
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
.spec.templates.podTemplates
.spec.templates.podTemplates
defines the Pod Templates. For more information, see the Kubernetes Pod Templates.
The following additional sections have been defined for the ClickHouse cluster:
zone
distribution
zone
and distribution
together define zoned layout of ClickHouse instances over nodes. These ensure that the affinity.nodeAffinity
and affinity.podAntiAffinity
are set.
.spec.templates.podTemplates Example
To place a ClickHouse instances in AWS us-east-1a
availability zone with one ClickHouse per host:
zone:
values:
- "us-east-1a"
distribution: "OnePerHost"
To place ClickHouse instances on nodes labeled as clickhouse=allow
with one ClickHouse per host:
zone:
key: "clickhouse"
values:
- "allow"
distribution: "OnePerHost"
Or the distribution
can be Unspecified
:
templates:
podTemplates:
# multiple pod templates makes possible to update version smoothly
# pod template for ClickHouse v18.16.1
- name: clickhouse-v18.16.1
# We may need to label nodes with clickhouse=allow label for this example to run
# See ./label_nodes.sh for this purpose
zone:
key: "clickhouse"
values:
- "allow"
# Shortcut version for AWS installations
#zone:
# values:
# - "us-east-1a"
# Possible values for distribution are:
# Unspecified
# OnePerHost
distribution: "Unspecified"
# type PodSpec struct {} from k8s.io/core/v1
spec:
containers:
- name: clickhouse
image: yandex/clickhouse-server:18.16.1
volumeMounts:
- name: default-volume-claim
mountPath: /var/lib/clickhouse
resources:
requests:
memory: "64Mi"
cpu: "100m"
limits:
memory: "64Mi"
cpu: "100m"
References
3.3.3 - Resources
The Altinity Kubernetes Operator creates the following resources on installation to support its functions:
- Custom Resource Definition
- Service account
- Cluster Role Binding
- Deployment
Custom Resource Definition
The Kubernetes k8s API is extended with the new Kubernetes Cluster Resource Definition kind:ClickHouseInstallation
.
To check the Custom Resource Definition:
kubectl get customresourcedefinitions
Expected result:
NAME CREATED AT
clickhouseinstallations.clickhouse.altinity.com 2022-02-09T17:20:39Z
clickhouseinstallationtemplates.clickhouse.altinity.com 2022-02-09T17:20:39Z
clickhouseoperatorconfigurations.clickhouse.altinity.com 2022-02-09T17:20:39Z
Service Account
The new Service Account clickhouse-operator allows services running from within Pods to be authenticated against the Service Account clickhouse-operator
through the apiserver
.
To check the Service Account:
kubectl get serviceaccounts -n kube-system
Expected result
NAME SECRETS AGE
attachdetach-controller 1 23d
bootstrap-signer 1 23d
certificate-controller 1 23d
clickhouse-operator 1 5s
clusterrole-aggregation-controller 1 23d
coredns 1 23d
cronjob-controller 1 23d
daemon-set-controller 1 23d
default 1 23d
deployment-controller 1 23d
disruption-controller 1 23d
endpoint-controller 1 23d
endpointslice-controller 1 23d
endpointslicemirroring-controller 1 23d
ephemeral-volume-controller 1 23d
expand-controller 1 23d
generic-garbage-collector 1 23d
horizontal-pod-autoscaler 1 23d
job-controller 1 23d
kube-proxy 1 23d
namespace-controller 1 23d
node-controller 1 23d
persistent-volume-binder 1 23d
pod-garbage-collector 1 23d
pv-protection-controller 1 23d
pvc-protection-controller 1 23d
replicaset-controller 1 23d
replication-controller 1 23d
resourcequota-controller 1 23d
root-ca-cert-publisher 1 23d
service-account-controller 1 23d
service-controller 1 23d
statefulset-controller 1 23d
storage-provisioner 1 23d
token-cleaner 1 23d
ttl-after-finished-controller 1 23d
ttl-controller 1 23d
Cluster Role Binding
The Cluster Role Binding cluster-operator
grants permissions defined in a role to a set of users.
Roles are granted to users, groups or service account. These permissions are granted cluster-wide with ClusterRoleBinding
.
To check the Cluster Role Binding:
kubectl get clusterrolebinding
Expected result
NAME ROLE AGE
clickhouse-operator-kube-system ClusterRole/clickhouse-operator-kube-system 5s
cluster-admin ClusterRole/cluster-admin 23d
kubeadm:get-nodes ClusterRole/kubeadm:get-nodes 23d
kubeadm:kubelet-bootstrap ClusterRole/system:node-bootstrapper 23d
kubeadm:node-autoapprove-bootstrap ClusterRole/system:certificates.k8s.io:certificatesigningrequests:nodeclient 23d
kubeadm:node-autoapprove-certificate-rotation ClusterRole/system:certificates.k8s.io:certificatesigningrequests:selfnodeclient 23d
kubeadm:node-proxier ClusterRole/system:node-proxier 23d
minikube-rbac ClusterRole/cluster-admin 23d
storage-provisioner ClusterRole/system:persistent-volume-provisioner 23d
system:basic-user ClusterRole/system:basic-user 23d
system:controller:attachdetach-controller ClusterRole/system:controller:attachdetach-controller 23d
system:controller:certificate-controller ClusterRole/system:controller:certificate-controller 23d
system:controller:clusterrole-aggregation-controller ClusterRole/system:controller:clusterrole-aggregation-controller 23d
system:controller:cronjob-controller ClusterRole/system:controller:cronjob-controller 23d
system:controller:daemon-set-controller ClusterRole/system:controller:daemon-set-controller 23d
system:controller:deployment-controller ClusterRole/system:controller:deployment-controller 23d
system:controller:disruption-controller ClusterRole/system:controller:disruption-controller 23d
system:controller:endpoint-controller ClusterRole/system:controller:endpoint-controller 23d
system:controller:endpointslice-controller ClusterRole/system:controller:endpointslice-controller 23d
system:controller:endpointslicemirroring-controller ClusterRole/system:controller:endpointslicemirroring-controller 23d
system:controller:ephemeral-volume-controller ClusterRole/system:controller:ephemeral-volume-controller 23d
system:controller:expand-controller ClusterRole/system:controller:expand-controller 23d
system:controller:generic-garbage-collector ClusterRole/system:controller:generic-garbage-collector 23d
system:controller:horizontal-pod-autoscaler ClusterRole/system:controller:horizontal-pod-autoscaler 23d
system:controller:job-controller ClusterRole/system:controller:job-controller 23d
system:controller:namespace-controller ClusterRole/system:controller:namespace-controller 23d
system:controller:node-controller ClusterRole/system:controller:node-controller 23d
system:controller:persistent-volume-binder ClusterRole/system:controller:persistent-volume-binder 23d
system:controller:pod-garbage-collector ClusterRole/system:controller:pod-garbage-collector 23d
system:controller:pv-protection-controller ClusterRole/system:controller:pv-protection-controller 23d
system:controller:pvc-protection-controller ClusterRole/system:controller:pvc-protection-controller 23d
system:controller:replicaset-controller ClusterRole/system:controller:replicaset-controller 23d
system:controller:replication-controller ClusterRole/system:controller:replication-controller 23d
system:controller:resourcequota-controller ClusterRole/system:controller:resourcequota-controller 23d
system:controller:root-ca-cert-publisher ClusterRole/system:controller:root-ca-cert-publisher 23d
system:controller:route-controller ClusterRole/system:controller:route-controller 23d
system:controller:service-account-controller ClusterRole/system:controller:service-account-controller 23d
system:controller:service-controller ClusterRole/system:controller:service-controller 23d
system:controller:statefulset-controller ClusterRole/system:controller:statefulset-controller 23d
system:controller:ttl-after-finished-controller ClusterRole/system:controller:ttl-after-finished-controller 23d
system:controller:ttl-controller ClusterRole/system:controller:ttl-controller 23d
system:coredns ClusterRole/system:coredns 23d
system:discovery ClusterRole/system:discovery 23d
system:kube-controller-manager ClusterRole/system:kube-controller-manager 23d
system:kube-dns ClusterRole/system:kube-dns 23d
system:kube-scheduler ClusterRole/system:kube-scheduler 23d
system:monitoring ClusterRole/system:monitoring 23d
system:node ClusterRole/system:node 23d
system:node-proxier ClusterRole/system:node-proxier 23d
system:public-info-viewer ClusterRole/system:public-info-viewer 23d
system:service-account-issuer-discovery ClusterRole/system:service-account-issuer-discovery 23d
system:volume-scheduler ClusterRole/system:volume-scheduler 23d
Cluster Role Binding Example
As an example, the role cluster-admin
is granted to a service account clickhouse-operator
:
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: clickhouse-operator
namespace: kube-system
Deployment
The Deployment clickhouse-operator
runs in the kube-system
namespace.
To check the Deployment:
kubectl get deployments --namespace kube-system
Expected result
NAME READY UP-TO-DATE AVAILABLE AGE
clickhouse-operator 1/1 1 1 5s
coredns 1/1 1 1 23d
References
3.3.4 - Networking Connection Guides
Organizations can connect their clickhouse-operator
based ClickHouse cluster to their network depending on their environment. The following guides are made to assist users setting up the connections based on their environment.
3.3.4.1 - MiniKube Networking Connection Guide
Organizations that have set up the Altinity Kubernetes Operator using minikube can connect it to an external network through the following steps.
Prerequisites
The following guide is based on an installed Altinity Kubernetes Operator cluster using minikube
for an Ubuntu Linux operating system.
- For instructions on setting up
minikube
for ubuntu 20.04, see the Install minikube for Linux guide. - For instructions on installing the Altinity Kubernetes Operator, see the Altinity Kubernetes Operator Quick Start Guide or the Altinity Kubernetes Operator install guides.
Network Connection Guide
The proper way to connect to the ClickHouse cluster is through the LoadBalancer created during the ClickHouse cluster created process. For example, the following ClickHouse cluster has 2 shards in one replica, applied to the namespace test
:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "demo-01"
spec:
configuration:
clusters:
- name: "demo-01"
layout:
shardsCount: 2
replicasCount: 1
This generates the following services in the namespace test
:
kubectl get service -n test
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
chi-demo-01-demo-01-0-0 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 22s
chi-demo-01-demo-01-1-0 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 5s
clickhouse-demo-01 LoadBalancer 10.96.67.44 <pending> 8123:32766/TCP,9000:31368/TCP 38s
The LoadBalancer alternates which of the ClickHouse shards to connect to, and should be where all ClickHouse clients connect to.
To open a connection from external networks to the LoadBalancer, use the kubectl port-forward
command in the following format:
kubectl port-forward service/{LoadBalancer Service} -n {NAMESPACE} --address={IP ADDRESS} {TARGET PORT}:{INTERNAL PORT}
Replacing the following:
LoadBalancer Service
: the LoadBalancer service to connect external ports to the Kubernetes environment.NAMESPACE
: The namespace for the LoadBalancer.IP ADDRESS
: The IP address to bind the service to on the machine runningminikube
, or0.0.0.0
to find all IP addresses on theminikube
server to the specified port.TARGET PORT
: The external port that users will connect to.INTERNAL PORT
: The port within the Altinity Kubernetes Operator network.
The kubectl port-forward
command must be kept running in the terminal, or placed into the background with the &
operator.
In the example above, the following settings will be used to bind all IP addresses on the minikube
server to the service clickhouse-demo-01
for ports 9000
and 8123
in the background:
kubectl port-forward service/clickhouse-demo-01 -n test --address=0.0.0.0 9000:9000 8123:8123 &
To test the connection, connect to the external IP address via curl
. For ClickHouse HTTP, OK
will be returned, while for port 9000
a notice requesting use of port 8123
will be displayed:
curl http://localhost:9000
Handling connection for 9000
Port 9000 is for clickhouse-client program
You must use port 8123 for HTTP.
curl http://localhost:8123
Handling connection for 8123
Ok.
Once verified, connect to the ClickHouse cluster via either HTTP or ClickHouse TCP as needed.
3.3.5 - Storage Guide
Altinity Kubernetes Operator users have different options regarding persistent storage depending on their environment and situation. The following guides detail how to set up persistent storage for local and cloud storage environments.
3.3.5.1 - Persistent Storage Overview
Users setting up storage in their local environments can establish persistent volumes in different formats based on their requirements.
Allocating Space
Space is allocated through the Kubernetes PersistentVolume
object. ClickHouse clusters established with the Altinity Kubernetes Operator then use the PersistentVolumeClaim
to receive persistent storage.
The PersistentVolume
can be set in one of two ways:
- Manually: Manual allocations set the storage area before the ClickHouse cluster is created. Space is then requested through a
PersistentVolumeClaim
when the ClickHouse cluster is created. - Dynamically: Space is allocated when the ClickHouse cluster is created through the
PersistentVolumeClaim
, and the Kubernetes controlling software manages the process for the user.
For more information on how persistent volumes are managed in Kubernetes, see the Kubernetes documentation Persistent Volumes.
Storage Types
Data stored for ClickHouse clusters in the following ways:
No Persistent Storage
If no persistent storage claim template is specified, then no persistent storage will be allocated. When Kubernetes is stopped or a new manifest applied, all previous data will be lost.
In this example two shards are specified but has no persistent storage allocated:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "no-persistent"
spec:
configuration:
clusters:
- name: "no-persistent"
layout:
shardsCount: 2
replicasCount: 1
When applied to the namespace test
, no persistent storage is found:
kubectl -n test get pv
No resources found
Cluster Wide Storage
If neither the dataVolumeClaimTemplate
or the logVolumeClaimTemplate
are specified (see below), then all data is stored under the requested volumeClaimTemplate
. This includes all information stored in each pod.
In this example, two shards are specified with one volume of storage that is used by the entire pods:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "cluster-storage"
spec:
configuration:
clusters:
- name: "cluster-storage"
layout:
shardsCount: 2
replicasCount: 1
templates:
volumeClaimTemplate: cluster-storage-vc-template
templates:
volumeClaimTemplates:
- name: cluster-storage-vc-template
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
When applied to the namespace test
the following persistent volumes are found. Note that each pod has 500Mb of storage:
kubectl -n test get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-6e70c36a-f170-47b5-93a6-88175c62b8fe 500Mi RWO Delete Bound test/cluster-storage-vc-template-chi-cluster-storage-cluster-storage-1-0-0 standard 21s
pvc-ca002bc4-0ad2-4358-9546-0298eb8b2152 500Mi RWO Delete Bound test/cluster-storage-vc-template-chi-cluster-storage-cluster-storage-0-0-0 standard 39s
Cluster Wide Split Storage
Applying the dataVolumeClaimTemplate
and logVolumeClaimTemplate
template types to the Altinity Kubernetes Operator controlled ClickHouse cluster allows for specific data from each ClickHouse pod to be stored in a particular persistent volume:
- dataVolumeClaimTemplate: Sets the storage volume for the ClickHouse node data. In a traditional ClickHouse server environment, this would be allocated to
/var/lib/clickhouse
. - logVolumeClaimTemplate: Sets the storage volume for ClickHouse node log files. In a traditional ClickHouse server environment, this would be allocated to
/var/log/clickhouse-server
.
This allows different storage capacities for log data versus ClickHouse database data, as well as only capturing specific data rather than the entire pod.
In this example, two shards have different storage capacity for dataVolumeClaimTemplate
and logVolumeClaimTemplate
:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "cluster-split-storage"
spec:
configuration:
clusters:
- name: "cluster-split"
layout:
shardsCount: 2
replicasCount: 1
templates:
dataVolumeClaimTemplate: data-volume-template
logVolumeClaimTemplate: log-volume-template
templates:
volumeClaimTemplates:
- name: data-volume-template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
- name: log-volume-template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
In this case, retrieving the PersistentVolume allocations shows two storage volumes per pod based on the specifications in the manifest:
kubectl -n test get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-0b02c5ba-7ca1-4578-b3d9-ff8bb67ad412 100Mi RWO Delete Bound test/log-volume-template-chi-cluster-split-storage-cluster-split-1-0-0 standard 21s
pvc-4095b3c0-f550-4213-aa53-a08bade7c62c 100Mi RWO Delete Bound test/log-volume-template-chi-cluster-split-storage-cluster-split-0-0-0 standard 40s
pvc-71384670-c9db-4249-ae7e-4c5f1c33e0fc 500Mi RWO Delete Bound test/data-volume-template-chi-cluster-split-storage-cluster-split-1-0-0 standard 21s
pvc-9e3fb3fa-faf3-4a0e-9465-8da556cb9eec 500Mi RWO Delete Bound test/data-volume-template-chi-cluster-split-storage-cluster-split-0-0-0 standard 40s
Pod Mount Based Storage
PersistentVolume
objects can be mounted directly into the pod’s mountPath
. Any other data is not stored when the container is stopped unless it is covered by another PersistentVolumeClaim
.
In the following example, each of the 2 shards in the ClickHouse cluster has the volumes tied to specific mount points:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "pod-split-storage"
spec:
configuration:
clusters:
- name: "pod-split"
# Templates are specified for this cluster explicitly
templates:
podTemplate: pod-template-with-volumes
layout:
shardsCount: 2
replicasCount: 1
templates:
podTemplates:
- name: pod-template-with-volumes
spec:
containers:
- name: clickhouse
image: yandex/clickhouse-server:21.8
volumeMounts:
- name: data-storage-vc-template
mountPath: /var/lib/clickhouse
- name: log-storage-vc-template
mountPath: /var/log/clickhouse-server
volumeClaimTemplates:
- name: data-storage-vc-template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
- name: log-storage-vc-template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
kubectl -n test get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-37be9f84-7ba5-404e-8299-e95a291014a8 500Mi RWO Delete Bound test/data-storage-vc-template-chi-pod-split-storage-pod-split-1-0-0 standard 24s
pvc-5b2f8694-326d-41cb-94ec-559725947b45 100Mi RWO Delete Bound test/log-storage-vc-template-chi-pod-split-storage-pod-split-1-0-0 standard 24s
pvc-84768e78-e44e-4295-8355-208b07330707 500Mi RWO Delete Bound test/data-storage-vc-template-chi-pod-split-storage-pod-split-0-0-0 standard 43s
pvc-9e123af7-01ce-4ab8-9450-d8ca32b1e3a6 100Mi RWO Delete Bound test/log-storage-vc-template-chi-pod-split-storage-pod-split-0-0-0 standard 43s
4 - Operations Guide
The methods to make your ClickHouse environment successful.
4.1 - Security
Keep your ClickHouse cluster and data safe and secure from intruders.
4.1.1 - Hardening Guide
ClickHouse is known for its ability to scale with clusters, handle terabytes to petabytes of data, and return query results fast. It also has a plethora of built in security options and features that help keep that data safe from unauthorized users.
Hardening your individual ClickHouse system will depend on the situation, but the following processes are generally applicable in any environment. Each of these can be handled separately, and do not require being performed in any particular order.
4.1.1.1 - User Hardening
Increasing ClickHouse security at the user level involves the following major steps:
-
User Configuration: Setup secure default users, roles and permissions through configuration or SQL.
-
User Network Settings: Limit communications by hostname or IP address
-
Secure Password: Store user information as hashed values.
-
Set Quotas: Limit how many resources users can use in given intervals.
-
Use Profiles: Use profiles to set common security settings across multiple accounts.
-
Database Restrictions: Narrow the databases, tables and rows that a user can access.
-
Enable Remote Authentication: Enable LDAP authentication or Kerberos authentication to prevent storing hashed password information, and enforce password standards.
-
IMPORTANT NOTE: Configuration settings can be stored in the default
/etc/clickhouse-server/config.xml
file. However, this file can be overwritten during vendor upgrades. To preserve configuration settings it is recommended to store them in/etc/clickhouse-server/config.d
as separate XML files.
User Configuration
The hardening steps to apply to users are:
- Restrict user access only to the specific host names or IP addresses when possible.
- Store all passwords in SHA256 format.
- Set quotas on user resources for users when possible.
- Use profiles to set similar properties across multiple users, and restrict user to the lowest resources required.
- Offload user authentication through LDAP or Kerberos.
Users can be configured through the XML based settings files, or through SQL based commands.
Detailed information on ClickHouse user configurations can be found on the ClickHouse.Tech documentation site for User Settings.
User XML Settings
Users are listed under the user.xml
file under the users
element. Each element under users
is created as a separate user.
It is recommended that when creating users, rather than lumping them all into the user.xml
file is to place them as separate XML files under the directory users.d
, typically located in /etc/clickhouse-server/users.d/
.
Note that if your ClickHouse environment is to be run as a cluster, then user configuration files must be replicated on each node with the relevant users information. We will discuss how to offload some settings into other systems such as LDAP later in the document.
Also note that ClickHouse user names are case sensitive: John
is different than john
. See the ClickHouse.tech documentation site for full details.
- IMPORTANT NOTE: If no user name is specified when a user attempts to login, then the account named
default
will be used.
For example, the following section will create two users:
- clickhouse_operator: This user has the password
clickhouse_operator_password
stored in a sha256 hash, is assigned the profileclickhouse_operator
, and can access the ClickHouse database from any network host. - John: This user can only access the database from
localhost
, has a basic password ofJohn
and is assigned to thedefault
profile.
<users>
<clickhouse_operator>
<networks>
<ip>127.0.0.1</ip>
<ip>0.0.0.0/0</ip>
<ip>::/0</ip>
</networks>
<password_sha256_hex>716b36073a90c6fe1d445ac1af85f4777c5b7a155cea359961826a030513e448</password_sha256_hex>
<profile>clickhouse_operator</profile>
<quota>default</quota>
</clickhouse_operator>
<John>
<networks>
<ip>127.0.0.1</ip>
</networks>
<password_sha456_hex>73d1b1b1bc1dabfb97f216d897b7968e44b06457920f00f2dc6c1ed3be25ad4c</password_sha256_hex>
<profile>default</profile>
</John>
</users>
User SQL Settings
ClickHouse users can be managed by SQL commands from within ClickHouse. For complete details, see the Clickhouse.tech User Account page.
Access management must be enabled at the user level with the access_management
setting. In this example, Access Management is enabled for the user John:
<users>
<John>
<access_management>1</access_management>
</John>
</users>
The typical process for DCL(Data Control Language) queries is to have one user enabled with access_management, then have the other accounts generated through queries. See the ClickHouse.tech Access Control and Account Management page for more details.
Once enabled, Access Management settings can be managed through SQL queries. For example, to create a new user called newJohn with their password set as a sha256 hash and restricted to a specific IP address subnet, the following SQL command can be used:
CREATE USER IF NOT EXISTS newJohn
IDENTIFIED WITH SHA256_PASSWORD BY 'secret'
HOST IP '192.168.128.1/24' SETTINGS readonly=1;
Access Management through SQL commands includes the ability to:
- Set roles
- Apply policies to users
- Set user quotas
- Restrict user access to databases, tables, or specific rows within tables.
User Network Settings
Users can have their access to the ClickHouse environment restricted by the network they are accessing the network from. Users can be restricted to only connect from:
- IP: IP address or netmask.
- For all IP addresses, use
0.0.0.0/0
for IPv4,::/0
for IPv6
- For all IP addresses, use
- Host: The DNS resolved hostname the user is connecting from.
- Host Regexp (Regular Expression): A regular expression of the hostname.
Accounts should be restricted to the networks that they connect from when possible.
User Network SQL Settings
User access from specific networks can be set through SQL commands. For complete details, see the Clickhouse.tech Create User page.
Network access is controlled through the HOST
option when creating or altering users. Host options include:
- ANY (default): Users can connect from any location
- LOCAL: Users can only connect locally.
- IP: A specific IP address or subnet.
- NAME: A specific FQDN (Fully Qualified Domain Name)
- REGEX: Filters hosts that match a regular expression.
- LIKE: Filters hosts by the LIKE operator.
For example, to restrict the user john to only connect from the local subnet of ‘192.168.0.0/16’:
ALTER USER john
HOST IP '192.168.0.0/16';
Or to restrict this user to only connecting from the specific host names awesomeplace1.com
, awesomeplace2.com
, etc:
ALTER USER john
HOST REGEXP 'awesomeplace[12345].com';
User Network XML Settings
User network settings are stored under the user configuration files /etc/clickhouse-server/config.d
with the <networks>
element controlling the sources that the user can connect from through the following settings:
<ip>
: IP Address or subnet mask.<host>
: Hostname.<host_regexp>
: Regular expression of the host name.
For example, the following will allow only from localhost
:
<networks>
<ip>127.0.0.1</ip>
</networks>
The following will restrict the user only to the site example.com or from supercool1.com, supercool2.com, etc:
<networks>
<host>example.com</host>
<host_regexp>supercool[1234].com</host_regexp>
</networks>
If there are hosts or other settings that are applied across multiple accounts, one option is to use the Substitution feature as detailed in the ClickHouse.tech Configuration Files page. For example, in the /etc/metrika.xml.
file used for substitutions, a local_networks
element can be made:
<local_networks>
<ip>192.168.1.0/24</ip>
</local_networks>
This can then be applied to a one or more users with the incl
attribute when specifying their network access:
<networks incl="local_networks" replace="replace">
</networks>
Secure Password
Passwords can be stored in plaintext or SHA256 (hex format).
SHA256 format passwords are labeled with the <password_sha256_hex>
element. SHA256 password can be generated through the following command:
echo -n "secret" | sha256sum | tr -d '-'
OR:
echo -n "secret" | shasum -a 256 | tr -d '-'
- IMPORTANT NOTE: The -n option removes the newline from the output.
For example:
echo -n "clickhouse_operator_password" | shasum -a 256 | tr -d '-'
716b36073a90c6fe1d445ac1af85f4777c5b7a155cea359961826a030513e448
Secure Password SQL Settings
Passwords can be set when using the CREATE USER
OR ALTER USER
with the IDENTIFIED WITH
option. For complete details, see the ClickHouse.tech Create User page. The following secure password options are available:
- sha256password BY ‘STRING’: Converts the submitted STRING value to sha256 hash.
- sha256_hash BY ‘HASH’ (best option): Stores the submitted HASH directly as the sha256 hash password value.
- double_sha1_password BY ‘STRING’ (only used when allowing logins through mysql_port): Converts the submitted STRING value to double sha256 hash.
- double_sha1_hash BY ‘HASH’(only used when allowing logins through mysql_port): Stores the submitted HASH directly as the double sha256 hash password value.
For example, to store the sha256 hashed value of “password” for the user John:
ALTER USER John IDENTIFIED WITH sha256_hash BY '5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8';
Secure Password XML Settings
Passwords can be set as part of the user’s settings in the user configuration files in /etc/clickhouse-server/config.d
. For complete details, see the Clickhouse.tech User Settings.
To set a user’s password with a sha256 hash, use the password_sha256_hex
branch for the user. For example, to set the sha256 hashed value of “password” for the user John:
<users>
<John>
<password_sha256_hex>5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8</password_sha256_hex>
</John>
</users>
Set Quotas
Quotas set how many resources can be accessed in a given time, limiting a user’s ability to tie up resources in the system. More details can be found on the ClickHouse.tech Quotas page.
Quota SQL Settings
Quotas can be created or altered through SQL queries, then applied to users.
For more information on ClickHouse quotas, see the ClickHouse.tech Access Control page on Quotas.
Quota XML Settings
These are defined in the users.xml
file under the element quotas
. Each branch of the quota element is the name of the quota being defined.
Quotas are set by intervals, which can be set to different restrictions. For example, this quota named limited
has one interval that sets maximum queries at 1000, and another interval that allows a total of 10000 queries over a 24 hour period.
<quotas>
<limited>
<interval>
<duration>3600</duration>
<queries>1000</queries>
</interval>
<interval>
<duration>86400</duration>
<queries>10000</queries>
</limited>
</quotas>
Use Profiles
Profiles allow settings that can be applied to multiple uses applied with the same name. More details on Settings Profiles are available on the ClickHouse.tech site.
Profile XML Settings
Profiles are applied to a user with the profile element. For example, this assigns the restricted
profile to the user John
:
<users>
<John>
<networks>
<ip>127.0.0.1</ip>
<ip>0.0.0.0/0</ip>
<ip>::/0</ip>
</networks>
<password_sha256_hex>716b36073a90c6fe1d445ac1af85f4777c5b7a155cea359961826a030513e448</password_sha256_hex>
<profile>restricted</profile>
Profiles are set in the users.xml file under the profiles element. Each branch of this element is the name of a profile. The profile restricted shown here only allows for eight threads to be used at a time for users with this profile:
<profiles>
<restricted>
<!-- The maximum number of threads when running a single query. -->
<max_threads>8</max_threads>
</default>
</profiles>
Recommended profile settings include the following:
readonly
: This sets the profile to be applied to users but not to be changed.max_execution_time
: Limits the amount of time a process will run before being forced to time out.max_bytes_before_external_group_by
: Maximum RAM allocated for a singleGROUP BY
sort.max_bytes_before_external_sort
: Maximum RAM allocated for sort commands.
Database Restrictions
Restrict users to the databases they need, and when possible only the tables or rows within tables that they require access to.
Full details are found on the ClickHouse.tech User Settings documentation.
Database Restrictions XML Settings
To restrict a user’s access by data in the XML file:
- Update user configuration files in
/etc/clickhouse-server/config.d
or update their permissions through SQL queries. - For each user to update:
- Add the
<databases>
element with the following branches:- The name of the database to allow access to.
- Within the database, the table names allowed to the user.
- Within the table, add a
<filter>
to match rows that fit the filter.
- Add the
Database Restrictions XML Settings Example
The following restricts the user John
to only access the database sales
, and from there only the table marked clients
where salesman = 'John'
:
<John>
<databases>
<sales>
<clients>
<filter>salesman = 'John'</filter>
</clients>
</sales>
</databases>
</John>
Enable Remote Authentication
One issue with user settings is that in a cluster environment, each node requires a separate copy of the user configuration files, which includes a copy of the sha256 encrypted password.
One method of reducing the exposure of user passwords, even in a hashed format in a restricted section of the file system, it to use external authentication sources. This prevents password data from being stored in local file systems and allows changes to user authentication to be managed from one source.
Enable LDAP
LDAP servers are defined in the ClickHouse configuration settings such as /etc/clickhouse-server/config.d/ldap.xml
. For more details, see the ClickHouse.tech site on Server Configuration settings.
Enabling LDAP server support in ClickHouse allows you to have one authority on login credentials, set password policies, and other essential security considerations through your LDAP server. It also prevents password information being stored on your ClickHouse servers or cluster nodes, even in a SHA256 hashed form.
To add one or more LDAP servers to your ClickHouse environment, each node will require the ldap
settings:
<ldap>
<server>ldapserver_hostname</server>
<roles>
<my_local_role1 />
<my_local_role2 />
</roles>
</ldap>
When creating users, specify the ldap server for the user:
create user if not exists newUser
identified with ldap by 'ldapserver_hostname'
host any;
When the user attempts to authenticate to ClickHouse, their credentials will be verified against the LDAP server specified from the configuration files.
4.1.1.2 - Network Hardening
Hardening the network communications for your ClickHouse environment is about reducing exposure of someone listening in on traffic and using that against you. Network hardening falls under the following major steps:
-
IMPORTANT NOTE: Configuration settings can be stored in the default
/etc/clickhouse-server/config.xml
file. However, this file can be overwritten during vendor upgrades. To preserve configuration settings it is recommended to store them in/etc/clickhouse-server/config.d
as separate XML files with the same root element, typically<yandex>
. For this guide, we will only refer to the configuration files in/etc/clickhouse-server/config.d
for configuration settings.
Reduce Exposure
It’s easier to prevent entry into your system when there’s less points of access, so unused ports should be disabled.
ClickHouse has native support for MySQL client, PostgreSQL clients, and others. The enabled ports are set in the /etc/clickhouse-server/config.d
files.
To reduce exposure to your ClickHouse environment:
-
Review which ports are required for communication. A complete list of the ports and configurations can be found on the ClickHouse documentation site for Server Settings.
-
Comment out any ports not required in the configuration files. For example, if there’s no need for the MySQL client port, then it can be commented out:
<!-- <mysql_port>9004</mysql_port> -->
Enable TLS
ClickHouse allows for both encrypted and unencrypted network communications. To harden network communications, unencrypted ports should be disabled and TLS enabled.
TLS encryption required a Certificate, and whether to use a public or private Certificate Authority (CA) is based on your needs.
- Public CA: Recommended for external services or connections where you can not control where they will be connecting from.
- Private CA: Best used when the ClickHouse services are internal only and you can control where hosts are connecting from.
- Self-signed certificate: Only recommended for testing environments.
Whichever method is used, the following files will be required to enable TLS with CLickHouse:
- Server X509 Certificate: Default name
server.crt
- Private Key: Default name
server.key
- Diffie-Hellman parameters: Default name
dhparam.pem
Generate Files
No matter which approach is used, the Private Key and the Diffie-Hellman parameters file will be required. These instructions may need to be modified based on the Certificate Authority used to match its requirements. The instructions below require the use of openssl
, and was tested against version OpenSSL 1.1.1j
.
-
Generate the private key, and enter the pass phrase when required:
openssl genrsa -aes256 -out server.key 2048
-
Generate
dhparam.pem
to create a 4096 encrypted file. This will take some time but only has to be done once:openssl dhparam -out dhparam.pem 4096
-
Create the Certificate Signing Request (CSR) from the generated private key. Complete the requested information such as Country, etc.
openssl req -new -key server.key -out server.csr
-
Store the files
server.key
,server.csr
, anddhparam.pem
in a secure location, typically/etc/clickhouse-server/
.
Public CA
Retrieving the certificates from a Public CA or Internal CA performed by registering with a Public CA such as Let’s Encrypt or Verisign or with an internal organizational CA service. This process involves:
- Submit the CSR to the CA. The CA will sign the certificate and return it, typically as the file
server.crt
. - Store the file
server.crt
in a secure location, typically/etc/clickhouse-server/
.
Create a Private CA
If you do not have an internal CA or do not need a Public CA, a private CA can be generated through the following process:
-
Create the Certificate Private Key:
openssl genrsa -aes256 -out internalCA.key 2048
-
Create the self-signed root certificate from the certificate key:
openssl req -new -x509 -days 3650 -key internalCA.key \ -sha256 -extensions v3_ca -out internalCA.crt
-
Store the Certificate Private Key and the self-signed root certificate in a secure location.
-
Sign the
server.csr
file with the self-signed root certificate:openssl x509 -sha256 -req -in server.csr -CA internalCA.crt \ -CAkey internalCA.key -CAcreateserial -out server.crt -days 365
-
Store the file
server.crt
, typically/etc/clickhouse-server/
.
Self Signed Certificate
To skip right to making a self-signed certificate, follow these instructions.
- IMPORTANT NOTE: This is not recommended for production systems, only for testing environments.
-
With the
server.key
file from previous steps, create the self-signed certificate. Replacemy.host.name
with the actual host name used:openssl req -subj "/CN=my.host.name" -new -key server.key -out server.crt
-
Store the file
server.crt
, typically/etc/clickhouse-server/
. -
Each
clickhouse-client
user that connects to the server with the self-signed certificate will have to allowinvalidCertificateHandler
by updating theirclickhouse-client
configuration files at/etc/clickhouse-server/config.d
:<config> <openSSL> <client> ... <invalidCertificateHandler> <name>AcceptCertificateHandler</name> </invalidCertificateHandler> </client> </openSSL>
Enable TLS in ClickHouse
Once the files server.crt
, server.crt
, and dhparam.dem
have been generated and stored appropriately, update the ClickHouse Server configuration files located at /etc/clickhouse-server/config.d.
To enable TLS and disable unencrypted ports:
-
Review the
/etc/clickhouse-server/config.d
files. Comment out unencrypted ports, includinghttp_port
andtcp_port
:<!-- <http_port>8123</http_port> --> <!-- <tcp_port>9000</tcp_port> -->
-
Enable encrypted ports. A complete list of ports and settings is available on the ClickHouse documentation site for Server Settings. For example:
<https_port>8443</https_port> <tcp_port_secure>9440</tcp_port_secure>
-
Specify the certificate files to use:
<openSSL> <server> <!-- Used for https server AND secure tcp port --> <certificateFile>/etc/clickhouse-server/server.crt</certificateFile> <privateKeyFile>/etc/clickhouse-server/server.key</privateKeyFile> <dhParamsFile>/etc/clickhouse-server/dhparams.pem</dhParamsFile> ... </server> ... </openSSL>
Encrypt Cluster Communications
If your organization runs ClickHouse as a cluster, then cluster-to-cluster communications should be encrypted. This includes distributed queries and interservice replication. To harden cluster communications:
-
Create a user for distributed queries. This user should only be able to connect within the cluster, so restrict it’s IP access to only the subnet or host names used for the network. For example, if the cluster is entirely contained in a subdomain named
logos1
,logos2
, etc. This internal user be set with or without a password:CREATE USER IF NOT EXISTS internal ON CLUSTER 'my_cluster' IDENTIFIED WITH NO_PASSWORD HOST REGEXP '^logos[1234]$'
-
Enable TLS for interservice replication and comment out the unencrypted interserver port by updating the
/etc/clickhouse-server/config.d
files:<!-- <interserver_http_port>9009</interserver_http_port> --> <interserver_https_port>9010</interserver_https_port> -->
-
Set an the
interserver_http_credentials
in the/etc/clickhouse-server/config.d
files, and include the internal username and password:<interserver_http_credentials> <user>internal</user> <password></password> </interserver_http_credentials>
-
Enable TLS for distributed queries by editing the file
/etc/clickhouse-server/config.d/remote_servers.xml
- For ClickHouse 20.10 and later versions, set a shared secret text and setting the port to secure for each shard:
<remote_servers> <my_cluster> <shard> <secret>shared secret text</secret> <!-- Update here --> <internal_replication>true</internal_replication> <replica> <host>logos1</host> <!-- Update here --> <port>9440</port> <!-- Secure Port --> <secure>1</secure> <!-- Update here, sets port to secure --> </replica> </shard> ...
- For previous versions of ClickHouse, set the internal user and enable secure communication:
<remote_servers> <my_cluster> <shard> <internal_replication>true</internal_replication> <replica> <host>logos1</host> <!-- Update here --> <port>9440</port> <!-- Secure Port --> <secure>1</secure> <!-- Update here --> <user>internal</port> <!-- Update here --> </replica> ... </shard> ...
4.1.1.3 - Storage Hardening
ClickHouse data is ultimately stored on file systems. Keeping that data protected when it is being used or “at rest” is necessary to prevent unauthorized entities from accessing your organization’s private information.
Hardening stored ClickHouse data is split into the following catagories:
-
Host-Level Security
-
Volume Level Encryption
-
Column Level Encryption
-
Log File Protection
-
IMPORTANT NOTE: Configuration settings can be stored in the default
/etc/clickhouse-server/config.xml
file. However, this file can be overwritten during vendor upgrades. To preserve configuration settings it is recommended to store them in/etc/clickhouse-server/config.d
as separate XML files with the same root element, typically<yandex>
. For this guide, we will only refer to the configuration files in/etc/clickhouse-server/config.d
for configuration settings.
Host-Level Security
The file level security for the files that ClickHouse uses to run should be restricted as much as possible.
- ClickHouse does not require
root
access to the file system, and runs by default as the userclickhouse
. - The following directories should be restricted to the minimum number of users:
/etc/clickhouse-server
: Used for ClickHouse settings and account credentials created by default./var/lib/clickhouse
: Used for ClickHouse data and new credentials./var/log/clickhouse-server
: Log files that may display privileged information through queries. See Log File Protection for more information.
Volume Level Encryption
Encrypting data on the file system prevents unauthorized users who may have gained access to the file system that your ClickHouse database is stored on from being able to access the data itself. Depending on your environment, different encryption options may be required.
Cloud Storage
If your ClickHouse database is stored in a cloud service such as AWS or Azure, verify that the cloud supports encrypting the volume. For example, Amazon AWS provides a method to encrypt new Amazon EBS volumes by default.
The Altinity.Cloud service provides the ability to set the Volume Type to gp2-encrypted. For more details, see the Altinity.Cloud Cluster Settings.
Local Storage
Organizations that host ClickHouse clusters on their own managed systems, LUKS is a recommended solution. Instructions for Linux distributions including Red Hat and Ubuntu are available. Check with the distribution your organization for instructions on how to encrypt those volumes.
Kubernetes Encryption
If your ClickHouse cluster is managed by Kubernetes, the StorageClass used may be encrypted. For more information, see the Kubernetes Storage Class documentation.
Column Level Encryption
Organizations running ClickHouse 20.11 or later can encrypt individual columns with AES functions. For full information, see the ClickHouse.tech Encryption functions documentation.
Applications are responsible for their own keys. Before enabling column level encryption, test to verify that encryption does not negatively impact performance.
The following functions are available:
Function | MySQL AES Compatible |
---|---|
encrypt(mode, plaintext, key, [iv, aad]) | |
decrypt(mode, ciphertext, key, [iv, aad]) | |
aes_encrypt_mysql(mode, plaintext, key, [iv]) | * |
aes_decrypt_mysql(mode, ciphertext, key, [iv]) | * |
Encryption function arguments:
Argument | Description | Type |
---|---|---|
mode | Encryption mode. | String |
plaintext | Text thats need to be encrypted. | String |
key | Encryption key. | String |
iv | Initialization vector. Required for -gcm modes, optional for others. | String |
aad | Additional authenticated data. It isn’t encrypted, but it affects decryption. Works only in -gcm modes, for others would throw an exception | String |
Column Encryption Examples
This example displays how to encrypt information using a hashed key.
- Takes a hex value, unhexes it and stores it as
key
. - Select the value and encrypt it with the
key
, then displays the encrypted value.
WITH unhex('658bb26de6f8a069a3520293a572078f') AS key
SELECT hex(encrypt('aes-128-cbc', 'Hello world', key)) AS encrypted
┌─encrypted────────────────────────┐
│ 46924AC12F4915F2EEF3170B81A1167E │
└──────────────────────────────────┘
This shows how to decrypt encrypted data:
- Takes a hex value, unhexes it and stores it as
key
. - Decrypts the selected value with the
key
as text.
WITH unhex('658bb26de6f8a069a3520293a572078f') AS key SELECT decrypt('aes-128-cbc',
unhex('46924AC12F4915F2EEF3170B81A1167E'), key) AS plaintext
┌─plaintext───┐
│ Hello world │
└─────────────┘
Log File Protection
The great thing about log files is they show what happened. The problem is when they show what happened, like the encryption key used to encrypt or decrypt data:
2021.01.26 19:11:23.526691 [ 1652 ] {4e196dfa-dd65-4cba-983b-d6bb2c3df7c8}
<Debug> executeQuery: (from [::ffff:127.0.0.1]:54536, using production
parser) WITH unhex('658bb26de6f8a069a3520293a572078f') AS key SELECT
decrypt(???), key) AS plaintext
These queries can be hidden through query masking rules, applying regular expressions to replace commands as required. For more information, see the ClickHouse.tech Server Settings documentation.
To prevent certain queries from appearing in log files or to hide sensitive information:
- Update the configuration files, located by default in
/etc/clickhouse-server/config.d
. - Add the element
query_masking_rules
. - Set each
rule
with the following:name
: The name of the rule.regexp
: The regular expression to search for.replace
: The replacement value that matches the rule’s regular expression.
For example, the following will hide encryption and decryption functions in the log file:
<query_masking_rules>
<rule>
<name>hide encrypt/decrypt arguments</name>
<regexp>
((?:aes_)?(?:encrypt|decrypt)(?:_mysql)?)\s*\(\s*(?:'(?:\\'|.)+'|.*?)\s*\)
</regexp>
<!-- or more secure, but also more invasive:
(aes_\w+)\s*\(.*\)
-->
<replace>\1(???)</replace>
</rule>
</query_masking_rules>
4.2 - Care and Feeding of Zookeeper with ClickHouse
ZooKeeper is required for ClickHouse cluster replication. Keeping ZooKeeper properly maintained and fed provides the best performance and reduces the likelihood that your ZooKeeper nodes will become “sick”.
Elements of this guide can also be found on the ClickHouse on Kubernetes Quick Start guide, which details how to use Kubernetes and ZooKeeper with the clickhouse-operator
.
4.2.1 - ZooKeeper Installation and Configuration
Prepare and Start ZooKeeper
Preparation
Before beginning, determine whether ZooKeeper will run in standalone or replicated mode.
- Standalone mode: One zookeeper server to service the entire ClickHouse cluster. Best for evaluation, development, and testing.
- Should never be used for production environments.
- Replicated mode: Multiple zookeeper servers in a group called an ensemble. Replicated mode is recommended for production systems.
- A minimum of 3 zookeeper servers are required.
- 3 servers is the optimal setup that functions even with heavily loaded systems with proper tuning.
- 5 servers is less likely to lose quorum entirely, but also results in longer quorum acquisition times.
- Additional servers can be added, but should always be an odd number of servers.
Precautions
The following practices should be avoided:
- Never deploy even numbers of ZooKeeper servers in an ensemble.
- Do not install ZooKeeper on ClickHouse nodes.
- Do not share ZooKeeper with other applications like Kafka.
- Place the ZooKeeper
dataDir
andlogDir
on fast storage that will not be used for anything else.
Applications to Install
Install the following applications in your servers:
zookeeper
(3.4.9 or later)netcat
Configure ZooKeeper
-
/etc/zookeeper/conf/myid
The
myid
file consists of a single line containing only the text of that machine’s id. Somyid
of server 1 would contain the text “1” and nothing else. The id must be unique within the ensemble and should have a value between 1 and 255. -
/etc/zookeeper/conf/zoo.cfg
Every machine that is part of the ZooKeeper ensemble should know about every other machine in the ensemble. You accomplish this with a series of lines of the form server.id=host:port:port
# specify all zookeeper servers # The first port is used by followers to connect to the leader # The second one is used for leader election server.1=zookeeper1:2888:3888 server.2=zookeeper2:2888:3888 server.3=zookeeper3:2888:3888
These lines must be the same on every ZooKeeper node
-
/etc/zookeeper/conf/zoo.cfg
This setting MUST be added on every ZooKeeper node:
# The time interval in hours for which the purge task has to be triggered. # Set to a positive integer (1 and above) to enable the auto purging. Defaults to 0. autopurge.purgeInterval=1 autopurge.snapRetainCount=5
Install Zookeeper
Depending on your environment, follow the Apache Zookeeper Getting Started guide, or the Zookeeper Administrator's Guide.
Start ZooKeeper
Depending on your installation, start ZooKeeper with the following command:
sudo -u zookeeper /usr/share/zookeeper/bin/zkServer.sh
Verify ZooKeeper is Running
Use the following commands to verify ZooKeeper is available:
echo ruok | nc localhost 2181
echo mntr | nc localhost 2181
echo stat | nc localhost 2181
Check the following files and directories to verify ZooKeeper is running and making updates:
- Logs:
/var/log/zookeeper/zookeeper.log
- Snapshots:
/var/lib/zookeeper/version-2/
Connect to ZooKeeper
From the localhost, connect to ZooKeeper with the following command to verify access (replace the IP address with your Zookeeper server):
bin/zkCli.sh -server 127.0.0.1:2181
Tune ZooKeeper
The following optional settings can be used depending on your requirements.
Improve Node Communication Reliability
The following settings can be used to improve node communication reliability:
/etc/zookeeper/conf/zoo.cfg
# The number of ticks that the initial synchronization phase can take
initLimit=10
# The number of ticks that can pass between sending a request and getting an acknowledgement
syncLimit=5
Reduce Snapshots
The following settings will create fewer snapshots which may reduce system requirements.
/etc/zookeeper/conf/zoo.cfg
# To avoid seeks ZooKeeper allocates space in the transaction log file in blocks of preAllocSize kilobytes.
# The default block size is 64M. One reason for changing the size of the blocks is to reduce the block size
# if snapshots are taken more often. (Also, see snapCount).
preAllocSize=65536
# ZooKeeper logs transactions to a transaction log. After snapCount transactions are written to a log file a
# snapshot is started and a new transaction log file is started. The default snapCount is 10,000.
snapCount=10000
Documentation
- ZooKeeper Getting Started Guide
- ClickHouse Zookeeper Recommendations
- Running ZooKeeper in Production
Configuring ClickHouse to use ZooKeeper
Once ZooKeeper has been installed and configured, ClickHouse can be modified to use ZooKeeper. After the following steps are completed, a restart of ClickHouse will be required.
To configure ClickHouse to use ZooKeeper, follow the steps shown below. The recommended settings are located on ClickHouse.tech zookeeper server settings.
-
Create a configuration file with the list of ZooKeeper nodes. Best practice is to put the file in
/etc/clickhouse-server/config.d/zookeeper.xml
.<yandex> <zookeeper> <node> <host>example1</host> <port>2181</port> </node> <node> <host>example2</host> <port>2181</port> </node> <session_timeout_ms>30000</session_timeout_ms> <operation_timeout_ms>10000</operation_timeout_ms> <!-- Optional. Chroot suffix. Should exist. --> <root>/path/to/zookeeper/node</root> <!-- Optional. ZooKeeper digest ACL string. --> <identity>user:password</identity> </zookeeper> </yandex>
-
Check the
distributed_ddl
parameter inconfig.xml
. This parameter can be defined in another configuration file, and can change the path to any value that you like. If you have several ClickHouse clusters using the same zookeeper,distributed_ddl
path should be unique for every ClickHouse cluster setup.<!-- Allow to execute distributed DDL queries (CREATE, DROP, ALTER, RENAME) on cluster. --> <!-- Works only if ZooKeeper is enabled. Comment it out if such functionality isn't required. --> <distributed_ddl> <!-- Path in ZooKeeper to queue with DDL queries --> <path>/clickhouse/task_queue/ddl</path> <!-- Settings from this profile will be used to execute DDL queries --> <!-- <profile>default</profile> --> </distributed_ddl>
-
Check
/etc/clickhouse-server/preprocessed/config.xml
. You should see your changes there. -
Restart ClickHouse. Check ClickHouse connection to ZooKeeper detailed in ZooKeeper Monitoring.
Converting Tables to Replicated Tables
Creating a replicated table
Replicated tables use a replicated table engine, for example ReplicatedMergeTree
. The following example shows how to create a simple replicated table.
This example assumes that you have defined appropriate macro values for cluster, shard, and replica in macros.xml
to enable cluster replication using zookeeper. For details consult the ClickHouse.tech Data Replication guide.
CREATE TABLE test ON CLUSTER '{cluster}'
(
timestamp DateTime,
contractid UInt32,
userid UInt32
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{cluster}/{shard}/default/test', '{replica}')
PARTITION BY toYYYYMM(timestamp)
ORDER BY (contractid, toDate(timestamp), userid)
SAMPLE BY userid;
The ON CLUSTER
clause ensures the table will be created on the nodes of {cluster}
(a macro value). This example automatically creates a ZooKeeper path for each replica table that looks like the following:
/clickhouse/tables/{cluster}/{replica}/default/test
becomes:
/clickhouse/tables/c1/0/default/test
You can see ZooKeeper replication data for this node with the following query (updating the path based on your environment):
SELECT *
FROM system.zookeeper
WHERE path = '/clickhouse/tables/c1/0/default/test'
Removing a replicated table
To remove a replicated table, use DROP TABLE
as shown in the following example. The ON CLUSTER
clause ensures the table will be deleted on all nodes. Omit it to delete the table on only a single node.
DROP TABLE test ON CLUSTER '{cluster}';
As each table is deleted the node is removed from replication and the information for the replica is cleaned up. When no more replicas exist, all ZooKeeper data for the table will be cleared.
Cleaning up ZooKeeper data for replicated tables
- IMPORTANT NOTE: Cleaning up ZooKeeper data manually can corrupt replication if you make a mistake. Raise a support ticket and ask for help if you have any doubt concerning the procedure.
New ClickHouse versions now support SYSTEM DROP REPLICA which is an easier command.
For example:
SYSTEM DROP REPLICA 'replica_name' FROM ZKPATH '/path/to/table/in/zk';
ZooKeeper data for the table might not be cleared fully if there is an error when deleting the table, or the table becomes corrupted, or the replica is lost. You can clean up ZooKeeper data in this case manually using the ZooKeeper rmr command. Here is the procedure:
- Login to ZooKeeper server.
- Run
zkCli.sh
command to connect to the server. - Locate the path to be deleted, e.g.:
ls /clickhouse/tables/c1/0/default/test
- Remove the path recursively, e.g.,
rmr /clickhouse/tables/c1/0/default/test
4.2.2 - ZooKeeper Monitoring
ZooKeeper Monitoring
For organizations that already have Apache ZooKeeper configured either manually, or with a Kubernetes operator such as the clickhouse-operator for Kubernetes, monitoring your ZooKeeper nodes will help you recover from issues before they happen.
Checking ClickHouse connection to ZooKeeper
To check connectivity between ClickHouse and ZooKeeper.
-
Confirm that ClickHouse can connect to ZooKeeper. You should be able to query the
system.zookeeper
table, and see the path for distributed DDL created in ZooKeeper through that table. If something went wrong, check the ClickHouse logs.$ clickhouse-client -q "select * from system.zookeeper where path='/clickhouse/task_queue/'" ddl 17183334544 17183334544 2019-02-21 21:18:16 2019-02-21 21:18:16 0 8 0 0 0 8 17183370142 /clickhouse/task_queue/
-
Confirm ZooKeeper accepts connections from ClickHouse. You can also see on ZooKeeper nodes if a connection was established and the IP address of the ClickHouse server in the list of clients:
$ echo stat | nc localhost 2181 ZooKeeper version: 3.4.9-3--1, built on Wed, 23 May 2018 22:34:43 +0200 Clients: /10.25.171.52:37384