This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Altinity Documentation

Your go-to technical source for all things ClickHouse.
Welcome the Altinity documentation site. Here we have created technical reference documents, quick start guides, best practices, and everything you need to be productive with ClickHouse and Altinity.Cloud.

1 - Altinity.Cloud

Manuals, quick start guides, code samples and tutorials on how to use Altinity.Cloud to launch and get the most out of your ClickHouse clusters.

Altinity.Cloud provides the best experience in managing ClickHouse. Create new clusters with the version of ClickHouse, set your node configurations, and get right to work.

1.1 - Altinity.Cloud 101

What is Altinity.Cloud?

Welcome to Altinity.Cloud. In this guide, we will be answering a few simple questions:

  • What is Altinity.Cloud?
  • Why should I use it?
  • How does it work?

What is Altinity.Cloud?

Altinity.Cloud is a fully managed ClickHouse services provider. Altinity.Cloud is the easiest way to set up a ClickHouse cluster with different configurations of shards and replicas, with the version of ClickHouse or Altinity Stable for ClickHouse you want. From one spot you can monitor performance, run queries, upload data from S3 or other cloud stores, and other essential operations.

For more details on Altinity.Cloud abilities, see the Administrator Guide. For a crash course on how to create your own ClickHouse clusters with Altinity.Cloud, we have the Altinity.Cloud Quick Start Guide.

What Can I Do with Altinity.Cloud?

Altinity.Cloud lets you create, manage, and monitor ClickHouse clusters with a few simple clicks. Here’s a brief look at the user interface:

Clusters View
  • A: Cluster Creation: Clusters can be created from scratch with Launch Cluster.
  • B: Clusters: Each cluster associated with your Altinity.Cloud account is listed in either tile format, or as a short list. They’ll display a short summary of their health and performance. By selecting a cluster, you can view the full details.
  • C: User and Environment Management:
    • Change to another environment.
    • Manage environments and zookeepers.
    • Update account settings.

Clusters can be spun up and set with the number of replicas and shards, the specific version of ClickHouse that you want to run on them, and what kind of virtual machines to power the nodes.

When your clusters are running you can connect to them with the ClickHouse client, or your favorite applications like Grafana, Kafka, Tableau, and more. See the Altinity.Cloud connectivity guide for more details.

Monitoring

Cluster performance can be monitored in real time through the Cluster Monitor system.

Cluster Monitoring View

Some of the metrics displayed here include:

  • DNS and Distributed Connection Errors: Displays the rate of any connection issues.
  • Select Queries: The number of select queries submitted to the cluster.
  • Zookeeper Transactions: The communications between the zookeeper nodes.
  • ClickHouse Data Size on Disk: The total amount of data the ClickHouse database is using.

How is Altinity.Cloud organized?

Security Tiers

Altinity.Cloud starts at the Organization level - that’s your company. When you and members of your team log into Altinity.Cloud, they’ll start here. Depending on their access level, they can then access the different systems within the organization.

The next level down from there are the Environments. Each organization has at least one Environment, and these are used to allow users access to one or more Clusters.

Clusters consist of one or more Nodes - individual containers that run the ClickHouse databases. These nodes are grouped into shards, which are sets of nodes that all work together to improve performance and reliability. Shards can then be set as replicas, where groups of nodes are copied. If one replica goes down, the other replicas can keep running and copy their synced data when the replica is restored or when a new replica is added.

To recap in reverse order:

  • Nodes are individual virtual machines or containers that run ClickHouse.
  • Shards are groups of nodes work together to improve performance and share data.
  • Replicas are groups of shards that mirror data and performance so when one replica goes down, they can keep going.
  • Clusters are sets of replicas that work together to replicate data and improve performance.
  • Environments contain different clusters into a set to control access and resources.
  • Organizations have one or more environments that service your company.

Altinity.Cloud Access

Altinity.Cloud keeps your users organized in the following roles:

Role Environment Cluster
orgadmin These users can create environments and clusters, and assign users in their organization to them.
envadmin These users have control over environments they are assigned to by the orgadmin. They can create clusters and control clusters within these environments.
envuser These users can access the clusters they are specifically assigned to within specific environments.

More details are available in the Account Administration guide.

Where can I find out more?

Altinity provides the following resources to our customers an the Open Source community:

1.2 - Quick Start Guide

The minimal steps to get Altinity.Cloud running with your first cluster.

Welcome to Altinity.Cloud! Altinity.Cloud is the fastest, easiest way to set up, administer and use ClickHouse. Your ClickHouse is fully managed so you can focus on your work.

If this is your first time using Altinity.Cloud, this quick start guide will give you the minimum steps to become familiar with the system. When you’re ready to dig deeper and use the full power of ClickHouse in your Altinity.Cloud environment, check out our Administrator and Developer Guides for in depth knowledge and best practices.

A full PDF version of this document is available through this link: Quick Start Guide PDF

1.2.1 - Account Creation and Login

How to set up your Altinity.Cloud account, and login to the service.

Create an Account

To start your Altinity.Cloud journey, the first thing you need is an account. New users can sign up for a test account on the Altinity.Cloud Test Drive page. Enter your contact information, and an Altinity.Cloud rep will get right with you.

Once finished, the Altinity.Cloud team will reach out to you with your login credentials. Altinity.Cloud uses your email address as your username, and you’ll be assigned an initial password.

Login to Altinity.Cloud

There are two methods to login:

Login with Username and Password

If you’ve used any web site, you’re likely familiar with this process.

To login to Altinity.Cloud with your username and password:

  1. Enter your username - in this case your email address - in the field marked Login.
  2. Enter your Password, then click Sign In.

Login with Auth0

Auth0 allows you to authenticate to Altinity.Cloud through a trusted authentication provider, in this case Google. Once set up, you can click Auth0 authenticate through your Google account.

  • Requirements: In order to use Auth0, you must have a Google account with the same email address that you use for Altinity.Cloud.

To setup authentication with Auth0 for the first time:

  1. Access the Altinity.Cloud page.
  2. Select Auth0.
  3. Select the Google account to use for authentication.
    IMPORTANT NOTE: The Google account must have the same email address as your Altinity.Cloud account.
  4. Select Continue with Google, and you’ll be in Altinity.Cloud.

After you’ve completed the Auth0 setup process, you can login to Altinity.Cloud by selecting Auth0 from the login page.

1.2.2 - Explore Clusters View

An overview of managing your ClickHouse clusters with Altinity.Cloud.

Explore Clusters View

Once you’ve logged in to Altinity.Cloud, let’s take a moment and familiarize ourselves with the environment. The default page is the Clusters View.

The Clusters View page is separated into the following sections:

Clusters View
  • A: Cluster Creation: Clusters can be created from scratch with Launch Cluster.
  • B: Clusters: Each cluster associated with your Altinity.Cloud account is listed in either tile format, or as a short list. They’ll display a short summary of their health and performance. By selecting a cluster, you can view the full details.
  • C: User and Environment Management:
    • Change to another environment.
    • Manage environments and zookeepers.
    • Update account settings.

1.2.3 - Create Your First Cluster

How to create your first ClickHouse cluster with Altinity.Cloud.

Time to make your first cluster! For this example, we’re creating a minimally sized cluster, but you can rescale your cluster later to make it the exact size you need for your ClickHouse needs.

As of October 21, 2021, Altinity.Cloud supports Google Compute Platform (GCP) and Amazon Web Services (AWS). For more information, see the Altinity.Cloud Administrator Guide.

To create your first cluster:

  1. From the Clusters View page, select Launch Cluster. This starts the Cluster Launch Wizard.

  2. The first page is Resources Configuration, where we set the name, size and authentication for the new cluster. When finished, click Next. Use the following settings:

    Setting Value
    Name Cluster names will be used to create the DNS name of the cluster. Therefore, cluster names must follow DNS name restrictions (letters, numbers, and dashes allowed, periods and special characters are not).
    Cluster names must start with a letter, and should be 15 characters at most.
    Node Type Select m5.large
    This is the size of the node. This selection gives us a cluster with 2 CPUs and around 7 GB RAM. Recall that we can rescale this cluster later. For more information, see the Administrator Guide.
    Node Storage Set to 30 GB.
    The size of each Cluster node in GB (gigabytes). Each node will have the same storage area.
    Number of Volumes Set to 1.
    Network storage can be split into separate volumes. Use more volumes to increase query performance.
    Volume Type Select gp2 (Not Encrypted).
    Volumes can be either encrypted or unencrypted, depending on your security requirements.
    Number of Shards Set to 1.
    The shard represents a set of nodes. Shards can then be replicated to provide increased availability and recovery.
    ClickHouse Version Select the most recent Altinity Stable Build.
    Your ClickHouse cluster can use the version that best meets your needs. Note that all nodes will run the same ClickHouse version.
    ClickHouse User Name Auto-set to admin.
    The default administrative user.
    ClickHouse User Password and Confirm Password Set to your security requirements. Both the ClickHouse User Password and Confirm Password must match.
  3. The next page is High Availability Configuration. This is where you can set your replication, Zookeeper, and backup options. Use the following values for your first cluster, then click Next to continue:

    Setting Value
    Data Replication Set to Enabled.
    Data replication duplicates data across replica clusters for increased performance and availability.
    Number of Replicas Set to 2.
    Only required if Data Replication is Enabled.
    Sets the number of replicas for each cluster shard.
    Zookeeper Configuration The only option at this time is Dedicated
    Apache Zookeeper manages synchronization between the clusters.
    Zookeeper Node Type Default is selected by default.
    Enable Backups Set to Enabled by default and cannot be disabled as of this time.
    Backup Schedule and Number of Backups to keep Is set to Daily and 5, and can not be changed as of this time.
  4. The Connection Configuration page determines how to communicate with your new cluster. Set the following values, then select Next to continue:

    Setting Value
    Endpoint This is automatically set based on your cluster name. It will display the final DNS name for your cluster end point.
    Use TLS Set to Enabled.
    When enabled, communications with your cluster are encrypted with TLS.
    Load Balancer Type Select Altinity Edge Ingress.
    IMPORTANT NOTE: This setting requires clients to support SNI (Server Name Indication). This will require the most current ClickHouse client and Python clickhouse-driver.
    This setting cannot be changed after the cluster is created.
    1. Protocols can restrict communications to the Altinity.Cloud cluster based on your organizations needs. By default **Binary Protocol (port:9440) and HTTP Protocol (port: 8443) are enabled.
    2. Datadog integration: Not enabled at this time. Stay tuned for future developments.
    3. IP restrictions: Restrict IP communications to the cluster to specific IP addresses. For more information, see the Administrator Guide. Leave blank for now.
  5. Last page! Review & Launch lets you double check your settings and see the estimated cluster cost. When you’re ready, click Launch.

It will take a little while before your new cluster is ready, so grab your coffee or tea or other hydrating beverage. When it’s complete, you’ll see your new cluster with all nodes online and all health checks passed.

1.2.4 - Your First Queries

How to use Cluster Explore to run queries, view table structures and available processes.

Once your cluster is created, time to create some tables and do some queries.

For those experienced with ClickHouse, this will be very familiar. For people who are new to ClickHouse, creating tables is very similar to other SQL based databases, with some extra syntax that defines the type of table we’re making. This is part of what gives ClickHouse its speed. For complete details on ClickHouse commands, see the ClickHouse SQL Reference Guide.

Cluster Explore

The Cluster Explore page allows you to run queries, view the schema, and check on processes for your cluster. It’s a quick way of getting into your ClickHouse database, run commands and view your schema straight from your web browser. We’ll be using this to generate our first tables and input some data.

To access Cluster Explore for your cluster, just click Explore for the specific cluster to manage.

For our example, we’re going to create two tables:

  • events_local: This table will use the ReplicatedMergeTree table engine. If you don’t know about table engines, don’t worry about that for now. See the ClickHouse Engines page for complete information.
  • events: This table will be distributed on your cluster with the Distributed table engine.

In our examples, we’ll be using macro variables - these are placed between curly brackets and let us use the same SQL commands on different clusters and environments without having to fill in every detail. Any time you see an entry like {cluster} or {shard} you should recognize those as a macro variable.

The commands below will create these tables into the default database on your cluster.

Create Tables

To create your first tables:

  1. From the Clusters View select Explore for the cluster to manage.

  2. The Query tab is selected by default. (This may change in future releases.)

  3. For our first table, copy and paste the following into the Query window, then select Execute.

    CREATE TABLE IF NOT EXISTS events_local ON CLUSTER '{cluster}' (
        event_date  Date,
        event_type  Int32,
        article_id  Int32,
        title       String
    ) ENGINE = ReplicatedMergeTree('/clickhouse/{cluster}/tables/{shard}/{database}/{table}', '{replica}')
        PARTITION BY toYYYYMM(event_date)
        ORDER BY (event_type, article_id);
    

    You should see the following under Execute confirming the command ran, just replacing docdemo with your cluster:

    docdemo.demo.beta.altinity.cloud:8443 (query time: 0.342s)
    chi-docdemo-docdemo-0-0	9000 0	 	1	0
    chi-docdemo-docdemo-0-1	9000 0	 	0	0
    
  4. Now let’s create our second table. Back in the Query window, enter the following and select Execute:

    CREATE TABLE events ON CLUSTER '{cluster}' AS events_local
        ENGINE = Distributed('{cluster}', default, events_local, rand())
    

    Once again, you should see confirmation under Execute:

    docdemo.demo.beta.altinity.cloud:8443 (query time: 0.162s)
    chi-docdemo-docdemo-0-0	9000 0	 	1	0
    chi-docdemo-docdemo-0-1	9000 0	 	0	0
    
  5. Now that we have some tables, let’s not leave them empty. Inserting data into a ClickHouse table is very similar to most SQL systems. Let’s Insert our data, then do a quick Select on it. Enter the following Insert command into Query, then select Execute:

    INSERT INTO events VALUES(today(), 1, 13, 'Example');
    

    You’ll see the results confirmed under Execute, just like before.

    OK.
    

    Then enter the following Select command and select Execute again:

    SELECT * FROM events;
    

    The results will look like the following:

    docdemo.demo.beta.altinity.cloud:8443 (query time: 0.018s)
    ┌─event_date─┬─event_type─┬─article_id─┬─title───┐
    │ 2020-11-19 │          113 │ Example │
    └────────────┴────────────┴────────────┴─────────┘
    

View Schema

The Database Schema shows a graphical view of your cluster’s database, the tables in it, and their structure.

To view your Schema:

  1. From the Clusters View select Explore for the cluster to manage.
  2. Select Schema.

You can expand the databases to display the tables in each database, or select the table to view its details, schema, and some sample rows.

View Processes

To view current actions running on your cluster select Processes. This displays what processes are currently running, what user account they are running under, and allows you to view more details regarding the process.

1.2.5 - Connect Remote Clients

How to install the command line and Python clients, and connect to your Altinity.Cloud ClickHouse Cluster.

Now that we’ve shown how to create a cluster and use ClickHouse SQL queries into your new cluster, let’s connect to it remotely.

For the following, we’re going to be using the clickhouse-client program, but the same process will help you gain access from your favorite client.

Full instructions for installing ClickHouse can be found on the ClickHouse Installation page. We’ll keep this simple and assume you’re using a Linux environment like Ubuntu. For this example, we set up a virtual machine running Ubuntu 20.04.

First, we need to know our connection details for our Altinity.Cloud ClickHouse cluster. To view your connection details:

  1. From the Clusters View, select Connection Details for the cluster to manage.

  2. From here, you can copy and paste the settings for the ClickHouse client from your cluster’s Connection Details. For example:

    clickhouse-client -h yourdataset.yourcluster.altinity.cloud --port 9440 -s --user=admin --password
    

ClickHouse Client

The ClickHouse Client is a command line based program that will be familiar to SQL based users.

Setting Up ClickHouse Client in Linux

If you’ve already set up ClickHouse client, then you can skip this step. These instructions are modified from the Altinity Stable Builds Quick Start Guides to quickly get a ClickHouse client running on your system.

Deb Linux Based Installs

For those who need quick instructions on how to install ClickHouse client in their deb based Linux environment (like Ubuntu), use the following:

  1. Update the apt-get local repository:

    sudo apt-get update
    
  2. Install the Altinity package signing keys:

    sudo sh -c 'mkdir -p /usr/share/keyrings && curl -s https://builds.altinity.cloud/apt-repo/pubkey.gpg | gpg --dearmor > /usr/share/keyrings/altinity-dev-archive-keyring.gpg'
    
  3. Update the apt-get repository to include the Altinity Stable build repository with the following commands:

    sudo sh -c 'echo "deb [signed-by=/usr/share/keyrings/altinity-dev-archive-keyring.gpg] https://builds.altinity.cloud/apt-repo stable main" > /etc/apt/sources.list.d/altinity-dev.list'
    
    sudo apt-get update
    
  4. Install the most current version of the Altinity Stable Build for ClickHouse client with the following:

    sudo apt-get install clickhouse-client
    

macOS Based Installs

For macOS users, the Altinity Stable for ClickHouse client can be installed through Homebrew with the through the Altinity Homebrew Tap for ClickHouse with the following quick command:

brew install altinity/clickhouse/clickhouse

Connect With ClickHouse Client

If your ClickHouse client is ready, then you can copy and paste your connection settings into your favorite terminal program, and you’ll be connected.

clickhouse-client to Altinity.Cloud demo

ClickHouse Python Driver

Users who prefer Python can use the clickhouse-driver to connect through Python. These instructions are very minimal, and are intended just to get you working in Python with your new Altinity.Cloud cluster.

These instructions are in the bash shell and require the following be installed in your environment:

  • Python 3.7 and above

  • The Python module venv

  • git

  • IMPORTANT NOTE: Install the clickhouse-driver 0.2.0 or above which has support for Server Name Indication (SNI) when using TLS communications.

To connect with the Python clickhouse-driver:

  1. (Optional) Setup your local environment. For example:

    python3 -m venv test 
    . test/bin/activate
    
  2. Install the driver with pip3:

    pip3 install clickhouse-driver
    
  3. Start Python.

  4. Add the client and connection details. The Access Point provides the necessary information to link directly to your cluster.

    AltinityCloud Cluster Connection Details

    Import the clickhouse_driver client and enter the connection settings:

    from clickhouse_driver import Client
    client = Client('<HOSTNAME>', user='admin', password=<PASSWORD>, port=9440, secure='y', verify=False)
    
  5. Run client.execute and submit your query. Let’s just look at the tables from within Python:

    >>> client.execute('SHOW TABLES in default')
    [('events',), ('events_local',)]
    
  6. You can perform selects and inserts as you need. For example, continuing from our Your First Queries using Cluster Explore.

    >>> result = client.execute('SELECT * FROM default.events')
    >>> print(result)
    [(datetime.date(2020, 11, 23), 1, 13, 'Example')]
    

For more information see the article ClickHouse And Python: Getting To Know The ClickHouse-driver Client.

1.2.6 - Conclusion

Closing instructions for the Altinity.Cloud Quick Start Guide.

There are several ways of running ClickHouse to take advantage of the robust features and speed in your big data applications. Altinity.Cloud makes it easy to start up a cluster the way you need, manage it, and connect to it so you can go from concept to execution in the fastest way possible.

If you have any questions, please feel free to Contact Us at any time.

1.2.7 - FAQ

Frequently Asked Questions

When using Launch Cluster, I can’t Click Next or complete the process

Make sure that all of your settings are filled in. Some common gotchas are:

  • Make sure that the ClickHouse User Password field has been entered and confirmed.
  • Cluster names must follow DNS name restrictions (letters, numbers, and dashes allowed, periods and special characters are not).
  • Cluster names must start with a letter, and should be 15 characters at most.

1.3 - General User Guide

Instructions on general use of Altinity.Cloud

Altinity.Cloud is made to be both convenient and powerful for ClickHouse users. Whether you’re a ClickHouse administrator or a developer, these are the concepts and procedures common to both.

1.3.1 - How to Create an Account

Creating your Altinity.Cloud account.

To create an Altinity.Cloud account, visit the Altinity.Cloud info page and select Free Trial. Fill in your contact information, and our staff will reach out to you to create a test account.

If you’re ready to upgrade to a full production account, talk to one of our consultants by filling out your contact information on our Consultation Request page.

1.3.2 - How to Login

Login to Altinity.Cloud

Altinity.Cloud provides the following methods to login to your account:

  • Username and Password
  • Auth0

Login with Username and Password

To login to Altinity.Cloud with your Username and Password:

  1. Open the Altinity.Cloud website.
  2. Enter your Email Address registered to your Altinity.Cloud account.
  3. Enter your Password.
  4. Click Sign In.

Once authenticated, you will be logged into Altinity.Cloud.

Login with Auth0

Auth0 allows you to use your existing Altinity account using trust authentication platforms such as Google to verify your identity.

  • IMPORTANT NOTE: This requires that your Altinity.Cloud account matches the authentication platform you are using. For example, if your email address in Altinity.Cloud is listed as Nancy.Doe@gmail.com, your Gmail address must also be Nancy.Doe@gmail.com.

To login using Auth0:

  1. Open the Altinity.Cloud website.
  2. Select Auth0.
  3. Select which authentication platform to use from the list (for example: Google).
    1. If this is your first time using Auth0, select which account to use. You must be already logged into the authentication platform
  4. You will be automatically logged into Altinity.Cloud.

1.3.3 - How to Logout

Logout of Altinity.Cloud

To logout:

  1. Select your profile icon in the upper right hand corner.
  2. Select Log out.

Your session will be ended, and you will have to authenticate again to log back into Altinity.Cloud.

1.3.4 - Account Settings

Account and profile settings.

Access My Account

To access your account profile:

  1. Select your user profile in the upper right hand corner.

  2. Select My Account.

    Access user account

My Account Settings

From the My Account page the following settings can be viewed:

  • Common Information. From here you can update or view the following:
    • Email Address View Only: Your email address or login
    • Password settings.
    • Dark Mode: Set the user interface to either the usual or darker interface.
  • API Access: The security access rights assigned to this account.
  • Access Rights: What security related actions this account can perform.

Update Password

To update your account password:

  1. Click your profile icon in the upper right hand corner.

  2. Select My Account.

  3. In the Common Information tab, enter the new password in the Password field.

  4. Select Save.

    Altinity Cloud user common settings

API Access Settings

Accounts can make calls to Altinity.Cloud through the API address at https://acm.altinity.cloud/api, and the Swagger API definition file is available at https://acm.altinity.cloud/api/reference.json.

Access is controlled through API access keys and API Allow Domains.

API Access Keys

Accounts can use this page to generate one or more API keys that can be used without exposing the accounts username and password. They allow API calls made to Altinity.Cloud to be made by the same Account as the keys were generated for.

When an Altinity.Cloud API key is generated, an expiration date is set for the key. By default, the expiration date is set 24 hours after the key is generated, with the date and time set to GMT. This date can be manually adjusted to allow the expiration date to make the API key invalid at the date of your choosing.

Create Altinity.Cloud API Key

To generate a new API key:

  1. Click your profile icon in the upper right hand corner.
  2. Select My Account.
  3. In the API Access tab, select + Add Key. The key will be available for use with the Altinity.Cloud API.

To change the expiration date of an API key:

  1. Click your profile icon in the upper right hand corner.
  2. Select My Account.
  3. In the API Access tab, update the date and time for the API key being modified. Note that the date and time are in GMT (Greenwich Mean Time).

To remove an API key:

  1. Click your profile icon in the upper right hand corner.
  2. Select My Account.
  3. In the API Access tab, select the trashcan icon next to the API key to delete. The key will no longer be allowed to connect to the Altinity.Cloud API for this account.

API Allow Domains

API submissions can be restricted by the source domain address. This provides enhanced security by keeping API communications only between authorized sources.

To update the list of domains this account can submit API commands from:

  1. Click your profile icon in the upper right hand corner.
  2. Select My Account.
  3. In the API Access tab, list each URL this account can submit API commands from. Each URl is a separate line.
  4. Click Save to update the account settings.
Altinity Cloud user common settings

Access Rights

The Access Rights page displays which permissions your account has. These are listed in three columns:

  • Section: The area of access within Altinity.Cloud, such as Accounts, Environments, and Console.
  • Action: What actions the access right rule allows within the section. Actions marked as * include all actions within the section.
  • Rule: Whether the Action in the Section is Allow (marked with a check mark), or Deny (marked with an X).

1.3.5 - Notifications

Notifications critical to your Altinity.Cloud account.

Notifications allow you to see any messages related to your Altinity.Cloud account. For example: billing, service issues, etc.

To access your notifications:

  1. From the upper right corner of the top navigation bar, select your user ID, then Notifications.

    Access notifications

Notifications History

The Notifications History page shows the notifications for your account, including the following:

  • Message: The notifications message.
  • Level: The priority level which can be:
    • Danger: Critical notifications that can effect your clusters or account.
    • Warning: Notifications of possible issues that are less than critical.
    • News: Notifications of general news and updates in Altinity.Cloud.
    • Info: Updates for general information.

1.3.6 - Billing

Managing billing for Altinity.Cloud.

Accounts with the role orgadmin are able to access the Billing page for their organizations.

To access the Billing page:

  1. Login to your Altinity.Cloud with an account with the orgadmin role.
  2. From the upper right hand corner, select the Account icon, and select Billing.
Access Billing

From the billing page, the following Usage Summary and the Billing Summary are available for the environments connected to the account.

Billing page

Usage Summary

The Usage Summary displays the following:

  • Current Period: The current billing month displaying the following:
    • Current Spend: The current total value of charges for Altinity.Cloud services.
    • Avg. Daily Spend: The average cost of Altinity.Cloud services per day.
    • Est. Monthly BIll: The total estimated value for the current period based on Current Spend and if usage continues at the current rate.
  • Usage for Period: Select the billing period to display.
  • Environment: Select the environment or All environments to display billing costs for. Each environment, its usage, and cost will be displayed with the total cost.

Billing Summary

The Billing Summary section displays the payment method, service address, and email address used for billing purposes. Each of these settings can be changed as required.

1.3.7 - System Status

View the status of Altinity.Cloud services.

The System Status page provides a quick view of whether the Altinity.Cloud services are currently up or down. This provides a quick glance to help devops staff determine where any issues may be when communicating with their Altinity.Cloud clusters.

To access tne System Status page:

  1. Login to your Altinity.Cloud account.

  2. From the upper right hand corner, select the Account icon, and select System Status.

    Access user account

System Status Page

The System Status page displays the status of the Altinity.Cloud services. To send a message to Altinity.Cloud support representatives, select Get in touch.

From the page the following information is displayed:

Altinity.Cloud system statut page

This sample is from a staging environment and cluster that was stopped and started to demonstrate how the uptime tracking system works.

  • Whether all Altinity.Cloud services are online or if there are any issues.
  • The status of services by product, with the uptime of the last 60 days shown as either green (the service was fully available that day), or red (the service suffered an issue). Hovering over a red bar will display how long the service was unavailable for the following services:
    • ClickHouse clusters
    • Ingress
    • Management Console

Enter your email at the bottom of the page in the section marked Subscribe to status updates to receive notifications via email regarding any issues with Altinity.Cloud services.

1.3.8 - Clusters View

Overview of the Clusters View

The Clusters View page allows you to view available clusters and access your profile settings.

To access the Clusters View page while logged in to Altinity.Cloud, click Altinity Cloud Manager.

The Clusters View page is separated into the following section:

  • A: Cluster Creation: For more information on how to create new clusters, see the Administrator Guide.
  • B: Clusters: Each cluster associated with your Altinity.Cloud account is listed in either tile format, or as a short list.
  • C: User Management:
    • Change which environment clusters are on display.
    • Access your Account Settings.
Clusters View

Organizational Admins have additional options in the left navigation panel that allows them to select the Accounts, Environments, and Clusters connected to the organization’s Altinity.Cloud account.

Change Environment

Accounts who are assigned to multiple Altinity.Cloud environments can select which environment’s clusters they are viewing. To change your current environment:

  1. Click the environment dropdown in the upper right hand corner, next to your user profile icon.
  2. Select the environment to use. You will automatically view that environment’s clusters.
Change Environment

Manage Environments

Accounts that have permission to manage environments access them through the following process:

  1. Select the Settings icon in the upper right hand corner.
  2. Select Environments.
Manage Environments

For more information on managing environments, see the Administrator Guide.

Access Settings

For information about access your account profile and settings, see Account Settings.

Cluster Access

For details on how to launch and manage clusters, see the Administrator Guide for Clusters.

1.3.9 - Cluster Explore Guide

How to explore a Cluster through queries, schema and processes

Altinity.Cloud users a range of options they can take on existing clusters.

For a quick view on how to create a cluster, see the Altinity.Cloud Quick Start Guide. For more details on interacting with clusters, see the Administrative Clusters Guide.

1.3.9.1 - Query Tool

How to submit ClickHouse queries to a cluster or nodes of the cluster

The Query Tool page allows users to submit ClickHouse SQL queries directly to the cluster or a specific cluster node.

To use the Query Tool:

  1. Select Explore from either the Clusters View or the Clusters Detail Page.

  2. Select Query from the top tab. This is the default view for the Explore page.

  3. Select from the following:

    Query Page
    1. Select which cluster to run a query against.

    2. Select Run DDLs ON CLUSTER to run Distributed DDL Queries.

    3. Select the following node options:

      Select node for query.
      1. Any: Any node selected from the Zookeeper parameters.
      2. All: Run the query against all nodes in the cluster.
      3. Node: Select a specific node to run the query against.
    4. The Query History allows you to scroll through queries that have been executed.

    5. Enter the query in the Query Textbox. For more information on ClickHouse SQL queries, see the SQL Reference page on ClickHouse.tech.

    6. Select Execute to submit the query from the Query Textbox.

    7. The results of the query will be displayed below the Execute button.

Additional tips and examples are listed on the Query page.

1.3.9.2 - Schema View

Viewing the database schema for clusters and nodes.

The Schema page allows you to view the databases, tables, and other details.

To access the Schema page:

  1. Select Explore from either the Clusters View or the Clusters Detail Page.

  2. Select Schema from the top tab.

  3. Select the following node options:

    Select node for query.
    1. Any: Any node selected from the Zookeeper parameters.
    2. All: Run the query against all nodes in the cluster.
    3. Node: Select a specific node to run the query against.

To view details on a table, select the table name. The following details are displayed:

  • Table Description: Details on the table’s database, engine, and other details.
  • Table Schema: The CREATE TABLE command used to generate the table.
  • Sample Rows: A display of 5 selected rows from the table to give an example of the data contents.

1.3.9.3 - Processes

How to view the processes for a cluster or node.

The Processes page displays the currently running processes on a cluster or node.

To view the processes page:

  1. Select Explore from either the Clusters View or the Clusters Detail Page.

  2. Select Processes from the top tab.

  3. Select the following node options:

    Select node for query.
    1. Any: Any node selected from the Zookeeper parameters.
    2. All: Run the query against all nodes in the cluster.
    3. Node: Select a specific node to run the query against.

The following information is displayed:

  • Query ID: The ClickHouse ID of the query.
  • Query: The ClickHouse query that the process is running.
  • Time: The elapsed time of the process.
  • User: The ClickHouse user running the process.
  • Client Address: The address of the client submitting the process.
  • Action: Stop or restart a process.

1.4 - Administrator Guide

How to manage Altinity.Cloud.

Altinity.Cloud allows administrators to manage clusters, users, and keep control of their ClickHouse environments with a few clicks. Monitoring tools are provided so you can keep track of everything in your environment to keep on top of your business.

1.4.1 - Clusters

How to launch clusters and manage clusters.

ClickHouse databases are managed through clusters, which harness the power of distributed processing to quickly deliver results on even the most complex and data intensive queries.

Altinity.Cloud users can create their own ClickHouse Clusters tailored to their organization’s needs.

1.4.1.1 - View Cluster Details

How to view details of a running cluster and its nodes.

Cluster Details Page

Once a cluster has been launched, its current operating details can be viewed by selecting the cluster from the Clusters View. This displays the Cluster Details page. From the Cluster Dashboard page, select Nodes to view the Nodes Summary page.

From the Cluster Details page, users can perform the following:

Cluster Details Page
  • A: Manage the cluster’s:
    • Actions
    • Configuration
    • Tables and structure with Explore
    • Alerts
    • Logs
  • B: Check Cluster Health.
  • C: View the cluster’s Access Point.
  • D: Monitor the Cluster and its Queries.
  • E: View summary details for the Cluster or Node. Select Nodes to view details on the cluster’s Nodes.

Nodes Summary

The Nodes Summary Page displays all nodes that are part of the selected cluster. From this page the following options and information is available:

  1. The Node Summary that lists:
    1. Endpoint: The connection settings for this node. See Node Connection.
    2. Details and Node View: Links to the Node Dashboard and Node Metrics.
    3. Version: The ClickHouse version running on this node.
    4. Type: The processor setting for the node.
    5. Node Storage: Storage space in GB available.
    6. Memory: RAM memory allocated for the node.
    7. Availability Zone: Which AWS Availability Zone the node is hosted on.
Node Summary Page

Node Connection

The Node Connection Details shows how to connect from various clients, including the clickhouse-client, JDBC drivers, HTTPS, and Python. Unlike the Cluster Access Point, this allows a connection directly to the specific node.

Node Connection

Node Dashboard

From the Node Dashboard Page users can:

Node Dashboard
  • A: Manage the node’s:
    • Tables and structure with Explore
    • Logs
  • B: Check the node’s health.
  • C: View summary details node, it’s Metrics and its Schema.
  • D: Perform Node Actions.

Node Metrics

Node Metrics provides a breakdown of the node’s performance, such as CPU data, active threads, etc.

Node Schema

The Node Schema provides a view of the databases’ schema and tables. For more information on how to interact with a Node by submitting queries, viewing the schema of its databases and tables, and viewing process, see the Cluster Explore Guide.

1.4.1.2 - Cluster Actions

Actions that can be taken on launched clusters.

Launched clusters can be have different actions applied to them based on your needs.

1.4.1.2.1 - Upgrade Cluster

How to upgrade an existing cluster.

Clusters can be upgraded to versions of ClickHouse other than the one your cluster is running.

When upgrading to a ClickHouse Altinity Stable Build, review the release notes for the version that you are upgrading to.

How to Upgrade an Altinity Cloud Cluster

To upgrade a launched cluster:

  1. Select Actions for the cluster to upgrade.

  2. Select Upgrade.

  3. Select the ClickHouse version to upgrade to.

  4. Select Upgrade to start the process.

    Cluster Upgrade

The upgrade process completion time varies with the size of the cluster, as each server is upgraded individually. This may cause downtime while the cluster is upgraded.

1.4.1.2.2 - Rescale Cluster

How to rescale an existing cluster.

The size and structure of the cluster may need to be altered after launching based on your organization’s needs. The following settings can be rescaled:

  • Number of Shards
  • Number of Replicas
  • Node Type
  • Node Storage
  • Number of Volumes
  • Apply to new nodes only: This setting will only effect nodes created from this point forward.

See Cluster Settings for more information.

How to Rescale a Cluster

To rescale a cluster:

  1. Select Actions for the cluster to rescale.

  2. Select Rescale.

  3. Set the new values of the cluster.

  4. Click OK to begin rescaling.

    Cluster Rescale

Depending on the size of the cluster, this may take several minutes.

1.4.1.2.3 - Stop and Start a Cluster

How to stop or start an existing cluster.

To stop an launched cluster, or start a stopped cluster:

  1. From either the Clusters View or the Cluster Details Page, select Actions.
    1. If the cluster is currently running, select Stop to halt its operations.
    2. If the cluster has been stopped, select Start to restart it.

Depending on the size of your cluster, it may take a few minutes until it is fully stopped or is restarted. To access the health and availability of the cluster, see Cluster Health or the Cluster Availability.

1.4.1.2.4 - Export Cluster Settings

How to export a cluster’s settings.

The structure of an Altinity Cloud cluster can be exported as JSON. For details on the cluster’s settings that are exported, see Cluster Settings.

To export a cluster’s settings to JSON:

  1. From either the Clusters View or the Cluster Details Page, select Actions, then select Export.
  2. A new browser window will open with the settings for the cluster in JSON.

1.4.1.2.5 - Replicate a Cluster

How to replicate an existing cluster.

Clusters can be replicated with the same or different settings. These can include the same database schema as the replicated cluster, or launched without the schema. This may be useful to create a test cluster, then launch the production cluster with different settings ready for production data.

For complete details on Altinity.Cloud clusters settings, see Cluster Settings.

To create a replica of an existing cluster:

  1. From either the Clusters View or the Cluster Details Page, select Actions, then select Launch a Replica Cluster.
  2. Enter the desired values for Resources Configuration.
    1. To replicate the schema of the source directory, select Replicate Schema.

      Replicate Schema
    2. Click Next to continue.

  3. High Availability Configuration, and Connection Configuration.
    1. Each section must be completed in its entirety before moving on to the next one.
  4. In the module Review & Launch, verify the settings are correct. When finished, select Launch.

Depending on the size of the new cluster it will be available within a few minutes. To verify the health and availability of the new cluster, see Cluster Health or the Cluster Availability.

1.4.1.2.6 - Destroy Cluster

How to destroy an existing cluster.

When a cluster is no longer required, the entire cluster and all of its data can be destroyed.

  • IMPORTANT NOTE: Once destroyed, a cluster can not be recovered. It must be manually recreated.

To destroy a cluster:

  1. From either the Clusters View or the Cluster Details Page, select Actions, then select Destroy.

  2. Enter the cluster name, then select OK to confirm its deletion.

    Destroy Cluster

1.4.1.3 - Cluster Settings

Settings and values used for Altinity.Cloud ClickHouse Clusters.

ClickHouse Clusters hosted on Altinity.Cloud have the following structural attributes. These determine options such as the version of ClickHouse installed on them, how many replicas, and other important features.

Name Description Values
Cluster Name The name for this cluster. It will be used for the hostname of the cluster. Cluster names must be DNS compliant. This includes:
  • Alphanumeric characters and underscores only
  • No special characters such as periods, ?, #, etc.
    Example:
    • Good: mycluster
    • Bad: my.cluster?
  • Can not start with a number.
Node Type Determines the number of CPUs and the amount of RAM used per node. The following Node Types are sample values, and may be updated at any time:
  • m5.large: CPU x2, RAM 6.5 GB
  • m5.xlarge: CPU x4, RAM 14 GB
  • M5.2xlarge: (CPU x8, RAM 29 GB)
  • m5.4xlarge: (CPU x16, RAM 58 GB)
  • m5.8xlarge: (CPU x32, RAM 120 GB)
Node Storage The amount of storage space available to each node, in GB.  
Number of Volumes Storage can be split across multiple volumes. The amount of data stored per node is the same as set in Node Storage, but it split into multiple volumes.
Separating storage into multiple volumes can increase query performance.
 
Volume Type Defines the Amazon Web Services volume class. Typically used to determine whether or not to encrypt the columns. Values:
  • gp2 (Not Encrypted)
  • gp2-encrypted (encrypted)
Number of Shards Shards represent a set of nodes. Shards can be replicated to provide increased availability and computational power.  
ClickHouse Version The version of the ClickHouse database that will be used on each node.
To run a custom ClickHouse container version, specify the Docker image to use.
  • IMPORTANT NOTE: The nodes in the cluster will all be running the same version of ClickHouse. If you want to run multiple versions of ClickHouse, they will have to be set on different clusters.
Currently available options:
  • Altinity Stable:
    • 19.11.12.69
    • 19.16.19.85
    • 20.3.21.2
    • 20.8.7.15
  • Standard Release
    • 20.10.5.10
    • 20.11.4.13
  • Custom Identifier
ClickHouse Admin Name The name of the ClickHouse administrative user. Set to admin by default. Can not be changed.
ClickHouse Admin Password The password for the ClickHouse administrative user.  
Data Replication Toggles whether shards will be replicated. When enabled, Zookeeper is required to manage the shard replication process. Values:
  • Enabled (Default): Each Cluster Shard will be replicated to the value set in Number of Replicas.
  • Disabled: Shards will not be replicated.
Number of Replicas Sets the number of replicas per shard. Only enabled if Data Replication is enabled.  
Zookeeper Configuration When Data Replication is set to Enabled, Zookeeper is required. This setting determines how Zookeeper will run and manage shard replication.
The Zookeeper Configuration mainly sets how many Zookeeper nodes are used to manage the shards. More Zookeeper nodes increases the availability of the cluster.
Values:
  • Single Node (Default): Replication is managed by one Zookeeper node.
  • Three Nodes: Increases the Zookeeper nodes to an ensemble of 3.
Zookeeper Node Type Determines the type of Zookeeper node. Defaults to default and can not be changed.
Node Placement Sets how nodes are distributed via Kubernetes. Depending on your situation and how robust you want your replicas and clusters. Values:
  • Separate Nodes (Default): ClickHouse containers are distributed across separate cluster nodes.
  • Separate Shards: ClickHouse containers for different shards are distributed across separate cluster nodes.
  • Separate Replicas: ClickHouse containers for different replicas are distributed across separate cluster nodes.
Enable Backups Backs up the cluster. These can be restored in the event data loss or to roll back to previous versions. Values:
  • Enabled (Default): The cluster will be backed up automatically.
  • Disabled: Automatic Backups are disabled.
Backup Schedule Determines how often the cluster will be backed up. Defaults to Daily
Number of Backups to keep Sets how many backups will be stored before deleting the oldest one Defaults to 5.
Endpoint The Access point Domain Name. This is hard set by the name of your cluster and your organization.
Use TLS Sets whether or not to encrypt external communications with the cluster to TLS. Default to Enabled and can not be changed.
Load Balancer Type The load balancer manages communications between the various nodes to ensure that nodes are not overwhelmed. Defaults to Altinity Edge Ingress
Protocols Sets the TCP ports used in external communications with the cluster. Defaults to ClickHouse TCP port 9440 and HTTP port 8443.

1.4.1.4 - Configure Cluster

How to configure launched clusters.

Once a cluster has been launched, it’s configuration can be updated to best match your needs.

1.4.1.4.1 - How to Configure Cluster Settings

How to update the cluster’s settings.

Cluster settings can be updated from the Clusters View or from the Cluster Details by selecting Configure > Settings.

  • IMPORTANT NOTE: Changing a cluster’s settings will require a restart of the entire cluster.

Note that some settings are locked - their values can not be changed from this screen.

Cluster Settings

How to Set Troubleshooting Mode

Troubleshooting mode prevents your cluster from auto-starting after a crash. To update this setting:

  1. Toggle Troubleshooting Mode either On or Off.

How to Edit an Existing Setting

To edit an existing setting:

  1. Select the menu on the left side of the setting to update.
  2. Select Edit.
  3. Set the following:
    1. Setting Type.
    2. Name
    3. Value
  4. Select OK to save the setting.
Edit Cluster Setting

How to Add a New Setting

To add a new setting to your cluster:

  1. Select Add Setting.
  2. Set the following:
    1. Setting Type.
    2. Name
    3. Value
  3. Select OK to save the setting.
Add a New Cluster Setting

How to Delete an Existing Setting

To delete an existing setting:

  1. Select the menu on the left side of the setting to update.
  2. Select OK.
  3. Select Remove to confirm removing the setting.
Delete a Cluster Setting

1.4.1.4.2 - How to Configure Cluster Profiles

How to update the cluster’s profiles.

Cluster profiles allow you to set the user permissions and settings based on their assigned profile.

The Cluster Profiles can be accessed from the Clusters View or from the Cluster Details by selecting Configure > Settings.

Cluster Profile Settings

Add a New Profile

To add a new cluster profile:

  1. From the Cluster Profile View page, select Add Profile.
  2. Provide profile’s Name and Description, then click OK.

Edit an Existing Profile

To edit an existing profile:

  1. Select the menu to the left of the profile to update and select Edit, or select Edit Settings.
  2. To add a profile setting, select Add Setting and enter the Name and Value, then click OK to store your setting value.
  3. To edit an existing setting, select the menu to the left of the setting to update. Update the Name and Value, then click OK to store the new value.

Delete an Existing Profile

To delete an existing profile:

  1. Select the menu to the left of the profile to update and select Delete.
  2. Select OK to confirm the profile deletion.

1.4.1.4.3 - How to Configure Cluster Users

How to update the cluster’s users.

The cluster’s Users allow you to set one or more entities who can access your cluster, based on their Cluster Profile.

Cluster users can be updated from the Clusters View or from the Cluster Details by selecting Configure > Users.

Cluster Users

How to Add a New User

To add a new user to your cluster:

  1. Select Add User

  2. Enter the following:

    Add New User
    1. Login: the name of the new user.
    2. Password and Confirm Password: the authenticating credentials for the user. Th
    3. Networks: The networks that the user can connect from. Leave as 0.0.0.0/0 to allow access from all networks.
    4. Databases: Which databases the user can connect to. Leave empty to allow access all databases.
    5. Profile: Which profile settings to apply to this user.
  3. Select OK to save the new user.

How to Edit a User

To edit an existing user:

  1. Select the menu to the left of the user to edit, then select Edit.
  2. Enter the following:
    1. Login: the new name of the user.
    2. Password and Confirm Password: the authenticating credentials for the user. Th
    3. Networks: The networks that the user can connect from. Leave as 0.0.0.0/0 to allow access from all networks.
    4. Databases: Which databases the user can connect to. Leave empty to allow access all databases.
    5. Profile: Which profile settings to apply to this user.
  3. Select OK to save the updated user user.

How to Delete a User

  1. Select the menu to the left of the user to edit, then select Delete.
  2. Select OK to verify the user deletion.

1.4.1.5 - Launch New Cluster

How to launch a new ClickHouse Cluster from Altinity.Cloud.

Launching a new ClickHouse Cluster is incredibly easy, and only takes a few minutes. For those looking to create their first ClickHouse cluster with the minimal steps, see the Quick Start Guide. For complete details on Altinity.Cloud clusters settings, see Cluster Settings.

To launch a new ClickHouse cluster:

  1. From the Clusters View page, select Launch Cluster. This starts the Cluster Launch Wizard.

    Launch New Cluster
  2. Enter the desired values for Resources Configuration, High Availability Configuration, and Connection Configuration.

    1. Each section must be completed in its entirety before moving on to the next one.
  3. In the module Review & Launch, verify the settings are correct. When finished, select Launch.

Within a few minutes, the new cluster will be ready for your use and display that all health checks have been passed.

1.4.1.6 - Cluster Alerts

How to be notified about cluster issues

The Cluster Alerts module allows users to set up when they are notified for a set fo events. Alerts can either be a popup, displaying the alert when the user is logged into Altinity.Cloud, or email so they can receive an alert even when they are not logged into Altinty.Cloud.

To set which alerts you receive:

  1. From the Clusters view, select the cluster to for alerts.

  2. Select Alerts.

    Cluster Alerts
  3. Add the Email address to send alerts to.

  4. Select whether to receive a Popup or Email alert for the following events:

    1. ClickHouse Version Upgrade: Alert triggered when the version of ClickHouse that is installed in the cluster has a new update.
    2. Cluster Rescale: Alert triggered when the cluster is rescaled, such as new shards added.
    3. Cluster Stop: Alert triggered when some event has caused the cluster to stop running.
    4. Cluster Resume: Alert triggered when a cluster that was stopped has resumed operations.

1.4.1.7 - Cluster Health Check

How to quickly check your cluster’s health.

From the Clusters View, you can see the health status of your cluster and its nodes at a glance.

How to Check Node Health

The quick health check of your cluster’s nodes is displayed from the Clusters View. Next to the cluster name is a summary of your nodes’ statuses, indicating the total number of nodes and how many nodes are available.

View the Access Point

How to Check Cluster Health

The overall health of the cluster is shown in the Health row of the cluster summary, showing the number of health checks passed.

View the Access Point

Click checked passed to view a detailed view of the cluster’s health.

How to View a Cluster’s Health Checks

The cluster’s Health Check module displays the status of the following health checks:

  • Access point availability check
  • Distributed query check
  • Zookeeper availability check
  • Zookeeper contents check
  • Readonly replica check
  • Delayed inserts check

To view details on what queries are used to verify the health check, select the caret for each health check.

Cluster Health Details

1.4.1.8 - Cluster Monitoring

How to monitor your clusters performance.

Altinity.Cloud integrates Grafana into its monitoring tools. From a cluster, you can quickly access the following monitoring views:

  • Cluster Metrics
  • Queries
  • Logs

How to Access Cluster Metrics

To access the metrics views for your cluster:

  1. From the Clusters view, select the cluster to monitor.
  2. From Monitoring, select the drop down View in Grafana and select from one of the following options:
    1. Cluster Metrics
    2. Queries
    3. Logs
  3. Each metric view opens in a separate tab.

Cluster Metrics

Cluster Metrics displays how the cluster is performing from a hardware and connection standpoint.

Cluster Monitoring View

Some of the metrics displayed here include:

  • DNS and Distributed Connection Errors: Displays the rate of any connection issues.
  • Select Queries: The number of select queries submitted to the cluster.
  • Zookeeper Transactions: The communications between the zookeeper nodes.
  • ClickHouse Data Size on Disk: The total amount of data the ClickHouse database is using.

Queries

The Queries monitoring page displays the performance of clusters, including the top requests, queries that require the most memory, and other benchmarks. This can be useful in identifying queries that can cause performance issues and refactoring them to be more efficient.

Query Monitoring View

Log Metrics

The Log monitoring page displays the logs for your clusters, and allows you to make queries directly on them. If there’s a specific detail you’re trying to iron out, the logs are the most granular way of tracking down those issues.

Log Monitoring View

1.4.1.9 - Cluster Logs

How to access your cluster’s logs

Altinity.Cloud provides the cluster log details so users can track down specific issues or performance bottlenecks.

To access a cluster’s logs:

  1. From the Clusters view, select the cluster to for alerts.
  2. Select Logs.
  3. From the Log Page, you can display the number of rows to view, or filter logs by specific text.
  4. To download the logs, select the download icon in the upper right corner (A).
  5. To refresh the logs page, select the refresh icon (B).
Cluster Logs Page

The following logs are available:

  • ACM Logs: These logs are specific to Altinity.Cloud issues and include the following:
    • System Log: Details the system actions such as starting a cluster, updating endpoints, and other details.
    • API Log: Displays updates to the API and activities.
  • ClickHouse Logs: Displays the Common Log that stores ClickHouse related events. From this view a specific host can be selected form the dropdown box.
  • Backup Logs: Displays backup events from the clickhouse-backup service. Log details per cluster host can be selected from the dropdown box.
  • Operator Logs: Displays logs from the Altinity Kubernetes Operator service, which is used to manage cluster replication cluster and communications in the Kubernetes environment.

1.4.2 - Access Control

How to control access to your organizations, environments, and clusters.

Altinity.Cloud provides role based access control. Depending the role granted to an Altinity.Cloud Account, they can assign other Altinity.Cloud accounts roles and grant permissions to access organizations, environments, or clusters.

1.4.2.1 - Role Based Access and Security Tiers

Altinity.Cloud hierarchy and role based access.

Access to ClickHouse data hosted in Altinity.Cloud is controlled through a combination of security tiers and account roles. This allows companies to tailor access to data in a way that maximizes security while still allowing ease of access.

Security Tiers

Altinity.Cloud groups sets of clusters together in ways that allows companies to provide Accounts access only to the clusters or groups of clusters that they need to.

Altinity.Cloud groups clusters into the following security related tiers:

Security Tiers
  • Nodes: The most basic level - an individual ClickHouse database and tables.
  • Clusters: These contain one or more nodes provide ClickHouse database access.
  • Environments: Environments contain one or more clusters.
  • Organizations: Organizations contain one or more environments.

Account access is controlled by assigning an account a single role and a security tier depending on their role. A single account can be assigned to multiple organizations, environments, multiple clusters in an environment, or a single cluster depending on their account role.

Account Roles

The actions that can be taken by Altinity.Cloud accounts is based on the role they are assigned. The following roles and their actions based on the security tier is detailed in the table below:

Role Environment Cluster
orgadmin Create, Edit, and Delete environments that they create, or are assigned to, within the assigned organizations.
Administrate Accounts associated with environments they are assign to.
Create, Edit, and Delete clusters within environments they create or assigned to in the organization.
envadmin Access assigned environments. Create, Edit, and Delete clusters within environments they are assigned to in the organization.
envuser Access assigned environments. Access one or more clusters the account is specifically assigned to.

The account roles are tied into the security tiers, and allow an account to access multiple environment and clusters depending on what type of tier they are assigned to.

For example, we may have the following situation:

  • Accounts peter, paul, and mary and jessica are all members of the organization HappyDragon.
  • HappyDragon has the following environments: HappyDragon_Dev and HappyDragon_Prod, each with the clusters marketing, sales, and ops.

The accounts are assigned the following roles and security tiers:

Account Role Organization Environments Clusters
mary orgadmin HappyDragon HappyDragon_Prod *
peter envadmin HappyDragon HappyDragon_Dev *
jessica envadmin HappyDragon HappyDragon_Prod, HappyDragon_Dev *
paul envuser HappyDragon HappyDragon_Prod marketing

In this scenario, mary has the ability to access the environment HappyDragon_Prod, or can create new environments and manage them and any clusters within them. However, she is not able to edit or access HappyDragon_Dev or any of its clusters.

  • Both peter and jessica have the ability to create and remove clusters within their assigned environments.
    • peter is able to modify the clusters in the environment HappyDragon_Dev.
    • jessica can modify clusters in both environments.
  • paul can only access the cluster marketing in the environment HappyDragon_Prod.

1.4.2.2 - Account Management

How to manage Altinity.Cloud accounts.

Altinity.Cloud accounts with the role orgadmin are able to create new Altinity.Cloud accounts and associate them with organizations, environments, and one or more clusters depending on their role. For more information on roles, see Role Based Access and Security Tiers.

Account Page

The Account Page displays all accounts assigned to the same Organization and Environments as the logged in account.

For example: the accounts mario, luigi, and peach and todd are members of the organizations MushroomFactory and BeanFactory as follows:

Account Role Organization: MushroomFactory Organization: BeanFactory
peach orgadmin *  
mario orgadmin   *
luigi envuser   *
todd envuser *  
  • peach will be able to see their account and todd in the Account Page, while accounts mario and luigi will be hidden from them.
  • mario will be able to see their account and luigi.

Access Accounts

To access the accounts that are assigned to the same Organizations and Environments as the logged in user with the account role orgadmin:

  1. Login to Altinity.Cloud with an account granted the orgadmin role.
  2. From the left navigation panel, select Accounts.
  3. All accounts that are in the same Organizations and Environments as the logged in account will be displayed.

Account Details

Accounts have the following details that can be set by an account with the orgadmin role:

  1. Common Information:
    1. Name: The name of the account.
    2. Email (Required): The email address of the account. This will be used to login, reset passwords, notifications, and other uses. This must be a working email for these functions to work.
    3. Password: The password for the account. Once a user has authenticated to the account, they can change their password.
    4. Confirm Password: Confirm the password for the account.
    5. Role (Required): The role assigned to the account. For more information on roles, see Role Based Access and Security Tiers.
    6. Organization: The organization assigned to the account. Note that the orgadmin can only assign accounts the same organizations that the orgadmin account also belongs to.
    7. Suspended: When enabled, this prevents the account from logging into Altinity.Cloud.
  2. Environment Access:
    1. Select the environments that the account will require access to. Note that the orgadmin can only assign accounts the same environments that the orgadmin account also belongs to.
  3. Cluster Access:
    1. This is only visible if the Role is set to envuser. This allows one or more clusters in the environments the new account was assigned to in Environmental Access to be accessed by them.
  4. API Access:
    1. Allows the new account to make API calls from the listed domain names.

Account Actions

Create Account

orgadmin accounts can create new accounts and assign them to the same organization and environments they are assigned to. For example, continuing the scenario from above, if account peach is assigned to the organization MushroomFactory and the environments MushroomProd and MushroomDev, they can assign new accounts to the organization MushroomFactory, and to the environments MushroomProd or MushroomDev or both.

To create a new account:

  1. Login to Altinity.Cloud with an account granted the orgadmin role.

  2. From the left navigation panel, select Accounts.

  3. Select Add Account.

  4. Set the fields as listed in the Account Details section.

    New User Settings
  5. Once all settings are completed, select Save. The account will be able to login with the username and password, or if their email address is registered through Google, Auth0.

Edit Account

  1. Login to Altinity.Cloud with an account granted the orgadmin role.
  2. From the left navigation panel, select Accounts.
  3. From the left hand side of the Accounts table, select the menu icon for the account to update and select Edit.
  4. Update the fields as listed in the Account Details section.
  5. When finished, select Save.

Suspend Account

Instead of deleting an account, setting an account to Suspended may be more efficient to preserve the accounts name and other settings. A suspended account is unable to login to Altinity.Cloud. This includes directly logging through a browser and API calls made under the account.

To suspend or activate an account:

  1. Login to Altinity.Cloud with an account granted the orgadmin role.
  2. From the left navigation panel, select Accounts.
  3. From the left hand side of the Accounts table, select the menu icon for the account to update and select Edit.
    1. To suspend an account, toggle Suspended to on.
    2. To activate a suspended account, toggle Suspended to off.
  4. When finished, select Save.

Delete Account

Accounts can be deleted which removes all information on the account. Clusters and environments created by account will remain.

To delete an existing account:

  1. Login to Altinity.Cloud with an account granted the orgadmin role.
  2. From the left navigation panel, select Accounts.
  3. From the left hand side of the Accounts table, select the menu icon for the account to update and select Delete.
  4. Verify the account is to be deleted by selecting OK.

1.5 - Connectivity

Connecting Altinity.Cloud with other services.

The following guides are designed to help organizations connect their existing services to Altinity.Cloud.

1.5.1 - Cluster Access Point

How to view your Cluster’s access information.

ClickHouse clusters created in Altinity.Cloud can be accessed through the Access Point. The Access Point is configured by the name of your cluster and environment it is hosted in.

Information on the Access Point is displayed from the Clusters View. Clusters with TLS Enabled will display a green shield icon.

View Cluster Access Point

To view your cluster’s access point:

  1. From the Clusters View, select Access Point.
  2. The Access Point details will be displayed.
View the Access Point

Access Point Details

The Access Point module displays the following details:

  • Host: The dns host name of the cluster, based on the name of the cluster and environment the cluster is hosted in.

  • TCP Port: The ClickHouse TCP port for the cluster.

  • HTTP Point: The HTTP port used for the cluster.

  • Client Connections: The client connections are quick commands you can copy and paste into your terminal or use in your code. This make it a snap to have your code connecting to your cluster by copying the details right from your cluster’s Access Point. Provided client includes are:

    • clickhouse-client
    • jdbc
    • https
    • python

1.5.2 - Configure Cluster Connections

Configure the connection protocols to your Altinity.Cloud cluster

Altinity.Cloud provides accounts the ability to customize their connections to their clusters. This allows organizations the ability to enable or disable:

  • The Binary Protocol: The native ClickHouse client secure port on port 9440.
  • The HTTP Protocol: The HTTPS protocol on port 8443.
  • IP Restrictions: Restricts ClickHouse client connections to the provided whitelist. The IP addresses must be listed in CIDR format. For example, ip_address1,ip_address2,etc.

As of this time, accounts can update the IP Restrictions section. Binary Protocol and HTTP Protocol are enabled by default and can not be disabled.

Update Connection Configuration

To update the cluster’s Connection Configuration:

  1. Log into Altinity.Cloud with an account.

  2. Select the cluster to update.

  3. From the top menu, select Configure->Connections.

    Select Configure->Connections for the cluster.
  4. To restrict IP communication only to a set of whitelisted IP addresses:

    1. Under IP Restrictions, select Enabled.

    2. Enter a list of IP addresses. These can be separated by comma, spaces, or a new line. The following examples are all equivalent:

      192.168.1.1,192.168.1.2
      
      192.168.1.1
      192.168.1.2
      
      192.168.1.1 192.168.1.2
      
  5. When finished, select Confirm to save the Connection Configuration settings.

    Cluster Connection Configuration Settings

1.5.3 - Connecting with DBeaver

Creating a connection to Altinity.Cloud from DBeaver.

Connecting to Altinity.Cloud from DBeaver is a quick, secure process thanks to the available JDBC driver plugin.

Required Settings

The following settings are required for the driver connection:

  • hostname: The DNS name of the Altinity.Cloud cluster. This is typically based on the name of your cluster, environment, and organization. For example, if the organization name is CameraCo and the environment is prod with the cluster sales, then the URL may be https://sales.prod.cameraco.altinity.cloud. Check the cluster’s Access Point to verify the DNS name of the cluster.
  • port: The port to connect to. For Altinity.Cloud, it will be HTTPS on port 8443.
  • Username: The ClickHouse user to authenticate to the ClickHouse server.
  • Password: The ClickHouse user password used to authenticate to the ClickHouse server.

Example

The following example is based on connecting to the Altinity.Cloud public demo database, with the following settings:

  • Server: github.demo.trial.altinity.cloud
  • Port: 8443
  • Database: default
  • Username: demo
  • Password: demo
  • Secure: yes

DBeaver Example

  1. Start DBeaver and select Database->New Database Connection.

    Create Database Connection
  2. Select All, then in the search bar enter ClickHouse.

  3. Select the ClickHouse icon in the “Connect to a database” screen.

    Select ClickHouse JDBC Driver
  4. Enter the following settings:

    1. Host: github.demo.trial.altinity.cloud
    2. Port: 8443
    3. Database: default
    4. User: demo
    5. Password: demo
    Connection details.
  5. Select the Driver Properties tab. If prompted, download the ClickHouse JDBC driver.

  6. Scroll down to the ssl property. Change the value to true.

    Set secure.
  7. Press the Test Connection button. You should see a successful connection message.

    Successful Test.

1.5.4 - clickhouse-client

How to install and connect to an Altinity.Cloud cluster with clickhouse-client.

The ClickHouse Client is a command line based program that will be familiar to SQL based users. For more information on clickhouse-client, see the ClickHouse Documentation Command-Line Client page.

The access points for your Altinity.Cloud ClickHouse cluster can be viewed through the Cluster Access Point.

How to Setup clickhouse-client for Altinity.Cloud in Linux

As of this document’s publication, version 20.13 and above of the ClickHouse client is required to connect with the SNI enabled clusters. These instructions use the testing version of that client. An updated official stable build is expected to be released soon.

sudo apt-get install apt-transport-https ca-certificates dirmngr
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv E0C56BD4

echo "deb https://repo.clickhouse.tech/deb/testing/ main/" | sudo tee \
    /etc/apt/sources.list.d/clickhouse.list
sudo apt-get update

sudo apt-get install -y clickhouse-client

Connect With clickhouse-client to a Altinity.Cloud Cluster

If your ClickHouse client is ready, then you can copy and paste your connection settings into your favorite terminal program, and you’ll be connected.

clickhouse-client to Altinity.Cloud demo

1.5.5 - Amazon VPC Endpoint

How to create an Amazon VPC Endpoint for Altinity.Cloud Services

Altinity.Cloud users can connect a VPC (Virtual Private Cloud) Endpoint from existing AWS environments to their Altinity.Cloud environment. The VPC Endpoint becomes a private connection between their existing Amazon services and Altinity.Cloud, without exposing the connection to the Internet.

The following instructions are based on using the AWS console. Examples of the Terraform equivalent settings are included.

Requirements

Altinity.Cloud requires the AWS ID for the account that will be linked to the Altinity.Cloud environment. This can be found when you login to your AWS Console, and select your username from the upper right hand corner:

Create Endpoint Details

Instructions

To create an VPC Endpoint, the following general steps are required:

  • Retrieve Your Altinity.Cloud Environment URL.
  • Request an Endpoint Service Name from Altinity.Cloud.
  • Create a VPC Endpoint. This must be in the same region as the service to be connected to.
  • Create a private Route 53 Hosted Zone to internal.{Altinity.Cloud environment name}.altinity.cloud.
  • Create a CNAME that points to the VPC Endpoint.

Retrieve Your Altinity.Cloud Environment URL

Your AWS service will be connected to the URL for your Altinity.Cloud environment. Typically this will be the name of your environment, followed by internal.{Altinity.Cloud environment name}.altinity.cloud. For example: if your environment is named trafficanalysis, then your environment URL will be internal.trafficanalysis.altinity.cloud.

This may differ depending on your type of service. If you have any questions, please contact your Altinity Support representative.

Request an Endpoint Service Name

Before creating a VPC Endpoint, Altinity.Cloud will need to provide you a AWS Service Name that will be used for your Endpoint. To request your AWS Service Name to use in later steps of creating the VPC Endpoint to Altinity.Cloud:

  1. Login to your AWS console and retrieve your AWS ID.

    Create Endpoint Details
  2. Contact your Altinity.Cloud support representative and inform them that you want to set up a VPC Endpoint to your Altinity.Cloud environment. They will require your AWS ID.

  3. Your Altinity.Cloud support representative will process your request, and return your AWS Service Name to you. Store this in a secure location for your records.

Create a VPC Endpoint

The next step in connecting Altinity.Cloud to the existing AWS Service is to create an Endpoint.

  1. From the AWS Virtual Private Cloud console, select Endpoints > Create Endpoint.

    Select Create Endpoint
  2. Set the following:

    1. Service Category: Set to Find service by name. (1)
    2. Service Name: Enter the Service Name (2) provided in the step Create Service Name, then select Verify. (3)
    Create Endpoint Details
  3. Select the VPC from the dropdown.

  4. Select Create Endpoint.

Terraform VPC Endpoint Configuration

resource "aws_vpc_endpoint" "this" {
    service_name = local.service_name,
    vpc_endpoint_type = "Interface",
    vpc_id = aws_vpc.this.id,
    subnet_ids = [aws_subnet.this.id],
    security_group_ids  = [aws_vpc.this.default_security_group_id],
    private_dns_enabled = false,
    tags = local.tags
}

Create Route 53 Hosted Zone

To create the Route 53 Hosted Zone for the newly created endpoint:

  1. From the AWS Console, select Endpoints.

  2. Select the Endpoint to connect to Altinity.Cloud, then the tab Details. In the section marked DNS names, select the DNS entry created and copy it. Store this in a separate location until ready.

    Copy Endpoint DNS Entry
  3. Enter the Route 53 console, and select Hosted zones.

    Select Create hosted zone
  4. Select Create hosted zone.

  5. On the Hosted zone configuration page, update the following:

    1. Domain name: Enter the URL of the Altinity.Cloud environment. Recall this will be internal.{Altinity.Cloud environment name}.altinity.cloud, where {your environment name} was determined in the step Retrieve Your Altinity.Cloud Environment URL.
    2. Description (optional): Enter a description of the hosted zone.
    3. Type: Set to Private hosted zone.
    Create hosted zone details
  6. In VPCs to associated with the hosted zone, set the following:

    1. Region: Select the region for the VPC to use.
    2. VPC ID: Enter the ID of the VPC that is being used.
  7. Verify the information is correct, then select Create hosted zone.

    Create hosted zone

Terraform Route 53 Configuration

resource "aws_route53_zone" "this" {
    name  = "$internal.{environment_name}.altinity.cloud.",
    vpc {
        vpc_id = aws_vpc.this.id
    }
    tags = local.tags
}

Create CNAME for VPC Endpoint

Once the Hosted Zone that will be used to connect the VPC to Altinity.Cloud has been created, the CNAME for the VPC Endpoint can be configured through the following process:

  1. From the AWS Console, select Route 53 > Hosted Zones, then select Create record.

    Create hosted zone
  2. Select the Hosted Zone that will be used for the VPC connection. This will be the internal.{Altinity.Cloud environment name}.altinity.cloud.

  3. Select Create record.

  4. From Choose routing policy select Simple routing, then select Next.

    Choose routing policy
  5. From Configure records, select Define simple record.

    Select Define simple record
  6. From Define simple record, update the following:

    1. Record name: set to *. (1)
    2. Value/Route traffic to:
      1. Select Ip address or another value depending on the record type. (3)
      2. Enter the DNS name for the Endpoint created in Create Route 53 Hosted Zone.
    3. Record type
      1. Select CNAME (2).
    Define simple record
  7. Verify the information is correct, then select Define simple record.

Terraform CNAME Configuration

resource "aws_route53_record" "this" {
    zone_id = aws_route53_zone.this.zone_id,
    name = "*.${aws_route53_zone.this.name}",
    type = "CNAME",
    ttl = 300,
    records = [aws_vpc_endpoint.this.dns_entry[0]["dns_name"]]
}

Test

To verify the VPC Endpoint works, launch a EC2 instance and execute the following curl command, and will return OK if successful. Use the name of your Altinity.Cloud environment’s host name in place of {your environment name here}:

curl -sS https://statuscheck.{your environment name here}
OK

For example, if your environment is internal.trafficanalysis.altinity.cloud, then use:

curl -sS https://statuscheck.internal.trafficanalysis.altinity.cloud
OK

References

1.5.6 - Amazon VPC Endpoint for Amazon MSK

How to create Amazon VPC Endpoint Services to connect Altinity.Cloud to Amazon MSK within your VPC

Altinity.Cloud users can connect a VPC (Virtual Private Cloud) Endpoint service from their existing AWS (Amazon Web Services) MSK (Amazon Managed Streaming for Apache Kafka) environments to their Altinity.Cloud environment. The VPC Endpoint services become a private connection between their existing Amazon services and Altinity.Cloud, without exposing Amazon MSK to the Internet.

The following instructions are based on using the AWS console. Examples of the Terraform equivalent settings are included.

Requirements

  • Amazon MSK
  • Provision Broker mapping.

Instructions

To create an VPC Endpoint Service, the following general steps are required:

  1. Contact your Altinity Support representative to retrieve the Altinity.Cloud AWS Account ID.
  2. Create VPC Endpoint Services: For each broker in the Amazon MSK cluster, provision a VPC endpoint service in the same region your Amazon MSK cluster. For more information, see the Amazon AWS service endpoints documentation.
  3. Configure each endpoint service to a Kafka broker. For example:
    1. Endpoint Service: com.amazonaws.vpce.us-east-1.vpce-svc-aaa
    2. Kafka broker: b-0.xxx.yyy.zzz.kafka.us-east-1.amazonaws.com
    3. Endpoint service provision settings: Set com.amazonaws.vpce.us-east-1.vpce-svc-aaa = b-0.xxx.yyy.zzz.kafka.us-east-1.amazonaws.com
  4. Provide Endpoint Services and MSK Broker mappings to your Altinity Support representative.

Create VPC Endpoint Services

To create the VPC Endpoint Service that connects your Altinity.Cloud environment to your Amazon MSK service:

  1. From the AWS Virtual Private Cloud console, select Endpoints Services > Create Endpoint Service.

    Select Create Endpoint
  2. Set the following:

    1. Name: Enter a Name of you own choice (A).
    2. Load balancer type: Set to Network. (B)
    3. Available load balancers: Set to the load balancer you provisioned for this broker. (C)
    4. Additional settings:
      1. If you are required to manually accept the endpoint, set Acceptence Required to Enabled (D).
      2. Otherwise, leave Acceptance Required unchecked.
        Create Endpoint Details
  3. Select Create.

Test

To verify the VPC Endpoint Service works, please contact your Altinity Support representative.

References

2 - Altinity Stable Builds

ClickHouse tested and verified for production use with 3 years of support.

ClickHouse, as an open source project, has multiple methods of installation. Altinity recommends either using Altinity Stable builds for ClickHouse, or community builds.

The Altinity Stable builds are releases with extended service of ClickHouse that undergo rigorous testing to verify they are secure and ready for production use. Altinity Stable Builds provide a secure, pre-compiled binary release of ClickHouse server and client with the following features:

  • The ClickHouse version release is ready for production use.
  • 100% open source and 100% compatible with ClickHouse community builds.
  • Provides Up to 3 years of support.
  • Validated against client libraries and visualization tools.
  • Tested for cloud use including Kubernetes.

For more information regarding the Altinity Stable builds, see Altinity Stable Builds for ClickHouse.

Altinity Stable Builds Life-Cycle Table

The following table lists Altinity Stable builds and their current status. Community builds of ClickHouse are no longer available after Community Support EOL. Contact us for build support beyond the Altinity Extend Support EOL.

Release Notes Build Status Latest Version Release Date Latest Update Support Duration Community Support End-of-Life* Altinity Extended Support End-of-Life**
22.3 Available 22.3.8.39 15 Jul 2022 15 Jul 2022 3 years 15 Mar 2023 15 Jul 2025
21.8 Available 21.8.15.7 11 Oct 2021 15 Apr 2022 3 years 31 Aug 2022 30 Aug 2024
21.3 Available 21.3.20.2 29 Jun 2021 10 Feb 2022 3 years
30 Mar 2022
31 Mar 2024
21.1 Available 21.1.11.3 24 Mar 2021 01 Jun 2022 2 years
30 Apr 2021
31 Jan 2023
20.8 Available Upon Request 20.8.12.2 02 Dec 2020 03 Feb 2021 2 years
31 Aug 2021
02 Dec 2022
20.3 Available Upon Request 20.3.19.4 24 Jun 2020 23 Sep 2020 2 years
31 Mar 2021
24 Jun 2022
  • *During Community Support bug fixes are automatically backported to community builds and picked up in refreshes of Altinity Stable builds.
  • **Altinity Extended Support covers P0-P1 bugs encountered by customers and critical security issues regardless of audience. Fixes are best effort and may not be possible in every circumstance. Altinity makes every effort to ensure a fix, workaround, or upgrade path for covered issues.

2.1 - Altinity Stable Builds Install Guide

How to install the Altinity Stable Builds for ClickHouse

Installing ClickHouse from the Altinity Stable Builds, available from https://builds.altinity.cloud, takes just a few minutes.

General Installation Instructions

When installing or upgrading from a previous version of ClickHouse from the Altinity Stable Builds, review the Release Notes for the ClickHouse version to install and upgrade to before starting. This will inform you of additional steps or requirements of moving from one version to the next.

Part of the installation procedures recommends you specify the version to install. The Release Notes lists the version numbers available for installation.

There are three main methods for installing Altinity Stable Builds:

  • Deb Packages
  • RPM Packages
  • Docker images

The package sources come from two sources:

  • Altinity Stable Builds: These are built from a secure, internal build pipeline and available from https://builds.altinity.cloud. Altinity Stable Builds are distinguishable from community builds when displaying version information:

    select version()
    
    ┌─version()─────────────────┐
     21.8.11.1.altinitystable  
    └───────────────────────────┘
    
  • Community Builds: These are made by ClickHouse community members, and are available at repo.clickhouse.tech.

2.1.1 - Altinity Stable Builds Deb Install Guide

How to install the Altinity Stable Builds for ClickHouse on Debian based systems.

Installation Instructions: Deb packages

ClickHouse can be installed from the Altinity Stable builds, located at https://builds.altinity.cloud, or from the ClickHouse community repository.

Deb Prerequisites

The following prerequisites must be installed before installing an Altinity Stable build of ClickHouse:

  • curl
  • gnupg2
  • apt-transport-https
  • ca-certificates

These can be installed prior to installing ClickHouse with the following command:

sudo apt-get update
sudo apt-get install curl gnupg2 apt-transport-https ca-certificates

Deb Packages for Altinity Stable Build

To install ClickHouse Altinity Stable build via Deb based packages from the Altinity Stable build repository:

  1. Update the apt-get local repository:

    sudo apt-get update
    
  2. Install the Altinity package signing keys:

    sudo sh -c 'mkdir -p /usr/share/keyrings && curl -s https://builds.altinity.cloud/apt-repo/pubkey.gpg | gpg --dearmor > /usr/share/keyrings/altinity-dev-archive-keyring.gpg'
    
  3. Update the apt-get repository to include the Altinity Stable build repository with the following commands:

    sudo sh -c 'echo "deb [signed-by=/usr/share/keyrings/altinity-dev-archive-keyring.gpg] https://builds.altinity.cloud/apt-repo stable main" > /etc/apt/sources.list.d/altinity-dev.list'
    
    sudo apt-get update
    
  4. Install either a specific version of ClickHouse, or the most current version.

    1. To install a specific version, include the version in the apt-get install command. The example below specifies the version 21.8.10.1.altinitystable:
    version=21.8.10.1.altinitystable
    
    sudo apt-get install clickhouse-common-static=$version clickhouse-client=$version clickhouse-server=$version
    
    1. To install the most current version of the ClickHouse Altinity Stable build without specifying a specific version, leave out the version= command.
    sudo apt-get install clickhouse-client clickhouse-server
    
  5. When prompted, provide the password for the default clickhouse user.

  6. Restart server.

    Installed packages are not applied to an already running server. It makes it convenient to install the packages first and restart later when convenient.

    sudo systemctl restart clickhouse-server
    

Remove Community Package Repository

For users upgrading to Altinity Stable builds from the community ClickHouse builds, we recommend removing the community builds from the local repository. See the instructions for your distribution of Linux for instructions on modifying your local package repository.

Community Builds

For instructions on how to install ClickHouse community, see the ClickHouse Documentation site.

2.1.2 - Altinity Stable Builds RPM Install Guide

How to install the Altinity Stable Builds for ClickHouse on RPM based systems.

Installation Instructions: RPM packages

ClickHouse can be installed from the Altinity Stable builds, located at https://builds.altinity.cloud, or from the ClickHouse commuinity repository.

Depending on your Linux distribution, either dnf or yum will be used. See your particular distribution of Linux for specifics.

The instructions below uses the command $(type -p dnf || type -p yum) to provide the correct command based on the distribution to be used.

RPM Prerequisites

The following prerequisites must be installed before installing an Altinity Stable build:

  • curl
  • gnupg2

These can be installed prior to installing ClickHouse with the following:

sudo $(type -p dnf || type -p yum) install curl gnupg2

RPM Packages for Altinity Stable Build

To install ClickHouse from an Altinity Stable build via RPM based packages from the Altinity Stable build repository:

  1. Update the local RPM repository to include the Altinity Stable build repository with the following command:

    sudo curl https://builds.altinity.cloud/yum-repo/altinity.repo -o /etc/yum.repos.d/altinity.repo    
    
  2. Install ClickHouse server and client with either yum or dnf. It is recommended to specify a version to maximize compatibly with other applications and clients.

    1. To specify the version of ClickHouse to install, create a variable for the version and pass it to the installation instructions. The example below specifies the version 21.8.10.1.altinitystable:
    version=21.8.10.1.altinitystable
    sudo $(type -p dnf || type -p yum) install clickhouse-common-static-$version clickhouse-server-$version clickhouse-client-$version
    
    1. To install the most recent version of ClickHouse, leave off the version- command and variable:
    sudo $(type -p dnf || type -p yum) install clickhouse-common-static clickhouse-server clickhouse-client
    

Remove Community Package Repository

For users upgrading to Altinity Stable builds from the community ClickHouse builds, we recommend removing the community builds from the local repository. See the instructions for your distribution of Linux for instructions on modifying your local package repository.

RPM Downgrading Altinity ClickHouse Stable to a Previous Release

To downgrade to a previous release, the current version must be installed, and the previous version installed with the --setup=obsoletes=0 option. Review the Release Notes before downgrading for any considerations or issues that may occur when downgrading between versions of ClickHouse.

For more information, see the Altinity Knowledge Base article Altinity packaging compatibility greater than 21.x and earlier.

Community Builds

For instructions on how to install ClickHouse community, see the ClickHouse Documentation site.

2.1.3 - Altinity Stable Builds Docker Install Guide

How to install the Altinity Stable Builds for ClickHouse with Docker.

Installation Instructions: Docker

These included instructions detail how to install a single Altinity Stable build of ClickHouse container through Docker. For details on setting up a cluster of Docker containers, see ClickHouse on Kubernetes.

Docker Images are available for Altinity Stable builds and Community builds. The instructions below focus on using the Altinity Stable builds for ClickHouse.

The Docker repositories are located at:

To install a ClickHouse Altinity Stable build through Docker:

  1. Create the directory for the docker-compose.yml file and the database storage and ClickHouse server storage.

    mkdir clickhouse
    cd clickhouse
    mkdir clickhouse_database
    
  2. Create the file docker-compose.yml and populate it with the following, updating the clickhouse-server to the current altinity/clickhouse-server version:

    version: '3'
    
    services:
      clickhouse_server:
          image: altinity/clickhouse-server:21.8.10.1.altinitystable
          ports:
          - "8123:8123"
          volumes:
          - ./clickhouse_database:/var/lib/clickhouse
          networks:
              - clickhouse_network
    
    networks:
      clickhouse_network:
          driver: bridge
          ipam:
              config:
                  - subnet: 10.222.1.0/24
    
  3. Launch the ClickHouse Server with docker-compose or docker compose depending on your version of Docker:

    docker compose up -d
    
  4. Verify the installation by logging into the database from the Docker image directly, and make any other necessary updates with:

    docker compose exec clickhouse_server clickhouse-client
    root@67c732d8dc6a:/# clickhouse-client
    ClickHouse client version 21.3.15.2.altinity+stable (altinity build).
    Connecting to localhost:9000 as user default.
    Connected to ClickHouse server version 21.1.10 revision 54443.
    
    67c732d8dc6a :)
    

2.1.4 - Altinity Stable Builds macOS Install Guide

How to install the Altinity Stable Builds for ClickHouse with macOS.

Altinity Stable for ClickHouse is available to macOS users through the Homebrew package manager. Users and developers who use macOS as their preferred environment can quickly install a production ready version of ClickHouse within minutes.

The following instructions are targeted for users of Altinity Stable for ClickHouse. For more information on running community or other versions of ClickHouse on macOS, see either the Homebrew Tap for ClickHouse project or the blog post Altinity Introduces macOS Homebrew Tap for ClickHouse.

macOS Prerequisites

Brew Install for Altinity Stable Instructions

By default, installing ClickHouse through brew will install the latest version of the community version of ClickHouse. Extra steps are required to install the Altinity Stable version of ClickHouse. Altinity Stable is installed as a keg-only version, which requires manually setting paths and other commands to run the Altinity Stable for ClickHouse through brew.

To install Altinity Stable for ClickHouse in macOS through Brew:

  1. Add the ClickHouse formula via brew tap:

    brew tap altinity/clickhouse
    
  2. Install Altinity Stable for ClickHouse by specifying clickhouse@altinity-stable for the most recent Altinity Stable version, or specify the version with clickhouse@{Altinity Stable Version}. For example, as of this writing the most current version of Altinity Stable is 21.8, therefore the command to install that version of altinity stable is clickhouse@21.8-altinity-stable. To install the most recent version, use the brew install command as follows:

    brew install clickhouse@altinity-stable
    
  3. Because Altinity Stable for ClickHouse is available as a keg only release, the path must be set manually. These instructions will be displayed as part of the installation procedure. Based on your version, executable directory will be different based on the pattern:

    $(brew --prefix)/{clickhouse version}/bin

    For our example, clickhouse@altinity-stable gives us the following path setting:

    export PATH="/opt/homebrew/opt/clickhouse@21.8-altinity-stable/bin:$PATH"

    Using the which command after updating the path reveals the location of the clickhouse-server executable:

    which clickhouse-server
    /opt/homebrew/opt/clickhouse@21.8-altinity-stable/bin/clickhouse-server
    
  4. To start the Altinity Stable for ClickHouse server use the brew services start command. For example:

    brew services start clickhouse@altinity-stable
    
  5. Connect to the new server with clickhouse-client:

    > clickhouse-client
    ClickHouse client version 21.8.13.1.
    Connecting to localhost:9000 as user default.
    Connected to ClickHouse server version 21.11.6 revision 54450.
    
    ClickHouse client version is older than ClickHouse server. It may lack support for new features.
    
    penny.home :) select version()
    
    SELECT version()
    
    Query id: 128a2cae-d0e2-4170-a771-83fb79429260
    
    ┌─version()─┐
    │ 21.11.6.1 │
    └───────────┘
    
    1 rows in set. Elapsed: 0.004 sec.
    
    penny.home :) exit
    Bye.
    
  6. To end the ClickHouse server, use brew services stop command:

    brew services stop clickhouse@altinity-stable
    

2.1.5 - Altinity Stable Build Guide for ClickHouse

How to build ClickHouse from Altinity Stable manually.

Organizations that prefer to build ClickHouse manually can use the Altinity Stable versions of ClickHouse directly from the source code.

Clone the Repo

Before using either the Docker or Direct build process, the Altinity Stable for ClickHouse must be downloaded from the Altinity Stable of ClickHouse repository, located at https://github.com/Altinity/clickhouse. The following procedure is used to update the source code to the most current version. For more information on downloading a specific version of the source code, see the GitHub documentation.

Hardware Recommendations

ClickHouse can run on the most minimum hardware to full clusters. The following hardware requirements are recommended for building and running ClickHouse:

  • 16GB of RAM (32 GB recommende)
  • Multiple cores (4+)
  • 20-50 GB disk storage

Downloading Altinity Stable for ClickHouse

Before building ClickHouse, specify the verified version to download and build by specifying the Altinity Stable for ClickHouse tags. The `–recursive`` command will download all submodules part of the Altinity Stable project.

As of this writing, the most recent verified version is v21.8.10.19-altinitystable, so the download command to download that version of Altinity Stable into the folder AltinityStableClickHouse is:

  1. git clone --recursive -b v21.8.10.19-altinitystable --single-branch https://github.com/Altinity/clickhouse.git AltinityStableClickHouse.

Direct Build Instructions for Deb Based Linux

To build Altinity Stable for ClickHouse from the source code for Deb based Linux platforms:

  1. Install the prerequisites:

    sudo apt-get install git cmake python ninja-build
    
  2. Install clang-12.

    sudo apt install clang-12
    
  3. Create and enter the build directory within your AltinityStable directory.

    mkdir build && cd build
    
  4. Set the compile variables to clang-12 and initiate the ninja build.

    CC=clang-12 CXX=clang++-12 cmake .. -GNinja
    
  5. Provide the ninja command to build your own Altinity Stable for ClickHouse:

    ninja clickhouse
    
  6. Once complete, Altinity Stable for ClickHouse will be in the project’s programs folder, and can be run with the following commands:

    1. ClickHouse Server: clickhouse server
    2. ClickHouse Client: clickhouse client

2.1.6 - Legacy ClickHouse Altinity Stable Releases Install Guide

How to install the ClickHouse Altinity Stable Releases from packagecloud.io.

ClickHouse Altinity Stable Releases are specially vetted community builds of ClickHouse that Altinity certifies for production use. We track critical changes and verify against a series of tests to make sure they’re ready for your production environment. We take the steps to verify how to upgrade from previous versions, and what issues you might run into when transitioning your ClickHouse clusters to the next Stable Altinity ClickHouse release.

As of October 12, 2021, Altinity replaced the ClickHouse Altinity Stable Releases with the Altinity Stable Builds, providing longer support and validation. For more information, see Altinity Stable Builds.

Legacy versions of the ClickHouse Altinity Stable Releases are available from the Altinity ClickHouse Stable Release packagecloud.io repository, located at https://packagecloud.io/Altinity/altinity-stable.

The available Altinity ClickHouse Stable Releases from packagecloud.io for ClickHouse server, ClickHouse client and ClickHouse common versions are:

  • Altinity ClickHouse Stable Release 21.1.10.3
  • Altinity ClickHouse Stable Release 21.3.13.9
  • Altinity ClickHouse Stable Release 21.3.15.2
  • Altinity ClickHouse Stable Release 21.3.15.4

General Installation Instructions

When installing or upgrading from a previous version of legacy ClickHouse Altinity Stable Release, review the Release Notes for the version to install and upgrade to before starting. This will inform you of additional steps or requirements of moving from one version to the next.

Part of the installation procedures recommends you specify the version to install. The Release Notes lists the version numbers available for installation.

There are three main methods for installing the legacy ClickHouse Altinity Stable Releases:

Altinity ClickHouse Stable Releases are distinguishable from community builds when displaying version information. The suffix altinitystable will be displayed after the version number:

select version()

┌─version()─────────────────┐
 21.3.15.2.altinitystable 
└───────────────────────────┘

Prerequisites

This guide assumes that the reader is familiar with Linux commands, permissions, and how to install software for their particular Linux distribution. The reader will have to verify they have the correct permissions to install the software in their target systems.

Installation Instructions

Legacy Altinity ClickHouse Stable Release DEB Builds

To install legacy ClickHouse Altinity Stable Release version DEB packages from packagecloud.io:

  1. Update the apt-get repository with the following command:

    curl -s https://packagecloud.io/install/repositories/Altinity/altinity-stable/script.deb.sh | sudo bash
    
  2. ClickHouse can be installed either by specifying a specific version, or automatically going to the most current version. It is recommended to specify a version for maximum compatibility with existing clients.

    1. To install a specific version, create a variable specifying the version to install and including it with the install command:
    version=21.8.8.1.altinitystable
    sudo apt-get install clickhouse-client=$version clickhouse-server=$version clickhouse-common-static=$version
    
    1. To install the most current version of the legacy ClickHouse Altinity Stable release without specifying a specific version, leave out the version= command.
    sudo apt-get install clickhouse-client clickhouse-server clickhouse-server-common
    
  3. Restart server.

    Installed packages are not applied to the already running server. It makes it convenient to install packages first and restart later when convenient.

    sudo systemctl restart clickhouse-server
    

Legacy Altinity ClickHouse Stable Release RPM Builds

To install legacy ClickHouse Altinity Stable Release version RPM packages from packagecloud.io:

  1. Update the yum package repository configuration with the following command:

    curl -s https://packagecloud.io/install/repositories/Altinity/altinity-stable/script.rpm.sh | sudo bash
    
  2. ClickHouse can be installed either by specifying a specific version, or automatically going to the most current version. It is recommended to specify a version for maximum compatibility with existing clients.

    1. To install a specific version, create a variable specifying the version to install and including it with the install command:
    version=version=21.8.8.1.altinitystable
    sudo yum install clickhouse-client-${version} clickhouse-server-${version} clickhouse-server-common-${version}
    
    1. To install the most current version of the legacy ClickHouse Altinity Stable release without specifying a specific version, leave out the version= command.
    sudo yum install clickhouse-client clickhouse-server clickhouse-server-common
    
  3. Restart the ClickHouse server.

    sudo systemctl restart clickhouse-server
    

2.2 - Monitoring Considerations

Monitoring Considerations

Monitoring helps to track potential issues in your cluster before they cause a critical error.

External Monitoring

External monitoring collects data from the ClickHouse cluster and uses it for analysis and review. Recommended external monitoring systems include:

ClickHouse can collect the recording of metrics internally by enabling system.metric_log in config.xml.

For dashboard system:

  • Grafana is recommended for graphs, reports, alerts, dashboard, etc.
  • Other options are Nagios or Zabbix.

The following metrics should be collected:

  • For Host Machine:
    • CPU
    • Memory
    • Network (bytes/packets)
    • Storage (iops)
    • Disk Space (free / used)
  • For ClickHouse:
    • Connections (count)
    • RWLocks
    • Read / Write / Return (bytes)
    • Read / Write / Return (rows)
    • Zookeeper operations (count)
    • Absolute delay
    • Query duration (optional)
    • Replication parts and queue (count)
  • For Zookeeper:

The following queries are recommended to be included in monitoring:

  • SELECT * FROM system.replicas
    • For more information, see the ClickHouse guide on System Tables
  • SELECT * FROM system.merges
    • Checks on the speed and progress of currently executed merges.
  • SELECT * FROM system.mutations
    • This is the source of information on the speed and progress of currently executed merges.

Monitor and Alerts

Configure the notifications for events and thresholds based on the following table:

Health Checks

The following health checks should be monitored:

Check Name Shell or SQL command Severity
ClickHouse status $ curl 'http://localhost:8123/'Ok. Critical
Too many simultaneous queries. Maximum: 100 select value from system.metrics where metric='Query' Critical
Replication status $ curl 'http://localhost:8123/replicas_status'Ok. High
Read only replicas (reflected by replicas_status as well) select value from system.metrics where metric='ReadonlyReplica’ High
ReplicaPartialShutdown (not reflected by replicas_status, but seems to correlate with ZooKeeperHardwareExceptions) select value from system.events where event='ReplicaPartialShutdown' HighI turned this one off. It almost always correlates with ZooKeeperHardwareExceptions, and when it’s not, then there is nothing bad happening…
Some replication tasks are stuck select count()from system.replication_queuewhere num_tries > 100 High
ZooKeeper is available select count() from system.zookeeper where path='/' Critical for writes
ZooKeeper exceptions select value from system.events where event='ZooKeeperHardwareExceptions' Medium
Other CH nodes are available $ for node in `echo "select distinct host_address from system.clusters where host_name !='localhost'" curl 'http://localhost:8123/' –silent –data-binary @-`; do curl "http://$node:8123/" –silent ; done
All CH clusters are available (i.e. every configured cluster has enough replicas to serve queries) for cluster in `echo "select distinct cluster from system.clusters where host_name !='localhost'" curl 'http://localhost:8123/' –silent –data-binary @-` ; do clickhouse-client –query="select '$cluster', 'OK' from cluster('$cluster', system, one)" ; done
There are files in 'detached' folders $ find /var/lib/clickhouse/data///detached/* -type d

wc -l;

19.8+select count() from system.detached_parts

Too many parts:

Number of parts is growing;

Inserts are being delayed;

Inserts are being rejected

select value from system.asynchronous_metrics where metric='MaxPartCountForPartition';select value from system.events/system.metrics where event/metric='DelayedInserts';

select value from system.events where event='RejectedInserts'

Critical
Dictionaries: exception select concat(name,': ',last_exception) from system.dictionarieswhere last_exception != '' Medium
ClickHouse has been restarted select uptime();select value from system.asynchronous_metrics where metric='Uptime'
DistributedFilesToInsert should not be always increasing select value from system.metrics where metric='DistributedFilesToInsert' Medium
A data part was lost select value from system.events where event='ReplicatedDataLoss' High
Data parts are not the same on different replicas

select value from system.events where event='DataAfterMergeDiffersFromReplica';

select value from system.events where event='DataAfterMutationDiffersFromReplica'

Medium

Monitoring References

3 - ClickHouse on Kubernetes

Install and Manage ClickHouse Clusters on Kubernetes.

Setting up a cluster of Altinity Stable for ClickHouse is made easy with Kubernetes, even if saying that takes some effort from the tongue. Organizations that want to setup their own distributed ClickHouse environments can do so with the Altinity Kubernetes Operator.

As of this time, the current version of the Altinity Kubernetes Operator is 0.18.5.

3.1 - Altinity Kubernetes Operator Quick Start Guide

Become familiar with the Kubernetes Altinity Kubernetes Operator in the fewest steps.

If you’re running the Altinity Kubernetes Operator for the first time, or just want to get it up and running as quickly as possible, the Quick Start Guide is for you.

Requirements:

  • An operating system running Kubernetes and Docker, or a service providing support for them such as AWS.
  • A ClickHouse remote client such as clickhouse-client. Full instructions for installing ClickHouse can be found on the ClickHouse Installation page.

3.1.1 - Installation

Install and Verify the Altinity Kubernetes Operator

The Altinity Kubernetes Operator can be installed in just a few minutes with a single command into your existing Kubernetes environment.

For those who need a more customized installation or want to build the Altinity Kubernetes Operator themselves can do so through the Operator Installation Guide.

Requirements

Before starting, make sure you have the following installed:

Quick Install

To install the Altinity Kubernetes Operator into your existing Kubernetes environment, run the following or download the Altinity Kubernetes Operator install file to modify it as best fits your needs. For more information custom Altinity Kubernetes Operator settings that can be applied, see the Operator Guide.

We recommend that when installing the Altinity Kubernetes Operator, specify the version to be installed. This insures maximum compatibility with applications and established Kubernetes environments running your ClickHouse clusters. For more information on installing other versions of the Altinity Kubernetes Operator, see the specific Version Installation Guide.

The most current version is 0.18.3:

kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml

Output similar to the following will be displayed on a successful installation. For more information on the resources created in the installation, see Altinity Kubernetes Operator Resources

customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com created
serviceaccount/clickhouse-operator created
clusterrole.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
configmap/etc-clickhouse-operator-files created
configmap/etc-clickhouse-operator-confd-files created
configmap/etc-clickhouse-operator-configd-files created
configmap/etc-clickhouse-operator-templatesd-files created
configmap/etc-clickhouse-operator-usersd-files created
deployment.apps/clickhouse-operator created
service/clickhouse-operator-metrics created

Quick Verify

To verify that the installation was successful, run the following. On a successful installation you’ll be able to see the clickhouse-operator pod under the NAME column.

kubectl get pods --namespace kube-system
NAME                                   READY   STATUS    RESTARTS       AGE
clickhouse-operator-857c69ffc6-dq2sz   2/2     Running   0              5s
coredns-78fcd69978-nthp2               1/1     Running   4 (110s ago)   50d
etcd-minikube                          1/1     Running   4 (115s ago)   50d
kube-apiserver-minikube                1/1     Running   4 (105s ago)   50d
kube-controller-manager-minikube       1/1     Running   4 (115s ago)   50d
kube-proxy-lsggn                       1/1     Running   4 (115s ago)   50d
kube-scheduler-minikube                1/1     Running   4 (105s ago)   50d
storage-provisioner                    1/1     Running   8 (115s ago)   50d

3.1.2 - First Clusters

Create your first ClickHouse Cluster

If you followed the Quick Installation guide, then you have the
Altinity Kubernetes Operator for Kubernetes installed.
Let’s give it something to work with.

Create Your Namespace

For our examples, we’ll be setting up our own Kubernetes namespace test.
All of the examples will be installed into that namespace so we can track
how the cluster is modified with new updates.

Create the namespace with the following kubectl command:

kubectl create namespace test
namespace/test created

Just to make sure we’re in a clean environment,
let’s check for any resources in our namespace:

kubectl get all -n test
No resources found in test namespace.

The First Cluster

We’ll start with the simplest cluster: one shard, one replica.
This template and others are available on the
Altinity Kubernetes Operator Github example site,
and contains the following:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "demo-01"
spec:
  configuration:
    clusters:
      - name: "demo-01"
        layout:
          shardsCount: 1
          replicasCount: 1

Save this as sample01.yaml and launch it with the following:

kubectl apply -n test -f sample01.yaml
clickhouseinstallation.clickhouse.altinity.com/demo-01 created

Verify that the new cluster is running. When the status is
Running then it’s complete.

kubectl -n test get chi -o wide
NAME      VERSION   CLUSTERS   SHARDS   HOSTS   TASKID                                 STATUS      UPDATED   ADDED   DELETED   DELETE   ENDPOINT
demo-01   0.18.1    1          1        1       6d1d2c3d-90e5-4110-81ab-8863b0d1ac47   Completed             1                          clickhouse-demo-01.test.svc.cluster.local

To retrieve the IP information use the get service option:

kubectl get service -n test
NAME                      TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)                         AGE
chi-demo-01-demo-01-0-0   ClusterIP      None           <none>        8123/TCP,9000/TCP,9009/TCP      2s
clickhouse-demo-01        LoadBalancer   10.111.27.86   <pending>     8123:31126/TCP,9000:32460/TCP   19s

So we can see our pods is running, and that we have the"
load balancer for the cluster.

Connect To Your Cluster With Exec

Let’s talk to our cluster and run some simple ClickHouse queries.

We can hop in directly through Kubernetes and run the clickhouse-client
that’s part of the image with the following command:

kubectl -n test exec -it chi-demo-01-demo-01-0-0-0 -- clickhouse-client
ClickHouse client version 20.12.4.5 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 20.12.4 revision 54442.

chi-demo-01-demo-01-0-0-0.chi-demo-01-demo-01-0-0.test.svc.cluster.local :)

From within ClickHouse, we can check out the current clusters:

SELECT * FROM system.clusters
┌─cluster─────────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───────────────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
 all-replicated                                           1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 all-sharded                                              1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 demo-01                                                  1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_one_shard_three_replicas_localhost          1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_one_shard_three_replicas_localhost          1             1            2  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_one_shard_three_replicas_localhost          1             1            3  127.0.0.3                127.0.0.3     9000         0  default                               0                0                        0 
 test_cluster_two_shards                                  1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards                                  2             1            1  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_two_shards_internal_replication             1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards_internal_replication             2             1            1  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_two_shards_localhost                        1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards_localhost                        2             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_shard_localhost                                     1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_shard_localhost_secure                              1             1            1  localhost                127.0.0.1     9440         0  default                               0                0                        0 
 test_unavailable_shard                                   1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_unavailable_shard                                   2             1            1  localhost                127.0.0.1        1         0  default                               0                0                        0 
└─────────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴─────────────────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘

Exit out of your cluster:

chi-demo-01-demo-01-0-0-0.chi-demo-01-demo-01-0-0.test.svc.cluster.local :) exit
Bye.

Connect to Your Cluster with Remote Client

You can also use a remote client such as clickhouse-client to
connect to your cluster through the LoadBalancer.

  • The default username and password is set by the
    clickhouse-operator-install.yaml file. These values can be altered
    by changing the chUsername and chPassword ClickHouse
    Credentials section:

    • Default Username: clickhouse_operator
    • Default Password: clickhouse_operator_password
# ClickHouse credentials (username, password and port) to be used
# by operator to connect to ClickHouse instances for:
# 1. Metrics requests
# 2. Schema maintenance
# 3. DROP DNS CACHE
# User with such credentials can be specified in additional ClickHouse
# .xml config files,
# located in `chUsersConfigsPath` folder
chUsername: clickhouse_operator
chPassword: clickhouse_operator_password
chPort: 8123

In either case, the command to connect to your new cluster will
resemble the following, replacing {LoadBalancer hostname} with
the name or IP address or your LoadBalancer, then providing
the proper password. In our examples so far, that’s been localhost.

From there, just make your ClickHouse SQL queries as you please - but
remember that this particular cluster has no persistent storage.
If the cluster is modified in any way, any databases or tables
created will be wiped clean.

Update Your First Cluster To 2 Shards

Well that’s great - we have a cluster running. Granted, it’s really small
and doesn’t do much, but what if we want to upgrade it?

Sure - we can do that any time we want.

Take your sample01.yaml and save it as sample02.yaml.

Let’s update it to give us two shards running with one replica:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "demo-01"
spec:
  configuration:
    clusters:
      - name: "demo-01"
        layout:
          shardsCount: 2
          replicasCount: 1

Save your YAML file, and apply it. We’ve defined the name
in the metadata, so it knows exactly what cluster to update.

kubectl apply -n test -f sample02.yaml
clickhouseinstallation.clickhouse.altinity.com/demo-01 configured

Verify that the cluster is running - this may take a few
minutes depending on your system:

kubectl -n test get chi -o wide
NAME      VERSION   CLUSTERS   SHARDS   HOSTS   TASKID                                 STATUS      UPDATED   ADDED   DELETED   DELETE   ENDPOINT
demo-01   0.18.1    1          2        2       80102179-4aa5-4e8f-826c-1ca7a1e0f7b9   Completed             1                          clickhouse-demo-01.test.svc.cluster.local
kubectl get service -n test
NAME                      TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)                         AGE
chi-demo-01-demo-01-0-0   ClusterIP      None           <none>        8123/TCP,9000/TCP,9009/TCP      26s
chi-demo-01-demo-01-1-0   ClusterIP      None           <none>        8123/TCP,9000/TCP,9009/TCP      3s
clickhouse-demo-01        LoadBalancer   10.111.27.86   <pending>     8123:31126/TCP,9000:32460/TCP   43s

Once again, we can reach right into our cluster with
clickhouse-client and look at the clusters.

clickhouse-client --host localhost --user=clickhouse_operator --password=clickhouse_operator_password
ClickHouse client version 20.12.4.5 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 20.12.4 revision 54442.

chi-demo-01-demo-01-1-0-0.chi-demo-01-demo-01-1-0.test.svc.cluster.local :)
SELECT * FROM system.clusters
┌─cluster─────────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───────────────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
 all-replicated                                           1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 all-sharded                                              1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 demo-01                                                  1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_one_shard_three_replicas_localhost          1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_one_shard_three_replicas_localhost          1             1            2  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_one_shard_three_replicas_localhost          1             1            3  127.0.0.3                127.0.0.3     9000         0  default                               0                0                        0 
 test_cluster_two_shards                                  1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards                                  2             1            1  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_two_shards_internal_replication             1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards_internal_replication             2             1            1  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_two_shards_localhost                        1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards_localhost                        2             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_shard_localhost                                     1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_shard_localhost_secure                              1             1            1  localhost                127.0.0.1     9440         0  default                               0                0                        0 
 test_unavailable_shard                                   1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_unavailable_shard                                   2             1            1  localhost                127.0.0.1        1         0  default                               0                0                        0 
└─────────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴─────────────────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘

So far, so good. We can create some basic clusters.
If we want to do more, we’ll have to move ahead with replication
and zookeeper in the next section.

3.1.3 - Zookeeper and Replicas

Install Zookeeper and Replicas
kubectl create namespace test
namespace/test created

Now we’ve seen how to setup a basic cluster and upgrade it. Time to step
up our game and setup our cluster with Zookeeper, and then add
persistent storage to it.

The Altinity Kubernetes Operator does not install or manage Zookeeper.
Zookeeper must be provided and managed externally. The samples below
are examples on establishing Zookeeper to provide replication support.
For more information running and configuring Zookeeper,
see the Apache Zookeeper site.

This step can not be skipped - your Zookeeper instance must
have been set up externally from your ClickHouse clusters.
Whether your Zookeeper installation is hosted by other
Docker Images or separate servers is up to you.

Install Zookeeper

Kubernetes Zookeeper Deployment

A simple method of installing a single Zookeeper node is provided from
the Altinity Kubernetes Operator
deployment samples. These provide samples deployments of Grafana, Prometheus, Zookeeper and other applications.

See the Altinity Kubernetes Operator deployment directory
for a full list of sample scripts and Kubernetes deployment files.

The instructions below will create a new Kubernetes namespace zoo1ns,
and create a Zookeeper node in that namespace.
Kubernetes nodes will refer to that Zookeeper node by the hostname
zookeeper.zoo1ns within the created Kubernetes networks.

To deploy a single Zookeeper node in Kubernetes from the
Altinity Kubernetes Operator Github repository:

  1. Download the Altinity Kubernetes Operator Github repository, either with
    git clone https://github.com/Altinity/clickhouse-operator.git or by selecting Code->Download Zip from the
    Altinity Kubernetes Operator GitHub repository
    .

  2. From a terminal, navigate to the deploy/zookeeper directory
    and run the following:

cd clickhouse-operator/deploy/zookeeper
./quick-start-volume-emptyDir/zookeeper-1-node-create.sh
namespace/zoo1ns created
service/zookeeper created
service/zookeepers created
Warning: policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
poddisruptionbudget.policy/zookeeper-pod-disruption-budget created
statefulset.apps/zookeeper created
  1. Verify the Zookeeper node is running in Kubernetes:
kubectl get all --namespace zoo1ns
NAME              READY   STATUS    RESTARTS   AGE
pod/zookeeper-0   0/1     Running   0          2s

NAME                 TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)             AGE
service/zookeeper    ClusterIP   10.100.31.86   <none>        2181/TCP,7000/TCP   2s
service/zookeepers   ClusterIP   None           <none>        2888/TCP,3888/TCP   2s

NAME                         READY   AGE
statefulset.apps/zookeeper   0/1     2s
  1. Kubernetes nodes will be able to refer to the Zookeeper
    node by the hostname zookeeper.zoo1ns.

Configure Kubernetes with Zookeeper

Once we start replicating clusters, we need Zookeeper to manage them.
Create a new file sample03.yaml and populate it with the following:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "demo-01"
spec:
  configuration:
    zookeeper:
      nodes:
        - host: zookeeper.zoo1ns
          port: 2181
    clusters:
      - name: "demo-01"
        layout:
          shardsCount: 2
          replicasCount: 2
        templates:
          podTemplate: clickhouse-stable
  templates:
    podTemplates:
      - name: clickhouse-stable
        spec:
          containers:
            - name: clickhouse
              image: altinity/clickhouse-server:21.8.10.1.altinitystable

Notice that we’re increasing the number of replicas from the
sample02.yaml file in the
[First Clusters - No Storage]({<ref “quickcluster”>}) tutorial.

We’ll set up a minimal Zookeeper connecting cluster by applying
our new configuration file:

kubectl apply -f sample03.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 created

Verify it with the following:

kubectl -n test get chi -o wide
NAME      VERSION   CLUSTERS   SHARDS   HOSTS   TASKID                                 STATUS      UPDATED   ADDED   DELETED   DELETE   ENDPOINT                                    AGE
demo-01   0.18.3    1          2        4       5ec69e86-7e4d-4b8b-877f-f298f26161b2   Completed             4                          clickhouse-demo-01.test.svc.cluster.local   102s
kubectl get service -n test
NAME                      TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE
chi-demo-01-demo-01-0-0   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      85s
chi-demo-01-demo-01-0-1   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      68s
chi-demo-01-demo-01-1-0   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      47s
chi-demo-01-demo-01-1-1   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      16s
clickhouse-demo-01        LoadBalancer   10.104.157.249   <pending>     8123:32543/TCP,9000:30797/TCP   101s

If we log into our cluster and show the clusters, we can show
the updated results and that we have a total of 4 replicas
of demo-01 - two shards for each node with two replicas.

SELECT * FROM system.clusters
┌─cluster──────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───────────────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
 all-replicated                                        1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 all-replicated                                        1             1            2  chi-demo-01-demo-01-0-1  172.17.0.6    9000         0  default                               0                0                        0 
 all-replicated                                        1             1            3  chi-demo-01-demo-01-1-0  172.17.0.7    9000         0  default                               0                0                        0 
 all-replicated                                        1             1            4  chi-demo-01-demo-01-1-1  172.17.0.8    9000         0  default                               0                0                        0 
 all-sharded                                           1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 all-sharded                                           2             1            1  chi-demo-01-demo-01-0-1  172.17.0.6    9000         0  default                               0                0                        0 
 all-sharded                                           3             1            1  chi-demo-01-demo-01-1-0  172.17.0.7    9000         0  default                               0                0                        0 
 all-sharded                                           4             1            1  chi-demo-01-demo-01-1-1  172.17.0.8    9000         0  default                               0                0                        0 
 demo-01                                               1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 demo-01                                               1             1            2  chi-demo-01-demo-01-0-1  172.17.0.6    9000         0  default                               0                0                        0 
 demo-01                                               2             1            1  chi-demo-01-demo-01-1-0  172.17.0.7    9000         0  default                               0                0                        0 
 demo-01                                               2             1            2  chi-demo-01-demo-01-1-1  172.17.0.8    9000         0  default                               0                0                        0 
 test_cluster_two_shards                               1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards                               2             1            1  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_two_shards_internal_replication          1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards_internal_replication          2             1            1  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_two_shards_localhost                     1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards_localhost                     2             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_shard_localhost                                  1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_shard_localhost_secure                           1             1            1  localhost                127.0.0.1     9440         0  default                               0                0                        0 
 test_unavailable_shard                                1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_unavailable_shard                                2             1            1  localhost                127.0.0.1        1         0  default                               0                0                        0 
└──────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴─────────────────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘

Distributed Tables

We have our clusters going - let’s test it out with some distributed
tables so we can see the replication in action.

Login to your ClickHouse cluster and enter the following SQL statement:

CREATE TABLE test AS system.one ENGINE = Distributed('demo-01', 'system', 'one')

Once our table is created, perform a SELECT * FROM test command.
We’ll see nothing because we didn’t give it any data,
but that’s all right.

SELECT * FROM test
┌─dummy─┐
     0 
└───────┘
┌─dummy─┐
     0 
└───────┘

Now let’s test out our results coming in.
Run the following command - this tells us just what shard is
returning the results. It may take a few times, but you’ll
start to notice the host name changes each time you run the
command SELECT hostName() FROM test:

SELECT hostName() FROM test
┌─hostName()────────────────┐
 chi-demo-01-demo-01-0-0-0 
└───────────────────────────┘
┌─hostName()────────────────┐
 chi-demo-01-demo-01-1-1-0 
└───────────────────────────┘
SELECT hostName() FROM test
┌─hostName()────────────────┐
 chi-demo-01-demo-01-0-0-0 
└───────────────────────────┘
┌─hostName()────────────────┐
 chi-demo-01-demo-01-1-0-0 
└───────────────────────────┘

This is showing us that the query is being distributed across
different shards. The good news is you can change your
configuration files to change the shards and replication
however suits your needs.

One issue though: there’s no persistent storage.
If these clusters stop running, your data vanishes.
Next instruction will be on how to add persistent storage
to your ClickHouse clusters running on Kubernetes.
In fact, we can test by creating a new configuration
file called sample04.yaml:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "demo-01"
spec:
  configuration:
    zookeeper:
      nodes:
        - host: zookeeper.zoo1ns
          port: 2181
    clusters:
      - name: "demo-01"
        layout:
          shardsCount: 1
          replicasCount: 1
        templates:
          podTemplate: clickhouse-stable
  templates:
    podTemplates:
      - name: clickhouse-stable
        spec:
          containers:
            - name: clickhouse
              image: altinity/clickhouse-server:21.8.10.1.altinitystable

Make sure you’re exited out of your ClickHouse cluster,
then install our configuration file:

kubectl apply -f sample04.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 configured

Notice that during the update that four pods were deleted,
and then two new ones added.

When your clusters are settled down and back down to just 1 shard
with 1 replication, log back into your ClickHouse database
and select from table test:

SELECT * FROM test
Received exception from server (version 21.8.10):
Code: 60. DB::Exception: Received from localhost:9000. DB::Exception: Table default.test doesn't exist. 
command terminated with exit code 60

No persistent storage means any time your clusters are changed over,
everything you’ve done is gone. The next article will cover
how to correct that by adding storage volumes to your cluster.

3.1.4 - Persistent Storage

How to set up persistent storage for your ClickHouse Kubernetes cluster.
kubectl create namespace test
namespace/test created

We’ve shown how to create ClickHouse clusters in Kubernetes, how to add zookeeper so we can create replicas of clusters. Now we’re going to show how to set persistent storage so you can change your cluster configurations without losing your hard work.

The examples here are built from the Altinity Kubernetes Operator examples, simplified down for our demonstrations.

Create a new file called sample05.yaml with the following:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "demo-01"
spec:
  configuration:
    zookeeper:
        nodes:
        - host: zookeeper.zoo1ns
          port: 2181
    clusters:
      - name: "demo-01"
        layout:
          shardsCount: 2
          replicasCount: 2
        templates:
          podTemplate: clickhouse-stable
          volumeClaimTemplate: storage-vc-template
  templates:
    podTemplates:
      - name: clickhouse-stable
        spec:
          containers:
          - name: clickhouse
            image: altinity/clickhouse-server:21.8.10.1.altinitystable
    volumeClaimTemplates:
      - name: storage-vc-template
        spec:
          storageClassName: standard
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 1Gi

Those who have followed the previous examples will recognize the
clusters being created, but there are some new additions:

  • volumeClaimTemplate: This is setting up storage, and we’re
    specifying the class as default. For full details on the different
    storage classes see the
    kubectl Storage Class documentation
  • storage: We’re going to give our cluster 1 Gigabyte of storage,
    enough for our sample systems. If you need more space that
    can be upgraded by changing these settings.
  • podTemplate: Here we’ll specify what our pod types are going to be.
    We’ll use the latest version of the ClickHouse containers,
    but other versions can be specified to best it your needs.
    For more information, see the
    [ClickHouse on Kubernetes Operator Guide]({<ref “kubernetesoperatorguide”>}).

Save your new configuration file and install it.
If you’ve been following this guide and already have the
namespace test operating, this will update it:

kubectl apply -f sample05.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 created

Verify it completes with get all for this namespace,
and you should have similar results:

kubectl -n test get chi -o wide
NAME      VERSION   CLUSTERS   SHARDS   HOSTS   TASKID                                 STATUS      UPDATED   ADDED   DELETED   DELETE   ENDPOINT                                    AGE
demo-01   0.18.3    1          2        4       57ec3f87-9950-4e5e-9b26-13680f66331d   Completed             4                          clickhouse-demo-01.test.svc.cluster.local   108s
kubectl get service -n test
NAME                      TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE
chi-demo-01-demo-01-0-0   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      81s
chi-demo-01-demo-01-0-1   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      63s
chi-demo-01-demo-01-1-0   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      45s
chi-demo-01-demo-01-1-1   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      8s
clickhouse-demo-01        LoadBalancer   10.104.236.138   <pending>     8123:31281/TCP,9000:30052/TCP   98s

Testing Persistent Storage

Everything is running, let’s verify that our storage is working.
We’re going to exec into our cluster with a bash prompt on
one of the pods created:

kubectl -n test exec -it chi-demo-01-demo-01-0-0-0 -- df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay          32G   26G  4.0G  87% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sda2        32G   26G  4.0G  87% /etc/hosts
shm              64M     0   64M   0% /dev/shm
tmpfs           7.7G   12K  7.7G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs           3.9G     0  3.9G   0% /proc/acpi
tmpfs           3.9G     0  3.9G   0% /proc/scsi
tmpfs           3.9G     0  3.9G   0% /sys/firmware

And we can see we have about 1 Gigabyte of storage
allocated into our cluster.

Let’s add some data to it. Nothing major, just to show that we can
store information, then change the configuration and the data stays.

Exit out of your cluster and launch clickhouse-client on your LoadBalancer.
We’re going to create a database, then create a table in the database,
then show both.

SHOW DATABASES
┌─name────┐
 default 
 system  
└─────────┘
CREATE DATABASE teststorage
CREATE TABLE teststorage.test AS system.one ENGINE = Distributed('demo-01', 'system', 'one')
SHOW DATABASES
┌─name────────┐
 default     
 system      
 teststorage 
└─────────────┘
SELECT * FROM teststorage.test
┌─dummy─┐
     0 
└───────┘
┌─dummy─┐
     0 
└───────┘

If you followed the instructions from
[Zookeeper and Replicas]({<ref “quickzookeeper” >}),
note at the end when we updated the configuration of our sample cluster
that all of the tables and data we made were deleted.
Let’s recreate that experiment now with a new configuration.

Create a new file called sample06.yaml. We’re going to reduce
the shards and replicas to 1:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "demo-01"
spec:
  configuration:
    zookeeper:
        nodes:
        - host: zookeeper.zoo1ns
          port: 2181
    clusters:
      - name: "demo-01"
        layout:
          shardsCount: 1
          replicasCount: 1
        templates:
          podTemplate: clickhouse-stable
          volumeClaimTemplate: storage-vc-template
  templates:
    podTemplates:
      - name: clickhouse-stable
        spec:
          containers:
          - name: clickhouse
            image: altinity/clickhouse-server:21.8.10.1.altinitystable
    volumeClaimTemplates:
      - name: storage-vc-template
        spec:
          storageClassName: standard
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 1Gi

Update the cluster with the following:

kubectl apply -f sample06.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 configured

Wait until the configuration is done and all of the pods are spun down,
then launch a bash prompt on one of the pods and check
the storage available:

kubectl -n test get chi -o wide
NAME      VERSION   CLUSTERS   SHARDS   HOSTS   TASKID                                 STATUS      UPDATED   ADDED   DELETED   DELETE   ENDPOINT                                    AGE
demo-01   0.18.3    1          1        1       776c1a82-44e1-4c2e-97a7-34cef629e698   Completed                               4        clickhouse-demo-01.test.svc.cluster.local   2m56s
kubectl -n test exec -it chi-demo-01-demo-01-0-0-0 -- df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay          32G   26G  4.0G  87% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sda2        32G   26G  4.0G  87% /etc/hosts
shm              64M     0   64M   0% /dev/shm
tmpfs           7.7G   12K  7.7G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs           3.9G     0  3.9G   0% /proc/acpi
tmpfs           3.9G     0  3.9G   0% /proc/scsi
tmpfs           3.9G     0  3.9G   0% /sys/firmware

Storage is still there. We can test if our databases are still available
by logging into clickhouse:

SHOW DATABASES
┌─name────────┐
 default     
 system      
 teststorage 
└─────────────┘
SELECT * FROM teststorage.test
┌─dummy─┐
     0 
└───────┘

All of our databases and tables are there.

There are different ways of allocating storage - for data, for logging,
multiple data volumes for your cluster nodes, but this will get you
started in running your own Kubernetes cluster running ClickHouse
in your favorite environment.

3.1.5 - Uninstall

How to uninstall the Altinity Kubernetes Operator and its namespace

To remove the Altinity Kubernetes Operator, both the Altinity Kubernetes Operator and the components in its installed namespace will have to be removed. The proper command is to uses the same clickhouse-operator-install-bundle.yaml file that was used to install the Altinity Kubernetes Operator. For more details, see how to install and verify the Altinity Kubernetes Operator.

The following instructions are based on the standard installation instructions. For users who perform a custom installation, note that the any custom namespaces that the user wants to remove will have to be deleted separate from the Altinity Kubernetes Operator deletion.

For example, if the custom namespace operator-test is created, then it would be removed with the command kubectl delete namespaces operator-test.

Instructions

To remove the Altinity Kubernetes Operator from your Kubernetes environment from a standard install:

  1. Verify the Altinity Kubernetes Operator is in the kube-system namespace. The Altinity Kubernetes Operator and other pods will be displayed:

    NAME                                   READY   STATUS    RESTARTS      AGE
    clickhouse-operator-857c69ffc6-2frgl   2/2     Running   0             5s
    coredns-78fcd69978-nthp2               1/1     Running   4 (23h ago)   51d
    etcd-minikube                          1/1     Running   4 (23h ago)   51d
    kube-apiserver-minikube                1/1     Running   4 (23h ago)   51d
    kube-controller-manager-minikube       1/1     Running   4 (23h ago)   51d
    kube-proxy-lsggn                       1/1     Running   4 (23h ago)   51d
    kube-scheduler-minikube                1/1     Running   4 (23h ago)   51d
    storage-provisioner                    1/1     Running   9 (23h ago)   51d
    
  2. Issue the kubectl delete command using the same YAML file used to install the Altinity Kubernetes Operator. By default the Altinity Kubernetes Operator is installed in the namespace kube-system. If this was installed into a custom namespace, verify that it installed in the uninstall command. In this example, we specified an installation of the Altinity Kubernetes Operator version 0.18.3 into the default kube-system namespace. This produces output similar to the following:

    kubectl delete -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
    
    customresourcedefinition.apiextensions.k8s.io "clickhouseinstallations.clickhouse.altinity.com" deleted
    customresourcedefinition.apiextensions.k8s.io "clickhouseinstallationtemplates.clickhouse.altinity.com" deleted
    customresourcedefinition.apiextensions.k8s.io "clickhouseoperatorconfigurations.clickhouse.altinity.com" deleted
    serviceaccount "clickhouse-operator" deleted
    clusterrole.rbac.authorization.k8s.io "clickhouse-operator-kube-system" deleted
    clusterrolebinding.rbac.authorization.k8s.io "clickhouse-operator-kube-system" deleted
    configmap "etc-clickhouse-operator-files" deleted
    configmap "etc-clickhouse-operator-confd-files" deleted
    configmap "etc-clickhouse-operator-configd-files" deleted
    configmap "etc-clickhouse-operator-templatesd-files" deleted
    configmap "etc-clickhouse-operator-usersd-files" deleted
    deployment.apps "clickhouse-operator" deleted
    service "clickhouse-operator-metrics" deleted
    
  3. To verify the Altinity Kubernetes Operator has been removed, use the kubectl get namespaces command:

    kubectl get pods --namespace kube-system
    
    NAME                               READY   STATUS    RESTARTS      AGE
    coredns-78fcd69978-nthp2           1/1     Running   4 (23h ago)   51d
    etcd-minikube                      1/1     Running   4 (23h ago)   51d
    kube-apiserver-minikube            1/1     Running   4 (23h ago)   51d
    kube-controller-manager-minikube   1/1     Running   4 (23h ago)   51d
    kube-proxy-lsggn                   1/1     Running   4 (23h ago)   51d
    kube-scheduler-minikube            1/1     Running   4 (23h ago)   51d
    storage-provisioner                1/1     Running   9 (23h ago)   51d
    

3.2 - Kubernetes Install Guide

How to install Kubernetes in different environments

Kubernetes and Zookeeper form a major backbone in running the Altinity Kubernetes Operator in a cluster. The following guides detail how to setup Kubernetes in different environments.

3.2.1 - Install minikube for Linux

How to install Kubernetes through minikube

One popular option for installing Kubernetes is through minikube, which creates a local Kubernetes cluster for different environments. Tests scripts and examples for the clickhouse-operator are based on using minikube to set up the Kubernetes environment.

The following guide demonstrates how to install minikube that support the clickhouse-operator for the following operating systems:

  • Linux (Deb based)

Minikube Installation for Deb Based Linux

The following instructions assume an installation for x86-64 based Linux that use Deb package installation. Please see the referenced documentation for instructions for other Linux distributions and platforms.

To install minikube that supports running clickhouse-operator:

kubectl Installation for Deb

The following instructions are based on Install and Set Up kubectl on Linux

  1. Download the kubectl binary:

    curl -LO 'https://dl.k8s.io/release/v1.22.0/bin/linux/amd64/kubectl'
    
  2. Verify the SHA-256 hash:

    curl -LO "https://dl.k8s.io/v1.22.0/bin/linux/amd64/kubectl.sha256"
    
    echo "$(<kubectl.sha256) kubectl" | sha256sum --check
    
  3. Install kubectl into the /usr/local/bin directory (this assumes that your PATH includes use/local/bin):

    sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
    
  4. Verify the installation and the version:

    kubectl version
    

Install Docker for Deb

These instructions are based on Docker’s documentation Install Docker Engine on Ubuntu

  1. Install the Docker repository links.

    1. Update the apt-get repository:

      sudo apt-get update
      
  2. Install the prequisites ca-certificates, curl, gnupg, and lsb-release:

    sudo apt-get install -y ca-certificates curl gnupg lsb-release
    
  3. Add the Docker repository keys:

    curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --yes --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
    
    1. Add the Docker repository:

      echo \
      "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
      $(lsb_release -cs) stable" |sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
      
  4. Install Docker:

    1. Update the apt-get repository:

      sudo apt-get update
      
    2. Install Docker and other libraries:

    sudo apt install docker-ce docker-ce-cli containerd.io
    
  5. Add non-root accounts to the docker group. This allows these users to run Docker commands without requiring root access.

    1. Add current user to the docker group and activate the changes to group

      sudo usermod -aG docker $USER&& newgrp docker
      

Install Minikube for Deb

The following instructions are taken from minikube start.

  1. Update the apt-get repository:

    sudo apt-get update
    
  2. Install the prerequisite conntrack:

    sudo apt install conntrack
    
  3. Download minikube:

    curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
    
  4. Install minikube:

    sudo install minikube-linux-amd64 /usr/local/bin/minikube
    
  5. To correct issues with the kube-proxy and the storage-provisioner, set nf_conntrack_max=524288 before starting minikube:

    sudo sysctl net/netfilter/nf_conntrack_max=524288
    
  6. Start minikube:

    minikube start && echo "ok: started minikube successfully"
    
  7. Once installation is complete, verify that the user owns the ~/.kube and ~/.minikube directories:

    sudo chown -R $USER.$USER .kube
    
    sudo chown -R $USER.$USER .minikube
    

3.2.2 - Altinity Kubernetes Operator on GKE

How to install the Altinity Kubernetes Operator using Google Kubernetes Engine

Organizations can host their Altinity Kubernetes Operator on the Google Kubernetes Engine (GKE). This can be done either through Altinity.Cloud or through a separate installation on GKE.

To setup a basic Altinity Kubernetes Operator environment, use the following steps. The steps below use the current free Google Cloud services to set up a minimally viable Kubernetes with ClickHouse environment.

Prerequisites

  1. Register a Google Cloud Account: https://cloud.google.com/.
  2. Create a Google Cloud project: https://cloud.google.com/resource-manager/docs/creating-managing-projects
  3. Install gcloud and run gcloud init or gcloud init --console to set up your environment: https://cloud.google.com/sdk/docs/install
  4. Enable the Google Compute Engine: https://cloud.google.com/endpoints/docs/openapi/enable-api
  5. Enable GKE on your project: https://console.cloud.google.com/apis/enableflow?apiid=container.googleapis.com.
  6. Select a default Computer Engine zone.
  7. Select a default Compute Engine region.
  8. Install kubectl on your local system. For sample instructions, see the Minikube on Linux installation instructions.

Altinity Kubernetes Operator on GKE Installation instructions

Installing the Altinity Kubernetes Operator in GKE has the following main steps:

Create the Network

The first step in setting up the Altinity Kubernetes Operator in GKE is creating the network. The complete details can be found on the Google Cloud documentation site regarding the gcloud compute networks create command. The following command will create a network called kubernetes-1 that will work for our minimal Altinity Kubernetes Operator cluster. Note that this network will not be available to external networks unless additional steps are made. Consult the Google Cloud documentation site for more details.

  1. See a list of current networks available. In this example, there are no networks setup in this project:

    gcloud compute networks list
    NAME     SUBNET_MODE  BGP_ROUTING_MODE  IPV4_RANGE  GATEWAY_IPV4
    default  AUTO         REGIONAL
    
  2. Create the network in your Google Cloud project:

    gcloud compute networks create kubernetes-1 --bgp-routing-mode regional --subnet-mode custom
    Created [https://www.googleapis.com/compute/v1/projects/betadocumentation/global/networks/kubernetes-1].
    NAME          SUBNET_MODE  BGP_ROUTING_MODE  IPV4_RANGE  GATEWAY_IPV4
    kubernetes-1  CUSTOM       REGIONAL
    
    Instances on this network will not be reachable until firewall rules
    are created. As an example, you can allow all internal traffic between
    instances as well as SSH, RDP, and ICMP by running:
    
    $ gcloud compute firewall-rules create <FIREWALL_NAME> --network kubernetes-1 --allow tcp,udp,icmp --source-ranges <IP_RANGE>
    $ gcloud compute firewall-rules create <FIREWALL_NAME> --network kubernetes-1 --allow tcp:22,tcp:3389,icmp
    
  3. Verify its creation:

    gcloud compute networks list
    NAME          SUBNET_MODE  BGP_ROUTING_MODE  IPV4_RANGE  GATEWAY_IPV4
    default       AUTO         REGIONAL
    kubernetes-1  CUSTOM       REGIONAL
    

Create the Cluster

Now that the network has been created, we can set up our cluster. The following cluster will be one using the ec2-micro machine type - the is still in the free model, and gives just enough power to run our basic cluster. The cluster will be called cluster-1, but you can replace it whatever name you feel appropriate. It uses the kubernetes-1 network specified earlier and creates a new subnet for the cluster under k-subnet-1.

To create and launch the cluster:

  1. Verify the existing clusters with the gcloud command. For this example there are no pre-existing clusters.

    gcloud container clusters list
    
  2. From the command line, issue the following gcloud command to create the cluster:

    gcloud container clusters create cluster-1 --region us-west1 --node-locations us-west1-a --machine-type e2-micro --network kubernetes-1 --create-subnetwork name=k-subnet-1 --enable-ip-alias &
    
  3. Use the clusters list command to verify when the cluster is available for use:

    gcloud container clusters list
    Created [https://container.googleapis.com/v1/projects/betadocumentation/zones/us-west1/clusters/cluster-1].
    To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/us-west1/cluster-1?project=betadocumentation
    kubeconfig entry generated for cluster-1.
    NAME       LOCATION  MASTER_VERSION   MASTER_IP      MACHINE_TYPE  NODE_VERSION     NUM_NODES  STATUS
    cluster-1  us-west1  1.21.6-gke.1500  35.233.231.36  e2-micro      1.21.6-gke.1500  3          RUNNING
    NAME       LOCATION  MASTER_VERSION   MASTER_IP      MACHINE_TYPE  NODE_VERSION     NUM_NODES  STATUS
    cluster-1  us-west1  1.21.6-gke.1500  35.233.231.36  e2-micro      1.21.6-gke.1500  3          RUNNING
    [1]+  Done                    gcloud container clusters create cluster-1 --region us-west1 --node-locations us-west1-a --machine-type e2-micro --network kubernetes-1 --create-subnetwork name=k-subnet-1 --enable-ip-alias
    

Get Cluster Credentials

Importing the cluster credentials into your kubectl environment will allow you to issue commands directly to the cluster on Google Cloud. To import the cluster credentials:

  1. Retrieve the credentials for the newly created cluster:

    gcloud container clusters get-credentials cluster-1 --region us-west1 --project betadocumentation
    Fetching cluster endpoint and auth data.
    kubeconfig entry generated for cluster-1.
    
  2. Verify the cluster information from the kubectl environment:

    kubectl cluster-info
    Kubernetes control plane is running at https://35.233.231.36
    GLBCDefaultBackend is running at https://35.233.231.36/api/v1/namespaces/kube-system/services/default-http-backend:http/proxy
    KubeDNS is running at https://35.233.231.36/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
    Metrics-server is running at https://35.233.231.36/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy
    
    To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
    

Install the Altinity ClickHouse Operator

Our cluster is up and ready to go. Time to install the Altinity Kubernetes Operator through the following steps. Note that we are specifying the version of the Altinity Kubernetes Operator to install. This insures maximum compatibility with your applications and other Kubernetes environments.

As of the time of this article, the most current version is 0.18.1

  1. Apply the Altinity Kubernetes Operator manifest by either downloading it and applying it, or referring to the GitHub repository URL. For more information, see the Altinity Kubernetes Operator Installation Guides.

    kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.1/deploy/operator/clickhouse-operator-install-bundle.yaml
    
  2. Verify the installation by running:

    kubectl get pods --namespace kube-system
    NAME                                                  READY   STATUS    RESTARTS   AGE
    clickhouse-operator-77b54889b4-g98kk                  2/2     Running   0          53s
    event-exporter-gke-5479fd58c8-7h6bn                   2/2     Running   0          108s
    fluentbit-gke-b29c2                                   2/2     Running   0          79s
    fluentbit-gke-k8f2n                                   2/2     Running   0          80s
    fluentbit-gke-vjlqh                                   2/2     Running   0          80s
    gke-metrics-agent-4ttdt                               1/1     Running   0          79s
    gke-metrics-agent-qf24p                               1/1     Running   0          80s
    gke-metrics-agent-szktc                               1/1     Running   0          80s
    konnectivity-agent-564f9f6c5f-59nls                   1/1     Running   0          40s
    konnectivity-agent-564f9f6c5f-9nfnl                   1/1     Running   0          40s
    konnectivity-agent-564f9f6c5f-vk7l8                   1/1     Running   0          97s
    konnectivity-agent-autoscaler-5c49cb58bb-zxzlp        1/1     Running   0          97s
    kube-dns-697dc8fc8b-ddgrx                             4/4     Running   0          98s
    kube-dns-697dc8fc8b-fpnps                             4/4     Running   0          71s
    kube-dns-autoscaler-844c9d9448-pqvqr                  1/1     Running   0          98s
    kube-proxy-gke-cluster-1-default-pool-fd104f22-8rx3   1/1     Running   0          36s
    kube-proxy-gke-cluster-1-default-pool-fd104f22-gnd0   1/1     Running   0          29s
    kube-proxy-gke-cluster-1-default-pool-fd104f22-k2sv   1/1     Running   0          12s
    l7-default-backend-69fb9fd9f9-hk7jq                   1/1     Running   0          107s
    metrics-server-v0.4.4-857776bc9c-bs6sl                2/2     Running   0          44s
    pdcsi-node-5l9vf                                      2/2     Running   0          79s
    pdcsi-node-gfwln                                      2/2     Running   0          79s
    pdcsi-node-q6scz                                      2/2     Running   0          80s
    

Create a Simple ClickHouse Cluster

The Altinity Kubernetes Operator allows the easy creation and modification of ClickHouse clusters in whatever format that works best for your organization. Now that the Google Cloud cluster is running and has the Altinity Kubernetes Operatorinstalled, let’s create a very simple ClickHouse cluster to test on.

The following example will create an Altinity Kubernetes Operator controlled cluster with 1 shard and replica, 500 MB of persistent storage, and sets the password of the default Altinity Kubernetes Operator user’s password to topsecret. For more information on customizing the Altinity Kubernetes Operator, see the Altinity Kubernetes Operator Configuration Guides.

  1. Create the following manifest and save it as gcp-example01.yaml.

    
    apiVersion: "clickhouse.altinity.com/v1"
    kind: "ClickHouseInstallation"
    metadata:
    name: "gcp-example"
    spec:
    configuration:
        # What does my cluster look like?
        clusters:
        - name: "gcp-example"
        layout:
            shardsCount: 1
            replicasCount: 1
        templates:
            podTemplate: clickhouse-stable
            volumeClaimTemplate: pd-ssd
        # Where is Zookeeper?
        zookeeper:
        nodes:
        - host: zookeeper.zoo1ns
            port: 2181
        # What are my users?
        users:
        # Password = topsecret
        demo/password_sha256_hex: 53336a676c64c1396553b2b7c92f38126768827c93b64d9142069c10eda7a721
        demo/profile: default
        demo/quota: default
        demo/networks/ip:
        - 0.0.0.0/0
        - ::/0
    templates:
        podTemplates:
        # What is the definition of my server?
        - name: clickhouse-stable
        spec:
            containers:
            - name: clickhouse
            image: altinity/clickhouse-server:21.8.10.1.altinitystable
        # Keep servers on separate nodes!
            podDistribution:
            - scope: ClickHouseInstallation
            type: ClickHouseAntiAffinity
        volumeClaimTemplates:
        # How much storage and which type on each node?
        - name: pd-ssd
        # Do not delete PVC if installation is dropped.
        reclaimPolicy: Retain
        spec:
            accessModes:
            - ReadWriteOnce
            resources:
            requests:
                storage: 500Mi
    
  2. Create a namespace in your GKE environment. For this example, we will be using test:

    kubectl create namespace test
    namespace/test created
    
  3. Apply the manifest to the namespace:

    kubectl -n test apply -f gcp-example01.yaml
    clickhouseinstallation.clickhouse.altinity.com/gcp-example created
    
  4. Verify the installation is complete when all pods are in a Running state:

    kubectl -n test get chi -o wide
    NAME          VERSION   CLUSTERS   SHARDS   HOSTS   TASKID                                 STATUS      UPDATED   ADDED   DELETED   DELETE   ENDPOINT
    gcp-example   0.18.1    1          1        1       f859e396-e2de-47fd-8016-46ad6b0b8508   Completed             1                          clickhouse-gcp-example.test.svc.cluster.local
    

Login to the Cluster

This example does not have any open external ports, but we can still access our ClickHouse database through kubectl exec. In this case, our specific pod we are connecting to is chi-demo-01-demo-01-0-0-0. Replace this with the designation of your pods;

Use the following procedure to verify the Altinity Stable build install in your GKE environment.

  1. Login to the clickhouse-client in one of your existing pods:

    kubectl -n test exec -it chi-gcp-example-gcp-example-0-0-0 -- clickhouse-client
    
  2. Verify the cluster configuration:

    kubectl -n test exec -it chi-gcp-example-gcp-example-0-0-0  -- clickhouse-client -q "SELECT * FROM system.clusters  FORMAT PrettyCompactNoEscapes"
    ┌─cluster──────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───────────────────────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
    │ all-replicated                               │         111 │ chi-gcp-example-gcp-example-0-0 │ 127.0.0.1    │ 90001 │ default │                  │            000│ all-sharded                                  │         111 │ chi-gcp-example-gcp-example-0-0 │ 127.0.0.1    │ 90001 │ default │                  │            000│ gcp-example                                  │         111 │ chi-gcp-example-gcp-example-0-0 │ 127.0.0.1    │ 90001 │ default │                  │            000│ test_cluster_two_shards                      │         111 │ 127.0.0.1                       │ 127.0.0.1    │ 90001 │ default │                  │            000│ test_cluster_two_shards                      │         211 │ 127.0.0.2                       │ 127.0.0.2    │ 90000 │ default │                  │            000│ test_cluster_two_shards_internal_replication │         111 │ 127.0.0.1                       │ 127.0.0.1    │ 90001 │ default │                  │            000│ test_cluster_two_shards_internal_replication │         211 │ 127.0.0.2                       │ 127.0.0.2    │ 90000 │ default │                  │            000│ test_cluster_two_shards_localhost            │         111 │ localhost                       │ 127.0.0.1    │ 90001 │ default │                  │            000│ test_cluster_two_shards_localhost            │         211 │ localhost                       │ 127.0.0.1    │ 90001 │ default │                  │            000│ test_shard_localhost                         │         111 │ localhost                       │ 127.0.0.1    │ 90001 │ default │                  │            000│ test_shard_localhost_secure                  │         111 │ localhost                       │ 127.0.0.1    │ 94400 │ default │                  │            000│ test_unavailable_shard                       │         111 │ localhost                       │ 127.0.0.1    │ 90001 │ default │                  │            000│ test_unavailable_shard                       │         211 │ localhost                       │ 127.0.0.1    │    10 │ default │                  │            000└──────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴─────────────────────────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘
    
  3. Exit out of your cluster:

    chi-gcp-example-gcp-example-0-0-0.chi-gcp-example-gcp-example-0-0.test.svc.cluster.local :) exit
    Bye.
    

Further Steps

This simple example demonstrates how to build and manage an Altinity Altinity Kubernetes Operator run cluster for ClickHouse. Further steps would be to open the cluster to external network connections, setup replication schemes, etc.

For more information, see the Altinity Kubernetes Operator guides and the Altinity Kubernetes Operator repository.

3.3 - Operator Guide

Installation and Management of clickhouse-operator for Kubernetes

The the Altinity Kubernetes Operator is an open source project managed and maintained by Altinity Inc. This Operator Guide is created to help users with installation, configuration, maintenance, and other important tasks.

3.3.1 - Installation Guide

Basic and custom installation instructions of the clickhouse-operator

Depending on your organization and its needs, there are different ways of installing the Kubernetes clickhouse-operator.

3.3.1.1 - Basic Installation Guide

The simple method of installing the Altinity Kubernetes Operator

Requirements

The Altinity Kubernetes Operator for Kubernetes has the following requirements:

Instructions

To install the Altinity Kubernetes Operator for Kubernetes:

  1. Deploy the Altinity Kubernetes Operator from the manifest directly from GitHub. It is recommended that the version be specified during installation - this insures maximum compatibility and that all replicated environments are working from the same version. For more information on installing other versions of the Altinity Kubernetes Operator, see the specific Version Installation Guide.

    The most current version is 0.18.3:

kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
  1. The following will be displayed on a successful installation.
    For more information on the resources created in the installation,
    see [Altinity Kubernetes Operator Resources]({<ref “operatorresources” >})
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com created
serviceaccount/clickhouse-operator created
clusterrole.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
configmap/etc-clickhouse-operator-files created
configmap/etc-clickhouse-operator-confd-files created
configmap/etc-clickhouse-operator-configd-files created
configmap/etc-clickhouse-operator-templatesd-files created
configmap/etc-clickhouse-operator-usersd-files created
deployment.apps/clickhouse-operator created
service/clickhouse-operator-metrics created
  1. Verify the installation by running:
kubectl get pods --namespace kube-system

The following will be displayed on a successful installation,
with your particular image:

NAME                                   READY   STATUS    RESTARTS      AGE
clickhouse-operator-857c69ffc6-ttnsj   2/2     Running   0             4s
coredns-78fcd69978-nthp2               1/1     Running   4 (23h ago)   51d
etcd-minikube                          1/1     Running   4 (23h ago)   51d
kube-apiserver-minikube                1/1     Running   4 (23h ago)   51d
kube-controller-manager-minikube       1/1     Running   4 (23h ago)   51d
kube-proxy-lsggn                       1/1     Running   4 (23h ago)   51d
kube-scheduler-minikube                1/1     Running   4 (23h ago)   51d
storage-provisioner                    1/1     Running   9 (23h ago)   51d

3.3.1.2 - Custom Installation Guide

How to install a customized Altinity Kubernetes Operator

Users who need to customize their Altinity Kubernetes Operator namespace or
can not directly connect to Github from the installation environment
can perform a custom install.

Requirements

The Altinity Kubernetes Operator for Kubernetes has the following requirements:

Instructions

Script Install into Namespace

By default, the Altinity Kubernetes Operator installed into the kube-system
namespace when using the Basic Installation instructions.
To install into a different namespace use the following command replacing {custom namespace here}
with the namespace to use:

curl -s https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/deploy/operator-web-installer/clickhouse-operator-install.sh | OPERATOR_NAMESPACE={custom_namespace_here} bash

For example, to install into the namespace test-clickhouse-operator
namespace, use:

curl -s https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/deploy/operator-web-installer/clickhouse-operator-install.sh | OPERATOR_NAMESPACE=test-clickhouse-operator bash
Setup ClickHouse Operator into 'test-clickhouse-operator' namespace
No 'test-clickhouse-operator' namespace found. Going to create
namespace/test-clickhouse-operator created
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com created
serviceaccount/clickhouse-operator created
clusterrole.rbac.authorization.k8s.io/clickhouse-operator-test-clickhouse-operator configured
clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-test-clickhouse-operator configured
configmap/etc-clickhouse-operator-files created
configmap/etc-clickhouse-operator-confd-files created
configmap/etc-clickhouse-operator-configd-files created
configmap/etc-clickhouse-operator-templatesd-files created
configmap/etc-clickhouse-operator-usersd-files created
deployment.apps/clickhouse-operator created
service/clickhouse-operator-metrics created

If no OPERATOR_NAMESPACE value is set, then the Altinity Kubernetes Operator will
be installed into kube-system.

Manual Install into Namespace

For organizations that can not access GitHub directly from the environment they are installing the Altinity Kubernetes Operator in, they can perform a manual install through the following steps:

  1. Download the install template file: clickhouse-operator-install-template.yaml.

  2. Edit the file and set OPERATOR_NAMESPACE value.

  3. Use the following commands, replacing {your file name} with the name of your YAML file:

    namespace = "custom-clickhouse-operator"
    bash("sed -i s/'${OPERATOR_NAMESPACE}'/test-clickhouse-operator/ clickhouse-operator-install-template.yaml", add_to_text=False)
    bash(f"kubectl apply -f clickhouse-operator-install-template.yaml", add_to_text=False)
    
    try:
    
        retry(bash, timeout=60, delay=1)("kubectl get pods --namespace test-clickhouse-operator "
            "-o=custom-columns=NAME:.metadata.name,STATUS:.status.phase",
            exitcode=0, message="Running", lines=slice(1, None),
            fail_message="not all pods in Running state", add_to_text=true)
    
    finally:
        bash(f"kubectl delete namespace test-clickhouse-operator', add_to_text=False)
    
    kubectl apply -f {your file name}
    

    For example:

    kubectl apply -f customtemplate.yaml
    

Alternatively, instead of using the install template, enter the following into your console
(bash is used below, modify depending on your particular shell).
Change the OPERATOR_NAMESPACE value to match your namespace.

# Namespace to install operator into
OPERATOR_NAMESPACE="${OPERATOR_NAMESPACE:-clickhouse-operator}"
# Namespace to install metrics-exporter into
METRICS_EXPORTER_NAMESPACE="${OPERATOR_NAMESPACE}"

# Operator's docker image
OPERATOR_IMAGE="${OPERATOR_IMAGE:-altinity/clickhouse-operator:latest}"
# Metrics exporter's docker image
METRICS_EXPORTER_IMAGE="${METRICS_EXPORTER_IMAGE:-altinity/metrics-exporter:latest}"

# Setup Altinity Kubernetes Operator into specified namespace
kubectl apply --namespace="${OPERATOR_NAMESPACE}" -f <( \
    curl -s https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/deploy/operator/clickhouse-operator-install-template.yaml | \
        OPERATOR_IMAGE="${OPERATOR_IMAGE}" \
        OPERATOR_NAMESPACE="${OPERATOR_NAMESPACE}" \
        METRICS_EXPORTER_IMAGE="${METRICS_EXPORTER_IMAGE}" \
        METRICS_EXPORTER_NAMESPACE="${METRICS_EXPORTER_NAMESPACE}" \
        envsubst \
)

Verify Installation

To verify the Altinity Kubernetes Operator is running in your namespace, use the following command:

kubectl get pods -n clickhouse-operator
NAME                                   READY   STATUS    RESTARTS   AGE
clickhouse-operator-5d9496dd48-8jt8h   2/2     Running   0          16s

3.3.1.3 - Source Build Guide - 0.18 and Up

How to build the Altinity Kubernetes Operator from source code

For organizations who prefer to build the software directly from source code,
they can compile the Altinity Kubernetes Operator and install it into a Docker
container through the following process. The following procedure is available for versions of the Altinity Kubernetes Operator 0.18.0 and up.

Binary Build

Binary Build Requirements

  • go-lang compiler: Go.
  • Go mod Package Manager.
  • The source code from the Altinity Kubernetes Operator repository.
    This can be downloaded using git clone https://github.com/altinity/clickhouse-operator.

Binary Build Instructions

  1. Switch working dir to clickhouse-operator.

  2. Link all packages with the command: echo {root_password} | sudo -S -k apt install -y golang.

  3. Build the sources with go build -o ./clickhouse-operator cmd/operator/main.go.

This creates the Altinity Kubernetes Operator binary. This binary is only used
within a kubernetes environment.

Docker Image Build and Usage

Docker Build Requirements

Install Docker Buildx CLI plugin

  1. Download Docker Buildx binary file releases page on GitHub

  2. Create folder structure for plugin

    mkdir -p ~/.docker/cli-plugins/
    
  3. Rename the relevant binary and copy it to the destination matching your OS

    mv buildx-v0.7.1.linux-amd64  ~/.docker/cli-plugins/docker-buildx
    
  4. On Unix environments, it may also be necessary to make it executable with chmod +x:

    chmod +x ~/.docker/cli-plugins/docker-buildx
    
  5. Set buildx as the default builder

    docker buildx install
    
  6. Create config.json file to enable the plugin

    touch ~/.docker/config.json
    
  7. Create config.json file to enable the plugin

    echo "{"experimental": "enabled"}" >> ~/.docker/config.json
    

Docker Build Instructions

  1. Switch working dir to clickhouse-operator

  2. Build docker image with docker: docker build -f dockerfile/operator/Dockerfile -t altinity/clickhouse-operator:dev .

  3. Register freshly build docker image inside kubernetes environment with the following:

    docker save altinity/clickhouse-operator | (eval $(minikube docker-env) && docker load)
    
  4. Install the Altinity Kubernetes Operator as described in either the Basic Build
    or Custom Build.

3.3.1.4 - Specific Version Installation Guide

How to install a specific version of the Altinity Kubernetes Operator

Users may want to install a specific version of the Altinity Kubernetes Operator for a variety of reasons: to maintain parity between different environments, to preserve the version between replicas, or other reasons.

The following procedures detail how to install a specific version of the Altinity Kubernetes Operator in the default Kubernetes namespace kube-system. For instructions on performing custom installations based on the namespace and other settings, see the Custom Installation Guide.

Requirements

The Altinity Kubernetes Operator for Kubernetes has the following requirements:

Instructions

Altinity Kubernetes Operator Versions After 0.17.0

To install a specific version of the Altinity Kubernetes Operator after version 0.17.0:

  1. Run kubectl and apply the manifest directly from the GitHub Altinity Kubernetes Operator repository, or by downloading the manifest and applying it directly. The format for the URL is:

    https://github.com/Altinity/clickhouse-operator/raw/{OPERATOR_VERSION}/deploy/operator/clickhouse-operator-install-bundle.yaml
    

    Replace the {OPERATOR_VERSION} with the version to install. For example, for the Altinity Kubernetes Operator version 0.18.3, the URL would be:

    https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml

    The command to apply the Docker manifest through kubectl is:

    kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
    
    customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com configured
    customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com configured
    customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com configured
    serviceaccount/clickhouse-operator created
    clusterrole.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
    clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
    configmap/etc-clickhouse-operator-files created
    configmap/etc-clickhouse-operator-confd-files created
    configmap/etc-clickhouse-operator-configd-files created
    configmap/etc-clickhouse-operator-templatesd-files created
    configmap/etc-clickhouse-operator-usersd-files created
    deployment.apps/clickhouse-operator created
    service/clickhouse-operator-metrics created
    
  2. Verify the installation is complete and the clickhouse-operator pod is running:

    kubectl get pods --namespace kube-system
    

    A similar result to the following will be displayed on a successful installation:

    NAME                                   READY   STATUS    RESTARTS      AGE
    clickhouse-operator-857c69ffc6-q8qrr   2/2     Running   0             5s
    coredns-78fcd69978-nthp2               1/1     Running   4 (23h ago)   51d
    etcd-minikube                          1/1     Running   4 (23h ago)   51d
    kube-apiserver-minikube                1/1     Running   4 (23h ago)   51d
    kube-controller-manager-minikube       1/1     Running   4 (23h ago)   51d
    kube-proxy-lsggn                       1/1     Running   4 (23h ago)   51d
    kube-scheduler-minikube                1/1     Running   4 (23h ago)   51d
    storage-provisioner                    1/1     Running   9 (23h ago)   51d
    
  3. To verify the version of the Altinity Kubernetes Operator, use the following command:

    kubectl get pods -l app=clickhouse-operator --all-namespaces -o jsonpath="{.items[*].spec.containers[*].image}" | tr -s "[[:space:]]" | sort | uniq -c
    
    1 altinity/clickhouse-operator:0.18.3 altinity/metrics-exporter:0.18.3
    

3.3.1.5 - Upgrade Guide

How to upgrade the Altinity Kubernetes Operator

The Altinity Kubernetes Operator can be upgraded at any time by applying the new manifest from the Altinity Kubernetes Operator GitHub repository.

The following procedures detail how to install a specific version of the Altinity Kubernetes Operator in the default Kubernetes namespace kube-system. For instructions on performing custom installations based on the namespace and other settings, see the Custom Installation Guide.

Requirements

The Altinity Kubernetes Operator for Kubernetes has the following requirements:

Instructions

The following instructions are based on installations of the Altinity Kubernetes Operator greater than version 0.16.0. In the following examples, Altinity Kubernetes Operator version 0.16.0 has been installed and will be upgraded to 0.18.3.

For instructions on installing specific versions of the Altinity Kubernetes Operator, see the Specific Version Installation Guide.

  1. Deploy the Altinity Kubernetes Operator from the manifest directly from GitHub. It is recommended that the version be specified during the installation for maximum compatibilty. In this example, the version being upgraded to is :

    kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
    
  2. The following will be displayed on a successful installation.
    For more information on the resources created in the installation,
    see [Altinity Kubernetes Operator Resources]({<ref “operatorresources” >})

    customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com configured
    customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com configured
    customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com configured
    serviceaccount/clickhouse-operator configured
    clusterrole.rbac.authorization.k8s.io/clickhouse-operator-kube-system configured
    clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-kube-system configured
    configmap/etc-clickhouse-operator-files configured
    configmap/etc-clickhouse-operator-confd-files configured
    configmap/etc-clickhouse-operator-configd-files configured
    configmap/etc-clickhouse-operator-templatesd-files configured
    configmap/etc-clickhouse-operator-usersd-files configured
    deployment.apps/clickhouse-operator configured
    service/clickhouse-operator-metrics configured
    
  3. Verify the installation by running:

    The following will be displayed on a successful installation, with your particular image:

    kubectl get pods --namespace kube-system
    
    NAME                                   READY   STATUS    RESTARTS       AGE
    clickhouse-operator-857c69ffc6-dqt5l   2/2     Running   0              29s
    coredns-78fcd69978-nthp2               1/1     Running   3 (14d ago)    50d
    etcd-minikube                          1/1     Running   3 (14d ago)    50d
    kube-apiserver-minikube                1/1     Running   3 (2m6s ago)   50d
    kube-controller-manager-minikube       1/1     Running   3 (14d ago)    50d
    kube-proxy-lsggn                       1/1     Running   3 (14d ago)    50d
    kube-scheduler-minikube                1/1     Running   3 (2m6s ago)   50d
    storage-provisioner                    1/1     Running   7 (48s ago)    50d
    
  4. To verify the version of the Altinity Kubernetes Operator, use the following command:

    kubectl get pods -l app=clickhouse-operator -n kube-system -o jsonpath="{.items[*].spec.containers[*].image}" | tr -s "[[:space:]]" | sort | uniq -c
    
    1 altinity/clickhouse-operator:0.18.3 altinity/metrics-exporter:0.18.3
    

3.3.2 - Configuration Guide

How to configure your Altinity Kubernetes Operator cluster.

Depending on your organization’s needs and environment, you can modify your environment to best fit your needs with the Altinity Kubernetes Operator or your cluster settings.

3.3.2.1 - ClickHouse Operator Settings

Settings and configurations for the Altinity Kubernetes Operator

Altinity Kubernetes Operator 0.18 and greater

For versions of the Altinity Kubernetes Operator 0.18 and later, the Altinity Kubernetes Operator settings can be modified through the clickhouse-operator-install-bundle.yaml file in the section etc-clickhouse-operator-files. This sets the config.yaml settings that are used to set the user configuration and other settings. For more information, see the sample config.yaml for the Altinity Kubernetes Operator.

Altinity Kubernetes Operator before 0.18

For versions before 0.18, the Altinity Kubernetes Operator settings can be modified through clickhouse-operator-install-bundle.yaml file in the section marked ClickHouse Settings Section.

New User Settings

Setting Default Value Description
chConfigUserDefaultProfile default Sets the default profile used when creating new users.
chConfigUserDefaultQuota default Sets the default quota used when creating new users.
chConfigUserDefaultNetworksIP ::1
127.0.0.1
0.0.0.0
Specifies the networks that the user can connect from. Note that 0.0.0.0 allows access from all networks.
chConfigUserDefaultPassword default The initial password for new users.

ClickHouse Operator Settings

The ClickHouse Operator role can connect to the ClickHouse database to perform the following:

  • Metrics requests
  • Schema Maintenance
  • Drop DNS Cache

Additional users can be created with this role by modifying the usersd XML files.

Setting Default Value Description
chUsername clickhouse_operator The username for the ClickHouse Operator user.
chPassword clickhouse_operator_password The default password for the ClickHouse Operator user.
chPort 8123 The IP port for the ClickHouse Operator user.

Log Parameters

The Log Parameters sections sets the options for log outputs and levels.

Setting Default Value Description
logtostderr true If set to true, submits logs to stderr instead of log files.
alsologtostderr false If true, submits logs to stderr as well as log files.
v 1 Sets V-leveled logging level.
stderrthreshold "" The error threshold. Errors at or above this level will be submitted to stderr.
vmodule "" A comma separated list of modules and their verbose level with {module name} = {log level}. For example: "module1=2,module2=3".
log_backtrace_at "" Location to store the stack backtrace.

Runtime Parameters

The Runtime Parameters section sets the resources allocated for processes such as reconcile functions.

Setting Default Value Description
reconcileThreadsNumber 10 The number threads allocated to manage reconcile requests.
reconcileWaitExclude false ???
reconcileWaitInclude false ???

Template Parameters

Template Parameters sets the values for connection values, user default settings, and other values. These values are based on ClickHouse configurations. For full details, see the ClickHouse documentation page.

3.3.2.2 - ClickHouse Cluster Settings

Settings and configurations for clusters and nodes

ClickHouse clusters that are configured on Kubernetes have several options based on the Kubernetes Custom Resources settings. Your cluster may have particular requirements to best fit your organizations needs.

For an example of a configuration file using each of these settings, see the 99-clickhouseinstllation-max.yaml file as a template.

This assumes that you have installed the clickhouse-operator.

Initial Settings

The first section sets the cluster kind and api.

Parent Setting Type Description
None kind String Specifies the type of cluster to install. In this case, ClickHouse. Value value: ClickHouseInstallation
None metadata Object Assigns metadata values for the cluster
metadata name String The name of the resource.
metadata labels Array Labels applied to the resource.
metadata annotation Array Annotations applied to the resource.

Initial Settings Example

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "clickhouse-installation-max"
  labels:
    label1: label1_value
    label2: label2_value
  annotations:
    annotation1: annotation1_value
    annotation2: annotation2_value

.spec.defaults

.spec.defaults section represents default values the sections that follow .specs.defaults.

Parent Setting Type Description
defaults replicasUseFQDN `[Yes No ]`
defaults distributedDDL String Sets the <yandex><distributed_ddl></distributed_ddl></yandex> configuration settings. For more information, see Distributed DDL Queries (ON CLUSTER Clause).
defaults templates Array Sets the pod template types. This is where the template is declared, then defined in the .spec.configuration later.

.spec.defaults Example

  defaults:
    replicasUseFQDN: "no"
    distributedDDL:
      profile: default
    templates:
      podTemplate: clickhouse-v18.16.1
      dataVolumeClaimTemplate: default-volume-claim
      logVolumeClaimTemplate: default-volume-claim
      serviceTemplate: chi-service-template

.spec.configuration

.spec.configuration section represents sources for ClickHouse configuration files. For more information, see the ClickHouse Configuration Files page.

.spec.configuration Example

  configuration:
    users:
      readonly/profile: readonly
      #     <users>
      #        <readonly>
      #          <profile>readonly</profile>
      #        </readonly>
      #     </users>
      test/networks/ip:
        - "127.0.0.1"
        - "::/0"
      #     <users>
      #        <test>
      #          <networks>
      #            <ip>127.0.0.1</ip>
      #            <ip>::/0</ip>
      #          </networks>
      #        </test>
      #     </users>
      test/profile: default
      test/quotas: default

.spec.configuration.zookeeper

.spec.configuration.zookeeper defines the zookeeper settings, and is expanded into the <yandex><zookeeper></zookeeper></yandex> configuration section. For more information, see ClickHouse Zookeeper settings.

.spec.configuration.zookeeper Example

    zookeeper:
      nodes:
        - host: zookeeper-0.zookeepers.zoo3ns.svc.cluster.local
          port: 2181
        - host: zookeeper-1.zookeepers.zoo3ns.svc.cluster.local
          port: 2181
        - host: zookeeper-2.zookeepers.zoo3ns.svc.cluster.local
          port: 2181
      session_timeout_ms: 30000
      operation_timeout_ms: 10000
      root: /path/to/zookeeper/node
      identity: user:password

.spec.configuration.profiles

.spec.configuration.profiles defines the ClickHouse profiles that are stored in <yandex><profiles></profiles></yandex>. For more information, see the ClickHouse Server Settings page.

.spec.configuration.profiles Example

    profiles:
      readonly/readonly: 1

expands into

      <profiles>
        <readonly>
          <readonly>1</readonly>
        </readonly>
      </profiles>

.spec.configuration.users

.spec.configuration.users defines the users and is stored in <yandex><users></users></yandex>. For more information, see the ClickHouse Server Settings page.

.spec.configuration.users Example

  users:
    test/networks/ip:
        - "127.0.0.1"
        - "::/0"

expands into

     <users>
        <test>
          <networks>
            <ip>127.0.0.1</ip>
            <ip>::/0</ip>
          </networks>
        </test>
     </users>

.spec.configuration.settings

.spec.configuration.settings sets other ClickHouse settings such as compression, etc. For more information, see the ClickHouse Server Settings page.

.spec.configuration.settings Example

    settings:
      compression/case/method: "zstd"
#      <compression>
#       <case>
#         <method>zstd</method>
#      </case>
#      </compression>

.spec.configuration.files

.spec.configuration.files creates custom files used in the custer. These are used for custom configurations, such as the ClickHouse External Dictionary.

.spec.configuration.files Example

    files:
      dict1.xml: |
        <yandex>
            <!-- ref to file /etc/clickhouse-data/config.d/source1.csv -->
        </yandex>        
      source1.csv: |
        a1,b1,c1,d1
        a2,b2,c2,d2        
spec:
  configuration:
    settings:
      dictionaries_config: config.d/*.dict
    files:
      dict_one.dict: |
        <yandex>
          <dictionary>
        <name>one</name>
        <source>
            <clickhouse>
                <host>localhost</host>
                <port>9000</port>
                <user>default</user>
                <password/>
                <db>system</db>
                <table>one</table>
            </clickhouse>
        </source>
        <lifetime>60</lifetime>
        <layout><flat/></layout>
        <structure>
            <id>
                <name>dummy</name>
            </id>
            <attribute>
                <name>one</name>
                <expression>dummy</expression>
                <type>UInt8</type>
                <null_value>0</null_value>
            </attribute>
        </structure>
        </dictionary>
        </yandex>        

.spec.configuration.clusters

.spec.configuration.clusters defines the ClickHouse clusters to be installed.

    clusters:

Clusters and Layouts

.clusters.layout defines the ClickHouse layout of a cluster. This can be general, or very granular depending on your requirements. For full information, see Cluster Deployment.

Templates

podTemplate is used to define the specific pods in the cluster, mainly the ones that will be running ClickHouse. The VolumeClaimTemplate defines the storage volumes. Both of these settings are applied per replica.

Basic Dimensions

Basic dimensions are used to define the cluster definitions without specifying particular details of the shards or nodes.

Parent Setting Type Description
.clusters.layout shardsCount Number The number of shards for the cluster.
.clusters.layout replicasCount Number The number of replicas for the cluster.
Basic Dimensions Example

In this example, the podTemplates defines ClickHouses containers into a cluster called all-counts with three shards and two replicas.

- name: all-counts
        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shardsCount: 3
          replicasCount: 2

This is expanded into the following configuration. The IP addresses and DNS configuration are assigned by k8s and the operator.

<yandex>
    <remote_servers>
        <all-counts>
        
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>192.168.1.1</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>192.168.1.2</host>
                    <port>9000</port>
                </replica>
            </shard>
            
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>192.168.1.3</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>192.168.1.4</host>
                    <port>9000</port>
                </replica>
            </shard>
            
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>192.168.1.5</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>192.168.1.6</host>
                    <port>9000</port>
                </replica>
            </shard>
            
        </all-counts>
    </remote_servers>
</yandex>

Specified Dimensions

The templates section can also be used to specify more than just the general layout. The exact definitions of the shards and replicas can be defined as well.

In this example, shard0 here has replicasCount specified, while shard1 has 3 replicas explicitly specified, with possibility to customized each replica.

        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shardsCount: 3
          replicasCount: 2
      - name: customized
        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shards:
            - name: shard0
              replicasCount: 3
              weight: 1
              internalReplication: Disabled
              templates:
                podTemplate: clickhouse-v18.16.1
                dataVolumeClaimTemplate: default-volume-claim
                logVolumeClaimTemplate: default-volume-claim

            - name: shard1
              templates:
                podTemplate: clickhouse-v18.16.1
                dataVolumeClaimTemplate: default-volume-claim
                logVolumeClaimTemplate: default-volume-claim
              replicas:
                - name: replica0
                - name: replica1
                - name: replica2

Other examples are combinations, where some replicas are defined but only one is explicitly differentiated with a different podTemplate.

      - name: customized
        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shards:
            - name: shard2
              replicasCount: 3
              templates:
                podTemplate: clickhouse-v18.16.1
                dataVolumeClaimTemplate: default-volume-claim
                logVolumeClaimTemplate: default-volume-claim
              replicas:
                - name: replica0
                  port: 9000
                  templates:
                    podTemplate: clickhouse-v19.11.3.11
                    dataVolumeClaimTemplate: default-volume-claim
                    logVolumeClaimTemplate: default-volume-claim

.spec.templates.serviceTemplates

.spec.templates.serviceTemplates represents Kubernetes Service templates, with additional fields.

At the top level is generateName which is used to explicitly specify service name to be created. generateName is able to understand macros for the service level of the object created. The service levels are defined as:

  • CHI
  • Cluster
  • Shard
  • Replica

The macro and service level where they apply are:

Setting CHI Cluster Shard Replica Description
{chi} X X X X ClickHouseInstallation name
{chiID} X X X X short hashed ClickHouseInstallation name (Experimental)
{cluster}   X X X The cluster name
{clusterID}   X X X short hashed cluster name (BEWARE, this is an experimental feature)
{clusterIndex}   X X X 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
{shard}     X X shard name
{shardID}     X X short hashed shard name (BEWARE, this is an experimental feature)
{shardIndex}     X X 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
{replica}       X replica name
{replicaID}       X short hashed replica name (BEWARE, this is an experimental feature)
{replicaIndex}       X 0-based index of the replica in the shard (BEWARE, this is an experimental feature)

.spec.templates.serviceTemplates Example

  templates:
    serviceTemplates:
      - name: chi-service-template
        # generateName understands different sets of macroses,
        # depending on the level of the object, for which Service is being created:
        #
        # For CHI-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        #
        # For Cluster-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        # 3. {cluster} - cluster name
        # 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
        # 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
        #
        # For Shard-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        # 3. {cluster} - cluster name
        # 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
        # 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
        # 6. {shard} - shard name
        # 7. {shardID} - short hashed shard name (BEWARE, this is an experimental feature)
        # 8. {shardIndex} - 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
        #
        # For Replica-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        # 3. {cluster} - cluster name
        # 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
        # 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
        # 6. {shard} - shard name
        # 7. {shardID} - short hashed shard name (BEWARE, this is an experimental feature)
        # 8. {shardIndex} - 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
        # 9. {replica} - replica name
        # 10. {replicaID} - short hashed replica name (BEWARE, this is an experimental feature)
        # 11. {replicaIndex} - 0-based index of the replica in the shard (BEWARE, this is an experimental feature)
        generateName: "service-{chi}"
        # type ObjectMeta struct from k8s.io/meta/v1
        metadata:
          labels:
            custom.label: "custom.value"
          annotations:
            cloud.google.com/load-balancer-type: "Internal"
            service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0
            service.beta.kubernetes.io/azure-load-balancer-internal: "true"
            service.beta.kubernetes.io/openstack-internal-load-balancer: "true"
            service.beta.kubernetes.io/cce-load-balancer-internal-vpc: "true"
        # type ServiceSpec struct from k8s.io/core/v1
        spec:
          ports:
            - name: http
              port: 8123
            - name: client
              port: 9000
          type: LoadBalancer

.spec.templates.volumeClaimTemplates

.spec.templates.volumeClaimTemplates defines the PersistentVolumeClaims. For more information, see the Kubernetes PersistentVolumeClaim page.

.spec.templates.volumeClaimTemplates Example

  templates:
    volumeClaimTemplates:
      - name: default-volume-claim
        # type PersistentVolumeClaimSpec struct from k8s.io/core/v1
        spec:
          # 1. If storageClassName is not specified, default StorageClass
          # (must be specified by cluster administrator) would be used for provisioning
          # 2. If storageClassName is set to an empty string (‘’), no storage class will be used
          # dynamic provisioning is disabled for this PVC. Existing, “Available”, PVs
          # (that do not have a specified storageClassName) will be considered for binding to the PVC
          #storageClassName: gold
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 1Gi

.spec.templates.podTemplates

.spec.templates.podTemplates defines the Pod Templates. For more information, see the Kubernetes Pod Templates.

The following additional sections have been defined for the ClickHouse cluster:

  1. zone
  2. distribution

zone and distribution together define zoned layout of ClickHouse instances over nodes. These ensure that the affinity.nodeAffinity and affinity.podAntiAffinity are set.

.spec.templates.podTemplates Example

To place a ClickHouse instances in AWS us-east-1a availability zone with one ClickHouse per host:

        zone:
          values:
            - "us-east-1a"
        distribution: "OnePerHost"

To place ClickHouse instances on nodes labeled as clickhouse=allow with one ClickHouse per host:

        zone:
          key: "clickhouse"
          values:
            - "allow"
        distribution: "OnePerHost"

Or the distribution can be Unspecified:

  templates:
    podTemplates:
      # multiple pod templates makes possible to update version smoothly
      # pod template for ClickHouse v18.16.1
      - name: clickhouse-v18.16.1
        # We may need to label nodes with clickhouse=allow label for this example to run
        # See ./label_nodes.sh for this purpose
        zone:
          key: "clickhouse"
          values:
            - "allow"
        # Shortcut version for AWS installations
        #zone:
        #  values:
        #    - "us-east-1a"

        # Possible values for distribution are:
        # Unspecified
        # OnePerHost
        distribution: "Unspecified"

        # type PodSpec struct {} from k8s.io/core/v1
        spec:
          containers:
            - name: clickhouse
              image: yandex/clickhouse-server:18.16.1
              volumeMounts:
                - name: default-volume-claim
                  mountPath: /var/lib/clickhouse
              resources:
                requests:
                  memory: "64Mi"
                  cpu: "100m"
                limits:
                  memory: "64Mi"
                  cpu: "100m"

References

3.3.3 - Resources

Altinity Kubernetes Operator Resources Details

The Altinity Kubernetes Operator creates the following resources on installation to support its functions:

  • Custom Resource Definition
  • Service account
  • Cluster Role Binding
  • Deployment

Custom Resource Definition

The Kubernetes k8s API is extended with the new Kubernetes Cluster Resource Definition kind:ClickHouseInstallation.

To check the Custom Resource Definition:

kubectl get customresourcedefinitions

Expected result:

NAME                                                       CREATED AT
clickhouseinstallations.clickhouse.altinity.com            2022-02-09T17:20:39Z
clickhouseinstallationtemplates.clickhouse.altinity.com    2022-02-09T17:20:39Z
clickhouseoperatorconfigurations.clickhouse.altinity.com   2022-02-09T17:20:39Z

Service Account

The new Service Account clickhouse-operator allows services running from within Pods to be authenticated against the Service Account clickhouse-operator through the apiserver.

To check the Service Account:

kubectl get serviceaccounts -n kube-system

Expected result

NAME                                 SECRETS   AGE
attachdetach-controller              1         23d
bootstrap-signer                     1         23d
certificate-controller               1         23d
clickhouse-operator                  1         5s
clusterrole-aggregation-controller   1         23d
coredns                              1         23d
cronjob-controller                   1         23d
daemon-set-controller                1         23d
default                              1         23d
deployment-controller                1         23d
disruption-controller                1         23d
endpoint-controller                  1         23d
endpointslice-controller             1         23d
endpointslicemirroring-controller    1         23d
ephemeral-volume-controller          1         23d
expand-controller                    1         23d
generic-garbage-collector            1         23d
horizontal-pod-autoscaler            1         23d
job-controller                       1         23d
kube-proxy                           1         23d
namespace-controller                 1         23d
node-controller                      1         23d
persistent-volume-binder             1         23d
pod-garbage-collector                1         23d
pv-protection-controller             1         23d
pvc-protection-controller            1         23d
replicaset-controller                1         23d
replication-controller               1         23d
resourcequota-controller             1         23d
root-ca-cert-publisher               1         23d
service-account-controller           1         23d
service-controller                   1         23d
statefulset-controller               1         23d
storage-provisioner                  1         23d
token-cleaner                        1         23d
ttl-after-finished-controller        1         23d
ttl-controller                       1         23d

Cluster Role Binding

The Cluster Role Binding cluster-operator grants permissions defined in a role to a set of users.

Roles are granted to users, groups or service account. These permissions are granted cluster-wide with ClusterRoleBinding.

To check the Cluster Role Binding:

kubectl get clusterrolebinding

Expected result

NAME                                                   ROLE                                                                               AGE
clickhouse-operator-kube-system                        ClusterRole/clickhouse-operator-kube-system                                        5s
cluster-admin                                          ClusterRole/cluster-admin                                                          23d
kubeadm:get-nodes                                      ClusterRole/kubeadm:get-nodes                                                      23d
kubeadm:kubelet-bootstrap                              ClusterRole/system:node-bootstrapper                                               23d
kubeadm:node-autoapprove-bootstrap                     ClusterRole/system:certificates.k8s.io:certificatesigningrequests:nodeclient       23d
kubeadm:node-autoapprove-certificate-rotation          ClusterRole/system:certificates.k8s.io:certificatesigningrequests:selfnodeclient   23d
kubeadm:node-proxier                                   ClusterRole/system:node-proxier                                                    23d
minikube-rbac                                          ClusterRole/cluster-admin                                                          23d
storage-provisioner                                    ClusterRole/system:persistent-volume-provisioner                                   23d
system:basic-user                                      ClusterRole/system:basic-user                                                      23d
system:controller:attachdetach-controller              ClusterRole/system:controller:attachdetach-controller                              23d
system:controller:certificate-controller               ClusterRole/system:controller:certificate-controller                               23d
system:controller:clusterrole-aggregation-controller   ClusterRole/system:controller:clusterrole-aggregation-controller                   23d
system:controller:cronjob-controller                   ClusterRole/system:controller:cronjob-controller                                   23d
system:controller:daemon-set-controller                ClusterRole/system:controller:daemon-set-controller                                23d
system:controller:deployment-controller                ClusterRole/system:controller:deployment-controller                                23d
system:controller:disruption-controller                ClusterRole/system:controller:disruption-controller                                23d
system:controller:endpoint-controller                  ClusterRole/system:controller:endpoint-controller                                  23d
system:controller:endpointslice-controller             ClusterRole/system:controller:endpointslice-controller                             23d
system:controller:endpointslicemirroring-controller    ClusterRole/system:controller:endpointslicemirroring-controller                    23d
system:controller:ephemeral-volume-controller          ClusterRole/system:controller:ephemeral-volume-controller                          23d
system:controller:expand-controller                    ClusterRole/system:controller:expand-controller                                    23d
system:controller:generic-garbage-collector            ClusterRole/system:controller:generic-garbage-collector                            23d
system:controller:horizontal-pod-autoscaler            ClusterRole/system:controller:horizontal-pod-autoscaler                            23d
system:controller:job-controller                       ClusterRole/system:controller:job-controller                                       23d
system:controller:namespace-controller                 ClusterRole/system:controller:namespace-controller                                 23d
system:controller:node-controller                      ClusterRole/system:controller:node-controller                                      23d
system:controller:persistent-volume-binder             ClusterRole/system:controller:persistent-volume-binder                             23d
system:controller:pod-garbage-collector                ClusterRole/system:controller:pod-garbage-collector                                23d
system:controller:pv-protection-controller             ClusterRole/system:controller:pv-protection-controller                             23d
system:controller:pvc-protection-controller            ClusterRole/system:controller:pvc-protection-controller                            23d
system:controller:replicaset-controller                ClusterRole/system:controller:replicaset-controller                                23d
system:controller:replication-controller               ClusterRole/system:controller:replication-controller                               23d
system:controller:resourcequota-controller             ClusterRole/system:controller:resourcequota-controller                             23d
system:controller:root-ca-cert-publisher               ClusterRole/system:controller:root-ca-cert-publisher                               23d
system:controller:route-controller                     ClusterRole/system:controller:route-controller                                     23d
system:controller:service-account-controller           ClusterRole/system:controller:service-account-controller                           23d
system:controller:service-controller                   ClusterRole/system:controller:service-controller                                   23d
system:controller:statefulset-controller               ClusterRole/system:controller:statefulset-controller                               23d
system:controller:ttl-after-finished-controller        ClusterRole/system:controller:ttl-after-finished-controller                        23d
system:controller:ttl-controller                       ClusterRole/system:controller:ttl-controller                                       23d
system:coredns                                         ClusterRole/system:coredns                                                         23d
system:discovery                                       ClusterRole/system:discovery                                                       23d
system:kube-controller-manager                         ClusterRole/system:kube-controller-manager                                         23d
system:kube-dns                                        ClusterRole/system:kube-dns                                                        23d
system:kube-scheduler                                  ClusterRole/system:kube-scheduler                                                  23d
system:monitoring                                      ClusterRole/system:monitoring                                                      23d
system:node                                            ClusterRole/system:node                                                            23d
system:node-proxier                                    ClusterRole/system:node-proxier                                                    23d
system:public-info-viewer                              ClusterRole/system:public-info-viewer                                              23d
system:service-account-issuer-discovery                ClusterRole/system:service-account-issuer-discovery                                23d
system:volume-scheduler                                ClusterRole/system:volume-scheduler                                                23d

Cluster Role Binding Example

As an example, the role cluster-admin is granted to a service account clickhouse-operator:

roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: ServiceAccount
    name: clickhouse-operator
    namespace: kube-system

Deployment

The Deployment clickhouse-operator runs in the kube-system namespace.

To check the Deployment:

kubectl get deployments --namespace kube-system

Expected result

NAME                  READY   UP-TO-DATE   AVAILABLE   AGE
clickhouse-operator   1/1     1            1           5s
coredns               1/1     1            1           23d

References

3.3.4 - Networking Connection Guides

How to connect your ClickHouse Kubernetes cluster network.

Organizations can connect their clickhouse-operator based ClickHouse cluster to their network depending on their environment. The following guides are made to assist users setting up the connections based on their environment.

3.3.4.1 - MiniKube Networking Connection Guide

How to connect your ClickHouse Kubernetes cluster network.

Organizations that have set up the Altinity Kubernetes Operator using minikube can connect it to an external network through the following steps.

Prerequisites

The following guide is based on an installed Altinity Kubernetes Operator cluster using minikube for an Ubuntu Linux operating system.

Network Connection Guide

The proper way to connect to the ClickHouse cluster is through the LoadBalancer created during the ClickHouse cluster created process. For example, the following ClickHouse cluster has 2 shards in one replica, applied to the namespace test:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "demo-01"
spec:
  configuration:
    clusters:
      - name: "demo-01"
        layout:
          shardsCount: 2
          replicasCount: 1

This generates the following services in the namespace test:

kubectl get service -n test
NAME                      TYPE           CLUSTER-IP    EXTERNAL-IP   PORT(S)                         AGE
chi-demo-01-demo-01-0-0   ClusterIP      None          <none>        8123/TCP,9000/TCP,9009/TCP      22s
chi-demo-01-demo-01-1-0   ClusterIP      None          <none>        8123/TCP,9000/TCP,9009/TCP      5s
clickhouse-demo-01        LoadBalancer   10.96.67.44   <pending>     8123:32766/TCP,9000:31368/TCP   38s

The LoadBalancer alternates which of the ClickHouse shards to connect to, and should be where all ClickHouse clients connect to.

To open a connection from external networks to the LoadBalancer, use the kubectl port-forward command in the following format:

kubectl port-forward service/{LoadBalancer Service} -n {NAMESPACE} --address={IP ADDRESS} {TARGET PORT}:{INTERNAL PORT}

Replacing the following:

  • LoadBalancer Service: the LoadBalancer service to connect external ports to the Kubernetes environment.
  • NAMESPACE: The namespace for the LoadBalancer.
  • IP ADDRESS: The IP address to bind the service to on the machine running minikube, or 0.0.0.0 to find all IP addresses on the minikube server to the specified port.
  • TARGET PORT: The external port that users will connect to.
  • INTERNAL PORT: The port within the Altinity Kubernetes Operator network.

The kubectl port-forward command must be kept running in the terminal, or placed into the background with the & operator.

In the example above, the following settings will be used to bind all IP addresses on the minikube server to the service clickhouse-demo-01 for ports 9000 and 8123 in the background:

kubectl port-forward service/clickhouse-demo-01 -n test --address=0.0.0.0 9000:9000 8123:8123 &

To test the connection, connect to the external IP address via curl. For ClickHouse HTTP, OK will be returned, while for port 9000 a notice requesting use of port 8123 will be displayed:

curl http://localhost:9000
Handling connection for 9000
Port 9000 is for clickhouse-client program
You must use port 8123 for HTTP.
curl http://localhost:8123
Handling connection for 8123
Ok.

Once verified, connect to the ClickHouse cluster via either HTTP or ClickHouse TCP as needed.

3.3.5 - Storage Guide

How to configure storage options for the Altinity Kubernetes Operator

Altinity Kubernetes Operator users have different options regarding persistent storage depending on their environment and situation. The following guides detail how to set up persistent storage for local and cloud storage environments.

3.3.5.1 - Persistent Storage Overview

Allocate persistent storage for Altinity Kubernetes Operator clusters

Users setting up storage in their local environments can establish persistent volumes in different formats based on their requirements.

Allocating Space

Space is allocated through the Kubernetes PersistentVolume object. ClickHouse clusters established with the Altinity Kubernetes Operator then use the PersistentVolumeClaim to receive persistent storage.

The PersistentVolume can be set in one of two ways:

  • Manually: Manual allocations set the storage area before the ClickHouse cluster is created. Space is then requested through a PersistentVolumeClaim when the ClickHouse cluster is created.
  • Dynamically: Space is allocated when the ClickHouse cluster is created through the PersistentVolumeClaim, and the Kubernetes controlling software manages the process for the user.

For more information on how persistent volumes are managed in Kubernetes, see the Kubernetes documentation Persistent Volumes.

Storage Types

Data stored for ClickHouse clusters in the following ways:

No Persistent Storage

If no persistent storage claim template is specified, then no persistent storage will be allocated. When Kubernetes is stopped or a new manifest applied, all previous data will be lost.

In this example two shards are specified but has no persistent storage allocated:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "no-persistent"
spec:
  configuration:
    clusters:
      - name: "no-persistent"
        layout:
          shardsCount: 2
          replicasCount: 1

When applied to the namespace test, no persistent storage is found:

kubectl -n test get pv
No resources found

Cluster Wide Storage

If neither the dataVolumeClaimTemplate or the logVolumeClaimTemplate are specified (see below), then all data is stored under the requested volumeClaimTemplate. This includes all information stored in each pod.

In this example, two shards are specified with one volume of storage that is used by the entire pods:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "cluster-storage"
spec:
  configuration:
    clusters:
      - name: "cluster-storage"
        layout:
          shardsCount: 2
          replicasCount: 1
        templates:
            volumeClaimTemplate: cluster-storage-vc-template
  templates:
    volumeClaimTemplates:
      - name: cluster-storage-vc-template
        spec:
          storageClassName: standard
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 500Mi

When applied to the namespace test the following persistent volumes are found. Note that each pod has 500Mb of storage:

kubectl -n test get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                                        STORAGECLASS   REASON   AGE
pvc-6e70c36a-f170-47b5-93a6-88175c62b8fe   500Mi      RWO            Delete           Bound    test/cluster-storage-vc-template-chi-cluster-storage-cluster-storage-1-0-0   standard                21s
pvc-ca002bc4-0ad2-4358-9546-0298eb8b2152   500Mi      RWO            Delete           Bound    test/cluster-storage-vc-template-chi-cluster-storage-cluster-storage-0-0-0   standard                39s

Cluster Wide Split Storage

Applying the dataVolumeClaimTemplate and logVolumeClaimTemplate template types to the Altinity Kubernetes Operator controlled ClickHouse cluster allows for specific data from each ClickHouse pod to be stored in a particular persistent volume:

  • dataVolumeClaimTemplate: Sets the storage volume for the ClickHouse node data. In a traditional ClickHouse server environment, this would be allocated to /var/lib/clickhouse.
  • logVolumeClaimTemplate: Sets the storage volume for ClickHouse node log files. In a traditional ClickHouse server environment, this would be allocated to /var/log/clickhouse-server.

This allows different storage capacities for log data versus ClickHouse database data, as well as only capturing specific data rather than the entire pod.

In this example, two shards have different storage capacity for dataVolumeClaimTemplate and logVolumeClaimTemplate:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "cluster-split-storage"
spec:
  configuration:
    clusters:
      - name: "cluster-split"
        layout:
          shardsCount: 2
          replicasCount: 1
        templates:
          dataVolumeClaimTemplate: data-volume-template
          logVolumeClaimTemplate: log-volume-template
  templates:
    volumeClaimTemplates:
      - name: data-volume-template
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 500Mi
      - name: log-volume-template
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 100Mi

In this case, retrieving the PersistentVolume allocations shows two storage volumes per pod based on the specifications in the manifest:

kubectl -n test get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                                     STORAGECLASS   REASON   AGE
pvc-0b02c5ba-7ca1-4578-b3d9-ff8bb67ad412   100Mi      RWO            Delete           Bound    test/log-volume-template-chi-cluster-split-storage-cluster-split-1-0-0    standard                21s
pvc-4095b3c0-f550-4213-aa53-a08bade7c62c   100Mi      RWO            Delete           Bound    test/log-volume-template-chi-cluster-split-storage-cluster-split-0-0-0    standard                40s
pvc-71384670-c9db-4249-ae7e-4c5f1c33e0fc   500Mi      RWO            Delete           Bound    test/data-volume-template-chi-cluster-split-storage-cluster-split-1-0-0   standard                21s
pvc-9e3fb3fa-faf3-4a0e-9465-8da556cb9eec   500Mi      RWO            Delete           Bound    test/data-volume-template-chi-cluster-split-storage-cluster-split-0-0-0   standard                40s

Pod Mount Based Storage

PersistentVolume objects can be mounted directly into the pod’s mountPath. Any other data is not stored when the container is stopped unless it is covered by another PersistentVolumeClaim.

In the following example, each of the 2 shards in the ClickHouse cluster has the volumes tied to specific mount points:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "pod-split-storage"
spec:
  configuration:
    clusters:
      - name: "pod-split"
        # Templates are specified for this cluster explicitly
        templates:
          podTemplate: pod-template-with-volumes
        layout:
          shardsCount: 2
          replicasCount: 1

  templates:
    podTemplates:
      - name: pod-template-with-volumes
        spec:
          containers:
            - name: clickhouse
              image: yandex/clickhouse-server:21.8
              volumeMounts:
                - name: data-storage-vc-template
                  mountPath: /var/lib/clickhouse
                - name: log-storage-vc-template
                  mountPath: /var/log/clickhouse-server

    volumeClaimTemplates:
      - name: data-storage-vc-template
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 500Mi
      - name: log-storage-vc-template
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 100Mi
kubectl -n test get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                                 STORAGECLASS   REASON   AGE
pvc-37be9f84-7ba5-404e-8299-e95a291014a8   500Mi      RWO            Delete           Bound    test/data-storage-vc-template-chi-pod-split-storage-pod-split-1-0-0   standard                24s
pvc-5b2f8694-326d-41cb-94ec-559725947b45   100Mi      RWO            Delete           Bound    test/log-storage-vc-template-chi-pod-split-storage-pod-split-1-0-0    standard                24s
pvc-84768e78-e44e-4295-8355-208b07330707   500Mi      RWO            Delete           Bound    test/data-storage-vc-template-chi-pod-split-storage-pod-split-0-0-0   standard                43s
pvc-9e123af7-01ce-4ab8-9450-d8ca32b1e3a6   100Mi      RWO            Delete           Bound    test/log-storage-vc-template-chi-pod-split-storage-pod-split-0-0-0    standard                43s

4 - Operations Guide

Recommended practices and procedures for running ClickHouse in your Production environments.

The methods to make your ClickHouse environment successful.

4.1 - Security

Security settings and best practices for ClickHouse

Keep your ClickHouse cluster and data safe and secure from intruders.

4.1.1 - Hardening Guide

Hardening procedures for ClickHouse environments.

ClickHouse is known for its ability to scale with clusters, handle terabytes to petabytes of data, and return query results fast. It also has a plethora of built in security options and features that help keep that data safe from unauthorized users.

Hardening your individual ClickHouse system will depend on the situation, but the following processes are generally applicable in any environment. Each of these can be handled separately, and do not require being performed in any particular order.

4.1.1.1 - User Hardening

User hardening security procedures.

Increasing ClickHouse security at the user level involves the following major steps:

  • User Configuration: Setup secure default users, roles and permissions through configuration or SQL.

  • User Network Settings: Limit communications by hostname or IP address

  • Secure Password: Store user information as hashed values.

  • Set Quotas: Limit how many resources users can use in given intervals.

  • Use Profiles: Use profiles to set common security settings across multiple accounts.

  • Database Restrictions: Narrow the databases, tables and rows that a user can access.

  • Enable Remote Authentication: Enable LDAP authentication or Kerberos authentication to prevent storing hashed password information, and enforce password standards.

  • IMPORTANT NOTE: Configuration settings can be stored in the default /etc/clickhouse-server/config.xml file. However, this file can be overwritten during vendor upgrades. To preserve configuration settings it is recommended to store them in /etc/clickhouse-server/config.d as separate XML files.

User Configuration

The hardening steps to apply to users are:

  • Restrict user access only to the specific host names or IP addresses when possible.
  • Store all passwords in SHA256 format.
  • Set quotas on user resources for users when possible.
  • Use profiles to set similar properties across multiple users, and restrict user to the lowest resources required.
  • Offload user authentication through LDAP or Kerberos.

Users can be configured through the XML based settings files, or through SQL based commands.

Detailed information on ClickHouse user configurations can be found on the ClickHouse.Tech documentation site for User Settings.

User XML Settings

Users are listed under the user.xml file under the users element. Each element under users is created as a separate user.

It is recommended that when creating users, rather than lumping them all into the user.xml file is to place them as separate XML files under the directory users.d, typically located in /etc/clickhouse-server/users.d/.

Note that if your ClickHouse environment is to be run as a cluster, then user configuration files must be replicated on each node with the relevant users information. We will discuss how to offload some settings into other systems such as LDAP later in the document.

Also note that ClickHouse user names are case sensitive: John is different than john. See the ClickHouse.tech documentation site for full details.

  • IMPORTANT NOTE: If no user name is specified when a user attempts to login, then the account named default will be used.

For example, the following section will create two users:

  • clickhouse_operator: This user has the password clickhouse_operator_password stored in a sha256 hash, is assigned the profile clickhouse_operator, and can access the ClickHouse database from any network host.
  • John: This user can only access the database from localhost, has a basic password of John and is assigned to the default profile.
<users>
    <clickhouse_operator>
        <networks>
            <ip>127.0.0.1</ip>
            <ip>0.0.0.0/0</ip>
            <ip>::/0</ip>
        </networks>          
        <password_sha256_hex>716b36073a90c6fe1d445ac1af85f4777c5b7a155cea359961826a030513e448</password_sha256_hex>
        <profile>clickhouse_operator</profile>
        <quota>default</quota>
    </clickhouse_operator>
    <John>
        <networks>
            <ip>127.0.0.1</ip>
        </networks>
        <password_sha456_hex>73d1b1b1bc1dabfb97f216d897b7968e44b06457920f00f2dc6c1ed3be25ad4c</password_sha256_hex>
        <profile>default</profile>
    </John>
</users>

User SQL Settings

ClickHouse users can be managed by SQL commands from within ClickHouse. For complete details, see the Clickhouse.tech User Account page.

Access management must be enabled at the user level with the access_management setting. In this example, Access Management is enabled for the user John:

<users>
    <John>
       <access_management>1</access_management>
    </John>
</users>

The typical process for DCL(Data Control Language) queries is to have one user enabled with access_management, then have the other accounts generated through queries. See the ClickHouse.tech Access Control and Account Management page for more details.

Once enabled, Access Management settings can be managed through SQL queries. For example, to create a new user called newJohn with their password set as a sha256 hash and restricted to a specific IP address subnet, the following SQL command can be used:

CREATE USER IF NOT EXISTS newJohn
  IDENTIFIED WITH SHA256_PASSWORD BY 'secret'
  HOST IP '192.168.128.1/24' SETTINGS readonly=1;

Access Management through SQL commands includes the ability to:

  • Set roles
  • Apply policies to users
  • Set user quotas
  • Restrict user access to databases, tables, or specific rows within tables.

User Network Settings

Users can have their access to the ClickHouse environment restricted by the network they are accessing the network from. Users can be restricted to only connect from:

  • IP: IP address or netmask.
    • For all IP addresses, use 0.0.0.0/0 for IPv4, ::/0 for IPv6
  • Host: The DNS resolved hostname the user is connecting from.
  • Host Regexp (Regular Expression): A regular expression of the hostname.

Accounts should be restricted to the networks that they connect from when possible.

User Network SQL Settings

User access from specific networks can be set through SQL commands. For complete details, see the Clickhouse.tech Create User page.

Network access is controlled through the HOST option when creating or altering users. Host options include:

  • ANY (default): Users can connect from any location
  • LOCAL: Users can only connect locally.
  • IP: A specific IP address or subnet.
  • NAME: A specific FQDN (Fully Qualified Domain Name)
  • REGEX: Filters hosts that match a regular expression.
  • LIKE: Filters hosts by the LIKE operator.

For example, to restrict the user john to only connect from the local subnet of ‘192.168.0.0/16’:

ALTER USER john
  HOST IP '192.168.0.0/16';

Or to restrict this user to only connecting from the specific host names awesomeplace1.com, awesomeplace2.com, etc:

ALTER USER john
  HOST REGEXP 'awesomeplace[12345].com';

User Network XML Settings

User network settings are stored under the user configuration files /etc/clickhouse-server/config.d with the <networks> element controlling the sources that the user can connect from through the following settings:

  • <ip> : IP Address or subnet mask.
  • <host>: Hostname.
  • <host_regexp>: Regular expression of the host name.

For example, the following will allow only from localhost:

<networks>
    <ip>127.0.0.1</ip>
</networks> 

The following will restrict the user only to the site example.com or from supercool1.com, supercool2.com, etc:

<networks>
    <host>example.com</host>
    <host_regexp>supercool[1234].com</host_regexp>
</networks> 

If there are hosts or other settings that are applied across multiple accounts, one option is to use the Substitution feature as detailed in the ClickHouse.tech Configuration Files page. For example, in the /etc/metrika.xml. file used for substitutions, a local_networks element can be made:

<local_networks>
    <ip>192.168.1.0/24</ip>
</local_networks>

This can then be applied to a one or more users with the incl attribute when specifying their network access:

<networks incl="local_networks" replace="replace">
</networks>

Secure Password

Passwords can be stored in plaintext or SHA256 (hex format).

SHA256 format passwords are labeled with the <password_sha256_hex> element. SHA256 password can be generated through the following command:

echo -n "secret" | sha256sum | tr -d '-'

OR:

echo -n "secret" | shasum -a 256 | tr -d '-'
  • IMPORTANT NOTE: The -n option removes the newline from the output.

For example:

echo -n "clickhouse_operator_password" | shasum -a 256 | tr -d '-'
716b36073a90c6fe1d445ac1af85f4777c5b7a155cea359961826a030513e448

Secure Password SQL Settings

Passwords can be set when using the CREATE USER OR ALTER USER with the IDENTIFIED WITH option. For complete details, see the ClickHouse.tech Create User page. The following secure password options are available:

  • sha256password BY ‘STRING’: Converts the submitted STRING value to sha256 hash.
  • sha256_hash BY ‘HASH’ (best option): Stores the submitted HASH directly as the sha256 hash password value.
  • double_sha1_password BY ‘STRING’ (only used when allowing logins through mysql_port): Converts the submitted STRING value to double sha256 hash.
  • double_sha1_hash BY ‘HASH’(only used when allowing logins through mysql_port): Stores the submitted HASH directly as the double sha256 hash password value.

For example, to store the sha256 hashed value of “password” for the user John:

ALTER USER John IDENTIFIED WITH sha256_hash BY '5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8';

Secure Password XML Settings

Passwords can be set as part of the user’s settings in the user configuration files in /etc/clickhouse-server/config.d. For complete details, see the Clickhouse.tech User Settings.

To set a user’s password with a sha256 hash, use the password_sha256_hex branch for the user. For example, to set the sha256 hashed value of “password” for the user John:

<users>
    <John>
        <password_sha256_hex>5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8</password_sha256_hex>
    </John>
</users>

Set Quotas

Quotas set how many resources can be accessed in a given time, limiting a user’s ability to tie up resources in the system. More details can be found on the ClickHouse.tech Quotas page.

Quota SQL Settings

Quotas can be created or altered through SQL queries, then applied to users.

For more information on ClickHouse quotas, see the ClickHouse.tech Access Control page on Quotas.

Quota XML Settings

These are defined in the users.xml file under the element quotas. Each branch of the quota element is the name of the quota being defined.

Quotas are set by intervals, which can be set to different restrictions. For example, this quota named limited has one interval that sets maximum queries at 1000, and another interval that allows a total of 10000 queries over a 24 hour period.

<quotas>
    <limited>
        <interval>
            <duration>3600</duration>
            <queries>1000</queries>
        </interval>
        <interval>
            <duration>86400</duration>
            <queries>10000</queries>
    </limited>
</quotas>

Use Profiles

Profiles allow settings that can be applied to multiple uses applied with the same name. More details on Settings Profiles are available on the ClickHouse.tech site.

Profile XML Settings

Profiles are applied to a user with the profile element. For example, this assigns the restricted profile to the user John:

<users>
    <John>
        <networks>
            <ip>127.0.0.1</ip>
            <ip>0.0.0.0/0</ip>
            <ip>::/0</ip>
        </networks>
        <password_sha256_hex>716b36073a90c6fe1d445ac1af85f4777c5b7a155cea359961826a030513e448</password_sha256_hex>
        <profile>restricted</profile>

Profiles are set in the users.xml file under the profiles element. Each branch of this element is the name of a profile. The profile restricted shown here only allows for eight threads to be used at a time for users with this profile:

<profiles>
    <restricted>
        <!-- The maximum number of threads when running a single query. -->
        <max_threads>8</max_threads>
    </default>
</profiles>

Recommended profile settings include the following:

  • readonly: This sets the profile to be applied to users but not to be changed.
  • max_execution_time: Limits the amount of time a process will run before being forced to time out.
  • max_bytes_before_external_group_by: Maximum RAM allocated for a single GROUP BY sort.
  • max_bytes_before_external_sort: Maximum RAM allocated for sort commands.

Database Restrictions

Restrict users to the databases they need, and when possible only the tables or rows within tables that they require access to.

Full details are found on the ClickHouse.tech User Settings documentation.

Database Restrictions XML Settings

To restrict a user’s access by data in the XML file:

  1. Update user configuration files in /etc/clickhouse-server/config.d or update their permissions through SQL queries.
  2. For each user to update:
    1. Add the <databases> element with the following branches:
      1. The name of the database to allow access to.
      2. Within the database, the table names allowed to the user.
      3. Within the table, add a <filter> to match rows that fit the filter.

Database Restrictions XML Settings Example

The following restricts the user John to only access the database sales, and from there only the table marked clients where salesman = 'John':

<John>
    <databases>
        <sales>
            <clients>
                <filter>salesman = 'John'</filter>
            </clients>
        </sales>
    </databases>
</John>

Enable Remote Authentication

One issue with user settings is that in a cluster environment, each node requires a separate copy of the user configuration files, which includes a copy of the sha256 encrypted password.

One method of reducing the exposure of user passwords, even in a hashed format in a restricted section of the file system, it to use external authentication sources. This prevents password data from being stored in local file systems and allows changes to user authentication to be managed from one source.

Enable LDAP

LDAP servers are defined in the ClickHouse configuration settings such as /etc/clickhouse-server/config.d/ldap.xml. For more details, see the ClickHouse.tech site on Server Configuration settings.

Enabling LDAP server support in ClickHouse allows you to have one authority on login credentials, set password policies, and other essential security considerations through your LDAP server. It also prevents password information being stored on your ClickHouse servers or cluster nodes, even in a SHA256 hashed form.

To add one or more LDAP servers to your ClickHouse environment, each node will require the ldap settings:

<ldap>
    <server>ldapserver_hostname</server>
        <roles>
            <my_local_role1 />
            <my_local_role2 />
        </roles>
</ldap>

When creating users, specify the ldap server for the user:

create user if not exists newUser
    identified with ldap by 'ldapserver_hostname'
    host any;

When the user attempts to authenticate to ClickHouse, their credentials will be verified against the LDAP server specified from the configuration files.

4.1.1.2 - Network Hardening

Secure network communications with ClickHouse

Hardening the network communications for your ClickHouse environment is about reducing exposure of someone listening in on traffic and using that against you. Network hardening falls under the following major steps:

  • Reduce Exposure

  • Enable TLS

  • Encrypt Cluster Communications

  • IMPORTANT NOTE: Configuration settings can be stored in the default /etc/clickhouse-server/config.xml file. However, this file can be overwritten during vendor upgrades. To preserve configuration settings it is recommended to store them in /etc/clickhouse-server/config.d as separate XML files with the same root element, typically <yandex>. For this guide, we will only refer to the configuration files in /etc/clickhouse-server/config.d for configuration settings.

Reduce Exposure

It’s easier to prevent entry into your system when there’s less points of access, so unused ports should be disabled.

ClickHouse has native support for MySQL client, PostgreSQL clients, and others. The enabled ports are set in the /etc/clickhouse-server/config.d files.

To reduce exposure to your ClickHouse environment:

  1. Review which ports are required for communication. A complete list of the ports and configurations can be found on the ClickHouse documentation site for Server Settings.

  2. Comment out any ports not required in the configuration files. For example, if there’s no need for the MySQL client port, then it can be commented out:

    <!-- <mysql_port>9004</mysql_port> -->
    

Enable TLS

ClickHouse allows for both encrypted and unencrypted network communications. To harden network communications, unencrypted ports should be disabled and TLS enabled.

TLS encryption required a Certificate, and whether to use a public or private Certificate Authority (CA) is based on your needs.

  • Public CA: Recommended for external services or connections where you can not control where they will be connecting from.
  • Private CA: Best used when the ClickHouse services are internal only and you can control where hosts are connecting from.
  • Self-signed certificate: Only recommended for testing environments.

Whichever method is used, the following files will be required to enable TLS with CLickHouse:

  • Server X509 Certificate: Default name server.crt
  • Private Key: Default name server.key
  • Diffie-Hellman parameters: Default name dhparam.pem

Generate Files

No matter which approach is used, the Private Key and the Diffie-Hellman parameters file will be required. These instructions may need to be modified based on the Certificate Authority used to match its requirements. The instructions below require the use of openssl, and was tested against version OpenSSL 1.1.1j.

  1. Generate the private key, and enter the pass phrase when required:

    openssl genrsa -aes256 -out server.key 2048
    
  2. Generate dhparam.pem to create a 4096 encrypted file. This will take some time but only has to be done once:

    openssl dhparam -out dhparam.pem 4096
    
  3. Create the Certificate Signing Request (CSR) from the generated private key. Complete the requested information such as Country, etc.

    openssl req -new -key server.key -out server.csr
    
  4. Store the files server.key, server.csr, and dhparam.pem in a secure location, typically /etc/clickhouse-server/.

Public CA

Retrieving the certificates from a Public CA or Internal CA performed by registering with a Public CA such as Let&rsquo;s Encrypt or Verisign or with an internal organizational CA service. This process involves:

  1. Submit the CSR to the CA. The CA will sign the certificate and return it, typically as the file server.crt.
  2. Store the file server.crt in a secure location, typically /etc/clickhouse-server/.

Create a Private CA

If you do not have an internal CA or do not need a Public CA, a private CA can be generated through the following process:

  1. Create the Certificate Private Key:

    openssl genrsa -aes256 -out internalCA.key 2048
    
  2. Create the self-signed root certificate from the certificate key:

    openssl req -new -x509 -days 3650 -key internalCA.key \
        -sha256 -extensions v3_ca -out internalCA.crt
    
  3. Store the Certificate Private Key and the self-signed root certificate in a secure location.

  4. Sign the server.csr file with the self-signed root certificate:

    openssl x509 -sha256 -req -in server.csr -CA internalCA.crt \
        -CAkey internalCA.key -CAcreateserial -out server.crt -days 365
    
  5. Store the file server.crt, typically /etc/clickhouse-server/.

Self Signed Certificate

To skip right to making a self-signed certificate, follow these instructions.

  • IMPORTANT NOTE: This is not recommended for production systems, only for testing environments.
  1. With the server.key file from previous steps, create the self-signed certificate. Replace my.host.name with the actual host name used:

    openssl req -subj "/CN=my.host.name" -new -key server.key -out server.crt
    
  2. Store the file server.crt, typically /etc/clickhouse-server/.

  3. Each clickhouse-client user that connects to the server with the self-signed certificate will have to allow invalidCertificateHandler by updating theirclickhouse-client configuration files at /etc/clickhouse-server/config.d:

    <config>
    <openSSL>
        <client>
            ...
            <invalidCertificateHandler>
                <name>AcceptCertificateHandler</name>
            </invalidCertificateHandler>
        </client>
    </openSSL>
    

Enable TLS in ClickHouse

Once the files server.crt, server.crt, and dhparam.dem have been generated and stored appropriately, update the ClickHouse Server configuration files located at /etc/clickhouse-server/config.d.

To enable TLS and disable unencrypted ports:

  1. Review the /etc/clickhouse-server/config.d files. Comment out unencrypted ports, including http_port and tcp_port:

    <!-- <http_port>8123</http_port> -->
    <!-- <tcp_port>9000</tcp_port> -->
    
  2. Enable encrypted ports. A complete list of ports and settings is available on the ClickHouse documentation site for Server Settings. For example:

    <https_port>8443</https_port>
    <tcp_port_secure>9440</tcp_port_secure>
    
  3. Specify the certificate files to use:

    <openSSL>
        <server>
            <!-- Used for https server AND secure tcp port -->
            <certificateFile>/etc/clickhouse-server/server.crt</certificateFile>
            <privateKeyFile>/etc/clickhouse-server/server.key</privateKeyFile>
            <dhParamsFile>/etc/clickhouse-server/dhparams.pem</dhParamsFile>
            ...
        </server>
    ...
    </openSSL>
    

Encrypt Cluster Communications

If your organization runs ClickHouse as a cluster, then cluster-to-cluster communications should be encrypted. This includes distributed queries and interservice replication. To harden cluster communications:

  1. Create a user for distributed queries. This user should only be able to connect within the cluster, so restrict it’s IP access to only the subnet or host names used for the network. For example, if the cluster is entirely contained in a subdomain named logos1,logos2, etc. This internal user be set with or without a password:

    CREATE USER IF NOT EXISTS internal ON CLUSTER 'my_cluster'
        IDENTIFIED WITH NO_PASSWORD
        HOST REGEXP '^logos[1234]$'
    
  2. Enable TLS for interservice replication and comment out the unencrypted interserver port by updating the /etc/clickhouse-server/config.d files:

    <!-- <interserver_http_port>9009</interserver_http_port> -->
    <interserver_https_port>9010</interserver_https_port> -->
    
  3. Set an the interserver_http_credentials in the /etc/clickhouse-server/config.d files, and include the internal username and password:

    <interserver_http_credentials>
        <user>internal</user>
        <password></password>
    </interserver_http_credentials>
    
  4. Enable TLS for distributed queries by editing the file /etc/clickhouse-server/config.d/remote_servers.xml

    1. For ClickHouse 20.10 and later versions, set a shared secret text and setting the port to secure for each shard:
    <remote_servers>
        <my_cluster>
        <shard>
            <secret>shared secret text</secret> <!-- Update here -->
            <internal_replication>true</internal_replication>
            <replica>
                <host>logos1</host> <!-- Update here -->
                <port>9440</port> <!-- Secure Port -->
                <secure>1</secure> <!-- Update here, sets port to secure -->
            </replica>
        </shard>
    ...
    
    1. For previous versions of ClickHouse, set the internal user and enable secure communication:
    <remote_servers>
        <my_cluster>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>logos1</host> <!-- Update here -->
                    <port>9440</port> <!-- Secure Port -->
                    <secure>1</secure> <!-- Update here -->
                    <user>internal</port> <!-- Update here -->
                </replica>
            ... 
            </shard>
    ...
    

4.1.1.3 - Storage Hardening

Secure data stored for ClickHouse

ClickHouse data is ultimately stored on file systems. Keeping that data protected when it is being used or “at rest” is necessary to prevent unauthorized entities from accessing your organization’s private information.

Hardening stored ClickHouse data is split into the following catagories:

  • Host-Level Security

  • Volume Level Encryption

  • Column Level Encryption

  • Log File Protection

  • IMPORTANT NOTE: Configuration settings can be stored in the default /etc/clickhouse-server/config.xml file. However, this file can be overwritten during vendor upgrades. To preserve configuration settings it is recommended to store them in /etc/clickhouse-server/config.d as separate XML files with the same root element, typically <yandex>. For this guide, we will only refer to the configuration files in /etc/clickhouse-server/config.d for configuration settings.

Host-Level Security

The file level security for the files that ClickHouse uses to run should be restricted as much as possible.

  • ClickHouse does not require root access to the file system, and runs by default as the user clickhouse.
  • The following directories should be restricted to the minimum number of users:
    • /etc/clickhouse-server: Used for ClickHouse settings and account credentials created by default.
    • /var/lib/clickhouse: Used for ClickHouse data and new credentials.
    • /var/log/clickhouse-server: Log files that may display privileged information through queries. See Log File Protection for more information.

Volume Level Encryption

Encrypting data on the file system prevents unauthorized users who may have gained access to the file system that your ClickHouse database is stored on from being able to access the data itself. Depending on your environment, different encryption options may be required.

Cloud Storage

If your ClickHouse database is stored in a cloud service such as AWS or Azure, verify that the cloud supports encrypting the volume. For example, Amazon AWS provides a method to encrypt new Amazon EBS volumes by default.

The Altinity.Cloud service provides the ability to set the Volume Type to gp2-encrypted. For more details, see the Altinity.Cloud Cluster Settings.

Local Storage

Organizations that host ClickHouse clusters on their own managed systems, LUKS is a recommended solution. Instructions for Linux distributions including Red Hat and Ubuntu are available. Check with the distribution your organization for instructions on how to encrypt those volumes.

Kubernetes Encryption

If your ClickHouse cluster is managed by Kubernetes, the StorageClass used may be encrypted. For more information, see the Kubernetes Storage Class documentation.

Column Level Encryption

Organizations running ClickHouse 20.11 or later can encrypt individual columns with AES functions. For full information, see the ClickHouse.tech Encryption functions documentation.

Applications are responsible for their own keys. Before enabling column level encryption, test to verify that encryption does not negatively impact performance.

The following functions are available:

Function MySQL AES Compatible
encrypt(mode, plaintext, key, [iv, aad])
decrypt(mode, ciphertext, key, [iv, aad])
aes_encrypt_mysql(mode, plaintext, key, [iv]) *
aes_decrypt_mysql(mode, ciphertext, key, [iv]) *

Encryption function arguments:

Argument Description Type
mode Encryption mode. String
plaintext Text thats need to be encrypted. String
key Encryption key. String
iv Initialization vector. Required for -gcm modes, optional for others. String
aad Additional authenticated data. It isn’t encrypted, but it affects decryption. Works only in -gcm modes, for others would throw an exception String

Column Encryption Examples

This example displays how to encrypt information using a hashed key.

  1. Takes a hex value, unhexes it and stores it as key.
  2. Select the value and encrypt it with the key, then displays the encrypted value.
 WITH unhex('658bb26de6f8a069a3520293a572078f') AS key
SELECT hex(encrypt('aes-128-cbc', 'Hello world', key)) AS encrypted
┌─encrypted────────────────────────┐
 46924AC12F4915F2EEF3170B81A1167E 
└──────────────────────────────────┘

This shows how to decrypt encrypted data:

  1. Takes a hex value, unhexes it and stores it as key.
  2. Decrypts the selected value with the key as text.
WITH unhex('658bb26de6f8a069a3520293a572078f') AS key SELECT decrypt('aes-128-cbc',
  unhex('46924AC12F4915F2EEF3170B81A1167E'), key) AS plaintext
┌─plaintext───┐
 Hello world 
└─────────────┘

Log File Protection

The great thing about log files is they show what happened. The problem is when they show what happened, like the encryption key used to encrypt or decrypt data:

2021.01.26 19:11:23.526691 [ 1652 ] {4e196dfa-dd65-4cba-983b-d6bb2c3df7c8}
<Debug> executeQuery: (from [::ffff:127.0.0.1]:54536, using production
parser) WITH unhex('658bb26de6f8a069a3520293a572078f') AS key SELECT
decrypt(???), key) AS plaintext

These queries can be hidden through query masking rules, applying regular expressions to replace commands as required. For more information, see the ClickHouse.tech Server Settings documentation.

To prevent certain queries from appearing in log files or to hide sensitive information:

  1. Update the configuration files, located by default in /etc/clickhouse-server/config.d.
  2. Add the element query_masking_rules.
  3. Set each rule with the following:
    1. name: The name of the rule.
    2. regexp: The regular expression to search for.
    3. replace: The replacement value that matches the rule’s regular expression.

For example, the following will hide encryption and decryption functions in the log file:

 <query_masking_rules>
    <rule>
        <name>hide encrypt/decrypt arguments</name>
        <regexp>
           ((?:aes_)?(?:encrypt|decrypt)(?:_mysql)?)\s*\(\s*(?:'(?:\\'|.)+'|.*?)\s*\)
        </regexp>
        <!-- or more secure, but also more invasive:
            (aes_\w+)\s*\(.*\)
        -->
        <replace>\1(???)</replace>
    </rule>
</query_masking_rules>

4.2 - Care and Feeding of Zookeeper with ClickHouse

Installing, configuring, and recovering Zookeeper

ZooKeeper is required for ClickHouse cluster replication. Keeping ZooKeeper properly maintained and fed provides the best performance and reduces the likelihood that your ZooKeeper nodes will become “sick”.

Elements of this guide can also be found on the ClickHouse on Kubernetes Quick Start guide, which details how to use Kubernetes and ZooKeeper with the clickhouse-operator.

4.2.1 - ZooKeeper Installation and Configuration

How to configure Zookeeper to work best with ClickHouse

Prepare and Start ZooKeeper

Preparation

Before beginning, determine whether ZooKeeper will run in standalone or replicated mode.

  • Standalone mode: One zookeeper server to service the entire ClickHouse cluster. Best for evaluation, development, and testing.
    • Should never be used for production environments.
  • Replicated mode: Multiple zookeeper servers in a group called an ensemble. Replicated mode is recommended for production systems.
    • A minimum of 3 zookeeper servers are required.
    • 3 servers is the optimal setup that functions even with heavily loaded systems with proper tuning.
    • 5 servers is less likely to lose quorum entirely, but also results in longer quorum acquisition times.
    • Additional servers can be added, but should always be an odd number of servers.

Precautions

The following practices should be avoided:

  • Never deploy even numbers of ZooKeeper servers in an ensemble.
  • Do not install ZooKeeper on ClickHouse nodes.
  • Do not share ZooKeeper with other applications like Kafka.
  • Place the ZooKeeper dataDir and logDir on fast storage that will not be used for anything else.

Applications to Install

Install the following applications in your servers:

  1. zookeeper (3.4.9 or later)
  2. netcat

Configure ZooKeeper

  1. /etc/zookeeper/conf/myid

    The myid file consists of a single line containing only the text of that machine’s id. So myid of server 1 would contain the text “1” and nothing else. The id must be unique within the ensemble and should have a value between 1 and 255.

  2. /etc/zookeeper/conf/zoo.cfg

    Every machine that is part of the ZooKeeper ensemble should know about every other machine in the ensemble. You accomplish this with a series of lines of the form server.id=host:port:port

    # specify all zookeeper servers
    # The first port is used by followers to connect to the leader
    # The second one is used for leader election
    server.1=zookeeper1:2888:3888
    server.2=zookeeper2:2888:3888
    server.3=zookeeper3:2888:3888
    

    These lines must be the same on every ZooKeeper node

  3. /etc/zookeeper/conf/zoo.cfg

    This setting MUST be added on every ZooKeeper node:

    # The time interval in hours for which the purge task has to be triggered.
    # Set to a positive integer (1 and above) to enable the auto purging. Defaults to 0.
    autopurge.purgeInterval=1
    autopurge.snapRetainCount=5
    

Install Zookeeper

Depending on your environment, follow the Apache Zookeeper Getting Started guide, or the Zookeeper Administrator's Guide.

Start ZooKeeper

Depending on your installation, start ZooKeeper with the following command:

sudo -u zookeeper /usr/share/zookeeper/bin/zkServer.sh

Verify ZooKeeper is Running

Use the following commands to verify ZooKeeper is available:

echo ruok | nc localhost 2181
echo mntr | nc localhost 2181
echo stat | nc localhost 2181

Check the following files and directories to verify ZooKeeper is running and making updates:

  • Logs: /var/log/zookeeper/zookeeper.log
  • Snapshots: /var/lib/zookeeper/version-2/

Connect to ZooKeeper

From the localhost, connect to ZooKeeper with the following command to verify access (replace the IP address with your Zookeeper server):

bin/zkCli.sh -server 127.0.0.1:2181

Tune ZooKeeper

The following optional settings can be used depending on your requirements.

Improve Node Communication Reliability

The following settings can be used to improve node communication reliability:

/etc/zookeeper/conf/zoo.cfg
# The number of ticks that the initial synchronization phase can take
initLimit=10
# The number of ticks that can pass between sending a request and getting an acknowledgement
syncLimit=5

Reduce Snapshots

The following settings will create fewer snapshots which may reduce system requirements.

/etc/zookeeper/conf/zoo.cfg
# To avoid seeks ZooKeeper allocates space in the transaction log file in blocks of preAllocSize kilobytes.
# The default block size is 64M. One reason for changing the size of the blocks is to reduce the block size
# if snapshots are taken more often. (Also, see snapCount).
preAllocSize=65536
# ZooKeeper logs transactions to a transaction log. After snapCount transactions are written to a log file a
# snapshot is started and a new transaction log file is started. The default snapCount is 10,000.
snapCount=10000

Documentation

Configuring ClickHouse to use ZooKeeper

Once ZooKeeper has been installed and configured, ClickHouse can be modified to use ZooKeeper. After the following steps are completed, a restart of ClickHouse will be required.

To configure ClickHouse to use ZooKeeper, follow the steps shown below. The recommended settings are located on ClickHouse.tech zookeeper server settings.

  1. Create a configuration file with the list of ZooKeeper nodes. Best practice is to put the file in /etc/clickhouse-server/config.d/zookeeper.xml.

    <yandex>
        <zookeeper>
            <node>
                <host>example1</host>
                <port>2181</port>
            </node>
            <node>
                <host>example2</host>
                <port>2181</port>
            </node>
            <session_timeout_ms>30000</session_timeout_ms>
            <operation_timeout_ms>10000</operation_timeout_ms>
            <!-- Optional. Chroot suffix. Should exist. -->
            <root>/path/to/zookeeper/node</root>
            <!-- Optional. ZooKeeper digest ACL string. -->
            <identity>user:password</identity>
        </zookeeper>
    </yandex>
    
  2. Check the distributed_ddl parameter in config.xml. This parameter can be defined in another configuration file, and can change the path to any value that you like. If you have several ClickHouse clusters using the same zookeeper, distributed_ddl path should be unique for every ClickHouse cluster setup.

    <!-- Allow to execute distributed DDL queries (CREATE, DROP, ALTER, RENAME) on cluster. -->
    <!-- Works only if ZooKeeper is enabled. Comment it out if such functionality isn't required. -->
    <distributed_ddl>
        <!-- Path in ZooKeeper to queue with DDL queries -->
        <path>/clickhouse/task_queue/ddl</path>
    
        <!-- Settings from this profile will be used to execute DDL queries -->
        <!-- <profile>default</profile> -->
    </distributed_ddl>
    
  3. Check /etc/clickhouse-server/preprocessed/config.xml. You should see your changes there.

  4. Restart ClickHouse. Check ClickHouse connection to ZooKeeper detailed in ZooKeeper Monitoring.

Converting Tables to Replicated Tables

Creating a replicated table

Replicated tables use a replicated table engine, for example ReplicatedMergeTree. The following example shows how to create a simple replicated table.

This example assumes that you have defined appropriate macro values for cluster, shard, and replica in macros.xml to enable cluster replication using zookeeper. For details consult the ClickHouse.tech Data Replication guide.

CREATE TABLE test ON CLUSTER '{cluster}'
(
    timestamp DateTime,
    contractid UInt32,
    userid UInt32
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{cluster}/{shard}/default/test', '{replica}')
PARTITION BY toYYYYMM(timestamp)
ORDER BY (contractid, toDate(timestamp), userid)
SAMPLE BY userid;

The ON CLUSTER clause ensures the table will be created on the nodes of {cluster} (a macro value). This example automatically creates a ZooKeeper path for each replica table that looks like the following:

/clickhouse/tables/{cluster}/{replica}/default/test

becomes:

/clickhouse/tables/c1/0/default/test

You can see ZooKeeper replication data for this node with the following query (updating the path based on your environment):

SELECT *
  FROM system.zookeeper
  WHERE path = '/clickhouse/tables/c1/0/default/test'

Removing a replicated table

To remove a replicated table, use DROP TABLE as shown in the following example. The ON CLUSTER clause ensures the table will be deleted on all nodes. Omit it to delete the table on only a single node.

DROP TABLE test ON CLUSTER '{cluster}';

As each table is deleted the node is removed from replication and the information for the replica is cleaned up. When no more replicas exist, all ZooKeeper data for the table will be cleared.

Cleaning up ZooKeeper data for replicated tables

  • IMPORTANT NOTE: Cleaning up ZooKeeper data manually can corrupt replication if you make a mistake. Raise a support ticket and ask for help if you have any doubt concerning the procedure.

New ClickHouse versions now support SYSTEM DROP REPLICA which is an easier command.

For example:

SYSTEM DROP REPLICA 'replica_name' FROM ZKPATH '/path/to/table/in/zk';

ZooKeeper data for the table might not be cleared fully if there is an error when deleting the table, or the table becomes corrupted, or the replica is lost. You can clean up ZooKeeper data in this case manually using the ZooKeeper rmr command. Here is the procedure:

  1. Login to ZooKeeper server.
  2. Run zkCli.sh command to connect to the server.
  3. Locate the path to be deleted, e.g.:
    ls /clickhouse/tables/c1/0/default/test
  4. Remove the path recursively, e.g.,
    rmr /clickhouse/tables/c1/0/default/test

4.2.2 - ZooKeeper Monitoring

Verifying Zookeeper and ClickHouse are working together.

ZooKeeper Monitoring

For organizations that already have Apache ZooKeeper configured either manually, or with a Kubernetes operator such as the clickhouse-operator for Kubernetes, monitoring your ZooKeeper nodes will help you recover from issues before they happen.

Checking ClickHouse connection to ZooKeeper

To check connectivity between ClickHouse and ZooKeeper.

  1. Confirm that ClickHouse can connect to ZooKeeper. You should be able to query the system.zookeeper table, and see the path for distributed DDL created in ZooKeeper through that table. If something went wrong, check the ClickHouse logs.

    $ clickhouse-client -q "select * from system.zookeeper where path='/clickhouse/task_queue/'"
    ddl 17183334544    17183334544    2019-02-21 21:18:16    2019-02-21 21:18:16    0    8    0    0    0    8    17183370142    /clickhouse/task_queue/
    
  2. Confirm ZooKeeper accepts connections from ClickHouse. You can also see on ZooKeeper nodes if a connection was established and the IP address of the ClickHouse server in the list of clients:

    $ echo stat | nc localhost 2181
    ZooKeeper version: 3.4.9-3--1, built on Wed, 23 May 2018 22:34:43 +0200
    Clients:
     /10.25.171.52:37384