This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

ClickHouse on Kubernetes

Install and Manage ClickHouse Clusters on Kubernetes.

Setting up a cluster of Altinity Stable for ClickHouse is made easy with Kubernetes, even if saying that takes some effort from the tongue. Organizations that want to setup their own distributed ClickHouse environments can do so with the Altinity Kubernetes Operator.

As of this time, the current version of the Altinity Kubernetes Operator is 0.18.5.

1 - Altinity Kubernetes Operator Quick Start Guide

Become familiar with the Kubernetes Altinity Kubernetes Operator in the fewest steps.

If you’re running the Altinity Kubernetes Operator for the first time, or just want to get it up and running as quickly as possible, the Quick Start Guide is for you.

Requirements:

  • An operating system running Kubernetes and Docker, or a service providing support for them such as AWS.
  • A ClickHouse remote client such as clickhouse-client. Full instructions for installing ClickHouse can be found on the ClickHouse Installation page.

1.1 - Installation

Install and Verify the Altinity Kubernetes Operator

The Altinity Kubernetes Operator can be installed in just a few minutes with a single command into your existing Kubernetes environment.

For those who need a more customized installation or want to build the Altinity Kubernetes Operator themselves can do so through the Operator Installation Guide.

Requirements

Before starting, make sure you have the following installed:

Quick Install

To install the Altinity Kubernetes Operator into your existing Kubernetes environment, run the following or download the Altinity Kubernetes Operator install file to modify it as best fits your needs. For more information custom Altinity Kubernetes Operator settings that can be applied, see the Operator Guide.

We recommend that when installing the Altinity Kubernetes Operator, specify the version to be installed. This insures maximum compatibility with applications and established Kubernetes environments running your ClickHouse clusters. For more information on installing other versions of the Altinity Kubernetes Operator, see the specific Version Installation Guide.

The most current version is 0.18.3:

kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml

Output similar to the following will be displayed on a successful installation. For more information on the resources created in the installation, see Altinity Kubernetes Operator Resources

customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com created
serviceaccount/clickhouse-operator created
clusterrole.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
configmap/etc-clickhouse-operator-files created
configmap/etc-clickhouse-operator-confd-files created
configmap/etc-clickhouse-operator-configd-files created
configmap/etc-clickhouse-operator-templatesd-files created
configmap/etc-clickhouse-operator-usersd-files created
deployment.apps/clickhouse-operator created
service/clickhouse-operator-metrics created

Quick Verify

To verify that the installation was successful, run the following. On a successful installation you’ll be able to see the clickhouse-operator pod under the NAME column.

kubectl get pods --namespace kube-system
NAME                                   READY   STATUS    RESTARTS       AGE
clickhouse-operator-857c69ffc6-dq2sz   2/2     Running   0              5s
coredns-78fcd69978-nthp2               1/1     Running   4 (110s ago)   50d
etcd-minikube                          1/1     Running   4 (115s ago)   50d
kube-apiserver-minikube                1/1     Running   4 (105s ago)   50d
kube-controller-manager-minikube       1/1     Running   4 (115s ago)   50d
kube-proxy-lsggn                       1/1     Running   4 (115s ago)   50d
kube-scheduler-minikube                1/1     Running   4 (105s ago)   50d
storage-provisioner                    1/1     Running   8 (115s ago)   50d

1.2 - First Clusters

Create your first ClickHouse Cluster

If you followed the Quick Installation guide, then you have the
Altinity Kubernetes Operator for Kubernetes installed.
Let’s give it something to work with.

Create Your Namespace

For our examples, we’ll be setting up our own Kubernetes namespace test.
All of the examples will be installed into that namespace so we can track
how the cluster is modified with new updates.

Create the namespace with the following kubectl command:

kubectl create namespace test
namespace/test created

Just to make sure we’re in a clean environment,
let’s check for any resources in our namespace:

kubectl get all -n test
No resources found in test namespace.

The First Cluster

We’ll start with the simplest cluster: one shard, one replica.
This template and others are available on the
Altinity Kubernetes Operator Github example site,
and contains the following:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "demo-01"
spec:
  configuration:
    clusters:
      - name: "demo-01"
        layout:
          shardsCount: 1
          replicasCount: 1

Save this as sample01.yaml and launch it with the following:

kubectl apply -n test -f sample01.yaml
clickhouseinstallation.clickhouse.altinity.com/demo-01 created

Verify that the new cluster is running. When the status is
Running then it’s complete.

kubectl -n test get chi -o wide
NAME      VERSION   CLUSTERS   SHARDS   HOSTS   TASKID                                 STATUS      UPDATED   ADDED   DELETED   DELETE   ENDPOINT
demo-01   0.18.1    1          1        1       6d1d2c3d-90e5-4110-81ab-8863b0d1ac47   Completed             1                          clickhouse-demo-01.test.svc.cluster.local

To retrieve the IP information use the get service option:

kubectl get service -n test
NAME                      TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)                         AGE
chi-demo-01-demo-01-0-0   ClusterIP      None           <none>        8123/TCP,9000/TCP,9009/TCP      2s
clickhouse-demo-01        LoadBalancer   10.111.27.86   <pending>     8123:31126/TCP,9000:32460/TCP   19s

So we can see our pods is running, and that we have the"
load balancer for the cluster.

Connect To Your Cluster With Exec

Let’s talk to our cluster and run some simple ClickHouse queries.

We can hop in directly through Kubernetes and run the clickhouse-client
that’s part of the image with the following command:

kubectl -n test exec -it chi-demo-01-demo-01-0-0-0 -- clickhouse-client
ClickHouse client version 20.12.4.5 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 20.12.4 revision 54442.

chi-demo-01-demo-01-0-0-0.chi-demo-01-demo-01-0-0.test.svc.cluster.local :)

From within ClickHouse, we can check out the current clusters:

SELECT * FROM system.clusters
┌─cluster─────────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───────────────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
 all-replicated                                           1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 all-sharded                                              1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 demo-01                                                  1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_one_shard_three_replicas_localhost          1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_one_shard_three_replicas_localhost          1             1            2  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_one_shard_three_replicas_localhost          1             1            3  127.0.0.3                127.0.0.3     9000         0  default                               0                0                        0 
 test_cluster_two_shards                                  1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards                                  2             1            1  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_two_shards_internal_replication             1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards_internal_replication             2             1            1  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_two_shards_localhost                        1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards_localhost                        2             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_shard_localhost                                     1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_shard_localhost_secure                              1             1            1  localhost                127.0.0.1     9440         0  default                               0                0                        0 
 test_unavailable_shard                                   1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_unavailable_shard                                   2             1            1  localhost                127.0.0.1        1         0  default                               0                0                        0 
└─────────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴─────────────────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘

Exit out of your cluster:

chi-demo-01-demo-01-0-0-0.chi-demo-01-demo-01-0-0.test.svc.cluster.local :) exit
Bye.

Connect to Your Cluster with Remote Client

You can also use a remote client such as clickhouse-client to
connect to your cluster through the LoadBalancer.

  • The default username and password is set by the
    clickhouse-operator-install.yaml file. These values can be altered
    by changing the chUsername and chPassword ClickHouse
    Credentials section:

    • Default Username: clickhouse_operator
    • Default Password: clickhouse_operator_password
# ClickHouse credentials (username, password and port) to be used
# by operator to connect to ClickHouse instances for:
# 1. Metrics requests
# 2. Schema maintenance
# 3. DROP DNS CACHE
# User with such credentials can be specified in additional ClickHouse
# .xml config files,
# located in `chUsersConfigsPath` folder
chUsername: clickhouse_operator
chPassword: clickhouse_operator_password
chPort: 8123

In either case, the command to connect to your new cluster will
resemble the following, replacing {LoadBalancer hostname} with
the name or IP address or your LoadBalancer, then providing
the proper password. In our examples so far, that’s been localhost.

From there, just make your ClickHouse SQL queries as you please - but
remember that this particular cluster has no persistent storage.
If the cluster is modified in any way, any databases or tables
created will be wiped clean.

Update Your First Cluster To 2 Shards

Well that’s great - we have a cluster running. Granted, it’s really small
and doesn’t do much, but what if we want to upgrade it?

Sure - we can do that any time we want.

Take your sample01.yaml and save it as sample02.yaml.

Let’s update it to give us two shards running with one replica:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "demo-01"
spec:
  configuration:
    clusters:
      - name: "demo-01"
        layout:
          shardsCount: 2
          replicasCount: 1

Save your YAML file, and apply it. We’ve defined the name
in the metadata, so it knows exactly what cluster to update.

kubectl apply -n test -f sample02.yaml
clickhouseinstallation.clickhouse.altinity.com/demo-01 configured

Verify that the cluster is running - this may take a few
minutes depending on your system:

kubectl -n test get chi -o wide
NAME      VERSION   CLUSTERS   SHARDS   HOSTS   TASKID                                 STATUS      UPDATED   ADDED   DELETED   DELETE   ENDPOINT
demo-01   0.18.1    1          2        2       80102179-4aa5-4e8f-826c-1ca7a1e0f7b9   Completed             1                          clickhouse-demo-01.test.svc.cluster.local
kubectl get service -n test
NAME                      TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)                         AGE
chi-demo-01-demo-01-0-0   ClusterIP      None           <none>        8123/TCP,9000/TCP,9009/TCP      26s
chi-demo-01-demo-01-1-0   ClusterIP      None           <none>        8123/TCP,9000/TCP,9009/TCP      3s
clickhouse-demo-01        LoadBalancer   10.111.27.86   <pending>     8123:31126/TCP,9000:32460/TCP   43s

Once again, we can reach right into our cluster with
clickhouse-client and look at the clusters.

clickhouse-client --host localhost --user=clickhouse_operator --password=clickhouse_operator_password
ClickHouse client version 20.12.4.5 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 20.12.4 revision 54442.

chi-demo-01-demo-01-1-0-0.chi-demo-01-demo-01-1-0.test.svc.cluster.local :)
SELECT * FROM system.clusters
┌─cluster─────────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───────────────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
 all-replicated                                           1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 all-sharded                                              1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 demo-01                                                  1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_one_shard_three_replicas_localhost          1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_one_shard_three_replicas_localhost          1             1            2  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_one_shard_three_replicas_localhost          1             1            3  127.0.0.3                127.0.0.3     9000         0  default                               0                0                        0 
 test_cluster_two_shards                                  1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards                                  2             1            1  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_two_shards_internal_replication             1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards_internal_replication             2             1            1  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_two_shards_localhost                        1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards_localhost                        2             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_shard_localhost                                     1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_shard_localhost_secure                              1             1            1  localhost                127.0.0.1     9440         0  default                               0                0                        0 
 test_unavailable_shard                                   1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_unavailable_shard                                   2             1            1  localhost                127.0.0.1        1         0  default                               0                0                        0 
└─────────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴─────────────────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘

So far, so good. We can create some basic clusters.
If we want to do more, we’ll have to move ahead with replication
and zookeeper in the next section.

1.3 - Zookeeper and Replicas

Install Zookeeper and Replicas
kubectl create namespace test
namespace/test created

Now we’ve seen how to setup a basic cluster and upgrade it. Time to step
up our game and setup our cluster with Zookeeper, and then add
persistent storage to it.

The Altinity Kubernetes Operator does not install or manage Zookeeper.
Zookeeper must be provided and managed externally. The samples below
are examples on establishing Zookeeper to provide replication support.
For more information running and configuring Zookeeper,
see the Apache Zookeeper site.

This step can not be skipped - your Zookeeper instance must
have been set up externally from your ClickHouse clusters.
Whether your Zookeeper installation is hosted by other
Docker Images or separate servers is up to you.

Install Zookeeper

Kubernetes Zookeeper Deployment

A simple method of installing a single Zookeeper node is provided from
the Altinity Kubernetes Operator
deployment samples. These provide samples deployments of Grafana, Prometheus, Zookeeper and other applications.

See the Altinity Kubernetes Operator deployment directory
for a full list of sample scripts and Kubernetes deployment files.

The instructions below will create a new Kubernetes namespace zoo1ns,
and create a Zookeeper node in that namespace.
Kubernetes nodes will refer to that Zookeeper node by the hostname
zookeeper.zoo1ns within the created Kubernetes networks.

To deploy a single Zookeeper node in Kubernetes from the
Altinity Kubernetes Operator Github repository:

  1. Download the Altinity Kubernetes Operator Github repository, either with
    git clone https://github.com/Altinity/clickhouse-operator.git or by selecting Code->Download Zip from the
    Altinity Kubernetes Operator GitHub repository
    .

  2. From a terminal, navigate to the deploy/zookeeper directory
    and run the following:

cd clickhouse-operator/deploy/zookeeper
./quick-start-volume-emptyDir/zookeeper-1-node-create.sh
namespace/zoo1ns created
service/zookeeper created
service/zookeepers created
Warning: policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
poddisruptionbudget.policy/zookeeper-pod-disruption-budget created
statefulset.apps/zookeeper created
  1. Verify the Zookeeper node is running in Kubernetes:
kubectl get all --namespace zoo1ns
NAME              READY   STATUS    RESTARTS   AGE
pod/zookeeper-0   0/1     Running   0          2s

NAME                 TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)             AGE
service/zookeeper    ClusterIP   10.100.31.86   <none>        2181/TCP,7000/TCP   2s
service/zookeepers   ClusterIP   None           <none>        2888/TCP,3888/TCP   2s

NAME                         READY   AGE
statefulset.apps/zookeeper   0/1     2s
  1. Kubernetes nodes will be able to refer to the Zookeeper
    node by the hostname zookeeper.zoo1ns.

Configure Kubernetes with Zookeeper

Once we start replicating clusters, we need Zookeeper to manage them.
Create a new file sample03.yaml and populate it with the following:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "demo-01"
spec:
  configuration:
    zookeeper:
      nodes:
        - host: zookeeper.zoo1ns
          port: 2181
    clusters:
      - name: "demo-01"
        layout:
          shardsCount: 2
          replicasCount: 2
        templates:
          podTemplate: clickhouse-stable
  templates:
    podTemplates:
      - name: clickhouse-stable
        spec:
          containers:
            - name: clickhouse
              image: altinity/clickhouse-server:21.8.10.1.altinitystable

Notice that we’re increasing the number of replicas from the
sample02.yaml file in the
[First Clusters - No Storage]({<ref “quickcluster”>}) tutorial.

We’ll set up a minimal Zookeeper connecting cluster by applying
our new configuration file:

kubectl apply -f sample03.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 created

Verify it with the following:

kubectl -n test get chi -o wide
NAME      VERSION   CLUSTERS   SHARDS   HOSTS   TASKID                                 STATUS      UPDATED   ADDED   DELETED   DELETE   ENDPOINT                                    AGE
demo-01   0.18.3    1          2        4       5ec69e86-7e4d-4b8b-877f-f298f26161b2   Completed             4                          clickhouse-demo-01.test.svc.cluster.local   102s
kubectl get service -n test
NAME                      TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE
chi-demo-01-demo-01-0-0   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      85s
chi-demo-01-demo-01-0-1   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      68s
chi-demo-01-demo-01-1-0   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      47s
chi-demo-01-demo-01-1-1   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      16s
clickhouse-demo-01        LoadBalancer   10.104.157.249   <pending>     8123:32543/TCP,9000:30797/TCP   101s

If we log into our cluster and show the clusters, we can show
the updated results and that we have a total of 4 replicas
of demo-01 - two shards for each node with two replicas.

SELECT * FROM system.clusters
┌─cluster──────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───────────────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
 all-replicated                                        1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 all-replicated                                        1             1            2  chi-demo-01-demo-01-0-1  172.17.0.6    9000         0  default                               0                0                        0 
 all-replicated                                        1             1            3  chi-demo-01-demo-01-1-0  172.17.0.7    9000         0  default                               0                0                        0 
 all-replicated                                        1             1            4  chi-demo-01-demo-01-1-1  172.17.0.8    9000         0  default                               0                0                        0 
 all-sharded                                           1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 all-sharded                                           2             1            1  chi-demo-01-demo-01-0-1  172.17.0.6    9000         0  default                               0                0                        0 
 all-sharded                                           3             1            1  chi-demo-01-demo-01-1-0  172.17.0.7    9000         0  default                               0                0                        0 
 all-sharded                                           4             1            1  chi-demo-01-demo-01-1-1  172.17.0.8    9000         0  default                               0                0                        0 
 demo-01                                               1             1            1  chi-demo-01-demo-01-0-0  127.0.0.1     9000         1  default                               0                0                        0 
 demo-01                                               1             1            2  chi-demo-01-demo-01-0-1  172.17.0.6    9000         0  default                               0                0                        0 
 demo-01                                               2             1            1  chi-demo-01-demo-01-1-0  172.17.0.7    9000         0  default                               0                0                        0 
 demo-01                                               2             1            2  chi-demo-01-demo-01-1-1  172.17.0.8    9000         0  default                               0                0                        0 
 test_cluster_two_shards                               1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards                               2             1            1  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_two_shards_internal_replication          1             1            1  127.0.0.1                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards_internal_replication          2             1            1  127.0.0.2                127.0.0.2     9000         0  default                               0                0                        0 
 test_cluster_two_shards_localhost                     1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_cluster_two_shards_localhost                     2             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_shard_localhost                                  1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_shard_localhost_secure                           1             1            1  localhost                127.0.0.1     9440         0  default                               0                0                        0 
 test_unavailable_shard                                1             1            1  localhost                127.0.0.1     9000         1  default                               0                0                        0 
 test_unavailable_shard                                2             1            1  localhost                127.0.0.1        1         0  default                               0                0                        0 
└──────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴─────────────────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘

Distributed Tables

We have our clusters going - let’s test it out with some distributed
tables so we can see the replication in action.

Login to your ClickHouse cluster and enter the following SQL statement:

CREATE TABLE test AS system.one ENGINE = Distributed('demo-01', 'system', 'one')

Once our table is created, perform a SELECT * FROM test command.
We’ll see nothing because we didn’t give it any data,
but that’s all right.

SELECT * FROM test
┌─dummy─┐
     0 
└───────┘
┌─dummy─┐
     0 
└───────┘

Now let’s test out our results coming in.
Run the following command - this tells us just what shard is
returning the results. It may take a few times, but you’ll
start to notice the host name changes each time you run the
command SELECT hostName() FROM test:

SELECT hostName() FROM test
┌─hostName()────────────────┐
 chi-demo-01-demo-01-0-0-0 
└───────────────────────────┘
┌─hostName()────────────────┐
 chi-demo-01-demo-01-1-1-0 
└───────────────────────────┘
SELECT hostName() FROM test
┌─hostName()────────────────┐
 chi-demo-01-demo-01-0-0-0 
└───────────────────────────┘
┌─hostName()────────────────┐
 chi-demo-01-demo-01-1-0-0 
└───────────────────────────┘

This is showing us that the query is being distributed across
different shards. The good news is you can change your
configuration files to change the shards and replication
however suits your needs.

One issue though: there’s no persistent storage.
If these clusters stop running, your data vanishes.
Next instruction will be on how to add persistent storage
to your ClickHouse clusters running on Kubernetes.
In fact, we can test by creating a new configuration
file called sample04.yaml:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "demo-01"
spec:
  configuration:
    zookeeper:
      nodes:
        - host: zookeeper.zoo1ns
          port: 2181
    clusters:
      - name: "demo-01"
        layout:
          shardsCount: 1
          replicasCount: 1
        templates:
          podTemplate: clickhouse-stable
  templates:
    podTemplates:
      - name: clickhouse-stable
        spec:
          containers:
            - name: clickhouse
              image: altinity/clickhouse-server:21.8.10.1.altinitystable

Make sure you’re exited out of your ClickHouse cluster,
then install our configuration file:

kubectl apply -f sample04.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 configured

Notice that during the update that four pods were deleted,
and then two new ones added.

When your clusters are settled down and back down to just 1 shard
with 1 replication, log back into your ClickHouse database
and select from table test:

SELECT * FROM test
Received exception from server (version 21.8.10):
Code: 60. DB::Exception: Received from localhost:9000. DB::Exception: Table default.test doesn't exist. 
command terminated with exit code 60

No persistent storage means any time your clusters are changed over,
everything you’ve done is gone. The next article will cover
how to correct that by adding storage volumes to your cluster.

1.4 - Persistent Storage

How to set up persistent storage for your ClickHouse Kubernetes cluster.
kubectl create namespace test
namespace/test created

We’ve shown how to create ClickHouse clusters in Kubernetes, how to add zookeeper so we can create replicas of clusters. Now we’re going to show how to set persistent storage so you can change your cluster configurations without losing your hard work.

The examples here are built from the Altinity Kubernetes Operator examples, simplified down for our demonstrations.

Create a new file called sample05.yaml with the following:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "demo-01"
spec:
  configuration:
    zookeeper:
        nodes:
        - host: zookeeper.zoo1ns
          port: 2181
    clusters:
      - name: "demo-01"
        layout:
          shardsCount: 2
          replicasCount: 2
        templates:
          podTemplate: clickhouse-stable
          volumeClaimTemplate: storage-vc-template
  templates:
    podTemplates:
      - name: clickhouse-stable
        spec:
          containers:
          - name: clickhouse
            image: altinity/clickhouse-server:21.8.10.1.altinitystable
    volumeClaimTemplates:
      - name: storage-vc-template
        spec:
          storageClassName: standard
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 1Gi

Those who have followed the previous examples will recognize the
clusters being created, but there are some new additions:

  • volumeClaimTemplate: This is setting up storage, and we’re
    specifying the class as default. For full details on the different
    storage classes see the
    kubectl Storage Class documentation
  • storage: We’re going to give our cluster 1 Gigabyte of storage,
    enough for our sample systems. If you need more space that
    can be upgraded by changing these settings.
  • podTemplate: Here we’ll specify what our pod types are going to be.
    We’ll use the latest version of the ClickHouse containers,
    but other versions can be specified to best it your needs.
    For more information, see the
    [ClickHouse on Kubernetes Operator Guide]({<ref “kubernetesoperatorguide”>}).

Save your new configuration file and install it.
If you’ve been following this guide and already have the
namespace test operating, this will update it:

kubectl apply -f sample05.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 created

Verify it completes with get all for this namespace,
and you should have similar results:

kubectl -n test get chi -o wide
NAME      VERSION   CLUSTERS   SHARDS   HOSTS   TASKID                                 STATUS      UPDATED   ADDED   DELETED   DELETE   ENDPOINT                                    AGE
demo-01   0.18.3    1          2        4       57ec3f87-9950-4e5e-9b26-13680f66331d   Completed             4                          clickhouse-demo-01.test.svc.cluster.local   108s
kubectl get service -n test
NAME                      TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE
chi-demo-01-demo-01-0-0   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      81s
chi-demo-01-demo-01-0-1   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      63s
chi-demo-01-demo-01-1-0   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      45s
chi-demo-01-demo-01-1-1   ClusterIP      None             <none>        8123/TCP,9000/TCP,9009/TCP      8s
clickhouse-demo-01        LoadBalancer   10.104.236.138   <pending>     8123:31281/TCP,9000:30052/TCP   98s

Testing Persistent Storage

Everything is running, let’s verify that our storage is working.
We’re going to exec into our cluster with a bash prompt on
one of the pods created:

kubectl -n test exec -it chi-demo-01-demo-01-0-0-0 -- df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay          32G   26G  4.0G  87% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sda2        32G   26G  4.0G  87% /etc/hosts
shm              64M     0   64M   0% /dev/shm
tmpfs           7.7G   12K  7.7G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs           3.9G     0  3.9G   0% /proc/acpi
tmpfs           3.9G     0  3.9G   0% /proc/scsi
tmpfs           3.9G     0  3.9G   0% /sys/firmware

And we can see we have about 1 Gigabyte of storage
allocated into our cluster.

Let’s add some data to it. Nothing major, just to show that we can
store information, then change the configuration and the data stays.

Exit out of your cluster and launch clickhouse-client on your LoadBalancer.
We’re going to create a database, then create a table in the database,
then show both.

SHOW DATABASES
┌─name────┐
 default 
 system  
└─────────┘
CREATE DATABASE teststorage
CREATE TABLE teststorage.test AS system.one ENGINE = Distributed('demo-01', 'system', 'one')
SHOW DATABASES
┌─name────────┐
 default     
 system      
 teststorage 
└─────────────┘
SELECT * FROM teststorage.test
┌─dummy─┐
     0 
└───────┘
┌─dummy─┐
     0 
└───────┘

If you followed the instructions from
[Zookeeper and Replicas]({<ref “quickzookeeper” >}),
note at the end when we updated the configuration of our sample cluster
that all of the tables and data we made were deleted.
Let’s recreate that experiment now with a new configuration.

Create a new file called sample06.yaml. We’re going to reduce
the shards and replicas to 1:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "demo-01"
spec:
  configuration:
    zookeeper:
        nodes:
        - host: zookeeper.zoo1ns
          port: 2181
    clusters:
      - name: "demo-01"
        layout:
          shardsCount: 1
          replicasCount: 1
        templates:
          podTemplate: clickhouse-stable
          volumeClaimTemplate: storage-vc-template
  templates:
    podTemplates:
      - name: clickhouse-stable
        spec:
          containers:
          - name: clickhouse
            image: altinity/clickhouse-server:21.8.10.1.altinitystable
    volumeClaimTemplates:
      - name: storage-vc-template
        spec:
          storageClassName: standard
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 1Gi

Update the cluster with the following:

kubectl apply -f sample06.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 configured

Wait until the configuration is done and all of the pods are spun down,
then launch a bash prompt on one of the pods and check
the storage available:

kubectl -n test get chi -o wide
NAME      VERSION   CLUSTERS   SHARDS   HOSTS   TASKID                                 STATUS      UPDATED   ADDED   DELETED   DELETE   ENDPOINT                                    AGE
demo-01   0.18.3    1          1        1       776c1a82-44e1-4c2e-97a7-34cef629e698   Completed                               4        clickhouse-demo-01.test.svc.cluster.local   2m56s
kubectl -n test exec -it chi-demo-01-demo-01-0-0-0 -- df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay          32G   26G  4.0G  87% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sda2        32G   26G  4.0G  87% /etc/hosts
shm              64M     0   64M   0% /dev/shm
tmpfs           7.7G   12K  7.7G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs           3.9G     0  3.9G   0% /proc/acpi
tmpfs           3.9G     0  3.9G   0% /proc/scsi
tmpfs           3.9G     0  3.9G   0% /sys/firmware

Storage is still there. We can test if our databases are still available
by logging into clickhouse:

SHOW DATABASES
┌─name────────┐
 default     
 system      
 teststorage 
└─────────────┘
SELECT * FROM teststorage.test
┌─dummy─┐
     0 
└───────┘

All of our databases and tables are there.

There are different ways of allocating storage - for data, for logging,
multiple data volumes for your cluster nodes, but this will get you
started in running your own Kubernetes cluster running ClickHouse
in your favorite environment.

1.5 - Uninstall

How to uninstall the Altinity Kubernetes Operator and its namespace

To remove the Altinity Kubernetes Operator, both the Altinity Kubernetes Operator and the components in its installed namespace will have to be removed. The proper command is to uses the same clickhouse-operator-install-bundle.yaml file that was used to install the Altinity Kubernetes Operator. For more details, see how to install and verify the Altinity Kubernetes Operator.

The following instructions are based on the standard installation instructions. For users who perform a custom installation, note that the any custom namespaces that the user wants to remove will have to be deleted separate from the Altinity Kubernetes Operator deletion.

For example, if the custom namespace operator-test is created, then it would be removed with the command kubectl delete namespaces operator-test.

Instructions

To remove the Altinity Kubernetes Operator from your Kubernetes environment from a standard install:

  1. Verify the Altinity Kubernetes Operator is in the kube-system namespace. The Altinity Kubernetes Operator and other pods will be displayed:

    NAME                                   READY   STATUS    RESTARTS      AGE
    clickhouse-operator-857c69ffc6-2frgl   2/2     Running   0             5s
    coredns-78fcd69978-nthp2               1/1     Running   4 (23h ago)   51d
    etcd-minikube                          1/1     Running   4 (23h ago)   51d
    kube-apiserver-minikube                1/1     Running   4 (23h ago)   51d
    kube-controller-manager-minikube       1/1     Running   4 (23h ago)   51d
    kube-proxy-lsggn                       1/1     Running   4 (23h ago)   51d
    kube-scheduler-minikube                1/1     Running   4 (23h ago)   51d
    storage-provisioner                    1/1     Running   9 (23h ago)   51d
    
  2. Issue the kubectl delete command using the same YAML file used to install the Altinity Kubernetes Operator. By default the Altinity Kubernetes Operator is installed in the namespace kube-system. If this was installed into a custom namespace, verify that it installed in the uninstall command. In this example, we specified an installation of the Altinity Kubernetes Operator version 0.18.3 into the default kube-system namespace. This produces output similar to the following:

    kubectl delete -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
    
    customresourcedefinition.apiextensions.k8s.io "clickhouseinstallations.clickhouse.altinity.com" deleted
    customresourcedefinition.apiextensions.k8s.io "clickhouseinstallationtemplates.clickhouse.altinity.com" deleted
    customresourcedefinition.apiextensions.k8s.io "clickhouseoperatorconfigurations.clickhouse.altinity.com" deleted
    serviceaccount "clickhouse-operator" deleted
    clusterrole.rbac.authorization.k8s.io "clickhouse-operator-kube-system" deleted
    clusterrolebinding.rbac.authorization.k8s.io "clickhouse-operator-kube-system" deleted
    configmap "etc-clickhouse-operator-files" deleted
    configmap "etc-clickhouse-operator-confd-files" deleted
    configmap "etc-clickhouse-operator-configd-files" deleted
    configmap "etc-clickhouse-operator-templatesd-files" deleted
    configmap "etc-clickhouse-operator-usersd-files" deleted
    deployment.apps "clickhouse-operator" deleted
    service "clickhouse-operator-metrics" deleted
    
  3. To verify the Altinity Kubernetes Operator has been removed, use the kubectl get namespaces command:

    kubectl get pods --namespace kube-system
    
    NAME                               READY   STATUS    RESTARTS      AGE
    coredns-78fcd69978-nthp2           1/1     Running   4 (23h ago)   51d
    etcd-minikube                      1/1     Running   4 (23h ago)   51d
    kube-apiserver-minikube            1/1     Running   4 (23h ago)   51d
    kube-controller-manager-minikube   1/1     Running   4 (23h ago)   51d
    kube-proxy-lsggn                   1/1     Running   4 (23h ago)   51d
    kube-scheduler-minikube            1/1     Running   4 (23h ago)   51d
    storage-provisioner                1/1     Running   9 (23h ago)   51d
    

2 - Kubernetes Install Guide

How to install Kubernetes in different environments

Kubernetes and Zookeeper form a major backbone in running the Altinity Kubernetes Operator in a cluster. The following guides detail how to setup Kubernetes in different environments.

2.1 - Install minikube for Linux

How to install Kubernetes through minikube

One popular option for installing Kubernetes is through minikube, which creates a local Kubernetes cluster for different environments. Tests scripts and examples for the clickhouse-operator are based on using minikube to set up the Kubernetes environment.

The following guide demonstrates how to install minikube that support the clickhouse-operator for the following operating systems:

  • Linux (Deb based)

Minikube Installation for Deb Based Linux

The following instructions assume an installation for x86-64 based Linux that use Deb package installation. Please see the referenced documentation for instructions for other Linux distributions and platforms.

To install minikube that supports running clickhouse-operator:

kubectl Installation for Deb

The following instructions are based on Install and Set Up kubectl on Linux

  1. Download the kubectl binary:

    curl -LO 'https://dl.k8s.io/release/v1.22.0/bin/linux/amd64/kubectl'
    
  2. Verify the SHA-256 hash:

    curl -LO "https://dl.k8s.io/v1.22.0/bin/linux/amd64/kubectl.sha256"
    
    echo "$(<kubectl.sha256) kubectl" | sha256sum --check
    
  3. Install kubectl into the /usr/local/bin directory (this assumes that your PATH includes use/local/bin):

    sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
    
  4. Verify the installation and the version:

    kubectl version
    

Install Docker for Deb

These instructions are based on Docker’s documentation Install Docker Engine on Ubuntu

  1. Install the Docker repository links.

    1. Update the apt-get repository:

      sudo apt-get update
      
  2. Install the prequisites ca-certificates, curl, gnupg, and lsb-release:

    sudo apt-get install -y ca-certificates curl gnupg lsb-release
    
  3. Add the Docker repository keys:

    curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --yes --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
    
    1. Add the Docker repository:

      echo \
      "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
      $(lsb_release -cs) stable" |sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
      
  4. Install Docker:

    1. Update the apt-get repository:

      sudo apt-get update
      
    2. Install Docker and other libraries:

    sudo apt install docker-ce docker-ce-cli containerd.io
    
  5. Add non-root accounts to the docker group. This allows these users to run Docker commands without requiring root access.

    1. Add current user to the docker group and activate the changes to group

      sudo usermod -aG docker $USER&& newgrp docker
      

Install Minikube for Deb

The following instructions are taken from minikube start.

  1. Update the apt-get repository:

    sudo apt-get update
    
  2. Install the prerequisite conntrack:

    sudo apt install conntrack
    
  3. Download minikube:

    curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
    
  4. Install minikube:

    sudo install minikube-linux-amd64 /usr/local/bin/minikube
    
  5. To correct issues with the kube-proxy and the storage-provisioner, set nf_conntrack_max=524288 before starting minikube:

    sudo sysctl net/netfilter/nf_conntrack_max=524288
    
  6. Start minikube:

    minikube start && echo "ok: started minikube successfully"
    
  7. Once installation is complete, verify that the user owns the ~/.kube and ~/.minikube directories:

    sudo chown -R $USER.$USER .kube
    
    sudo chown -R $USER.$USER .minikube
    

2.2 - Altinity Kubernetes Operator on GKE

How to install the Altinity Kubernetes Operator using Google Kubernetes Engine

Organizations can host their Altinity Kubernetes Operator on the Google Kubernetes Engine (GKE). This can be done either through Altinity.Cloud or through a separate installation on GKE.

To setup a basic Altinity Kubernetes Operator environment, use the following steps. The steps below use the current free Google Cloud services to set up a minimally viable Kubernetes with ClickHouse environment.

Prerequisites

  1. Register a Google Cloud Account: https://cloud.google.com/.
  2. Create a Google Cloud project: https://cloud.google.com/resource-manager/docs/creating-managing-projects
  3. Install gcloud and run gcloud init or gcloud init --console to set up your environment: https://cloud.google.com/sdk/docs/install
  4. Enable the Google Compute Engine: https://cloud.google.com/endpoints/docs/openapi/enable-api
  5. Enable GKE on your project: https://console.cloud.google.com/apis/enableflow?apiid=container.googleapis.com.
  6. Select a default Computer Engine zone.
  7. Select a default Compute Engine region.
  8. Install kubectl on your local system. For sample instructions, see the Minikube on Linux installation instructions.

Altinity Kubernetes Operator on GKE Installation instructions

Installing the Altinity Kubernetes Operator in GKE has the following main steps:

Create the Network

The first step in setting up the Altinity Kubernetes Operator in GKE is creating the network. The complete details can be found on the Google Cloud documentation site regarding the gcloud compute networks create command. The following command will create a network called kubernetes-1 that will work for our minimal Altinity Kubernetes Operator cluster. Note that this network will not be available to external networks unless additional steps are made. Consult the Google Cloud documentation site for more details.

  1. See a list of current networks available. In this example, there are no networks setup in this project:

    gcloud compute networks list
    NAME     SUBNET_MODE  BGP_ROUTING_MODE  IPV4_RANGE  GATEWAY_IPV4
    default  AUTO         REGIONAL
    
  2. Create the network in your Google Cloud project:

    gcloud compute networks create kubernetes-1 --bgp-routing-mode regional --subnet-mode custom
    Created [https://www.googleapis.com/compute/v1/projects/betadocumentation/global/networks/kubernetes-1].
    NAME          SUBNET_MODE  BGP_ROUTING_MODE  IPV4_RANGE  GATEWAY_IPV4
    kubernetes-1  CUSTOM       REGIONAL
    
    Instances on this network will not be reachable until firewall rules
    are created. As an example, you can allow all internal traffic between
    instances as well as SSH, RDP, and ICMP by running:
    
    $ gcloud compute firewall-rules create <FIREWALL_NAME> --network kubernetes-1 --allow tcp,udp,icmp --source-ranges <IP_RANGE>
    $ gcloud compute firewall-rules create <FIREWALL_NAME> --network kubernetes-1 --allow tcp:22,tcp:3389,icmp
    
  3. Verify its creation:

    gcloud compute networks list
    NAME          SUBNET_MODE  BGP_ROUTING_MODE  IPV4_RANGE  GATEWAY_IPV4
    default       AUTO         REGIONAL
    kubernetes-1  CUSTOM       REGIONAL
    

Create the Cluster

Now that the network has been created, we can set up our cluster. The following cluster will be one using the ec2-micro machine type - the is still in the free model, and gives just enough power to run our basic cluster. The cluster will be called cluster-1, but you can replace it whatever name you feel appropriate. It uses the kubernetes-1 network specified earlier and creates a new subnet for the cluster under k-subnet-1.

To create and launch the cluster:

  1. Verify the existing clusters with the gcloud command. For this example there are no pre-existing clusters.

    gcloud container clusters list
    
  2. From the command line, issue the following gcloud command to create the cluster:

    gcloud container clusters create cluster-1 --region us-west1 --node-locations us-west1-a --machine-type e2-micro --network kubernetes-1 --create-subnetwork name=k-subnet-1 --enable-ip-alias &
    
  3. Use the clusters list command to verify when the cluster is available for use:

    gcloud container clusters list
    Created [https://container.googleapis.com/v1/projects/betadocumentation/zones/us-west1/clusters/cluster-1].
    To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/us-west1/cluster-1?project=betadocumentation
    kubeconfig entry generated for cluster-1.
    NAME       LOCATION  MASTER_VERSION   MASTER_IP      MACHINE_TYPE  NODE_VERSION     NUM_NODES  STATUS
    cluster-1  us-west1  1.21.6-gke.1500  35.233.231.36  e2-micro      1.21.6-gke.1500  3          RUNNING
    NAME       LOCATION  MASTER_VERSION   MASTER_IP      MACHINE_TYPE  NODE_VERSION     NUM_NODES  STATUS
    cluster-1  us-west1  1.21.6-gke.1500  35.233.231.36  e2-micro      1.21.6-gke.1500  3          RUNNING
    [1]+  Done                    gcloud container clusters create cluster-1 --region us-west1 --node-locations us-west1-a --machine-type e2-micro --network kubernetes-1 --create-subnetwork name=k-subnet-1 --enable-ip-alias
    

Get Cluster Credentials

Importing the cluster credentials into your kubectl environment will allow you to issue commands directly to the cluster on Google Cloud. To import the cluster credentials:

  1. Retrieve the credentials for the newly created cluster:

    gcloud container clusters get-credentials cluster-1 --region us-west1 --project betadocumentation
    Fetching cluster endpoint and auth data.
    kubeconfig entry generated for cluster-1.
    
  2. Verify the cluster information from the kubectl environment:

    kubectl cluster-info
    Kubernetes control plane is running at https://35.233.231.36
    GLBCDefaultBackend is running at https://35.233.231.36/api/v1/namespaces/kube-system/services/default-http-backend:http/proxy
    KubeDNS is running at https://35.233.231.36/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
    Metrics-server is running at https://35.233.231.36/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy
    
    To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
    

Install the Altinity ClickHouse Operator

Our cluster is up and ready to go. Time to install the Altinity Kubernetes Operator through the following steps. Note that we are specifying the version of the Altinity Kubernetes Operator to install. This insures maximum compatibility with your applications and other Kubernetes environments.

As of the time of this article, the most current version is 0.18.1

  1. Apply the Altinity Kubernetes Operator manifest by either downloading it and applying it, or referring to the GitHub repository URL. For more information, see the Altinity Kubernetes Operator Installation Guides.

    kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.1/deploy/operator/clickhouse-operator-install-bundle.yaml
    
  2. Verify the installation by running:

    kubectl get pods --namespace kube-system
    NAME                                                  READY   STATUS    RESTARTS   AGE
    clickhouse-operator-77b54889b4-g98kk                  2/2     Running   0          53s
    event-exporter-gke-5479fd58c8-7h6bn                   2/2     Running   0          108s
    fluentbit-gke-b29c2                                   2/2     Running   0          79s
    fluentbit-gke-k8f2n                                   2/2     Running   0          80s
    fluentbit-gke-vjlqh                                   2/2     Running   0          80s
    gke-metrics-agent-4ttdt                               1/1     Running   0          79s
    gke-metrics-agent-qf24p                               1/1     Running   0          80s
    gke-metrics-agent-szktc                               1/1     Running   0          80s
    konnectivity-agent-564f9f6c5f-59nls                   1/1     Running   0          40s
    konnectivity-agent-564f9f6c5f-9nfnl                   1/1     Running   0          40s
    konnectivity-agent-564f9f6c5f-vk7l8                   1/1     Running   0          97s
    konnectivity-agent-autoscaler-5c49cb58bb-zxzlp        1/1     Running   0          97s
    kube-dns-697dc8fc8b-ddgrx                             4/4     Running   0          98s
    kube-dns-697dc8fc8b-fpnps                             4/4     Running   0          71s
    kube-dns-autoscaler-844c9d9448-pqvqr                  1/1     Running   0          98s
    kube-proxy-gke-cluster-1-default-pool-fd104f22-8rx3   1/1     Running   0          36s
    kube-proxy-gke-cluster-1-default-pool-fd104f22-gnd0   1/1     Running   0          29s
    kube-proxy-gke-cluster-1-default-pool-fd104f22-k2sv   1/1     Running   0          12s
    l7-default-backend-69fb9fd9f9-hk7jq                   1/1     Running   0          107s
    metrics-server-v0.4.4-857776bc9c-bs6sl                2/2     Running   0          44s
    pdcsi-node-5l9vf                                      2/2     Running   0          79s
    pdcsi-node-gfwln                                      2/2     Running   0          79s
    pdcsi-node-q6scz                                      2/2     Running   0          80s
    

Create a Simple ClickHouse Cluster

The Altinity Kubernetes Operator allows the easy creation and modification of ClickHouse clusters in whatever format that works best for your organization. Now that the Google Cloud cluster is running and has the Altinity Kubernetes Operatorinstalled, let’s create a very simple ClickHouse cluster to test on.

The following example will create an Altinity Kubernetes Operator controlled cluster with 1 shard and replica, 500 MB of persistent storage, and sets the password of the default Altinity Kubernetes Operator user’s password to topsecret. For more information on customizing the Altinity Kubernetes Operator, see the Altinity Kubernetes Operator Configuration Guides.

  1. Create the following manifest and save it as gcp-example01.yaml.

    
    apiVersion: "clickhouse.altinity.com/v1"
    kind: "ClickHouseInstallation"
    metadata:
    name: "gcp-example"
    spec:
    configuration:
        # What does my cluster look like?
        clusters:
        - name: "gcp-example"
        layout:
            shardsCount: 1
            replicasCount: 1
        templates:
            podTemplate: clickhouse-stable
            volumeClaimTemplate: pd-ssd
        # Where is Zookeeper?
        zookeeper:
        nodes:
        - host: zookeeper.zoo1ns
            port: 2181
        # What are my users?
        users:
        # Password = topsecret
        demo/password_sha256_hex: 53336a676c64c1396553b2b7c92f38126768827c93b64d9142069c10eda7a721
        demo/profile: default
        demo/quota: default
        demo/networks/ip:
        - 0.0.0.0/0
        - ::/0
    templates:
        podTemplates:
        # What is the definition of my server?
        - name: clickhouse-stable
        spec:
            containers:
            - name: clickhouse
            image: altinity/clickhouse-server:21.8.10.1.altinitystable
        # Keep servers on separate nodes!
            podDistribution:
            - scope: ClickHouseInstallation
            type: ClickHouseAntiAffinity
        volumeClaimTemplates:
        # How much storage and which type on each node?
        - name: pd-ssd
        # Do not delete PVC if installation is dropped.
        reclaimPolicy: Retain
        spec:
            accessModes:
            - ReadWriteOnce
            resources:
            requests:
                storage: 500Mi
    
  2. Create a namespace in your GKE environment. For this example, we will be using test:

    kubectl create namespace test
    namespace/test created
    
  3. Apply the manifest to the namespace:

    kubectl -n test apply -f gcp-example01.yaml
    clickhouseinstallation.clickhouse.altinity.com/gcp-example created
    
  4. Verify the installation is complete when all pods are in a Running state:

    kubectl -n test get chi -o wide
    NAME          VERSION   CLUSTERS   SHARDS   HOSTS   TASKID                                 STATUS      UPDATED   ADDED   DELETED   DELETE   ENDPOINT
    gcp-example   0.18.1    1          1        1       f859e396-e2de-47fd-8016-46ad6b0b8508   Completed             1                          clickhouse-gcp-example.test.svc.cluster.local
    

Login to the Cluster

This example does not have any open external ports, but we can still access our ClickHouse database through kubectl exec. In this case, our specific pod we are connecting to is chi-demo-01-demo-01-0-0-0. Replace this with the designation of your pods;

Use the following procedure to verify the Altinity Stable build install in your GKE environment.

  1. Login to the clickhouse-client in one of your existing pods:

    kubectl -n test exec -it chi-gcp-example-gcp-example-0-0-0 -- clickhouse-client
    
  2. Verify the cluster configuration:

    kubectl -n test exec -it chi-gcp-example-gcp-example-0-0-0  -- clickhouse-client -q "SELECT * FROM system.clusters  FORMAT PrettyCompactNoEscapes"
    ┌─cluster──────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───────────────────────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
    │ all-replicated                               │         111 │ chi-gcp-example-gcp-example-0-0 │ 127.0.0.1    │ 90001 │ default │                  │            000│ all-sharded                                  │         111 │ chi-gcp-example-gcp-example-0-0 │ 127.0.0.1    │ 90001 │ default │                  │            000│ gcp-example                                  │         111 │ chi-gcp-example-gcp-example-0-0 │ 127.0.0.1    │ 90001 │ default │                  │            000│ test_cluster_two_shards                      │         111 │ 127.0.0.1                       │ 127.0.0.1    │ 90001 │ default │                  │            000│ test_cluster_two_shards                      │         211 │ 127.0.0.2                       │ 127.0.0.2    │ 90000 │ default │                  │            000│ test_cluster_two_shards_internal_replication │         111 │ 127.0.0.1                       │ 127.0.0.1    │ 90001 │ default │                  │            000│ test_cluster_two_shards_internal_replication │         211 │ 127.0.0.2                       │ 127.0.0.2    │ 90000 │ default │                  │            000│ test_cluster_two_shards_localhost            │         111 │ localhost                       │ 127.0.0.1    │ 90001 │ default │                  │            000│ test_cluster_two_shards_localhost            │         211 │ localhost                       │ 127.0.0.1    │ 90001 │ default │                  │            000│ test_shard_localhost                         │         111 │ localhost                       │ 127.0.0.1    │ 90001 │ default │                  │            000│ test_shard_localhost_secure                  │         111 │ localhost                       │ 127.0.0.1    │ 94400 │ default │                  │            000│ test_unavailable_shard                       │         111 │ localhost                       │ 127.0.0.1    │ 90001 │ default │                  │            000│ test_unavailable_shard                       │         211 │ localhost                       │ 127.0.0.1    │    10 │ default │                  │            000└──────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴─────────────────────────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘
    
  3. Exit out of your cluster:

    chi-gcp-example-gcp-example-0-0-0.chi-gcp-example-gcp-example-0-0.test.svc.cluster.local :) exit
    Bye.
    

Further Steps

This simple example demonstrates how to build and manage an Altinity Altinity Kubernetes Operator run cluster for ClickHouse. Further steps would be to open the cluster to external network connections, setup replication schemes, etc.

For more information, see the Altinity Kubernetes Operator guides and the Altinity Kubernetes Operator repository.

3 - Operator Guide

Installation and Management of clickhouse-operator for Kubernetes

The the Altinity Kubernetes Operator is an open source project managed and maintained by Altinity Inc. This Operator Guide is created to help users with installation, configuration, maintenance, and other important tasks.

3.1 - Installation Guide

Basic and custom installation instructions of the clickhouse-operator

Depending on your organization and its needs, there are different ways of installing the Kubernetes clickhouse-operator.

3.1.1 - Basic Installation Guide

The simple method of installing the Altinity Kubernetes Operator

Requirements

The Altinity Kubernetes Operator for Kubernetes has the following requirements:

Instructions

To install the Altinity Kubernetes Operator for Kubernetes:

  1. Deploy the Altinity Kubernetes Operator from the manifest directly from GitHub. It is recommended that the version be specified during installation - this insures maximum compatibility and that all replicated environments are working from the same version. For more information on installing other versions of the Altinity Kubernetes Operator, see the specific Version Installation Guide.

    The most current version is 0.18.3:

kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
  1. The following will be displayed on a successful installation.
    For more information on the resources created in the installation,
    see [Altinity Kubernetes Operator Resources]({<ref “operatorresources” >})
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com created
serviceaccount/clickhouse-operator created
clusterrole.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
configmap/etc-clickhouse-operator-files created
configmap/etc-clickhouse-operator-confd-files created
configmap/etc-clickhouse-operator-configd-files created
configmap/etc-clickhouse-operator-templatesd-files created
configmap/etc-clickhouse-operator-usersd-files created
deployment.apps/clickhouse-operator created
service/clickhouse-operator-metrics created
  1. Verify the installation by running:
kubectl get pods --namespace kube-system

The following will be displayed on a successful installation,
with your particular image:

NAME                                   READY   STATUS    RESTARTS      AGE
clickhouse-operator-857c69ffc6-ttnsj   2/2     Running   0             4s
coredns-78fcd69978-nthp2               1/1     Running   4 (23h ago)   51d
etcd-minikube                          1/1     Running   4 (23h ago)   51d
kube-apiserver-minikube                1/1     Running   4 (23h ago)   51d
kube-controller-manager-minikube       1/1     Running   4 (23h ago)   51d
kube-proxy-lsggn                       1/1     Running   4 (23h ago)   51d
kube-scheduler-minikube                1/1     Running   4 (23h ago)   51d
storage-provisioner                    1/1     Running   9 (23h ago)   51d

3.1.2 - Custom Installation Guide

How to install a customized Altinity Kubernetes Operator

Users who need to customize their Altinity Kubernetes Operator namespace or
can not directly connect to Github from the installation environment
can perform a custom install.

Requirements

The Altinity Kubernetes Operator for Kubernetes has the following requirements:

Instructions

Script Install into Namespace

By default, the Altinity Kubernetes Operator installed into the kube-system
namespace when using the Basic Installation instructions.
To install into a different namespace use the following command replacing {custom namespace here}
with the namespace to use:

curl -s https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/deploy/operator-web-installer/clickhouse-operator-install.sh | OPERATOR_NAMESPACE={custom_namespace_here} bash

For example, to install into the namespace test-clickhouse-operator
namespace, use:

curl -s https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/deploy/operator-web-installer/clickhouse-operator-install.sh | OPERATOR_NAMESPACE=test-clickhouse-operator bash
Setup ClickHouse Operator into 'test-clickhouse-operator' namespace
No 'test-clickhouse-operator' namespace found. Going to create
namespace/test-clickhouse-operator created
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com created
serviceaccount/clickhouse-operator created
clusterrole.rbac.authorization.k8s.io/clickhouse-operator-test-clickhouse-operator configured
clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-test-clickhouse-operator configured
configmap/etc-clickhouse-operator-files created
configmap/etc-clickhouse-operator-confd-files created
configmap/etc-clickhouse-operator-configd-files created
configmap/etc-clickhouse-operator-templatesd-files created
configmap/etc-clickhouse-operator-usersd-files created
deployment.apps/clickhouse-operator created
service/clickhouse-operator-metrics created

If no OPERATOR_NAMESPACE value is set, then the Altinity Kubernetes Operator will
be installed into kube-system.

Manual Install into Namespace

For organizations that can not access GitHub directly from the environment they are installing the Altinity Kubernetes Operator in, they can perform a manual install through the following steps:

  1. Download the install template file: clickhouse-operator-install-template.yaml.

  2. Edit the file and set OPERATOR_NAMESPACE value.

  3. Use the following commands, replacing {your file name} with the name of your YAML file:

    namespace = "custom-clickhouse-operator"
    bash("sed -i s/'${OPERATOR_NAMESPACE}'/test-clickhouse-operator/ clickhouse-operator-install-template.yaml", add_to_text=False)
    bash(f"kubectl apply -f clickhouse-operator-install-template.yaml", add_to_text=False)
    
    try:
    
        retry(bash, timeout=60, delay=1)("kubectl get pods --namespace test-clickhouse-operator "
            "-o=custom-columns=NAME:.metadata.name,STATUS:.status.phase",
            exitcode=0, message="Running", lines=slice(1, None),
            fail_message="not all pods in Running state", add_to_text=true)
    
    finally:
        bash(f"kubectl delete namespace test-clickhouse-operator', add_to_text=False)
    
    kubectl apply -f {your file name}
    

    For example:

    kubectl apply -f customtemplate.yaml
    

Alternatively, instead of using the install template, enter the following into your console
(bash is used below, modify depending on your particular shell).
Change the OPERATOR_NAMESPACE value to match your namespace.

# Namespace to install operator into
OPERATOR_NAMESPACE="${OPERATOR_NAMESPACE:-clickhouse-operator}"
# Namespace to install metrics-exporter into
METRICS_EXPORTER_NAMESPACE="${OPERATOR_NAMESPACE}"

# Operator's docker image
OPERATOR_IMAGE="${OPERATOR_IMAGE:-altinity/clickhouse-operator:latest}"
# Metrics exporter's docker image
METRICS_EXPORTER_IMAGE="${METRICS_EXPORTER_IMAGE:-altinity/metrics-exporter:latest}"

# Setup Altinity Kubernetes Operator into specified namespace
kubectl apply --namespace="${OPERATOR_NAMESPACE}" -f <( \
    curl -s https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/deploy/operator/clickhouse-operator-install-template.yaml | \
        OPERATOR_IMAGE="${OPERATOR_IMAGE}" \
        OPERATOR_NAMESPACE="${OPERATOR_NAMESPACE}" \
        METRICS_EXPORTER_IMAGE="${METRICS_EXPORTER_IMAGE}" \
        METRICS_EXPORTER_NAMESPACE="${METRICS_EXPORTER_NAMESPACE}" \
        envsubst \
)

Verify Installation

To verify the Altinity Kubernetes Operator is running in your namespace, use the following command:

kubectl get pods -n clickhouse-operator
NAME                                   READY   STATUS    RESTARTS   AGE
clickhouse-operator-5d9496dd48-8jt8h   2/2     Running   0          16s

3.1.3 - Source Build Guide - 0.18 and Up

How to build the Altinity Kubernetes Operator from source code

For organizations who prefer to build the software directly from source code,
they can compile the Altinity Kubernetes Operator and install it into a Docker
container through the following process. The following procedure is available for versions of the Altinity Kubernetes Operator 0.18.0 and up.

Binary Build

Binary Build Requirements

  • go-lang compiler: Go.
  • Go mod Package Manager.
  • The source code from the Altinity Kubernetes Operator repository.
    This can be downloaded using git clone https://github.com/altinity/clickhouse-operator.

Binary Build Instructions

  1. Switch working dir to clickhouse-operator.

  2. Link all packages with the command: echo {root_password} | sudo -S -k apt install -y golang.

  3. Build the sources with go build -o ./clickhouse-operator cmd/operator/main.go.

This creates the Altinity Kubernetes Operator binary. This binary is only used
within a kubernetes environment.

Docker Image Build and Usage

Docker Build Requirements

Install Docker Buildx CLI plugin

  1. Download Docker Buildx binary file releases page on GitHub

  2. Create folder structure for plugin

    mkdir -p ~/.docker/cli-plugins/
    
  3. Rename the relevant binary and copy it to the destination matching your OS

    mv buildx-v0.7.1.linux-amd64  ~/.docker/cli-plugins/docker-buildx
    
  4. On Unix environments, it may also be necessary to make it executable with chmod +x:

    chmod +x ~/.docker/cli-plugins/docker-buildx
    
  5. Set buildx as the default builder

    docker buildx install
    
  6. Create config.json file to enable the plugin

    touch ~/.docker/config.json
    
  7. Create config.json file to enable the plugin

    echo "{"experimental": "enabled"}" >> ~/.docker/config.json
    

Docker Build Instructions

  1. Switch working dir to clickhouse-operator

  2. Build docker image with docker: docker build -f dockerfile/operator/Dockerfile -t altinity/clickhouse-operator:dev .

  3. Register freshly build docker image inside kubernetes environment with the following:

    docker save altinity/clickhouse-operator | (eval $(minikube docker-env) && docker load)
    
  4. Install the Altinity Kubernetes Operator as described in either the Basic Build
    or Custom Build.

3.1.4 - Specific Version Installation Guide

How to install a specific version of the Altinity Kubernetes Operator

Users may want to install a specific version of the Altinity Kubernetes Operator for a variety of reasons: to maintain parity between different environments, to preserve the version between replicas, or other reasons.

The following procedures detail how to install a specific version of the Altinity Kubernetes Operator in the default Kubernetes namespace kube-system. For instructions on performing custom installations based on the namespace and other settings, see the Custom Installation Guide.

Requirements

The Altinity Kubernetes Operator for Kubernetes has the following requirements:

Instructions

Altinity Kubernetes Operator Versions After 0.17.0

To install a specific version of the Altinity Kubernetes Operator after version 0.17.0:

  1. Run kubectl and apply the manifest directly from the GitHub Altinity Kubernetes Operator repository, or by downloading the manifest and applying it directly. The format for the URL is:

    https://github.com/Altinity/clickhouse-operator/raw/{OPERATOR_VERSION}/deploy/operator/clickhouse-operator-install-bundle.yaml
    

    Replace the {OPERATOR_VERSION} with the version to install. For example, for the Altinity Kubernetes Operator version 0.18.3, the URL would be:

    https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml

    The command to apply the Docker manifest through kubectl is:

    kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
    
    customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com configured
    customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com configured
    customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com configured
    serviceaccount/clickhouse-operator created
    clusterrole.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
    clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
    configmap/etc-clickhouse-operator-files created
    configmap/etc-clickhouse-operator-confd-files created
    configmap/etc-clickhouse-operator-configd-files created
    configmap/etc-clickhouse-operator-templatesd-files created
    configmap/etc-clickhouse-operator-usersd-files created
    deployment.apps/clickhouse-operator created
    service/clickhouse-operator-metrics created
    
  2. Verify the installation is complete and the clickhouse-operator pod is running:

    kubectl get pods --namespace kube-system
    

    A similar result to the following will be displayed on a successful installation:

    NAME                                   READY   STATUS    RESTARTS      AGE
    clickhouse-operator-857c69ffc6-q8qrr   2/2     Running   0             5s
    coredns-78fcd69978-nthp2               1/1     Running   4 (23h ago)   51d
    etcd-minikube                          1/1     Running   4 (23h ago)   51d
    kube-apiserver-minikube                1/1     Running   4 (23h ago)   51d
    kube-controller-manager-minikube       1/1     Running   4 (23h ago)   51d
    kube-proxy-lsggn                       1/1     Running   4 (23h ago)   51d
    kube-scheduler-minikube                1/1     Running   4 (23h ago)   51d
    storage-provisioner                    1/1     Running   9 (23h ago)   51d
    
  3. To verify the version of the Altinity Kubernetes Operator, use the following command:

    kubectl get pods -l app=clickhouse-operator --all-namespaces -o jsonpath="{.items[*].spec.containers[*].image}" | tr -s "[[:space:]]" | sort | uniq -c
    
    1 altinity/clickhouse-operator:0.18.3 altinity/metrics-exporter:0.18.3
    

3.1.5 - Upgrade Guide

How to upgrade the Altinity Kubernetes Operator

The Altinity Kubernetes Operator can be upgraded at any time by applying the new manifest from the Altinity Kubernetes Operator GitHub repository.

The following procedures detail how to install a specific version of the Altinity Kubernetes Operator in the default Kubernetes namespace kube-system. For instructions on performing custom installations based on the namespace and other settings, see the Custom Installation Guide.

Requirements

The Altinity Kubernetes Operator for Kubernetes has the following requirements:

Instructions

The following instructions are based on installations of the Altinity Kubernetes Operator greater than version 0.16.0. In the following examples, Altinity Kubernetes Operator version 0.16.0 has been installed and will be upgraded to 0.18.3.

For instructions on installing specific versions of the Altinity Kubernetes Operator, see the Specific Version Installation Guide.

  1. Deploy the Altinity Kubernetes Operator from the manifest directly from GitHub. It is recommended that the version be specified during the installation for maximum compatibilty. In this example, the version being upgraded to is :

    kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
    
  2. The following will be displayed on a successful installation.
    For more information on the resources created in the installation,
    see [Altinity Kubernetes Operator Resources]({<ref “operatorresources” >})

    customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com configured
    customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com configured
    customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com configured
    serviceaccount/clickhouse-operator configured
    clusterrole.rbac.authorization.k8s.io/clickhouse-operator-kube-system configured
    clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-kube-system configured
    configmap/etc-clickhouse-operator-files configured
    configmap/etc-clickhouse-operator-confd-files configured
    configmap/etc-clickhouse-operator-configd-files configured
    configmap/etc-clickhouse-operator-templatesd-files configured
    configmap/etc-clickhouse-operator-usersd-files configured
    deployment.apps/clickhouse-operator configured
    service/clickhouse-operator-metrics configured
    
  3. Verify the installation by running:

    The following will be displayed on a successful installation, with your particular image:

    kubectl get pods --namespace kube-system
    
    NAME                                   READY   STATUS    RESTARTS       AGE
    clickhouse-operator-857c69ffc6-dqt5l   2/2     Running   0              29s
    coredns-78fcd69978-nthp2               1/1     Running   3 (14d ago)    50d
    etcd-minikube                          1/1     Running   3 (14d ago)    50d
    kube-apiserver-minikube                1/1     Running   3 (2m6s ago)   50d
    kube-controller-manager-minikube       1/1     Running   3 (14d ago)    50d
    kube-proxy-lsggn                       1/1     Running   3 (14d ago)    50d
    kube-scheduler-minikube                1/1     Running   3 (2m6s ago)   50d
    storage-provisioner                    1/1     Running   7 (48s ago)    50d
    
  4. To verify the version of the Altinity Kubernetes Operator, use the following command:

    kubectl get pods -l app=clickhouse-operator -n kube-system -o jsonpath="{.items[*].spec.containers[*].image}" | tr -s "[[:space:]]" | sort | uniq -c
    
    1 altinity/clickhouse-operator:0.18.3 altinity/metrics-exporter:0.18.3
    

3.2 - Configuration Guide

How to configure your Altinity Kubernetes Operator cluster.

Depending on your organization’s needs and environment, you can modify your environment to best fit your needs with the Altinity Kubernetes Operator or your cluster settings.

3.2.1 - ClickHouse Operator Settings

Settings and configurations for the Altinity Kubernetes Operator

Altinity Kubernetes Operator 0.18 and greater

For versions of the Altinity Kubernetes Operator 0.18 and later, the Altinity Kubernetes Operator settings can be modified through the clickhouse-operator-install-bundle.yaml file in the section etc-clickhouse-operator-files. This sets the config.yaml settings that are used to set the user configuration and other settings. For more information, see the sample config.yaml for the Altinity Kubernetes Operator.

Altinity Kubernetes Operator before 0.18

For versions before 0.18, the Altinity Kubernetes Operator settings can be modified through clickhouse-operator-install-bundle.yaml file in the section marked ClickHouse Settings Section.

New User Settings

Setting Default Value Description
chConfigUserDefaultProfile default Sets the default profile used when creating new users.
chConfigUserDefaultQuota default Sets the default quota used when creating new users.
chConfigUserDefaultNetworksIP ::1
127.0.0.1
0.0.0.0
Specifies the networks that the user can connect from. Note that 0.0.0.0 allows access from all networks.
chConfigUserDefaultPassword default The initial password for new users.

ClickHouse Operator Settings

The ClickHouse Operator role can connect to the ClickHouse database to perform the following:

  • Metrics requests
  • Schema Maintenance
  • Drop DNS Cache

Additional users can be created with this role by modifying the usersd XML files.

Setting Default Value Description
chUsername clickhouse_operator The username for the ClickHouse Operator user.
chPassword clickhouse_operator_password The default password for the ClickHouse Operator user.
chPort 8123 The IP port for the ClickHouse Operator user.

Log Parameters

The Log Parameters sections sets the options for log outputs and levels.

Setting Default Value Description
logtostderr true If set to true, submits logs to stderr instead of log files.
alsologtostderr false If true, submits logs to stderr as well as log files.
v 1 Sets V-leveled logging level.
stderrthreshold "" The error threshold. Errors at or above this level will be submitted to stderr.
vmodule "" A comma separated list of modules and their verbose level with {module name} = {log level}. For example: "module1=2,module2=3".
log_backtrace_at "" Location to store the stack backtrace.

Runtime Parameters

The Runtime Parameters section sets the resources allocated for processes such as reconcile functions.

Setting Default Value Description
reconcileThreadsNumber 10 The number threads allocated to manage reconcile requests.
reconcileWaitExclude false ???
reconcileWaitInclude false ???

Template Parameters

Template Parameters sets the values for connection values, user default settings, and other values. These values are based on ClickHouse configurations. For full details, see the ClickHouse documentation page.

3.2.2 - ClickHouse Cluster Settings

Settings and configurations for clusters and nodes

ClickHouse clusters that are configured on Kubernetes have several options based on the Kubernetes Custom Resources settings. Your cluster may have particular requirements to best fit your organizations needs.

For an example of a configuration file using each of these settings, see the 99-clickhouseinstllation-max.yaml file as a template.

This assumes that you have installed the clickhouse-operator.

Initial Settings

The first section sets the cluster kind and api.

Parent Setting Type Description
None kind String Specifies the type of cluster to install. In this case, ClickHouse. Value value: ClickHouseInstallation
None metadata Object Assigns metadata values for the cluster
metadata name String The name of the resource.
metadata labels Array Labels applied to the resource.
metadata annotation Array Annotations applied to the resource.

Initial Settings Example

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "clickhouse-installation-max"
  labels:
    label1: label1_value
    label2: label2_value
  annotations:
    annotation1: annotation1_value
    annotation2: annotation2_value

.spec.defaults

.spec.defaults section represents default values the sections that follow .specs.defaults.

Parent Setting Type Description
defaults replicasUseFQDN `[Yes No ]`
defaults distributedDDL String Sets the <yandex><distributed_ddl></distributed_ddl></yandex> configuration settings. For more information, see Distributed DDL Queries (ON CLUSTER Clause).
defaults templates Array Sets the pod template types. This is where the template is declared, then defined in the .spec.configuration later.

.spec.defaults Example

  defaults:
    replicasUseFQDN: "no"
    distributedDDL:
      profile: default
    templates:
      podTemplate: clickhouse-v18.16.1
      dataVolumeClaimTemplate: default-volume-claim
      logVolumeClaimTemplate: default-volume-claim
      serviceTemplate: chi-service-template

.spec.configuration

.spec.configuration section represents sources for ClickHouse configuration files. For more information, see the ClickHouse Configuration Files page.

.spec.configuration Example

  configuration:
    users:
      readonly/profile: readonly
      #     <users>
      #        <readonly>
      #          <profile>readonly</profile>
      #        </readonly>
      #     </users>
      test/networks/ip:
        - "127.0.0.1"
        - "::/0"
      #     <users>
      #        <test>
      #          <networks>
      #            <ip>127.0.0.1</ip>
      #            <ip>::/0</ip>
      #          </networks>
      #        </test>
      #     </users>
      test/profile: default
      test/quotas: default

.spec.configuration.zookeeper

.spec.configuration.zookeeper defines the zookeeper settings, and is expanded into the <yandex><zookeeper></zookeeper></yandex> configuration section. For more information, see ClickHouse Zookeeper settings.

.spec.configuration.zookeeper Example

    zookeeper:
      nodes:
        - host: zookeeper-0.zookeepers.zoo3ns.svc.cluster.local
          port: 2181
        - host: zookeeper-1.zookeepers.zoo3ns.svc.cluster.local
          port: 2181
        - host: zookeeper-2.zookeepers.zoo3ns.svc.cluster.local
          port: 2181
      session_timeout_ms: 30000
      operation_timeout_ms: 10000
      root: /path/to/zookeeper/node
      identity: user:password

.spec.configuration.profiles

.spec.configuration.profiles defines the ClickHouse profiles that are stored in <yandex><profiles></profiles></yandex>. For more information, see the ClickHouse Server Settings page.

.spec.configuration.profiles Example

    profiles:
      readonly/readonly: 1

expands into

      <profiles>
        <readonly>
          <readonly>1</readonly>
        </readonly>
      </profiles>

.spec.configuration.users

.spec.configuration.users defines the users and is stored in <yandex><users></users></yandex>. For more information, see the ClickHouse Server Settings page.

.spec.configuration.users Example

  users:
    test/networks/ip:
        - "127.0.0.1"
        - "::/0"

expands into

     <users>
        <test>
          <networks>
            <ip>127.0.0.1</ip>
            <ip>::/0</ip>
          </networks>
        </test>
     </users>

.spec.configuration.settings

.spec.configuration.settings sets other ClickHouse settings such as compression, etc. For more information, see the ClickHouse Server Settings page.

.spec.configuration.settings Example

    settings:
      compression/case/method: "zstd"
#      <compression>
#       <case>
#         <method>zstd</method>
#      </case>
#      </compression>

.spec.configuration.files

.spec.configuration.files creates custom files used in the custer. These are used for custom configurations, such as the ClickHouse External Dictionary.

.spec.configuration.files Example

    files:
      dict1.xml: |
        <yandex>
            <!-- ref to file /etc/clickhouse-data/config.d/source1.csv -->
        </yandex>        
      source1.csv: |
        a1,b1,c1,d1
        a2,b2,c2,d2        
spec:
  configuration:
    settings:
      dictionaries_config: config.d/*.dict
    files:
      dict_one.dict: |
        <yandex>
          <dictionary>
        <name>one</name>
        <source>
            <clickhouse>
                <host>localhost</host>
                <port>9000</port>
                <user>default</user>
                <password/>
                <db>system</db>
                <table>one</table>
            </clickhouse>
        </source>
        <lifetime>60</lifetime>
        <layout><flat/></layout>
        <structure>
            <id>
                <name>dummy</name>
            </id>
            <attribute>
                <name>one</name>
                <expression>dummy</expression>
                <type>UInt8</type>
                <null_value>0</null_value>
            </attribute>
        </structure>
        </dictionary>
        </yandex>        

.spec.configuration.clusters

.spec.configuration.clusters defines the ClickHouse clusters to be installed.

    clusters:

Clusters and Layouts

.clusters.layout defines the ClickHouse layout of a cluster. This can be general, or very granular depending on your requirements. For full information, see Cluster Deployment.

Templates

podTemplate is used to define the specific pods in the cluster, mainly the ones that will be running ClickHouse. The VolumeClaimTemplate defines the storage volumes. Both of these settings are applied per replica.

Basic Dimensions

Basic dimensions are used to define the cluster definitions without specifying particular details of the shards or nodes.

Parent Setting Type Description
.clusters.layout shardsCount Number The number of shards for the cluster.
.clusters.layout replicasCount Number The number of replicas for the cluster.
Basic Dimensions Example

In this example, the podTemplates defines ClickHouses containers into a cluster called all-counts with three shards and two replicas.

- name: all-counts
        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shardsCount: 3
          replicasCount: 2

This is expanded into the following configuration. The IP addresses and DNS configuration are assigned by k8s and the operator.

<yandex>
    <remote_servers>
        <all-counts>
        
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>192.168.1.1</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>192.168.1.2</host>
                    <port>9000</port>
                </replica>
            </shard>
            
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>192.168.1.3</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>192.168.1.4</host>
                    <port>9000</port>
                </replica>
            </shard>
            
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>192.168.1.5</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>192.168.1.6</host>
                    <port>9000</port>
                </replica>
            </shard>
            
        </all-counts>
    </remote_servers>
</yandex>

Specified Dimensions

The templates section can also be used to specify more than just the general layout. The exact definitions of the shards and replicas can be defined as well.

In this example, shard0 here has replicasCount specified, while shard1 has 3 replicas explicitly specified, with possibility to customized each replica.

        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shardsCount: 3
          replicasCount: 2
      - name: customized
        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shards:
            - name: shard0
              replicasCount: 3
              weight: 1
              internalReplication: Disabled
              templates:
                podTemplate: clickhouse-v18.16.1
                dataVolumeClaimTemplate: default-volume-claim
                logVolumeClaimTemplate: default-volume-claim

            - name: shard1
              templates:
                podTemplate: clickhouse-v18.16.1
                dataVolumeClaimTemplate: default-volume-claim
                logVolumeClaimTemplate: default-volume-claim
              replicas:
                - name: replica0
                - name: replica1
                - name: replica2

Other examples are combinations, where some replicas are defined but only one is explicitly differentiated with a different podTemplate.

      - name: customized
        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shards:
            - name: shard2
              replicasCount: 3
              templates:
                podTemplate: clickhouse-v18.16.1
                dataVolumeClaimTemplate: default-volume-claim
                logVolumeClaimTemplate: default-volume-claim
              replicas:
                - name: replica0
                  port: 9000
                  templates:
                    podTemplate: clickhouse-v19.11.3.11
                    dataVolumeClaimTemplate: default-volume-claim
                    logVolumeClaimTemplate: default-volume-claim

.spec.templates.serviceTemplates

.spec.templates.serviceTemplates represents Kubernetes Service templates, with additional fields.

At the top level is generateName which is used to explicitly specify service name to be created. generateName is able to understand macros for the service level of the object created. The service levels are defined as:

  • CHI
  • Cluster
  • Shard
  • Replica

The macro and service level where they apply are:

Setting CHI Cluster Shard Replica Description
{chi} X X X X ClickHouseInstallation name
{chiID} X X X X short hashed ClickHouseInstallation name (Experimental)
{cluster}   X X X The cluster name
{clusterID}   X X X short hashed cluster name (BEWARE, this is an experimental feature)
{clusterIndex}   X X X 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
{shard}     X X shard name
{shardID}     X X short hashed shard name (BEWARE, this is an experimental feature)
{shardIndex}     X X 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
{replica}       X replica name
{replicaID}       X short hashed replica name (BEWARE, this is an experimental feature)
{replicaIndex}       X 0-based index of the replica in the shard (BEWARE, this is an experimental feature)

.spec.templates.serviceTemplates Example

  templates:
    serviceTemplates:
      - name: chi-service-template
        # generateName understands different sets of macroses,
        # depending on the level of the object, for which Service is being created:
        #
        # For CHI-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        #
        # For Cluster-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        # 3. {cluster} - cluster name
        # 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
        # 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
        #
        # For Shard-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        # 3. {cluster} - cluster name
        # 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
        # 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
        # 6. {shard} - shard name
        # 7. {shardID} - short hashed shard name (BEWARE, this is an experimental feature)
        # 8. {shardIndex} - 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
        #
        # For Replica-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        # 3. {cluster} - cluster name
        # 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
        # 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
        # 6. {shard} - shard name
        # 7. {shardID} - short hashed shard name (BEWARE, this is an experimental feature)
        # 8. {shardIndex} - 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
        # 9. {replica} - replica name
        # 10. {replicaID} - short hashed replica name (BEWARE, this is an experimental feature)
        # 11. {replicaIndex} - 0-based index of the replica in the shard (BEWARE, this is an experimental feature)
        generateName: "service-{chi}"
        # type ObjectMeta struct from k8s.io/meta/v1
        metadata:
          labels:
            custom.label: "custom.value"
          annotations:
            cloud.google.com/load-balancer-type: "Internal"
            service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0
            service.beta.kubernetes.io/azure-load-balancer-internal: "true"
            service.beta.kubernetes.io/openstack-internal-load-balancer: "true"
            service.beta.kubernetes.io/cce-load-balancer-internal-vpc: "true"
        # type ServiceSpec struct from k8s.io/core/v1
        spec:
          ports:
            - name: http
              port: 8123
            - name: client
              port: 9000
          type: LoadBalancer

.spec.templates.volumeClaimTemplates

.spec.templates.volumeClaimTemplates defines the PersistentVolumeClaims. For more information, see the Kubernetes PersistentVolumeClaim page.

.spec.templates.volumeClaimTemplates Example

  templates:
    volumeClaimTemplates:
      - name: default-volume-claim
        # type PersistentVolumeClaimSpec struct from k8s.io/core/v1
        spec:
          # 1. If storageClassName is not specified, default StorageClass
          # (must be specified by cluster administrator) would be used for provisioning
          # 2. If storageClassName is set to an empty string (‘’), no storage class will be used
          # dynamic provisioning is disabled for this PVC. Existing, “Available”, PVs
          # (that do not have a specified storageClassName) will be considered for binding to the PVC
          #storageClassName: gold
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 1Gi

.spec.templates.podTemplates

.spec.templates.podTemplates defines the Pod Templates. For more information, see the Kubernetes Pod Templates.

The following additional sections have been defined for the ClickHouse cluster:

  1. zone
  2. distribution

zone and distribution together define zoned layout of ClickHouse instances over nodes. These ensure that the affinity.nodeAffinity and affinity.podAntiAffinity are set.

.spec.templates.podTemplates Example

To place a ClickHouse instances in AWS us-east-1a availability zone with one ClickHouse per host:

        zone:
          values:
            - "us-east-1a"
        distribution: "OnePerHost"

To place ClickHouse instances on nodes labeled as clickhouse=allow with one ClickHouse per host:

        zone:
          key: "clickhouse"
          values:
            - "allow"
        distribution: "OnePerHost"

Or the distribution can be Unspecified:

  templates:
    podTemplates:
      # multiple pod templates makes possible to update version smoothly
      # pod template for ClickHouse v18.16.1
      - name: clickhouse-v18.16.1
        # We may need to label nodes with clickhouse=allow label for this example to run
        # See ./label_nodes.sh for this purpose
        zone:
          key: "clickhouse"
          values:
            - "allow"
        # Shortcut version for AWS installations
        #zone:
        #  values:
        #    - "us-east-1a"

        # Possible values for distribution are:
        # Unspecified
        # OnePerHost
        distribution: "Unspecified"

        # type PodSpec struct {} from k8s.io/core/v1
        spec:
          containers:
            - name: clickhouse
              image: yandex/clickhouse-server:18.16.1
              volumeMounts:
                - name: default-volume-claim
                  mountPath: /var/lib/clickhouse
              resources:
                requests:
                  memory: "64Mi"
                  cpu: "100m"
                limits:
                  memory: "64Mi"
                  cpu: "100m"

References

3.3 - Resources

Altinity Kubernetes Operator Resources Details

The Altinity Kubernetes Operator creates the following resources on installation to support its functions:

  • Custom Resource Definition
  • Service account
  • Cluster Role Binding
  • Deployment

Custom Resource Definition

The Kubernetes k8s API is extended with the new Kubernetes Cluster Resource Definition kind:ClickHouseInstallation.

To check the Custom Resource Definition:

kubectl get customresourcedefinitions

Expected result:

NAME                                                       CREATED AT
clickhouseinstallations.clickhouse.altinity.com            2022-02-09T17:20:39Z
clickhouseinstallationtemplates.clickhouse.altinity.com    2022-02-09T17:20:39Z
clickhouseoperatorconfigurations.clickhouse.altinity.com   2022-02-09T17:20:39Z

Service Account

The new Service Account clickhouse-operator allows services running from within Pods to be authenticated against the Service Account clickhouse-operator through the apiserver.

To check the Service Account:

kubectl get serviceaccounts -n kube-system

Expected result

NAME                                 SECRETS   AGE
attachdetach-controller              1         23d
bootstrap-signer                     1         23d
certificate-controller               1         23d
clickhouse-operator                  1         5s
clusterrole-aggregation-controller   1         23d
coredns                              1         23d
cronjob-controller                   1         23d
daemon-set-controller                1         23d
default                              1         23d
deployment-controller                1         23d
disruption-controller                1         23d
endpoint-controller                  1         23d
endpointslice-controller             1         23d
endpointslicemirroring-controller    1         23d
ephemeral-volume-controller          1         23d
expand-controller                    1         23d
generic-garbage-collector            1         23d
horizontal-pod-autoscaler            1         23d
job-controller                       1         23d
kube-proxy                           1         23d
namespace-controller                 1         23d
node-controller                      1         23d
persistent-volume-binder             1         23d
pod-garbage-collector                1         23d
pv-protection-controller             1         23d
pvc-protection-controller            1         23d
replicaset-controller                1         23d
replication-controller               1         23d
resourcequota-controller             1         23d
root-ca-cert-publisher               1         23d
service-account-controller           1         23d
service-controller                   1         23d
statefulset-controller               1         23d
storage-provisioner                  1         23d
token-cleaner                        1         23d
ttl-after-finished-controller        1         23d
ttl-controller                       1         23d

Cluster Role Binding

The Cluster Role Binding cluster-operator grants permissions defined in a role to a set of users.

Roles are granted to users, groups or service account. These permissions are granted cluster-wide with ClusterRoleBinding.

To check the Cluster Role Binding:

kubectl get clusterrolebinding

Expected result

NAME                                                   ROLE                                                                               AGE
clickhouse-operator-kube-system                        ClusterRole/clickhouse-operator-kube-system                                        5s
cluster-admin                                          ClusterRole/cluster-admin                                                          23d
kubeadm:get-nodes                                      ClusterRole/kubeadm:get-nodes                                                      23d
kubeadm:kubelet-bootstrap                              ClusterRole/system:node-bootstrapper                                               23d
kubeadm:node-autoapprove-bootstrap                     ClusterRole/system:certificates.k8s.io:certificatesigningrequests:nodeclient       23d
kubeadm:node-autoapprove-certificate-rotation          ClusterRole/system:certificates.k8s.io:certificatesigningrequests:selfnodeclient   23d
kubeadm:node-proxier                                   ClusterRole/system:node-proxier                                                    23d
minikube-rbac                                          ClusterRole/cluster-admin                                                          23d
storage-provisioner                                    ClusterRole/system:persistent-volume-provisioner                                   23d
system:basic-user                                      ClusterRole/system:basic-user                                                      23d
system:controller:attachdetach-controller              ClusterRole/system:controller:attachdetach-controller                              23d
system:controller:certificate-controller               ClusterRole/system:controller:certificate-controller                               23d
system:controller:clusterrole-aggregation-controller   ClusterRole/system:controller:clusterrole-aggregation-controller                   23d
system:controller:cronjob-controller                   ClusterRole/system:controller:cronjob-controller                                   23d
system:controller:daemon-set-controller                ClusterRole/system:controller:daemon-set-controller                                23d
system:controller:deployment-controller                ClusterRole/system:controller:deployment-controller                                23d
system:controller:disruption-controller                ClusterRole/system:controller:disruption-controller                                23d
system:controller:endpoint-controller                  ClusterRole/system:controller:endpoint-controller                                  23d
system:controller:endpointslice-controller             ClusterRole/system:controller:endpointslice-controller                             23d
system:controller:endpointslicemirroring-controller    ClusterRole/system:controller:endpointslicemirroring-controller                    23d
system:controller:ephemeral-volume-controller          ClusterRole/system:controller:ephemeral-volume-controller                          23d
system:controller:expand-controller                    ClusterRole/system:controller:expand-controller                                    23d
system:controller:generic-garbage-collector            ClusterRole/system:controller:generic-garbage-collector                            23d
system:controller:horizontal-pod-autoscaler            ClusterRole/system:controller:horizontal-pod-autoscaler                            23d
system:controller:job-controller                       ClusterRole/system:controller:job-controller                                       23d
system:controller:namespace-controller                 ClusterRole/system:controller:namespace-controller                                 23d
system:controller:node-controller                      ClusterRole/system:controller:node-controller                                      23d
system:controller:persistent-volume-binder             ClusterRole/system:controller:persistent-volume-binder                             23d
system:controller:pod-garbage-collector                ClusterRole/system:controller:pod-garbage-collector                                23d
system:controller:pv-protection-controller             ClusterRole/system:controller:pv-protection-controller                             23d
system:controller:pvc-protection-controller            ClusterRole/system:controller:pvc-protection-controller                            23d
system:controller:replicaset-controller                ClusterRole/system:controller:replicaset-controller                                23d
system:controller:replication-controller               ClusterRole/system:controller:replication-controller                               23d
system:controller:resourcequota-controller             ClusterRole/system:controller:resourcequota-controller                             23d
system:controller:root-ca-cert-publisher               ClusterRole/system:controller:root-ca-cert-publisher                               23d
system:controller:route-controller                     ClusterRole/system:controller:route-controller                                     23d
system:controller:service-account-controller           ClusterRole/system:controller:service-account-controller                           23d
system:controller:service-controller                   ClusterRole/system:controller:service-controller                                   23d
system:controller:statefulset-controller               ClusterRole/system:controller:statefulset-controller                               23d
system:controller:ttl-after-finished-controller        ClusterRole/system:controller:ttl-after-finished-controller                        23d
system:controller:ttl-controller                       ClusterRole/system:controller:ttl-controller                                       23d
system:coredns                                         ClusterRole/system:coredns                                                         23d
system:discovery                                       ClusterRole/system:discovery                                                       23d
system:kube-controller-manager                         ClusterRole/system:kube-controller-manager                                         23d
system:kube-dns                                        ClusterRole/system:kube-dns                                                        23d
system:kube-scheduler                                  ClusterRole/system:kube-scheduler                                                  23d
system:monitoring                                      ClusterRole/system:monitoring                                                      23d
system:node                                            ClusterRole/system:node                                                            23d
system:node-proxier                                    ClusterRole/system:node-proxier                                                    23d
system:public-info-viewer                              ClusterRole/system:public-info-viewer                                              23d
system:service-account-issuer-discovery                ClusterRole/system:service-account-issuer-discovery                                23d
system:volume-scheduler                                ClusterRole/system:volume-scheduler                                                23d

Cluster Role Binding Example

As an example, the role cluster-admin is granted to a service account clickhouse-operator:

roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: ServiceAccount
    name: clickhouse-operator
    namespace: kube-system

Deployment

The Deployment clickhouse-operator runs in the kube-system namespace.

To check the Deployment:

kubectl get deployments --namespace kube-system

Expected result

NAME                  READY   UP-TO-DATE   AVAILABLE   AGE
clickhouse-operator   1/1     1            1           5s
coredns               1/1     1            1           23d

References

3.4 - Networking Connection Guides

How to connect your ClickHouse Kubernetes cluster network.

Organizations can connect their clickhouse-operator based ClickHouse cluster to their network depending on their environment. The following guides are made to assist users setting up the connections based on their environment.

3.4.1 - MiniKube Networking Connection Guide

How to connect your ClickHouse Kubernetes cluster network.

Organizations that have set up the Altinity Kubernetes Operator using minikube can connect it to an external network through the following steps.

Prerequisites

The following guide is based on an installed Altinity Kubernetes Operator cluster using minikube for an Ubuntu Linux operating system.

Network Connection Guide

The proper way to connect to the ClickHouse cluster is through the LoadBalancer created during the ClickHouse cluster created process. For example, the following ClickHouse cluster has 2 shards in one replica, applied to the namespace test:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "demo-01"
spec:
  configuration:
    clusters:
      - name: "demo-01"
        layout:
          shardsCount: 2
          replicasCount: 1

This generates the following services in the namespace test:

kubectl get service -n test
NAME                      TYPE           CLUSTER-IP    EXTERNAL-IP   PORT(S)                         AGE
chi-demo-01-demo-01-0-0   ClusterIP      None          <none>        8123/TCP,9000/TCP,9009/TCP      22s
chi-demo-01-demo-01-1-0   ClusterIP      None          <none>        8123/TCP,9000/TCP,9009/TCP      5s
clickhouse-demo-01        LoadBalancer   10.96.67.44   <pending>     8123:32766/TCP,9000:31368/TCP   38s

The LoadBalancer alternates which of the ClickHouse shards to connect to, and should be where all ClickHouse clients connect to.

To open a connection from external networks to the LoadBalancer, use the kubectl port-forward command in the following format:

kubectl port-forward service/{LoadBalancer Service} -n {NAMESPACE} --address={IP ADDRESS} {TARGET PORT}:{INTERNAL PORT}

Replacing the following:

  • LoadBalancer Service: the LoadBalancer service to connect external ports to the Kubernetes environment.
  • NAMESPACE: The namespace for the LoadBalancer.
  • IP ADDRESS: The IP address to bind the service to on the machine running minikube, or 0.0.0.0 to find all IP addresses on the minikube server to the specified port.
  • TARGET PORT: The external port that users will connect to.
  • INTERNAL PORT: The port within the Altinity Kubernetes Operator network.

The kubectl port-forward command must be kept running in the terminal, or placed into the background with the & operator.

In the example above, the following settings will be used to bind all IP addresses on the minikube server to the service clickhouse-demo-01 for ports 9000 and 8123 in the background:

kubectl port-forward service/clickhouse-demo-01 -n test --address=0.0.0.0 9000:9000 8123:8123 &

To test the connection, connect to the external IP address via curl. For ClickHouse HTTP, OK will be returned, while for port 9000 a notice requesting use of port 8123 will be displayed:

curl http://localhost:9000
Handling connection for 9000
Port 9000 is for clickhouse-client program
You must use port 8123 for HTTP.
curl http://localhost:8123
Handling connection for 8123
Ok.

Once verified, connect to the ClickHouse cluster via either HTTP or ClickHouse TCP as needed.

3.5 - Storage Guide

How to configure storage options for the Altinity Kubernetes Operator

Altinity Kubernetes Operator users have different options regarding persistent storage depending on their environment and situation. The following guides detail how to set up persistent storage for local and cloud storage environments.

3.5.1 - Persistent Storage Overview

Allocate persistent storage for Altinity Kubernetes Operator clusters

Users setting up storage in their local environments can establish persistent volumes in different formats based on their requirements.

Allocating Space

Space is allocated through the Kubernetes PersistentVolume object. ClickHouse clusters established with the Altinity Kubernetes Operator then use the PersistentVolumeClaim to receive persistent storage.

The PersistentVolume can be set in one of two ways:

  • Manually: Manual allocations set the storage area before the ClickHouse cluster is created. Space is then requested through a PersistentVolumeClaim when the ClickHouse cluster is created.
  • Dynamically: Space is allocated when the ClickHouse cluster is created through the PersistentVolumeClaim, and the Kubernetes controlling software manages the process for the user.

For more information on how persistent volumes are managed in Kubernetes, see the Kubernetes documentation Persistent Volumes.

Storage Types

Data stored for ClickHouse clusters in the following ways:

No Persistent Storage

If no persistent storage claim template is specified, then no persistent storage will be allocated. When Kubernetes is stopped or a new manifest applied, all previous data will be lost.

In this example two shards are specified but has no persistent storage allocated:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "no-persistent"
spec:
  configuration:
    clusters:
      - name: "no-persistent"
        layout:
          shardsCount: 2
          replicasCount: 1

When applied to the namespace test, no persistent storage is found:

kubectl -n test get pv
No resources found

Cluster Wide Storage

If neither the dataVolumeClaimTemplate or the logVolumeClaimTemplate are specified (see below), then all data is stored under the requested volumeClaimTemplate. This includes all information stored in each pod.

In this example, two shards are specified with one volume of storage that is used by the entire pods:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "cluster-storage"
spec:
  configuration:
    clusters:
      - name: "cluster-storage"
        layout:
          shardsCount: 2
          replicasCount: 1
        templates:
            volumeClaimTemplate: cluster-storage-vc-template
  templates:
    volumeClaimTemplates:
      - name: cluster-storage-vc-template
        spec:
          storageClassName: standard
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 500Mi

When applied to the namespace test the following persistent volumes are found. Note that each pod has 500Mb of storage:

kubectl -n test get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                                        STORAGECLASS   REASON   AGE
pvc-6e70c36a-f170-47b5-93a6-88175c62b8fe   500Mi      RWO            Delete           Bound    test/cluster-storage-vc-template-chi-cluster-storage-cluster-storage-1-0-0   standard                21s
pvc-ca002bc4-0ad2-4358-9546-0298eb8b2152   500Mi      RWO            Delete           Bound    test/cluster-storage-vc-template-chi-cluster-storage-cluster-storage-0-0-0   standard                39s

Cluster Wide Split Storage

Applying the dataVolumeClaimTemplate and logVolumeClaimTemplate template types to the Altinity Kubernetes Operator controlled ClickHouse cluster allows for specific data from each ClickHouse pod to be stored in a particular persistent volume:

  • dataVolumeClaimTemplate: Sets the storage volume for the ClickHouse node data. In a traditional ClickHouse server environment, this would be allocated to /var/lib/clickhouse.
  • logVolumeClaimTemplate: Sets the storage volume for ClickHouse node log files. In a traditional ClickHouse server environment, this would be allocated to /var/log/clickhouse-server.

This allows different storage capacities for log data versus ClickHouse database data, as well as only capturing specific data rather than the entire pod.

In this example, two shards have different storage capacity for dataVolumeClaimTemplate and logVolumeClaimTemplate:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "cluster-split-storage"
spec:
  configuration:
    clusters:
      - name: "cluster-split"
        layout:
          shardsCount: 2
          replicasCount: 1
        templates:
          dataVolumeClaimTemplate: data-volume-template
          logVolumeClaimTemplate: log-volume-template
  templates:
    volumeClaimTemplates:
      - name: data-volume-template
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 500Mi
      - name: log-volume-template
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 100Mi

In this case, retrieving the PersistentVolume allocations shows two storage volumes per pod based on the specifications in the manifest:

kubectl -n test get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                                     STORAGECLASS   REASON   AGE
pvc-0b02c5ba-7ca1-4578-b3d9-ff8bb67ad412   100Mi      RWO            Delete           Bound    test/log-volume-template-chi-cluster-split-storage-cluster-split-1-0-0    standard                21s
pvc-4095b3c0-f550-4213-aa53-a08bade7c62c   100Mi      RWO            Delete           Bound    test/log-volume-template-chi-cluster-split-storage-cluster-split-0-0-0    standard                40s
pvc-71384670-c9db-4249-ae7e-4c5f1c33e0fc   500Mi      RWO            Delete           Bound    test/data-volume-template-chi-cluster-split-storage-cluster-split-1-0-0   standard                21s
pvc-9e3fb3fa-faf3-4a0e-9465-8da556cb9eec   500Mi      RWO            Delete           Bound    test/data-volume-template-chi-cluster-split-storage-cluster-split-0-0-0   standard                40s

Pod Mount Based Storage

PersistentVolume objects can be mounted directly into the pod’s mountPath. Any other data is not stored when the container is stopped unless it is covered by another PersistentVolumeClaim.

In the following example, each of the 2 shards in the ClickHouse cluster has the volumes tied to specific mount points:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "pod-split-storage"
spec:
  configuration:
    clusters:
      - name: "pod-split"
        # Templates are specified for this cluster explicitly
        templates:
          podTemplate: pod-template-with-volumes
        layout:
          shardsCount: 2
          replicasCount: 1

  templates:
    podTemplates:
      - name: pod-template-with-volumes
        spec:
          containers:
            - name: clickhouse
              image: yandex/clickhouse-server:21.8
              volumeMounts:
                - name: data-storage-vc-template
                  mountPath: /var/lib/clickhouse
                - name: log-storage-vc-template
                  mountPath: /var/log/clickhouse-server

    volumeClaimTemplates:
      - name: data-storage-vc-template
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 500Mi
      - name: log-storage-vc-template
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 100Mi
kubectl -n test get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                                 STORAGECLASS   REASON   AGE
pvc-37be9f84-7ba5-404e-8299-e95a291014a8   500Mi      RWO            Delete           Bound    test/data-storage-vc-template-chi-pod-split-storage-pod-split-1-0-0   standard                24s
pvc-5b2f8694-326d-41cb-94ec-559725947b45   100Mi      RWO            Delete           Bound    test/log-storage-vc-template-chi-pod-split-storage-pod-split-1-0-0    standard                24s
pvc-84768e78-e44e-4295-8355-208b07330707   500Mi      RWO            Delete           Bound    test/data-storage-vc-template-chi-pod-split-storage-pod-split-0-0-0   standard                43s
pvc-9e123af7-01ce-4ab8-9450-d8ca32b1e3a6   100Mi      RWO            Delete           Bound    test/log-storage-vc-template-chi-pod-split-storage-pod-split-0-0-0    standard                43s