Deploy a simple Multi-Node Clickhouse Cluster with docker-compose in minutes.

Overview

Simple Multi Node Clickhouse Cluster

I hate those single-node clickhouse clusters and manually installation, I mean, why should we:

this is just weird!

So this repo tries to solve these problem.

Note

  • This is a simplified model of Multi Node Clickhouse Cluster, which lacks: LoadBalancer config/Automated Failover/MultiShard Config generation.
  • All clickhouse data is persisted under event-data, if you need to move clickhouse to some other directory, you'll just need to move the directory(that contains docker-compose.yml) and docker-compose up -d to fire it up again.
  • Host network mode is used to simplify the whole deploy procedure, so you might need to create addition firewall rules if you are running this on a public accessible machine.

Prerequisites

To use this, we need docker and docker-compose installed, recommended OS is ubuntu, and it's recommended to install clickhouse-client on machine, so on a typical ubuntu server, doing the following should be sufficient.

apt update
curl -fsSL https://get.docker.com -o get-docker.sh && sh get-docker.sh && rm -f get-docker.sh
apt install docker-compose clickhouse-client -y

Usage

  1. Clone this repo
  2. Edit the necessary server info in topo.yml
  3. Run python3 generate.py
  4. Your cluster info should be in the cluster directory now
  5. Sync those files to related nodes and run docker-compose up -d on them
  6. Your cluster is ready to go

If you still cannot understand what I'm saying above, see the example below.

Example Usage

Edit information

I've Clone the repo and would like to set a 3-master clickhouse cluster and has the following specs

  • 3 replica(one replica on each node)
  • 1 Shard only

So I need to edit the topo.yml as follows:

global:
  clickhouse_image: "yandex/clickhouse-server:21.3.2.5"
  zookeeper_image: "bitnami/zookeeper:3.6.1"

zookeeper_servers:
  - host: 192.168.33.101
  - host: 192.168.33.102
  - host: 192.168.33.103

clickhouse_servers:
  - host: 192.168.33.101
  - host: 192.168.33.102
  - host: 192.168.33.103

clickhouse_topology:
  - clusters:
      - name: "novakwok_cluster"
        shards:
          - name: "novakwok_shard"
            servers:
              - host: 192.168.33.101
              - host: 192.168.33.102
              - host: 192.168.33.103

Generate Config

After python3 generate.py, a structure has been generated under cluster directory, looks like this:

➜  simple-multinode-clickhouse-cluster git:(master) ✗ python3 generate.py 
Write clickhouse-config.xml to cluster/192.168.33.101/clickhouse-config.xml
Write clickhouse-config.xml to cluster/192.168.33.102/clickhouse-config.xml
Write clickhouse-config.xml to cluster/192.168.33.103/clickhouse-config.xml

➜  simple-multinode-clickhouse-cluster git:(master) ✗ tree cluster 
cluster
├── 192.168.33.101
│   ├── clickhouse-config.xml
│   ├── clickhouse-user-config.xml
│   └── docker-compose.yml
├── 192.168.33.102
│   ├── clickhouse-config.xml
│   ├── clickhouse-user-config.xml
│   └── docker-compose.yml
└── 192.168.33.103
    ├── clickhouse-config.xml
    ├── clickhouse-user-config.xml
    └── docker-compose.yml

3 directories, 9 files

Sync Config

Now we need to sync those files to related hosts(of course you can use ansible here):

rsync -aP ./cluster/192.168.33.101/ [email protected]:/root/ch/
rsync -aP ./cluster/192.168.33.102/ [email protected]:/root/ch/
rsync -aP ./cluster/192.168.33.103/ [email protected]:/root/ch/

Start Cluster

Now run docker-compose up -d on every hosts' /root/ch/ directory.

Validation

On 192.168.33.101, use clickhouse-client to connect to local instance and check if cluster is there.

[email protected]:~/ch# clickhouse-client 
ClickHouse client version 18.16.1.
Connecting to localhost:9000.
Connected to ClickHouse server version 21.3.2 revision 54447.

192-168-33-101 :) SELECT * FROM system.clusters;

SELECT *
FROM system.clusters 

┌─cluster──────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name──────┬─host_address───┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─estimated_recovery_time─┐
│ novakwok_cluster                             │         1 │            1 │           1 │ 192.168.33.101 │ 192.168.33.101 │ 9000 │        1 │ default │                  │            0 │                       0 │
│ novakwok_cluster                             │         1 │            1 │           2 │ 192.168.33.102 │ 192.168.33.102 │ 9000 │        0 │ default │                  │            0 │                       0 │
│ novakwok_cluster                             │         1 │            1 │           3 │ 192.168.33.103 │ 192.168.33.103 │ 9000 │        0 │ default │                  │            0 │                       0 │
│ test_cluster_two_shards                      │         1 │            1 │           1 │ 127.0.0.1      │ 127.0.0.1      │ 9000 │        1 │ default │                  │            0 │                       0 │
│ test_cluster_two_shards                      │         2 │            1 │           1 │ 127.0.0.2      │ 127.0.0.2      │ 9000 │        0 │ default │                  │            0 │                       0 │
│ test_cluster_two_shards_internal_replication │         1 │            1 │           1 │ 127.0.0.1      │ 127.0.0.1      │ 9000 │        1 │ default │                  │            0 │                       0 │
│ test_cluster_two_shards_internal_replication │         2 │            1 │           1 │ 127.0.0.2      │ 127.0.0.2      │ 9000 │        0 │ default │                  │            0 │                       0 │
│ test_cluster_two_shards_localhost            │         1 │            1 │           1 │ localhost      │ 127.0.0.1      │ 9000 │        1 │ default │                  │            0 │                       0 │
│ test_cluster_two_shards_localhost            │         2 │            1 │           1 │ localhost      │ 127.0.0.1      │ 9000 │        1 │ default │                  │            0 │                       0 │
│ test_shard_localhost                         │         1 │            1 │           1 │ localhost      │ 127.0.0.1      │ 9000 │        1 │ default │                  │            0 │                       0 │
│ test_shard_localhost_secure                  │         1 │            1 │           1 │ localhost      │ 127.0.0.1      │ 9440 │        0 │ default │                  │            0 │                       0 │
│ test_unavailable_shard                       │         1 │            1 │           1 │ localhost      │ 127.0.0.1      │ 9000 │        1 │ default │                  │            0 │                       0 │
│ test_unavailable_shard                       │         2 │            1 │           1 │ localhost      │ 127.0.0.1      │    1 │        0 │ default │                  │            0 │                       0 │
└──────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴────────────────┴────────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────────────┘
↘ Progress: 13.00 rows, 1.58 KB (4.39 thousand rows/s., 532.47 KB/s.) 
13 rows in set. Elapsed: 0.003 sec. 

Let's create a DB with replica:

192-168-33-101 :) create database novakwok_test on cluster novakwok_cluster;

CREATE DATABASE novakwok_test ON CLUSTER novakwok_cluster

┌─host───────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ 192.168.33.103 │ 9000 │      0 │       │                   2 │                0 │
│ 192.168.33.101 │ 9000 │      0 │       │                   1 │                0 │
│ 192.168.33.102 │ 9000 │      0 │       │                   0 │                0 │
└────────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
← Progress: 3.00 rows, 174.00 B (16.07 rows/s., 931.99 B/s.)  99%
3 rows in set. Elapsed: 0.187 sec. 

192-168-33-101 :) show databases;

SHOW DATABASES

┌─name──────────┐
│ default       │
│ novakwok_test │
│ system        │
└───────────────┘
↑ Progress: 3.00 rows, 479.00 B (855.61 rows/s., 136.61 KB/s.) 
3 rows in set. Elapsed: 0.004 sec. 

Connect to another host to see if it's really working.

[email protected]:~/ch# clickhouse-client -h 192.168.33.102
ClickHouse client version 18.16.1.
Connecting to 192.168.33.102:9000.
Connected to ClickHouse server version 21.3.2 revision 54447.

192-168-33-102 :) show databases;

SHOW DATABASES

┌─name──────────┐
│ default       │
│ novakwok_test │
│ system        │
└───────────────┘
↘ Progress: 3.00 rows, 479.00 B (623.17 rows/s., 99.50 KB/s.) 
3 rows in set. Elapsed: 0.005 sec. 

License

GPL

Owner
Nova Kwok
43EC 6073 0BFF A16C 34BB 9EF2 8D42 A0E6 99E5 0639
Nova Kwok
CTF infrastructure deployment automation tool.

CTF infrastructure deployment automation tool. Focus on the challenges. Mirrored from

Fake News 1 Apr 12, 2022
Rundeck / Grafana / Prometheus / Rundeck Exporter integration demo

Rundeck / Prometheus / Grafana integration demo via Rundeck Exporter This is a demo environment that shows how to monitor a Rundeck instance using Run

Reiner 4 Oct 14, 2022
🐳 Docker templates for various languages.

Docker Deployment Templates One Stop repository for Docker Compose and Docker Templates for Deployment. Features Python (FastAPI, Flask) Screenshots D

CodeChef-VIT 6 Aug 28, 2022
A lobby boy will create a VPS server when you need one, and destroy it after using it.

Lobbyboy What is a lobby boy? A lobby boy is completely invisible, yet always in sight. A lobby boy remembers what people hate. A lobby boy anticipate

226 Dec 29, 2022
HXVM - Check Host compatibility with the Virtual Machines

HXVM - Check Host compatibility with the Virtual Machines. Features | Installation | Usage Features Takes input from user to compare how many VMs they

Aman Srivastava 4 Oct 15, 2022
A repository containing a short tutorial for Docker (with Python).

Docker Tutorial for IFT 6758 Lab In this repository, we examine the advtanges of virtualization, what Docker is and how we can deploy simple programs

Arka Mukherjee 0 Dec 14, 2021
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions

Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions

Arie Bregman 35.1k Jan 02, 2023
DC/OS - The Datacenter Operating System

DC/OS - The Datacenter Operating System The easiest way to run microservices, big data, and containers in production. What is DC/OS? Like traditional

DC/OS 2.3k Jan 06, 2023
Rancher Kubernetes API compatible with RKE, RKE2 and maybe others?

kctl Rancher Kubernetes API compatible with RKE, RKE2 and maybe others? Documentation is WIP. Quickstart pip install --upgrade kctl Usage from lazycls

1 Dec 02, 2021
A curated list of awesome DataOps tools

Awesome DataOps A curated list of awesome DataOps tools. Awesome DataOps Data Catalog Data Exploration Data Ingestion Data Lake Data Processing Data Q

Kelvin S. do Prado 40 Dec 23, 2022
Ingress patch example by Kustomize

Ingress patch example by Kustomize

Jinu 10 Nov 14, 2022
IP address management (IPAM) and data center infrastructure management (DCIM) tool.

NetBox is an IP address management (IPAM) and data center infrastructure management (DCIM) tool. Initially conceived by the network engineering team a

NetBox Community 11.8k Jan 07, 2023
Wiremind Kubernetes helper

Wiremind Kubernetes helper This Python library is a high-level set of Kubernetes Helpers allowing either to manage individual standard Kubernetes cont

Wiremind 3 Oct 09, 2021
Run Oracle on Kubernetes with El Carro

El Carro is a new project that offers a way to run Oracle databases in Kubernetes as a portable, open source, community driven, no vendor lock-in container orchestration system. El Carro provides a p

Google Cloud Platform 205 Dec 30, 2022
Play Wordle from any Kubernetes cluster.

wordle-operator 🟩 ⬛ 🟩 🟨 ⬛ Play Wordle from any Kubernetes cluster. Using the power of CustomResourceDefinitions and Kubernetes Operators, now you c

Lucas Melin 1 Jan 15, 2022
Google Kubernetes Engine (GKE) with a Snyk Kubernetes controller installed/configured for Snyk App

Google Kubernetes Engine (GKE) with a Snyk Kubernetes controller installed/configured for Snyk App This example provisions a Google Kubernetes Engine

Pas Apicella 2 Feb 09, 2022
framework providing automatic constructions of vulnerable infrastructures

中文 | English 1 Introduction Metarget = meta- + target, a framework providing automatic constructions of vulnerable infrastructures, used to deploy sim

rambolized 685 Dec 28, 2022
Knock your images before these make you painful.

image-knocker Knock your images before these make you painful. Background One day, I had run my deep learning model training program and got off work

Yonghye Kwon 9 Jul 25, 2022
Docker Container wallstreetbets-sentiment-analysis

Docker Container wallstreetbets-sentiment-analysis A docker container using restful endpoints exposed on port 5000 "/analyze" to gather sentiment anal

145 Nov 22, 2022
RMRK spy bot for RMRK hackathon

rmrk_spy_bot RMRK spy bot https://t.me/RMRKspyBot for rmrk hacktoberfest https://rmrk.devpost.com/ Birds and items price and rarity estimation Reports

Victor Ryabinin 2 Sep 06, 2022