README
¶
Fluent Bit -> Kafka -> ClickHouse
The Fluent Bit Kafka ClickHouse connector can be used to ingest the logs from Kafka into ClickHouse. To write the logs from Fluent Bit into Kafka the official Kafka output plugin can be used.
Configuration
An example Deployment for the Kafka ClickHouse connector can be found in the fluent-bit-kafka-clickhouse.yaml file. The following command-line flags and environment variables can be used to configure the connector:
Command-Line Flag | Environment Variable | Description | Default |
---|---|---|---|
--clickhouse.address |
CLICKHOUSE_ADDRESS |
ClickHouse address to connect to. | |
--clickhouse.database |
CLICKHOUSE_DATABASE |
ClickHouse database name. | logs |
--clickhouse.username |
CLICKHOUSE_USERNAME |
ClickHouse username for the connection. | |
--clickhouse.password |
CLICKHOUSE_PASSWORD |
ClickHouse password for the connection. | |
--clickhouse.write-timeout |
CLICKHOUSE_WRITE_TIMEOUT |
ClickHouse write timeout for the connection. | 10 |
--clickhouse.read-timeout |
CLICKHOUSE_READ_TIMEOUT |
ClickHouse read timeout for the connection. | 10 |
--clickhouse.batch-size |
CLICKHOUSE_BATCH_SIZE |
The size for how many log lines should be buffered, before they are written to ClickHouse. | 100000 |
--clickhouse.flush-interval |
CLICKHOUSE_FLUSH_INTERVAL |
The maximum amount of time to wait, before logs are written to ClickHouse. | 60s |
--kafka.brokers |
KAFKA_BROKERS |
Kafka bootstrap brokers to connect to, as a comma separated list | |
--kafka.group |
KAFKA_GROUP |
Kafka consumer group definition | kafka-clickhouse |
--kafka.version |
KAFKA_VERSION |
Kafka cluster version | 2.1.1 |
--kafka.topics |
KAFKA_TOPICS |
Kafka topics to be consumed, as a comma separated list | fluent-bit |
--log.format |
LOG_FORMAT |
The log format. Must be plain or json . |
plain |
--log.level |
LOG_LEVEL |
The log level. Must be trace , debug , info , warn , error , fatal or panic . |
info |
Development
We are using kind for local development. To create a new Kubernetes cluster using kind you can run the cluster/cluster.sh
script, which will create such a cluster with a Docker registry:
./cluster/cluster.sh
Once the cluster is running we can build and push the Docker image for Fluent Bit:
docker build -f cmd/fluent-bit-kafka-clickhouse/Dockerfile -t localhost:5000/fluent-bit-clickhouse:latest-kafka .
docker push localhost:5000/fluent-bit-clickhouse:latest-kafka
# To run the Docker image locally, the following command can be used:
docker run -it --rm localhost:5000/fluent-bit-clickhouse:latest-kafka
In the next step we have to create our ClickHouse cluster via the ClickHouse Operator. To do that we can deploy all the files from the cluster/clickhouse-operator
and cluster/clickhouse
folder:
k apply -f cluster/clickhouse-operator
k apply -f cluster/clickhouse
Once ClickHouse is running we have to connect to the two ClickHouse nodes to create our SQL schema. The schema can be found in the schema.sql
file, just execute each SQL command one by one on both ClickHouse nodes:
k exec -n clickhouse -it chi-clickhouse-sharded-0-0-0 -c clickhouse -- clickhouse-client
k exec -n clickhouse -it chi-clickhouse-sharded-1-0-0 -c clickhouse -- clickhouse-client
Before we can deploy Fluent Bit and the Kafka to ClickHouse connector, we have to deploy Kafka using the following command:
k apply -f cluster/kafka
Now we can deploy Fluent Bit to ingest all logs into Kafka and the Kafka to ClickHouse connector to write the logs from Kafka into ClickHouse:
k apply -f cluster/fluent-bit/kafka
k logs -n fluent-bit -l app=fluent-bit -f
To check if the logs are arriving in ClickHouse you can use the following SQL commands:
SELECT count(*) FROM logs.logs;
SELECT * FROM logs.logs LIMIT 10;
SELECT count(*) FROM logs.logs_local;
SELECT * FROM logs.logs_local LIMIT 10;
To clean up all the created resources run the following commands:
kind delete cluster --name fluent-bit-clickhouse
docker stop kind-registry
docker rm kind-registry
Documentation
¶
There is no documentation for this package.