Automating Kafka with Ansible
In all areas of life, there are a lot of issues we are not aware of. To measure these problems, we use the so-called fulfillment rate. For example: with a fulfillment rate of 90%, a heart beat would stop every 85 seconds, and an average A4 page of text would contain 30 typos. A fulfillment rate of 99.9% (which seems a lot) still means 22,000 wrong bank bookings per hour, and a total of 32,000 missed heart beats per year. The answer is automation and standardization! These approaches help solve problems we are often not aware of.
With automation and standardization, we can close the Innovation Gap, i.e., the discrepancy between the innovation businesses need and the innovative capability traditional IT can provide. Standardization means a common IT language, from development to production. To achieve this, Ansible is the best tool. It is s „radically simple,“ has low requirements, needs no agent, and is human-readable.
Ansible
Ansible consists of a control node, a playbook, and an inventory containing servers. The inventory uses an ini or YAML syntax to define the infrastructure, combine servers in groups, and define certain variables. [kafkaserver]
web1.server.com ansible_host=13.12.2.1
web2.server.com ansible_host=13.12.2.2
web3.server.com ansible_host=13.12.2.3
[appserver]
app1.server.com ansible_host=10.10.12.1
The playbook is written in (human-readable) YAML and maps roles to hosts. It is used for standardization and should ideally be combined with VCS.
---
- hosts: kafkaserver # mapping role to host
become: true # run as root
roles:
- preflight # role1
- zookeeper # role2
- kafkabroker # role3
A role is a combination of tasks. It is idempotent and should ideally be combined with VCS.
---
#preflight confluent role
- name: Add open JDK repo
apt_repository:
repo: ppa:openjdk-r/ppa
- name: Install Java
apt:
name: "openjdk-8-jdk"
update_cache: yes
...
Kafka
Ansible allows users to manage a Kafka deployment. You can automate different Kafka-related tasks, ensure consistent configurations in a cluster, and easily scale and control the Kafka infrastructure. At the same time that Ansible was launched, Confluent Kafka was created in 2012. Kafka playbooks are a set of Ansible playbooks and roles that help deploy and manage Kafka clusters with Ansible. These playbooks are intended to simplify the process of setting up and configuring a Kafka cluster, including installing Kafka, ZooKeeper, and other related components. The following inventory could be used to deploy a Kafka cluster: ---
#inventory.yaml
zookeeper:
hosts:
host1:
ansible_host: 13.12.2.1
host2:
ansible_host: 13.12.2.2
host3:
ansible_host: 13.12.2.3
broker:
hosts:
host1:
ansible_host: 13.12.2.1
host2:
ansible_host: 13.12.2.2
host3:
ansible_host: 13.12.2.3
The Confluent Kafka playbooks provide a high-level abstraction for managing Kafka infrastructure. They encapsulate the best practices and configuration recommendations provided by Confluent and can be customized to meet specific deployment requirements. You can use the playbooks to assign roles to servers:
Within one role, it is possible to separate variables …
- hosts: preflight
tasks:
- import_role:
name: confluent.preflight
- hosts: ssl_CA
tasks:
- import_role:
name: confluent.ssl_CA
- hosts: zookeeper
tasks:
- import_role:
name: confluent.zookeeper
- hosts: broker
tasks:
- import_role:
name: confluent.kafka-broker
- hosts: schema-registry
tasks:
- import_role:
name: confluent.schema-registry
… and code:
---
#defaults/main.yml
kafka:
broker:
user: "cp-kafka"
group: "confluent"
config_file: "/etc/kafka/server.properties"
systemd_file: |
"/usr/lib/systemd/system/kafka.service"
service_name: "kafka"
datadir:
- "/var/lib/kafka/data"
systemd:
enabled: yes
state: "started"
environment:
KAFKA_HEAP_OPTS: "-Xmx1g"
config:
group.initial.rebalance.delay.ms: 0
log.retention.check.interval.ms: 300000
num.partitions: 1
num.recovery.threads.per.data.dir: 2
offsets.topic.replication.factor: 3
transaction.state.log.min.isr: 2
zookeeper.connection.timeout.ms: 6000
# [...] many more
---
# [...] tasks to create user,group and dirs
# tasks/main.yml
- name: broker ssl config
template:
src: server_ssl.properties.j2
dest: "{{kafka.broker.config_file}}"
mode: 0640
owner: "{{kafka.broker.user}}"
group: "{{kafka.broker.group}}"
when: security_mode == "ssl"
notify:
- restart kafka
- name: create systemd override file
file:
path: "{{kafka.broker.systemd_override}}"
owner: "{{kafka.broker.user}}"
group: "{{kafka.broker.group}}"
state: directory
mode: 0640
- name: broker configure service
systemd:
name: "{{kafka.broker.service_name}}"
enabled: "{{kafka.broker.systemd.enabled}}"
state: "{{kafka.broker.systemd.state}}"
Docker
In 2013, Docker entered the market. Docker offers a lot of benefits, such as portability, scalability, security, efficiency, and reproducibility. There is an Ansible module for managing Docker containers. This allows a generic role to be used multiple times with always different variables.
- hosts: zookeeper
tasks:
- name: Deploy Zookeeper
include_role:
name: confluent_component
vars_from: zookeeper
- hosts: kafka
- name: Deploy Kafka
include_role:
name: confluent_component
vars_from: kafka
- hosts: control-center
Generic Docker role, data and code separated:
- name: Deploy Control-Center
include_role:
name: confluent_component
vars_from: control-center
---
- name: "Start Docker-Container"
docker_container:
name: "{{ kafka_component_name }}"
image: "{{ kafka_component_container_image }}"
state: "{{ kafka_component_container_state }}"
restart: "{{ config_changed.changed }}"
published_ports: "{{ published_ports }}"
restart_policy: "{{ container_restart_policy }}"
env_file: "{{ kafka_component_env_file }}"
volumes: "{{ kafka_component_volumes }}"
---
kafka_component_name: "zookeeper"
image: "confluentinc/cp-kafka"
published_ports:
- 12888:2888
- 13888:3888
Docker Compose followed a year later, allowing multiple containers to be deployed. This is done via a compose file: ---
This is how you can then use Docker Compose in an Ansible playbook:
version: '2'
services:
zookeeper:
image: "confluentinc/cp-zookeeper:latest"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
kafka:
image: "confluentinc/cp-kafka:latest"
depends_on:
- "zookeeper"
ports:
- 9092:9092
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: "zookeeper:2181"
# [...] more kafka broker settings
- hosts: kafka-server
tasks:
- name: "Spin up Kakfa-Cluster"
docker_compose:
project_src: "cp_kafka"
state: absent
register: output
In the years that followed, many other vendors came along, such as Kubernetes, Red Hat OpenShift, Rancher ...
- name: "Ensure Stack is running"
assert:
that:
- kafka.cp_kafka_kafka_1.state.running
- zookeeper.cp_kafka_zookeeper_1.state.running
Kafka and Ansible
With Ansible, you can not only deploy Kafka, but also manage it. There is an Ansible module for topics and ACIs, and no SSH connection to a remote broker is required. ---
Manage Kafka topics:
- name: "create topic"
kafka_lib:
resource: 'topic'
name: 'test'
partitions: 2
replica_factor: 1
options:
retention.ms: 574930
flush.ms: 12345
state: 'present'
zookeeper: >
"{{ zookeeper_ip }}:2181"
bootstrap_servers: >
"{{ kafka_ip_1 }}:9092, {{ kafka_ip2 }}:9092"
security_protocol: 'SASL_SSL'
sasl_plain_username: 'username'
sasl_plain_password: 'password'
ssl_cafile: '{{ content_of_ca_cert_file_or_path_to_ca_cert_file }}
The Rest Proxy cannot create topics, you need manual idempotency, and access is limited to one Kafka broker.
# Definition of topic
topic:
name: "test"
partitions: 2
replica_factor: 1
configuration:
retention.ms: 574930
flush.ms: 12345
---
- name: "Get topic information"
uri:
url: "{{ 'kafka_rest_proxy_url' + ':8082/topics/' + topic.name }}"
register: result
- name: "Create new topic"
command: "{{ 'kafka-topics --zookeeper ' + zookeeper +
' --create' +
' --topic ' + topic.name +
' --partitions ' + topic.partitions +
' --replication-factor ' + topic.replica_factor +
topic.configuration }}"
ATIX-Crew
Latest posts by ATIX-Crew (see all)
- Foreman Birthday Party 2024 - 1. August 2024
- CrewDay 2024 - 6. June 2024
- Navigating the XZ Security Vulnerability: A Comprehensive Guide - 9. April 2024