Kafka and Ansible

Automating Kafka with Ansible

In all areas of life, there are a lot of issues we are not aware of. To measure these problems, we use the so-called fulfillment rate. For example: with a fulfillment rate of 90%, a heart beat would stop every 85 seconds, and an average A4 page of text would contain 30 typos. A fulfillment rate of 99.9% (which seems a lot) still means 22,000 wrong bank bookings per hour, and a total of 32,000 missed heart beats per year. The answer is automation and standardization! These approaches help solve problems we are often not aware of.

Tuesday, April 30, 11:00 – 11:45 a.m. CEST
Webinar: Driving Efficiency with Event-Driven Ansible: An Introduction

Unravel the theoretical foundations of Event-Driven Ansible alongside a practical demonstration.
Presented by our blog author Dr. Ottavia Balducci.

With automation and standardization, we can close the Innovation Gap, i.e., the discrepancy between the innovation businesses need and the innovative capability traditional IT can provide. Standardization means a common IT language, from development to production. To achieve this, Ansible is the best tool. It is s „radically simple,“ has low requirements, needs no agent, and is human-readable.

Ansible

Ansible consists of a control node, a playbook, and an inventory containing servers. The inventory uses an ini or YAML syntax to define the infrastructure, combine servers in groups, and define certain variables.

[kafkaserver]
web1.server.com ansible_host=13.12.2.1

web2.server.com ansible_host=13.12.2.2

web3.server.com ansible_host=13.12.2.3

[appserver]

app1.server.com ansible_host=10.10.12.1


The playbook is written in (human-readable) YAML and maps roles to hosts. It is used for standardization and should ideally be combined with VCS.

---
- hosts: kafkaserver # mapping role to host

become: true # run as root
roles:
- preflight # role1
- zookeeper # role2
- kafkabroker # role3


A role is a combination of tasks. It is idempotent and should ideally be combined with VCS.


---

#preflight confluent role

- name: Add open JDK repo

apt_repository:
repo: ppa:openjdk-r/ppa


- name: Install Java

apt:
name: "openjdk-8-jdk"
update_cache: yes
...

Kafka

Ansible allows users to manage a Kafka deployment. You can automate different Kafka-related tasks, ensure consistent configurations in a cluster, and easily scale and control the Kafka infrastructure. At the same time that Ansible was launched, Confluent Kafka was created in 2012. Kafka playbooks are a set of Ansible playbooks and roles that help deploy and manage Kafka clusters with Ansible. These playbooks are intended to simplify the process of setting up and configuring a Kafka cluster, including installing Kafka, ZooKeeper, and other related components.

The following inventory could be used to deploy a Kafka cluster:

---
#inventory.yaml
zookeeper:
hosts:
host1:
ansible_host: 13.12.2.1
host2:
ansible_host: 13.12.2.2
host3:
ansible_host: 13.12.2.3
broker:
hosts:
host1:
ansible_host: 13.12.2.1
host2:
ansible_host: 13.12.2.2
host3:
ansible_host: 13.12.2.3


The Confluent Kafka playbooks provide a high-level abstraction for managing Kafka infrastructure. They encapsulate the best practices and configuration recommendations provided by Confluent and can be customized to meet specific deployment requirements. You can use the playbooks to assign roles to servers:


- hosts: preflight

tasks:
- import_role:
name: confluent.preflight

- hosts: ssl_CA

tasks:
- import_role:
name: confluent.ssl_CA

- hosts: zookeeper
tasks:
- import_role:
name: confluent.zookeeper

- hosts: broker

tasks:
- import_role:
name: confluent.kafka-broker

- hosts: schema-registry
tasks:
- import_role:
name: confluent.schema-registry


Within one role, it is possible to separate variables …


---
#defaults/main.yml
kafka:

broker:
user: "cp-kafka"
group: "confluent"
config_file: "/etc/kafka/server.properties"
systemd_file: |
"/usr/lib/systemd/system/kafka.service"
service_name: "kafka"
datadir:
- "/var/lib/kafka/data"
systemd:
enabled: yes
state: "started"
environment:
KAFKA_HEAP_OPTS: "-Xmx1g"
config:
group.initial.rebalance.delay.ms: 0
log.retention.check.interval.ms: 300000
num.partitions: 1
num.recovery.threads.per.data.dir: 2
offsets.topic.replication.factor: 3
transaction.state.log.min.isr: 2
zookeeper.connection.timeout.ms: 6000
# [...] many more


… and code:


---
# [...] tasks to create user,group and dirs

# tasks/main.yml
- name: broker ssl config

template:
src: server_ssl.properties.j2
dest: "{{kafka.broker.config_file}}"
mode: 0640
owner: "{{kafka.broker.user}}"
group: "{{kafka.broker.group}}"
when: security_mode == "ssl"
notify:
- restart kafka


- name: create systemd override file

file:
path: "{{kafka.broker.systemd_override}}"
owner: "{{kafka.broker.user}}"
group: "{{kafka.broker.group}}"
state: directory
mode: 0640


- name: broker configure service

systemd:
name: "{{kafka.broker.service_name}}"
enabled: "{{kafka.broker.systemd.enabled}}"
state: "{{kafka.broker.systemd.state}}"

Docker

In 2013, Docker entered the market. Docker offers a lot of benefits, such as portability, scalability, security, efficiency, and reproducibility. There is an Ansible module for managing Docker containers. This allows a generic role to be used multiple times with always different variables.

- hosts: zookeeper

tasks:
- name: Deploy Zookeeper
include_role:
name: confluent_component
vars_from: zookeeper

- hosts: kafka
- name: Deploy Kafka
include_role:
name: confluent_component
vars_from: kafka

- hosts: control-center
- name: Deploy Control-Center
include_role:
name: confluent_component
vars_from: control-center


Generic Docker role, data and code separated:

---
- name: "Start Docker-Container"

docker_container:
name: "{{ kafka_component_name }}"
image: "{{ kafka_component_container_image }}"
state: "{{ kafka_component_container_state }}"
restart: "{{ config_changed.changed }}"
published_ports: "{{ published_ports }}"
restart_policy: "{{ container_restart_policy }}"
env_file: "{{ kafka_component_env_file }}"
volumes: "{{ kafka_component_volumes }}"


---

kafka_component_name: "zookeeper"

image: "confluentinc/cp-kafka"

published_ports:

- 12888:2888
- 13888:3888

Docker Compose followed a year later, allowing multiple containers to be deployed. This is done via a compose file:

---
version: '2'

services:

zookeeper:
image: "confluentinc/cp-zookeeper:latest"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
kafka:
image: "confluentinc/cp-kafka:latest"
depends_on:
- "zookeeper"
ports:
- 9092:9092
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: "zookeeper:2181"
# [...] more kafka broker settings


This is how you can then use Docker Compose in an Ansible playbook:


- hosts: kafka-server

tasks:
- name: "Spin up Kakfa-Cluster"
docker_compose:
project_src: "cp_kafka"
state: absent
register: output


- name: "Ensure Stack is running"

assert:
that:
- kafka.cp_kafka_kafka_1.state.running
- zookeeper.cp_kafka_zookeeper_1.state.running


In the years that followed, many other vendors came along, such as Kubernetes, Red Hat OpenShift, Rancher ...

Kafka and Ansible

With Ansible, you can not only deploy Kafka, but also manage it. There is an Ansible module for topics and ACIs, and no SSH connection to a remote broker is required.

---
- name: "create topic"

kafka_lib:
resource: 'topic'
name: 'test'
partitions: 2
replica_factor: 1
options:
retention.ms: 574930
flush.ms: 12345
state: 'present'
zookeeper: >
"{{ zookeeper_ip }}:2181"
bootstrap_servers: >
"{{ kafka_ip_1 }}:9092, {{ kafka_ip2 }}:9092"
security_protocol: 'SASL_SSL'
sasl_plain_username: 'username'
sasl_plain_password: 'password'
ssl_cafile: '{{ content_of_ca_cert_file_or_path_to_ca_cert_file }}


Manage Kafka topics:

The Rest Proxy cannot create topics, you need manual idempotency, and access is limited to one Kafka broker.


# Definition of topic

topic:
name: "test"
partitions: 2
replica_factor: 1
configuration:
retention.ms: 574930
flush.ms: 12345


---

- name: "Get topic information"

uri:
url: "{{ 'kafka_rest_proxy_url' + ':8082/topics/' + topic.name }}"
register: result


- name: "Create new topic"

command: "{{ 'kafka-topics --zookeeper ' + zookeeper +
' --create' +
' --topic ' + topic.name +
' --partitions ' + topic.partitions +
' --replication-factor ' + topic.replica_factor +
topic.configuration }}"

ansible

Ansible Training

ATIX offers Ansible training courses for beginners and advanced users, where participants learn how to use Ansible as a configuration management tool. Participants find out how to manage their infrastructure more efficiently with Ansible and learn aboout the best ways to create, use, and maintain Ansible roles, inventories, and playbooks.

The following two tabs change content below.

ATIX-Crew

Der ATIX-Crew besteht aus Leuten, die in unterschiedlichen Bereichen tätig sind: Consulting, Development/Engineering, Support, Vertrieb und Marketing.