Article

Event-driven architectures with Kafka and Kafka Streams

Introducing event-driven architecture in your application

By Rizwan Dudekula, Anjali Tibrewal

Event driven architectures make applications more modular and add real-time responsiveness. The modular nature of event driven architectures makes applications easy to scale and promotes agility. As opposed to batch processes, event driven architectures bring in timeliness and parallelism to workloads.

Developers face many challenges when designing event-driven applications:

Reliability (error handling or retries). Event-driven systems demand meticulous error handling and retries. As events flow through the system, errors are inevitable. Implementing robust error handling mechanisms and intelligent retries ensures resilience in the applications. Strategies like exponential backoff and dead-letter queues are vital safeguards, preventing data loss and ensuring event processing continuity.
Order of event processing. Just as there is parallelism and retries in event-driven systems, developers need to ensure order of event processing is maintained.
Consistency of the system. With monoliths what could have been a single transaction operation now becomes a transaction in multiple systems, each one having their own boundaries. As developers, we need to ensure the system is eventually consistent.

So, if you have a large monolithic application, and you are looking to move to making it more modular by adopting an event driven architecture, you might be wondering how to begin the transition. The task can be quite overwhelming with a multitude of business objects and workflows that are typically intertwined. And, if you identify specific workflows to carve out, which ones should you prioritize?

Learn more about architectural considerations for event-driven microservices-based systems.

Transitioning to an event-driven architecture

In the journey towards an event-driven architecture, identifying functional areas that are hotspots in a monolithic codebase is pivotal. These hotspots, characterized by complex code and scalability challenges, serve as prime candidates for migrating to microservices. By isolating these areas and focusing on a specific business flow, organizations can move their monolithic application in phases and reap benefits early.

Identifying entities and workflows

After you have identified an area of concern that needs scalability and efficiency, you need to map out the entities and workflows:

Outline the business objects and workflows around the capability. A crucial step involves dissecting the application's functionality into distinct business objects and workflows. Understanding the relationships between these entities is paramount.
Identify any change in the business objects. For example, changes in the core attribute of the business object that leads to a lifecycle change or changes in any secondary attribute of the business object.
Identify all workflows that can happen in the application. Each of these can be modelled as an event, on the occurrence of which, as a reaction, something needs to be done.

Let's take an example. Let’s say you have a ride booking system and the functionality you are looking to transform is a user requesting a ride.

The business entities involved in this flow are a persona for a rider and a persona for driver.
Changes in these entities (the rider and the driver entities).
A driver being available to take a ride versus not being available to take a ride is a core attribute change.
Changes in the rating of a driver can be a secondary attribute change.
A rider requesting a ride and a driver accepting or denying a ride are all events. On the event of a rider requesting a ride, the system should locate nearby drivers and send them notifications. On the event of a driver accepting a ride, a booking should be created and notification sent to the rider who requested it.

A meticulous mapping of business objects in your application is the cornerstone of event-driven architecture. Recognizing primary attribute changes and secondary attribute changes enable precise event identification. Changes in these attributes, such as availability or ratings, serve as triggers for events. Typically, on a primary attribute change event, there will be key actions that need to be taken. Like if a driver becomes available to accept rides, the system should look for open requests based on the driver's location and send ride requests to the driver.

Designing events

Now that we have an overview of entities and workflows, let's look at how an event should be designed. Use these best practices to carve out events in your application:

Decouple event definitions. Events should encapsulate historical facts, not prescribe future actions. Decoupling event definitions from future processing and avoiding future instructions in event details ensures adaptability. As requirements evolve, responses to the same event can evolve without altering fundamental events.
Granular event chunks. Events should be broken into manageable, predictable units. The time you take to process an event should be finite and predictable, fostering efficiency and allowing for distributed, parallelized handling.
Eliminate sequential dependencies. Events must be processed independently, devoid of sequential dependencies. This design principle maximizes concurrency and promotes responsive, scalable systems.

Event-driven architecture design patterns

A few of the event-driven architecture design patterns are:

Publish-subscribe. This is a fundamental pattern where events are published to specific topics or channels, and any number of subscribers can listen to these channels to receive and process the events. It enables decoupling between event producers and consumers. Read more about publish-subscribe messaging in this introductory Kafka article.
Event sourcing. In this pattern, the state of a system is derived from a sequence of events. Every change to the system's state is captured as an immutable event, very similar to using version control for source code. This pattern is commonly used in systems where audit trails or a complete history of changes is necessary. Traveling across the time and fixing the inconsistencies in the data is quite possible with this pattern. Learn more about the event sourcing pattern in this introduction to event sourcing article.
Choreography and orchestration. Choreography involves a more decentralized approach where services or components communicate through events without a central controller. Orchestration, on the other hand, employs a central coordinator that manages and directs the flow of events and actions among various components. Read more about these event processing topologies in this event-driven architectures article.
CQRS (Command Query Responsibility Segregation). This pattern segregates the read and write operations, where commands (changes to data) are separate from queries (retrieving data). This allows for different optimization strategies for reads and writes. Learn more about this architectural pattern in this introductory CQRS article.
Event-driven workflow. Events trigger and control workflow steps. Each step is initiated by an event and might produce other events as outputs. This pattern is common in systems where a sequence of steps is required in response to specific events.
Fan-out/Fan-in. Events are broadcast to multiple consumers (fan-out) and then aggregated or collected from these consumers (fan-in) to derive a result. This pattern is often seen in scenarios where a single event triggers multiple downstream actions.
Event transformation and enrichment. Events might be transformed or enriched before being consumed by other services. For example, adding metadata, translating event formats, or aggregating multiple events into a single event for easier consumption.

Although there are many patterns in implementing an event driven architecture, the patterns can be categorized as either stateless EDA or stateful EDA.

Stateless EDA. In a stateless EDA, events are processed without retaining any knowledge of past events or their order. Each event is processed in isolation, without relying on historical context or prior events. It doesn’t matter which consumer processes the message. Stateless architectures are simpler and more scalable as they don't maintain historical data. This approach is efficient for systems where events can be handled independently without considering past event information.
Stateful EDA. On the other hand, a stateful EDA involves maintaining the history and context of past events. Events are processed in consideration of the system's state and the sequence in which they occurred. Stateful architectures enable complex workflows, long-running transactions, and decision-making based on the system's historical state. Stateful EDAs are suitable for systems that require a memory of past events and their impact on the system's state to make informed decisions. It is extremely important which consumer handles a specific event. In general, implementing a stateful EDA is more complex and needs a more careful design.

Event-driven architecture best practices using Kafka and Kafka Streams

In event-driven architectures, tools like Kafka facilitate seamless event communication. For example, in Kafka, event details often indicate the facts and what has already happened (same goes for the topic names, too) and not what should be done with it as that is future information and can change.

General guidelines and best practices when designing EDA using Kafka technology include:

Event processing systems allow parallelism, but also still offer order guarantees. How? You don’t likely need a strict order amongst every activity that is happening. You can maintain the order for a subset of events. Deciding what that subset is for your system is a key tenet of designing an EDA system with Kafka.
Breaking events into predictable chunks helps make the processing of events have a finite time to them. It makes the processing time more predictable and allows for better optimization with Kafka configurations.
Consider using a schema registry to manage data serialization formats like Avro, JSON Schema, and Protobuf. A schema registry is a central component in Kafka-based data processing systems that manages the schemas for data that is produced and consumed by Kafka topics. It provides a way to ensure compatibility and consistency of data formats across different producers and consumers within an organization.
When we use a messaging platform like Kafka, challenges arise in debugging or understanding the journey of an event. Use a unique ID to correlate the journey of an event and log it across all workflows to enable better tracking and troubleshooting.

Let’s try to understand how to design a stateful EDA using Kafka Streams with a real-world example. Imagine in the banking domain, credit bureau services need a transaction history of users with profile information every week to update the credit score of the customers. This involves a series of customer transactions, aggregations of the transactions for the entire week, and enriching the report with customer profile information before sending a unified report to the customer. The credit bureau as a downstream consumer receives these events and recalculates the credit score and persists it.

Designing a stateful EDA using Kafka Streams for banking domain

Figure 1. Event transformation and enrichment pattern, showing a typical template of a Kafka Streams application which aggregates and enriches data before sending the events to output topics (Event transformation and Enrichment pattern)

Having provided an illustrative example of a Kafka Streams use case, let's delve deeper into the intricacies of this architecture and its underlying patterns. By zooming in on each component, we can uncover how Kafka Streams facilitates real-time data processing and streamlines complex workflows within distributed applications.

Kafka Streams is a client library that processes and analyzes high loads of data in real-time. Under the hood a Kafka Streams application has Kafka producers and consumers, just like a typical Kafka client. However, with Kafka Streams you can enable a stateful EDA with features like Fault Tolerance and Scalability.

Kafka Streams primarily operates within a single Kafka cluster. It's designed to process data from topics within that specific Kafka cluster. The library allows you to consume data from topics, perform processing, and produce results back into topics within the same Kafka cluster. This single-cluster focus simplifies the architecture and operations, ensuring that all data processing occurs within the confines of one Kafka environment.

Achieving stateful EDA without using Kafka Streams or another stream processing framework is hard. If you had a plain Kafka consumer that needs to interact with the previous state of events, the consumer would have to store all the event history in heap memory, and if the key space of the events is high, it can quickly run out of memory. To avoid this, the state should be saved somewhere else, such as local disk or an external store like Redis. But, the potential to lose data or introduce network latency while writing and reading to an external store is high.

To address this Kafka Streams persists the state to local disk and to an internal Kafka topic. Generally, the internal Kafka topics to store the state are changelog topics. The changelog topic acts as a persistent log of these state changes. It records all modifications made to the local state stores.

Changelog topics serve a crucial role in fault tolerance. If a Kafka Streams application restarts or if a stateful application instance fails, it can restore its state from these changelog topics. By replaying the changelog topic, the application can reconstruct its state and resume processing from where it left off.

Generally, the Kafka topics are setup with topic configuration "cleanup.policy" = "delete" with some retention, but the changelog topics will have "cleanup.policy" = "compact".

By default, Kafka Streams creates changelog topics with a delete policy of compact, which means that Kafka will delete older version records with the same key, but at least the latest value of any given key is retained and will never be deleted. This helps to retain only the latest information which is the desired nature for state, and so the reconstruction time will be less in case of failures to achieve faster state recovery. With this technique, application itself holds the entire state (state itself can be partitioned across multiple application nodes). We can think of this state store as a materialized view.

Time in event stream processing

When using Kafka Streams, it is important to understand the different notes of time in the lifecycle of an event:

Event time is when the event actually happened in the real world. It is the current time (wall-clock time) of the producer's environment if you don't add your own at the source.
Log append time (also called Ingestion time) is when the event is written to a Kafka topic. For most EDA systems, log append time is the near approximation of the event time, assuming the event is written immediately to the Kafka topic when it happened.
Processing time is when the event is picked up by the consumers for processing.

Kafka Streams considers the event time for windowing and aggregations. Let’s say we have consumer-A and consumer-B, where consumer-B has double the resources like CPU and memory compared to consumer-A. When we take the same topic and run it by consumer-A and consumer-B, it might result in more events in an hour window in consumer-B, because of better hardware. And, it results in inaccurate results for the company to get any meaningful insights. Instead, if we use event time, no matter what the hardware of the consumers is, the windowed aggregations open and close at the desired times, as the times are coming from the real world’s approximation of event time. The flow of events determines the window start and closure, which is why Kafka Streams uses event time and not processing time.

To efficiently keep track of the windows, Kafka Streams uses something called stream time, which is the largest event timestamp seen so far by a Kafka Streams application. Stream time is advanced only when a new event is received. And, stream time will only ever increase and it can never decrease. Any out of order events will not alter the stream time.

new_streamTime = max(current_streamTime, current_message_eventTime)

Log append time refers to the time when the event is appended to the Kafka topic. It provides the reliability in processing the events in the order that they were appended. This approach simplifies the processing of events, but it might not actually reflect the actual time when the event occurred in the real world. On the other hand, event time reflects the actual time in the real world, but can introduce complexity in handling out of order events, or late arrival records.

When windowed aggregations are performed on records that are configured with log append time, this time ensures reliability because no records are lost, even in case of late arrival records. No records are lost because log append time monotonically increases for each appended record on a Kafka broker. No matter what the original record timestamp is, but when it is appended to the Kafka topic, the broker will assign its wall-clock time as the record’s timestamp.

Similarly, when windowed aggregations are performed on records that are configured with event time (by default), you will have to account for late arrival records by configuring a grace period time. The records that arrive after the grace period are discarded forever.

With this trade-off, it’s important to choose between these two options with regards to times:

Accuracy and ordering:
- Log append time might not reflect the actual occurrence of the event, which can lead to potential inaccuracies in calculations in certain use cases.
- Event time allows for accurate windowed calculations based on event timestamps, but out-of-order events require additional handling.
Complexity and configurability:
- Log append times are simpler to configure and require minimal setup as the broker timestamps are used directly.
- Event time requires setting timestamps correctly at the producer level.

Storing states of real-time events

While building stateful EDA applications, we might need to enrich the events with dynamically changing information in real time. As we discussed before, enrichment should happen in the streams application without needing to make an explicit I/O call to optimize the performance. But the state store that we use to enrich the events might not have the initial state when we start the application. In such scenarios, we need to first initialize our state store with a snapshot of the original data source.

An initial state snapshot can be loaded from an external data source such as Oracle or MongoDB. Then, you can write a batch process to extract the initial state to produce the data to a Kafka topic. First, you consume the topic and update the state store with the initial data. Next, you subscribe to the Kafka topics that contain the real-time events relevant to your application.

Now, to update the state store with real-time events, there are two approaches.

Use a single state store, which is used to store the initial snapshot (preload) and update the same store with the relevant real-time events that you subscribed to.
- Pros: Simpler implementation as it requires managing only one store. Easier query logic, because there is no need to coordinate lookups across multiple stores.
- Cons: Mixed data, with the snapshot and real-time events being co-located, which can make understanding and managing the data more complex.
Use two different stores, one for the initial snapshot (preload) state store and another for the real-time events state store. On each enrichment, you’d first perform a look up on the real-time events state store, and if present you’d enrich the event, and if it was not present you’d fallback to the initial snapshot state store.
- Pros: Isolation, which keeps the initial snapshot and real-time updates in separate stores thereby isolating historical and real-time data, which improves maintainability and understanding. Debugging will be easier as there is no merge of events from the two different sources, avoiding the confusion of the source of the events.
- Cons: Increased complexity in that you have to manage two state stores and coordinate lookups. Query lookups might involve both the stores, potentially increasing query complexity.

Parallelism in Kafka Streams

In EDA, parallelism refers to the capability of handling multiple events simultaneously across different components or services within the architecture. This parallel processing allows for efficient and scalable event handling to achieve high throughput and low latency. Kafka Streams uses a data flow programming paradigm that allows you to express your application as a series of interconnected pipelines. Each pipeline represents a different step in the processing of your data. This data flow is represented as a directed acyclic graph (DAG) of nodes that process streams of data which is called a topology. Each node in the topology represents a different step in the processing of the data. For example, a node might be responsible for filtering the data, and one node for aggregating the data, and yet another node for joining the data with data from another stream.

In the stream processing, a few operations like grouping the stream by a new key will cause repartitioning, which means Kafka Streams creates a new topic for the grouped stream. This topic is partitioned based on the new key, so the data is redistributed across the partitions.

Each repartitioning will cause the topology to create a sub-topology. Sub-topologies use parallelism for better performance. If we use a plain consumer API for stream processing, the consumer should complete the end-to-end processing of an event or the batch of events and then go for next. This can sometimes be time-consuming and might not offer better performance.

Kafka Streams enables parallelism with its task allocation capabilities. A task is the smallest unit of work or parallelism that Kafka performs. Tasks are the building blocks that execute the stream processing logic. Threads are assigned for the tasks. For an example, for a defined topology, if we have 80 tasks, then the number of threads should be defined as 10. Each thread is assigned 8 tasks.

The total number of tasks that are created in a streams application is the sum of partitions in the input topic of each sub-topology, assuming there is a single input topic.

The maximum parallelism can be achieved using the following formula.

Formula for achieving maximum parallelism

If there are multiple input topics, N(s) is equal to the number of partitions of the biggest input topic of sub-topology s. For example, if there are two input topics for the sub-topology s, one with 8 partitions and another with 16 partitions, N(s) would be 16.

There is an upper bound to create threads as well. The upper bound for the number of threads is based on the number of cores available and the number of tasks, whichever is the highest. Even if we define more threads than this upper bound, the remaining threads will be idle.

It is not necessarily optimal to define the maximum number of threads as there can be context switches that bring some unwanted latencies. To achieve the maximum parallelism for your environment, run some experiments with the load and pick the optimal number of threads for your application.

The extent to which the stream processing can be scaled ends here. What if with this full capacity you still experience backlogs and further scaling is required? The only option is to increase the number of partitions for the source topic, which will automatically lead to a proportional increase in the number of tasks.

But, this is not as simple as it sounds, because your input data is partitioned by key and if you add a partition the partitioning in your input topic breaks. Also, the partitioned state store breaks. Therefore, you should over-partition topics in the beginning to avoid this issue. If you haven’t done that, you could use the following method to increase the number of partitions:

Create a new input topic v2 with the desired number of partitions.
Start a copy of the application with a new application ID (every Kafka Streams application is identified with an application ID, consumers running across instances are uniquely identified with the application ID. This will be used as a suffix for the internal re-partition and changelog topics as well).
Update the upstream producers to write to the new topic created in step 1.
Finally, shutdown the old application when the backlog on the old input topic is 0.

To summarize the best practices to achieve scale in a Kafka Streams application:

If you have a high throughput application, keep a relatively high number of input topic partitions as that ultimately defines the parallelism.
Pick an optimal number of threads.
Avoid network calls. Build the required state within the application itself.
Use windowing and time-based operations judiciously. Improper use of windows can lead to increased resource consumption.
Design a streamlined topology that uses state stores judiciously and avoids unnecessary operations. This ensures efficient use of resources and better scalability.
Do not alter the default compact configuration for changelog topics.

Two common pitfalls in Kafka Streams

Though Kafka Streams is very powerful and robust tool for streaming applications, we may often encounter few hurdles in it’s implementation. Let's delve into two such prevalent pitfalls and explore strategies to navigate them effectively.

Co-partitioning in Kafka Streams applications

In a Kafka Streams application, as we saw in the above examples, you might want to take information from multiple streams and find related events in these streams. In the banking example, you would have one stream of data containing the transaction history of users and another that has a change log of user profiles. To generate a transaction history with the user profile, you would need to join these streams. The two streams need to be partitioned in a similar fashion for the join to work. This is called co-partitioning.

Let’s look a bit deeper in this example. You have a Kafka Streams application that is taking in an input stream and while processing it also enriches it with data from another stream. Let’s say the input stream contains events for a user ID and it needs to be enriched with user profile data. When the Kafka Streams application processes an event for the user ID “X” and picks it from a partition numbered “0,” the process can access the same partition from the internal topic that is required for enrichment for that user ID. So, we need to ensure that an event with a given key lands in the same numbered partition across two topics.

To achieve this, the following requirements must be met:

The two topics must have the same number of partitions.
The partitioning key must be the same.
The partitioning strategy must be the same in the producers producing to those topics.

In Figure 2, the transactions topic has 16 partitions and the user profile topic has 8 partitions.

Co-partitioning requirements not satisfied in the EDA design

Figure 2: Co-partitioning requirements not satisfied

Let’s say there are 2 application instances running in the Kafka Streams consumer group. The input topic partitions are equally distributed between the two application instances.

In this example, event processing for the transaction topic is distributed to application-instance-1 with partitions 0 to 7 and to application-instance-2 with partitions 8 to 15.

Similarly, the state store for the user profile data is distributed across application instance-1 with partitions 0 to 3 and application instance-2 with partitions 4 to 7. It’s obvious that if a new event with a key like 12262342 comes into the application instance-1 for processing, the task associated with partition-2 picks it up for processing, and it will perform a lookup in state store partition associated with only partition 2 and it fails to find the key and hence the enrichment fails.

In Figure 3, either we can increase the input topic partitions for user profiles data topic from 8 to 16 or can introduce a repartition construct to create a new repartition topic with 16 partitions which can be used to construct the state store without touching the input topic.

Co-partitioning requirements satisfied in EDA design

Figure 3: Co-partitioning requirements satisfied

Once the topic with 16 partitions is used as a source to build the state store, the partitions are assigned to the application instances similar to the transaction topic’s partitions as the number of partitions are same. When the event is received for processing and the corresponding state store partition are co-located in the same application instance, the enrichment will be successful.

Unfortunately, Kafka Streams doesn't error out for such scenarios. It's easy to miss this detail. The solution is to re-partition the topic with a fewer number of partitions. This will ensure the relevant events are always routed to the same instance.

Windowing in Kafka Streams applications

The windowing(suppress) operator operates based on stream or event time. Thus, a Kafka Streams topology detects the passage of time only when new events flow through. The implication on a topology is that the aggregation window does not close if the flow of events stops. If there are aggregated results pending in the window, they are not forwarded until some other event flows into the topology and advances the stream time.

There are proposals to implement the windowing(suppress) operator based on absolute wall-clock time so that this does not happen. One approach we considered in our Kafka Streams application is to advance stream time even if there are no incoming events, to push messages out of the window. This is done by attaching a so called PsuedoEventForwarder operator (a custom transformer operator) before the windowing operator. The basic idea is to generate dummy or pseudo events at regular intervals into the topology so that stream time always advances.

In the actual implementation, optimizations are done to generate these dummy events only when needed; that is, when there are actual events that have not been forwarded due to the window not closing in spite of the passing of wall-clock time.

We used the following code to achieve this:

if(currentWallClockTime - latestRecordContext.getWallClockTimestamp() > timeWindows.sizeMs + timeWindows.gracePeriodMs()) {
    // Forward a dummy event
}

Save the latest record per partition received so far by comparing the max timestamp seen so far, along with the record’s timestamp and the wall clock time when the record was received.
Define a punctuate method in the transformer class, to get invoked periodically.
In the each scheduled invocation, check for each partition, and send a dummy event if the difference between currentWallClockTime and the wallClockTime, when the latest record was received for this partition is more than windowDuration + gracePeriod.

Basically, this code checks if the amount of wall clock time passed since the last event received is more than the configured time window and the gracePeriod time.

Summary and next steps

In this article, we explored the process of transitioning from monolithic applications to event-driven architectures, with a particular focus on the pivotal role of Kafka Streams in this transformation. Drawing from our experiences in real-world production systems, we dissected the anatomy of streaming applications, distinguished between stateful and stateless event-driven architectures, and harnessed Kafka Streams for real-time streaming applications. Throughout our exploration, we provided in-depth insights into crucial aspects such as reliability, scalability, common pitfalls, and the utilization of diverse patterns to address specific design challenges.

Please send us your thoughts: Rizwan Dudekula (rizwan.dudekula1@ibm.com), Anjali Tibrewal (anjtibre@in.ibm.com)

Topics

Languages

Products

Open Source