Archive for the ‘code’ Category

Using Kafka Streams for a Kafka Event Projection

Monday, December 2nd, 2024

In this post, I’ll walk through a sample implementation of Kafka Streams to maintain an Event Projection. I’ll use this to illustrate when this is a suitable approach to use.

I’ve written similar Event Projection posts about sample implementations that use an in-memory lookup table, and a PostgreSQL database.

The objective for this demo

I introduced the pattern of Event Projections in Comparing approaches to maintaining an Event Projection from Kafka topics.

I also explained the scenario that I’ve been using for each of my Event Projections demos. If you haven’t seen my other posts, it may help to go back and see the scenario detail and motivation first.

In short, I’m showing how to maintain a projection of two Kafka topics (one based on the event key, the other based on an attribute in the event payload). And I’m showing how an application could make an HTTP/REST call to retrieve the data from the most recent event that matches some query.

At a high-level, the goal for this demo is to:

  • use Kafka Streams to maintain a projection of the Kafka topics
  • provide an HTTP/REST API for querying the projection

(more…)

Using a database for a Kafka Event Projection

Friday, November 29th, 2024

In this post, I’ll walk through a sample implementation of using a database to maintain an Event Projection. I’ll use this to illustrate when this is a suitable approach to use.

The objective for this demo

In Comparing approaches to maintaining an Event Projection from Kafka topics, I introduced the pattern of Event Projections.

I also introduced the scenario that I’ll be using in these demos. Please see that post for the detail and motivation, but to recap, I want to maintain a projection of two Kafka topics (one based on the event key, the other based on an attribute in the event payload).

In both cases, I want to be able to make an HTTP/REST call to retrieve the data from the most recent event that matches my query.

At a high-level, the goal was to:

  • use Kafka Connect JDBC sink connectors to maintain a database projection of the Kafka topics
  • provide an HTTP/REST API for querying the projection

(more…)

Using an in-memory lookup table for a Kafka Event Projection

Thursday, November 28th, 2024

In this post, I’ll walk through a sample implementation of the simplest way to maintain an Event Projection: an in-memory lookup table. I’ll use this sample to illustrate when this is a suitable approach to use.

The objective for this demo

In Comparing approaches to maintaining an Event Projection from Kafka topics, I introduced the pattern of Event Projections.

I also introduced the scenario that I’ll be using in these demos. Please see that post for the detail and motivation, but to recap: I will maintain a projection of the data from two Kafka topics (one based on the event key, the other based on an attribute in the event payload).

In both cases, I want to be able to make an HTTP/REST call to retrieve the data that was in the most recent event to match my query.

At a high-level, the goal was to create a single server that will:

  • subscribe to my Kafka topics
  • maintain an in-memory lookup of the relevant data
  • provide an HTTP/REST API for querying the projection

For demo purposes, my “application” will be curl, so I can illustrate being able to query the projection like this.

(more…)

Comparing approaches to maintaining an Event Projection from Kafka topics

Thursday, November 28th, 2024

This is the first in a series of posts exploring different approaches to implementing the Event Projections pattern with Apache Kafka.

In this first post, I’ll introduce what Event Projections are, and outline some of the benefits of the Event Projections pattern.

Finally, I’ll introduce the scenario that I’ll be using to illustrate the pros and cons of different approaches in later posts.

(more…)

Social media updates with Kafka Connect

Tuesday, November 19th, 2024

In this post, I’ll show how to bring posts from open social media networks (Bluesky and Mastodon) into Kafka using Kafka Connect source connectors.

My goal is to be able to populate a Kafka topic with status updates posted to social media.

Rather than to try and do this with the full firehose of all status updates, this is done with status updates that match a search term or hashtag.

For example, the screenshot above is a Kafka topic with posts from Bluesky that mention the term “xbox”.

(more…)

Creating custom record builders for the Kafka Connect MQ Source Connector

Monday, October 28th, 2024

In this post, I want to share an example of handling bespoke structured messages with the Kafka Connect MQ Source Connector.

The MQ Source Connector gets data from MQ messages and produces it as events on Kafka topics. The default record builder makes a copy of the data as-is. For example, this can mean taking a JMS TextMessage from MQ and producing a string to Kafka. Or it can mean taking a JMS BytesMessage from MQ and producing a byte array to Kafka.

In my last post, I showed an example of using the XML record builder, to read XML documents from MQ and turn them into structured Kafka Connect records. From this point, I could choose the format I want the data to be produced to Kafka in (e.g. JSON or Avro) by choosing an appropriate value converter (e.g. org.apache.kafka.connect.json.JsonConverter or io.apicurio.registry.utils.converter.AvroConverter).

But what if your MQ messages have a custom structure, but you still want Kafka Connect to be able to parse your messages and output them to Kafka in any format of your choice?

In that case, you need to use a record builder that can correctly parse your MQ messages. In this post, I’ll explain what that means, show you how to create one, and share a sample you can use to get started.

(more…)

Analysing IBM MQ messages in IBM Event Processing

Sunday, October 27th, 2024

In this post, I’ll walk through a demo of using IBM Event Processing to create an Apache Flink job that calculates summaries of messages from IBM MQ queues.

This is a high-level overview of the demo:

  • A JMS/Jakarta application puts XML messages onto an MQ queue
  • A JSON version of these messages is copied onto a Kafka topic
  • The messages are processed by a Flink job, which outputs JSON results onto a Kafka topic
  • An XML version of the results are copied onto an MQ queue
  • The results are received by a JMS/Jakarta application

I’ve added instructions for how you can create a demo like this for yourself to my demos repo on Github.

The rest of this post is a walkthrough and explanation of how it all works.

(more…)

Analysing Wikipedia edits with IBM Event Processing

Monday, October 14th, 2024

In this post, I’ll share a demo I gave today to explain some of the processing nodes in the palette of IBM Event Processing.

I’ve found that demonstrations of Event Processing are easier to understand when I don’t need to explain the stream of events I’m processing in the first place. This means I’m always looking for interesting real-world event streams that are widely understood, as they can make for the most effective demos.

With this in mind, today I tried explaining a few of the Event Processing nodes by using them with a live stream of events representing pages that are being created and edited in the English Wikipedia.


Click on the image for a higher-resolution screenshot

Each event contains:

  • title of the page
  • who made the edit (user ID if logged in, or IP address if anonymous)
  • was this the creation of a new page, or an edit of an existing page?

Every edit on Wikipedia results in an event on the Kafka topic, so there are typically a few events a second. It’s not a super-high-throughput topic in Kafka terms, but there are enough events to try out interesting ideas.


Click on the image for a higher-resolution screenshot

Here are a few of the demos I gave today.

This is by no means an exhaustive list of what you could do with this data, but it was enough to let me show what the most commonly-used tools in the palette can do.

(more…)