Posts Tagged ‘avro’

Processing Apache Avro-serialized Kafka messages with IBM App Connect Enterprise

Monday, October 25th, 2021

IBM App Connect Enterprise (ACE) is a broker for developing and hosting high-throughput, high-scale integrations between a large number of applications and systems, including Apache Kafka.

In this post, I’ll describe how to use App Connect Enterprise to process Kafka messages that were serialized to a stream of bytes using Apache Avro schemas.

screenshot

Background

Best practice when using Apache Kafka is to define Apache Avro schemas with a definition of the structure of your Kafka messages.

(For more detail about this, see my last post on From bytes to objects: describing Kafka events, or the intro to Avro that I wrote a couple of years ago.)

In this post, I’m assuming that you have embraced Avro, and you have Kafka topics with messages that were serialized using Avro schemas.

Perhaps you used a Java producer with an Avro SerDe that handled the serialization automatically for you.

Or your messages are coming from a Kafka Connect source connector, with an Avro converter that is handling the serialization for you.

Or you are doing the serialization yourself, such as if you’re producing Avro-serialized messages from a Python app.

Now you want to use IBM App Connect Enterprise to develop and host integrations for processing those Kafka messages. But you need App Connect to know how to:

  • retrieve the Avro schemas it needs
  • use the schemas to turn the binary stream of bytes on your Kafka topics into structured objects that are easy for ACE to manipulate and process

(more…)

From bytes to objects: describing Kafka events

Saturday, October 23rd, 2021

The recording of the talk that Kate Stanley and I gave at Kafka Summit Americas is now available.

Events stored in Kafka are just bytes, this is one of the reasons Kafka is so flexible. But when developing a producer or consumer you want objects, not bytes. Documenting and defining events provides a common way to discuss and agree on an approach to using Kafka. It also informs developers how to consume events without needing access to the developers responsible for producing events.

In our talk, we introduced the most popular formats for documenting events that flow through Kafka, such as AsyncAPI, Avro, CloudEvents, JSON schemas, and Protobuf.

We discussed the differences between the approaches and how to decide on a documentation strategy. Alongside the formats, we also touched on the tooling available for the different approaches. Tools for testing and code generation can make a big difference to your day-to-day developer experience.

The talk was aimed at developers who maybe aren’t already documenting their Kafka events or who wanted to see other approaches.


watch the recording on the Kafka Summit website

(more…)

Describing Kafka with AsyncAPI

Friday, November 27th, 2020

In this post, I want to describe how to use AsyncAPI to document how you’re using Apache Kafka. There are already great AsyncAPI “Getting Started” guides, but it supports a variety of protocols, and I haven’t found an introduction written specifically from the perspective of a Kafka user.

I’ll start with a description of what AsyncAPI is.

“an open source initiative … goal is to make working with Event-Driven Architectures as easy as it is to work with REST APIs … from documentation to code generation, from discovery to event management”

asyncapi.com/docs

The most obvious initial aspect is that it is a way to document how you’re using Kafka topics, but the impact is broader than that: a consistent approach to documentation enables an ecosystem that includes things like automated code generation and discovery.

(more…)

Using Avro schemas from Python apps with IBM Event Streams

Thursday, October 17th, 2019

I’ve written before about how to write a schema for your developers using Kafka. The examples I used before were all in Java, but someone asked me yesterday if I could share some Python equivalents.

The principles are described in the Event Streams documentation, but in short, your Kafka producers use Apache Avro to serialize the message data that you send, and identify the schema that you’ve used in the Kafka message header. In your Kafka consumers, you look at the headers of the messages that you receive to know which schema to retrieve, and use that to deserialize message data.

(more…)

How to write your first Avro schema

Saturday, July 20th, 2019

Any time there is more than one developer using a Kafka topic, they will need a way to agree on the shape of the data that will go into messages. The most common way to document the schema of messages in Kafka is to use the Apache Avro serialization system.

This post is a beginner’s guide to writing your first Avro schema, and a few tips for how to use it in your Kafka apps.

(more…)