Archive for the ‘code’ Category

You need two schemas to deserialize an Avro message… but which two?

Friday, November 17th, 2023

In this post, I want to talk about what happens when you use Avro to deserialize messages on a Kafka topic, why it actually needs two schemas, and what those schemas need to be.

I should start by pointing out that if you’re using a schema registry, you probably don’t need to worry about any of this. In fact, a TLDR for this whole post could be “You should be using a good schema registry and SerDes client“.

But, there are times where this may be difficult to do, so knowing how to set a deserializer up correctly is helpful. (Even if you’re doing the right thing and using a Schema Registry, it is still interesting to poke at some of the details and know what is happening.)

The key thing to understand is that to deserialize binary-encoded Avro data, you need a copy of the schema that was used to serialize the data in the first place [1].

This gets interesting after your topic has been around for a while, and you have messages using a mixture of schema versions on the topic. Maybe over the lifetime of your app, you’ve needed to add new fields to your messages a couple of times.

If you want a consumer application to be able to consume all of the messages on this topic, what does that mean?


Machine Learning for Kids with EduBlocks

Saturday, July 8th, 2023

Students can now create Machine Learning for Kids projects using EduBlocks – letting them create machine learning Python projects in the browser by dragging and dropping blocks on a canvas.

This is all thanks to a fantastic new contribution from Joshua Lowe.

Here’s a quick run-through to show what this makes possible.


Using Xbox to get started with Kafka Connect & Kafka Streams

Wednesday, May 17th, 2023

It’s easy for developers who aren’t immersed in all-things-Kafka to assume that “Apache Kafka” just means an event backbone: something that hosts topics (and perhaps the client libraries to produce and consume messages using those topics). But Kafka is more than that. It is an ecosystem of tools that enables a complete event-streaming application.

That was the premise of this talk, recorded at Devoxx UK, which I gave to a room of Java developers. I introduced them to two other bits of Kafka: Kafka Connect (for getting data in and out of Kafka topics from external systems) and Kafka Streams (for developing stream processing applications).

Because they were Java developers, I thought the best way to give them a flavour of these tools was to show them the APIs, and walk through an example solution made using the APIs.

The example solution used Kafka tools to process data from Xbox – mostly because I’m a gamer and it made for a fun, if silly, demo.

recording of the talk at


Using Apache Kafka with IBM MQ using Kafka Connect

Thursday, April 20th, 2023

A recording of a demo walkthrough I did about using the Kafka Connect MQ connectors to flow messages between IBM MQ and Apache Kafka.

A few weeks ago, I presented a session at TechCon about IBM MQ and Apache Kafka with David Ware. I spent most of my time running through how to use Kafka Connect with IBM MQ, with a few demos showing different ways to setup and run the kafka-connect-mq-source Connector.

My demos start at around 20 minutes in, but you should listen to David give the context first!

Using weather data in Scratch

Friday, March 31st, 2023

In this post, I want to share an example of Scratch projects that use live weather data.

At the Raspberry Pi Clubs Conference last week, I talked about the idea of Scratch projects that use live data: projects that do something different every time you run them, based on when or where they are run.

I love this idea. It’s something I’ve talked about many times – like when I tried bringing NASA data into Scratch, or when I built Scratch extensions for different web APIs, such as Wikipedia, Twitter, and Spotify.

I think doing this brings a new perspective to Scratch. Live data can bring projects to life.

So I thought I’d share another example: this time, weather data from Open Meteo.


How to make your own Scratch extension

Thursday, March 23rd, 2023

A workshop I prepared for the Raspberry Pi Clubs Conference about how to create your own custom Scratch blocks.

This workshop is a step-by-step guide for how to create a Scratch extension.

I created it for educators and coding group volunteers, who would like to customize Scratch for their students by giving them new and unique blocks to create with. In particular, I wanted to make this accessible to people who perhaps don’t necessarily think of themselves as developers and wouldn’t otherwise know how to clone the Scratch Team repos and start hacking it.

I’ve wrapped all the complicated bits in scripts that set everything up, and prepared an online Scratch extension development environment – so everything can be done in a web browser without having to install or configure anything on your own computer.

I’ve included step-by-step instructions for building different types of Scratch extensions, including Scratch blocks based on web APIs, and Scratch blocks based on JavaScript modules from npm.

workshop video on YouTube

Using client quotas with IBM Event Streams

Sunday, February 26th, 2023

In this post, I want to highlight a feature that I often see under-used in IBM Event Streams, and show how you can easily give it a try.

Kafka can enforce quotas to limit the impact that client applications can have on your cluster. To quote the Kafka documentation:

It is possible for producers and consumers to produce/consume very high volumes of data or generate requests at a very high rate and thus monopolize broker resources, cause network saturation and generally DOS other clients and the brokers themselves.

Having quotas protects against these issues and is all the more important in large multi-tenant clusters where a small set of badly behaved clients can degrade user experience for the well behaved ones.

In fact, when running Kafka as a service this even makes it possible to enforce API limits according to an agreed upon contract.


Running IBM Event Streams on a laptop (sort of)

Friday, December 23rd, 2022

How to run a tiny local Kafka cluster using IBM Event Streams images

For local development on Kafka projects, I always run the public open source builds of ZooKeeper and Kafka as Java processes directly on my laptop (similar to steps described in the Apache Kafka Quickstart).

But for a project this week, I needed to verify something with the distribution of Kafka that comes with IBM Event Streams.

I used a simple Docker Compose setup for this. I’ll use this post to share how I did it.