In this post, I describe how event streams can be used as a source of training data for machine learning models.
I spoke at Current last week. I gave a talk about how artificial intelligence and machine learning are most commonly used with Kafka topics. I had a lot to say, so I didn’t manage to finish writing up my slides – but this post covers the last section of the talk.
It follows:
- the building blocks used in AI/ML Kafka projects
- how AI / ML is used to augment event stream processing
- how agentic AI is used to respond autonomously to events
- how events can provide real-time context to agents
- how events can be used as a source of training data for models (this post)

The talk covered the four main patterns for using AI/ML with events.
This pattern was where I talked about using events as a source of training data for models. This is perhaps the simplest and longest established approach – I’ve been writing about this for years, long pre-dating the current generative AI-inspired interest.









