Posts Tagged ‘ibmeventstreams’

Talking about IBM Event Streams

Wednesday, September 9th, 2020

We’ve been running a virtual event this week to explain the capabilities of IBM’s Cloud Pak for Integration.

One of these is Event Streams, so I gave an overview of the Event Streams Operator.

But what it really reminded me is that I miss going to conferences and tech events. I don’t want to sound ungrateful for what I’m sure has been a huge amount of work for event organisers in the pivot to online events. It’s great that we can still do events at all, and that organisers are still trying out ways to make it interactive, to enable panels and Q&A sessions.

(more…)

Supporting CI/CD with Kubernetes Operators

Thursday, August 20th, 2020

Operators bring a lot of benefits as a way of managing complex software systems in a Kubernetes cluster. In this post, I want to illustrate one in particular: the way that custom resources (and declarative approaches to managing systems in general) enable easy integration with source control and a CI/CD pipeline.

I’ll be using IBM Event Streams as my example here, but the same principles will be true for many Kubernetes Operators, in particular, the open-source Strimzi Kafka Operator that Event Streams is based on.

(more…)

Installing IBM Event Streams using the kubectl-operator plugin

Thursday, August 13th, 2020

Installing operators in Red Hat OpenShift from the CLI is much easier with the new kubectl-operator plugin. Here’s an example of how you can use it to install the Event Streams Operator.

Installing operators in OpenShift from the CLI is a little fiddly. It’s possible, but you have to create a bunch of custom resources that aren’t entirely intuitive, like Subscriptions and OperatorGroups.

It’s easy if you’re using the OpenShift Console web UI, as it does this all for you so you don’t need to worry about it. But sometimes you want to do things from the command line. And the new kubectl-operator plugin looks like it’ll make that much simpler.

I had a quick play with it this evening, and it let me get the Event Streams operator running with three commands. (Compare this with the OpenShift Console web UI equivalent in my Event Streams demo video).

(more…)

Using MirrorMaker 2

Wednesday, July 15th, 2020

I’ve been talking about MirrorMaker 2 this week – the Apache Kafka tool for replicating data across two Kafka clusters. You can use it to make a copy of messages on your Kafka cluster to a remote Kafka cluster running on a different data centre, and keep that copy up to date in the background.

For the discussion we had, I needed to give examples of how you might use MirrorMaker 2, which essentially meant I spent an afternoon drawing pictures. As some of them were a little pretty, I thought I’d tidy them up and share them here.

We went through several different use cases, but I’ll just describe two examples here.
(more…)

IBM Event Streams v10

Tuesday, June 30th, 2020

On Friday, we released the latest version of IBM Event Streams. This means I’ve been doing a variety of demo sessions to show people what we’ve made and how it works.

Here’s a recording of one of them:

In this session, I did a run-through of the new Event Streams Operator on Red Hat OpenShift, with a very quick intro to some of the features:

00m30s – installing the Operator
02m10s – creating custom Kafka clusters in the OpenShift console
05m10s – creating custom Kafka clusters in IBM Cloud Pak for Integration
08m00s – running the sample Kafka application
08m50s – creating topics
10m20s – creating credentials for client applications
11m45s – automating deployment of event-streaming infrastructure
12m30s – using schemas with the schema registry
13m10s – sending messages with HTTP POST requests
13m45s – viewing messages in the message browser
14m00s – command line administration
14m30s – running Kafka Connect
15m10s – geo-replication for disaster recovery
15m50s – monitoring Kafka clusters in the Event Streams UI
17m10s – monitoring with custom Grafana dashboards
17m30s – alerting using Prometheus

Why are Kafka messages still on the topic after the retention time has expired?

Sunday, February 9th, 2020

We had an interesting Kafka question from an Event Streams user. The answer isn’t immediately obvious unless you know a bit about Kafka internals, and after a little searching I couldn’t find an explanation online, so I thought I’d share the answer here (obviously anonymised and heavily simplified).

What is retention?

Retention is a Kafka feature to help you manage the amount of disk space your topics use.

It lets you specify how long you want Kafka to keep messages on a topic for. You can specify this by time (e.g. “I want messages on this topic to be preserved for at least X days”) or by disk usage (e.g. “I want at least the last X gb of messages on this topic to be preserved”).

After the retention time or disk threshold is exceeded, messages become eligible for being automatically deleted by Kafka.

What was wrong in this case?

screenshot

They had created a topic with a retention time of 7 days.

They had assumed that this meant messages older than 7 days would be deleted.

When they looked at the messages on their topic, they could see some messages older than 7 days were there, and were surprised.

They thought this might mean retention wasn’t working.

(more…)

A run-through of IBM Event Streams

Thursday, January 16th, 2020

I needed to quickly record a demo of what it looks like to get started with Event Streams yesterday.

It’s a little rough around the edges (it was only for an internal event, so the production values were essentially me-talking-at-my-laptop without a lot of planning or editing) but I thought I’d share it here in case I need to point anyone else at it.

Using TensorFlow with IBM Event Streams
(Kafka + Machine Learning = Awesome)

Thursday, October 31st, 2019

In this post, I want to explain how to get started creating machine learning applications using the data you have on Kafka topics.

I’ve written a sample app, with examples of how you can use Kafka topics as:

  • a source of training data for creating machine learning models
  • a source of test data for evaluating machine learning models
  • an ongoing stream of events to make predictions about using machine learning models

I’ll use this post to explain how it works, and how you can use it as the basis of writing your first ML pipeline using the data on your own Kafka topics.

(more…)