Intro
Imagine you run a city e-bike hire scheme.
Let’s say that you’ve instrumented your bikes so you can track their location and battery level.
When a bike is on the move, it emits periodic updates to a Kafka topic, and you use these events for a range of maintenance, logistics, and operations reasons.
You also have other Kafka topics, such as a stream of events with weather sensor readings covering the area of your bike scheme.
Do you know how to use predictive models to forecast the likely demand for bikes in the next few hours?
Could you compare these forecasts with the actual usage that follows, and use this to identify unusual demand?
Time series models
A time series is how a machine learning or data scientist would describe a dataset that consists of data values, ordered sequentially over time, and labelled with timestamps.
A time series model is a specific type of machine learning model that can analyze this type of sequential time series data. These models are used to predict future values and to identify anomalies.
For those of us used to working with Kafka topics, the machine learning definition of a “time series” sounds exactly like our definition of a Kafka topic. Kafka topics are a sequential ordered set of data values, each labelled with timestamps.
(more…)