Using OpenWhisk in Machine Learning for Kids

July 28th, 2019

I’ve moved a couple of bits of Machine Learning for Kids into OpenWhisk functions. In this post, I’ll describe what I’m trying to solve by doing this, and what I’ve done.

Background

I’ve talked before how I implemented Machine Learning for Kids, but the short version is that most of it is a Node.js app, hosted in Cloud Foundry so I can easily run multiple instances of it.

The most computationally expensive thing the site has to do is for projects that train a machine learning model to recognize images.

In particular, the expensive bit is when a student clicks on the Train new machine learning model button for a project to train the computer to recognize images.

Read the rest of this entry »

The Scratch coordinate system

July 23rd, 2019

In Scratch 3, the stage in the top right where your sprites live is implemented as an HTML canvas. Unfortunately the internal coordinate system used by Scratch logically to maintain state, and the coordinate system used by HTML canvases both work very differently.

For some of the Scratch blocks I’ve written for Machine Learning for Kids, I need to be able to convert between coordinates and sizes between the two different coordinate systems.

For example, my ML blocks can let a student use an image classifier they’ve trained to recognise what is on the background behind a certain Sprite in their project. To do that, the backdrop image block needs to:

  1. get the location of the Sprite (which will be returned using the Scratch coordinate system)
  2. get the image data of what is rendered on the canvas at that location (using HTML canvas APIs – using the HTML coordinate system)

I couldn’t find a way to convert between the two documented anywhere, and it was a tiny bit fiddly, so I’m documenting it here for the next time I need it!

Read the rest of this entry »

How to write your first Avro schema

July 20th, 2019

Any time there is more than one developer using a Kafka topic, they will need a way to agree on the shape of the data that will go into messages. The most common way to document the schema of messages in Kafka is to use the Apache Avro serialization system.

This post is a beginner’s guide to writing your first Avro schema, and a few tips for how to use it in your Kafka apps.

Read the rest of this entry »

An introduction to serverless and OpenWhisk for Kafka users

July 13th, 2019

I gave a talk at Kafka Summit London this year about Apache OpenWhisk. It was aimed at Kafka users who want to know what the serverless hype is all about.

I covered:

  • a simple introduction of what serverless is for
  • an introduction to some of the serverless platforms available
  • a quick crash course in how to get started with Apache OpenWhisk

I also had a quick tangent looking into how Apache OpenWhisk itself uses Kafka internally, because I thought that was interesting!

My slides are on SlideShare if you’d like to see a higher-res version of any of them.

If this convinces you to give OpenWhisk a try, I have a post on how to get started with OpenWhisk that has all the commands you need to copy/paste to get yourself a working OpenWhisk environment connected to a Kafka source of events.

Read the rest of this entry »

Getting started with OpenWhisk and Kafka

July 6th, 2019

Apache OpenWhisk (and serverless platforms in general) are a great way to host and manage code that you want to run in response to events.
Apache Kafka topics are a great source of events.

In this post, I’ll run through a super simple beginner’s guide to writing code for OpenWhisk that processes events on your Kafka topics.

Read the rest of this entry »

Using Node-RED with IBM Event Streams

June 28th, 2019


Click to enlarge

IBM Event Streams is the distributed streaming real-time data platform Apache Kafka, from IBM.

Node-RED is a visual flow-based development tool, with nodes that you drag and drop onto a canvas and wire together. It’s useful for loads of tasks, such as quick and flexible prototyping.

In this post, I’ll show how Event Streams and Node-RED work well together. You can use Node-RED to quickly and easily create flows that consume messages from Kafka topics, or that process events from different sources and produce the output to Kafka topics.

Read the rest of this entry »

Curated sample training datasets for Machine Learning for Kids

June 26th, 2019

Machine Learning for Kids now includes support for a curated collection of training data sets, to enable children to create different types of machine learning projects.


Click to enlarge

The tool lets children make things using machine learning. The principle I’ve worked to is that children train their own machine learning models, as doing this is a great way to teach them about how this tech works.

Preparing their own training data is a useful exercise, but it is time-consuming. Project worksheets I’ve written so far have all been written with the assumption that the student will prepare the training data within a single lesson. This has been a limiting factor on the kinds of ML projects I’ve been able to include.

Read the rest of this entry »

Are indie games better value than AAA games?

June 23rd, 2019

This post started life as a debate with friends about whether big triple-A games are better value than cheaper indie games. We didn’t have data, the debate was just opinions. But it stuck with me, so I decided to collect data to prove I was right. 🙂

The plan was to plot time I spend playing games against how much money I spent on them, and use the clear correlation to prove my point. That didn’t work. I didn’t find much of a pattern, but it’s been a while since I’ve done this sort of quantified-self thing and collecting the data was a pain so I’m sharing it anyway!

To start with, this graph plots the cost of each game (x axis) against the number of hours I’ve spent playing them (y axis).


Cost against Hours played – click for larger version

Read the rest of this entry »