Looking back at Machine Learning for Kids in 2020

A review of what I did on Machine Learning for Kids in 2020.

Happy New Year!

At this time of year, it’s traditional to get a bit reflective, so I thought I’d look over the work I did on ML for Kids in 2020.

Excuses

I’ll get my excuses out of the way first. It was 2020, so like all of us I was a bit distracted and a bit depressed and a lot less productive than I’d normally want to be.

And as with all personal/side-projects, it gets put to one side when work is busy – and I was super busy with my day job in 2020, particularly in the first half of the year. We rewrote Event Streams from being Helm-based to being a Kubernetes Operator. That was a huge amount of work, even for a normal year – let alone while figuring out how to get the team working remotely.

So… those are my excuses for why ML for Kids didn’t move on as much in 2020 as I think I’d have otherwise liked.

I wrote new project worksheets

As well as developing the platform / tool itself, I try to regularly add to the collection of machine learning student project worksheets.

They take a huge amount of time to create: coming up with the idea, prototyping it, testing it, refining it, writing it up (both the student instructions and teacher notes)…

(Testing them is harder now than it was. Before 2020, I was fortunate to be able to test new projects with local schools, which was always eye-opening – I’d always end up reworking them after school tests. This year, I’ve only been able to test them on my own kids, which isn’t the same thing!)

But making these worksheets is worth the effort. They help inspire teachers and students alike about what is possible, in a way that just creating the platform and tool wouldn’t. Some schools use the project worksheets as-is, but many use them as a jumping off point to invent their own creations.

I added ten new worksheets to the site in 2020:

In Describe the glass, students train an ML model to predict whether they’ll describe a glass as half-full or half-empty. It’s the simplest, easiest worksheet on the site, and I think it should be a fun starter project, particularly for younger primary school classes.

Projects based on training a computer to play a video game are always popular with students. In Shoot the bug, students train an ML model to predict what angle to shoot at, in a Breakout sort of game.

Laser eyes is a fun combination of ML techniques, using both a face detection model to position lasers and a custom speech recognition model to fire it.

Fooled is a little complicated, but it’s a demonstration of bias in ML models by intentionally training a biased image classifier.

Phishing is a bit dry, a bit technical, and a bit geeky, but I really liked it. It’s about training a machine learning model to predict if a URL is likely to be a phishing link.

Emoji Mask is a fun simple project – using a face detection ML model to add an emoji mask to a webcam view.

Face Finder is a bit similar to emoji mask – another fun project based around using a pre-trained face detection model. It’s a bit silly, but it’s simple and has a lot of potential for students doing something creative with it.

Ink blots involves testing image classifiers using Rorschach ink blot images as a way of explaining an MIT AI research project called “Norman”. It’s a bit of a weird one to explain quickly, but I think it’ll be a good way to start a discussion about how AI is portrayed by the media.

Semaphore Quiz is another project that involves a combination of a couple of ML techniques – a pre-trained pose detection model to identify the location of your arms in the webcam, and a custom speech detection model to recognize spoken commands.

I revised and updated project worksheets

The older projects worksheets I wrote were based on version 2 of Scratch (the older Adobe Flash version of Scratch) which isn’t used much nowadays, so I’ve been working through them to rewrite them to use the current version 3 of Scratch.

I rewrote the Snap, Pac Man, Top Trumps, and Noughts and Crosses projects in 2020.

There’s only three left that I need to rewrite, but now that Flash has finally been killed off I should really hurry up and get through them!

I wrote a bunch of code

The Github stats for the site code do show what I expected: that I made fewer changes to the site in the first half of the year (when I was more busy with work).


commits


additions

I created explanations of text ML models

One of the new features I added to the tool in 2020 was a new page that explains how the neural networks created in the site for text projects are trained.

Doing more to explain what is happening under the covers when students train their machine learning models is something I’ve been working on for a long time, and this new section was a big piece of work – partly because of the work to create the interactive neural network visualizations (I’m a terrible front-end developer) but mostly because of time involved in coming up with the approach and design.

I added support for creating Python projects in the browser

I’ve had support for Python projects for a while now, but I did this by generating sample code (and leaving students to work out how and where to run it, including how they should set up the third-party dependency needed to make API calls).

In 2020 I added an integration with repl.it to allow students to create machine-learning Python projects in the browser.


I added support for sharing projects

A small, but surprisingly well-received, feature I added was to allow teachers to share the projects that they create with the students in their class.

I added Pre-trained models

Another new section I added to the site in 2020 was the Pretrained models page – a small library of pretrained machine learning models, with support for creating Scratch projects using them.

This opened up the potential for a variety of projects using machine learning models that are too complex for students to train themselves in a single lesson.

I continued to add new models to the collection after this: imagenet, toxicity, face detection, and pose detection.

And I’m on the look out to add more ML models to the set in 2021.

I added support for training image models in the browser

A continual complaint I’d had from teachers for a while has been how long machine learning models take to train for image projects. This originally took three or four minutes, but by 2020 the wait for models to train had been increasing (not because of the actual training time increasing but because of the time that training requests to the Watson Visual Recognition service were being queued before training started). This could sometimes take about ten minutes, which is hugely disruptive in lessons, so I spent a lot of time in 2020 investigating options for removing the dependency on the Watson Visual Recognition service and replacing it with my own implementation.

I ended up implementing support for training image ML models locally in the browser using TensorFlow.js, which trains so much quicker, even on slow or small computers like Raspberry Pi’s. There is definitely an obvious accuracy cost, particularly for more complex tasks, but I think the models that it trains are good enough for my purposes.

I’ve only implemented support for this in Scratch, so I still need to do the equivalent work for Python and App Inventor (Java) projects.

I added TensorFlow model support

Doing this also helped lay the groundwork for another new feature: support for students using TensorFlow models in Scratch.

This means that code club leaders who know how to prepare a custom machine learning model have the flexibility to prepare any sort of model they can think of for their students (or they could always find an existing model from places like TensorFlow Hub).

I added Teachable Machine support

I also added in support for creating Scratch projects using models trained in Google’s Teachable Machine. I started by implementing support for Teachable Machine’s image projects, and pose recognition projects.

I added a bunch of other features

This post is already a lot longer than I’d planned, so I’ll race through a few other examples:

Added new notifications in Scratch to give students feedback when there are problems with the machine learning blocks in their script, such as if their model isn’t ready to use yet or if they’ve reached the limit for the number of examples they can add to their training data

Allowing teachers to review the training data that their students have collected in their projects

More feedback about Watson API keys, such as warning teachers about expiring API keys or checking the state of API keys when students start a new project

Allowing teachers to download the browser console log from the site menu without needing to go into browser developer tools (or know what browser developer tools are)

Support for saving machine learning models trained in the browser to IndexedDB storage so that they can still be used after page changes or refreshes

A new version of the summary animation for the site About page

I worried about performance a bit

I did some work to improve the experience for students using Raspberry Pi’s or other smaller devices, such as being a bit smarter about the way I load dependencies, making more use of caching and minifying, pre-generating theme CSS instead of doing it dynamically, removing a bunch of unused code, replacing some of the third-party dependencies with smaller equivalents, reducing image sizes, and so on.

And I worked on a bunch of annoyingly boring and time-consuming (but essential) stuff that no-one ever notices

The cloud database host I use removed support for MySQL, which is what I’d been using since I started the site. So I had to rewrite all the back-end code to use PostgreSQL. And I needed to implement support for putting the site in a read-only maintenance mode so I had time to migrate the contents of the site database.

I rewrote how I manage Watson API keys to share them across managed classes to better fit changes that Watson made to pricing plans, the new Plans they added, and the new region they added. Essentially, every time the Watson services I use make changes, it meant I had a bit more work to do to keep everything in ML for Kids working. (And update the instructions I give to teachers).

I removed some uses of CDN, hosting more of the essential dependencies within the site itself, after some schools reported things being broken by blocked third-party dependencies.

I increased the translation support, by adding NLS support to some of the last few pages that still used hard-coded English text, like the download screens for student worksheets and teacher notes. And added a couple of new languages – Welsh and Arabic.

(Arabic language support was also the long-overdue nudge to get me to remove the flag icons from the language selection menu)

And endless other changes and tweaks to how the site is deployed and managed to try and reduce the time I spend babysitting the site when it gets busy, updates to dependencies to address security issues, and so on and so on. A lot of the work on the site is frustratingly invisible to end users.

I answered many, many questions

I tried setting up a Google Groups forum in 2019 to try and collect all the questions that I get in one place.

It hasn’t really worked.

I still get questions from pretty much everywhere: such as emails to both of my personal email addresses, emails to my work email address, direct messages to both of my Twitter accounts, and questions in blog post comments and YouTube video comments.

I need to figure out a way of doing this more efficiently, because responding to all of them can take a huge amount of time, and they often bury other messages that I get.

Speaking of things that aren’t code, I gave presentations

A couple of them were recorded:

And I was invited to speak on a couple of podcasts:

I haven’t listened to either of them, because I can’t listen to my own voice without shuddering.

I finished writing a book

The project worksheets I write for the site are written with schools and code clubs in mind, and that comes with some constraints. They’re all intended to be self-contained, as I assume that schools will only have room in their timetable/curriculum plan to do one or maybe two AI projects. The projects are also generally short – so that they can be completed within a school lesson.

The idea of writing a book version of Machine Learning for Kids was about creating something for a child at home with their parents, which doesn’t need those constraints.

It’s still project-based, but there is a flow in there. There is an intentional order and a continuation between them – each project builds upon the projects that came before it. It’s okay if some of the projects take a bit longer as they don’t need to be done in one sitting. And I have more time and space to explain the ideas and to give the real-world context. I even get into some more advanced topics that none of the project worksheets on the site go near, such as things like accuracy, recall, and confidence matrices.

It’s sort of a different spin on what I’ve done with Machine Learning for Kids before.

But the book took so much longer than I thought it would. Orders of magnitude longer. I was starting to think it’d never see the light of day.

I’m really glad it’s finished. And I hope people will find it useful.

Not sure if that’s all of it, but this post is long enough now

That was the sort of stuff that I did with Machine Learning for Kids in 2020. It was a more incremental year than I’d have liked, but there were some improvements.

A post about what teachers and students did with the site in 2020 would probably have been more interesting, but that’s a post for another day. For now, you can see examples in wakelet.

Tags:

Comments are closed.