Analysing social media sentiment with IBM Event Processing

aka “Who wants a Mario alarm clock?”

In this post, I want to share a quick demo of using Event Processing to process social media posts.

Background

A fun surprise from Nintendo today: they’ve introduced a new product! “Alarmo” is a game-themed alarm clock, with some interesting gesture recognition features.

I was (unsurprisingly!) tempted…

But that got me wondering how the rest of the Internet was reacting.

In this post, I want to share a (super-simple!) demo for how to look at this – using IBM Event Processing to create an Apache Flink job that looks at the sentiment of social media posts about this unusual new product.

This is what I set up. I’ll share a few details for each of the 4 steps in case this inspires you to try something similar.

Step 1 – Getting social media posts into Kafka

I’m using Mastodon for this, as it has a friendly and open API. I configured a Kafka Connect connector to bring mastodon posts with the #nintendo hashtag into Kafka.

I added a Mastodon connector jar to the Kafka Connect runtime I use for demos.

apiVersion: eventstreams.ibm.com/v1beta2
kind: KafkaConnect
metadata:
  annotations:
    eventstreams.ibm.com/use-connector-resources: 'true'
  name: kafka-connect-cluster
spec:
  ...
  build:
    ...
    plugins:
      - artifacts:
          - type: jar
            url: 'https://github.com/dalelane/kafka-connect-mastodon-source/releases/download/0.0.1/kafka-connect-mastodon-source-0.0.1-jar-with-dependencies.jar'
        name: mastodon
      ...

I created a KafkaConnector instance to start a new connector:

apiVersion: eventstreams.ibm.com/v1beta2
kind: KafkaConnector
metadata:
  name: mastodon-nintendo
  namespace: event-automation
  labels:
    eventstreams.ibm.com/cluster: kafka-connect-cluster
spec:
  class: uk.co.dalelane.kafkaconnect.mastodon.source.MastodonSourceConnector
  config:
    key.converter: org.apache.kafka.connect.storage.StringConverter
    key.converter.schemas.enable: false
    value.converter: org.apache.kafka.connect.json.JsonConverter
    value.converter.schemas.enable: false
    mastodon.accesstoken: MySuperSecretAccessTokenForMastodonAPI
    mastodon.instance: mastodon.social
    mastodon.searchterm: nintendo
    mastodon.topic: mastodon

That gave me with a Kafka topic with Mastodon social media posts about Nintendo.

Step 2 – Processing the posts

I used IBM Event Processing to create an Apache Flink job to start processing these events.

I added filters to limit the posts to the ones I’m interested in.

For example, I filtered out posts by bots, as I was curious to see what people were organically saying about the product, not just auto-posted links.

And I added a filter to only keep posts that mentioned something to do with this new product.

That gave me a starting point with a stream of posts worth looking at.

Step 3 – Measure the sentiment of the comments

I used IBM Watson Natural Language Understanding for this.

I created an instance of the service using the Lite plan as that let me get started for free.

I downloaded the OpenAPI definition for the API from the Watson NLU documentation page (using the three-dot menu in the top-left).

Aside: There’s a tiny bug I needed to fix in the spec – replacing the incorrect “apikey” security scheme in the doc to match what the Watson API actually expects:

    ...
    "securitySchemes": {
      "IAM": {
        "type": "http",
        "scheme": "basic"
      }
    },
    ...

I could then use this API in Event Processing to enrich each event with the Watson NLU service’s assessment of the sentiment:

(The API key can be found in the “Service credentials” tab of the Watson Natural Language Understanding service instance page.)

The input mapping page is pretty long as the Watson service has a lot of options, but following the API doc makes that easy.

I set the version of the service to use as shown in the examples in the API docs, and I enabled the sentiment analysis feature. (The service can return lots of other details about the text, such as emotion, entity extraction, relationships, etc. – which is why there are so many options here.)

The Mastodon API returns the HTML content for social media posts, so I put that into the html property.

Posts are written by people in a wide variety of languages, but the Watson NLU service can handle that. I included the language code from the Mastodon post in the request to the Watson service so it could analyse the text appropriately.

There is a lot of output you can get from the service, but I was most interested in the sentiment analysis score and label.

Step 4 – Outputting the results

At this point, there are lots of things you can do with the output from this.

You can just have the raw results, with the posts together with the analysis of their sentiment.

Or you can aggregate the posts – maybe try computing something with those sentiment scores, or even just keep it simple and count the number of posts with each sentiment label.

Summary

I’m not sure I’m any closer to deciding if I need a Nintendo-themed gesture-controlled alarm clock.

But hopefully I’ve at least demonstrated how easy it is to take a stream of events from social media, and enrich them using IBM Watson services to gain insight into public discussions.

Tags: apachekafka, ibmeventstreams, kafka

This entry was posted on Thursday, October 10th, 2024 at 5:10 pm and is filed under code. You can follow any responses to this entry through the RSS 2.0 feed. You can skip to the end and leave a response. Pinging is currently not allowed.

dale lane