In this post, I want to share an example of how to stream phone call audio through IBM Watson Speech to Text and IBM Watson Natural Language Understanding services, and show some ideas of what you could use this for.
Let’s start with a demo
That’s what I want to show you how to build.
At a high-level, this is what you will have seen in that video:
1.
Faith made a phone call to a phone number managed by Twilio.
2.
Twilio routed the phone call to me, and I answered the call.
We then started talking to each other. And while we were doing this:
3.
Twilio streamed a copy of the audio from the phone call to a demo Node.js app
4.
The Node.js app sent audio to the Watson Speech to Text service for transcribing.
5.
Watson Speech to Text asynchronously sent transcriptions to the Node.js app as soon as they were available.
6.
The app then submitted the transcription text to Watson Natural Language Understanding for analysis.
7.
All of this – the transcriptions and analyses – were displayed on the demo web page.