The Audiogram

So the audiogram up there is really just a proof of concept. One piece of a bigger puzzle I’ve been putting together.

For about three years now, I’ve been going on what I call “dictation walks”. I’d talk into my phone — full stops, commas, the lot — using Dragon Dictate or Just Press Record on the iPhone. Proper dictation, old school.

Amazon Transcribe and friends

Then I started playing with Amazon Transcribe. For not much money at all, you can throw raw audio at it — no punctuation, no formatting, just talking — and get a usable transcript back. The results are decent enough. But here’s the bit that actually got me excited: it identifies different speakers.

That changes everything. Because what I’m really after is a workflow for doing more journalism again. One recording, multiple outputs. I’ve now got that nailed down:

Record an audio note or interview. Use it as a podcast episode.
Send the audio to AWS Transcribe. Edit the transcript, publish it on the website or wherever else it needs to go.
Create an audiogram — a short visual snippet of the audio with waveforms. That’s the proof of concept I started with.
Produce a very short video — of me, or better yet, of whoever I was talking to. Send the audio track to AWS, convert it to SRT subtitles.

I’ll write more about what I’m planning to do with all of this soon. There’s a lot.