[tɒk bɒks] — a tiny hardware speech synthesizer/TTS

[tÉ’k bÉ’ks]: case
[tÉ’k bÉ’ks]: case
[tÉ’k bÉ’ks]: inside
[tÉ’k bÉ’ks]: inside. The observant amongst you will notice that the speech board is 1/10″ further in than it should be for ideal alignment with the USB serial adapter.
Back in the 1980s, the now-defunct Digital Equipment Corporation (“DEC”) sold a hardware speech synthesizer based on Dennis Klatt’s research at MIT.  These DECTalk boxes were compact and robust, and — despite not having the greatest speech quality — gave valuable speech, telephone and reading accessibility to many people. Stephen Hawking’s distinctive voice is from a pre-DEC version of the MIT hardware.

DEC is long gone, and the licensing of DECTalk has wandered off into mostly software. Much to the annoyance of those in earshot, I’ve always enjoyed dabbling in speech synthesis. DECTalk hardware remains expensive, partly because of demand from electronic music producers (its vocoder-like burr is on countless tracks), but also because there are still many people who rely on it for daily life. I couldn’t justify buying a real DECTalk, but I found this: the Parallax Emic 2 Text-to-Speech Module.  For about $80, this stamp-sized board brings a hardware DECTalk implementation to embedded projects.

The Emic 2 is really marketed to microcontroller hobbyists: Make Your Arduino Speak! sorta thing. But I wanted to make a DECTalk-ish hardware box, with serial input, a speaker, and switchable headphone/line jack.  [tɒk bɒks] (a fair approximation of how I pronounce “Talk Box”) is the result.

Hardware

  • Parallax Emic 2 Text-to-Speech Module
  • OSEPP FTDI USB-Serial Breakout — there are many USB-Serial boards that would do this, but two points in this one’s favour are: i) it has header pins for breadboard use, and ii) I had a spare one.
  • Small 8Ω speaker element — the one I used is most likely a headphone element, bought from Active Surplus (RIP). This should be as small as you can get away with (and still hear) as the USB-Serial connection isn’t designed to supply audio power.
  • Header pins and sockets
  • Toggle switch
  • Small project box with perfboard
  • Jumper wires and solder

Connections

Emic 2             Serial
======             ======
 GND                GND
 5V                 Vcc
 SOUT               RXD
 SIN                TXD

Emic 2             Speaker
======             =======
 SP-                -
 SP+  (via switch)  +

Using it

You’ll need some kind of serial terminal connection. In a pinch, you can use the serial monitor that is in the Arduino development environment. Either way, identify your serial port (/dev/ttyUSBN, COMN:, or /dev/tty-usbserialNNNN) and find a way to send 9600 baud, 8N1 characters to it. Hit Return, and you should be greeted by the Emic 2’s : prompt (or a ?, followed by :). Whether you get the prompt or not depends on whether local echo is set or not. Either way, try sending this line:

SAll watched over by machines of loving grace.

You should hear a voice say the title of Richard Brautigan’s lovely poem All Watched Over by Machines of Loving Grace (caution: video link contains nekkid hippies). You should get the : prompt back once the the speech has stopped. And that’s all there is to it: send an S, followed by up to 1023 bytes of (basically ASCII) text, followed by a newline, and it will be spoken. There’s more detail, of course, in the Emic 2 documentation and the Emic 2 Epson/Fonix DECTalk 501 User’s Guide for changing voices, etc. Yes, you can make it sing. No, you probably shouldn’t, though.

Notes

  1. The Emic 2 has no serial flow control, so you have to wait until the module stops speaking (or you send it the stop command) before you can send more. The easiest way is to poll the serial port and see if there’s the : prompt waiting. Until you see the prompt, any text you send it may be lost.
  2. The Emic 2 is an embedded device; Unicode is a bit of a stretch. It’s supposed to accept ISO Latin-1 8-bit characters (handy for Spanish mode), though.
  3. Starting every speech line with S may make this board incompatible with assistive technology software such as the JAWS screen reader. I don’t think that this was the goal for Emic 2’s designers (Grand Idea Studio), however.
  4. The output from the audio jack has a fair bit of noise on it, and you need to set the volume quite low to avoid hiss and hum. Your experience may be different, as I may have accidentally made a ground loop. There is a faintly  audible click at the start and end of the text, too.
  5. The Emic 2 uses DECTalk v5 commands and phonemes. Many DECTalk resources on the web (like these songs) use v4 or older, which are subtly incompatible. I haven’t found a reliable conversion protocol yet.

To end, here’s the Emic 2’s “Dennis” voice reading all of Brautigan’s All Watched Over By Machines of Loving Grace:


(plain link: molg-dennis-140wpm-16khz.mp3)

(even plainer link if you can’t decode MP2 files: molg-dennis-140wpm.mp3)

(recorded and edited for length with Audacity. No hippies — nekkid, or otherwise — were harmed in the making of this recording.)