Plug in the CH552 board. You may have to do something with the boot/reset button to make it turn up as the right USB ID (4348:55e0 WinChipHead).
Program the board: sudo ch55xisptool basic52s.bin
Disconnect the board, and wire it up to the USB UART (5 V, GND, TX → RXD [P3.0], RX → TXD [P3.1])
Hit return a few times to get a prompt
PWM is on P1.2, INT1 is on P3.3. See Hackaday project to see how to access I²C, and also do things with SFR values using RDSFR / WRSFR. PORT3 is at SFR(0B0H)
please ignore the following for now …
Who wouldn’t want to run a solid BASIC interpreter on a $3 development board? So maybe there are a couple of drawbacks:
there’s no way to save the program to non-volatile storage: you have to be connected through a serial terminal at all times; and
you’ve got about 600 bytes for the whole program, with no way to expand it.
Despite these limitations, there’s some futile fun to be had. I’ll show you how to flash BASIC-52 onto one of these development boards, and give a quick intro to what you can do with it.
BASIC-52
Intel released the first version of BASIC-52 for their 8051 family of microcontrollers in 1984. They produced a chip (8052AH-BASIC) with the interpreter burnt into mask ROM in 1985. The source code was released into the public domain, and various features such as I²C support were added by the community around 2000.
As befits an embedded language, BASIC-52 supports pin management, timers and interrupts. It’s also a fairly full-featured BASIC interpreter with floating point support and mostly familiar keywords and functions. Because it’s designed for very limited memory use, its string handling is quite unlike any other BASIC dialect. It has one character array that you can treat as a string, and a few functions to work with characters, but that’s about all.
You might know WCH (aka QinHeng Electronics) from their inexpensive CH341 USB serial adapters and other interface boards. What you might not realize is that all of their older interface chips are based on an optimized 8051 design
also known as “Monster BASICS Sound reactive RGB+IC Color Flow LED strip”. It’s $5 or so at Dollarama, and includes a USB cable for power and a remote control. It’s two metres long and includes 60 RGB LEDs. Are these really super-cheap NeoPixel clones?
I’m going to keep the USB power so I can power it from a power bank, but otherwise convert it to a string of smart LEDs. We lose the remote control capability.
Pull back the heatshrink at the USB end:
… and there are our connectors. We want to disconnect the blue Din (Data In) line from the built in controller, and solder new wires to Din and GND to run from a microcontroller board.
Maybe not the best solder job, but there are new wires feeding through the heatshrink and soldered onto the strip.
Here’s the heatshrink pushed back, and everything secured with a cable tie.
Now to feed it from standard MicroPython NeoPixel code, suitably jazzed up for 60 pixels.
So you bought that Brother laser printer like everyone told you to. And now it’s out of toner, so you replaced the cartridge. If you were in the USA, you could return the cartridge for free using the included label. But in Canada … it’s a whole deal including registering with Brother and giving away your contact details and, and, and …
Anyway, my dear fellow Canadians, I went through the process and downloaded the label PDF so you don’t have to:
Stewart Russell – scruss.com — 2024-03-26, at age 19999 days …
Summary
One’s thousand day(s) celebration occurs every thousand days of a person’s life. They are meant to be a recognition of getting this far, and are celebrated at the person’s own discretion.
Who is this for?
Maybe your birthday’s on a day associated with an unpleasant event. Your thousand day will never coincide with your birthday.
Maybe your birthday’s in the middle of winter, or in another part of the year that you’re not keen on. Your thousand day is every 2 years and 3 seasons, so it shifts back by a season every time it happens.
Quantities and scale
1000 days is approximately:
2.738 years
2 years 269 days
2 years 8.85 months
2 years, 3 seasons.
4000 days is just shy of 11 years.
Disadvantages
Compared to regular birthdays, thousand days:
must be calculated; they’re not intuitive when they’re going to happen. But we have computers and calendar reminders for that …
can be used to work out your actual date of birth, if someone knows that you’re going to be x000 days old on a particular day. It’s possible to know someone’s birthday, but not know their age.
There are no trademarks, patents, official websites, social media or official anythings attached to this concept. Please take the idea and do good with it.
So why aren’t you implementing this further?
I’ve had this idea kicking around my head for at least the last 20 years. For $REASONS, it turns out I’m not very good at implementing stuff. I’d far rather someone else took this idea and ran with it than let it sit undeveloped.
It’s mid-February in Toronto: -10 °C and snowy. The memory of chirping summer fields is dim. But in my heart there is always a cricket-loud meadow.
Short of moving somewhere warmer, I’m going to have to make my own midwinter crickets. I have micro-controllers and tiny speakers: how hard can this be?
more fun than a bucket of simulated crickets (video description: a plastic box containing three USB power banks, each with USB cable leading to a Raspberry Pi Pico board. Each board has a small electromagnetic speaker attached between ground and a data pin)
I could have merely made these beep away at a fixed rate, but I know that real crickets tend to chirp faster as the day grows warmer. This relationship is frequently referred to as Dolbear’s law. The American inventor Amos Dolbear published his observation (without data or species identification) in The American Naturalist in 1897: The Cricket as a Thermometer —
pretty bold assertions there without data eh, Amos old son …?
When emulating crickets I’m less interested in the rate of chirps per minute, but rather in the period between chirps. I could also care entirely less about barbarian units, so I reformulated it in °C (t) and milliseconds (p):
t = ⅑ × (40 + 75000 ÷ p)
Since I know that the micro-controller has an internal temperature sensor, I’m particularly interested in the inverse relationship:
p = 15000 ÷ (9 * t ÷ 5 – 8)
I can check this against one of Dolbear’s observations for 70°F (= 21⅑ °C, or 190/9) and 120 chirps / minute (= 2 Hz, or a period of 500 ms):
Now I’ve got the timing worked out, how about the chirp sound. From a couple of recordings of cricket meadows I’ve made over the years, I observed:
The total duration of a chirp is about ⅛ s
A chirp is made up of four distinct events:
a quieter short tone;
a longer louder tone of a fractionally higher pitch;
the same longer louder tone repeated;
the first short tone repeated
There is a very short silence between each tone
Each cricket appears to chirp at roughly the same pitch: some slightly lower, some slightly higher
The pitch of the tones is in the range 4500–5000 Hz: around D8 on the music scale
I didn’t attempt to model the actual stridulating mechanism of a particular species of cricket. I made what sounded sort of right to me. Hey, if Amos Dolbear could make stuff up and get it accepted as a “law”, I can at least get away with pulse width modulation and tiny tinny speakers …
This is the profile I came up with:
21 ms of 4568 Hz at 25% duty cycle
7 ms of silence
28 ms of 4824 Hz at 50% duty cycle
7 ms of silence
28 ms of 4824 Hz at 50% duty cycle
7 ms of silence
21 ms of 4568 Hz at 25% duty cycle
7 ms of silence
That’s a total of 126 ms, or ⅛ish seconds. In the code I made each instance play at a randomly-selected relative pitch of ±200 Hz on the above numbers.
For the speaker, I have a bunch of cheap PC motherboard beepers. They have a Dupont header that spans four pins on a Raspberry Pi Pico header, so if you put one on the ground pin at pin 23, the output will be connected to pin 26, aka GPIO 20:
# cricket thermometer simulator - scruss, 2024-02
# uses a buzzer on GPIO 20 to make cricket(ish) noises
# MicroPython - for Raspberry Pi Pico
# -*- coding: utf-8 -*-
from machine import Pin, PWM, ADC, freq
from time import sleep_ms, ticks_ms, ticks_diff
from random import seed, randrange
freq(125000000) # use default CPU freq
seed() # start with a truly random seed
pwm_out = PWM(Pin(20), freq=10, duty_u16=0) # can't do freq=0
led = Pin("LED", Pin.OUT)
sensor_temp = machine.ADC(4) # adc channel for internal temperature
TOO_COLD = 10.0 # crickets don't chirp below 10 °C (allegedly)
temps = [] # for smoothing out temperature sensor noise
personal_freq_delta = randrange(400) - 199 # different pitch every time
chirp_data = [
# cadence, duty_u16, freq
# there is a cadence=1 silence after each of these
[3, 16384, 4568 + personal_freq_delta],
[4, 32768, 4824 + personal_freq_delta],
[4, 32768, 4824 + personal_freq_delta],
[3, 16384, 4568 + personal_freq_delta],
]
cadence_ms = 7 # length multiplier for playback
def chirp_period_ms(t_c):
# for a given temperature t_c (in °C), returns the
# estimated cricket chirp period in milliseconds.
#
# Based on
# Dolbear, Amos (1897). "The cricket as a thermometer".
# The American Naturalist. 31 (371): 970–971. doi:10.1086/276739
#
# The inverse function is:
# t_c = (75000 / chirp_period_ms + 40) / 9
return int(15000 / (9 * t_c / 5 - 8))
def internal_temperature(temp_adc):
# see pico-micropython-examples / adc / temperature.py
return (
27
- ((temp_adc.read_u16() * (3.3 / (65535))) - 0.706) / 0.001721
)
def chirp(pwm_channel):
for peep in chirp_data:
pwm_channel.freq(peep[2])
pwm_channel.duty_u16(peep[1])
sleep_ms(cadence_ms * peep[0])
# short silence
pwm_channel.duty_u16(0)
pwm_channel.freq(10)
sleep_ms(cadence_ms)
led.value(0) # led off at start; blinks if chirping
### Start: pause a random amount (less than 2 s) before starting
sleep_ms(randrange(2000))
while True:
loop_start_ms = ticks_ms()
sleep_ms(5) # tiny delay to stop the main loop from thrashing
temps.append(internal_temperature(sensor_temp))
if len(temps) > 5:
temps = temps[1:]
avg_temp = sum(temps) / len(temps)
if avg_temp >= TOO_COLD:
led.value(1)
loop_period_ms = chirp_period_ms(avg_temp)
chirp(pwm_out)
led.value(0)
loop_elapsed_ms = ticks_diff(ticks_ms(), loop_start_ms)
sleep_ms(loop_period_ms - loop_elapsed_ms)
There are a few more details in the code that I haven’t covered here:
The program pauses for a short random time on starting. This is to ensure that if you power up a bunch of these at the same time, they don’t start exactly synchronized
The Raspberry Pi Pico’s temperature sensor can be slightly noisy, so the chirping frequency is based on the average of (up to) the last five readings
There’s no chirping below 10 °C, because Amos Dolbear said so
The built-in LED also flashes if the board is chirping. It doesn’t mimic the speaker’s PWM cadence, though.
Before I show you the next video, I need to say: no real crickets were harmed in the making of this post. I took the bucket outside (roughly -5 °C) and the “crickets” stopped chirping as they cooled down. Don’t worry, they started back up chirping again when I took them inside.
“If You’re Cold They’re Cold, Bring Them Inside” (video description: a plastic box containing three USB power banks, each with USB cable leading to a Raspberry Pi Pico board. Each board has a small electromagnetic speaker attached between ground and a data pin)
First PostScript font: STSong (华文宋体) was released in 1991, making it the first PostScript font by a Chinese foundry [ref: Typekit blog — Pan-CJK Partner Profile: SinoType]. But STSong looks like Garamond(ish).
This is a mini celebratory post to say that I’ve fixed the database encoding problems on this blog. It looks like I will have to go through the posts manually to correct the errors still, but at least I can enter, store and display UTF-8 characters as expected.
“? µ ° × — – ½ ¾ £ é?êè”, he said with some relief.
Postmortem: For reasons I cannot explain or remember, the database on this blog flipped to an archaic character set: latin1, aka ISO/IEC 8859-1. A partial fix was effected by downloading the entire site’s database backup, and changing all the following references in the SQL:
For additional annoyance, the entire SQL dump was too big to load back into phpmyadmin, so I had to split it by table. Thank goodness for awk!
#!/usr/bin/awk -f
BEGIN {
outfile = "nothing.sql";
}
/^# Table: / {
# very special comment in WP backup that introduces a new table
# last field is table_name,
# which we use to create table_name.sql
t = $NF
gsub(/`/, "", t);
outfile = t ".sql";
}
{
print > outfile;
}
The data still appears to be confused. For example, in the post Compose yourself, Raspberry Pi!, what should appear as “That little key marked “Compose”” appears as “That little key marked “Composeâ€Â”. This isn’t a straight conversion of one character set to another. It appears to have been double-encoded, and wrongly too.
Still, at least I can now write again and have whatever new things I make turn up the way I like. Editing 20 years of blog posts awaits … zzz
My OpenProcessing demo “autumn in canada”, redone as a NAPLPS playback file. Yes, it would have been nice to have outlined leaves, but I’ve only got four colours to play with that are vaguely autumnal in NAPLPS’s limited 2-bit RGB.
NAPLPS — an almost-forgotten videotex vector graphics format with a regrettable pronunciation (/nap-lips/, no really) — was really hard to create. Back in the early days when it was a worthwhile Canadian initiative called Telidon (see Inter/Access’s exhibit Remember Tomorrow: A Telidon Story) it required a custom video workstation costing $$$$$$. It got cheaper by the time the 1990s rolled round, but it was never easy and so interest waned.
I don’t claim what I made is particularly interesting:
suspiciously canadian
but even decoding the tutorial and standards material was hard. NAPLPS made heavy use of bitfields interleaved and packed into 7 and 8-bit characters. It was kind of a clever idea (lower resolution data could be packed into fewer bytes) but the implementation is quite unpleasant.
A few of the references/tools/resources I relied on:
The 1983 BYTE Magazine article series NAPLPS: A New Standard for Text and Graphics. Also long and needlessly wordy, with digressions into extensions that were never implemented. Contains a commented byte dump of an image that explains most concepts by example
John Durno has spent years recovering Telidon / NAPLPS works. He has published many useful resources on the subject
Here’s the fragment of code I wrote to generate the NAPLPS:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# draw a disappointing maple leaf in NAPLPS - scruss, 2023-09
# stylized maple leaf polygon, quite similar to
# the coordinates used in the Canadian flag ...
maple = [
[62, 2],
[62, 35],
[94, 31],
[91, 41],
[122, 66],
[113, 70],
[119, 90],
[100, 86],
[97, 96],
[77, 74],
[85, 114],
[73, 108],
[62, 130],
[51, 108],
[39, 114],
[47, 74],
[27, 96],
[24, 86],
[5, 90],
[11, 70],
[2, 66],
[33, 41],
[30, 31],
[62, 35],
]
def colour(r, g, b):
# r, g and b are limited to the range 0-3
return chr(0o74) + chr(
64
+ ((g & 2) << 4)
+ ((r & 2) << 3)
+ ((b & 2) << 2)
+ ((g & 1) << 2)
+ ((r & 1) << 1)
+ (b & 1)
)
def coord(x, y):
# if you stick with 256 x 192 integer coordinates this should be okay
xsign = 0
ysign = 0
if x < 0:
xsign = 1
x = x * -1
x = ((x ^ 255) + 1) & 255
if y < 0:
ysign = 1
y = y * -1
y = ((y ^ 255) + 1) & 255
return (
chr(
64
+ (xsign << 5)
+ ((x & 0xC0) >> 3)
+ (ysign << 2)
+ ((y & 0xC0) >> 6)
)
+ chr(64 + ((x & 0x38)) + ((y & 0x38) >> 3))
+ chr(64 + ((x & 7) << 3) + (y & 7))
)
f = open("maple.nap", "w")
f.write(chr(0x18) + chr(0x1B)) # preamble
f.write(chr(0o16)) # SO: into graphics mode
f.write(colour(0, 0, 0)) # black
f.write(chr(0o40) + chr(0o120)) # clear screen to current colour
f.write(colour(3, 0, 0)) # red
# *** STALK ***
f.write(
chr(0o44) + coord(maple[0][0], maple[0][1])
) # point set absolute
f.write(
chr(0o51)
+ coord(maple[1][0] - maple[0][0], maple[1][1] - maple[0][1])
) # line relative
# *** LEAF ***
f.write(
chr(0o67) + coord(maple[1][0], maple[1][1])
) # set polygon filled
# append all the relative leaf vertices
for i in range(2, len(maple)):
f.write(
coord(
maple[i][0] - maple[i - 1][0], maple[i][1] - maple[i - 1][1]
)
)
f.write(chr(0x0F) + chr(0x1A)) # postamble
f.close()
There are a couple of perhaps useful routines in there:
colour(r, g, b) spits out the code for two bits per component RGB. Inputs are limited to the range 0–3 without error checking
coord(x, y) converts integer coordinates to a NAPLPS output stream. Best limited to a 256 × 192 size. Will also work with positive/negative relative coordinates.
After remarkable success with the SYN-6988 TTS module, then somewhat less success with the SYN-6658 and other modules, I didn’t hold out much hope for the YuTone SYN-6288, which – while boasting a load of background tunes that could play over speech – can only convert Chinese text to speech
The wiring is similar to the SYN-6988: a serial UART connection at 9600 baud, plus a Busy (BY) line to signal when the chip is busy. The serial protocol is slightly more complicated, as the SYN-6288 requires a checksum byte at the end.
As I’m not interested in the text-to-speech output itself, here’s a MicroPython script to play all of the sounds:
# very crude MicroPython demo of SYN6288 TTS chip
# scruss, 2023-07
import machine
import time
### setup device
ser = machine.UART(
0, baudrate=9600, bits=8, parity=None, stop=1
) # tx=Pin(0), rx=Pin(1)
busyPin = machine.Pin(2, machine.Pin.IN, machine.Pin.PULL_UP)
def sendspeak(u2, data, busy):
# modified from https://github.com/TPYBoard/TPYBoard_lib/
# u2 = UART(uart, baud)
eec = 0
buf = [0xFD, 0x00, 0, 0x01, 0x01]
# buf = [0xFD, 0x00, 0, 0x01, 0x79] # plays with bg music 15
buf[2] = len(data) + 3
buf += list(bytearray(data, "utf-8"))
for i in range(len(buf)):
eec ^= int(buf[i])
buf.append(eec)
u2.write(bytearray(buf))
while busy.value() != True:
# wait for busy line to go high
time.sleep_ms(5)
while busy.value() == True:
# wait for it to finish
time.sleep_ms(5)
for s in "abcdefghijklmnopqrstuvwxy":
playstr = "[v10][x1]sound" + s
print(playstr)
sendspeak(ser, playstr, busyPin)
time.sleep(2)
for s in "abcdefgh":
playstr = "[v10][x1]msg" + s
print(playstr)
sendspeak(ser, playstr, busyPin)
time.sleep(2)
for s in "abcdefghijklmno":
playstr = "[v10][x1]ring" + s
print(playstr)
sendspeak(ser, playstr, busyPin)
time.sleep(2)
Each sound starts and stops with a very loud click, and the sound quality is not great. I couldn’t get a good recording of the sounds (some of which of which are over a minute long) as the only way I could get reliable audio output was through tiny headphones. Any recording came out hopelessly distorted:
I’m not too disappointed that this didn’t work well. I now know that the SYN-6988 is the good one to get. It also looks like I may never get to try the XFS5152CE speech synthesizer board: AliExpress has cancelled my shipment for no reason. It’s supposed to have some English TTS function, even if quite limited.
Here’s the auto-translated SYN-6288 manual, if you do end up finding a use for the thing
Yup, it’s another “let’s wire up a SYN6988 board” thing, this time for MMBasic running on the Armmite STM32F407 Module (aka ‘Armmite F4’). This board is also known as the BLACK_F407VE, which also makes a nice little MicroPython platform.
Uh, let’s not dwell too much on how the SYN6988 seems to parse 19:51 as “91 minutes to 20” …
Wiring
SYN6988
Armmite F4
RX
PA09 (COM1 TX)
TX
PA10 (COM1 RX)
RDY
PA08
your choice of 3.3 V and GND connections, of course
Yes, I know it says it’s an XFS5152, but I got a SYN6988 and it seems to be about as reliable a source as one can find. The board is marked YS-V6E-V1.03, and even mentions SYN6988 on the rear silkscreen:
Code
REM SYN6988 speech demo - MMBasic / Armmite F4
REM scruss, 2023-07
OPEN "COM1:9600" AS #5
REM READY line on PA8
SETPIN PA8, DIN, PULLUP
REM you can ignore font/text commands
CLS
FONT 1
TEXT 0,15,"[v1]Hello - this is a speech demo."
say("[v1]Hello - this is a speech demo.")
TEXT 0,30,"[x1]soundy[d]"
say("[x1]soundy[d]"): REM chimes
TEXT 0,45,"The time is "+LEFT$(TIME$,5)+"."
say("The time is "+LEFT$(TIME$,5)+".")
END
SUB say(a$)
LOCAL dl%,maxlof%
REM data length is text length + 2 (for the 1 and 0 bytes)
dl%=2+LEN(a$)
maxlof%=LOF(#5)
REM SYN6988 simple data packet
REM byte 1 : &HFD
REM byte 2 : data length (high byte)
REM byte 3 : data length (low byte)
REM byte 4 : &H01
REM byte 5 : &H00
REM bytes 6-: ASCII string data
PRINT #5, CHR$(&hFD)+CHR$(dl%\256)+CHR$(dl% MOD 256)+CHR$(1)+CHR$(0)+a$;
DO WHILE LOF(#5)<maxlof%
REM pause while sending text
PAUSE 5
LOOP
DO WHILE PIN(PA8)<>1
REM wait until RDY is high
PAUSE 5
LOOP
DO WHILE PIN(PA8)<>0
REM wait until SYN6988 signals READY
PAUSE 5
LOOP
END SUB
The other week’s success with the SYN6988 TTS chip was not repeated with three other modules I ordered, alas. Two of them I couldn’t get a peep out of, the other didn’t support English text-to-speech.
SYN6658
This one looks remarkably like the SYN6988:
Yes, I added the 6658 label so I could tell the boards apart
Apart from the main chip, the only difference appears to be that the board’s silkscreen says YS-V6 V1.15 where the SYN6988’s said YS-V6E V1.02.
To be fair to YuTone (the manufacturer), they claim this only supports Chinese as an input language. If you feed it English, at best you’ll get it spelling out the letters. It does have quite a few amusing sounds, though, so at least you can make it beep and chime. My MicroPython library for the VoiceTX SYN6988 text to speech module can drive it as far as I understand it.
I’ve still got a SYN6288 to look at, plus a XFS5152CE TTSthat’s in the mail that may or may not be in the mail. The SYN6988 is the best of the bunch so far.
I have a bunch of other boards on order to see if the other chips (SYN6288, SYN6658, XF5152) work in the same way. I really wonder which I’ll end up receiving!
Update (2023-07-09): Got the SYN6658. It does not support English TTS and thus is not recommended. It does have some cool sounds, though.
Embedded Text Command Sound Table
The github repo references Embedded text commands, but all of the sound references were too difficult to paste into a table there. So here are all of the ones that the SYN-6988 knows about:
Name is the string you use to play the sound, eg: [x1]sound101
Alias is an alternative name by which you can call some of the sounds. This is for better compatibility with the SYN6288 apparently. So [x1]sound101 is exactly the same as specifying [x1]sounda
Type is the sound description from the manual. Many of these are blank
Link is a playable link for a recording of the sound.
I’ve had one of these cheap(ish – $15) sound modules from AliExpress for a while. I hadn’t managed to get much out of it before, but I poked about at it a little more and found I was trying to drive the wrong chip. Aha! Makes all the difference.
Sensitive listener alert! There is a static click midway through. I edited out the clipped part, but it’s still a little jarring. It would always do this at the same point in playback, for some reason.
The only Pythonish code I could find for these chips was meant for the older SYN6288 and MicroPython (syn6288.py). I have no idea what I’m doing, but with some trivial modification, it makes sound.
I used the simple serial UART connection: RX -> TX, TX -> RX, 3V3 to 3V3 and GND to GND. My board is hard-coded to run at 9600 baud. I used the USB serial adapter that came with the board.
Here’s the code that read that text:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import serial
import time
# NB via MicroPython and old too! Also for a SYN6288, which I don't have
# nabbed from https://github.com/TPYBoard/TPYBoard_lib/
def sendspeak(port, data):
eec = 0
buf = [0xFD, 0x00, 0, 0x01, 0x01]
buf[2] = len(data) + 3
buf += list(bytearray(data, encoding='utf-8'))
for i in range(len(buf)):
eec ^= int(buf[i])
buf.append(eec)
port.write(bytearray(buf))
ser = serial.Serial("/dev/ttyUSB1", 9600)
sendspeak(ser, "[t5]I like to think [p100](it [t7]has[t5] to be!)[p100] of a cybernetic ecology [p100]where we are free of our labors and joined back to nature, [p100]returned to our mammal brothers and sisters, [p100]and all watched over by machines of loving grace")
time.sleep(8)
ser.close()
This code is bad. All I did was prod stuff until it stopped not working. Since all I have to work from includes a datasheet in Chinese (from here: ??????-SYN6988???TTS????) there’s lots of stuff I could do better. I used the tone and pause tags to give the reading a little more life, but it’s still a bit flat. For $15, though, a board that makes a fair stab at reading English is not bad at all. We can’t all afford vintage DECtalk hardware.
The one thing I didn’t do is used the SYN6988’s Busy/Ready line to see if it was still busy reading. That means I could send it text as soon as it was ready, rather than pausing for 8 seconds after the speech. This refinement will come later, most likely when I port this to MicroPython.
It’s now possible to build and run the DECtalk text to speech system on Linux. It even builds under emscripten, enabling DECtalk for Web in your browser. You too can annoy everyone within earshot making it prattle on about John Madden.
But DECTalk can sing! Because it’s been around so long, there are huge archives of songs in DECtalk format out there. The largest archive is at THE FLAME OF HOPE website, under the Dectalk section.
Building DECtalk songs isn’t easy, especially for a musical numpty like me. You need a decent grasp of music notation, phonemic/phonetic markup and patience with DECtalk’s weird and ancient text formats.
DECtalk phonemes
While DECtalk can accept text and turn it into a fair approximation of spoken English, for singing you have to use phonemes. Let’s say we have a solfège-ish major scale:
DECtalk uses a variant on the ARPABET convention to represent IPA symbols as ASCII text. The initial consonant sounds remain as you might expect: D, R, M, F, S, L and T. The vowel sounds, however, are much more complex. This will give us a DECtalk-speakable phrase:
[dow rey miy faa sow laa tiy dow].
Note the opening and closing brackets and the full stop at the end. The brackets introduce phonemes, and the full stop tells DECtalk that the text is at an end. Play it in the DECtalk for Web window and be unimpressed: while the pitch changes are non-existent, the sounds are about right.
If you want to have a rough idea of what the phonemes in a phrase might be, you can use DECtalk’s :log phonemes option. You might still have to massage the input and output a bit, like using sed to remove language codes:
say -l us -pre '[:log phonemes on]' -post '[:log phonemes off]' -a "doe ray me fah so lah tea doe" | sed 's/us_//g;'
d ' ow r ' ey m iy f ' aa) s ow ll' aa t ' iy d ' ow.
Music notation
To me — a not very musical person — staff notation looks like it was designed by a maniac. A more impractical system to indicate arrangement of notes and their durations I don’t think I could come up with: and yet we’re stuck with it.
DECtalk uses a series of numbered pitches plus durations in milliseconds for its singing mode. The notes (1–37) correspond to C2 to C5. If you’re familiar with MIDI note numbers, DECtalk’s 1–37 correspond to MIDI note numbers 36–72. This is how DECtalk’s pitch numbers would look as major scales on the treble clef:
The entire singing range of DECtalk as a C Major scale, from note 1 (C2, 65.4 Hz) to note 37 (C5, 523.4 Hz)
I’m not sure browsers can play MIDI any more, but here you go (doremi-abc.mid):
and since I had to learn abc notation to make these noises, here is the source:
%abc-2.1
X:1
T:Do Re Mi
C:Trad.
M:4/4
L:1/4
Q:1/4=120
K:C
%1
C,, D,, E,, F,,| G,, A,, B,, C,| D, E, F, G,| A, B, C D| E F G A| B c z2 |]
w:do re mi fa sol la ti do re mi fa sol la ti do re mi fa sol la ti do
Each element of a DECtalk song takes the following form:
phoneme <duration, pitch number>
The older DTC-03 manual hints that it takes around 100 ms for DECtalk to hit pitch, so for each ½ second utterance (or quarter note at 120 bpm, ish), I split it up as:
100 ms of the initial consonant;
337 ms of the vowel sound;
63 ms of pause (which has the phoneme code “_”). Pauses don’t need pitch numbers, unless you want them to preempt DECtalk’s pitch-change algorithm.
So the three lowest notes in the major scale would sing as:
You can paste that into the DECtalk browser window, or run the following from the command line on Linux:
say -pre '[:PHONE ON]' -a '[d<100,1>ow<337,1>_<63>r<100,3>ey<337,3>_<63>m<100,5>iy<337,5>_<63>f<100,6>aa<337,6>_<63>s<100,8>ow<337,8>_<63>l<100,10>aa<337,10>_<63>t<100,12>iy<337,12>_<63>d<100,13>ow<337,13>_<63>r<100,15>ey<337,15>_<63>m<100,17>iy<337,17>_<63>f<100,18>aa<337,18>_<63>s<100,20>ow<337,20>_<63>l<100,22>aa<337,22>_<63>t<100,24>iy<337,24>_<63>d<100,25>ow<337,25>_<63>r<100,27>ey<337,27>_<63>m<100,29>iy<337,29>_<63>f<100,30>aa<337,30>_<63>s<100,32>ow<337,32>_<63>l<100,34>aa<337,34>_<63>t<100,36>iy<337,36>_<63>d<100,37>ow<337,37>_<63>].'
It sounds like this:
Singing a scale is hardly singing a tune, but hey, you were warned that this was a terrible guide at the outset. I hope I’ve given you a start on which you can build your own songs.
(One detail I haven’t tried yet: the older DTC-03 manual hints that singing notes can take Hz values instead of pitch numbers, and apparently loses the vibrato effect. It’s not that hard to convert from a note/octave to a frequency. Whether this still works, I don’t know.)
This post from Patrick Perdue suggested to me I had to dig into the Hz value substitution because the results are so gloriously awful. Of course, I had to write a Perl regex to make the conversions from DECtalk 1–37 sung notes to frequencies from 65–523 Hz:
(as one does). So the sung scale ends up as this non-vibrato text:
say -pre '[:PHONE ON]' -a '[d<100,65>ow<337,65>_<63>r<100,73>ey<337,73>_<63>m<100,82>iy<337,82>_<63>f<100,87>aa<337,87>_<63>s<100,98>ow<337,98>_<63>l<100,110>aa<337,110>_<63>t<100,123>iy<337,123>_<63>d<100,131>ow<337,131>_<63>r<100,147>ey<337,147>_<63>m<100,165>iy<337,165>_<63>f<100,175>aa<337,175>_<63>s<100,196>ow<337,196>_<63>l<100,220>aa<337,220>_<63>t<100,247>iy<337,247>_<63>d<100,262>ow<337,262>_<63>r<100,294>ey<337,294>_<63>m<100,330>iy<337,330>_<63>f<100,349>aa<337,349>_<63>s<100,392>ow<337,392>_<63>l<100,440>aa<337,440>_<63>t<100,494>iy<337,494>_<63>d<100,523>ow<337,523>_<63>].'
That doesn’t sound as wondrously terrible as it should, most probably as they are very small differences between each sung word. So how about we try something better? Like the refrain from The Turtles’ Happy Together, as posted on TheFlameOfHope: