The Joy of BirdNetPi

I don’t think I’ve had as much enjoyment for a piece of software for a very long time as I’ve had with BirdNET-Pi. It’s a realtime acoustic bird classification system for the Raspberry Pi. It listens through a microphone you place somewhere near where you can hear birds, and it’ll go off and guess what it’s hearing, using a cut-down version of the BirdNET Sound ID model. It does this 24/7, and saves the samples it hears. You can then go to a web page (running on the same Raspberry Pi) and look up all the species it has heard.

Our Garden

a somewhat overgrown garden with budding green trees against blue sky

Not very impressive, kind of overgrown, in the wrong part of town. Small, too. But birds love it. At this time of year, it’s alive with birds. You can’t make them out, but there’s a pair of Rose-breasted Grosbeaks happily snacking near the top of the big tree. There are conifers next door too, so we get birds we wouldn’t expect.

We are next to two busy subway/train stations, and in between two schools. There’s a busy intersection nearby, too. Consequently, the background noise is horrendous

What I used

This was literally “stuff I had lying around”:

  • Raspberry Pi 3B+ (with power supply, case, thermostatic fan and SD card)
  • USB extension cable (this, apparently, is quite important to isolate the USB audio device from electrical noise)
  • Horrible cheap USB sound card: I paid about $2 for a “3d sound” thing about a decade ago. It records in mono. It works. My one is wrapped in electrical tape as the case keeps threatening to fall off, plus it has a hugely bright flashing LED the is annoying.
  • Desktop mic (circa 2002): before video became a thing, PCs had conferencing microphones. I think I got this one free with a PC over 20 years ago. It’s entirely unremarkable and is not an audiophile device. I stuck it out a back window and used a strip of gaffer tape to stop bugs getting in. It’s not waterproof, but it didn’t rain the whole week it was out the window.
  • Raspberry Pi OS Lite 64-bit. Yes, it has to be 64 bit.
  • BirdNET-Pi installation on top.

I spent very little time optimizing this. I had to fiddle with microphone gain slightly. That’s all.

What I heard

To the best of my knowledge, I have actual observations of 30 species, observed between May 7th and May 16th 2023:

American Goldfinch, American Robin, Baltimore Oriole, Blue Jay, Cedar Waxwing, Chimney Swift, Clay-colored Sparrow, Common Grackle, Common Raven, Gray Catbird, Hermit Thrush, House Finch, House Sparrow, Killdeer, Mourning Dove, Nashville Warbler, Northern Cardinal, Northern Parula, Orchard Oriole, Ovenbird, Red-winged Blackbird, Ring-billed Gull, Rose-breasted Grosbeak, Ruby-crowned Kinglet, Song Sparrow, Veery, Warbling Vireo, White-throated Sparrow, White-winged Crossbill, Wood Thrush

I’ll put the recordings at the end of this post. Note, though, they’re noisy: Cornell Lab quality they ain’t.

What I learned

This is the first time that I’ve let an “AI” classifier model run with no intervention. If it flags some false positives, then it’s pretty low-stakes when it’s wrong. And how wrong did it get some things!

allegedly a Barred Owl, this is clearly a two-stroke leafblower
Black-Billed Cuckoo? How about kids playing in the school yard?
Emergency vehicles are Common Loons now, according to BirdNetPi
Police cars at 2:24 am are Eastern Screech-Owls. I wonder if we could use this classifier to detect over-policed, under-served neighbourhoods?
Great Black-backed Gulls, or kids playing? The latter
Turkey Vulture? How about a very farty two-stroke engine in a bicycle frame driving past?
(This thing stinks out the street, blecch)

There are also false positive for Trumpeter Swans (local dog) and Tundra Swans (kids playing). These samples had recognizable voices, so I didn’t include them here.

The 30 positive species identifications

Many of these have a fairly loud click at the start of the sample, so mind your ears.

American Goldfinch

American Robin

Baltimore Oriole

(I dunno what’s going on here; the next sample’s much more representative)

Blue Jay

Cedar Waxwing

Chimney Swift

Clay-colored Sparrow

Common Grackle

Common Raven

Gray Catbird

Hermit Thrush

House Finch

House Sparrow

Killdeer

Mourning Dove

Nashville Warbler

Northern Cardinal

Hey, we’ve got both of the repetitive songs that these little doozers chirp out all day. Song 1:

and song 2 …

Northern Parula

Orchard Oriole

Ovenbird

Red-winged Blackbird

Ring-billed Gull

Rose-breasted Grosbeak

Ruby-crowned Kinglet

Song Sparrow

Veery

Warbling Vireo

White-throated Sparrow

White-winged Crossbill

Wood Thrush

Boring technical bit

BirdNetPi doesn’t create combined spectrograms with audio as a single video file. What it does do is create an mp3 plus a PNG of the spectrogram. ffmpeg can make a nice not-too-large webm video for sharing:

ffmpeg -loop 1 -y -i 'birb.mp3.png' -i 'birb.mp3' -ac 1 -crf 48 -vf scale=720:-2 -shortest 'birb.webm'

(Minor update, May 2024: the original project maintainer has moved on, so I changed the project link to point to Nachtzuster/BirdNET-Pi: A realtime acoustic bird classification system for the Raspberry Pi 5, 4B 3B+ 0W2 and more. Built on the TFLite version of BirdNET.)

An hour of Pink Noise

cover made by netpbm, of course
an hour of soothing 2-channel noise

Direct download: 01-pink_noise.mp3

There are a million variations on the simple “use sox to play masking pink noise“, such as:

play -n synth pinknoise gain -3

This will play synthesized pink noise until you hit Ctrl-C.

But it you want two independent noise channels rather than mono, that’s a little more complex. It’s probably easier to download/play the MP3 file above than show you the command line.

Note that MP3s really aren’t designed to encode such random data, and it’s likely that your player will cause the audio to clip in a couple of places. I’m not quite sure why it does this, but it does it repeatably.

If you want to create this for yourself (and create a bonus lossless FLAC, which was far too large to upload here), here’s what I did to make this:

#!/bin/bash

duration='60:00'
fade='1'
outfile='pinknoise.wav'

# make the track
sox --combine merge "|sox --norm=-3 -c 1 -b 16 -r 44100 -n -p synth $duration pinknoise" "|sox --norm=-3 -c 1 -b 16 -r 44100 -n -p synth $duration pinknoise" -c 2 -b 16 -r 44100 $outfile fade $fade fade 0 $duration $fade gain -n -3

# make the cover
# 1 - text - 500 x 500 px
pnmcat -white -tb <(pbmmake -white 500 114) <(pbmtextps -font HelveticaBold -fontsize 64 -resolution 180 "PINK" | pnmcrop) <(pbmmake -white 32 32) <(pbmtextps -font HelveticaBold -fontsize 64 -resolution 180 "NOISE" | pnmcrop) <(pbmmake -white 500 114) > cover-text.pbm
# 2 - make the noise bg
pgmnoise 500 500 > cover-noise.pgm
# 3 - make the magenta text
ppmchange black magenta cover-text.pbm > cover-text-magenta.ppm
# 4 - overlay with transparency
pnmcomp -alpha=<(pnminvert cover-text.pbm | pbmtopgm 35 35 ) cover-text-magenta.ppm cover-noise.pgm | cjpeg -qual 50 -opt -baseline -dct float > cover.jpg
# delete the temporary image files, leaving cover.jpg
rm -f cover-text.pbm cover-noise.pgm cover-text-magenta.ppm

# make the mp3
lame -V 2 --noreplaygain -m s --tt 'Pink Noise' --ta 'Pink Noise' --tl 'Pink Noise' --ty $(date +%Y) --tc "scruss, 2021-05" --tn 1/1 --tg Ambient --ti cover.jpg "$outfile" 01-pink_noise.mp3

# make the flac (and delete wav file)
flac --best --output-name=01-pink_noise.flac --delete-input-file --picture=cover.jpg --tag="TITLE=Pink Noise" --tag="ARTIST=Pink Noise" --tag="ALBUM=Pink Noise" --tag="DATE=$(date +%Y)" --tag="COMMENT=scruss, 2021-05" --tag="GENRE=Ambient" --tag="TRACKNUMBER=1" --tag="TRACKTOTAL=1" "$outfile"

You’ll likely need these packages installed:

sudo apt install sox libsox-fmt-all ghostscript gsfonts-x11 netpbm lame flac libjpeg-progs

Happy Birthday Alvin Lucier

Happy Birthday Alvin Lucier - spectrogram of ten generations of re-recordings
Happy Birthday Alvin Lucier – spectrogram of ten generations of re-recordings

Just over 50 years ago, Alvin Lucier turned on his tape recorder and started to recite. This wasn’t an entirely unusual thing to do, but what he did next was a little different: he played that tape back, while recording it on another device. He kept doing this until all there was left was the ringing resonant frequency of the room, his voice smeared out of any recognizable sound. He called this piece I am sitting in a room, and it’s still a stunning work of sound art.

Alvin Lucier just turned 90 years old, so in recognition, I made this:

Many Happy Returns: for Alvin Lucier at 90

I don’t have a quiet room with two tape decks, but I do have a large plastic tote in which I can fit a whole bunch of battery-powered recording gubbins:

computer-based recording and playback equipment in plastic tote
my studio in a box. From L to R: Raspberry Pi 3A+, backup audio recorder, microphone stand, battery pack, microphone, USB audio interface, kalimba (for added resonance) and bluetooth speaker

After setting everything up and running, I put the box on my bed and covered it with several layers of blankets to keep our noisy neighbourhood sounds out. The initial audio was made in PicoTTS, and then playback and recording were controlled over wifi and VNC. I (or more accurately, the bash script I wrote) made 90 generations of recordings. I’ve only included the first 10 due to time constraints.

What I did

I guess I’ve been lucky with whatever audio system Raspberry Pi OS is using, because it recognized and used both my ancient Griffin iMic USB microphone adapter and my Sony Bluetooth speaker as default input/output devices.

One annoyance was having to build PicoTTS from source. Raspberry Pi OS doesn’t provide packages for it, but does have the source package. Building it goes something like this: Compile Pico TTS on Raspbian. You might prefer trying flite or espeak-ng, both of which have binary packages available.

I used this script, which may be rather too specific to my particular goal:

#!/bin/bash

parent=/home/pi/Desktop/sitting_in_a_box
gen0="$parent/box0.wav"
dest=${1:-$(date +%y%m%d%H%M)}
outdir="$parent/$dest"
mkdir -p "$outdir" && echo Created "$outdir"
n=0
outfile=$(printf "%s/%s_%03d.wav" "$outdir" "$dest" "$n")
tmpfile=$(printf "%s/%s_TMP.wav" "$outdir" "$dest")

# copy source file
sox -q --norm=-3 "$gen0" "$outfile"
while
    [ ! -f "$parent/STOP" ]
do
    infile="$outfile"
    n=$((n + 1))
    outfile=$(printf "%s/%s_%03d.wav" "$outdir" "$dest" "$n")
    echo Recording "$outfile" . . .
    arecord -f cd -q "$tmpfile" &
    rec_pid=$!
    echo Playing "$infile" . . .
    aplay -q "$infile"
    sleep 0.5
    kill -SIGINT "$rec_pid"
    echo Normalizing "$tmpfile" to "$outfile" . . .
    sox --norm=-3 "$tmpfile" -c 2 "$outfile" && rm "$tmpfile"
    echo ""
done
echo '***' Process stopped after "$n" iterations.

Some notes on the code:

  1. the script creates a folder for the output files, either as specified on the command line, or if not, from the current date/time
  2. I didn’t really know how many iterations I’d want, so I monitored occasional files by pulling them down using scp and listening. When I was satisfied, I’d create a file called STOP in the project folder that would make the loop exit
  3. $gen0 is the name of the source recording
  4. sox -q –norm=-3 infile outfile is a way of normalizing the audio so that it won’t clip
  5. We start the recording in the background (arecord … &) and immediately grab its process ID using $!. This allows us to stop the recording using kill after the audio file has played. arecord doesn’t update the file header when a process is stopped like this, so sox will complain quietly every time
  6. The sleep 0.5 after aplay is to prevent the recording cutting off. Both aplay and sox/play seem to exit as soon as the last block of audio data has been sent to the audio device, and not when the sound has stopped playing. This means you have to edit out increasingly longer gaps from your recordings, but at least you get everything.

What I’d do differently next time

  • Probably not used a Bluetooth speaker. Sure, they’re self-powered and you can’t complain about the number of wires, but they’re noisy and low quality
  • Done more work on the microphone interface. The pair of binaural mics I used to use with the Griffin interface (it’s stereo, unlike most USB interfaces) had a dead channel, so I had to fish around for other microphones. Somewhere I have a nice microphone from a Sony Pro Walkman, and I should have used that
  • Maybe considered a band-pass filter to cut the very high frequency ringing. Yes, I know the whole point of I am sitting in a room is to capture the room dynamics, but when you’re recording in a tiny space with very hard walls, the dynamics take over almost too quickly to be interesting
  • Spent more time on playback and record levels beyond the “hey, it works!” settings I used here
  • Used a higher bitrate recording. This might have resulted in more wolf-tones and ringing, though.

Ringing like 1984: Western Electric “Princess”

Western Electric “Princess” compact telephone from 1984

I got this phone at a junk swap event. It had a broken handset jack, but I got a replacement from OldPhoneWorks.

It has a distinctive, loud ring:

(Alternative Freesound link: Western Electric Princess Telephone Ringing)

That’s a lot of noise from a small phone!

Western Electric “Princess” compact telephone ­— base. Note mid-1984 production date: after the US Bell breakup

If you want the ringtone for your phone, here it is as an Ogg file for Android: WesternElectric-Princess_Ring-mobile.zip

Western Electric “Princess” Telephone Ringing
Recording © 2018, Stewart C. Russell — scruss.com

provided under the Creative Commons — Attribution 2.5 
Canada — CC BY 2.5 CA licence:
https://creativecommons.org/licenses/by/2.5/ca/

Work-in-progress: Sayso Globord audio decoding

You may still be able to get surplus Sayso Globord programmable LED signs in surplus stores. It’s a 7×24 LED scrolling sign that you can program with a lightpen or with audio input.

sayso-001

The unit comes with no software, but has a link to https://www.dropbox.com/sh/q1q9yhahwtblb23/AACpMeXQjYyD8ZWC-65vNgcxa printed on the box. It’s an archive of the programming software, manual, and canned audio files for a whole bunch of standard messages. Here’s an archive if the dropbox link goes away: SaySo.zip

The audio files used for programming the display are clearly FSK-encoded, but I haven’t quite worked out the relationship between the tones and the display bits. Here’s what I’ve worked out so far:

  • Files are made up of 12 audio blocks, each about 0.9 seconds long. Each block appears to correspond to one 7×24 display screen.
  • Mark (1 bit): Three cycles, 96 samples at 44100 Hz: 1378.125 Hz
  • Space (0 bit): Four cycles, 256 samples at 44100 Hz: 689.0625 Hz

The editor runs nicely under DOSBox, so you can experiment and save samples as WAV files. Here’s a sample display with its corresponding audio linked underneath:

sayed1_0_003

I’m not sure how much extra work I have time or inclination to put in on getting this working, but I hope that my preliminary work will be useful to someone (maybe this person).

Fixing a broken boombox

Catherine‘s Insignia CD Boombox with FM Radio Model: NS-BIPCD01 (CD-player/iPod dock thingy) just stopped working. The traces on the power connector broke when it got bumped. It was a bit of a bear to open up. I was going to submit this to iFixit, but their editor is horrid.

You will need:

  1. Phillips #0 screwdriver
  2. Phillips #1 screwdriver
  3. Nice thick guitar pick.

Insignia-NS-BIPCD01-opening1Underneath the device, peel off the two sticky feet next to the product label at (1). Underneath are #1 Phillips screws you should remove. These are countersunk, and should be kept separate from the other screws.

Insignia-NS-BIPCD01-opening2At (2), peel off the sticky covers and remove the #1 screws.

Open the CD door, and remove the #1 screws near the top at (3).

Remove the #0 screws in the handle at (4). We’ve accounted for all the screws holding the case together, but there are a couple of clips we’ll need to work on.

Starting from near the top of the handle, pry the two halves of the case apart with the guitar pick. There’s an insert in the handle which will fall out; keep it aside.

At (5) and at (6), there are clips inside the case which you’ll need to press on with the guitar pick to get them open. They’re quite fragile, and I broke two out of four. If you do break them, make sure the loose bits don’t rattle about the case.

The case should slip apart now, and there are several short cables connecting buttons, displays and power supplies. If you lay the box on its back (with the iPod dock uppermost) you can set the top of the case up on the main circuit board. This will allow you to get at the power/audio board, which is secured by two large-flange #0 screws.

2014-03-15-131248… and there’s the problem: the power trace (the lower of the three near the middle of the picture) has cracked. I re-soldered it, and also ran jumper wires between the pins. If this cracks again, the jumpers will be much more robust.

Introducing RAFTP — the Really Annoying File Transfer Protocol

I would like to describe a new and highly impractical method of transferring data between computers. Modern networks are getting more efficient every year. This protocol aims to reverse this trend, as RAFTP features:

  1. Slow file transfers
  2. A stubborn lack of error correction
  3. The ability to irritate neighbours while ensuring inaccurate transmission through playing the data over the air using Bell 202 tones.

doge-small-tx
Figure 1

Figure 1 shows a test image before it was converted into PGM format. This was then converted into an audio file using minimodem:

minimodem --tx -v 0.90 -f doge-small-1200.wav 1200 < doge-small-tx.pgm

This file was then transferred to an audio player. To ensure maximal palaver, the audio player was connected to a computer via a USB audio interface and a long, minimally-shielded audio cable. The output was captured as mp3 by Audacity as this file: RAFTP-demo

The mp3 file was then decoded back to an image:

madplay -o wav:- RAFTP-demo.mp3   | minimodem --rx -q -f - 1200 | rawtopgm 90 120 | pnmtopng > doge-small-rx.png

Figure 2 shows the received and decoded file:

Figure 2
Figure 2

Raspberry Pi as a USB audio capture device

The Raspberry Pi’s hardware and software support has come a long way in the few months it has been in the wild. I first tried this application in the summer, and the results were dismal. Now, thanks much improved USB driver support under Raspbian, I’m pleased to say it works flawlessly.

Earlier this year, I bought a turntable (ack!) for transferring vinyl to mp3. I have a TC-772 USB phono preamp, which spits out a 48 kHz stereo audio stream. If you plug the USB output of the preamp into a Rapberry Pi (running Raspbian Wheezy with all the updates), it’s instantly recognized as an audio device:

$ lsusb
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 0424:9512 Standard Microsystems Corp. 
Bus 001 Device 003: ID 0424:ec00 Standard Microsystems Corp. 
Bus 001 Device 004: ID 08bb:2902 Texas Instruments Japan PCM2902 Audio Codec

If you install the ALSA recording utilities (sudo apt-get install alsa-utils pulseaudio – this should pull in a whole bunch of necessary packages), you can record directly from this device with the following command:

arecord -D 'pulse' -V stereo -c 2 -f dat -d 900 out.wav

which records from the ‘pulse’ audio device, displaying a stereo text VU meter (handy for setting levels), writing to a two channel 16-bit 48 kHz file called ‘out.wav’ for a maximum of 900 seconds (15 minutes). arecord has a baffling number of recording source options; arecord -L will show them. ‘pulse’ was the first one I tried.

So how does it sound? Here’s a 30 second excerpt from the only single I owned for years, The Music Tapes‘ “The Television Tells Us/Freeing Song by Reindeer”: Freeing Song by Reindeer – excerpt [mp3]. I’ve saved an even smaller snippet as lossless FLAC so you can see that the waveform’s pretty clean: FreeingSongbyReindeer-tiny_excerpt [flac].

Sounds pretty good. Not quite as good as having Julian play it in your house, I’ll allow, but not bad for a first try with a $35 computer.

good, not quite great

I accidentally dropped and broke my car mp3 player, so had to come up with another music solution. I caved and bought an iTrip for my iPod Nano. It sounds pretty good.

What’s good about it is that it allows you to charge your iPod from a standard USB Mini-B. What’s not so good is that it doesn’t have full USB pass-through, so you can’t sync your iPod, and have to stick with that stupid dock cable.

(and don’t get me started on the really annoying connector on my work cell phone …)

half-assed, but endearing


So I bought the Kross Bluetooth Hands Free Cell Phone Car Kit with FM Transmitter. It has its good points, but it has some quirks and serious shortcomings.

Here’s what’s good:

  • It’s cheap (< $40)
  • It provides in-car Bluetooth speakerphone
  • It plays MP3s from SD card, USB stick, or an line level source.

Here’s what’s not so good:

  • Playback quality is limited to finding an open FM frequency, which is hard in the GTA
  • The transmitter is not very powerful, so nearby vehicles can swamp your signal (or, if you want to call it a feature, it’s a “random positional mashup”)
  • The phone mic is a tiny port on the unit, so sometimes the caller can’t hear you too well
  • You need to have your radio on to answer your phone
  • The USB port doesn’t provide enough charging current for a phone or GPS
  • The remote isn’t very good
  • Voice dialling doesn’t seem to work with my Blackberry
  • The MP3 playback function usually remembers where you were when you start the car, but sometimes forgets, and needs the card ejected and reinserted
  • It doesn’t know about ID3 tags
  • Weirdest of all, it plays back files in the strict order they were written to the directory – not ordered by file name. It seems that, under Microsoft operating systems, files are copied in name order, but under Unix, they are (winging it here) copied by inode. Using tar on a Mac or Linux is the way to go, as it writes in name order.

The Kross S-150 Manual (scanned PDF) is pretty terse, and has been of limited use to me. For all its faults, it’s kind of useful, but if I had a USB-capable stereo, I wouldn’t need this.

less than 100 CDs to go …

1492 Artists / 999 Albums / 15245 Tracks / 34.9 Days / 62.12 GB
(and here’s me thinking I had about 2000 CDs, too)

CDs that wouldn’t read: 0 (so far). That’s not to say that there weren’t some difficulties (copy-controlled CDs can go die, glitching and gronking in my drives) and my oldest CD (XTC’s Skylarking, my copy of which I think has just turned 20) had a ton of retries.

Lost CDs: Thomas Dolby’s Aliens Ate My Buick is somewhere in the house, but nowhere I’ve looked.

Found CDs: My long-lost promo copy of the (Portland) Decemberists’ Picaresque, which I thought had vanished in a road trip to Missouri. It was lurking in a long-forgotten portable CD player in the bottom of a storage bin.

Pleasant surprises: that freedb is generally better than it used to be.

Peeves: copy-controlled CDs (see above); flappy cardboardy cases that only have the title on one spine; oversized CD cases (Japanese imports, I’m looking straight at you), dark blue text on a black background, idjit freedb submitters who insist on Band, The syntax or worse, submit whole albums called sdfsdf;aefhsdf; bonus DVD “premium” releases (who watches these?).

ripping dvd audio with Ubuntu

With more than a little help from How to Rip DVD audio to mp3 or ogg — Ubuntu Geek, here’s how I’d rip audio from a DVD:
for f in $(seq 1 12)
do
transcode -i /dev/sr1 -x null,dvd -T 1,$f,1 -N 0x1 -y null,wav -m $(printf "%02d" $f).wav
done

Your track count and device name will vary. You’ll note that I caved, and used the annoying $(…) syntax instead of good old-fashioned backticks (which some youngsters will claim are deprecated, but I claim as job security). WordPress munges those badly, so we’re stuck with the ugly.
You could use livemp3 to convert to mp3s (if I remembered to upload the version that handles wav files) under controlled circumstances.

rockin’ the plastic: four turntables and an mp3 share

Now I’ve got the Soundbridge set up to share from my server, I’ve been ripping CDs like crazy. I’ve got two drives on my Ubuntu box, and hooked an external CD drive to my laptop, so I’m rocking four drives at once. After years of using Grip, I converted to Abcde this weekend. What I really like about it is that I can run multiple copies at once, and it very nearly things right (aka “my way”) out of the box.

By the end of tonight, I should have about 6700 tracks on my share, and a bunch of CDs in storage.

the analogue hole

I have a bunch of Catherine’s old family recordings to digitise (do people still do that – sit around a tape recorder and make recordings?) and I had recorded one of Ken’s shows on minidisc, so I needed a relatively clean way to get analogue audio onto the computer.

I ended up getting a Griffin iMic, a small USB audio input device. The sound quality is remarkably clean; here’s a sine wave recorded from CD to minidisc, then recorded on the iMic:

tracks000.png

 

The  iMic seems to work with all Mac audio software as an input device. The free Final Vinyl recording sofware is pretty, but a bit buggy and annoyingly, only works when the iMic is connected. I just use Audacity, and have done with it.