Updated 26 Feb 2002

Audio One Time Pad Encryption

Using an audio stream from digital CD audio data isn't a new idea. It was first mooted on Usenet in the mid 1980s. But effective CD-DA extraction ("ripping") tools are now available, and it's possible to extract data repeatably from the same edition of a music CD.

MP3s (or Ogg files) are also a good source of audio data, but are not as random as an uncompressed CD audio stream. It seems that the data can often be slightly different on different hardware, even with the same MP3 decoding software. So this won't work for making useful pads.

Known Problems

Monophonic audio
has repeating words.
Silence
creates runs of hex 80 bytes.

Solutions

Data compression removes repetition, but beware of file headers and repeating block details in some compression schemes.

Sample Implementation

Download cdencrypt, 7933 bytes, MD5 checksum: 86ad38fa15e350147701e4a9f24b4f05

You will need a Linux system with working cdparanoia and bzip2. Full documentation from the source, in POD format.

Documentation

Reproduced form the source.


NAME

cdencrypt - use CD-DA as the basis of an OTP encryption system


SYNOPSIS

cdencrypt [-v 1] [-s soffset] [-t toffset] track messagefile > cryptedfile


DESCRIPTION

cdencrypt encrypts the contents of messagefile against the compressed CD-DA data read from a given audio CD track, and outputs the encrypted data to cryptedfile. A number of bytes, either specified with the -s option, or by default 1031, are skipped at the start of the CD-DA stream and a similar offset, either specified with the -t option, or by default 1031, skipped from the compressed stream, to avoid any traceable headers being compressed, and/or provide many possible one-time pads.


OPTIONS

-s soffset allows you to set the number of bytes skipped at the start of the audio stream to be soffset. If soffset is less than the default (1031), silently uses the default.

-t toffset allows you to set the number of bytes skipped at the start of the compressed stream to be toffset. If toffset is less than the default (1031), silently uses the default.

-v 1 sets verbose operation.


RETURN VALUE

Not used.


ERRORS

Has a few error messages/states:


EXAMPLES

        cdencrypt 1 file.txt > hehheh

would encrypt file.txt against audio CD track 1, outputting to file hehheh

        cdencrypt -s 98765 7 file.txt > hehheh

would encrypt file.txt against audio CD track 7, skipping 98765 bytes from the start of the audio track before forming the key, and then outputting to file hehheh

To decrypt, run the same command over the encrypted text with the same CD in the drive:

        cdencrypt -s 98765 7 hehheh > hehheh.plain

The above examples would have the default toffset value of 1031.


ENVIRONMENT

This perl script should not be affected by environment, but the programs called by it may be.


FILES

No intermediate files created.


SEE ALSO

tail, cdparanoia, bzip2, perl


NOTES

Defaults to offsets of 1031 for no particular reason other than it's a prime number bigger than what a typical file header might be.

I had planned to also allow audio streams derived from MP3s, but it seems that the data can often be slightly different on different hardware, even with the same MP3 decoding software. So this won't work.

A minute of bzip2 compressed CD audio is typically ten million bytes. Thus, for a one million byte message, it renders just less than nine million possible -s offset choices, and consequently the same number of distinct encryption keys. Do the sums on your CD collection. You have a lot of keys. And then there's the -t offset to think about.

I have no idea whether it would be better to compress the input message before encrypting. It's probably not too great an idea, as it reduces the OTP key length, and thus makes it easier to crack. You certainly can't compress the encrypted output by any meaningful amount. It wouldn't be much of an encryption routine if you could.

I like the idea of ``encrupted data''; encrypted data that has been corrupted...


CAVEATS

Probably not very secure, as it explicitly opens a shell to run tail, cdparanoia and bzip2. This is, however, only a proof of concept.

Reads message and key into memory in one go. If your message is large, your core had better be too.

Although the record labels help you by distributing the CDs, you still have to find a way to transmit the information about the key CD, track and offsets securely.

Beware of trivially-different national release variations. Oh, and BMG own-pressing CDs. They're different.


DIAGNOSTICS

If you have output, it worked. -v might help tell you something.


BUGS

Setting a very large offset which goes off the end of a track should start reading from the beginning of the next track, but doesn't in this version.

Is unlikely to handle huge messages on modest machines very well.


RESTRICTIONS

Your conscience alone.


AUTHOR

Stewart C. Russell <scruss@bigfoot.com>


HISTORY

Created 02002/02/19

$Log: cdencrypt,v $ Revision 1.4 2002/02/26 08:29:58 scruss added -t and -v options

Revision 1.3 2002/02/24 00:44:07 scruss unbroke program

Revision 1.2 2002/02/24 00:39:50 scruss added docos and -s option

Revision 1.1 2002/02/23 00:47:24 scruss Initial revision


Stewart C. Russell, Kirkintilloch, Scotland