#!/usr/local/bin/perl -w
# cdencrypt - use CD-DA as the basis of an OTP system
# created by scruss on 02002/02/19
# SCCS:    %W% %G%
# RCS/CVS: $Id: cdencrypt,v 1.4 2002/02/26 08:29:58 scruss Exp $

use strict;
use integer;
use Getopt::Std;

my $toffset=1031;		# compressed stream offset
my $soffset=1031;		# uncompressed stream offset
my $verbose=0;
my %opts;

sub xorstrings($$);		# xors two strings

# handle option
getopt('vst', \%opts);

if (defined($opts{'v'})) {
  $verbose=1;			# verbose flag
  print STDERR "Verbose operation on.\n";
}

if (defined($opts{'s'})) {
  if ($opts{'s'} > $soffset) {
    $soffset=$opts{'s'};
  }
  print STDERR "uncompressed stream offset: ", $soffset, "\n"
    if (1 == $verbose);
}

if (defined($opts{'t'})) {
  if ($opts{'t'} > $toffset) {
    $toffset=$opts{'t'};
  }
  print STDERR "compressed stream offset: ", $toffset, "\n"
    if (1 == $verbose);
}

print STDERR "argv: ", join(' ', @ARGV), "\n" if (1 == $verbose);

# check usage
if (1 != $#ARGV) {		# must have two args
  die "usage: $0 [-v 1] [-s soffset] [-t toffset] track messagefile > cryptedfile\n";
}

# two args: track file
my $track=shift;
$track += 0;			# make it numeric
if (($track < 0) or ($track > 99)) {
  die "Invalid CD track: $track\n";
}

my $file=shift;
unless ( -f $file ) {
  die "Input not a file: $file\n";
}
my @tmp=stat($file);
my $filelength = $tmp[7];

my $stailarg = '+' . ($soffset + 1) . 'c'; # for calling tail
my $ttailarg = '+' . ($toffset + 1) . 'c';

# don't punt out encrypted data to tty if that's where stdout is at
die "Can\'t output encrypted data to tty.\n" if ( -t STDOUT);


my $msg='';
my $key='';

# read the message
open(INFILE, $file) or die "$!: $file\n";
die "Couldn\'t read $filelength bytes from $file\n"
  unless ($filelength == read(INFILE, $msg, $filelength));
close(INFILE);
print STDERR "read $filelength bytes from $file okay\n" if (1 == $verbose);

# read the key
my $progstring="cdparanoia --quiet --output-wav --abort-on-skip $track - | tail $stailarg | bzip2 -9 --compress --stdout --quiet | tail $ttailarg";
print STDERR "command used: $progstring\n" if (1 == $verbose);
open(KEYFILE, "$progstring |") or die "$!: Can\'t make key chain\n";
die "Couldn\'t read $filelength bytes for key\n"
  unless ($filelength == read(KEYFILE, $key, $filelength));
close(KEYFILE);
print STDERR "read $filelength bytes for key okay\n" if (1 == $verbose);

print STDERR "processing strings\n" if (1 == $verbose);
print xorstrings($msg, $key);
print STDERR "done.\n" if (1 == $verbose);
exit;

sub xorstrings($$) {		# xors two strings without endian problems
  my @a = unpack("C*", shift);	# turn string into unsigned char array
  my @b = unpack("C*", shift);
  if ($#a != $#b) {
    return undef;		# function meaningless if args not same length
  }
  else {
    map { $_ ^= shift(@b) } @a;	# xor arrays
    return pack("C*", @a);	# and return packed from unsigned chars
  }
}

__END__

=head1 NAME

cdencrypt - use CD-DA as the basis of an OTP encryption system

=head1 SYNOPSIS

C<cdencrypt [-v 1] [-s soffset] [-t toffset] track messagefile E<gt> cryptedfile>

=head1 DESCRIPTION

cdencrypt encrypts the contents of messagefile against the compressed
CD-DA data read from a given audio CD track, and outputs the encrypted
data to cryptedfile. A number of bytes, either specified with the
C<-s> option, or by default 1031, are skipped at the start of the
CD-DA stream and a similar offset, either specified with the C<-t>
option, or by default 1031, skipped from the compressed stream, to
avoid any traceable headers being compressed, and/or provide many
possible one-time pads.

=head1 OPTIONS

C<-s soffset> allows you to set the number of bytes skipped at the
start of the audio stream to be C<soffset>. If C<soffset> is less than the
default (1031), silently uses the default.

C<-t toffset> allows you to set the number of bytes skipped at the
start of the compressed stream to be C<toffset>. If C<toffset> is less than the
default (1031), silently uses the default.

C<-v 1> sets verbose operation.

=head1 RETURN VALUE

Not used.

=head1 ERRORS

Has a few error messages/states:

=over 4

=item * Invalid CD track -- track number specified must be between 1 and 99.

=item * Input not a file -- message is not a readable file.

=item * Can't output encrypted data to tty -- as the encrypted message
is pseudo-random binary data, it makes no sense to output it to the
terminal.

=item * Couldn't read I<N> bytes for I<file> -- the message, though
readable, failed to read its own length.

=item * Couldn't read I<N> bytes for key -- enough bytes were not
readable from the CD-DA stream. Check your track number, and that the
offset is not too high.

=item * Can't make key chain -- the CD-DA reading process failed on
opening, somehow.

=back

=head1 EXAMPLES

	cdencrypt 1 file.txt > hehheh

would encrypt file.txt against audio CD track 1, outputting to file hehheh

	cdencrypt -s 98765 7 file.txt > hehheh

would encrypt file.txt against audio CD track 7, skipping 98765 bytes
from the start of the audio track before forming the key, and then
outputting to file hehheh

To decrypt, run the same command over the encrypted text with the same
CD in the drive:

	cdencrypt -s 98765 7 hehheh > hehheh.plain

The above examples would have the default C<toffset> value of 1031.

=head1 ENVIRONMENT

This perl script should not be affected by environment, but the
programs called by it may be.

=head1 FILES

No intermediate files created.

=head1 SEE ALSO

tail, cdparanoia, bzip2, perl

=head1 NOTES

Defaults to offsets of 1031 for no particular reason other than it's
a prime number bigger than what a typical file header might be.

I had planned to also allow audio streams derived from MP3s, but it
seems that the data can often be slightly different on different
hardware, even with the same MP3 decoding software. So this won't
work.

A minute of bzip2 compressed CD audio is typically ten million
bytes. Thus, for a one million byte message, it renders just less than
nine million possible C<-s> offset choices, and consequently the same
number of distinct encryption keys. Do the sums on your CD
collection. You have a lot of keys. And then there's the C<-t> offset
to think about.

I have no idea whether it would be better to compress the input
message before encrypting. It's probably not too great an idea, as it
reduces the OTP key length, and thus makes it easier to crack. You
certainly can't compress the encrypted output by any meaningful
amount. It wouldn't be much of an encryption routine if you could.

I like the idea of "encrupted data"; encrypted data that has been corrupted...

=head1 CAVEATS

Probably not very secure, as it explicitly opens a shell to run tail,
cdparanoia and bzip2. This is, however, only a proof of concept.

Reads message and key into memory in one go. If your message is large,
your core had better be too.

Although the record labels help you by distributing the CDs, you still
have to find a way to transmit the information about the key CD, track
and offsets securely.

Beware of trivially-different national release variations. Oh, and BMG
own-pressing CDs. They're different.

=head1 DIAGNOSTICS

If you have output, it worked. C<-v> might help tell you something.

=head1 BUGS

Setting a very large offset which goes off the end of a track should
start reading from the beginning of the next track, but doesn't in
this version.

Is unlikely to handle huge messages on modest machines very well.

=head1 RESTRICTIONS

Your conscience alone.

=head1 AUTHOR

Stewart C. Russell C<E<lt>scruss@bigfoot.comE<gt>>

=head1 HISTORY

Created 02002/02/19

$Log: cdencrypt,v $
Revision 1.4  2002/02/26 08:29:58  scruss
added -t and -v options

Revision 1.3  2002/02/24 00:44:07  scruss
unbroke program

Revision 1.2  2002/02/24 00:39:50  scruss
added docos and -s option

Revision 1.1  2002/02/23 00:47:24  scruss
Initial revision


=cut



