In the unlikely event you need to represent Emoji in RTF using Perl …

Of all the niche blog entries I’ve written, this must be the nichest. I don’t even like the topic I’m writing about. But I’ve worked it out, and there seems to be a shortage of documented solutions.

For the both of you that generate Rich Text Format (RTF) documents by hand, you might be wondering how RTF converts ‘💩’ (that’s code point U+1F4A9) to the seemingly nonsensical \u-10179?\u-9047?. It seems that RTF imposes two encoding limitations on characters: firstly, everything must be in 7-bit ASCII for easy transmission, and secondly, it uses the somewhat old-fashioned UTF-16 representation for non-ASCII characters.

UTF-16 grew out of an early standard, UCS-2, that was all like “Hey, there will never be a Unicode cope point above 65536, so we can hard code the characters in two bytes … oh shiiii…”. So not merely does it have to escape emoji code points down to two bytes using a very dank scheme indeed, it then has to further escape everything to ASCII. That’s how your single enoji becomes 17 bytes in an RTF document.

So here’s a tiny subroutine to do the conversion. I wrote it in Perl, but it doesn’t do anything Perl-specific:

#!/usr/bin/env perl
# emoji2rtf - 2017 - scruss
# See UTF-16 decoder for the dank details
#  <https://en.wikipedia.org/wiki/UTF-16#U.2B10000_to_U.2B10FFFF>

use v5.20;
use strict;
use warnings;
use utf8;
sub emoji2rtf($);

my $c = substr( $ARGV[0], 0, 1 );
say join( "\t⇒ ", $c, sprintf( "U+%X", ord($c) ), emoji2rtf($c) );
exit;

sub emoji2rtf($) {
    my $n = ord( substr( shift, 0, 1 ) );
    die "emoji2rtf: code must be >= 65536\n" if ( $n < 0x10000 );
    return sprintf( "\\u%d?\\u%d?",
        0xd800 + ( ( $n - 0x10000 ) & 0xffc00 ) / 0x400 - 0x10000,
        0xdC00 + ( ( $n - 0x10000 ) & 0x3ff ) - 0x10000 );
}

This will take any emoji fed to it and spit out the RTF code:

📓	⇒ U+1F4D3	⇒ \u-10179?\u-9005?
💽	⇒ U+1F4BD	⇒ \u-10179?\u-9027?
🗽	⇒ U+1F5FD	⇒ \u-10179?\u-8707?
😱	⇒ U+1F631	⇒ \u-10179?\u-8655?
🙌	⇒ U+1F64C	⇒ \u-10179?\u-8628?
🙟	⇒ U+1F65F	⇒ \u-10179?\u-8609?
🙯	⇒ U+1F66F	⇒ \u-10179?\u-8593?
🚥	⇒ U+1F6A5	⇒ \u-10179?\u-8539?
🚵	⇒ U+1F6B5	⇒ \u-10179?\u-8523?
🛅	⇒ U+1F6C5	⇒ \u-10179?\u-8507?
💨	⇒ U+1F4A8	⇒ \u-10179?\u-9048?
💩	⇒ U+1F4A9	⇒ \u-10179?\u-9047?
💪	⇒ U+1F4AA	⇒ \u-10179?\u-9046?

Just to show that this encoding scheme really is correct, I made a tiny test RTF file unicode-emoji.rtf that looked like this in Google Docs on my desktop:

It looks a bit better on my phone, but there are still a couple of glyphs that won’t render:

Ⓗⓞⓦ ⓣⓞ ⓑⓔ ⓐⓝⓝⓞⓨⓘⓝⓖ ⓦⓘⓣⓗ Ⓟⓔⓡⓛ ⓐⓝⓓ Ⓤⓝⓘⓒⓞⓓⓔ

It’s been so long since I’ve programmed in Perl. Twelve years ago, it was my life, but what with the Raspberry Pi intervening, I hadn’t used it in a while. It’s been so long, in fact, that I wasn’t aware of the new language structures available since version 5.14. Perl’s Unicode support has got a lot more robust, and I’m sick of Python’s whining about codecs when processing anything other than ASCII anyway. So I thought I’d combine re-learning some modern Perl with some childish amusement.

What I came up with was a routine to convert ASCII alphanumerics ([0-9A-Za-z]) to Unicode Enclosed Alphanumerics ([⓪-⑨Ⓐ-Ⓩⓐ-ⓩ]) for advanced lulz purposes. Ⓘ ⓣⓗⓘⓝⓚ ⓘⓣ ⓦⓞⓡⓚⓢ ⓡⓐⓣⓗⓔⓡ ⓦⓔⓛⓛ:

#!/usr/bin/perl
# annoying.pl - ⓑⓔ ⓐⓝⓝⓞⓨⓘⓝⓖ ⓦⓘⓣⓗ ⓤⓝⓘⓒⓞⓓⓔ
# created by scruss on 2014-05-18

use v5.14;
# fun UTF8 tricks from http://stackoverflow.com/questions/6162484/
use strict;
use utf8;
use warnings;
use charnames qw( :full :short );
sub annoyify;

die "usage: $0 ", annoyify('string to print like this'), "\n" if ( $#ARGV < 0 );
say annoyify( join( ' ', @ARGV ) );
exit;

# 💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩

sub annoyify() {
    # convert ascii to chars in circles
    my $str = shift;
    my @out;
    foreach ( split( '', $str ) ) {
        my $c = ord($_);             # remember, can be > 127 for UTF8
        if ( $c == charnames::vianame("DIGIT ZERO") )
	{
            # 💩💩💩 sigh; this one's real special ... 💩💩💩
            $c = charnames::vianame("CIRCLED DIGIT ZERO");
        }
        elsif ($c >= charnames::vianame("DIGIT ONE")
            && $c <= charnames::vianame("DIGIT NINE") )
        {
            # numerals, 1-9 only (grr)
            $c =
              charnames::vianame("CIRCLED DIGIT ONE") +
              $c -
              charnames::vianame("DIGIT ONE");
        }
        elsif ($c >= charnames::vianame("LATIN CAPITAL LETTER A")
            && $c <= charnames::vianame("LATIN CAPITAL LETTER Z") )
        {
            # upper case
            $c =
              charnames::vianame("CIRCLED LATIN CAPITAL LETTER A") +
              $c -
              charnames::vianame("LATIN CAPITAL LETTER A");
        }
        elsif ($c >= charnames::vianame("LATIN SMALL LETTER A")
            && $c <= charnames::vianame("LATIN SMALL LETTER Z") )
        {
            # lower case
            $c =
              charnames::vianame("CIRCLED LATIN SMALL LETTER A") +
              $c -
              charnames::vianame("LATIN SMALL LETTER A");
        }
        else {
            # pass thru non-ascii chars
        }
        push @out, chr($c);
    }
    return join( '', @out );
}

Yes, I really did have to do that special case for ⓪; ⓪…⑨ are not contiguous like ASCII 0…9. ⓑⓞⓞ!

ICQuestionBank2csv

ICQuestionBank2csv: A tool to extract both the Basic and Advanced Amateur Radio Examination guides from Industry Canada’s rather annoying two-column PDFs. Written for IC’s 2014-02 database updates.

See: Amateur Radio Exam Generator.

Written by Stewart C. Russell (aka scruss) / VA3PID – 2014-03-07.

Requirements

  • Perl, with Text::CSV_XS
  • xpdf tools
  • Bash
  • wget

Usage

Run either basic2csv.sh or advanced2csv.sh to download the source PDF and extract the data.

Licence

WTFPL (srsly).

Morse Palindromes, or CQ Christian Bök

The longest palindrome in Morse code is “intransigence””, and it was on

First off, here’s the Morse code for the word intransigence:

·· –· – ·–· ·– –· ··· ·· ––· · –· –·–· ·
i  n  t r   a  n  s   i  g   e n  c    e

If you look at it as a simple stream of dits and dahs, then yes, it’s palindromic. But, like comedy, the secret of Morse (or CW) is timing. It’s important to include the spaces between the keyings, or letters become hard to identify as they run together. For a word to truly sound palindromic, it would need to have the same spacing too, and thus have to start and end on Morse codes that were mirror-images.

Not only that, but you get codes which when reversed, become another letter. a (·–) becomes n (–·) when reversed. So things are getting more complex, as we’ve now got to think of:

  1. Words which are both palindromes in the English and Morse code;
  2. Words which are palindromes in Morse, but not when written in English.

With only Convert::Morse and words to guide me, here’s what I found.

Firstly, here’s a Morse code table for reference:

 ! → –·–·––           3 → ···––          a → ·–          n → –·
 " → ·–··–·           4 → ····–          b → –···        o → –––
 ' → ·––––·           5 → ·····          c → –·–·        p → ·––·
 ( → –·––·            6 → –····          d → –··         q → ––·–
 ) → –·––·–           7 → ––···          e → ·           r → ·–·
 + → ·–·–·            8 → –––··          f → ··–·        s → ···
 , → ––··––           9 → ––––·          g → ––·         t → –
 - → –····–           : → –––···         h → ····        u → ··–
 . → ·–·–·–           ; → –·–·–          i → ··          v → ···–
 / → –··–·            = → –···–          j → ·–––        w → ·––
 0 → –––––            ? → ··––··         k → –·–         x → –··–
 1 → ·––––            @ → ·––·–·         l → ·–··        y → –·––
 2 → ··–––            _ → ··––·–         m → ––          z → ––··

From that, you can see that the letters which have symmetrical keyings are:

 " ' ) + , - 0 5 ; = ? e h i k m o p r s t x

So are there palindromic words composed only of the letters E, H, I, K, M, O, P, R, S, T & X? Here are the ones in my words file, longest first:

 sexes rotor toot sees poop peep kook tot
 tit SOS sis pop pip pep oho mom ere eke

(Somewhere, the ghost of Sigmund Freud is going “Hmm …)

When encoded, rotor (·–· ––– – ––– ·–·) has more dahs that sexes (··· · –··– · ···), so takes longer to transmit. So rotor is the longest word that’s palindromic in both English and Morse.

The characters which have valid Morse codes when reversed are:

 " → "             8 → 2             l → f
 ' → '             9 → 1             m → m
 ) → )             ; → ;             n → a
 + → +             = → =             o → o
 , → ,             ? → ?             p → p
 - → -             a → n             q → y
 0 → 0             b → v             r → r
 1 → 9             d → u             s → s
 2 → 8             e → e             t → t
 3 → 7             f → l             u → d
 4 → 6             g → w             v → b
 5 → 5             h → h             w → g
 6 → 4             i → i             x → x
 7 → 3             k → k             y → q

Note how 1…9 reverse to 9…1. c, j & z don’t stand for anything backwards.

So, with only minimal messing about, here are the words that are palindromes in CW:

 ada → nun              ads → sun              ages → sewn
 ago → own              ail → fin              aim → min
 ana → nan              ani → ian              ant → tan
 ants → stan            boa → nov              eel → fee
 ego → owe              eire → erie            eke → eke
 emir → rime            emit → time            ere → ere
 erie → eire            eris → sire            eros → sore
 etna → nate            fee → eel              feel → feel
 fever → rebel          few → gel              fin → ail
 fins → sail            fool → fool            foot → tool
 foots → stool          footstool → footstool  fop → pol
 gel → few              gem → mew              gets → stew
 gnaw → gnaw            goa → now              gob → vow
 gog → wow              got → tow              hoop → pooh
 ian → ani              ids → sui              kans → sank
 kant → tank            keep → peek            kook → kook
 kroger → rework        leer → reef            leif → lief
 lief → leif            loops → spoof          meet → teem
 mew → gem              min → aim              mir → rim
 mit → tim              mom → mom              moor → room
 nan → ana              nate → etna            nerd → urea
 net → tea              nib → via              nit → tia
 nov → boa              now → goa              nun → ada
 oho → oho              otto → otto            outdo → outdo
 owe → ego              own → ago              owns → sago
 peek → keep            peep → peep            pees → seep
 pep → pep              per → rep              pets → step
 pip → pip              pis → sip              pit → tip
 pol → fop              pooh → hoop            poop → poop
 pop → pop              ports → strop          pot → top
 pots → stop            queer → reedy          quit → tidy
 rebel → fever          reedy → queer          reef → leer
 regor → rower          remit → timer          rep → per
 rework → kroger        rim → mir              rime → emir
 robert → trevor        room → moor            rot → tor
 rotor → rotor          rower → regor          runs → sadr
 sadr → runs            sago → owns            sail → fins
 saints → stains        sangs → swans          sank → kans
 sans → sans            seep → pees            sees → sees
 sewn → ages            sexes → sexes          sip → pis
 sire → eris            sis → sis              sling → waifs
 sloops → spoofs        sore → eros            sos → sos
 spit → tips            spoof → loops          spoofs → sloops
 sports → strops        spot → tops            spots → stops
 stains → saints        stan → ants            step → pets
 stew → gets            sting → waits          stool → foots
 stop → pots            stops → spots          strop → ports
 strops → sports        suds → suds            sui → ids
 sun → ads              sung → wads            swans → sangs
 swig → wigs            swigs → swigs          taint → taint
 tan → ant              tang → want            tank → kant
 tea → net              teem → meet            tet → tet
 tia → nit              tidy → quit            tim → mit
 time → emit            timer → remit          ting → wait
 tip → pit              tips → spit            tit → tit
 tog → wot              tool → foot            toot → toot
 top → pot              tops → spot            tor → rot
 tort → trot            tot → tot              tow → got
 trevor → robert        trot → tort            urea → nerd
 via → nib              vow → gob              wads → sung
 waifs → sling          wait → ting            waiting → waiting
 waits → sting          wang → wang            want → tang
 wig → wig              wigs → swig            wot → tog
 wow → gog

So of all of these, footstool (··–· ––– ––– – ··· – ––– ––– ·–··) is the longest English word that is a palindrome in CW. Here is how it sounds at 18wpm: forwards, backwards.

Trolling the Bruce Nuclear Cost and Clean Air Calculator for Fun & Profit

Energy_Calculator-Bruce_Power
You might have seen the Bruce Power Cost and Clean Air Calculator. It’s supposed to show that nuclear is both cheap and clean, and using anything else would make your bills and your emissions go through the roof. Well, here are 40+ scenarios that all save money and emissions while using no nuclear and no coal:

  1. 3.9% Solar, 5.7% Wind, 0.2% Gas, 0% Nuclear, 90.2% Hydro and 0% Coal saves $5.09/month and 89.7 t/CO2 annually.
  2. 2.1% Solar, 10.1% Wind, 0.3% Gas, 0% Nuclear, 87.5% Hydro and 0% Coal saves $8.97/month and 89.2 t/CO2 annually.
  3. 0.5% Solar, 23.6% Wind, 0.4% Gas, 0% Nuclear, 75.5% Hydro and 0% Coal saves $2.75/month and 88.7 t/CO2 annually.
  4. 2.1% Solar, 2.5% Wind, 0.8% Gas, 0% Nuclear, 94.6% Hydro and 0% Coal saves $16.32/month and 87.1 t/CO2 annually.
  5. 3.1% Solar, 0.2% Wind, 1.4% Gas, 0% Nuclear, 95.3% Hydro and 0% Coal saves $13.27/month and 84.5 t/CO2 annually.
  6. 0.6% Solar, 5.4% Wind, 1.8% Gas, 0% Nuclear, 92.2% Hydro and 0% Coal saves $19.52/month and 82.7 t/CO2 annually.
  7. 1.9% Solar, 15.8% Wind, 2.5% Gas, 0% Nuclear, 79.8% Hydro and 0% Coal saves $2.48/month and 79.8 t/CO2 annually.
  8. 0.3% Solar, 13.6% Wind, 2.5% Gas, 0% Nuclear, 83.6% Hydro and 0% Coal saves $12.08/month and 79.7 t/CO2 annually.
  9. 3.0% Solar, 11.7% Wind, 2.9% Gas, 0% Nuclear, 82.4% Hydro and 0% Coal saves $1.21/month and 78.0 t/CO2 annually.
  10. 0.1% Solar, 24.8% Wind, 3.1% Gas, 0% Nuclear, 72.0% Hydro and 0% Coal saves $1.35/month and 77.3 t/CO2 annually.
  11. 2.7% Solar, 4.8% Wind, 3.6% Gas, 0% Nuclear, 88.9% Hydro and 0% Coal saves $8.77/month and 75.2 t/CO2 annually.
  12. 4.1% Solar, 1.2% Wind, 3.9% Gas, 0% Nuclear, 90.8% Hydro and 0% Coal saves $5.96/month and 73.6 t/CO2 annually.
  13. 1.3% Solar, 0.3% Wind, 5.6% Gas, 0% Nuclear, 92.8% Hydro and 0% Coal saves $18.44/month and 66.3 t/CO2 annually.
  14. 2.4% Solar, 0.1% Wind, 6.0% Gas, 0% Nuclear, 91.5% Hydro and 0% Coal saves $13.26/month and 64.7 t/CO2 annually.
  15. 3.8% Solar, 4.6% Wind, 6.5% Gas, 0% Nuclear, 85.1% Hydro and 0% Coal saves $1.99/month and 62.2 t/CO2 annually.
  16. 1.4% Solar, 11.8% Wind, 6.8% Gas, 0% Nuclear, 80% Hydro and 0% Coal saves $5.54/month and 61.0 t/CO2 annually.
  17. 2.9% Solar, 5.7% Wind, 7.0% Gas, 0% Nuclear, 84.4% Hydro and 0% Coal saves $4.64/month and 60.1 t/CO2 annually.
  18. 0.6% Solar, 14.4% Wind, 7.6% Gas, 0% Nuclear, 77.4% Hydro and 0% Coal saves $6.09/month and 57.7 t/CO2 annually.
  19. 0.7% Solar, 12.1% Wind, 7.9% Gas, 0% Nuclear, 79.3% Hydro and 0% Coal saves $7.64/month and 56.4 t/CO2 annually.
  20. 2.1% Solar, 2.9% Wind, 8.5% Gas, 0% Nuclear, 86.5% Hydro and 0% Coal saves $104/month and 53.5 t/CO2 annually.
  21. 1.9% Solar, 13.5% Wind, 8.6% Gas, 0% Nuclear, 76.0% Hydro and 0% Coal saves $0.36/month and 53.1 t/CO2 annually.
  22. 2.5% Solar, 3.5% Wind, 8.6% Gas, 0% Nuclear, 85.4% Hydro and 0% Coal saves $7.63/month and 53.1 t/CO2 annually.
  23. 0% Solar, 5.4% Wind, 8.7% Gas, 0% Nuclear, 85.9% Hydro and 0% Coal saves $17.02/month and 52.9 t/CO2 annually.
  24. 0.5% Solar, 0.4% Wind, 8.8% Gas, 0% Nuclear, 90.3% Hydro and 0% Coal saves $19.53/month and 52.4 t/CO2 annually.
  25. 1.6% Solar, 3.9% Wind, 9.7% Gas, 0% Nuclear, 84.8% Hydro and 0% Coal saves $10.31/month and 48.5 t/CO2 annually.
  26. 2.6% Solar, 6.6% Wind, 9.9% Gas, 0% Nuclear, 80.9% Hydro and 0% Coal saves $2.76/month and 47.6 t/CO2 annually.
  27. 0.6% Solar, 9.6% Wind, 10.5% Gas, 0% Nuclear, 79.3% Hydro and 0% Coal saves $8.70/month and 45.2 t/CO2 annually.
  28. 1.4% Solar, 1.0% Wind, 10.5% Gas, 0% Nuclear, 87.1% Hydro and 0% Coal saves $13.58/month and 44.9 t/CO2 annually.
  29. 0.9% Solar, 12.1% Wind, 11.7% Gas, 0% Nuclear, 75.3% Hydro and 0% Coal saves $3.96/month and 39.9 t/CO2 annually.
  30. 0.4% Solar, 13.9% Wind, 12.6% Gas, 0% Nuclear, 73.1% Hydro and 0% Coal saves $3.89/month and 35.7 t/CO2 annually.
  31. 0.3% Solar, 10.7% Wind, 13.3% Gas, 0% Nuclear, 75.7% Hydro and 0% Coal saves $6.89/month and 32.9 t/CO2 annually.
  32. 0.3% Solar, 10.5% Wind, 13.3% Gas, 0% Nuclear, 75.9% Hydro and 0% Coal saves $7.11/month and 32.8 t/CO2 annually.
  33. 0.2% Solar, 17.8% Wind, 13.6% Gas, 0% Nuclear, 68.4% Hydro and 0% Coal saves $0.18/month and 31.8 t/CO2 annually.
  34. 2.3% Solar, 6.9% Wind, 14.0% Gas, 0% Nuclear, 76.8% Hydro and 0% Coal saves $0.96/month and 29.8 t/CO2 annually.
  35. 3.5% Solar, 0.2% Wind, 14.0% Gas, 0% Nuclear, 82.3% Hydro and 0% Coal saves $2.11/month and 29.7 t/CO2 annually.
  36. 0.6% Solar, 15.2% Wind, 14.0% Gas, 0% Nuclear, 70.2% Hydro and 0% Coal saves $0.68/month and 29.6 t/CO2 annually.
  37. 3.1% Solar, 3.4% Wind, 14.9% Gas, 0% Nuclear, 78.6% Hydro and 0% Coal saves $09/month and 26.0 t/CO2 annually.
  38. 2.2% Solar, 3.6% Wind, 16.8% Gas, 0% Nuclear, 77.4% Hydro and 0% Coal saves $2.65/month and 17.8 t/CO2 annually.
  39. 1.4% Solar, 1.3% Wind, 17.1% Gas, 0% Nuclear, 80.2% Hydro and 0% Coal saves $8.29/month and 16.2 t/CO2 annually.
  40. 1.1% Solar, 4.5% Wind, 18.2% Gas, 0% Nuclear, 76.2% Hydro and 0% Coal saves $5.74/month and 11.5 t/CO2 annually.
  41. 0.1% Solar, 13.3% Wind, 19.1% Gas, 0% Nuclear, 67.5% Hydro and 0% Coal saves $0.70/month and 7.9 t/CO2 annually.
  42. 0.1% Solar, 6.4% Wind, 19.8% Gas, 0% Nuclear, 73.7% Hydro and 0% Coal saves $7.47/month and 4.7 t/CO2 annually.
  43. 0.7% Solar, 8.7% Wind, 20.6% Gas, 0% Nuclear, 70% Hydro and 0% Coal saves $1.73/month and 1.2 t/CO2 annually.

Sure, some of these won’t be practical from a dispatch/capacity perspective, but hey, that’s Bruce’s issue to explain away.

I couldn’t have done it without this tiny routine to produce a list of random numbers that all add up to 1. No way was I clicking those sliders 10000+ times. Viewing the source was handy, though.

sub rndnormsum {
    # generate N uniformly distributed random numbers that sum to 1
    # see http://stackoverflow.com/a/2640079/377125
    my $n = shift;        # number of entries to return
    my @arr = ( 0, 1 );
    foreach ( 1 .. ( $n - 1 ) ) {
        push @arr, rand;
    }
    @arr = sort(@arr);
    my @result = ();
    foreach ( 1 .. $n ) {
        push @result, $arr[$_] - $arr[ $_ - 1 ];
    }
    return @result;
}

Mac to Linux: 1Password to KeePassX

I have too many passwords to remember, so I’ve been using a password manager for years. First there was Keyring for Palm OS, then 1Password on the Mac. 1Password’s a very polished commercial program, but it only has Mac and Windows desktop clients. Sadly, it had to go.

Finding a replacement was tough. It needed to be free, and yet cross-platform. It needed to work on iOS and Android. It also needed to integrate with a cloud service like Dropbox so I could keep my passwords in sync. The only program that met all of these requirements was KeePassX. I’ve stuck with the stable (v 0.4.3) branch rather than the flashy 2.0 version, as the older database format does all I need and is fully portable. MiniKeePass on iOS and KeePassDroid on Android look after my mobile needs. But first, I needed to get my password data out of 1Password.

1Password offers two export formats: a delimited text format (which seemed to drop some of the more obscure fields), and the 1Password Interchange Format (1PIF). The latter is a JSONish format (ಠ_ಠ) containing a dump of all of the internal data structures. There is, of course, no documentation for this file format, because no-one would ever move away from this lovely commercial software, no …

So armed with my favourite swiss army chainsaw, I set about picking the file apart. JSON::XS and Data::Dumper::Simple were invaluable for this process, and pretty soon I had all the fields picked apart that I cared about. I decided to write a converter that wrote KeePassX 1.x XML, since it was readily imported into KeePassX, would could then write a database readable by all of the KeePass variants.

To run this converter you’ll need Perl, the JSON::XS and Data::Dumper::Simple modules, and if your Perl is older than about 5.12, the Time::Piece module (it’s a core module for newer Perls, so you don’t have to install it). Here’s the code:

#!/usr/bin/perl -w
# 1pw2kpxxml.pl - convert 1Password Exchange file to KeePassX XML
# created by scruss on 02013/04/21

use strict;
use JSON::XS;
use HTML::Entities;
use Time::Piece;

# print xml header
print <<HEADER;
<!DOCTYPE KEEPASSX_DATABASE>
<database>
 <group>
  <title>General</title>
  <icon>2</icon>
HEADER

##############################################################
# Field Map
#
# 1Password			KeePassX
# ============================  ==============================
# title        			title
# username			username
# password			password
# location			url
# notesPlain			comment
#    -				icon
# createdAt			creation
#    -				lastaccess	(use updatedAt)
# updatedAt			lastmod
#    -				expire		('Never')

# 1PW exchange files are made of single lines of JSON (O_o)
# interleaved with separators that start '**'
while (<>) {
    next if (/^\*\*/);    # skip separator
    my $rec = decode_json($_);

    # throw out records we don't want:
    #  - 'trashed' entries
    #  -  system.sync.Point entries
    next if ( exists( $rec->{'trashed'} ) );
    next if ( $rec->{'typeName'} eq 'system.sync.Point' );

    print '  <entry>', "\n";    # begin entry

    ################
    # title field
    print '   <title>', xq( $rec->{'title'} ), '</title>', "\n";

    ################
    # username field - can be in one of two places
    my $username = '';

    # 1. check secureContents as array
    foreach ( @{ $rec->{'secureContents'}->{'fields'} } ) {
        if (
            (
                exists( $_->{'designation'} )
                && ( $_->{'designation'} eq 'username' )
            )
          )
        {
            $username = $_->{'value'};
        }
    }

    # 2.  check secureContents as scalar
    if ( $username eq '' ) {
        $username = $rec->{'secureContents'}->{'username'}
          if ( exists( $rec->{'secureContents'}->{'username'} ) );
    }

    print '   <username>', xq($username), '</username>', "\n";

    ################
    # password field - as username
    my $password = '';

    # 1. check secureContents as array
    foreach ( @{ $rec->{'secureContents'}->{'fields'} } ) {
        if (
            (
                exists( $_->{'designation'} )
                && ( $_->{'designation'} eq 'password' )
            )
          )
        {
            $password = $_->{'value'};
        }
    }

    # 2.  check secureContents as scalar
    if ( $password eq '' ) {
        $password = $rec->{'secureContents'}->{'password'}
          if ( exists( $rec->{'secureContents'}->{'password'} ) );
    }

    print '   <password>', xq($password), '</password>', "\n";

    ################
    # url field
    print '   <url>', xq( $rec->{'location'} ), '</url>', "\n";

    ################
    # comment field
    my $comment = '';
    $comment = $rec->{'secureContents'}->{'notesPlain'}
      if ( exists( $rec->{'secureContents'}->{'notesPlain'} ) );
    $comment = xq($comment);    # pre-quote
    $comment =~ s,\\n,<br/>,g;  # replace escaped NL with HTML
    $comment =~ s,\n,<br/>,mg;  # replace NL with HTML
    print '   <comment>', $comment, '</comment>', "\n";

    ################
    # icon field (placeholder)
    print '   <icon>2</icon>', "\n";

    ################
    # creation field
    my $creation = localtime( $rec->{'createdAt'} );
    print '   <creation>', $creation->datetime, '</creation>', "\n";

    ################
    # lastaccess field
    my $lastaccess = localtime( $rec->{'updatedAt'} );
    print '   <lastaccess>', $lastaccess->datetime, '</lastaccess>', "\n";

    ################
    # lastmod field (= lastaccess)
    print '   <lastmod>', $lastaccess->datetime, '</lastmod>', "\n";

    ################
    # expire field (placeholder)
    print '   <expire>Never</expire>', "\n";

    print '  </entry>', "\n";    # end entry
}

# print xml footer
print <<FOOTER;
 </group>
</database>
FOOTER

exit;

sub xq {                         # encode string for XML
    $_ = shift;
    return encode_entities( $_, q/<>&"'/ );
}

To run it,

./1pw2kpxxml.pl data.1pif > data.xml

You can then import data.xml into KeePassX.

Please be careful to delete the 1PIF file and the data.xml once you’ve finished the export/import. These files contain all of your passwords in plain text; if they fell into the wrong hands, it would be a disaster for your online identity. Be careful that none of these files accidentally slip onto backups, too. Also note that, while I think I’m quite a trustworthy bloke, to you, I’m Some Random Guy On The Internet. Check this code accordingly; I don’t warrant it for anything save for looking like line noise.

Now on github: scruss / 1pw2kpxxml, or download: 1pw2kpxxml.zip (gpg signature: 1pw2kpxxml.zip.sig)

SHA1 Checksums:

  • 3c25eb72b2cfe3034ebc2d251869d5333db74592 — 1pw2kpxxml.pl
  • 99b7705ff30a2b157be3cfd29bb1d4f137920c25 — readme.txt
  • de4a51fbe0dd6371b8d68674f71311a67da76812 — 1pw2kpxxml.zip
  • f6bd12e33b927bff6999e9e80506aef53e6a08fa — 1pw2kpxxml.zip.sig.txt

The converter has some limitations:

  • All attached files in the database are lost.
  • All entries are stored under the same folder, with the same icon.
  • It has not been widely tested, and as I’m satisfied with its conversion, it will not be developed further.

the russian peasants are multiplying!

Via this post, I found out about Russian Peasant Multiplication, a rather clever method of multiplication that only requires, doubling, halving and adding. So I wrote some code to display it:

#!/usr/bin/python
# -*- coding: utf-8 -*-

import sys
results=[]
indicator=' '

left=int(sys.argv[1])
right=int(sys.argv[2])

while right >= 1:
    indicator='X'
    if right % 2:
        indicator=' '              # right number is odd,
        results.append(left)       #  so add left number to results
    print (" %s %16d \t %16d %s") % (indicator, left, right, indicator)
    left *= 2
    right /= 2

print("%s × %s = %s = %d")%(sys.argv[1], sys.argv[2],
                            ' + '.join(map(str,results)), sum(results))

So to multiply 571 × 293:

$ ./rpmult.py 571 293
                571                   293  
 X             1142                   146 X
               2284                    73  
 X             4568                    36 X
 X             9136                    18 X
              18272                     9  
 X            36544                     4 X
 X            73088                     2 X
             146176                     1  
571 × 293 = 571 + 2284 + 18272 + 146176 = 167303

Python’s still got some weirdness compared to Perl; where I’d join the list of sum terms in Perl with join(' + ', @results), in Python you have to convert the integer values to strings, then call the join method of the separator string: ' + '.join(map(str,results)). Still, I’ll give Python props for having a built-in list sum() function, which Perl lacks.

learning to tolerate python

Python is okay, I guess, but there’s not a hint of music to it. I’m a dyed-in-the-wool Perl programmer since 4.036 days. When I think of how I’ll solve a programming problem, I think in Perl (or, more rarely, in PostScript, but I really have to be pretty off-balance to be thinking in stacks). I’m learning Python because all of the seemingly nifty open source geospatial software uses it, and if I’m to write anything for or about the Raspberry Pi, it seems that Python is the language they officially support on it.

So I’m learning Python by porting some of the simple Perl tools I use around here. It’s painful, not just dealing with the language Fortranesque space-significance, but also physically; I think I put my shoulder out picking up Mark Lutz‘s giant books on Python. The first program I chose to port matches input lines against known words in the system dictionary file. Here’s the Perl version:

#!/usr/bin/perl -w

use strict;
use constant WORDLIST => '/usr/share/dict/words';

my %words;
open(WORDS, WORDLIST);
while () {
    chomp;
    my $word  = lc($_);
    $words{$word}++;
}
close(WORDS);

# now read candidate words from stdin
while (<>) {
  chomp;
  $_=lc($_);
  print $_,"\n" if defined($words{$_});
}

exit;

I most recently used this to look for available call signs that — minus the number — were real words. The input lines from the available call sign list look like this:

VA3PHZ
VA3PIA
VA3PID
VA3PIF
VA3PIH
...

so if I strip out the 3s and run it through the program:

sed 's/3//;' va3_avail.txt | ./callsigncheck.pl

I get one hit: vapid. Which is now my call sign, VA3PID. Moohah.

The Python version is much shorter, and I’m semi-impressed with the nifty little trick in line 5 (aka ‘dictionary comprehension’) which offers some hope for the future of terse, idiomatic code. The fileinput module gives Perlish stdin-or-ARGV[] file input, without which I’m sunk.

#!/usr/bin/python
import fileinput                        # Perl-like file input

# get our wordlist
words={w.lower(): 1 for w in open('/usr/share/dict/words', 'r').read().split()}

# read through input looking for matching words
for l in fileinput.input():
    ll=l.lower().rstrip()
    if words.get(ll, 0):
        print(ll)

(So far, I’ve found the PLEAC – Programming Language Examples Alike Cookbook useful in comparing the languages.)

Parsing ADIF with Perl

In ham radio, we’re plagued with a data log standard called ADIF, the Amateur Data Interchange Format. It certainly is amateur, in the bad sense of the word. It looks like someone once saw SGML in a fever dream, and wrote down what little they remembered.

Anyway, the following Perl snippet will parse an ADIF file into an array of hashes. It was based on some code from PerlMonks that kinda worked. This works for all the file (singular) I tested.

#!/usr/bin/perl -w
# modified from perlmonks - TedPride - http://www.perlmonks.org/?node_id=559222
use strict;

my ( $temp, @results ) = '';

### Fast forward past header
while (<>) {
  last if m/<eoh>\s+$/i;
}

### While there are records remaining...
while (<>) {
  $temp .= $_;

  ### Process if end of record tag reached
  if (m/<eor>\s+$/i) {
    my %hash;
    $temp =~ s/\n//g;
    $temp =~ s/<eoh>.*//i;
    $temp =~ s/<eor>.*//i;
    my @arr = split( '<', $temp );
    foreach (@arr) {
      next if (/^$/);
      my ( $key, $val ) = split( '>', $_ );
      $key =~ s/:.*$//;
      $hash{$key} = $val unless ( $key eq '' );
    }
    push @results, \%hash;
    $temp = '';
  }
}

# example: just pull out CALL and GRIDSQUARE for each record that has them
foreach (@results) {
  next unless exists( $_->{GRIDSQUARE} );
  print join( "\t", $_->{CALL}, $_->{GRIDSQUARE} ), "\n";
}

exit;

If you want some real code to manipulate ADIF files, adifmerg works.

creating a TrueType font from your handwriting with your scanner, your printer, and FontForge

Hey, this post is super old!
That means that installation and run instructions may not work as well, or even at all. Most of the *Ports Apple software repositories have given way to Homebrew: you may have some success on Mac (untested by me) if you brew install netpbm fontforge potrace. There’s also some font cleanup I’d recommend, like resolving overlaps, adding extrema, and rounding points to integer. One day I may update this post, but for now, I’m leaving it as is.

This looks more than a bit like my handwriting

because it is my handwriting! Sure, the spacing of the punctuation needs major work, and I could have fiddled with the baseline alignment, but it’s legible, which is more than can usually be said of my own chicken-scratch.

This process is a little fiddly, but all the parts are free, and it uses free software. This all runs from the command line. I wrote and tested this on a Mac (with some packages installed from DarwinPorts), but it should run on Linux. It might need Cygwin under Windows; I don’t know.

Software you will need:

  • a working Perl interpreter
  • NetPBM, the free graphics converter toolkit
  • FontForge, the amazing free font editor. (Yes, I said amazing. I didn’t say easy to use …)
  • autotrace or potrace so that FontForge can convert the scanned bitmaps to vectors
  • some kind of bitmap editor.

You will need to download

  • fonttrace.pl – splits up a (very particular) bitmap grid into character cells
  • chargrid.pdf – the font grid template for printing

Procedure:

  1. Print at least the first page of chargrid.pdf. The second page is guidelines that you can place under the page. This doesn’t work very well if you use thick paper.
  2. Draw your characters in the boxes. Keep well within the lines; there’s nothing clever about how fonttrace.pl splits the page up.
  3. Scan the page, making sure the page is as straight as possible and the scanner glass is spotless. You want to scan in greyscale or black and white.
  4. Crop/rotate/skew the page so the very corners of the character grid table are at the edges of the image, like this: I find it helpful at this stage to clean off any specks/macules. I also scale and threshold the image so I get a very dark image at 300-600dpi.
  5. Save the image as a Portable Bitmap (PBM). It has to be 1-bit black and white. You might want to put a new font in a new folder, as the next stage creates lots of files, and might overwrite your old work.
  6. Run fonttrace.pl like this:
    fonttrace.pl infile.pbm | sh
    If you miss out the call to the shell, it will just print out the commands it would have run to create the character tiles.
  7. This should result in a bunch of files called uniNNNN.png in the current folder, like these:
    W
    uni0057.png
    i
    uni0069.png
    s
    uni0073.png
    p
    uni0070.png

    y
    uni0079.png
  8. Fire up FontForge. You’ll want to create a New font. Now File→Import…, and use Image Template as the format. Point it at the first of the image tiles (uni0020.png), and Import.
  9. Select Edit→Select→All, then Element→Autotrace. You’ll see your characters appear in the main window.
  10. And that’s – almost – it. You’ll need to fiddle with (auto)spacing, set up some kerning tables, set the font name (in Element→Font Info … – and you’ll probably want to set the em scale to 1024, as TrueType fonts like powers of two), then File→Generate Fonts. Fontforge will throw you a bunch of warnings and suggestions, and I’d recommend reading the help to find out what they mean.

There are a couple of limitations to the process:

  • Most of the above process could be written into a FontForge script to make things easier
  • Only ASCII characters are supported, to keep the number of scanned pages simple. Sorry. I’d really like to support more. You’re free to build on this.

Lastly, a couple of extra files:

  • CrapHand2.pbm – a sample array drawn by me, gzipped for your inconvenience (and no, I don’t know why WordPress is changing the file extension to ‘pbm_’ either).
  • chargrid.ods – the OpenOffice spreadsheet used to make chargrid.pdf

Have fun! Write nicely!

Calculating the second last Friday of the month

My boss, bless ‘im (no really, do; he’s a sound bloke, great guy to work for, and is just getting through some serious health problems), needs a monthly status report on the second last Friday of every month. I live by my calendar applications reminding me to do things, so I thought it’d be no problem getting Outlook to set up a reminder.

No dice; it will only set up appointments on the 1st, 2nd, 3rd etc., starting from the beginning of the month. I did a web search, and really thought I’d found a solution for iCal. It was not to be; this was for a Unix program called ICal; dratted case-insensitive search. Curiously, it appears that the ics spec might support a second-from-last syntax, but Outlook and iCal (and Google Calendar) can’t create them. Phooey.

So I tried excel; and really thought I’d found the basis of an answer: Last Friday of the month. And indeed, most of their assumptions are right; the code

DATE(year,month+1,1)-WEEKDAY(DATE(year,month+1,1),1)

really does give you the date of the last Saturday in the month. But you can’t assume that the day before the last Saturday is the last Friday – it is the second last, if the month ends on a Friday (April 2010 is a test case).

So I tried the Swiss Army chainsaw of brute-force date calculation: Perl with Date::Calc. What I do here is create an array of every Friday in the month, then print the second last member; never known to fail:

#!/usr/bin/perl -w
# second_last_friday.pl - show date of 2nd last friday
use strict;
use Date::Calc qw(Today
  Nth_Weekday_of_Month_Year
  Add_Delta_YMD);
my ( $new_year, $new_month ) = 0;

my ( $year, $month, $day ) = Today;
foreach ( 1 .. 24 ) {
    my @fridays = ();   # for every friday in this month
    foreach my $week ( 1 .. 5 ) {
        if (
            ( $new_year, $new_month, $day ) =
            Nth_Weekday_of_Month_Year(
                $year, $month, 5, $week
            )
          )
        {               # day of week 5 is Friday
            push @fridays, $day;
        }
        else {
            last;       # not a valid Friday
        }
    }
    printf( "%4d/%02d/%02d\n",
        $year, $month, $fridays[-2] );
    ( $year, $month, $day ) =
      Add_Delta_YMD( $year, $month, 1, 0, 1, 0 )
      ;                 # month++
}
exit;

and this gives


2009/11/20
2009/12/18
2010/01/22
2010/02/19
2010/03/19
2010/04/23
2010/05/21

...

See, notice the tricksy 23 April 2010, which – considering thirty days hath April et al – ends on a Friday and threw that simple Excel calculation off.

I’m disappointed that all these new applications like Outlook and iCal don’t seem to handle dates as elegantly as the old unix programs I used to use. pcal, in particular, could generate incredibly complex date formulae. I must dig around to solve this problem – and for now, actually have to remember to write that report on the second last Friday of this month …

renaming files to include datestamp

My Marantz PMD-620 has a reliable internal clock, and stamps the files with the time that recording stopped. File times are remarkably fragile, so I wanted to make sure that the times were preserved in the file name. Perl’s rename utility does this rather well, as it allows you to use arbitrary code in a rename operation. So:

rename -n 'use POSIX qw(strftime); my $mtime=(stat($_))[9]; s/.WAV$//; $_ .= strftime("-%Y%m%d%H%M%S",localtime($mtime)); s/$/.WAV/;' *.WAV

which, for files 1007.WAV and 1008.WAV recorded last night, results in:

1007.WAV renamed as 1007-20091024192436.WAV
1008.WAV renamed as 1008-20091024193438.WAV

To actually rename the files, remove the -n from the command line. I left it in so you couldn’t blame me for b0rking up your files if you typed first, thought later.

There are probably smarter ways to handle the file extension. This works for me. Perfection comes later.

how does he do that?

Someone asked how the automatic podcast works. It’s a bit complex, and they probably will be sorry they asked.

I have all my music saved as MP3s on a server running Firefly Media Server. It stores all its information about tracks in a SQLite database, so I can very easily grab a random selection of tracks.

Since I know the name of the track and the artist from the Firefly database, I have a selection of script lines that I can feed to flite, a very simple speech synthesizer. Each of these spoken lines is stored as as wav file, and then each candidate MP3 is converted to wav, and the whole mess is joined together using SoX. SoX also created the nifty (well, I think so) intro and outro sweeps.

The huge wav file of the whole show is converted to MP3 using LAME and uploaded to my webhost with scp. All of this process is done by one Perl script – it also creates the web page, the RSS feed, and even logs the tracks on Last.fm.

Couldn’t be simpler.

m4a2mp3

m4a2mp3 – convert AAC to MP3. Uses Perl, LAME and faad. Semi-gracefully converts weird iTunes genres to ID3v2, or to “Other” if it’s something else. Uses lame’s new VBR settings, so you end up with an MP3 not massively bigger than the source M4A.

PS: broke the 8000 tunes on the Firefly server …

bbtrackerwpt – create GPX files of named waypoints from bbtracker

I like bbtracker -it’s a very simple GPS track logger for the Blackberry. It has (at least, at the current version) one problem – you can’t create waypoints in the way that most GPS applications would expect. You can, however, name trackpoints – so I wrote a little perl script to extract all the named trackpoints from an exported GPX files, and save them as waypoints.

Download bbtrackerwpt – converts named trackpoints from bbtracker GPX into waypoints. You’ll need XML::Simple for this to work.

I imagine this script has a limited audience, and quite likely a limited lifetime. The author of bbtracker has said they’d provide waypoint support in the next version. You know me and patience, though …

If I remembered more XSLT, I’d have done this the proper way. As is, I create XML using Perl print statements. I’m probably okay, as the name field is the only piece of free-form text, and I do some rudimentary escaping of characters that XML doesn’t like. The output seems to validate, which is more than the GPX that bbtracker produces does. The length of your GPS track may vary 😉

auplabels – extract times of tracks in an Audacity file for adding labels

auplabels – extract times of tracks in an Audacity file for adding labels (download).

Audacity 1.3’s method of track splitting has always seemed a pain, so I wrote the above to help me.

Running auplabels file.aup will generate a somewhat sparse file of track offsets:

0.00000000
191.57333333
376.08000000
550.76000000

You’ll want to edit this to add track names (there should be a tab between the first column and the title):

0.00000000      Battle of the Blues
191.57333333    I Quit My Job
376.08000000    Ain't Goin' My Way
550.76000000    Wake Up Hill

If you use File -> Import… -> Labels… to import this into your project, the label track should exactly align with your track splits.

(Of course, this should really be an XML application since Audacity AUP files are XML, but issues were had.)

how to fix the annoying Ubuntu/Debian XML::SAX install problems

Debian and its derived distributions have a policy about packages not being able to modify the configuration of other packages. While this might generally seem like a good idea, for the TIMTOWTDI world of Perl, this causes problems.

The problem arises if you have installed Perl XML modules from both CPAN and the Debian (or Ubuntu, or whatever) repositories. Debian’s modifications subtly break the XML::SAX module, on which most Perl XML modules (including the brilliant XML::Simple) depend. If you’ve been naughty and used a module from CPAN, Debian gets its knickers in a knot, and won’t configure or run anything remotely related to libxml-sax-perl.

If you get the error Can’t locate object method “save_parsers_debian” via package “XML::SAX” at /usr/bin/update-perl-sax-parsers line 90, your system is affected. You might get the clue that any of your Perl XML handlers freak out and fail in weird ways.

Here’s a method (there’s always more than one, of course)  to fix it. This was combined from a couple of sources, each of which was on the right track but didn’t entirely work. Actually, the first might’ve been right on the money, but my hiragana’s a bit ropey …

  1. make sure you’ve got your system up to date with apt-get or aptitude.
  2. sudo cpan CPANPLUS (this will ask you lots of questions, to which you should almost always answer with the default)
  3. sudo cpanp -u XML::SAX (this takes quite a while, and produces no output for most of it)
  4. LC_ALL=C sudo apt-get install --reinstall libxml-sax-perl (the LC_ALL=C might not be strictly necessary, but it worked for me)

You must remember never to pretend to be smarter than the Debian maintainers, and suitably chastened, may now return to your normal OpenSSH patching activities …

Rise Up Singing! in freedb

It took me a while, but I finally put all the track information for Sing Out!‘s Rise Up Singing teaching CDs (also on the artists’ website) on freedb. I was given the data just over a year ago by Mark D. Moss, the editor of Sing Out! magazine.
The discs are:

Perhaps what took longest was working out a UTF-8 safe processing workflow, from converting the original Excel table to e-mailing the entries to the freedb server. Let’s just say that OpenOffice, sqlite, and Perl were very helpful here.

now it really works

While I said quite early on that I had Ubuntu Feisty running in 64-bit, it wasn’t until today I got things really how I liked it. My earlier Perl problem was due to a broken gcc setup; all is happy now, and all the modules I’ve ever used are built and running as expected.

The one thing I’ll probably never get going is Citrix Metaframe presentation client. There’s no AMD64 package for it. I’m hardly heartbroken, as I still have two machines on which it runs just fine.

now feisty – and 64 bit!

I reinstalled Ubuntu completely last night, and took the opportunity to go to AMD64 mode. I had to sacrifice the cheapo ndiswrapper wireless card, so am now running a switch off the wireless bridge. So it works now!

It looks like Perl really doesn’t like 64-bit. CPAN‘s having difficulty building.