xml – We Saw a Chicken …

svgo: silently destroying creators’ rights since whenever

svgo is, on the face of it, pretty neat: it takes those huge vector graphic files and squozes them down to something more acceptable. Unfortunately, though, the authors have seen too many files with junk machine-generated <metadata> sections, and decided that it’s all worthless.

Metadata isn’t junk; it’s provenance. Your RDF? Gone. Your diligently researched and carefully crafted Dublin Core entries? Blown away. The licence you agonized over? teh g0ne, man. svgo does this by default. It would be very easy to use this tool to take someone else’s graphic, strip out the ownership information, and claim it as your own. It would be wrong to do that, but the original creator would have to find your rip-off and go to the effort of challenging your use of it. All so much work, all so easily avoided.

You can make svgo do the right thing by calling it this way:

svgoÂ  --disable=removeMetadata -i infile.svg -o outfile.svg

There’s apparently a config option to make this permanent, but the combination of javascript, no docs and YAML brings me out in hives. Given that the metadata section of a complex file is typically a couple of percent of the total, it’s worth keeping. Software passes; but data lives forever, so be kind to it.

Mac to Linux: 1Password to KeePassX

I have too many passwords to remember, so I’ve been using a password manager for years. First there was Keyring for Palm OS, then 1Password on the Mac. 1Password’s a very polished commercial program, but it only has Mac and Windows desktop clients. Sadly, it had to go.

Finding a replacement was tough. It needed to be free, and yet cross-platform. It needed to work on iOS and Android. It also needed to integrate with a cloud service like Dropbox so I could keep my passwords in sync. The only program that met all of these requirements was KeePassX. I’ve stuck with the stable (v 0.4.3) branch rather than the flashy 2.0 version, as the older database format does all I need and is fully portable. MiniKeePass on iOS and KeePassDroid on Android look after my mobile needs. But first, I needed to get my password data out of 1Password.

1Password offers two export formats: a delimited text format (which seemed to drop some of the more obscure fields), and the 1Password Interchange Format (1PIF). The latter is aÂ JSONish format (à² _à² ) containing a dump of all of the internal data structures. There is, of course, no documentation for this file format, because no-one would ever move away from this lovely commercial software, no …

So armed with my favourite swiss army chainsaw, I set about picking the file apart. JSON::XS and Data::Dumper::Simple were invaluable for this process, and pretty soon I had all the fields picked apart that I cared about. I decided to write a converter that wrote KeePassX 1.x XML, since it was readily imported into KeePassX, would could then write a database readable by all of the KeePass variants.

To run this converter you’ll need Perl, the JSON::XS and Data::Dumper::Simple modules, and if your Perl is older than about 5.12, the Time::Piece module (it’s a core module for newer Perls, so you don’t have to install it). Here’s the code:

#!/usr/bin/perl -w
# 1pw2kpxxml.pl - convert 1Password Exchange file to KeePassX XML
# created by scruss on 02013/04/21

use strict;
use JSON::XS;
use HTML::Entities;
use Time::Piece;

# print xml header
print <<HEADER;
<!DOCTYPE KEEPASSX_DATABASE>
<database>
 <group>
  <title>General</title>
  <icon>2</icon>
HEADER

##############################################################
# Field Map
#
# 1Password			KeePassX
# ============================  ==============================
# title        			title
# username			username
# password			password
# location			url
# notesPlain			comment
#    -				icon
# createdAt			creation
#    -				lastaccess	(use updatedAt)
# updatedAt			lastmod
#    -				expire		('Never')

# 1PW exchange files are made of single lines of JSON (O_o)
# interleaved with separators that start '**'
while (<>) {
    next if (/^\*\*/);    # skip separator
    my $rec = decode_json($_);

    # throw out records we don't want:
    #  - 'trashed' entries
    #  -  system.sync.Point entries
    next if ( exists( $rec->{'trashed'} ) );
    next if ( $rec->{'typeName'} eq 'system.sync.Point' );

    print '  <entry>', "\n";    # begin entry

    ################
    # title field
    print '   <title>', xq( $rec->{'title'} ), '</title>', "\n";

    ################
    # username field - can be in one of two places
    my $username = '';

    # 1. check secureContents as array
    foreach ( @{ $rec->{'secureContents'}->{'fields'} } ) {
        if (
            (
                exists( $_->{'designation'} )
                && ( $_->{'designation'} eq 'username' )
            )
          )
        {
            $username = $_->{'value'};
        }
    }

    # 2.  check secureContents as scalar
    if ( $username eq '' ) {
        $username = $rec->{'secureContents'}->{'username'}
          if ( exists( $rec->{'secureContents'}->{'username'} ) );
    }

    print '   <username>', xq($username), '</username>', "\n";

    ################
    # password field - as username
    my $password = '';

    # 1. check secureContents as array
    foreach ( @{ $rec->{'secureContents'}->{'fields'} } ) {
        if (
            (
                exists( $_->{'designation'} )
                && ( $_->{'designation'} eq 'password' )
            )
          )
        {
            $password = $_->{'value'};
        }
    }

    # 2.  check secureContents as scalar
    if ( $password eq '' ) {
        $password = $rec->{'secureContents'}->{'password'}
          if ( exists( $rec->{'secureContents'}->{'password'} ) );
    }

    print '   <password>', xq($password), '</password>', "\n";

    ################
    # url field
    print '   <url>', xq( $rec->{'location'} ), '</url>', "\n";

    ################
    # comment field
    my $comment = '';
    $comment = $rec->{'secureContents'}->{'notesPlain'}
      if ( exists( $rec->{'secureContents'}->{'notesPlain'} ) );
    $comment = xq($comment);    # pre-quote
    $comment =~ s,\\n,<br/>,g;  # replace escaped NL with HTML
    $comment =~ s,\n,<br/>,mg;  # replace NL with HTML
    print '   <comment>', $comment, '</comment>', "\n";

    ################
    # icon field (placeholder)
    print '   <icon>2</icon>', "\n";

    ################
    # creation field
    my $creation = localtime( $rec->{'createdAt'} );
    print '   <creation>', $creation->datetime, '</creation>', "\n";

    ################
    # lastaccess field
    my $lastaccess = localtime( $rec->{'updatedAt'} );
    print '   <lastaccess>', $lastaccess->datetime, '</lastaccess>', "\n";

    ################
    # lastmod field (= lastaccess)
    print '   <lastmod>', $lastaccess->datetime, '</lastmod>', "\n";

    ################
    # expire field (placeholder)
    print '   <expire>Never</expire>', "\n";

    print '  </entry>', "\n";    # end entry
}

# print xml footer
print <<FOOTER;
 </group>
</database>
FOOTER

exit;

sub xq {                         # encode string for XML
    $_ = shift;
    return encode_entities( $_, q/<>&"'/ );
}

To run it,

./1pw2kpxxml.pl data.1pif > data.xml

You can then import data.xml into KeePassX.

Please be careful to delete the 1PIF file and the data.xml once you’ve finished the export/import. These files contain all of your passwords in plain text; if they fell into the wrong hands, it would be a disaster for your online identity. Be careful that none of these files accidentally slip onto backups, too. Also note that, while I think I’m quite a trustworthy bloke, to you, I’m Some Random Guy On The Internet. Check this code accordingly; I don’t warrant it for anything save for looking like line noise.

Now on github: scruss / 1pw2kpxxml, or download: 1pw2kpxxml.zip (gpg signature: 1pw2kpxxml.zip.sig)

SHA1 Checksums:

3c25eb72b2cfe3034ebc2d251869d5333db74592 â€” 1pw2kpxxml.pl
99b7705ff30a2b157be3cfd29bb1d4f137920c25 â€” readme.txt
de4a51fbe0dd6371b8d68674f71311a67da76812 â€” 1pw2kpxxml.zip
f6bd12e33b927bff6999e9e80506aef53e6a08fa â€” 1pw2kpxxml.zip.sig.txt

The converter has some limitations:

All attached files in the database are lost.
All entries are stored under the same folder, with the same icon.
It has not been widely tested, and as I’m satisfied with its conversion, it will not be developed further.

Updated: Augustus Carp, Esq: by Himself

I’ve updated the markup for Augustus Carp, Esq: by Himself, and now host a local copy. Apart from changing the TeX-style quotes to proper Unicode typographic one, the main change has been converting the images to SVG for added crispness.

You’ll like it. It’s all about piety gone wrong.

implicit markup: easy to read, hard to parse

I don’t usually ponder about other people’s blog postings, but Jeff Atwood’s Responsible Open Source Code Parenting reminded me of some of the old wars that the used to be when I was a markup head. Jeff writes about his frustration that John Gruber’s Markdown text-to-html filter:

hasn’t been updated for some time
doesn’t quite do exactly what Jeff’s users at Stack Overflow want
appears to have any changes in its behaviour from v1.0.1 strenuously vetoed by Gruber himself.

Markdown is nice in that you can write screeds of text, and it does almost exactly what you’d expect. The markup doesn’t get in the way, usually. The difficulty arises when implicit markup (indented lines for quoted text, bulleted lists, highlights) has to give way to explicit (cross-references, code samples). Explicit markup is ugly, but sometimes, you’ve got to do it. Complex intent requires complex modes of communication, and sometimes plain text just hasn’t the bandwidth. [As an aside, there was a hilarious lengthy recurring episode on John Mark Ockerbloom‘s late bookpeople mailing list where a user (mercilessly skewered here)Â insisted that they could write a general Gutenberg plain-text converter that would re-create typeset quality in an e-book reader with no explicit markup, and that XML was completely unnecessary and ill-conceived. The un-markup language, called zen markup language (said user had an aversion to the shift key) lives on only in a single website: the home of z.m.l. As for XML, its executive assistant had no comment on the matter.] Looking at Markdown, it looks like Gruber’s moved on from it. He made a 1.0.1 which did what he wanted. The code’s there to change if anyone needs it. I understand his frustration at people wanting to make changes and still call it Markdown; I’d be annoyed if I had text which I thought was in one format suddenly not be accepted, or do something unexpected. Seriously, that’s almost as bad as ‘deprecated‘. [At least Gruber didn’t go on a deletion rampage, like (admittedly smaller-time) erstwhile CHDK stalwart Barney Fife did when he was slighted in a forum. Looks like almost everything he contributed to CHDK has been removed, including some very useful control scripts and explanations.] Personally, when I need to make text to web conversions, I still use txt2html and a bunch of shell and Perl glue to feed to tidy. It’s on its third maintainer, doesn’t do much, but does it simply. And I’m pretty simple that way.

Update: see also On my increasing exasperation with Markdown.

bbtrackerwpt – create GPX files of named waypoints from bbtracker

I like bbtracker -it’s a very simple GPS track logger for the Blackberry. It has (at least, at the current version) one problem – you can’t create waypoints in the way that most GPS applications would expect. You can, however, name trackpoints – so I wrote a little perl script to extract all the named trackpoints from an exported GPX files, and save them as waypoints.

Download bbtrackerwpt – converts named trackpoints from bbtracker GPX into waypoints. You’ll need XML::Simple for this to work.

I imagine this script has a limited audience, and quite likely a limited lifetime. The author of bbtracker has said they’d provide waypoint support in the next version. You know me and patience, though …

If I remembered more XSLT, I’d have done this the proper way. As is, I create XML using Perl print statements. I’m probably okay, as the name field is the only piece of free-form text, and I do some rudimentary escaping of characters that XML doesn’t like. The output seems to validate, which is more than the GPX that bbtracker produces does. The length of your GPS track may vary 😉

how to fix the annoying Ubuntu/Debian XML::SAX install problems

Debian and its derived distributions have a policy about packages not being able to modify the configuration of other packages. While this might generally seem like a good idea, for the TIMTOWTDI world of Perl, this causes problems.

The problem arises if you have installed Perl XML modules from both CPAN and the Debian (or Ubuntu, or whatever) repositories. Debian’s modifications subtly break the XML::SAX module, on which most Perl XML modules (including the brilliant XML::Simple) depend. If you’ve been naughty and used a module from CPAN, Debian gets its knickers in a knot, and won’t configure or run anything remotely related to libxml-sax-perl.

If you get the error Can’t locate object method “save_parsers_debian” via package “XML::SAX” at /usr/bin/update-perl-sax-parsers line 90, your system is affected. You might get the clue that any of your Perl XML handlers freak out and fail in weird ways.

Here’s a method (there’s always more than one, of course)Â to fix it. This was combined from a couple of sources, each of which was on the right track but didn’t entirely work. Actually, the first might’ve been right on the money, but my hiragana’s a bit ropey …

make sure you’ve got your system up to date with apt-get or aptitude.
sudo cpan CPANPLUS (this will ask you lots of questions, to which you should almost always answer with the default)
sudo cpanp -u XML::SAX (this takes quite a while, and produces no output for most of it)
LC_ALL=C sudo apt-get install --reinstall libxml-sax-perl (the LC_ALL=C might not be strictly necessary, but it worked for me)

You must remember never to pretend to be smarter than the Debian maintainers, and suitably chastened, may now return to your normal OpenSSH patching activities …

DocBook XSL: The Complete Guide

This is neat: DocBook XSL: The Complete Guide. Thanks for the indirect link, Emma!