Tag: cipher

  • ROT13 and other cypher silliness

    I wanted to encode a spoiler in a forum post last night, so used the ancient ROT13 reciprocal cypher in the time-honoured way. That way, casual readers can immediately read the solutions, but you can get them by running the text through the ROT13 cypher again.

    It’s a very simple cypher: the letters a–m are mapped to n–z, and n–z are mapped to a–m in turn. Each letter is rotated 13 places in the alphabet, hence the name. So the phrase “fly at once all is discovered” becomes “syl ng bapr nyy vf qvfpbirerq”. A simple command line invocation might use the tr command like this:

    echo fly at once all is discovered | tr a-mn-z n-za-m syl ng bapr nyy vf qvfpbirerq

    NB:

    1. Because I’m a low-effort hipster, I’m only going to work with lower case letters. Real implementations do more.
    2. Nothing in this article moves the state of cryptography forward in any way. People were doing this kind of thing 2000 years ago.
    3. I’m going to keep using tr, but you probably have a rot13 command.

    I noticed that once of the encyphered words in the spoiler spelled onyx, an English word in its own right.. Before discovering that this was all documented on Wikipedia already, I wrote some scripts to find what words were also words when run through ROT13:

    tr a-mn-z n-za-m < /usr/share/dict/words | sort |\ comm -12 <(sort /usr/share/dict/words) - |\ grep -P '^[a-z]{4,}$'

    This process has three phases:

    1. ROT13 the word list (/usr/share/dict/words) and sort it;
    2. Output the words common to the sorted word list and the ROT13 output using comm;
    3. finally, output only those words that are all lowercase letters and at least four characters long.

    With a smallish word list, you’ll get results like:

    balk barf crag ebbs envy errs flap frag gnat ones onyx pent rail reef roof sent sync tang

    gnat/tang is the longest pair that are ROT13 and the reverse of each other. If you’re into very obscure words, nana/anan work too. (Anan: obsolete interjection meaning ‘in a moment’ or ‘at your service’; 17th century)

    There are more words that reverse if you ROT13 them, but aren’t necessarily real words when reversed: ‘robe‘, ‘serf‘ and ‘thug‘ are fairly common examples. (Yes, I know about ‘Ebor‘, archaic name for York)

    The longest ROT13 self-reversing word I can find is ‘tavering‘ (Scots: to wander aimlessly). Next shorter are ‘rebore‘, ‘ravine‘ and ‘grivet‘. Obscure examples are ‘averin‘, ‘cherup‘ and ‘granet‘: all in OED, just.

    Other reciprocal cyphers

    Another historical cypher is Atbash: here, the range a–z is mapped to z–a. The cypher’s name derives from the first, last, second, and second to last Hebrew letters: “אתבש” (which reads right-to-left). A name derived from the Latin alphabet might be something like “azby

    Again, tr can do the needful:

    echo fly at once all is discovered | tr a-z zyxwvutsrqponmlkjihgfedcba uob zg lmxv zoo rh wrhxlevivw

    Seeing that all becomes zoo, there have to be some good Atbash words that are English words. Using similar code as above:

    girl girt glow grog hold holy horn kiln prom slim slob slow tilt told trig trio

    Again, we can see a word pair that’s reversed when encoded: girt/trig. The longest reversed-by-Atbash words I can find are wizard and hovels.

    There are, in fact, a huge number of reciprocal cyphers you can create this way. As long as, for each letter pair (a, b), you map a→b and b→a, you’ll get a cypher that’s self-decrypting. Here’s an embarrassment of a shell one-liner that will generate a perfectly reciprocal mapping for a–z:

    for i in {a..z}; do echo "$i"; done | shuf | awk '{a[NR]=$0;} END{for (i=1;i<26;i+=2) {b[a[i]]=a[i+1]; b[a[i+1]]=a[i];}; for (m in b) {print m " " b[m];}}' | sort | awk '{print $2}' | fmt -w999 | tr -d ' '

    Here are 21 examples that can be put after ‘tr a-z‘ to make a reciprocal cypher:

    bauonpqmtrzxhedfgjwicyslvk cyalpqnmzxrdhgwefktsvuojbi dcbajlzsxeoftqkrnphmwyuivg dknafehgovbmlciwyuxzrjpsqt eyfpacoihxutzsgdvwnlkqrjbm johmtyrcvawzdubsxgpenikqfl khjiznvbdcaouflwxsrymgpqte kwtlqosrnxaduifvehgcmpbjzy ohsjkvpbrdezqyagmicwxftunl petjbzyrmdlkivuaxhwconsqgf pyqjmztsvdoxerkacnhgwiulbf qldchvyekxibwtszauonrfmjgp ufzedbtosqmrkphnjligawvyxc uktiljzpdfbexsvhrqncaoymwg usdcnprwjitmlexfzgbkayhovq vintfeorbzpywcgkuhxdqamslj xjgzrmcwubonflksvepyiqhatd xoledhwfkpicutbjzyvnmsgarq ylfjgcepodrbnmihwktsvuqzax yregcwdnmotvihjsxbpkzlfqau badcfehgjilknmporqtsvuxwzy

    That last one is what you get if you don’t shuffle the letters before you pair them. My brain keeps wanting to put it in alphabetical order, and it can’t.

    There are lots of these keys: I estimate something like 49,229,914,688,306,352,000,000 (= 26! ÷ 8192) of them. But they’re all trivially easy to crack. They don’t change the underlying letter frequencies in the text, so if you have a long enough encrypted message, you can use standard frequency tables to break them.