unicode – We Saw a Chicken …

I (U+1F494, BROKEN HEART) UTF-8

Something has gone very wrong with the database encoding on this blog after a recent update, so all my lovely UTF-8 characters have gone mojibake.

Trying to find ways to fix it. It may have to be manual. Remember, kids: have backups before letting WordPress upgrade!

Here’s the Python equivalent of what I think the database has done:

bytes("I ???? UTF-8", encoding='utf-8').decode(encoding='cp1252')
'I ðŸ’” UTF-8'

Quite why my hosting thought a character encoding from last century was appropriate, I’ll never know.

Update, November 2023: kinda-sort fixed the backend, but the encoding is still weird — can we…?

ð’³ / à¼³ == ( â‘½ – ð¹ ) * ( ð’² / ð…‰ ), of course

I just got brian d. foy’s Learning Perl 6 from the library. It’s a pretty good book, though it’ll take a good few readings for some of Perl 6’s features to stick.

Since Perl 6 is built using Unicode from the ground up, it does two rather wonderful things when dealing with numbers:

regular expressions match numerals beyond 0â€“9: Ù¤ is as much four as 4
numeric constants can (pretty much) be expressed in terms of Unicode values in your Perl 6 source code. Assigning Ï€ to a variable does what you think it does. Dividing by Â¼ is the same as multiplying by, well, Ù¤.

So herewith a table (probably incomplete, and very unlikely to render properly for you) of Unicode glyphs accepted by Perl 6 as numeric values:

Value	Glyphs
-0.5	à¼³
0	0 Ù Û° ß€ à¥¦ à§¦ à©¦ à«¦ à¦ à¯¦ à±¦ à±¸ à³¦ àµ¦ à¹ à» à¼ á€ á‚ áŸ áŸ° á á¥† á§ áª€ áª á á®° á±€ á± â° â‚€ â†‰ â“ª â“¿ ã€‡ ê˜ ê›¯ ê£ ê¤€ ê§ ê© ê¯° ï¼ ð†Š ð’ ð‘¦ ð‘ƒ° ð‘„¶ ð‘‡ ð‘›€ ðŸŽ ðŸ˜ ðŸ¢ ðŸ¬ ðŸ¶ ðŸ„€ ðŸ„
0.0625	à§´ àµ ê ³
0.1	â…’
0.111111	â…‘
0.125	à§µ à¶ â…› ê ´ ð’‘Ÿ
0.142857	â…
0.166667	â…™ ð’‘¡
0.1875	à§¶ à· ê µ
0.2	â…•
0.25	Â¼ à§· à² àµ³ ê ° ð…€ ð¹¼ ð’‘ ð’‘¢
0.333333	â…“ ð¹½ ð’‘š ð’‘
0.375	â…œ
0.4	â…–
0.5	Â½ à³ àµ´ à¼ª â³½ ê ± ð… ð…µ ð…¶ ð¹»
0.6	â…—
0.625	â…
0.666667	â…” ð…· ð¹¾ ð’‘› ð’‘ž
0.75	Â¾ à§¸ à´ àµµ ê ² ð…¸
0.8	â…˜
0.833333	â…š ð’‘œ
0.875	â…ž
1	1 Â¹ Ù¡ Û± ß à¥§ à§§ à©§ à«§ à§ à¯§ à±§ à±¹ à±¼ à³§ àµ§ à¹‘ à»‘ à¼¡ á á‚‘ á© áŸ¡ áŸ± á ‘ á¥‡ á§‘ á§š áª áª‘ á‘ á®± á± á±‘ â‚ â…Ÿ â… â…° â‘ â‘´ â’ˆ â“µ â¶ âž€ âžŠ ã€¡ ã†’ ãˆ ãŠ€ ê˜¡ ê›¦ ê£‘ ê¤ ê§‘ ê©‘ ê¯± ï¼‘ ð„‡ ð…‚ ð…˜ ð…™ ð…š ðŒ ð‘ ð’¡ ð¡˜ ð¤– ð©€ ð©½ ð˜ ð¸ ð¹ ð‘’ ð‘§ ð‘ƒ± ð‘„· ð‘‡‘ ð‘› ð’• ð’ž ð’¬ ð’´ ð’‘ ð’‘˜ ð ðŸ ðŸ™ ðŸ£ ðŸ ðŸ· ðŸ„‚
1.5	à¼«
2	2 Â² Ù¢ Û² ß‚ à¥¨ à§¨ à©¨ à«¨ à¨ à¯¨ à±¨ à±º à±½ à³¨ àµ¨ à¹’ à»’ à¼¢ á‚ á‚’ áª áŸ¢ áŸ² á ’ á¥ˆ á§’ áª‚ áª’ á’ á®² á±‚ á±’ â‚‚ â…¡ â…± â‘¡ â‘µ â’‰ â“¶ â· âž âž‹ ã€¢ ã†“ ãˆ¡ ãŠ ê˜¢ ê›§ ê£’ ê¤‚ ê§’ ê©’ ê¯² ï¼’ ð„ˆ ð…› ð…œ ð… ð…ž ð’ ð’¢ ð¡™ ð¤š ð© ð™ ð¹ ð¹¡ ð‘“ ð‘¨ ð‘ƒ² ð‘„¸ ð‘‡’ ð‘›‚ ð’€ ð’– ð’Ÿ ð’£ ð’ ð’µ ð’‘Š ð’‘ ð’‘– ð’‘™ ð¡ ðŸ ðŸš ðŸ¤ ðŸ® ðŸ¸ ðŸ„ƒ
2.5	à¼¬
3	3 Â³ Ù£ Û³ ßƒ à¥© à§© à©© à«© à© à¯© à±© à±» à±¾ à³© àµ© à¹“ à»“ à¼£ áƒ á‚“ á« áŸ£ áŸ³ á “ á¥‰ á§“ áªƒ áª“ á“ á®³ á±ƒ á±“ â‚ƒ â…¢ â…² â‘¢ â‘¶ â’Š â“· â¸ âž‚ âžŒ ã€£ ã†” ãˆ¢ ãŠ‚ ê˜£ ê›¨ ê£“ ê¤ƒ ê§“ ê©“ ê¯³ ï¼“ ð„‰ ð’£ ð¡š ð¤› ð©‚ ðš ðº ð¹¢ ð‘” ð‘© ð‘ƒ³ ð‘„¹ ð‘‡“ ð‘›ƒ ð’ ð’ˆ ð’— ð’ ð’¤ ð’¥ ð’® ð’¯ ð’¶ ð’· ð’º ð’» ð’‘‹ ð’‘‘ ð’‘— ð¢ ðŸ‘ ðŸ› ðŸ¥ ðŸ¯ ðŸ¹ ðŸ„„
3.141592653589793	Ï€
3.5	à¼
4	4 Ù¤ Û´ ß„ à¥ª à§ª à©ª à«ª àª à¯ª à±ª à³ª àµª à¹” à»” à¼¤ á„ á‚” á¬ áŸ¤ áŸ´ á ” á¥Š á§” áª„ áª” á” á®´ á±„ á±” â´ â‚„ â…£ â…³ â‘£ â‘· â’‹ â“¸ â¹ âžƒ âž ã€¤ ã†• ãˆ£ ãŠƒ ê˜¤ ê›© ê£” ê¤„ ê§” ê©” ê¯´ ï¼” ð„Š ð’¤ ð©ƒ ð› ð» ð¹£ ð‘• ð‘ª ð‘ƒ´ ð‘„º ð‘‡” ð‘›„ ð’‚ ð’‰ ð’ ð’˜ ð’¡ ð’¦ ð’° ð’¸ ð’¼ ð’½ ð’¾ ð’¿ ð’‘Œ ð’‘’ ð’‘“ ð£ ðŸ’ ðŸœ ðŸ¦ ðŸ° ðŸº ðŸ„…
4.5	à¼®
5	5 Ù¥ Ûµ ß… à¥« à§« à©« à«« à« à¯« à±« à³« àµ« à¹• à»• à¼¥ á… á‚• á áŸ¥ áŸµ á • á¥‹ á§• áª… áª• á• á®µ á±… á±• âµ â‚… â…¤ â…´ â‘¤ â‘¸ â’Œ â“¹ âº âž„ âžŽ ã€¥ ãˆ¤ ãŠ„ ê˜¥ ê›ª ê£• ê¤… ê§• ê©• ê¯µ ï¼• ð„‹ ð…ƒ ð…ˆ ð… ð…Ÿ ð…³ ðŒ¡ ð’¥ ð¹¤ ð‘– ð‘« ð‘ƒµ ð‘„» ð‘‡• ð‘›… ð’ƒ ð’Š ð’ ð’™ ð’¢ ð’§ ð’± ð’¹ ð’‘ ð’‘” ð’‘• ð¤ ðŸ“ ðŸ ðŸ§ ðŸ± ðŸ» ðŸ„†
5.5	à¼¯
6	6 Ù¦ Û¶ ß† à¥¬ à§¬ à©¬ à«¬ à¬ à¯¬ à±¬ à³¬ àµ¬ à¹– à»– à¼¦ á† á‚– á® áŸ¦ áŸ¶ á – á¥Œ á§– áª† áª– á– á®¶ á±† á±– â¶ â‚† â…¥ â…µ â†… â‘¥ â‘¹ â’ â“º â» âž… âž ã€¦ ãˆ¥ ãŠ… ê˜¦ ê›« ê£– ê¤† ê§– ê©– ê¯¶ ï¼– ð„Œ ð’¦ ð¹¥ ð‘— ð‘¬ ð‘ƒ¶ ð‘„¼ ð‘‡– ð‘›† ð’„ ð’‹ ð’‘ ð’š ð’¨ ð’‘€ ð’‘Ž ð¥ ðŸ” ðŸž ðŸ¨ ðŸ² ðŸ¼ ðŸ„‡
6.5	à¼°
7	7 Ù§ Û· ß‡ à¥ à§ à© à« à à¯ à± à³ àµ à¹— à»— à¼§ á‡ á‚— á¯ áŸ§ áŸ· á — á¥ á§— áª‡ áª— á— á®· á±‡ á±— â· â‚‡ â…¦ â…¶ â‘¦ â‘º â’Ž â“» â¼ âž† âž ã€§ ãˆ¦ ãŠ† ê˜§ ê›¬ ê£— ê¤‡ ê§— ê©— ê¯· ï¼— ð„ ð’§ ð¹¦ ð‘˜ ð‘ ð‘ƒ· ð‘„½ ð‘‡— ð‘›‡ ð’… ð’Œ ð’’ ð’› ð’© ð’‘ ð’‘‚ ð’‘ƒ ð¦ ðŸ• ðŸŸ ðŸ© ðŸ³ ðŸ½ ðŸ„ˆ
7.5	à¼±
8	8 Ù¨ Û¸ ßˆ à¥® à§® à©® à«® à® à¯® à±® à³® àµ® à¹˜ à»˜ à¼¨ áˆ á‚˜ á° áŸ¨ áŸ¸ á ˜ á¥Ž á§˜ áªˆ áª˜ á˜ á®¸ á±ˆ á±˜ â¸ â‚ˆ â…§ â…· â‘§ â‘» â’ â“¼ â½ âž‡ âž‘ ã€¨ ãˆ§ ãŠ‡ ê˜¨ ê› ê£˜ ê¤ˆ ê§˜ ê©˜ ê¯¸ ï¼˜ ð„Ž ð’¨ ð¹§ ð‘™ ð‘® ð‘ƒ¸ ð‘„¾ ð‘‡˜ ð‘›ˆ ð’† ð’ ð’“ ð’œ ð’ª ð’‘„ ð’‘… ð§ ðŸ– ðŸ ðŸª ðŸ´ ðŸ¾ ðŸ„‰
8.5	à¼²
9	9 Ù© Û¹ ß‰ à¥¯ à§¯ à©¯ à«¯ à¯ à¯¯ à±¯ à³¯ àµ¯ à¹™ à»™ à¼© á‰ á‚™ á± áŸ© áŸ¹ á ™ á¥ á§™ áª‰ áª™ á™ á®¹ á±‰ á±™ â¹ â‚‰ â…¨ â…¸ â‘¨ â‘¼ â’ â“½ â¾ âžˆ âž’ ã€© ãˆ¨ ãŠˆ ê˜© ê›® ê£™ ê¤‰ ê§™ ê©™ ê¯¹ ï¼™ ð„ ð’© ð¹¨ ð‘š ð‘¯ ð‘ƒ¹ ð‘„¿ ð‘‡™ ð‘›‰ ð’‡ ð’Ž ð’” ð’ ð’« ð’‘† ð’‘‡ ð’‘ˆ ð’‘‰ ð¨ ðŸ— ðŸ¡ ðŸ« ðŸµ ðŸ¿ ðŸ„Š
10	à¯° àµ° á² â…© â…¹ â‘© â‘½ â’‘ â“¾ â¿ âž‰ âž“ ã€¸ ãˆ© ã‰ˆ ãŠ‰ ð„ ð…‰ ð… ð…— ð… ð…¡ ð…¢ ð…£ ð…¤ ðŒ¢ ð“ ð¡› ð¤— ð©„ ðœ ð¼ ð¹© ð‘› ð©
11	â…ª â…º â‘ª â‘¾ â’’ â“«
12	â…« â…» â‘« â‘¿ â’“ â“¬
13	â‘¬ â’€ â’” â“
14	â‘ â’ â’• â“®
15	â‘® â’‚ â’– â“¯
16	à§¹ â‘¯ â’ƒ â’— â“°
17	á›® â‘° â’„ â’˜ â“±
18	á›¯ â‘± â’… â’™ â“²
19	á›° â‘² â’† â’š â“³
20	á³ â‘³ â’‡ â’› â“´ ã€¹ ã‰‰ ð„‘ ð” ð¡œ ð¤˜ ð©… ð ð½ ð¹ª ð‘œ ðª
21	ã‰‘
22	ã‰’
23	ã‰“
24	ã‰”
25	ã‰•
26	ã‰–
27	ã‰—
28	ã‰˜
29	ã‰™
30	á´ ã€º ã‰Š ã‰š ð„’ ð…¥ ð¹« ð‘ ð«
31	ã‰›
32	ã‰œ
33	ã‰
34	ã‰ž
35	ã‰Ÿ
36	ãŠ±
37	ãŠ²
38	ãŠ³
39	ãŠ´
40	áµ ã‰‹ ãŠµ ð„“ ð¹¬ ð‘ž ð¬
41	ãŠ¶
42	ãŠ·
43	ãŠ¸
44	ãŠ¹
45	ãŠº
46	ãŠ»
47	ãŠ¼
48	ãŠ½
49	ãŠ¾
50	á¶ â…¬ â…¼ â†† ã‰Œ ãŠ¿ ð„” ð…„ ð…Š ð…‘ ð…¦ ð…§ ð…¨ ð…© ð…´ ðŒ£ ð©¾ ð¹ ð‘Ÿ ð
60	á· ã‰ ð„• ð¹® ð‘ ð®
70	á¸ ã‰Ž ð„– ð¹¯ ð‘¡ ð¯
80	á¹ ã‰ ð„— ð¹° ð‘¢ ð°
90	áº ð„˜ ð ð¹± ð‘£ ð±
100	à¯± àµ± á» â… â…½ ð„™ ð…‹ ð…’ ð…ª ð• ð¡ ð¤™ ð©† ðž ð¾ ð¹² ð‘¤
200	ð„š ð¹³
300	ð„› ð…« ð¹´
400	ð„œ ð¹µ
500	â…® â…¾ ð„ ð…… ð…Œ ð…“ ð…¬ ð… ð…® ð…¯ ð…° ð¹¶
600	ð„ž ð¹·
700	ð„Ÿ ð¹¸
800	ð„ ð¹¹
900	ð„¡ ðŠ ð¹º
1000	à¯² àµ² â…¯ â…¿ â†€ ð„¢ ð… ð…” ð…± ð¡ž ð©‡ ðŸ ð¿ ð‘¥
2000	ð„£
3000	ð„¤
4000	ð„¥
5000	â† ð„¦ ð…† ð…Ž ð…²
6000	ð„§
7000	ð„¨
8000	ð„©
9000	ð„ª
10000	á¼ â†‚ ð„« ð…• ð¡Ÿ
20000	ð„¬
30000	ð„
40000	ð„®
50000	â†‡ ð„¯ ð…‡ ð…–
60000	ð„°
70000	ð„±
80000	ð„²
90000	ð„³
100000	â†ˆ
216000	ð’²
432000	ð’³
Inf	âˆž

So the title of this post really is accepted as a valid Perl 6 expression in the REPL:

$ perl6
To exit type 'exit' or '^D'
> ð’³ / à¼³ == ( â‘½ - ð¹ ) * ( ð’² / ð…‰ )
True

What does it evaluate to? Well:

ð’³ â€˜CUNEIFORM NUMERIC SIGN SHAR2 TIMES GAL PLUS MINâ€™ represents 432000
à¼³ â€˜TIBETAN DIGIT HALF ZEROâ€™ represents -Â½
â‘½ â€˜PARENTHESIZED NUMBER TENâ€™ represents 10
ð¹ â€˜RUMI NUMBER FIFTYâ€™ represents 50
ð’² â€˜CUNEIFORM NUMERIC SIGN SHAR2 TIMES GAL PLUS DISHâ€™ represents 216000
ð…‰ â€˜GREEK ACROPHONIC ATTIC TEN TALENTSâ€™ represents 10.

Definitely into just because you can doesn’t mean you should territory, and a feature to make the Pythonistas reach for the Zantac again, poor dears.

ayaburnie!

Instagram filter used: Normal

View in Instagram â‡’

nerdy spreadsheet tick/cross formatting

Screenshot from 2014-11-22 08:13:57

The magic custom format string for this is:

[Red][=0]âœ—;[Black][<>0]

Works with LibreOffice and Excel on every platform I’ve tried.

â’½â“žâ“¦ â“£â“ž â“‘â“” â“â“â“â“žâ“¨â“˜â“â“– â“¦â“˜â“£â“— â“…â“”â“¡â“› â“â“â““ â“Šâ“â“˜â“’â“žâ““â“”

It’s been so long since I’ve programmed in Perl. Twelve years ago, it was my life, but what with the Raspberry Pi intervening, I hadn’t used it in a while. It’s been so long, in fact, that I wasn’t aware of the new language structures available since version 5.14. Perl’s Unicode support has got a lot more robust, and I’m sick of Python’s whining about codecs when processing anything other than ASCII anyway. So I thought I’d combine re-learning some modern Perl with some childish amusement.

What I came up with was a routine to convert ASCII alphanumerics ([0-9A-Za-z]) to Unicode Enclosed Alphanumerics ([â“ª-â‘¨â’¶-â“â“-â“©]) for advanced lulz purposes. â’¾ â“£â“—â“˜â“â“š â“˜â“£ â“¦â“žâ“¡â“šâ“¢ â“¡â“â“£â“—â“”â“¡ â“¦â“”â“›â“›:

#!/usr/bin/perl
# annoying.pl - â“‘â“” â“â“â“â“žâ“¨â“˜â“â“– â“¦â“˜â“£â“— â“¤â“â“˜â“’â“žâ““â“”
# created by scruss on 2014-05-18

use v5.14;
# fun UTF8 tricks from http://stackoverflow.com/questions/6162484/
use strict;
use utf8;
use warnings;
use charnames qw( :full :short );
sub annoyify;

die "usage: $0 ", annoyify('string to print like this'), "\n" if ( $#ARGV < 0 );
say annoyify( join( ' ', @ARGV ) );
exit;

# ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©ðŸ’©

sub annoyify() {
    # convert ascii to chars in circles
    my $str = shift;
    my @out;
    foreach ( split( '', $str ) ) {
        my $c = ord($_);             # remember, can be > 127 for UTF8
        if ( $c == charnames::vianame("DIGIT ZERO") )
	{
            # ðŸ’©ðŸ’©ðŸ’© sigh; this one's real special ... ðŸ’©ðŸ’©ðŸ’©
            $c = charnames::vianame("CIRCLED DIGIT ZERO");
        }
        elsif ($c >= charnames::vianame("DIGIT ONE")
            && $c <= charnames::vianame("DIGIT NINE") )
        {
            # numerals, 1-9 only (grr)
            $c =
              charnames::vianame("CIRCLED DIGIT ONE") +
              $c -
              charnames::vianame("DIGIT ONE");
        }
        elsif ($c >= charnames::vianame("LATIN CAPITAL LETTER A")
            && $c <= charnames::vianame("LATIN CAPITAL LETTER Z") )
        {
            # upper case
            $c =
              charnames::vianame("CIRCLED LATIN CAPITAL LETTER A") +
              $c -
              charnames::vianame("LATIN CAPITAL LETTER A");
        }
        elsif ($c >= charnames::vianame("LATIN SMALL LETTER A")
            && $c <= charnames::vianame("LATIN SMALL LETTER Z") )
        {
            # lower case
            $c =
              charnames::vianame("CIRCLED LATIN SMALL LETTER A") +
              $c -
              charnames::vianame("LATIN SMALL LETTER A");
        }
        else {
            # pass thru non-ascii chars
        }
        push @out, chr($c);
    }
    return join( '', @out );
}

Yes, I really did have to do that special case for â“ª; â“ªâ€¦â‘¨ are not contiguous like ASCII 0â€¦9. â“‘â“žâ“ž!

Compose yourself, Raspberry Pi!

NB: If you’re gonna complain about the text encoding in this article, please know that the database went and broke itself: I (U+1F494, BROKEN HEART) UTF-8

Years ago, I worked in multilingual dictionary publishing. I was on the computing team, so we had to support the entry and storage of text in many different languages. Computers could display accented and special characters, but we were stuck with 8-bit character sets. This meant that we could only have a little over 200 distinct characters display in the same font at the same time. We’d be pretty much okay doing French & English together, but French & Norwegian started to get a little trying, and Italian & Greek couldn’t really be together at all.

We were very fortunate to be using Sun workstations in the editorial office. These were quite powerful Unix machines, which means that they were a fraction of the speed and capabilities of a Raspberry Pi. Suns had one particularly neat feature:

(source:Â Compose key, Wikipedia.)

That little key marked â€œComposeâ€Â (to the right of the space bar) acted as a semi-smart typewriter backspace key: if you hit Compose, then the right key combination, an accented character or symbol would appear. Some of the straightforward compose key sequences are:

Â	Compose +		Â	Â
Accent	First key	Second key	Result	Example
Acute	‘	e	Ã©	cafÃ©
Grave	`	a	Ã	dÃ©jÃ
Cedilla	,	c	Ã§	soupÃ§on
Circumflex	^	o	Ã´	hÃ´tel
Umlaut	â€œ	u	Ã¼	kÃ¼che
Ring	o	a	Ã¥	HÃ¥kon
Slash	/	L	Å	Åukasiewicz
Tilde	~	n	Ã±	maÃ±ana

Like every (non-embedded) Linux system I’ve used, the Raspberry Pi running Raspbian can use the compose key method for entering extra characters. I’m annoyed, though, that almost every setup tutorial either says to disable it, or doesn’t explain what it’s for. Let me fix that for you …

Setup

Run raspi-config

sudo raspi-config

and go to the ~~configure_keyboard~~Â â€œ4 Internationalisation Optionsâ€ â†’ â€œI3 Change Keyboard Layoutâ€ section. Your keyboard’s probably mostly set up the way you want it, so hit the Tab key and select <Ok> until you get to the Compose key section:

Choose whatever is convenient. The combined keyboard and trackpad I use (a SolidTek KB-3910) with my Raspberry Pi has a couple of â€œWindowsÂ® Logoâ€ keys, and the one on the right works for me. Keep the rest of the keyboard options the same, and exit raspi-config. After the message

Reloading keymap. This may take a short while
[ ok ] Setting preliminary keymap...done.

appears, you now have a working Compose key.

Using the Compose key

raspi-config hints (â€˜On the text console the Compose key does not work in Unicode mode â€¦â€™) that Compose might not work everywhere with every piece of software. I’ve tested it across quite a few pieces of software â€” both on the text console and under LXDE â€” and support seems to be almost universal. The only differences I can find are:

Text Console â€” (a. k. a. the texty bit you see after booting) Despite raspi-config’s warning, accented alphabetical characters do seem to work (Ã© Ã¨ Ã± Ã¶ Ã¸ Ã¥, etc). Most symbols, however, don’t (like Â± Ã— Ã· â€¦). The currency symbol for your country is a special case. In Canada, I need to use Compose for â‚¬ and Â£, but you’ve probably got a key for that.
LXDE â€” (a. k. a. the mousey bit you see after typing â€˜startxâ€™) All characters and symbols I’ve tried work everywhere, in LXTerminal, Leafpad, Midori, Dillo (browser), IDLE, and FocusWriter (a very minimal word processor).

Special characters in Python's IDLE — Special characters in Python’s IDLE

To find out which key sequences do what, the Compose key – Wikipedia page is a decent start. I prefer the slightly friendlier Ubuntu referencesÂ GtkComposeTable and Compose Key, or the almost unreadable but frighteningly comprehensive UTF-8 (Unicode) compose sequence reference (which is essentially mirrored on your Raspberry Pi as the file /usr/share/X11/locale/en_US.UTF-8/Compose). Now go forth and work that Compose key like a boÃŸ.

(If you’re on a Mac and feeling a bit left out, you can do something similar with the Option key. Here’s how: Extended Keyboard Accent Codes for the Macintosh. On WindowsÂ®? ~~Out of luck, I’m afraid~~ WinCompose!)

disnaeland

Conclusive proof (if any were needed) that Scotland invented Unicode:

didnae

isnae

wasnae

If you try to display a UTF-8 apostrophe on an ISO 8859-15 system, you get a reasonable representation of didnae, isnae and wasnae.