scan – We Saw a Chicken …

GTALUG Lightning Talk: Pages â†’ Searchable PDF (â†’ archive.org)

Here are my slides:

Pages_to_SearchablePDF-GTALUG_Lightning_Talk-20210713.odp Download

Or if you like PDF:

Pages_to_SearchablePDF-GTALUG_Lightning_Talk-20210713.pdf Download

Book: Fortran techniques with special reference to non-numerical applications (1972)

Programming flow diagram, with the flow of a program using subroutines on the left ("closed coding") and the same structure on the right written as a series of GOTO-controlled sections ("open coding") to save computer memory and execution time — â€œSubroutines do, however, bring with them considerable
overheads in both space and execution time.â€

Imagine you have a programming task that involves parsing and analyzing text. Nothing complicated: maybe just breaking it into tokens. Now imagine the only programming language you had available:

has no text handling functions at all: you can pack characters into numeric types, but how they are packed and how many you get per type are system dependent;
allows integers in variables starting with the letters Iâ†’N, with Aâ†’H and Oâ†’Z floating point;
has IF â€¦ THEN but no ELSE, with the preferred form being
IF (expr) neg, zero, pos
where expr is the expression to evaluate, and neg, zero and pos are statement labels to jump to if the evaluation is negative, zero or positive, respectively;
has only enough memory for (linear, non-associative) arrays of a couple of thousand entries;
disallows recursion completely;
charges for computing time such that a solo researcher’s work might cost many times their salary in a few weeks.

Sounds impossible, right? But that’s the world described in Colin Day’s book from 1972, Fortran techniques with special reference to non-numerical applications.

The programming language used is USA Standard FORTRAN X3.9 1966, commonly known as Fortran IV after IBM’s naming convention. For all it looks crude today, Fortran was an efficient, sod-the-theory-just-get-the-job-done language that allowed numerical problems to be described as a text program and solved with previously impossible speed. Every computer shipped with some form of Fortran compiler at the time. Day wasn’t alone working within Fortran IV’s text limitations in the early 1970s: the first Unix tools at Bell Labs were written in Fortran IV â€” that was before they built themselves their own toolchain and invented the segmentation fault.

The book is a small (~ 90 page) delight, and is a window into system limitations we might almost find unimaginable. Wanna create a lookup table of a thousand entries? Today it’s a fraction of a thought and microseconds of program time. But nearly fifty years ago, Colin Day described methods of manually creating two small index and target arrays and rolling your own hash functions to store and retrieve stuff. Text? Hollerith constants, mate; that’s yer lot â€” 6HOH HAI might fit in one computer word if you were running on big iron. Sorting and searching (especially without recursion) are revealed to be the immensely complex subjects they are, all hidden behind today’s one-liner methods. Day shows methods to simulate recursion with arrays standing in for pointer stacks of GO TO targets (:coding_horror_face:). And if it’s graphics you want, that’s what the line printer’s for:

Damped cosine 2d function density plot rendered as mono-spaced characters, approximately 60 colums across, made up of only X, 0, *, +, - and space characters — â€œ*â€¦ the most serious drawback to a density plot of the type shown above is the limited number of characters used to represent the height above the page*.â€
(This image was deemed impressive enough by Cambridge University Press that they used it as the cover of the book. The same function became a bit of a visual clichÃ©, with home computers being able to render it in colour and isometric 3D less than a decade later.)

Why do I like this book enough to track down a used copy, import it, scan it, correct it and upload it to the Internet Archive? To me, it shows the layers we now take for granted, and the privilege we have with these hard problems of half a century ago being trivially soluble on a $10 computer the size of a stick of gum. When we run today’s massive AI models with little interest in the underlying assumptions but a sharp focus on getting the results we want, we do a disservice to the years of R&D that got us here.

The â€˜charges for computing timeâ€™ comment above is from Colin’s website. Early central computing facilities had the SaaS billing down solid, partly because many mainframes were rented from the vendor and system usage was accounted for in minute detail. Apparently the system Colin used (when a new lecturer) was at another college, and it was the custom to send periodic invoices for CPU time and storage used back to the user’s department. Nowhere on these invoices did it say that these accounts were for information only and were not payable. Not the best way to greet your users.

(Incidentally, if you hate yourself and everyone else around you, you can get a feel of system billing on any Linux system by enabling user quotas. You’ll very likely stop doing this almost immediately as the restrictions and reporting burden seem utterly alien to us today.)

While the book is still very much in copyright, the copy I have sat unread at Lakehead University Library since June 1995; the due date slip’s still pasted in the back. It’s been out of print at Cambridge University Press since May 1987, even if they do have a plaintive/passive aggressive â€œhey we could totally make an ebook of this if you really want itâ€ link on their site. I â€” and the lovely folks hosting it at the Internet Archive â€” have saved them from what’s evidently too much trouble. I won’t even raise an eyebrow if they pull a Nintendo and start selling this scan.

Colossal thanks to Internet Archive for making the book uploading process much easier than I thought it was. They’ve completely revamped the processing behind it, and the fully open-source engine gives great results. As ever, if you assumed you knew how to do it, think again and read the How to upload scanned images to make a book guide. Uploading a zip file of images is much easier than mucking about with weird command-line TIFF and PDF tools. The resulting PDF is about half the size of the optimized scans I uploaded, and it’s nicely tagged with metadata and contains (mostly) searchable text. It took more than an hour to process on the archive’s spectacularly powerful servers, though, so I hate to think what Colin Day’s bill would have been in 1972 for that many CPU cycles â€¦ or if even a computer of that time, given enough storage, could complete the project by now.

Big old scanned manuals to small old scanned manuals

It is good that there are so many scanned manuals for old computer systems out there. Every old system did things its very own special way, and life’s too short to guess. I mean, there’s not much out there on the SYM-1 I’m trying to get working again:

â€” not much except for 6502.org’s excellent Synertek SYM-1 Resources, that is.

Some manuals, though, while lovingly scanned, are just too large to download, browse or file. Take, for instance, AppleIIScans’ Apple II BASIC Programming With ProDOS. It’s a very faithful colour scan, but at 170 MB for 280 pages, it’s a bit unwieldy. I suspect it’s Adobe Acrobat Paper Capture’s fault: while it makes turning scans into readable files really easy, it doesn’t warn against using 600 dpi full colour for a book with only decorative use of colour.

Good old Ghostscript saves the day, though:

gs -sDEVICE=pdfwrite -sColorConversionStrategy=Gray -dProcessColorModel=/DeviceGray -dPDFSETTINGS=/ebook -dNOPAUSE -dBATCH -dSAFER -q -sOutputFile=1983-A2L2013-m-a2-bpwp-grey.pdf -- 1983-A2L2013-m-a2-bpwp.pdf

By downsampling the scanned images and converting everything to greyscale, the result’s only 16 MB. All text and indexing from Acrobat is left intact.

Library Hand – Disjoint

LibHandDis â€” Based on scans of â€œLibrary Hand – Disjointâ€, described in Dana’s A Library Primer, with some modifications.

Major changes from scan:

As the scan only covered A-Z, a-z, 0-9 and â€˜&â€™, I had to make the rest up.
Many of the descenders had to be shortened to fit with modern typography conventions.
Kerning is much tighter than Dana’s guidelines suggest.

(idea for this came via MetaFilter, This question of library handwriting is an exceedingly practical one)

Local copy: LibHandDis.zip.

GTALUG Lightning Talk(s): Mostly Searchable and Mini Printers

My lightning talk for GTALUG seemed to go down quite well. Here are the slides. It’s mostly based on experience gleaned from My bank broke PDF â€¦ and how I used PDFBeads to fix it. I really must write this up properly â€¦ oh wait, I just did.

I also prepared â€” but didn’t get to use â€” notes on using Mini Printers and Linux. Again, this is fromÂ Thermal Printer driver for CUPS, Linux, and Raspberry Pi: zj-58 and Notes on mini-printers and Linux.

Arabic Geometrical Pattern and Design (Dover Pictorial Archive) [Kindle Edition] review

Product link: Arabic Geometrical Pattern and Design (Dover Pictorial Archive) eBook: J. Bourgoin: Amazon.ca: Kindle Store

Summary: Buy the paper edition; this book is illegible on Kindle.

The original book features very finely engraved line drawings, with construction lines showing how the patterns are built up. The Kindle edition has only low-resolution scans, so the lines break down into noise and are very hard to follow. You can’t zoom in, either. The figure numbering is entirely absent from the Kindle edition, so you can’t use this book for reference. Some of the page scans are squint and partially cut off, too.

Very disappointed in this purchase. You’re better off with the paper than trying to squint at these smudgy pixels.

kindlefail

(unedited text as simultaneously posted to Amazon)

A (mostly) colour-managed workflow for Linux for not too many $$$

Colour management is good. It means that what I see on the screen is what you meant it to look like, and anything I make with a colour-managed workflow you’ll see in the colours I meant it to have. (Mostly.) You can spend a lot of money to do this professionally, but you can also get most of the benefits for about $125, if you’re prepared to do some fiddly stuff.

The most important part is calibrating your display. Hughski’s ColorHug (which I’ve mentioned before) is as close to plug-and-play as you’ll get: plug it in, and the colour management software pops up with prompts on what to do next. Attach the ColorHug to the screen (with the newly supplied stretchy band), let it burble away for 10â€“20 minutes, and the next time you log in, colours will be just right.

Calibrating the scanner on my Epson WorkForce WF-7520 was much more work, and the process could use optimization. To calibrate any scanner, you need a physical colour target to scan and compare against reference data. The cheapest place to get these (unless there was one in the box with your scanner) is Wolf Faust’s Affordable IT 8.7 (ISO 12641) Scanner Colour Calibration Targets. If there are a bunch of likeminded folk in your area, it’s definitely worth clubbing together on a group buy to save on shipping. It’s also less work for Wolf, since he doesn’t have to send out so many little packages.

(I’ve known of Wolf Faust since my Amiga days. He produced the most glorious drivers for Canon printers, and Jeff Walker produced the camera-ready copy for JAM using Wolf’s code. While Macs had the high end DTP sewn up back then, you could do amazing things on a budget with an Amiga.)

The target comes packed in a protective sleeve, and along with a CD-R containing the calibration data which matches the print run of the target. Wolf makes a lot of targets for OEMs, and cost savings from his volume clients allow him to sell to individuals cheaply.

Scanning the thing without introducing automatic image corrections was the hard part. I found that my scanner had two drivers (epson2 and epkowa), the latter of which claimed to support 48-bit scanning. Unfortunately, it only supports 24-bit, like the epson2 driver, so whichever I chose was moot. I used the scanimage command line tool to make the scan:

scanimage --mode Color -x 175 -y 125 --format=tiff --resolution 300 > Epson-Workforce_WF-7520-WFaust-R1.tiff

which looks, when reduced down to web resolution, a bit like this:

It looks a lot darker than the physical target, so it’s clear that the scanner needs calibrating. To do this, you need two tools from the Argyll Colour Management System. The first creates a text representation of the scanned target’s colour patches:

scanin -v Epson-Workforce_WF-7520-WFaust-R1.tiff /usr/share/color/argyll/ref/it8.cht IT87/r130227.txt diag.tiff

The result is a smallish text file Epson-Workforce_WF-7520-WFaust-R1.ti3 which needs one more step to make a standard ICC profile:

colprof -A Epson -M 'Workforce WF-7520' -D 'WFaust R1' -ax -qm Epson-Workforce_WF-7520-WFaust-R1

I didn’t quite need to add that much metadata, but I could, so I did. The resultant ICC file can be used to apply colour calibrations to scanned images. Here’s the target scan, corrected:

Epson-Workforce_WF-7520-WFaust-R1-corrected

(I’ve made this a mouseover with the original image, so you can see the difference. Also, yes, there is a greasy thumb-print on my scanner glass near the bottom right, thank you so much for noticing.)

I used tifficc from theÂ Little CMS package to apply the colour correction:

tifficc -v -i Epson-Workforce_WF-7520-WFaust-R1.icc Epson-Workforce_WF-7520-WFaust-R1.tiff Epson-Workforce_WF-7520-WFaust-R1-corrected.tiff

There are probably many easier, quicker ways of doing this, but this was the first thing I found that worked.

To show you a real example, here’s an un-retouched scan of the cover of Algrove Publishing‘s book â€œAll the Knots You Needâ€, scanned at 75 dpi. Mouseover to see the corrected version:

all-the-knots-you-need_algrove

(Incidentally, there are two old but well-linked programs that are out there that purport to do scanner calibration:Â Scarse and LPROF. Don’t use them! They’re really hard to build on modern systems, and the Argyll tools work well.)

The last part of my workflow that remains uncalibrated is my printer. I could make a target with Argyll, print it, scan it, colour correct it, then use that as the input to colprof as above. I’m suspecting the results would be mediocre, as my scanner’s bit depth isn’t great, and I’d have to do this process for every paper and print setting combination. I’d also have to work out what magic CUPS does and compensate. Maybe later, but not yet.

Wind Power, 1940s style

This is how wind turbines were supposed to look, at least in the 1940s. It’s the experimental Smith-Putnam 1.25 MW unit than ran for a short while on a hill near Rutland, VT. The picture’s from a rather falling-apart copy of Large Horizontal-axis Wind Turbines (Thresher, R. W., & Solar Energy Research Institute. (1982). Large horizontal-axis wind turbines: Proceedings of a workshop held in Cleveland, Ohio, July 28-30, 1981. Golden, Colo: Solar Energy Research Institute) that I rescued from Jim‘s recycling years ago.

The first part of these proceedings has a historical review of the Smith-Putnam turbine, including an excerpt from the S. Morgan Smith Company’s house organ on the project. As the rest of the book is pretty much all about the MOD series of turbines, it’s of less interest. I’ve scanned the bits about the Smith-Putnam turbine, and put them here: NASA_DOE-1981-large_horizontal_axis_wind_turbines-excerpt. If anyone wants the book, let me know. It’s very ratty, but readable.

I’ve written about this turbine before, but in relation to a packet of crayons. More awesome turbine pictures from Paul Gipe: Smith-Putnam Industrial Photos.

.awesome

Here are the complete 1988-vintage Sun manuals â€œUsing NROFF and TROFFâ€ and â€œFormatting Documentsâ€ scanned just for you. I’d scanned these in 2000, and they’d sat on a forgotten archive volume since then.

Update: there are better versions on the Internet Archive: Using NROFF and TROFF and Formatting Documents, all as part of the Sun Microsystems, Inc. manual collection.

(if you need to get your troff on, go to Ralph’s troff.org.)

Too many QR Codes

I have, of late, been rather more attached to QR Codes than might be healthy. I’ve been trying all sorts of sizes and input data, printing them, and seeing what camera phones can scan them. I tried three different devices to scan the codes:

iPhone 4s – 8 MP, running either i-nigma (free) or Denso Wave’s own QRdeCODE ($2). QRdeCODE is better, but then, it should be, since it was created by the developer of the QR Code standard.
Nexus 7 – 1.2 MP, running Google Goggles.
Nokia X2-01 – Catherine‘s new(ish) phone, which I can’t believe only has a 0.3 MP VGA camera on it. Still, it worked for a small range of codes.

QR Code readability is defined by the module size; that is, the number of device pixels (screen or print) that represent a single QR Code pixel. Denso Wave recommends that each module is made up of 4 or more dots. I was amazed that the iPhone could read images with a module size of 1 from the screen, like this one:

hello_____-ei-m01-300dpi

On this laptop, one pixel is about 0.24 mm. The other cameras didn’t fare so well on reading from the screen:

iPhone 4s – Min module size: 1-2 pixels (0.24-0.48 mm/module)
Nexus 7 – Min module size: 2-3 pixels (0.48-0.72 mm/module)
Nokia X2-01 – Min module size: 3-4 pixels (0.72-0.96 mm/module)

So I guess for screen scanning, Denso Wave’s recommendation of 4 pixels/module will pretty much work everywhere.

I then generated and printed a bunch of codes on a laser printer, and scanned them. The results were surprisingly similar:

iPhone 4s – Min module size: 3-4 dots (0.25-0.34 mm/module)
Nexus 7 – Min module size: 4-5 dots (0.34-0.42 mm/module)
Nokia X2-01 – Min module size: 8-9 dots (0.68-0.76 mm/module)

A test print on an inkjet resulted in far less impressive results. I reckon you need to make the module size around 25% bigger on an inkjet than a laser, perhaps because the inkjet is less crisp.

I have to admit I went a bit nuts with QR Codes. I made a Vcard:

(and while I was at it, I created a new field for ham radio operators: X-CALLSIGN. Why not?). I even encoded some locations in QR Codes.

Just to show you what qrencode can do, here’s a favourite piece of little prose:

a_real_man

optar: paper-based archiving

I’ve spent most of the day messing around with Twibright Optar, a way of creating printed archives of binary data that can be scanned back in and restored.Â It looks like it was written as a proof-of-concept, as the only way to change options is to modify the code and recompile. Eppur si muove.

To compile the code on OS X, I found I had to change this line in the Makefile from:

LDFLAGS=-lm

LDFLAGS=-lmÂ  `libpng-config --L_opts`

After trying to print some samples at the default resolution, I had no luck, so for reliability I halved the data density settings in the file optar.h:

#define XCROSSES 33 /* Number of crosses horizontally */
#define YCROSSES 43 /* Number of crosses vertically */

It’s quite important that your image prints and scans with a whole number of printer dots to image pixels. This used to be quite easy to do, before the advent of PDF’s “Scale to fit” misfeature, and also printer drivers that do a tonne of work in the background to “improve” the image. Add the mismatch between laser printer resolutions (300, 600, 1200 dpi …) and inkjets (360, 720, 1440 dpi …), and you’ve got lots of ways that this can go wrong.

Thankfully, there’s one common resolution that works across both types of printers. If you output the image at 120 dpi, that’s 5 laser printer dots at 600 dpi, or six inkjet dots at 720 dpi. And there was peace in the kingdom.

Here’s a demo, based on this:

So I took this track (which I used to have as a 7″, got at a jumble sale in the mid-70s) and converted it to a really low quality MPEG-2.5: MichelinJingle8kbit â€” that’s 175KB for just shy of three minutes of music (which, at this bitrate, sounds like it’s played through a layer of socks at the bottom of the Marianas Trench, but still).

Passing it through optar (which I wish wouldn’t produce PGM files; its output is mono) and bundling the pages into a PDF, I get this: optar_mj.pdf (760KB). Scanning that printout at 600dpi and running the pages through unoptar, I got this: optar1_mj.mp3. It’s the same as the input file, except padded with zeros at the end.

Sometimes, the scanning and conversion doesn’t do so well:

mjoptar300dpi.mp3 â€” this is what happens when you scan at too low a resolution.
mjx.mp3 â€” I have no idea what went wrong here, but: glitchtastic!

My bank broke PDF … and how I used PDFBeads to fix it

I’m on a major decluttering toot. When I realised that the filing cabinet I bought three years ago would no longer close with all the papers stuffed in it, I knew something had to change. I’ve been shredding like it’s Houston in 2001. I have the duplex scanner to suck in the stuff I need to keep. I’m moving to paperless wherever possible to stop it building up again.

My bank provides PDF statements. Of this I approve. PDF is almost perfect for this: it provides an electronic version of the page, but with searchable text and the potential for some level of security. Except, this is not the way that my bank does it. At first glance, the text looks pretty harmless:

Zoom in, and it gets a bit blocky:

Zoom right in:

Aargh! Blockarama! Did they really store text as bitmaps? Sure enough, pdftotext output from the files contains no text. Running pdfimages produces hundreds of tiny images; here’s just a few:

Dear oh dear. This format is the worst of electronic, combined with paper’s lack of computer indexability. The producer claims to be Xenos D2eVision. Smooth work there, Xenos.

So, how can I fix this? It’s a bit of a pain to set this workflow up, but what I’ve done is:

Convert the PDF to individual TIFF files at 300 dpi. Ghostscript is good for this:
gs -SDEVICE=tiffg4 -r300x300 -sOutputFile=file%03d.tif -dNOPAUSE -dBATCH -- file.pdf
Run Tesseract OCR on the TIFF files to make hOCR output:
for f in file*tif do tesseract $f `basename $f` hocr done
Update: Cuneiform seems to work better than Tesseract when feeding pdfbeads:
for f in file*tif do cuneiform -f hocr -o `basename $f .tif`.html $f done
Move the images and the hOCR files to a new folder; the filenames must correspond, so file001.tif needs file001.html, file002.tif file002.html, etc.
In the new folder, run pdfbeads * > ../Output.pdf

The files are really small, and the text is recognized pretty well. It still looks pretty bad:

but at least the text can be copied and indexed.

This thread â€œConvert Scanned Images to a Single PDF Fileâ€ got me up and running with PDFBeads. You might also have success using the method described here: â€œHow to extract text with OCR from a PDF on Linux?â€ â€” it uses hocr2pdf to create single-page OCR’d PDFs, then joins them.

creating a TrueType font from your handwriting with your scanner, your printer, and FontForge

Hey, this post is super old!
That means that installation and run instructions may not work as well, or even at all. Most of the *Ports Apple software repositories have given way to Homebrew: you may have some success on Mac (untested by me) if you brew install netpbm fontforge potrace. There’s also some font cleanup I’d recommend, like resolving overlaps, adding extrema, and rounding points to integer. One day I may update this post, but for now, I’m leaving it as is.

This looks more than a bit like my handwriting

because it is my handwriting! Sure, the spacing of the punctuation needs major work, and I could have fiddled with the baseline alignment, but it’s legible, which is more than can usually be said of my own chicken-scratch.

This process is a little fiddly, but all the parts are free, and it uses free software. This all runs from the command line. I wrote and tested this on a Mac (with some packages installed from DarwinPorts), but it should run on Linux. It might need Cygwin under Windows; I don’t know.

Software you will need:

a working Perl interpreter
NetPBM, the free graphics converter toolkit
FontForge, the amazing free font editor. (Yes, I said amazing. I didn’t say easy to use …)
autotrace or potrace so that FontForge can convert the scanned bitmaps to vectors
some kind of bitmap editor.

You will need to download

fonttrace.pl – splits up a (very particular) bitmap grid into character cells
chargrid.pdf – the font grid template for printing

Procedure:

Print at least the first page of chargrid.pdf. The second page is guidelines that you can place under the page. This doesn’t work very well if you use thick paper.
Draw your characters in the boxes. Keep well within the lines; there’s nothing clever about how fonttrace.pl splits the page up.
Scan the page, making sure the page is as straight as possible and the scanner glass is spotless. You want to scan in greyscale or black and white.
Crop/rotate/skew the page so the very corners of the character grid table are at the edges of the image, like this: I find it helpful at this stage to clean off any specks/macules. I also scale and threshold the image so I get a very dark image at 300-600dpi.
Save the image as a Portable Bitmap (PBM). It has to be 1-bit black and white. You might want to put a new font in a new folder, as the next stage creates lots of files, and might overwrite your old work.
Run fonttrace.pl like this:
fonttrace.pl infile.pbm | sh
If you miss out the call to the shell, it will just print out the commands it would have run to create the character tiles.
This should result in a bunch of files called uniNNNN.png in the current folder, like these:
uni0057.png

uni0069.png

uni0073.png

uni0070.png

uni0079.png
Fire up FontForge. You’ll want to create a New font. Now Fileâ†’Import…, and use Image Template as the format. Point it at the first of the image tiles (uni0020.png), and Import.
Select Editâ†’Selectâ†’All, then Elementâ†’Autotrace. You’ll see your characters appear in the main window.
And that’s – almost – it. You’ll need to fiddle with (auto)spacing, set up some kerning tables, set the font name (in Elementâ†’Font Info … – and you’ll probably want to set the em scale to 1024, as TrueType fonts like powers of two), then Fileâ†’Generate Fonts. Fontforge will throw you a bunch of warnings and suggestions, and I’d recommend reading the help to find out what they mean.

There are a couple of limitations to the process:

Most of the above process could be written into a FontForge script to make things easier
Only ASCII characters are supported, to keep the number of scanned pages simple. Sorry. I’d really like to support more. You’re free to build on this.

Lastly, a couple of extra files:

CrapHand2.pbm – a sample array drawn by me, gzipped for your inconvenience (and no, I don’t know why WordPress is changing the file extension to ‘pbm_’ either).
chargrid.ods – the OpenOffice spreadsheet used to make chargrid.pdf

Have fun! Write nicely!

to scan film, or not

I’ve recently taken up film photography again. But processing is expensive.

To have 24 exposures processed and scanned at 6MP at Downtown Camera costs $12 + tax. That’s a pretty good price for black and white.

I can process at home (yay stinky toxic chemicals!) for a bit less. I’d need to buy a scanner, and the cheapest film scanners come in at around $300.

What to do, what to do?

Lady Goosepelt Rides Again!

Lady Goosepelt, from What a Life!

In case anyone wants them, the 600 dpi page images of What a Life! are stored in this PDF: what_a_life.pdf (16MB). If you merely wish to browse, all the images from the book are here.

I got a bit carried away with doing this. Instead of just smacking together all the 360 dpi TIFFs I scanned seven years ago, I had to scan a new set at a higher resolution, then crop them, then fix the page numbers, add chapter marks, and make the table of contents a set of live links.

I’ve got out of the way of thinking in PostScript, so I spent some time looking for tools that would do things graphically. Bah! These things’d cost a fortune, so armed only with netpbm, libtiff, ghostscript, the pdfmark reference, Aquamacs, awk to add content based on the DSC, and gimp to work out the link zones on the contents page, I made it all go. Even I’m impressed.

One thing that didn’t impress me, though:

aquamacs file size warning

I used to edit multi-gigabyte files with emacs on Suns. They never used to complain like this. They just loaded (admittedly fairly slowly) and let me do my thing. Real emacs don’t give warning messages.

All the printers I’ve ever owned …

bird you can see: hp print test

An ancient (even in 1985) Centronics serial dot-matrix printer that we never got working with the CPC464. The print head was driven along a rack, and when it hit the right margin, an idler gear was wedged in place, forcing the carriage to return. Crude, noisy but effective.
Amstrad DMP-2000. Plasticky but remarkably good 9-pin printer. Had an open-loop ribbon that we used to re-ink with thick oily endorsing ink until the ribbons wore through.
NEC Pinwriter P20. A potentially lovely 24-pin printer ruined by a design flaw. Print head pins would get caught in the ribbon, and snap off. It didn’t help that the dealer that sold it to me wouldn’t refund my money, and required gentle persuasion from a lawyer to do so.
Kodak-Diconix 300 inkjet printer. I got this to review for Amiga Computing, and the dealer never wanted it back. It used HP ThinkJet print gear which used tiny cartridges that sucked ink like no tomorrow; you could hear the droplets hit the page.
HP DeskJet 500. I got this for my MSc thesis. Approximately the shape of Torness nuclear power station (and only slightly smaller), last I heard it was still running.
Canon BJ 200. A little mono inkjet printer that ran to 360dpi, or 720 if you had all the time in the world and an unlimited ink budget.
Epson Stylus Colour. My first colour printer. It definitely couldn’t print photos very well.
HP LaserJet II. Big, heavy, slow, and crackling with ozone, this was retired from Glasgow University. Made the lights dim when it started to print. Came with a clone PostScript cartridge that turned it into the world’s second-slowest PS printer. We did all our Canadian visa paperwork on it.
Epson Stylus C80. This one could print photos tolerably well, but the cartridges dried out quickly, runing the quality and making it expensive to run.
Okidata OL-410e PS. The world’s slowest PostScript printer. Sold by someone on tortech who should’ve known better (and bought by someone who also should’ve known better), this printer jams on every sheet fed into it due to a damaged paper path. Unusually, it uses an LED imaging system instead of laser xerography, and has a weird open-hopper toner system that makes transporting a part-used print cartridge a hazard.
HP LaserJet 4M Plus. With its duplexer and extra paper tray it’s huge and heavy, but it still produces crisp pages after nearly 1,000,000 page impressions. I actually have two of these; one was bought for $99 refurbished, and the other (which doesn’t print nearly so well) was got on eBay for $45, including duplexer and 500-sheet tray. Combining the two (and judiciously adding a bunch of RAM) has given me a monster network printer which lets you know it’s running by dimming the lights from here to Etobicoke.
IBM Wheelwriter typewriter/ daisywheel printer. I’ve only ever produced a couple of pages on this, but this is the ultimate letter-quality printer. It also sounds like someone slowly machine-gunning the neighbourhood, so mostly lives under wraps.
HP PhotoSmart C5180. It’s a network photo printer/scanner that I bought yesterday. Really does print indistinguishably from photos, and prints direct from memory cards. When first installed, makes an amusing array of howls, boinks, squeals, beeps and sproings as it primes the print heads.

stones, as current vernacular would have it

Finatics' sign, by Big Al's

I’m no fan of billboards, but I have to congratulate Mike of Finatics for sheer gall when he put up this sign. See the plastic shark on the building behind? That’s Big Al’s, one of the biggest aquarium stores in Canada. Mike’s probably not going to get any favours from them any time soon.

Tag: scan