{"id":17455,"date":"2023-11-08T21:37:49","date_gmt":"2023-11-09T02:37:49","guid":{"rendered":"https:\/\/scruss.com\/blog\/?p=17455"},"modified":"2023-11-10T20:15:10","modified_gmt":"2023-11-11T01:15:10","slug":"can-we","status":"publish","type":"post","link":"https:\/\/scruss.com\/blog\/2023\/11\/08\/can-we\/","title":{"rendered":"can we\u2026?"},"content":{"rendered":"\n<p>This is a mini celebratory post to say that I&#8217;ve fixed the database encoding problems on this blog. It looks like I will have to go through the posts manually to correct the errors still, but at least I can enter, store and display UTF-8 characters as expected.<\/p>\n\n\n\n<p>\u201c? \u00b5 \u00b0 \u00d7 \u2014 \u2013 \u00bd \u00be \u00a3 \u00e9?\u00ea\u00e8\u201d, he said with some relief.<\/p>\n\n\n\n<p><strong>Postmortem<\/strong>: For reasons I cannot explain or remember, the database on this blog flipped to an archaic character set: latin1, aka <a href=\"https:\/\/en.wikipedia.org\/wiki\/ISO\/IEC_8859-1\">ISO\/IEC 8859-1<\/a>. A partial fix was effected by downloading the entire site&#8217;s database backup, and changing all the following references in the SQL:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>CHARSET=latin1 \u2192 CHARSET=utf8mb4<\/li>\n\n\n\n<li>COLLATE=latin1_german2_ci \u2192 COLLATE=utf8mb4_general_ci<\/li>\n\n\n\n<li>COLLATE utf8mb4_general_ci \u2192 COLLATE utf8mb4_general_ci<\/li>\n\n\n\n<li>latin1_general_ci \u2192 utf8mb4_general_ci<\/li>\n\n\n\n<li>COLLATE latin1_german2_ci \u2192 COLLATE utf8mb4_general_ci<\/li>\n\n\n\n<li>CHARACTER SET latin1 \u2192 CHARACTER SET utf8mb4<\/li>\n<\/ul>\n\n\n\n<p>For additional annoyance, the entire SQL dump was too big to load back into phpmyadmin, so I had to split it by table. Thank goodness for awk!<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n#!\/usr\/bin\/awk -f\n\nBEGIN {\n    outfile = &quot;nothing.sql&quot;;\n}\n\n\/^# Table: \/ {\n    # very special comment in WP backup that introduces a new table\n    # last field is table_name,\n    # which we use to create table_name.sql\n    t = $NF\n    gsub(\/`\/, &quot;&quot;, t);\n    outfile = t &quot;.sql&quot;;\n}\n\n{\n    print &gt; outfile;\n}\n<\/pre><\/div>\n\n\n<p>The data still appears to be confused. For example, in the post <a href=\"https:\/\/scruss.com\/blog\/2013\/05\/11\/compose-yourself-raspberry-pi\/\">Compose yourself, Raspberry Pi!<\/a>, what should appear as \u201c<em>That little key marked \u201cCompose\u201d<\/em>\u201d appears as \u201c<em>That little key marked \u00e2\u20ac\u0153Compose\u00e2\u20ac\u009d\u00c2<\/em>\u201d. This isn&#8217;t a straight conversion of one character set to another. It appears to have been double-encoded, and wrongly too.<\/p>\n\n\n\n<p>Still, at least I can now write again and have whatever new things I make turn up the way I like. Editing 20 years of blog posts awaits \u2026 <em>zzz<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>This is a mini celebratory post to say that I&#8217;ve fixed the database encoding problems on this blog. It looks like I will have to go through the posts manually to correct the errors still, but at least I can enter, store and display UTF-8 characters as expected. \u201c? \u00b5 \u00b0 \u00d7 \u2014 \u2013 \u00bd [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[7,2],"tags":[],"class_list":["post-17455","post","type-post","status-publish","format-standard","hentry","category-computers-suck","category-goatee-stroking-musing-or-something"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pQNZZ-4xx","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/posts\/17455","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/comments?post=17455"}],"version-history":[{"count":5,"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/posts\/17455\/revisions"}],"predecessor-version":[{"id":17463,"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/posts\/17455\/revisions\/17463"}],"wp:attachment":[{"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/media?parent=17455"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/categories?post=17455"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/tags?post=17455"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}