{"id":73,"date":"2006-12-03T12:44:12","date_gmt":"2006-12-03T12:44:12","guid":{"rendered":"http:\/\/dalelane.co.uk\/blog\/?p=73"},"modified":"2006-12-03T12:44:12","modified_gmt":"2006-12-03T12:44:12","slug":"supporting-different-languages","status":"publish","type":"post","link":"https:\/\/dalelane.co.uk\/blog\/?p=73","title":{"rendered":"Supporting different languages"},"content":{"rendered":"<p>I got an email on Friday from a German guy called Bernard. He uses my <a href=\"http:\/\/dalelane.co.uk\/blog\/?p=47\" title=\"blog post from when I wrote it\" target=\"_blank\">wiki note-taking app<\/a> that I wrote to play with the Windows Mobile SDK (that in itself was a surprise!). <\/p>\n<p>He asked if I&#8217;d add support for accented characters to it, as he (unsurprisingly, being German!) wanted to use German characters in his notes. That was an easy enough fix &#8211; just add a lookup table to the wiki markup parser which replaces characters with their <a href=\"http:\/\/www.w3.org\/MarkUp\/html-spec\/html-spec_13.html\" title=\"w3.org\" target=\"_blank\">HTML code equivalent<\/a>.<\/p>\n<p>Hurrah &#8211; I could feel suitably smug for making it a little less English-centric. <\/p>\n<p>A guy called Alex brought me back down to earth on Saturday morning with an email pointing out that when he uses my wiki note-taking app (wow &#8211; how many people are using this??), it displays the wrong Chinese characters in &#8216;View&#8217; mode to the ones he enters in &#8216;Edit&#8217; mode. Chinese? Eeek&#8230; this isn&#8217;t something I knew about. <\/p>\n<p><!--more-->A bit of research (with Alex&#8217;s help) showed that the answer was actually pretty simple. Joel Spolsky has <a href=\"http:\/\/www.joelonsoftware.com\/articles\/Unicode.html\" title=\"Joel on Software\" target=\"_blank\">a brilliant introduction to the topic<\/a> which helped me get my head around it. <\/p>\n<p>Slightly concerningly&#8230;<\/p>\n<blockquote><p>I have an announcement to make: if you are a programmer working in 2003 and you don&#8217;t know the basics of characters, character sets, encodings, and Unicode, and I catch you, I&#8217;m going to punish you by making you peel onions for 6 months in a submarine. I swear I will.<\/p><\/blockquote>\n<p>A little scary. Still, he lives in the US, so I&#8217;m probably safe. <\/p>\n<p>The aim of the article is to:<\/p>\n<blockquote><p>fill you in on exactly what every working programmer should know. All that stuff about &#8220;plain text == ascii == characters are 8 bits&#8221; is not only wrong, it&#8217;s hopelessly wrong, and if you&#8217;re still programming that way, you&#8217;re not much better than a medical doctor who doesn&#8217;t believe in germs. Please do not write another line of code until you finish reading this article.<\/p><\/blockquote>\n<p>A little harsh? Still, any smugness I might have felt after fixing Bernard&#8217;s problem was now long gone \ud83d\ude42<\/p>\n<p>To get back to the problem Alex was having &#8211; the wrong characters were being displayed in the WebBrowser component (an embedded Internet Explorer control) used in the wiki&#8217;s &#8216;View&#8217; mode. To quote from Joel&#8217;s article some more:<\/p>\n<blockquote><p>Almost every stupid &#8220;my website looks like gibberish&#8221; or &#8220;she can&#8217;t read my emails when I use accents&#8221; problem comes down to one naive programmer who didn&#8217;t understand the simple fact that if you don&#8217;t tell me whether a particular string is encoded using UTF-8 or ASCII or ISO 8859-1 (Latin 1) or Windows 1252 (Western European), you simply cannot display it correctly or even figure out where it ends.<\/p><\/blockquote>\n<p>This was the problem &#8211; the HTML that my parser was spitting out didn&#8217;t identify an encoding. The answer seems to be easy. I can either:<\/p>\n<ol>\n<li>Tell users to right-click on the WebBrowser control, and use the &#8216;Encoding&#8217; menu to choose something that makes it look right.  This should work, but it&#8217;s a little icky. Or:<\/li>\n<li>Add something to the header of the HTML file that the Wiki markup parser produces which identifies UTF-8 as the encoding:\n<pre><code>&lt;html&gt;\r\n&lt;head&gt;\r\n&lt;meta http-equiv=\"Content-Type\" content=\"text\/html; charset=utf-8\"&gt;<\/code><\/pre>\n<\/li>\n<\/ol>\n<p>It seems that <a href=\"http:\/\/www.microsoft.com\/globaldev\/getwr\/nwr\/nwrpartVI.mspx\" title=\"microsoft advice on globalization\" target=\"_blank\">UTF-8 is a good choice<\/a> for any web-app which uses non-European characters. <\/p>\n<p>This has been interesting. Globalization\/Localization wasn&#8217;t at the front of my mind when hacking together an app for my own personal use, so I think I&#8217;ve got a reasonable excuse for not looking into this earlier. But I&#8217;m glad I can do a few simple things to make it a little better.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I got an email on Friday from a German guy called Bernard. He uses my wiki note-taking app that I wrote to play with the Windows Mobile SDK (that in itself was a surprise!). He asked if I&#8217;d add support for accented characters to it, as he (unsurprisingly, being German!) wanted to use German characters [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[],"class_list":["post-73","post","type-post","status-publish","format-standard","hentry","category-code"],"_links":{"self":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/73","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=73"}],"version-history":[{"count":0,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/73\/revisions"}],"wp:attachment":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=73"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=73"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=73"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}