[aklug] Re: Unicode / UTF-8

From: Royce Williams <royce@tycho.org>
Date: Sat Mar 19 2011 - 17:32:00 AKDT

Christopher Howard said, on 03/19/2011 04:12 PM:
> I've been forced to learn a lot about unicode / UTF-8 lately (from a
> botched MySQL migration). I'm glad I did though, because I learned a lot
> of cool stuff. Maybe its just my amateurish perspective, but it seems to
> me that UTF-8 was a very ingenious way to make the ASCII to Unicode
> transition possible for ASCII-based systems.

Amen, brother! I've got those scars, too.

This post was illuminating, from Derek Sivers, the guy who started CD Baby.

    http://www.oreillynet.com/onlamp/blog/2006/01/turning_mysql_data_in_latin1_t.html

Quoting it: (the caps are the original, not mine:)

LESSON LEARNED: KEEP EVERYTHING IN UTF-8, ABSOLUTELY EVERYWHERE, FROM DAY ONE.

If anyone else out there has a database that has any chance of needing i18n *ever*, it's muuuuch easier to start out that way (or to convert as soon as possible). Your future self with thank you. And definitely start out any new projects that way.

Your HTTP headers, MySQL databases, tables, fields, CGI code (including regex routines), local console, browser, editor and email client all need to be on board to make development against UTF-8 reasonable - but once you do it, the whole world of international characters opens up. Not only is just generally pretty neat, but unusual characters will never hold you back from doing something cool and new with your app.

And agreed that UTF-8 was a clever way to do it. It's backwards-compatible with ASCII as long as you are not using any non-ASCII characters in your Unicode.

Royce
---------
To unsubscribe, send email to <aklug-request@aklug.org>
with 'unsubscribe' in the message body.
Received on Sat Mar 19 17:32:04 2011

This archive was generated by hypermail 2.1.8 : Sat Mar 19 2011 - 17:32:04 AKDT