2012-11-01
    UTF-8 everything
      snarklet
      
    Popular wisdom among developers does change over time.  But on character encoding, this wisdom has become consistent: use UTF-8.  Use UTF-8 everywhere.  In your Strings, in your database tables, in your data store, in your messages, in your pages, in your scripts.  "UTF-8 all the things."It begs the question, why is anything other than UTF-8 the default character encoding for any system?  Every system that contends with character encoding released since—oh, I'll be generous and keep it recent—2010 should use UTF-8 out of the box, fresh from the git hub or bit locker.It angers me, frankly, that as a user of an API, I have to be the one to realize, oh lordy, there's a character transcoding going on here, and for some insane reason this thing isn't using UTF-8.  Okay then, how do I tell it to snap out of its fit of insanity?Stop it.  I don't want ISO 8859.  I don't want Windows-1252.  I doubt anyone really does, they just don't know they need to tell you they don't.
      
    About this blog