Sunday, October 26, 2008

Best OS for low-end laptops: take 3

Just a quick update: less than a month after my two posts on the topic of the best OS for low end machines, somebody posted an Ask Slashdot on the same question.  A quick glance thru the comments suggests that Linux is the most popular option, which isn't surprising given the audience. 

The real value of the Ask Slashdot article, however, is that people are suggesting actual Distros.  So far xubuntu doesn't get a lot of hits, suggesting that perhaps I didn't select the very best option for my own head-to-head tests.  If you do decide to install Linux on your low end laptop, you should check it out:

Friday, October 24, 2008

Converting Windows filenames to Unicode

Let's say you used FindFirstFile, etc to get a list of file names. What character set will those files be in? Evidence suggests Windows code page 1252.  Now, let's say you want to convert those 8-byte chars into Unicode. One easy way that almost works is: 

wchar_t tmp[256]; // potential buffer overflow exploit here
swprintf(tmp, L"%S", name); // %S (note caps) converts narrow (8 byte) to unicode 

This will fail, sooner or later, because this conversion only seems to work for ASCII defined characters (ie char values less than 127).  WinLatin (1252) has plenty of characters which are not ASCII. You might think you'll never encounter them, but one way that they can creep in is if you use MSWord's smartquotes, and then paste text from Word into your filename.  Probably filenames written by people outside of the US also tend to have these characters.

I don't know what would happen with swprintf if you had some other code page, but I'm guessing it also would be bad.

Here's the proper way to do it. 

wchar_t tmp[1024];
MultiByteToWideChar(CP_ACP,MB_PRECOMPOSED,name,-1,&tmp[0],1023);

Not only does this work for 'fancy' characters, but it also prevents the buffer overflow bug.