The not so great zero challenge
The Great Zero Challenge tries to dispel the myth that you can recover any data from a hard drive that has been intentionally erased by overwriting its contents once with a single stream of zeros.
It's a nice idea with a problem. If it is possible to recover data, it's probably hugely expensive and it's highly unlikely that any company that is capable of doing that would take the challenge for a (recently increased) prize of $500. The money amount itself would probably be irrelevant if there would be a large media company backing it so that there would be some warranty of positive media coverage. Still, I congratulate the organizers for betting their money in order to dispel untruths as they put it.
Anyway, when I first heard of the challenge some time ago I noticed something interesting. The challenge says that you must identify the name of at least one of the files or folders on the disk. What if it would be possible to win the challenge without even touching the disk? Here's a enlarged portion of one of the censored screenshots available on the challenge website:
Notice the dotted line? The censor has been a bit sloppy. That's one side of the selection box. So it looks like the folder was selected in the Explorer when the screenshot was made. This gives you a pretty good idea of the length of the folder name. More specifically, since the line is dotted, you know the width of the box is 62 or 63 pixels.
Windows uses a proportional font for file names (and by the looks of the screenshot they used the default Windows XP theme), so that further reduces the number of possible filenames. With a couple of trial screenshots I measured the width of all English letters and a short C program tried all possible combinations that included only letters.
The result? From a dictionary of English words, 2091 names matched. Far too many to be useful for guessing the correct name.
So knowing the length of a string rendered in proportional fonts isn't enough without some kind of a context. Is there any more information available in the screenshot to narrow the possibilities even further?
Take a look at what else haven't been censored on the screenshot. There is one file with .gz extension, one with .tar extension and a directory. This suggests a decompressed distribution of one of the open source programs (for example something like "linux-2.6.23.tar.gz", "linux-2.6.23.tar" and "linux-2.6.24"). Size of the .tar file confirms that since it's approximately twice the size of the .gz file - a typical compression ratio for ASCII text or source code. The modification date of the .tar file suggests this is a fairly old release from the late 2006.
So, now all I need to find is an open source software that had a tar.gz release in November or December 2006, was 4.862 kB in size and had around 10 characters in the filename. Is there an easily searchable database of open source software out there that has this information? I haven't found one yet, but the Great Zero Challenge sure looks much less formidable when you look at it this way. And you don't even need to dust off that old electron microscope.
You could just recursively list whole kernel.org or ftp.arnes.si and check the timestamp and size on the listings.