C constants
Today I spent a couple of hours hunting down a bug in some of Zemanta's C++ code. I won't bore you with details, but I did learn something interesting about the way GCC works.
In essence the bug was due to a broken implementation of string comparison. Like this for example:
int cmp_strings(const char *x, const char *y)
{
return x == y;
}
Now, if you know some basics you should see right away that this won't compare strings at all. What it will do is that it will compare the memory locations at which the strings are stored. But that's not the interesting part.
The interesting part is that a function like this had a couple of unit tests and they all passed with flying colors, while in real usage it broke horribly (as it should).
How was this possible? Consider the following program using the function above:
int main()
{
printf("%d\n", cmp_strings("en", "en"));
printf("%d\n", cmp_strings("sl", "en"));
return 0;
}
Can you say what the output looks like without compiling it first?
Interestingly, at the first glance the output of this program supports the (wrong) theory that cmp_strings does in fact compare string content (and that's why the unit tests I mentioned passed).
What is really happening is that GCC is optimizing memory usage of the program and is merging equal constant strings. There is no use in storing the same constant "en" string three times in three different locations when one copy will do just fine (they are constant after all). So cmp_strings will work correctly for constant strings, but not for variable ones.
Oh, and -fno-merge-constants doesn't help with this, since it only affects merging of identical constants over multiple compilation units (on GCC 4.2.3 at least). In fact I see no way of disabling this optimization so that I could quickly check if any other code is also broken in this way.


