What is string normalization, how do ordinal string comparisons work, and why should I care?

The following code will always return -1.

"Example\\̼".IndexOf("Example\\");

This is because the Combining Seagull Below character, U+033C, is a combining character, and it will combine with the character previous to it in order to produce a new character. Using an ordinal string comparison would fix this problem for you. As an example, the following code will always return 0:

"Example\\̼".IndexOf("Example\\", StringComparison.OrdinalIgnoreCase)

The following code will always, also return -1.

"µ".IndexOf("μ");

This is because, although µ and μ are the same symbol, they are actually two different characters that are part of the Unicode character set (one is called the Micro Sign character, U+00B5, and the other is the Greek Small Letter Mu character, U+03BC). Using an ordinal string comparison will not fix this, but normalization will. The following code will always return 0:

"µ".Normalize(NormalizationForm.FormKD).IndexOf("μ".Normalize(NormalizationForm.FormKD))

Alexandru

"To avoid criticism, say nothing, do nothing, be nothing." - Aristotle

"It is wise to direct your anger towards problems - not people; to focus your energies on answers - not excuses." - William Arthur Ward

"Science does not know its debt to imagination." - Ralph Waldo Emerson

"Money was never a big motivation for me, except as a way to keep score. The real excitement is playing the game." - Donald Trump

"All our dreams can come true, if we have the courage to pursue them." - Walt Disney

"Mitch flashes back to a basketball game held in the Brandeis University gymnasium in 1979. The team is doing well and chants, 'We're number one!' Morrie stands and shouts, 'What's wrong with being number two?' The students fall silent." - Tuesdays with Morrie

I'm not entirely sure what makes me successful in general programming or development, but to any newcomers to this blood-sport, my best guess would be that success in programming comes from some strange combination of interest, persistence, patience, instincts (for example, someone might tell you that something can't be done, or that it can't be done a certain way, but you just know that can't be true, or you look at a piece of code and know something doesn't seem right with it at first glance, but you can't quite put your finger on it until you think it through some more), fearlessness of tinkering, and an ability to take advice because you should be humble. Its okay to be wrong or to have a bad approach, realize it, and try to find a better one, and even better to be wrong and find a better approach to solve something than to have had a bad approach to begin with. I hope that whatever fragments of information I sprinkle across here help those who hit the same roadblocks.

Leave a Reply

Your email address will not be published. Required fields are marked *