Strange production errors
The following code cause a really strange error in production:
new MailAddress("test@gmail.com");
The specified string is not in the form required for an e-mail address.
Huh?!
Obviously it is!
After immediately leaping to the conclusion that .NET is crap and I should immediately start writing my own virtual machine, I decided to dig a little deeper:
| Character | Code |
|---|---|
| t | 116 |
| e | 101 |
| s | 115 |
| t | 116 |
| @ | 64 |
| g | 103 |
| m | 109 |
| a | 97 |
| i | 105 |
| l | 108 |
| . | 46 |
| ? | 8203 |
| c | 99 |
| o | 111 |
| m | 109 |
8203 stands for U+200B or zero width space.
I guess that someone with a software testing background decided to get medieval on one of our systems.

Comments
Holy crap!
I just debugged the exact same issue on my client's system.
We were all similarly scratching our heads till I had to use to view source.
My solution:
// Remove HTML characters email = Regex.Replace(email, "&#[0-9]+;", "");
(A big hacky)
This usually happens when you copy-paste from Word. That guy isn't too sophisticated, he is just lazy...
We've just been dealing with something similar.
select id, catnum from table;
1 ABCD-1234 2 ABCD-1234
select id, '[' + catnum + ']' from table; 1 [ABCD-1234] 2 [ABCD-1234
(catnum is ment to be unique, too!)
Got some unicode nonsense going on in there somewhere.... I suspect a newline, but we still can't find it.
Another problem to watch out for when using the MailAddress constructor:
http://social.msdn.microsoft.com/forums/en-US/netfxnetcom/thread/2217c413-968f-4dcf-8035-45eaf2a3c609
I get this quite a lot in our databases. The source is usually legacy processes that rely on Excel spreadsheets/vba for data loading (yuck).
So when is RavenVM coming out?
This is a quite valid and common character in some languages, such as Persian (it is called Zero-Width Non Jointer) and joins different parts of a single word, when you don't want it get separated when word-wrapping happens. E.g., the following word contains a ZWNJ: میروم
Since it is a very common character for some languages it may happen usually that somebody changes the keyboard language accidentally and enter it without purpose.
Comment preview