Canonicalizing URLs to Lowercase
I wish to write an HTTP module that converts URLs to lowercase. My first attempt ignored international character sets and works great:
// Convert URL virtual path to lowercase
string lowercase = context.Request.FilePath.ToLowerInvariant();
// If anything changed then issue 301 Permanent Redirect
if (!lowercase.Equals(context.Request.FilePath, StringComparison.Ordinal))
{
context.Response.RedirectPermanent(...lowercase URL...);
}
The Turkey Test (international cultures):
But what about cultures other than en-US? I referred to the Turkey Test to come up with a test URL:
http://example.com/Iıİi
This little insidious gem destroys any notion that case conversion in URLs is simple! Its lowercase and uppercase versions, respectively, are:
http://example.com/ııii
http://example.com/IIİİ
For case conversion to work with Turkish URLs, I first had to set the current culture of ASP.NET to Turkish:
<system.web>
<globalization culture="tr-TR" />
</system.web>
Next, I had to change my code to use the current culture for the case conversion:
// Convert URL virtual path to lowercase
string lowercase = context.Request.FilePath.ToLower(CultureInfo.CurrentCulture);
// If anything changed then issue 301 Permanent Redirect
if (!lowercase.Equals(context.Request.FilePath, StringComparison.Ordinal))
{
context.Response.RedirectPermanent(...);
}
But wait! Will StringComparison.Ordinal still work? Or should I use StringComparison.CurrentCulture? I'm really not certain of either!
File names: It gets MUCH WORSE!
Even if the above works, using the current culture for case conversions breaks the NTFS file system! Let's say I have a static file with the name Iıİi.html:
http://example.com/Iıİi.html
Even though the Windows file system is case-insensitive it does not use language culture. Converting the above URL to lowercase results in a 404 Not Found because the file system doesn't consider the two names as equal:
http://example.com/ııii.html
The correct case conversion for file names? WHO KNOWS?!
The MSDN article, Best Practices for Using Strings in the .NET Framework, has a note (about halfway through the article):
Note:
The string behavior of the file system, registry keys and values, and environment variables is best represented by StringComparison.OrdinalIgnoreCase.
Huh? Best represented??? Is that the best we can do in C#? So just what is the correct case conversion to match the file system? Who knows?!!? About all we can say is that string comparisons using the above will probably work MOST of the time.
Summary: Two case conversions: Static/Dynamic URLs
So we've seen that static URLs---URLs having a file path that matches a real directory/file in the file system---must use an unknown case conversion that is only "best represented" by StringComparison.OrdinalIgnoreCase. And please note there is no string.ToLowerOrdinal() method so it's very difficult to know exactly what case conversion equates to the OrdinalIgnoreCase string comparison. Using string.ToLowerInvariant() is probably the best bet, yet it breaks language culture.
On the other hand, dynamic URLs---URLs with a file path that does not match a real file on the disk (that map to your application)---can use string.ToLower(CultureInfo.CurrentCulture), but it breaks file system matching and it is somewhat unclear what edge cases exist that may break this strategy.
Thus, it appears case conversion first requires detection as to whether a URL is static or dynamic before choosing one of two conversion methods. For static URLs there is uncertainty how to change case without breaking the Windows file system. For dynamic URLs it is questionable if case conversion using culture will similarly break the URL.
Whew! Anyone have a solution to this mess? Or should I just close my eyes and pretend everything is ASCII?
I would challenge the premise here that there is any utility whatsoever in attempting to auto-convert URLs to lowercase.
Whether a full URL is case-sensitive or not depends entirely on the web server, web application framework, and underlying file system.
You're only guaranteed case-insensitivity in the scheme (http://, etc.) and hostname portions of the URL. And remember that not all URL schemes (file and news, for example) even include a hostname.
Everything else can be case-sensitive to the server, including paths (/), filenames, queries (?), fragments (#), and authority info (usernames/passwords before the # in mailto, http, ftp, and some other schemes).
You have some incompatible goals.
Have a culture-sensitive case-lowering. If Turkish seems bad, you don't want to know about some of the Georgian scripts, never mind that ß is either upper-cased to SS or less commonly to SZ - in either case to have a full case-folding where lower("ß") will match lower(upper("ß")) you need to consider it equivalent to at least one of those two-character sequences. Generally we aim for case-folding rather than case-lowering if possible (not possible here).
Use this in a non culture-sensitive context. URIs are ultimately opaque strings. That they may have a human-readable understanding is usefulful for coders, users, search-engines and marketers alike, but their ultimate job is to identify a resource by a direct case-sensitive comparison.
Map this to NTFS, which has a case-preserving case-sensitivity based on the mappings in the $UpCase file, which it does by comparing the upper-cased forms of words (at least it doesn't have to decide whether Σ lower-cases to σ or ς, in a culture-insensitive manner.
Presumably do well in terms of SEO and human readability. This may well be part of your original goal, but whileThisIsNotVeryEasyToReadOrParse itseasierforbothpeopleandmachinesthanthis. Case-folding loses information.
I suggest a different approach.
Start with your starting string, whatever that is and wherever it came from (NTFS filename, database entry, HttpHandler binding in web.config). Have that as your canonical form. By all means have rules that people should create these strings according to some canonical form, and perhaps enforce it where you can, but if something slips by that breaks your rules, then accept it as the official canonical name for that resource no matter how much you dislike it.
As much as possible the canonical name should be the only one "seen" by the outside world. This can be enforced programmatically or just a matter of it being best practice, as canonicalising after the fact with 301s won't solve the fact that outside entities don't know you do so until they dereference the URI.
When a request is received, test it according to how it is going to be used. Hence while you may choose to use a particular culture (or not) for those cases where you perform the resource-lookup yourself, with so-called "static" URIs, your logic can deliberately follow that of NTFS by simply using NTFS to do the work:
Find mapped file ignoring the matter of case sensitivity for now.
If non-match then 404, who cares about case?
If find, do case-sensitive ordinal comparison, if it doesn't match then 301 to the case-sensitive mapping.
Otherwise, proceed as usual.
Edit:
In some ways the question of domain names is more complicated. The rules for IDN have to cover more issues with less room for manœuver. However, it's also simpler at least as far as case-canonicalising goes.
(I'm going to ignore canonicalising of whether or not www. is used etc. though I'd guess it's part of the same job here, it's pushing the scope and we could end up writing a book between us if we don't stop somewhere :)
IDNs have their own case canoniclisation (and some other forms of normalisation) rules defined in RFC 3491. If you're going to canonicalise domain names on case, follow that.
Makes it nice and simple to answer, doesn't it? :)
There's also less pressure in a way, for while search engines have to recognise that http://example.net/thisisapath and http://example.net/thisIsAPath may be the same resource, they also have to recognise that they might be different, and that's where all of the SEO advantage of canonicalising on one of them (doesn't matter which) comes from.
However, they know that example.net and EXAMPLE.NET can't possibly be different sites, so there's little SEO advantage in making sure they're the same (still nice for things like caches and history lists that don't make that jump themselves). Of course, the issue remains with the fact that www.example.net or even maAndPasExampleEmporium.us might be the same site, but again, that moves away from case issues.
There's also the simple matter that most of the time we never have to deal with more than a couple of dozen different domains, so sometimes working harder rather than smarter (i.e. just make sure they're all set up right and don't do anything programmatically!) can do the trick.
A final note though, it's important not to canonicalise a third-party URI. You can end up breaking things if you change the path (they may not be treating it case-insensitively) and you might at least end up breaking their slightly different canonicalisation. Best to leave them as is at all times.
Firstly never use case transformations to compare strings. It needlessly allocates a string, it has a needless small performance impact, could result in an ObjectReferenceException if the value is null and could likely result in an incorrect comparison.
If this is important enough to you I would manually traverse the file system and use the your own comparisons against each file/directory name. You should be able to use Accept-Language or Accept-Encoding (if it has a culture included) HTTP header to find the suitable culture to use. Once you have the CultureInfo you can use it to perform the string comparisons:
var ci = CultureInfo.CurrentCulture; // Use Accept-Language to derive this.
ci.CompareInfo.Compare("The URL", "the url", CompareOptions.IgnoreCase);
I would only do this on a HTTP 404; the HTTP 404 handler would search for a matching file and then HTTP 301 the user to the correctly-cased URL (as manual file-system traversal can get expensive).
Related
I have an issue with naming endpoints in a REST api.
Let's say you have a UI on the client side and in that UI is a table with a list of files. When tapping on a file it will proceed to download that selected one from the server. Additionally there is a button that when clicked will download all the files or selected files.
So the endpoints on the API may be structured like so...
[GET] Api/Files/{fileName}
Gets a single file by the file name provided in the route.
[GET] Api/Files
Gets a list of the files, including: FileName, Size, Type, etc...
[GET] Api/Files
Gets the files, returned as a ZIP file.
As you can see the issue is the conflict of endpoints with Api/Files. I would expect both endpoints to do what I have specified that they do. But one of them needs to change... I've thought about adding something to the end but mostly verbs come to mind. Any ideas on how the formatting could be done?
Going over the different answers and testing them out, I think the best answer is just having a different endpoint name. So I've now gone for...
[GET] Api/Files/{fileName}
Gets a single file by the file name provided in the route.
[GET] Api/Files
Gets a list of the files, including: FileName, Size, Type, etc...
[GET] Api/Files/Archive
Gets the files, returned as a ZIP file.
It's not perfect, but it makes sense.
An alternative could be...
[GET] Api/Files/Zip
But I think this doesn't work very well. As endpoints should never change and I may want to change it from a zip at some point...
The HTTP/RESTy way is to specify the response type you want with the Accept header. The endpoint can return the results as JSON if Accept is application/json and a ZIP file if it's application/zip
In the worst case, you can inspect a request's Accept header and return either a JSON result or create a ZIP file and return it with return File(...).
The Produces attribute can be used to specify the content type returned by each action, allowing you to write different actions for each content type.
Another option is to create a custom output formatter and have ASP.NET itself handle the generation of the ZIP file when the Accept header requests it. This blog post shows how to create an Excel Output formatter which returns a List of items as an Excel file when the Accept header requests this
I would expect both endpoints to do what I have specified that they do. But one of them needs to change...
Right - expanding on that idea, you have three resources (the contents of the file, the list of available files, the downloadable archive of files), but only two names; so you need at least one more name.
Good news: REST doesn't care what spelling conventions you use for your resource identifiers, so you don't actually need a good name.
/Api/0d660ac6-d067-42c1-b23b-daaaf946efc0
That will work just fine. The machines don't care.
Human beings do care though; it will be a lot easier to review an access log, or find things in documentation, if we aren't trying to guess what the different UUIDs mean.
mostly verbs come to mind
Verbs are fine. Notice that these URI all work exactly like you would expect:
https://www.merriam-webster.com/dictionary/get
https://www.merriam-webster.com/dictionary/put
https://www.merriam-webster.com/dictionary/post
The HTTP/RESTy way is to specify the response type you want with the Accept header.
Probably not what you want, here. The effective target uri is the primary cache key; when we invalidate a cached entry, all of the representations will be invalidated. If that's not the relationship you want between your list of files and your downloadable zip, then having them share a resource identifier is going to be unhappy.
Accept makes more sense when you have multiple representations of the same thing (ie, the list of files, but represented as HTML, JSON, text, XML, etc).
You might also consider what these identifiers look like in the access logs; should the list of files and the zip be logged using the same URI? Do you want general purpose log parsing tools (like alarm systems) to consider the two different kinds of fetch to be equivalent?
I have this app.config which contains FolderPath under AppSettings node.
During testing, the QAs used the path: C:\directory\test & test on FolderPath value that made the application crash on startup.
I know it's the unescaped character (specifically &) that made the error.
They're insisting that it's a program error and should be automatically escaped because some users may not know about escaping strings.
How do I deal with it?
If users are allowed to edit app.config, they must write correct XML, period. This is not a program error, and there is frankly nothing you can (or should) do about this. But you already know this.
I believe the point your QAs are trying to make is:
You should not expect "normal" users to know how to edit/write XML, or even know what XML is. Instead, for user-editable values, you should create a UI which edits the app.config, and otherwise tell your users to leave the file alone unless they "know what they are doing".
They're insisting that it's a program error and should be automatically escaped because some users may not know about escaping strings.
User should never be allowed to deal directly with settings file. If they have to update they should be provided a GUI where they can update as necessary. This way you have the control to test whether or not the input is correct or not.
I am creating a new application right now and I want to make all right at the start so I can grow with it in the future.
I have looked on several guides descibing how to make a multilanguage supported application, but I can't figure out witch one to use.
Some tutorials are old and I don't know if they are out of date.
http://www.codeproject.com/Articles/352583/Localization-in-ASP-NET-MVC-with-Griffin-MvcContri
http://geekswithblogs.net/shaunxu/archive/2012/09/04/localization-in-asp.net-mvc-ndash-upgraded.aspx
http://www.hanselman.com/blog/GlobalizationInternationalizationAndLocalizationInASPNETMVC3JavaScriptAndJQueryPart1.aspx
http://www.chambaud.com/2013/02/27/localization-in-asp-net-mvc-4/
https://github.com/turquoiseowl/i18n
I found that they are 2 ways of storing the language data, either in db or in resource files.
What are the pro/cons?
Is there another way that is prefered?
This is what I want:
Easy to maintain (Add/Change/Remove)
Full language support. (views, currency, time/date, jquery, annotations and so on..)
Enable to change language.
Auto detect language.
Future safe.
What is the prefered way of doing this? got any good tutorial that is best practice for 2013?
I've written a blog post convering the following aspect of a multilingual asp.net mvc app:
Database: I split table in two parts, one containing non-translatable fields the other containing the fields that need translation.
Url Routes: I normally keep the culture in the URL that way you get a well indexed ML site.
routes.MapRoute(
name: "ML",
url: "{lang}/{controller}/{action}/{id}",
defaults: new { lang = "en-CA", controller = "Home", action = "Index", id = UrlParameter.Optional }
);
From a base controller class you can make that {lang} parameters available to your Controller. See the blog post for all details.
Querying your data: You would simply pass the Culture to your repository or database context. The idea is to create a view model that merge the two table that you separated for translation.
public ActionResult Index(string id)
{
var vm = pages.Get(id, Language); // Language is the {lang} from the route
return View(vm);
}
Your Views: Use the .NET Resources file with the generic two-letter language code, ex: HomePage.en.resx, HomePage.fr.resx. The locale en**-US**, en**-CA** is useful for formatting currency, dates, numbers etc. Your Resources file will mostly likely be the same for English US, Canada, etc.
Images: Use a format like imagename-en.png, imagename-fr.png. From your Views make the {lang} route parameter available via an extension method and display your images like this:
<img src="/content/logos/imagename-#this.CurrentLanguage()" />
You may also have a complete separate folder for your language supported. Example: /content/en/imagename.png, /content/fr/imagename.png.
JavaScript:: I usually create a folder name LanguagePacks and JS files called Lang-en.js, Lang-fr.js. The idea is to create "static" class that you can use all over your other JS files:
// lang-en.js
var Lang = {
globalDateFormat = 'mm-dd-yy';
greeting: 'Hello'
};
On your Views, you load the correct Language file
<script type="text/javascript" src="/content/js/languagepacks/lang-#(this.CurrentLanguage()).js"></script>
From a JavaScript module
// UI Module
var UI = function() {
function notify() {
alert(Lang.greeting);
}
return {
notify: notify
}
};
There's no one way to do a multilingual web app. Use the Resources to translate your text, they can be quickly swap at run-time. You have multiple editor that can open those that you can pass to translator etc.
You can check the blog post and get a detailed example on how I do this.
I have an approach for myself based on db, however it may not be suitable for large scale apps.
I create a Translation table/entity for holding titles and texts which should be multilingual.
So, whenever I want to render a view, first I retrieve the appropriate translation from db and then pass it to the view as model:
var t = dbCtx.Translations.Find(langID);
// ...
return View(t);
And in view, I render the content like the following:
<tr>
<td>#Html.DisplayFor(m => m.WelcomeMessage)</td>
<td>#Url.Action(Model.EnterSite, "Index", "Home")</td>
</tr>
And about getting the appropriate answer, well you have several ways. You can use session:
Session["Lang"] = "en";
// ...
var lang = (string)Session["Lang"] ?? "en";
Or by passing it through query string, or combination of them.
For auto detecting language, you should decide of the following:
a) Detecting from the browser
b) Detecting from user IP and guessing geo location of him/her and setting the appropriate language for him/her
I kinda have a feeling that this question does not have a straight forward answer. For formatting and what not you should always be using the Culture and the UICulture i.e. stuff like 'en-GB' for British English and 'en-US' for US English (and yes, there is a difference). All of .Net is built around it and that way you can use local formatting without really thinking about it.
You should also check out the System.Globalization namespace for more details:
http://msdn.microsoft.com/en-us/library/system.globalization%28v=vs.110%29.aspx
As for were the culture should come from, then at least in our company the policy has always without exception from the query string. The reason for that is that if you use IP localization for example, then if a Spanish user is looking at a Spanish site in Japan and then would be switched to the Japanese version, then not exactly wrong but can be annoying if you've told the client that it's a direct Spanish link or something. That said if the culture is undefined in the query string, then using the IP to get guess with language the user would like to have, would not be a bad idea, but really depends on your clients needs.
As for were to get the translations, then it really depends, both resource and DB have their ups and downs. So the major up-points for DB are that, its easy to share between applications and if you need to update a phrase for any reason, then you can update them all through one DB entry, but this can be a fault as well, because some phrases have dual meanings and can be translated completely differently in other languages although the same sentence is used in English. The other big problem with DB is that to a degree (but this depends on implementation) you loose intelligence in VS for the phrases.
Resources of course have proper intelligence, but they are fairly static to that web application you are using, you can't share resources across applications... well not exactly true, but more or less you can't. Though that static nature is also a plus, because all of the resources are contained within the application and you know that no outside effect can influence it.
In reality it really depends on the project in hand. you simply have to way your own pros and cons and decide for your self... sorry. But if it helps at all, I can tell you how things are in my company. We make several applications, that share the same phrases all across various web applications. We used to have it in DBs, but what happened was that the translations kept going array, meaning a client asked for a better translation in Spanish for one app, but that made no sense what so ever on others. That happened more than you might think. Then we moved to the resources, but what happened there was that when one app's translation was updated, then no-one remembered to update the other apps and it actually happened that 3 different apps translated the same term in 3 different ways. In the end we decided to go back to the DB, because the ability to change all of the translations at once meant more to us, then the fact no outside influence could effect it.
In general there are other pros and cons as well, but all that is fairly irrelevant compared to the above. You really need to ask how you are using the translations. As for general editing (excluding the above point), then aider approach works just as well, you can just as easily change and edit or extend translations with both approaches.
That said, if the DB and the DB calls are designed badly, then adding new languages can be easier with resources, simply copy the resource file, add the culture extension to the name and add the translations to the resource and you are done, but again, completely down to the DB design, so that is something to keep in mind when designing the DB and says fairly nothing about holding the translations in the DB.
By in large I would say that the resources are easier to use and very easy to maintain and extend (and they are already built into .NET) though DB has a clear advantage if you need to share translations. So if I would have to say I would say the recommended way is to use Resources, but DB does have it's place and it really depends on the project.
I've been looking at the same sources as you for the same needs but as you say it is very hard to find one preferred solution.
But I decided to go with the solution by Michael Chambaud that you have linked in your question. Reasons for this was that it was easy to implement, gave me the url's I wanted and since I'm still not 100% certain... easy to replace in the future.
In additon to this I will use jquery globalize for client side formatting.
It would be really interesting to hear what solution you decided on?
Is it possible to either use the System.IO.Path class, or some similar object to format a unix style path, providing similar functionality to the PATH class? For example, I can do:
Console.WriteLine(Path.Combine("c:\\", "windows"));
Which shows:
"C:\\windows"
But is I try a similar thing with forward slashes (/) it just reverses them for me.
Console.WriteLine(Path.Combine("/server", "mydir"));
Which shows:
"/server\\mydir"
You've got bigger problems, Unix accepts characters in a file name than Windows does not allow. This code will bomb with ArgumentException, "Illegal characters in path":
string path = Path.Combine("/server", "accts|payable");
You can't reliably use Path.Combine() for Unix paths.
Path.Combine uses the values of Path.DirectorySeperatorChar and Path.VolumeSeparatorChar, and these are determined by the class libraries in the runtime - so if you write your code using only Path.Combine calls, Environment.SpecialFolder values, and so forth, it will run fine everywhere, since Mono (and presumably any .NET runtime) implements the native way of getting and building those paths for any platform it runs on. (Your second example, for instance, returns /server/mydir for me, but the first example gives c:\/windows )
If you want a UNIX-specific path hard-coded in all cases, Path.Combine isn't buying you anything: Console.WriteLine ("/server/mydir"); does what you want in the OP.
As Hans said though, different filesystems have different rules for allowed characters, path lengths, and etc., so the best practice, like with any cross-platform programming, is to restrict yourself to using the intersection of allowed features between the filesystems you're targeting. Watch case-sensitivity issues too.
In this case i would use the class System.Uri or System.UriBuilder.
Side note: If you run your .NET code on a Linux-System with the Mono-Runtime, the Path class should return your expected behavior. The information that the Path class uses are provided by the underlying system.
FxCop wants me to spell Username with a capital N (i.e. UserName), due to it being a compound word. However, due to consistency reasons we need to spell it with a lowercase n - so either username or Username.
I've tried tweaking the CodeAnalysisDictionary.xml by adding the following section to the section:
<DiscreteExceptions>
<Term>username</Term>
</DiscreteExceptions>
From what I understand how custom dictionaries work, this should tell FxCop to treat username as a discrete term and prevent the CompoundWordsShouldBeCasedCorrectly (CA1702) check to fire an error.
Unfortunately this doesn't work. Does anybody have an idea why that is and how to solve this? I don't want to add suppressions, because this would seriously clutter the GlobalSuppressions file as there are quite a lot of occurrences.
Edited to add: For the time being I have solved this by using GlobalSuppressions, but given the nature of the issue this doesn't seem like the ideal way to solve this. Can anybody give a hint on where to look for further information on how FxCop applies the rules defined in a dictionary?
I was a developer on the FxCop / Managed Code Analysis Team for 3 years and I have your answer. Things have changed since my time, and I had forgotten exactly how custom dictionary handling worked and so it took me quite a bit of time to figure this out. :)
Executive Summary
The short answer is that you need to remove all references to username, usernames, UserName, and UserNames from C:\Program Files (x86)\Microsoft FxCop 1.36\CustomDictionary.xml.
Normally, I would not recommend this as it should not be required, but you have found what I believe is a bug, and this is the only workaround that I could find.
Full Story
OK, now for the long answer...
The rule has two distinct checks which work as follows:
A. Check for compound words that should be discrete
Split identifier into tokens: e.g. FileName --> { "file", "name" }
Spell check each adjacent pair of tokens.
If the spell check succeeds (e.g. filename is deemed to be a valid word),
then we have found a potential problem since a single word should not be expressed as
two tokens.
However, if there is a <Term CompoundAlternate="FileName">filename</Term>
in the <Compound> section of the custom dictionary, then it is taken to mean that
although filename is a word, the design guidelines (largely as a nod to consistency
with prior art in the Framework that predates the existence of the rule) insist it
should be written as FileName, and so we must suppress the warning.
Also, if there is a <Term>filename</Term> entry in the <DiscreteExceptions>
section of the custom dictionary, then it is taken to mean that although 'filename' is
a word, it might also be two words 'file' and 'name' in a different context. e.g.
Onset is a word, but asking the user to change DoSomethingOnSet to
DoSomethingOnset would be noise, and so we must suppress the warning.
B. Check for discrete words that should be compound:
Taking the tokens from A.1, check each one individually against the set of compound
terms in the custom dictionary.
If there is a match, then we must warn in keeping with the interpretation in step A.4.
Notice that your warning: Username should be UserName is detected in part B, which does not consult the DiscreteExceptions section, which is why you are not able to suppress the warning by modifying that section. The problem is that the default custom dictionary has an entry stating that the correct casing for username is always UserName. It needs to be removed or overridden somehow.
The Bug
Now, the ideal solution would be to leave the default custom dictionary alone, specify SearchFxCopDir=false in your project file, and then merge in only the parts of the default custom dictionary that you want in the CustomDictionary.xml that is used for your project. Sadly, this does not work as FxCop 1.36 ignores the SearchFxCopDir directive and always treats it as true. I believe this is a bug, but it is also possible that this was an intentional change since the directive is not documented and has no corresponding UI. I honestly don't know...
Conclusion
Given that FxCop always uses its default custom dictionary in addition to the project custom dictionary, your only recourse is to remove the entries in question from the default custom dictionary.
If I have a chance, I will contact the current code analysis team to see if this in fact a bug, and report back here...
In the custom dictionary that comes with FxCop (located in my system in C:\Program Files\Microsoft FxCop 1.36\CustomDixtionary.xml, but YMMV) in Words\Compounds has a <Term CompoundAlternate="UserName">username</Term> entry. Delete it. You still need the discrete exception.