FxCop: Compound word should be treated as discrete term - c#

FxCop wants me to spell Username with a capital N (i.e. UserName), due to it being a compound word. However, due to consistency reasons we need to spell it with a lowercase n - so either username or Username.
I've tried tweaking the CodeAnalysisDictionary.xml by adding the following section to the section:
<DiscreteExceptions>
<Term>username</Term>
</DiscreteExceptions>
From what I understand how custom dictionaries work, this should tell FxCop to treat username as a discrete term and prevent the CompoundWordsShouldBeCasedCorrectly (CA1702) check to fire an error.
Unfortunately this doesn't work. Does anybody have an idea why that is and how to solve this? I don't want to add suppressions, because this would seriously clutter the GlobalSuppressions file as there are quite a lot of occurrences.
Edited to add: For the time being I have solved this by using GlobalSuppressions, but given the nature of the issue this doesn't seem like the ideal way to solve this. Can anybody give a hint on where to look for further information on how FxCop applies the rules defined in a dictionary?

I was a developer on the FxCop / Managed Code Analysis Team for 3 years and I have your answer. Things have changed since my time, and I had forgotten exactly how custom dictionary handling worked and so it took me quite a bit of time to figure this out. :)
Executive Summary
The short answer is that you need to remove all references to username, usernames, UserName, and UserNames from C:\Program Files (x86)\Microsoft FxCop 1.36\CustomDictionary.xml.
Normally, I would not recommend this as it should not be required, but you have found what I believe is a bug, and this is the only workaround that I could find.
Full Story
OK, now for the long answer...
The rule has two distinct checks which work as follows:
A. Check for compound words that should be discrete
Split identifier into tokens: e.g. FileName --> { "file", "name" }
Spell check each adjacent pair of tokens.
If the spell check succeeds (e.g. filename is deemed to be a valid word),
then we have found a potential problem since a single word should not be expressed as
two tokens.
However, if there is a <Term CompoundAlternate="FileName">filename</Term>
in the <Compound> section of the custom dictionary, then it is taken to mean that
although filename is a word, the design guidelines (largely as a nod to consistency
with prior art in the Framework that predates the existence of the rule) insist it
should be written as FileName, and so we must suppress the warning.
Also, if there is a <Term>filename</Term> entry in the <DiscreteExceptions>
section of the custom dictionary, then it is taken to mean that although 'filename' is
a word, it might also be two words 'file' and 'name' in a different context. e.g.
Onset is a word, but asking the user to change DoSomethingOnSet to
DoSomethingOnset would be noise, and so we must suppress the warning.
B. Check for discrete words that should be compound:
Taking the tokens from A.1, check each one individually against the set of compound
terms in the custom dictionary.
If there is a match, then we must warn in keeping with the interpretation in step A.4.
Notice that your warning: Username should be UserName is detected in part B, which does not consult the DiscreteExceptions section, which is why you are not able to suppress the warning by modifying that section. The problem is that the default custom dictionary has an entry stating that the correct casing for username is always UserName. It needs to be removed or overridden somehow.
The Bug
Now, the ideal solution would be to leave the default custom dictionary alone, specify SearchFxCopDir=false in your project file, and then merge in only the parts of the default custom dictionary that you want in the CustomDictionary.xml that is used for your project. Sadly, this does not work as FxCop 1.36 ignores the SearchFxCopDir directive and always treats it as true. I believe this is a bug, but it is also possible that this was an intentional change since the directive is not documented and has no corresponding UI. I honestly don't know...
Conclusion
Given that FxCop always uses its default custom dictionary in addition to the project custom dictionary, your only recourse is to remove the entries in question from the default custom dictionary.
If I have a chance, I will contact the current code analysis team to see if this in fact a bug, and report back here...

In the custom dictionary that comes with FxCop (located in my system in C:\Program Files\Microsoft FxCop 1.36\CustomDixtionary.xml, but YMMV) in Words\Compounds has a <Term CompoundAlternate="UserName">username</Term> entry. Delete it. You still need the discrete exception.

Related

Dictionary of Words in SonarQube

We introduced SonarQube into our project which is working fine, but we have names like SelfAHFController. For Sonar, the name SelfAHFController should be SelfAhfController.
In StyleCop was possible to create a dictionary of allowed words. We would like to have AHF in this dictonary, but also to keep the CammelCase rule. It means, for words found in the dictionary, Sonar should igore the CammelCase rule.
Where can I specify this in Sonar?
We still do not support custom dictionaries for the naming rules. We have an open ticket, but I cannot promise a specific date when it will be developed.

Ignoring files from checkin with certain pattern of change

Since having started using JetBrains Annotations, for my own benefit I've decorated all methods with [CanBeNull] or [NotNull]
For example, the following line:
public AccountController(IAccountService accountService)
Would be changed to:
public AccountController([CanBeNull] IAccountService accountService)
Another example would be:
public Account CreateAccountEntity(Account accountEnttity)
would be changed to:
[CanBeNull]
public Account CreateAccountEntity([NotNull] Account accountEnttity)
How can I bypass pending changes for annotations, specifically "[CanBeNull]", and have TFS completely ignore this change?
You cannot make TFS "ignore" the change. That is the purpose of TFS - to track all changes.
The way I interpret your question, you are wanting to avoid the noise of potentially many small but innocuous checkins due to your annotations. If this is correct then there is a way to use TFS that will minimize the noise:
create a branch from where you are currently working (let's call it "BranchA"), then make all the annotation changes in that new branch ("BranchB"), checking them in regularly
if this is going to take some time (days, weeks) to complete then ensure you do regular merges from BranchA to BranchB
when you think you've finished do a final merge from BranchA to BranchB. If you've pulled across any new methods then ensure you annotate them. Repeat this step if you made changes.
merge all changes from BranchB back to BranchA. This will have the effect of aggregating all your smaller changes into a single large checkin/changeset in BranchA. Provided you have been doing regular merges from BranchA to BranchB this should be problem free even if considerable time has passed since you started the decoration work.
In short, you shouldn't, the closest feature is the tfignore, but this will ignore all file.
On the other hand, if you really want this, you could create a tool using the TFS API, and you would have to run this before check-ins and it would verify all the pending files in your solution and looking for this small changes and excluding the files, but this could cause the problem that at some point you may make a change to an excluded file and it won't get checked in and cause problems. You would need to add extra code to verify what files should be included from the excluded list.
External tool used inside VS Here you can see how to add tools to the Tools menu and send arguments to it.
TFS API Example
This example shows how to use the TFS API. There is a 'workspace.AddIgnoreFileExclusion()', but I don't have TFS here, so I'll verify how to ignore the files later.
Now in my experience, the only reason I wouldn't want to check in those changes would be to avoid conflicts with the team.
If I see a lot of value in some practice like using the annotations, I would talk with the team to get them to buy in into the idea of using annotations, that way everyone would be using it and soon every file will have the annotations and there won't be any conflicts.
You can't selectively ignore changes within files, in TFVC or in any other SCM I've ever encountered.
I agree with other answers that such kind of feature isn’t officially supported by Microsoft.
But you can also overwrite TFVC in a few ways if it is really needed. You can write your own Visual Studio plug-in or Source Control VSPackage .
If your main goal is to write better code with the help of ReSharper telling you whether you should expect nulls or not or produce other warnings and you don't want to disturb other team members with it I would suggest you to consider using External Annotations instead of annotation attributes in code.
You can then decide whether you want to commit those files or keep them locally. And even if you commit your code will still be clean without those extra attributes.
At least I would give it a try.

Canonicalize URL to lowercase without breaking file system or culture?

Canonicalizing URLs to Lowercase
I wish to write an HTTP module that converts URLs to lowercase. My first attempt ignored international character sets and works great:
// Convert URL virtual path to lowercase
string lowercase = context.Request.FilePath.ToLowerInvariant();
// If anything changed then issue 301 Permanent Redirect
if (!lowercase.Equals(context.Request.FilePath, StringComparison.Ordinal))
{
context.Response.RedirectPermanent(...lowercase URL...);
}
The Turkey Test (international cultures):
But what about cultures other than en-US? I referred to the Turkey Test to come up with a test URL:
http://example.com/Iıİi
This little insidious gem destroys any notion that case conversion in URLs is simple! Its lowercase and uppercase versions, respectively, are:
http://example.com/ııii
http://example.com/IIİİ
For case conversion to work with Turkish URLs, I first had to set the current culture of ASP.NET to Turkish:
<system.web>
<globalization culture="tr-TR" />
</system.web>
Next, I had to change my code to use the current culture for the case conversion:
// Convert URL virtual path to lowercase
string lowercase = context.Request.FilePath.ToLower(CultureInfo.CurrentCulture);
// If anything changed then issue 301 Permanent Redirect
if (!lowercase.Equals(context.Request.FilePath, StringComparison.Ordinal))
{
context.Response.RedirectPermanent(...);
}
But wait! Will StringComparison.Ordinal still work? Or should I use StringComparison.CurrentCulture? I'm really not certain of either!
File names: It gets MUCH WORSE!
Even if the above works, using the current culture for case conversions breaks the NTFS file system! Let's say I have a static file with the name Iıİi.html:
http://example.com/Iıİi.html
Even though the Windows file system is case-insensitive it does not use language culture. Converting the above URL to lowercase results in a 404 Not Found because the file system doesn't consider the two names as equal:
http://example.com/ııii.html
The correct case conversion for file names? WHO KNOWS?!
The MSDN article, Best Practices for Using Strings in the .NET Framework, has a note (about halfway through the article):
Note:
The string behavior of the file system, registry keys and values, and environment variables is best represented by StringComparison.OrdinalIgnoreCase.
Huh? Best represented??? Is that the best we can do in C#? So just what is the correct case conversion to match the file system? Who knows?!!? About all we can say is that string comparisons using the above will probably work MOST of the time.
Summary: Two case conversions: Static/Dynamic URLs
So we've seen that static URLs---URLs having a file path that matches a real directory/file in the file system---must use an unknown case conversion that is only "best represented" by StringComparison.OrdinalIgnoreCase. And please note there is no string.ToLowerOrdinal() method so it's very difficult to know exactly what case conversion equates to the OrdinalIgnoreCase string comparison. Using string.ToLowerInvariant() is probably the best bet, yet it breaks language culture.
On the other hand, dynamic URLs---URLs with a file path that does not match a real file on the disk (that map to your application)---can use string.ToLower(CultureInfo.CurrentCulture), but it breaks file system matching and it is somewhat unclear what edge cases exist that may break this strategy.
Thus, it appears case conversion first requires detection as to whether a URL is static or dynamic before choosing one of two conversion methods. For static URLs there is uncertainty how to change case without breaking the Windows file system. For dynamic URLs it is questionable if case conversion using culture will similarly break the URL.
Whew! Anyone have a solution to this mess? Or should I just close my eyes and pretend everything is ASCII?
I would challenge the premise here that there is any utility whatsoever in attempting to auto-convert URLs to lowercase.
Whether a full URL is case-sensitive or not depends entirely on the web server, web application framework, and underlying file system.
You're only guaranteed case-insensitivity in the scheme (http://, etc.) and hostname portions of the URL. And remember that not all URL schemes (file and news, for example) even include a hostname.
Everything else can be case-sensitive to the server, including paths (/), filenames, queries (?), fragments (#), and authority info (usernames/passwords before the # in mailto, http, ftp, and some other schemes).
You have some incompatible goals.
Have a culture-sensitive case-lowering. If Turkish seems bad, you don't want to know about some of the Georgian scripts, never mind that ß is either upper-cased to SS or less commonly to SZ - in either case to have a full case-folding where lower("ß") will match lower(upper("ß")) you need to consider it equivalent to at least one of those two-character sequences. Generally we aim for case-folding rather than case-lowering if possible (not possible here).
Use this in a non culture-sensitive context. URIs are ultimately opaque strings. That they may have a human-readable understanding is usefulful for coders, users, search-engines and marketers alike, but their ultimate job is to identify a resource by a direct case-sensitive comparison.
Map this to NTFS, which has a case-preserving case-sensitivity based on the mappings in the $UpCase file, which it does by comparing the upper-cased forms of words (at least it doesn't have to decide whether Σ lower-cases to σ or ς, in a culture-insensitive manner.
Presumably do well in terms of SEO and human readability. This may well be part of your original goal, but whileThisIsNotVeryEasyToReadOrParse itseasierforbothpeopleandmachinesthanthis. Case-folding loses information.
I suggest a different approach.
Start with your starting string, whatever that is and wherever it came from (NTFS filename, database entry, HttpHandler binding in web.config). Have that as your canonical form. By all means have rules that people should create these strings according to some canonical form, and perhaps enforce it where you can, but if something slips by that breaks your rules, then accept it as the official canonical name for that resource no matter how much you dislike it.
As much as possible the canonical name should be the only one "seen" by the outside world. This can be enforced programmatically or just a matter of it being best practice, as canonicalising after the fact with 301s won't solve the fact that outside entities don't know you do so until they dereference the URI.
When a request is received, test it according to how it is going to be used. Hence while you may choose to use a particular culture (or not) for those cases where you perform the resource-lookup yourself, with so-called "static" URIs, your logic can deliberately follow that of NTFS by simply using NTFS to do the work:
Find mapped file ignoring the matter of case sensitivity for now.
If non-match then 404, who cares about case?
If find, do case-sensitive ordinal comparison, if it doesn't match then 301 to the case-sensitive mapping.
Otherwise, proceed as usual.
Edit:
In some ways the question of domain names is more complicated. The rules for IDN have to cover more issues with less room for manœuver. However, it's also simpler at least as far as case-canonicalising goes.
(I'm going to ignore canonicalising of whether or not www. is used etc. though I'd guess it's part of the same job here, it's pushing the scope and we could end up writing a book between us if we don't stop somewhere :)
IDNs have their own case canoniclisation (and some other forms of normalisation) rules defined in RFC 3491. If you're going to canonicalise domain names on case, follow that.
Makes it nice and simple to answer, doesn't it? :)
There's also less pressure in a way, for while search engines have to recognise that http://example.net/thisisapath and http://example.net/thisIsAPath may be the same resource, they also have to recognise that they might be different, and that's where all of the SEO advantage of canonicalising on one of them (doesn't matter which) comes from.
However, they know that example.net and EXAMPLE.NET can't possibly be different sites, so there's little SEO advantage in making sure they're the same (still nice for things like caches and history lists that don't make that jump themselves). Of course, the issue remains with the fact that www.example.net or even maAndPasExampleEmporium.us might be the same site, but again, that moves away from case issues.
There's also the simple matter that most of the time we never have to deal with more than a couple of dozen different domains, so sometimes working harder rather than smarter (i.e. just make sure they're all set up right and don't do anything programmatically!) can do the trick.
A final note though, it's important not to canonicalise a third-party URI. You can end up breaking things if you change the path (they may not be treating it case-insensitively) and you might at least end up breaking their slightly different canonicalisation. Best to leave them as is at all times.
Firstly never use case transformations to compare strings. It needlessly allocates a string, it has a needless small performance impact, could result in an ObjectReferenceException if the value is null and could likely result in an incorrect comparison.
If this is important enough to you I would manually traverse the file system and use the your own comparisons against each file/directory name. You should be able to use Accept-Language or Accept-Encoding (if it has a culture included) HTTP header to find the suitable culture to use. Once you have the CultureInfo you can use it to perform the string comparisons:
var ci = CultureInfo.CurrentCulture; // Use Accept-Language to derive this.
ci.CompareInfo.Compare("The URL", "the url", CompareOptions.IgnoreCase);
I would only do this on a HTTP 404; the HTTP 404 handler would search for a matching file and then HTTP 301 the user to the correctly-cased URL (as manual file-system traversal can get expensive).

Where can I find a list of all possible messages that an XmlException can contain?

I'm writing an XML code editor and I want to display syntax errors in the user interface. Because my code editor is strongly constrained to a particular problem domain and audience, I want to rewrite certain XMLException messages to be more meaningful for users. For instance, an exception message like this:
'"' is an unexpected token. The
expected token is '='. Line 30,
position 35
.. is very technical and not very informative to my audience. Instead, I'd like to rewrite it and other messages to something else. For completeness' sake that means I need to build up a dictionary of existing messages mapped to the new message I would like to display instead. To accomplish that I'm going to need a list of all possible messages XMLException can contain.
Is there such a list somewhere? Or can I find out the possible messages through inspection of objects in C#?
Edit: specifically, I am using XmlDocument.LoadXml to parse a string into an XmlDocument, and that method throws an XmlException when there are syntax errors. So specifically, my question is where I can find a list of messages applied to XmlException by XmlDocument.LoadXml. The discussion about there potentially being a limitless variation of actual strings in the Message property of XmlException is moot.
Edit 2: More specifically, I'm not looking for advice as to whether I should be attempting this; I'm just looking for any clues to a way to obtain the various messages. Ben's answer is a step in the right direction. Does anyone know of another way?
Technically there is no such thing, any class that throws an XmlException can set the message to any string. Really it depends on which classes you are using, and how they handle exceptions. It is perfectly possible you may be using a class that includes context specific information in the message, e.g. info about some xml node or attribute that is malformed. In that case the number of unqiue message strings could be infinite depending on the XML that was being processed. It is equally possible that a particular class does not work in this way and has a finite number of messages that occur under specific circumstances. Perhaps a better aproach would be to use try/catch blocks in specific parts of your code, where you understand the processing that is taking place and provide more generic error messages based on what is happening. E.g. in your example you could simply look at the line and character number and produce an error along the lines of "Error processing xml file LineX CharacterY" or even something as general as "error processing file".
Edit:
Further to your edit i think you will have trouble doing what you require. Essentially you are trying to change a text string to another text string based on certain keywords that may be in the string. This is likely to be messy and inconsistent. If you really want to do it i would advise using something like Redgate .net Reflector to reflect out the loadXML method and dig through the code to see how it handles different kinds of syntax errors in the XML and what kind of messages it generates based on what kind of errors it finds. This is likely to be time consuming and dificult. If you want to hide the technical errors but still provide useful info to the user then i would still recomend ignoring the error message and simply pointing the user to the location of the problem in the file.
Just my opinion, but ... spelunking the error messages and altering them before displaying them to the user seems like a really misguided idea.
First, The messages are different for each international language. Even if you could collect them for English, and you're willing to pay the cost, they'll be different for other languages.
Second, even if you are dealing with a single language, there's no way to be sure that an external package hasn't injected a novel XmlException into the scope of LoadXml.
Last, the list of messages is not stable. It may change from release to release.
A better idea is to just emit an appropriate message from your own app, and optionally display -- maybe upon demand -- the original error message contained in the XmlException.

Visual Studio - smarter word completion wanted

I'd like word completion to show all matching type names (not only those in imported namespaces). If nampespace of that type is not imported - it should be imported as I choose the type from list, and if that type was in the non-referenced assembly - that assembly should be added to project references (adding imports and references - after prompt, of course)
Trying to recollect exact type name and it's namespace is real pain sometimes.
Is there any product with such completion?
(Yes, I know about Resharper. No, it doesn't support this)
PS and it would be really great, if word completion could show all types having text anywhere in the name - not only in the beginning. For example, I type "writer" - and completion shows me all writers (TextWriter, StringWriter, StreamWriter - etc)
You should take a look at ReSharper (Again) it does support the functionality with part of a type name or only writing the capital letters of a camel case type name e.g. SomeType can be found with ST.
The number of assemblies any tool will look in for possible types will always be limited. After all unless you tell the tool about the assemblies (tell in some way as in registering an assembly in the GAC, referencing it or any other means) the tool will simply not know of that assembly at will therfor not search it. On top of that you really do not want the tool to search through to many assemblies since you'd then risk being done writing the full name of any type before the tool will be done searching
Right or wrong, the goal of intellisense is to provide legal completions for the current edit positions. This is by no means 100% accurate but we do strive for listing only valid completions.
Showing type names which are not imported and/or not a type in the assembly the current project references flies in the face of this approach. It is instead suggesting code that is known to be illegal to the user.
True we could then go back and fix this up by adding an assembly reference. But some users may find this very annoying. They are typing up code and suddenly references are added and their imports are changed.
I'm not saying this is necessarily a bad feature, just that it goes against the current design philosophy and has the potential to upset a good portion of users.
What you are looking for is c# reshaper.
Just type something in like MD and press Ctrl+Space it will bring up every standard include. Just press space to confirm(in this case MD5 will show up). It also learns what you use most.
In addition to Resharper, CodeRush also has this feature. The free version probably does too.

Categories

Resources