Finding usages of string == operator in a large codebase - c#

I've had a request to look into the feasibility of replacing all of the string == operator usages in a reasonably large C# codebase with String.Equals() method calls that explicitly specify case-sensitivity.
Haven't had much luck figuring out a way to identify all the occurrences in the codebase, though.
Searching for "==" obviously finds countless instances of types other than strings being compared.
There doesn't seem to be a StyleCop rule to find this.
Nor a ReSharper rule.
As a last resort I tried loading the assemblies into JustDecompile and finding all usages of System.String.op_Equality but that doesn't seem to pick up usages inside of LINQ expressions such as .Where(x => x.StringField == stringField)
So I'm a little stumped and wondered if anyone had any ideas on how to search these pesky comparisons out?

You can use Resharper to find all the usages. Here's what works for me:
Right click on the string type anywhere in your code. Click Go to Declaration.
Resharper will open string.cs from the .NET framework
Scroll down to operator == and right click, select Find Usages
It takes a bit of time but you'll get a nice list of usages, ordered in a tree view.
I tried this with Resharper 6.1 in VS2010.
UPDATE
There is a simpler way to do this:
Select == in a string comparison
Right click on the selection and choose Find Usages Advanced
In the dialog under find check only 'Usages' and set scope to 'Solution' to filter out any references in other libs.

My advise would be to write a very basic and specific code parser which transverses through each scope in your system, recording all string/String variable declarations and detecting any == comparisons using those variables.
Anyone with a deeper knowledge on code parsing is welcome to comment. I'm sure there are some classes/tools out there that one could use.

Related

Prevent null forgiving operator usage

As part of the work we do maintaining a project we typically end up discussing the null forgiving operator ! in PR reviews and have been trying to explore ways to prevent it's use altogether.
I am ideally trying to find a way to prevent it's usage through .editorconfig and forcing it to be an error however I have struggled to find any clear options. Is this possible?
If it isn't are there any known other ways of enforcing this desired behaviour?
I had started to write my own Roslyn Analyzer as suggested in comments until a friend pointed me at this repository which has already provided it:
https://github.com/tom-englert/Nullable.Extended
There is an extension or a NuGet Package option which looks exactly like we need.

In C#, what is the recommended way to establish whether or not a string is a real word?

Working in C# I have an array of strings. Some of these strings are real words, others are complete nonsense. My goal is to come up with a way of deciding which of these words are real and which are false.
I had planned to find some kind of word list online that I could bring into my project, turn into a list, and compare against, but of course typing in "C# dictionary" comes up with an unrelated topic! I don't need a 100% accuracy rate.
To formalize the question:
In C#, what is the recommended way to establish whether or not a string is a real word?
Advice and guidance is very much appreciated!
Solution
Thanks for the great answers, they were all very useful. As it happens the thing to do was ask the same question in a different wording. Searching for C# spellcheck brought up some great links and I ended up using Nhunspell which you can get through NuGet, and is very easy to use.
The problem is that "Dictionary" is a type within the framework. So, searching with that word will end up with all sorts of results. What you are basically wanting to do is Spell Check. This will determine if a word is valid or not.
Searching for C# spell check yielded some promising results. Searching for open source spell check also has some.
I have previously implemented one of the open source ones within a VB6 project. I think it was ASpell. I haven't had to use spell check library within C#, but I'm sure there is one, or at least one with a .NET wrapper to make implementation easier.
If you have special case words that do not exist in the dictionary/word file for a spell check solution, you can add them.
To do this I would use a freely available dictionary for linux (googling "linux dictionaries" should get you on the right track), read and parse the file, and store it in a C# System.Collections.Generic.HashSet collection. I would probably store everything as .ToUpper() or as .ToLower() but this depends on your requirements.
You can then check if any arbitrary string is in the HashSet efficiently.
I don't know of any word list file included by default on Windows, but most Unix-like operating systems include a words file for this purpose. Someone has also posted a words file on github suggested for use in Windows projects. These files are simple lists of words, one per line.

Compile Times: Additional Using Statements [duplicate]

I've always run Remove and Sort Usings as a matter of course, because it seems the right thing to do.
But just now I got to wondering: Why do we do this?
Certainly, there's always a benefit to clean & compact code.
And there must be some benefit if MS took the time to have it as a menu item in VS.
Can anyone answer: why do this?
What are the compile-time or run-time (or other) benefits from removing and/or sorting usings?
As #craig-w mentions, there's a very small compile time performance improvement.
The way that the compiler works, is when it encounters a type, it looks in the current namespace, and then starts searching each namespace with a using directive in the order presented until it finds the type it's looking for.
There's an excellent writeup on this in the book CLR Via C# by Jeffrey Richter (http://www.amazon.com/CLR-via-4th-Developer-Reference/dp/0735667454/ref=sr_1_1?ie=UTF8&qid=1417806042&sr=8-1&keywords=clr+via+c%23)
As to why MS provided the menu option, I would imagine that enough internal developers were asking for it for the same reasons that you mention: cleaner, more concise code.
There's probably a teeny-tiny (i.e. minuscule/virtually unmeasurable) performance improvement during compilation because it doesn't have to search through namespaces that you aren't actually using for unqualified types. I do it because it's just neater and easier to read in the end. Also, I use the Productivity Power Tools and have them set to do the Remove and Sort when I save the file.

What's the value in removing and/or sorting Usings?

I've always run Remove and Sort Usings as a matter of course, because it seems the right thing to do.
But just now I got to wondering: Why do we do this?
Certainly, there's always a benefit to clean & compact code.
And there must be some benefit if MS took the time to have it as a menu item in VS.
Can anyone answer: why do this?
What are the compile-time or run-time (or other) benefits from removing and/or sorting usings?
As #craig-w mentions, there's a very small compile time performance improvement.
The way that the compiler works, is when it encounters a type, it looks in the current namespace, and then starts searching each namespace with a using directive in the order presented until it finds the type it's looking for.
There's an excellent writeup on this in the book CLR Via C# by Jeffrey Richter (http://www.amazon.com/CLR-via-4th-Developer-Reference/dp/0735667454/ref=sr_1_1?ie=UTF8&qid=1417806042&sr=8-1&keywords=clr+via+c%23)
As to why MS provided the menu option, I would imagine that enough internal developers were asking for it for the same reasons that you mention: cleaner, more concise code.
There's probably a teeny-tiny (i.e. minuscule/virtually unmeasurable) performance improvement during compilation because it doesn't have to search through namespaces that you aren't actually using for unqualified types. I do it because it's just neater and easier to read in the end. Also, I use the Productivity Power Tools and have them set to do the Remove and Sort when I save the file.

Find all source hardcoded strings

I need to move all the hard coded strings in my source code in .resx files. Is there a tool that could help me find all the hardcoded strings within C# code?
ReSharper 5 is obvious a choice, but many tips must be set so as to achieve your goals,
Turn on solution wide analysis.
Go to ReSharper|Options|Code Inspection|Inspection Severity|Potential Code Quality Issues|Element is localizable set to Show as error.
Go back to Solution Explorer and click on the project (csproj).
In Properties panel under ReSharper category, set Localizable to Yes, Localizable Inspector to Pessimistic.
Then you can find almost all you need in Errors in Solution panel.
Hope this helps.
Or do a search based upon a regular expression like discussed here:
https://vosseburchttechblog.azurewebsites.net/index.php/2014/12/16/find-all-string-literals-in-c-code-files-but-not-the-ones-in-comments/
(?=(^((?!///).)*$)).*((".+?")|('.+?')).*
You could always do a search for the " sign in all the .cs files. That should get you to most of them, without too much noise.
This tool http://visuallocalizer.codeplex.com/ allows for batch-move strings to resources, together with other features. It is FOSS so maybe you can give it a try.
(I am involved)
Resharper 5.0 (Beta) allows you to move strings to resources (it has built in Localization feature). Give it a try. Beta works fine, i use it every day and have no problems. Best of all it's free until out of beta. I even recommend using night builds as they seem to be stable.
Software localization and globalization have always been tough and at times unwanted tasks for developers. ReSharper 5 greatly simplifies working with resources by providing a full stack of features for resx files and resource usages in C# and VB.NET code, as well as in ASP.NET and XAML markup.
Dedicated features include Move string to resource, Find usages of resource and other navigation actions. Combined with refactoring support, inspections and fixes, you get a convenient localization environment.
Some are found by FxCop. Not sure what its limits are, I think it depends on parameter and property names (eg: a property called "Text" is considered to be localized).

Categories

Resources