Compare Culture-Invaliant and ignore Superscript - c#

A database with collation ...CI_AS makes no differences between "²" and "2".
The c# default String-Comparer StringComparer.InvariantCultureIgnoreCase on the other hand makes a difference.
so if I want to save an object to the database, theres a unique error.
What is the common solution to say c# to not make any difference? (Own Comparer?)

Yes, custom equality comparer (IEqualityComparer) should do the job, using it you'll be able to compare your strings in a way you need

Related

How to use StepArgumentTransformation properly in Specflow(BDD)?

I encountered a scenario where i have to send an array of integers as parameter from specflow feature file. I could have used tables which i don't want to do as i have send as row[] or col[]. If i pass parameter as a string
eg: Given set the value as '470,471,472,472'
and receive it and do split in step definition file. How different is StepArgumentTransformation from the above scenario? Is any other benefit in using step argument transformation. I understand we can convert XML,Date or any object. Why do we have to use stepargumenttransformation???
I hope I understood the question correctly.
Specflow supports some automatic transformation out of the box, so things like converting to Date, Double, int etc etc, and it does these by default as there is no ambiguity about them. You can easily convert a string to a double or a Date as you know the locale being used.
Why isn't converting to arrays supported? I suppose it could be, but there is some ambiguity. What should the list separator be? a comma? What about locales that use that as a separator between the whole and fractional part of a number?
So providing a default implementation of something which converted a list to int[] or IEnumerable<int> could be possible, but its just likely to get some people asking why it doesn't work for them when they have used ☃ as a list separator.
It's better to leave the things with ambiguity to individuals to implement, rather than guess at the best implementation.
The StepArgumentTransformation you want is very simple to write and could be included in an external step assembly if you wanted to share it amongst many projects.
So to answer your many questions:
It's not really any different, it just encapsulates it in a single place, which is good practise, which is a benefit.
Yes you can convert any object.
You don't have to use StepArgumentTransformation, many people don't, but IMHO they make your life much easier

Fastest way to detect non-equal strings (without storing the string)?

I am writing a templating engine and I am searching for a good way to detect if a template has changed.
For this I have the following requirements (in order of importance):
non-equal strings are required to be detected different
as fast as possible
as less memory as possible (=> do not store the whole string for comparison)
high propability to detect equal strings as equal
It is not a big problem, if sometimes equal strings are not detected as equal as this would just trigger a "re-rendering" which would not be needed, but because of the "heavy work" of this, this should happen as less as possible.
I first thought of using String.GetHashCode(), but the probalility of getting the same hash-code for two non-equal strings is pretty high.
Are there any good combinations like checking hash-code and Length to get the probability of to non-equal strings wrongly detected as equal to an unrealisticly happening low number?
Or is using some hashing algorithm, like MD5 or SHA, a good alternative (after hash-code is equal)?
My rendering looks something like the following:
public string RenderTemplate(string name, string template)
{
var cachedTemplate = Cache.Get(name);
if(cachedTemplate == null || !cachedTemplate.Equals(template)) // <= Equals
{
cachedTemplate = new Template(name, template);
cachedTemplate.Render();
Cache.Set(name, cachedTemplate);
}
return cachedTemplate.Result;
}
The Equals is the point I am asking about.
I am also open for other suggestions how this could be solved.
UPDATE:
To add some numbers to get more context:
I expect to have >1000 individual templates and each template will have up to at least a few thousand characters.
This is why I would like to avoid storing the whole template-string "in memory" only for the comparison.
Most of the templates are stored in the DB.
UPDATE 2:
What do you think about extending my RenderTemplate method with a timestamp as suggested by Nikola:
public string RenderTemplate(string name, string template, DateTime timestamp)
Then I could compare name, GetHashCode and timestamp which does not need much memory, should be pretty fast and the probability of a "wrongly detected equality" is practically 0. The timestamp I can read from the DB (have it already there) or the "last changed date" from the file-system for a file-based template.
You don't have much choice. If you don't compare strings by comparing their content, use a hash algorithm to determine if strings are equal. Personally, I would probably use a hash algorithm. If you are a bit paranoid and afraid of a collision, choose algorithm with widest space (e.g. SHA512).
Why do you need to compare strings to determine that a template has changed? Why not use a different approach?
If file is stored on disk, why not use a file watcher?
If stored in database, why not use a timestamp to detect when it was saved?
If application is restarted, anyway reload templates
Also, it's worrying that a template for UI changes so often that you must make checks like this. I think you have more problems with design beside comparing strings.

Keys.PageDown.ToString() returns Next

I've read a few articles on this issue. Basically PageDown and PageUp are linked to Next and Prior respectively, for backwards compatibility. The problem with this is there's no reliable way to get the wanted values out (atleast none that I can see).
See here for a good explanation. Quite old though, I thought something might have been done to address this by now.
At present there are two options I can see;
Enum.GetNames(typeof (Keys)).GetValue(e.KeyValue);
This returns "Prior" for "PageUp" but "PageDown" for "PageDown".
e.KeyCode.ToString();
This returns "PageUp" for "PageUp" but "Next for "PageDown".
I could handle it manually, but what if there's another instance like this?
Does anyone have a better solution?
Perhaps the best thing to do is to create a lookup table to translate the enum values.
You could implement the lookup table with a dictionary to map the enum values onto strings, and if the dictionary doesn't contain the enum value fall back to Enum.ToString() to get the value. That way you only need to add the exceptions (such as PageUp and PageDown) to the dictionary.
(Note that if you are displaying these strings to the user and you want to internationalize the strings you will probably need to add translated entries for most of the strings.)

Why do bool.TrueString and bool.FalseString exist?

I was reading the MSDN article of the boolean structure, when I saw that a boolean has two fields: TrueString and FalseString. These respectively return "True" and "False".
After some searching, the only example I could find is in this dotnetperls article. The article states:
Programs often need these strings. TrueString and FalseString are a useful pair of readonly members. They represent truth values in string format. They provide indirection and abstraction over directly using string literals.
So appearantly it's useful for some situations. But the same article fails to provide a realistic example (IMHO anyway).
Some further reading also brought this to my attention: TrueString and FalseString are public static readonly fields. And this dornetperls article states:
The language specification recommends using public static readonly fields ... when the field is subject to change in the future.
Now this I can somewhat understand. If the .NET developers ever decide to change "True" and "False" to respectively "OkeyDokey" and "Negatory", it's smart to use TrueString and or FalseString.
But that still leaves me with the question: in what kind of scenario do you want to compare a string with the string literal of a boolean? Because appearantly: "Programs often need" them.
For the same reason that string.Empty exists. Some people prefer a semantically named value in code over a literal one.
In modern .NET (anything after .NET Framework) the following code prints True three times:
Console.WriteLine(ReferenceEquals("True", bool.TrueString));
Console.WriteLine(ReferenceEquals("False", bool.FalseString));
Console.WriteLine(ReferenceEquals("", string.Empty));
This tells us there is zero runtime difference between the literals and the fields. They are exactly the same object at runtime.
Try this for yourself on sharplab.io here.
Others have mentioned using it to compare with when parsing boolean strings, but I would not recommend that. If you want to convert a string to a bool, use bool.TryParse or bool.Parse. Using == does a case-sensitive comparison, which is probably not what you want. Furthermore, the framework's methods are optimised specifically for common cases. You can see these optimisations in the code on GitHub here: https://github.com/dotnet/runtime/blob/f8fa9f6d1554e8db291187dd7b2847162703381e/src/libraries/System.Private.CoreLib/src/System/Boolean.cs#L226
If the program stores data in a human readable file or database, it may need to store values as strings. When you read the data back in, if you know the data was written by your application and uses a standard string representation, you can compare x == bool.TrueString faster than you can bool.TryParse(x ...). You could also validate the data by making sure all values x == bool.TrueString || x == bool.FalseString
If the data was typed by humans, or a different system, TryParse is a better option, as it accepts more values as true and differentiates between a definite false and an invalid input. (MSDN Boolean TryParse)
In easy words. Boolean is a Structure. this boolean expose ToString() method which represent a human readable text for the users. So if you write some thing like.
bool b = false;
b.ToString();
the output will be the "False" insteed of 0. the "False" is readable by human and easyly being captured.
Also some where you may want to parse a text value to a boolean value. so these also can be represented as boolean values. for example. we use
Boolean.TryParse("false" ,out mybool)
the false value is being set by the Tryparse method as this finds that we can read values from strings tool.
It can be used as a default value for missing "stringly-typed" configuration parameters. Here's a concrete example I've recently used:
if (bool.Parse(ConfigurationManager.AppSettings["IsTestMode"] ?? bool.FalseString)) ...
...which is - in my humble opinion - simpler and more readable than
var isTestModeString = ConfigurationManager.AppSettings["IsTestMode"];
if (isTestModeString != null && bool.Parse(isTestModeString)) ...
(I deliberately do not use TryParse here, since I do not want to silently ignore invalid values. I want an exception to be thrown, if the configuration value is present and something other than True or False.)
There are many situations where you may need to compare if a string is equal to "True", such as checking an API response. Note: it's more efficient to compare strings but often safer to parse.
The only advantage to using the built-in properties is you won't make typos (assuming you have Intellisense) and you don't have to remember the casing (e.g. "true" instead of "True).

Efficient(?) string comparison

What could possibly be the reasons to use -
bool result = String.Compare(fieldStr, "PIN", true).Equals(0);
instead of,
bool result = String.Equals(fieldStr, "PIN", StringComparison.CurrentCultureIgnoreCase);
or, even simpler -
bool result = fieldStr.Equals("PIN", StringComparison.CurrentCultureIgnoreCase);
for comparing two strings in .NET with C#?
I've been assigned on a project with a large code-base that has abandon use of the first one for simple equality comparison. I couldn't (not yet) find any reason why those senior guys used that approach, and not something simpler like the second or the third one. Is there any performance issue with Equals (static or instance) method? Or is there any specific benefit with using String.Compare method that even outweighs the processing of an extra operation of the entailing .Equals(0)?
I can't give immediate examples, but I suspect there are cases where the first would return true, but the second return false. Two values maybe equal in terms of sort order, while still being distinct even under case-ignoring rules. For example, one culture may decide not to treat accents as important while sorting, but still view two strings differing only in accented characters as unequal. (Or it's possible that the reverse may be true - that two strings may be considered equal, but one comes logically before the other.)
If you're basically interested in the sort order rather than equality, then using Compare makes sense. It also potentially makes sense if the code is going to be translated - e.g. for LINQ - and that overload of Compare is supported but that overload of Equals isn't.
I'll try to come up with an example where they differ. I would certainly say it's rare. EDIT: No luck so far, having tried accents, Eszet, Turkish "I" handling, and different kinds of spaces. That's a long way from saying it cant happen though.

Categories

Resources