Numeric check of string not always working as expected - c#

I have an issue, that I cannot resolve by myself. I am reading string values from a file in CSV-format with default encoding (UTF-8 as far as I know). The thing is, that I use the following method to determine if the strings contains digits only:
private static bool IsDigitsOnly(string str)
{
return str.All(Char.IsDigit);
}
In mostly all cases it´s is working fine, but it returns 'false' when the input string is one of the following:
726849004
704152104
779450251
779459121
346751902
779459111
779459115
779428100
726852040
I tried with another approach by changing the method to this:
str.All(c => c >= '0' && c <= '9');
It does the same in the cases above.
When I debug, I can see, that the string values are correct (no trailing or leading whitespaces, chars or anything)
Can someone help me out here?
Thank you in advance,
Thomas.

I tried to solve this and found that there is some skeptic character at the end of each string except for 3rd and 4th. See my test app screen shot:
This is causing the error. If you see the ASCII value of this character then its 31, listed as Unit Separator. Need to look why it's there.
I know that I am not answering your question but trying to give/get some pointer to solve this interesting problem.

Replace
return str.All(char.IsDigit);
with
return str.All(c =>
char.IsDigit(c));
and set a breakpoint on the second line, then debug to inspect which characters are being passed and hopefully find the one which isn't a digit.

Related

string.IsNullOrEmpty & string.IsNullOrWhiteSpace return false for empty string

I have run into a curious case where a block of code that is designed to weed out blank strings, periods dashes and the like after a paragraph of text is processed from the MSFT Azure Phrase Breaker. I could use help figuring out how to debug the issue.
The following block of code returns true when given a value of "". Obviously the expectation is the method should return false after the first if statement. Out of 899 phrases to be looked at, only two seem to have this problem. Happens on another machine as well.
public static bool IsSentenceTranslatable(string sentence)
{
string trimmedSentence = sentence.Trim();
if (string.IsNullOrEmpty(trimmedSentence) || string.IsNullOrWhiteSpace(trimmedSentence))
{
return false;
}
switch (trimmedSentence)
{
case " ":
case ".":
case "-":
case "·":
return false;
}
return true;
}
Here is a snapshot of the debugger.
Could it be a bug in Visual Studio or .NET? I tried using the various visualizers, but still couldn't see anything in the box. .NET 4.5 C# 7.3
Try to get the string's byte representation. I suspect that it actually contains some character which is invisible in VS debugger but doesn't count as a whitespace.
See this questions for hints:
Invisible characters - ASCII
Converting string to byte array in C#
UPD: since your Watch window shows that after the call string trimmedSentence = sentence.Trim() you have trimmedSentence.Length == 1, I'd upgrade my suspicion to certainty.
As stated in my comment, in that screenshot you can see that trimmedSentence.Length is 1, therefore it's not empty, and its contents is definitely not a standard space. If the string appears empty, it's because it has one of those so-called invisible characters. To check what your string has, you can directly access that character by doing trimmedSentence[0].
If that character will appear often, you might want to consider doing this:
string trimmedSentence = sentence.Trim().Replace("<this special character>", "");
Alternatively, you can create that replaceable string from the Unicode value by doing Char.ConvertFromUtf32(yourCharCode).ToString(). You cannot use the Replace overload that uses character parameters, as there is no "empty" character, only an empty string. You should be able to get this value while debugging. Or if necessary, by doing (int)trimmedSentence[0].

Two string value are the same but CompareTo doesn't return 0

I create wpf application. And some case I compare two string values. local value comes from richtextbox, and richtextbox value comes from word document. I try every solution on this site. But nothing changed. The comparision equal to false. I try replace end of file with linkedWord.Replace((char)160, (char)32);
Try string.Compare
String.Compare(wr.Orthography, linkedWord, StringComparison.OrdinalIgnoreCase) == 0
Use Encoding to byte array and SequanceEqual
and more, but can not find solution.
Please help me to solve this problem.
The value comes from richtextbox:
the value comes from database:
EDIT:
After Compare method result is -1
The reason is probably that cyrillic ә(ә) and latin ə(ə) are different though they look same.
Check each character for equality, below you can see the difference:
foreach (char c in "bәse")
Console.Write(((int)c).ToString("0000"));
Console.WriteLine("\n--------------------");
foreach (char c in "bəse")
Console.Write(((int)c).ToString("0000"));
Console.WriteLine("\n--------------------");
Console.WriteLine(("bәse"=="bəse").ToString());
Output
0098124101150101
--------------------
0098060101150101
--------------------
False
DOTNETFIDDLE
In this case you should replace the cyrillic chars with latin counterparts
You can see here and also check here, it seems like there is a library that can be used in this case

validate excel worksheet name

I'm getting the below error when setting the worksheet name dynamically. Does anyone has regexp to validate the name before setting it ?
The name that you type does not exceed 31 characters. The name does
not contain any of the following characters: : \ / ? * [ or ]
You did not leave the name blank.
You can use the method to check if the sheet name is valid
private bool IsSheetNameValid(string sheetName)
{
if (string.IsNullOrEmpty(sheetName))
{
return false;
}
if (sheetName.Length > 31)
{
return false;
}
char[] invalidChars = new char[] {':', '\\', '/', '?', '*', '[', ']'};
if (invalidChars.Any(sheetName.Contains))
{
return false;
}
return true;
}
To do worksheet validation for those specified invalid characters using Regex, you can use something like this:
string wsName = #"worksheetName"; //verbatim string to take special characters literally
Match m = Regex.Match(wsName, #"[\[/\?\]\*]");
bool nameIsValid = (m.Success || (string.IsNullOrEmpty(wsName)) || (wsName.Length > 31)) ? false : true;
This also includes a check to see if the worksheet name is null or empty, or if it's greater than 31. Those two checks aren't done via Regex for the sake of simplicity and to avoid over engineering this problem.
Let's match the start of the string, then between 1 and 31 things that aren't on the forbidden list, then the end of the string. Requiring at least one means we refuse empty strings:
^[^\/\\\?\*\[\]]{1,31}$
There's at least one nuance that this regex will miss: this will accept a sequence of spaces, tabs and newlines, which will be a problem if that is considered to be blank (as it probably is).
If you take the length check out of the regex, then you can get the blankness check by doing something like:
^[^\/\\\?\*\[\]]*[^ \t\/\\\?\*\[\]][^\/\\\?\*\[\]]*$
How does that work? If we defined our class above as WORKSHEET, that would be:
^[^WORKSHEET]*[^\sWORKSHEET][^WORKSHEET]*$
So we match one or more non-forbidden characters, then a character that is neither forbidden nor whitespace, then zero or more non-forbidden characters. The key is that we demand at least one non-whitespace character in the middle section.
But we've lost the length check. It's hard to do both the length check and the regex in one expression. In order to count, we have to phrase things in terms of matching n times, and the things being matched have to be known to be of length 1. But in order to allow whitespace to be placed freely - as long as it's not all whitespace - we need to have a part of the match that is not necessarily of length 1.
Well, that's not quite true. At this point this starts to become a really bad idea, but nevertheless: onwards, into the breach! (for educational purposes only)
Instead of using * for the possibly-blank sections, we can specify the number we expect of each, and include all the possible ways for those three sections to add up to 31. How many ways are there for two numbers to add up to 30? Well, there's 30 of them. 0+30, 1+29, 2+28, ... 30+0:
^[^WORKSHEET]{0}[^\sWORKSHEET][^WORKSHEET]{30}$
|^[^WORKSHEET]{1}[^\sWORKSHEET][^WORKSHEET]{29}$
|^[^WORKSHEET]{2}[^\sWORKSHEET][^WORKSHEET]{28}$
....
|^[^WORKSHEET]{30}[^\sWORKSHEET][^WORKSHEET]{0}$
Obviously if this was a good idea, you'd write a program that expression rather than specifying it all by hand (and getting something wrong). But I don't think I need to tell you it's not a good idea. It is, however, the only answer I have to your question.
While admittedly not actually answering your question, I think #HatSoft has the right approach, encoding the conditions directly and clearly. After all, I'm now satisfied that an answer to your question as asked is not actually a helpful thing.
You might want to do a check for the name History as this is a reserved sheet name in Excel.
Something like that?
public string validate(string name)
{
foreach (char c in Path.GetInvalidFileNameChars())
name = name.Replace(c.ToString(), "");
if (name.Length > 31)
name = name.Substring(0, 31);
return name;
}

How to Compare localized strings in C#

I am doing localization for ASP.NET Web Application, when user enters a localized string "XXXX" and i am comparing that string with a value in my localized resource file.
Example :
if ( txtCalender.Text == Resources.START_NOW)
{
//do something
}
But When i do that even when the two strings(localized strings) are equal, it returns false. ie.
txtCalender.Text ="இப்போது தொடங்க"
Resources.START_NOW="இப்போது தொடங்க"
This is localized for Tamil.
Please help..
Use one of the string.Equals overloads that takes a StringComparison value - this allows you to use the current culture for comparison..
if ( txtCalender.Text.Equals(Resources.START_NOW, StringComparison.CurrentCulture))
{
//do something
}
Or, if you want case insensitive comparison:
if ( txtCalender.Text.Equals(Resources.START_NOW,
StringComparison.CurrentCultureIgnoreCase))
{
//do something
}
I found the answer and it works. Here is the solution,
it was not working when i tried from Chrome browser and it works with Firefox. Actually when i converted both string to char array,
txtCalender.Text Returns 40 characters and Resource.START_NOW returned 46. So i have tried to Normalize the string using Normalize() method
if(txtCalender.Text.Normalize() == Resources.START_NOW.Normalize())
It was interpreting one character as two different characters when i didn't put normalize method.
it has worked fine. Thanks for your answers.
You can compare with InvariantCulture in String.Equals (statis method):
String.Equals("XXX", "XXX", StringComparison.InvariantCulture);
Not sure whether this helps though, could others comment on it? I've never come across your actual error.
Use String.Equals or String.Compare.
There is some performance differences between these two. String.Compare is faster than String.Equal because String.Compare is static method and String.Equals is instance method.
String.Equal returns a boolean. String.Compare returns 0 when the strings equal, but if they're different they return a positive or negative number depending on whether the first string is before (less) or after (greater) the second string. Therefore, use String.Equals when you need to know if they are the same or String.Compare when you need to make a decision based on more than equality.
You probably need to use .Equals
if(txt.Calendar.Text.Equals(Resources.START_NOW))
{ //...
And if case-insensitive comparison is what you're after (often is) use StringComparison.OrdinalIgnoreCase as the second argument to the .Equals call.
If this isn't working - then can I suggest you breakpoint the line and check the actual value of Resources.START_NOW - the only possible reason why this equality comparison would fail is if the two strings really aren't the same. So my guess is that your culture management isn't working properly.

Frustrated trying to read a path from an argument in C#

I'm passing /file:c:\myfile.doc and I'm getting back "/file:c:\myfile.doc" instead of "C:\myfile.doc", could someone please advise where I am going wrong?
if (entry.ToUpper().IndexOf("FILE") != -1)
{
//override default log location
MyFileLocation = entry.Split(new char[] {'='})[1];
}
You are splitting on "=" instead of ":"
Try
if (entry.ToUpper().IndexOf("FILE:") == 0)
{
//override default log location
MyFileLocation location = entry.Split(new char[] {':'},2)[1];
}
The easiest way to do this is to just take a substring. Since you are reading this from the command line, the "/file:" portion will always be consistent.
entry.Substring(6);
This will return everything after the "/file:".
Not an answer as I think it's been answered well enough already, but as you stated that you're a beginner I thought that I would point out that:
entry.split(new char[]{':'});
can be:
entry.split(':');
This uses:
split(params char[] separator);
This can be deceiving for new C# programmers as the params keyword means that you can actually pass in 1 to many chars, as in:
entry.split(':','.',' ');
You could also just lop off the 'file:' part. It is clearly defined and will be constant so it isn't THAT bad. Not great, but not horrible.
Here is a good example of a command line argument parser.
The code you've posted would require the argument /file=c:\myfile.doc.
Either use that as the parameter or split on the colon (:) instead of equals (=).

Categories

Resources