Why ("abc"+char.MaxValue).CompareTo("abc")==0? - c#

I have a sorted array of strings.
Given a string that identifies a prefix, I perform two binary searches to find the first and last positions in the array that contain words that start with that prefix:
string [] words = {"aaa","abc","abcd","acd"};
string prefix = "abc";
int firstPosition = Array.BinarySearch<string>(words, prefix);
int lastPosition = Array.BinarySearch<string>(words, prefix + char.MaxValue);
if (firstPosition < 0)
firstPosition = ~firstPosition;
if (lastPosition < 0)
lastPosition = ~lastPosition;
Running this code I get firstPosition and lastPosition both equal to 1, while the right answer is to have lastPosition equal to 3 (i.e., pointing to the first non-matching word).
The BinarySearch method uses the CompareTo method to compare the objects and I have found that
("abc"+char.MaxValue).CompareTo("abc")==0
meaning that the two string are considered equal!
If I change the code with
int lastPosition = Array.BinarySearch<string>(words, prefix + "z");
I get the right answer.
Moreover I have found that
("abc"+char.MaxValue)==("abc")
correctly (with respect to my needs) returns false.
Could you please help me explaining the behavior of the CompareTo method?
I would like to have the CompareTo method to behave like the ==, so that the BinarySearch method returns 3 for lastPosition.

string.CompareTo() does a current-culture compare. Internally it uses StringComparer.CurrentCulture, whereas the string equals-operator does a culture-invariant compare.
For example, if the current-culture is "DE", you will get the same results with "ss" and "ß":
Console.WriteLine("ss".CompareTo("ß")); // => 0
Console.WriteLine("ss" == "ß"); // => false
What you want is a culture-invariant compare, which you will get by using StringComparer.Ordinal:
StringComparer.Ordinal.Compare("ss", "ß"); // => -108
StringComparer.Ordinal.Compare("abc"+char.MaxValue, "abc"); // => 65535

According to the MSDN, string.CompareTo should not be used to check whether two strings are equal:
The CompareTo method was designed primarily for use in sorting or alphabetizing operations. It should not be used when the primary purpose of the method call is to determine whether two strings are equivalent. To determine whether two strings are equivalent, call the Equals method.
To get the behavior you wish, you could make use of the overload that accepts an IComparer<T>:
int lastPosition = Array.BinarySearch<string>(words, prefix + char.MaxValue,
StringComparer.Ordinal);
This will return -4 for lastPosition as there is no string with that prefix in the array. I don't understand why you expect 3 in that case...

Related

why does parse function return o?

I am new to c# programming and I recently bumped into one problem which looks pretty basic.I store the string value like SV_1 in the variable lastServiceNo and split it using Split function and the result is stored in string array called index.Basically index[1] has some numeric value bt as string. now I want to convert string into int. In the following code , it behaves as expected until parse function is encountered.I could not understand why does this parse function returning 0 as index[1] has some numeric value in it. Can somebody point the problem please??
public string GenerateServiceNo() {
DataAccessLayer.DataAccessLayer dlObj= new DataAccessLayer.DataAccessLayer();
string lastServiceNo = dlObj.GetLastServiceNo();
string[] index = lastServiceNo.Split('_');
int lastIndex = int.Parse(index[1]);
return "SV_"+(lastIndex++).ToString();
}
int.Parse(string s) throws an exception if the number is too bug in terms of data size or the string "s" is not in the correct numerical format.
The format that this method accepts is "[ws][sign]number[ws]" where:
[ws] is optional for one or more whitespace(" ")
[sign] is optional for "+" or "-"
Check here for the full reference.
Thus said, I can assure you that if int.Parse(index[1]) returns 0 then that means index[1] equals "[ws][sign]0[ws]" using the transcript above.
However, looking at your code, I can conclude that you're incrementing a local variable after assignment without using its incremented value afterwards. Perhaps you meant that this operation shouldn't be 0?
If that's the case then I believe this is what you're trying to achieve:
public string GenerateServiceNo()
{
DataAccessLayer.DataAccessLayer dlObj= new DataAccessLayer.DataAccessLayer();
string lastServiceNo = dlObj.GetLastServiceNo();
string[] index = lastServiceNo.Split('_');
int lastIndex = int.Parse(index[1]);
return string.Format("SV_{0}", ++lastIndex);
}
Assuming index[1] == "0", this method will now return "SV_1".

UPDATE Keyword Equivalence From Java to C#

I am converting code from Java to C#, but having issues figuring out some keyword equivalence. I have looked over the web and can't find anything. Updated added number 3.
1) Does anyone know what C# uses for charAt()? Below is how I am trying to use it.
curr = tokens[i].charAt(0);
2) Also having issues converting isEmpty() to C# syntax.
if (par.isEmpty())
3) How should I convert this:
op2 = compute.pop().intValue();
Thanks!
1) Strings can have their characters accessed by using the [] operator:
curr = (tokens[i])[0];
2) IsEmpty becomes String.IsNullOrEmpty or String.IsNullOrWhiteSpace depending on what you want (the second is only available in .NET 4+ as well).
3) From what research I could find, it looks like intValue deals with boxing/unboxing. If you stick with working with ints, you shouldn't need to worry about that in C#. "Pop" will work the same if you have a Stack collection. Hopefully that gives you enough to convert the line.
1) In C# a string is also an array of characters.So you can access a character using the indexer:
curr = tokens[i][0]
2) You can compare your string with string.Empty or use String.IsNullOrEmpty method to check whether a string is empty or not:
if( par == string.Empty )
OR:
if( string.IsNullOrEmpty(par) );
Assuming tokens[i] is a string, treat the string as an array of characters:
var firstCharacter = tokens[i][0];
Assuming par is also a string, the string.IsNullOrEmpty() method can help you test whether or not a particular string is empty:
if (string.IsNullOrEmpty(par))
{
}
If par is a Stack<String> as you've indicated, then you could test whether it was empty (has no elements) with a simple bit of LINQ:
if (!par.Any())
{
// par has no elements
}
Alternatively, you could use the Count property in the Stack class:
if (par.Count == 0)
{
// par has no elements
}
A different approach:
1) You can use curr = tokens[i].ElementAt(0); This will return the same result as charAt(0)
2) if( string.IsNullOrEmpty(par) ); will do the job.

How to get the value of a System.String object instead of returning "System.String[]"

I am working on a file parser, and this bit of code is not giving me what I want. Before I go any farther, I should mention that I did not write this program, I am only editing the source to fix this specific problem. Also, I can compile the code, so that is not a problem (you know how downloaded programs always have compile errors). Here's the code.
case EsfValueType.Binary4E: //System.String[]
{
int size = (int)(this.reader.ReadUInt32() - ((uint)this.reader.BaseStream.Position));
var strings = new string[size / 4];
for (int i = 0; i < size / 4; i++)
strings[i] = this.stringValuesUTF16[this.reader.ReadUInt32()];
esfValue.Value = strings.ToString();
break;
}
Now, I added the .ToString(); part to the above line, but it made no difference. The problem is that esfValue.Value ends up with System.String[] as it's value, and I want the value of the System.String object. If you can make sense out of this and tell me what is wrong, it would be appreciated.
The program name is ESF Editor 1.4.8.0.
case EsfValueType.Binary4E: //System.String[]
{
int size = (int)(this.reader.ReadUInt32() - ((uint)this.reader.BaseStream.Position));
var strings = new StringBuilder();
for (int i = 0; i < size / 4; i++)
{
strings.Append(this.stringValuesUTF16[this.reader.ReadUInt32()]); //or AppendLine, depending on what you need
}
esfValue.Value = strings.ToString();
break;
}
The strings variable is an array of strings - the Array class does not override the default ToString() implementation which returns the type of the object.
You need to concatenate all the strings in the array - either looping and concatenating or using LINQ and assign the resulting string to esfValue.Value. Of course, this assumes you want the values all in one string, one after the other.
Your issue is that strings isn't a single string, its an array of strings. As a result your call to ToString is calling Object.ToString(), which returns the type of the object.
Maybe you want something like
esfValue.Value = strings.Aggregate((acc, next) => acc + next)
which will simply concatenate all the strings together.
When you do a .ToString() on a class that doesn't override the .ToString() base method to return a custom string (which string[] doesn't), you're always going to get the type's namespace/class as the result.
Arrays, in and of themselves, don't have values. What value are you trying to get? Are you trying to join the array into a single, character-delimited string? If so, this would work:
esfValue.Value = string.Join(",", strings);
Just replace the , with whatever character you want to delimit the array with.
I think you just need to join the string values contained in the string array. In order to do so, you need to call String.Join and pass the string separator and the string array. It returns a single System.String.

How to Compare Values in Array

If you have a string of "1,2,3,1,5,7" you can put this in an array or hash table or whatever is deemed best.
How do you determine that all value are the same? In the above example it would fail but if you had "1,1,1" that would be true.
This can be done nicely using lambda expressions.
For an array, named arr:
var allSame = Array.TrueForAll(arr, x => x == arr[0]);
For an list (List<T>), named lst:
var allSame = lst.TrueForAll(x => x == lst[0]);
And for an iterable (IEnumerable<T>), named col:
var first = col.First();
var allSame = col.All(x => x == first);
Note that these methods don't handle empty arrays/lists/iterables however. Such support would be trivial to add however.
Iterate through each value, store the first value in a variable and compare the rest of the array to that variable. The instant one fails, you know all the values are not the same.
How about something like...
string numArray = "1,1,1,1,1";
return numArrray.Split( ',' ).Distinct().Count() <= 1;
I think using List<T>.TrueForAll would be a slick approach.
http://msdn.microsoft.com/en-us/library/kdxe4x4w.aspx
Not as efficient as a simple loop (as it always processes all items even if the result could be determined sooner), but:
if (new HashSet<string>(numbers.Split(',')).Count == 1) ...

Fastest way to compare two lists

I have a List (Foo) and I want to see if it's equal to another List (foo). What is the fastest way ?
From 3.5 onwards you may use a LINQ function for this:
List<string> l1 = new List<string> {"Hello", "World","How","Are","You"};
List<string> l2 = new List<string> {"Hello","World","How","Are","You"};
Console.WriteLine(l1.SequenceEqual(l2));
It also knows an overload to provide your own comparer
Here are the steps I would do:
Do an object.ReferenceEquals() if true, then return true.
Check the count, if not the same, return false.
Compare the elements one by one.
Here are some suggestions for the method:
Base the implementation on ICollection. This gives you the count, but doesn't restrict to specific collection type or contained type.
You can implement the method as an extension method to ICollection.
You will need to use the .Equals() for comparing the elements of the list.
Something like this:
public static bool CompareLists(List<int> l1, List<int> l2)
{
if (l1 == l2) return true;
if (l1.Count != l2.Count) return false;
for (int i=0; i<l1.Count; i++)
if (l1[i] != l2[i]) return false;
return true;
}
Some additional error checking (e.g. null-checks) might be required.
Something like this maybe using Match Action.
public static CompareList<T>(IList<T> obj1, IList<T> obj2, Action<T,T> match)
{
if (obj1.Count != obj2.Count) return false;
for (int i = 0; i < obj1.Count; i++)
{
if (obj2[i] != null && !match(obj1[i], obj2[i]))
return false;
}
}
Assuming you mean that you want to know if the CONTENTS are equal (not just the list's object reference.)
If you will be doing the equality check much more often than inserts then you may find it more efficient to generate a hashcode each time a value is inserted and compare hashcodes when doing the equality check. Note that you should consider if order is important or just that the lists have identical contents in any order.
Unless you are comparing very often I think this would usually be a waste.
One shortcut, that I didn't see mentioned, is that if you know how the lists were created, you may be able to join them into strings and compare directly.
For example...
In my case, I wanted to prompt the user for a list of words. I wanted to make sure that each word started with a letter, but after that, it could contain letters, numbers, or underscores. I'm particularly concerned that users will use dashes or start with numbers.
I use Regular Expressions to break it into 2 lists, and them join them back together and compare them as strings:
var testList = userInput.match(/[-|\w]+/g)
/*the above catches common errors:
using dash or starting with a numeric*/
listToUse = userInput.match(/[a-zA-Z]\w*/g)
if (listToUse.join(" ") != testList.join(" ")) {
return "the lists don't match"
Since I knew that neither list would contain spaces, and that the lists only contained simple strings, I could join them together with a space, and compare them.

Categories

Resources