using System;
public class Program
{
public static void Main()
{
string t1, t2;
t1 = "Test";
t2 = "test";
Console.WriteLine(t1.CompareTo(t2)); //prints 1, expected was -1
}
}
So, it says that CompareTo() is supposed to return 1 - if it's greater than, -1 if it's less than or 0 if it's equal to the other string. In this example I am comparing "Test" with "test". As I understood, in ASCII, 'A' < 'a'.
So why does it say that t1 is greater if the only difference is first letter, and by ASCII, it should be smaller. Thanks.
The default C# string comparison (as in, if you don't specify it yourself anywhere) is a culture-aware comparison, and the rules depend on your computer's culture.
If you want to use ordinal comparison (ie ASCII as you call it, though C# strings are Unicode), you can use this instead:
Console.WriteLine(string.Compare(t1, t2, StringComparison.Ordinal));
Also note that the specification requires a negative, zero or positive result, not specifically -1. The command above will return -32 for example.
Related
I have started to learn Linq recently. I came across a few inbuild methods like Min() and Max().
The working of these two methods is fine with int[]. But when it comes to string[], I am curious how it will work. I have tried some codes
string[] cars = { "Volvo", "BMW", "Ford", "Mazda" };
Console.WriteLine(cars.Max());
Console.WriteLine(cars.Min());
The output was like this:
**Volvo for Max()
BMW for Min()**
Can you please explain how it is working, is it taking the first letter in alphabetical order or is there any mechanism it is using like based on ASCII values etc?
All types that implement the IComparable or IComparable interface can be compared, using the CompareTo method implemented by each type. All primitive types implement IComparable<T>, including char and string. LINQ's Min(IEnumerable) and Max(IEnumerable) use this implementation to find the minimum or maximum in an enumerable.
String Comparisons
Comparing strings though is a bit more interesting than comparing integers. The strings are typically compared in dictionary order (lexicographically) but ... whose dictionary? Different languages have different sorting rules, and sometimes two letters are considered a single one. Even Danes forget that AA is equivalent to Å in Danish.
The dictionary used to compare strings is provided by the CultureInfo class. By default, the current thread's culture is used which typically matches the culture of the end user (in desktop applications) or the system locale in server applications. In a Danish culture for example, AA is treated differently from aa - I think one of them is ordered after other letters of the same case and the other isn't, but don't ask me which.
The InvariantCulture specifies a locale-insensitive culture that can be used to handle strings the same way in every locale. It uses mostly sensible settings (eg . for the decimal point) except dates, where it uses the US format instead of the ISO8601 (YYYY-MM-DD) format as everyone would expect.
Custom comparisons
It's possible to specify a different comparison method by passing a class that implements IComparer to any LINQ methods affected by order. Min(IEnumerable,IComparer) is one example.
The StringComparer class contains some predefined comparers :
CurrentCulture is the default
CurrentCultureIgnoreCase uses the current culture but ignores case, so A is equal to a. This is very useful eg in dictionaries.
InvariantCulture and InvariantCultureIgnoreCase use the Invariant culture for ordering
Finally, Ordinal and OrdinalIgnoreCase don't use a dictionary but compare the Unicode values of the characters. That's the fastest option if you don't care about locale rules
.Max() use Compare(String, String)
which compares two specified String objects and returns an integer that indicates their relative position in the sort order.
Source code of .Max() for string compare
public static TSource Max<TSource>(this IEnumerable<TSource> source) {
if (source == null) throw Error.ArgumentNull("source");
Comparer<TSource> comparer = Comparer<TSource>.Default;
TSource value = default(TSource);
if (value == null) {
foreach (TSource x in source) {
if (x != null && (value == null || comparer.Compare(x, value) > 0))
value = x;
}
return value;
}
else {
bool hasValue = false;
foreach (TSource x in source) {
if (hasValue) {
if (comparer.Compare(x, value) > 0) //Compare strings
value = x;
}
else {
value = x;
hasValue = true;
}
}
if (hasValue) return value;
throw Error.NoElements();
}
}
Compare(String, String)
https://learn.microsoft.com/en-us/dotnet/api/system.string.compare?view=net-6.0#system-string-compare(system-string-system-string)
Strings are compared alphabetically, which translates into the following logic:
which item has a higher first character
which item has a higher second character
which item has a higher third character
...
The difference is detected at the first character that differs. So, when you compare string1 with string2, then string1 is greater than string2 if at their first differing character (from left to right), string1 has a greater value at that position than string2. If string2 is string1 + string3, then the first difference between string1 and string2 is beyond the end of string1 and at that point the comparison yields that string2 is greater than string1.
If you are dissatisfied with this comparison, then you can specify what comparer you intend to use, see here: https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.max?view=net-6.0#system-linq-enumerable-max-1(system-collections-generic-ienumerable((-0))-system-collections-generic-icomparer((-0)))
Basically in that case you need to implement an IComparer and pass it as a second parameter, like
cars.Max(c => c, yourcomparer)
With string list Min() or Max() follows the first and last word or letter respectively but in case of integers Min() or Max() follows the exact phenomena of finding the Minimum and Maximum numbers from the list.
Please check this image and it's output
I'm trying to understand CompareTo() in C# and the following example made me more confused than ever. Can someone help me understand why the result for the 3rd variation is 1? 2nd word in the sentence "Hello wordd" is not the same as str1 "Hello world" so why am I getting 1? Shouldn't I get -1?
static void Main(string[] args)
{
string str1 = "Hello world";
Console.WriteLine(str1.CompareTo("Hello World"));
Console.WriteLine(str1.CompareTo("Hello world"));
Console.WriteLine(str1.CompareTo("Hello wordd"));
}
Results: -1, 0, 1
If the strings match, then CompareTo() gives 0. If they don't match, it gives a positive or negative number depending on which string comes first alphabetically.
In your example, both the results 1 and -1 indicate the strings do not match, while 0 indicates the strings match.
It looks like you are using it to determine equality and not to sort. If this is the case, then you should use Equals() instead.
The String.CompareTo method compares this instance with a specified object or String and returns an integer that indicates whether this instance precedes.
if the return value is Less than zero: This instance precedes value.
if the return value is Zero: This instance has the same position in the sort order as value.
if the return value is Greater than zero: This instance follows value.
-or-
value is null.
Sample code to illustrate:
int res1 = "a".CompareTo("A"); // res1 = -1
int res2 = "ab".CompareTo("A"); // res2 = 1
I'm seeing res1 = -1, and res2 = 1 at the end, which was a bit unexpected.
I thought res1 would return 1, since on an ASCII chart "A" (0x41) comes before "a" (0x61).
Also, it seems strange that for res2, the length of the string seems to make a difference. i.e. if "a" comes before "A" (as res1 = -1 indicates), then I would have thought that "a"withAnythingAfterIt would also come before "A"withAnythingAfterIt.
Can someone shed some light?
Thanks.
This is the expected behavior. String.CompareTo(string) does a culture sensitive comparison, using its sort order. In fact it calls CultureInfo to do the job as we can see in the source code:
public int CompareTo(String strB) {
if (strB==null) {
return 1;
}
return CultureInfo.CurrentCulture.CompareInfo.Compare(this, strB, 0);
}
Your current culture puts 'A' after 'a' in the sort order, since it would be a tie, but not after 'ab' since clearly 'ab' comes after either 'a' or 'A' in most sort orders I know. It's just the tie breaking mechanism doing its work: when the sort order would be the same, use the ordinal value!
From MSDN
Definition
Compares this instance with a specified Object and indicates whether
this instance precedes, follows, or appears in the same position in
the sort order as the specified Object.
Note
The CompareTo method was designed primarily for use in sorting or
alphabetizing operations. It should not be used when the primary
purpose of the method call is to determine whether two strings are
equivalent. To determine whether two strings are equivalent, call the
Equals method.
CompareTo is an instance method.
If the first string is bigger, the result is 1. If the first string is smaller, the result is -1. If both strings are equal, the result is 0. The number essentially indicates how much "larger" the first string is.
Console.WriteLine("a".CompareTo("A")); // -1
Console.WriteLine("ab".CompareTo("A")); // 1
Console.WriteLine("a".CompareTo("a")); // 0
Console.WriteLine("ab".CompareTo("AB")); // -1
Console.WriteLine("A".CompareTo("a")); // 1
Console.WriteLine("AB".CompareTo("ab")); // 1
Console.WriteLine("A".CompareTo("A")); // 0
1)
class Program
{
static void Main(string[] args)
{
int a;
a = Convert.ToInt32( "a" );
Console.Write(a);
}
}
I get FormatException with message: Input string was not in a correct format. and this is quite understood.
2)
class Program
{
static void Main(string[] args)
{
int a;
a = Convert.ToInt32( Console.Read() );
Console.Write(a);
}
}
In second case, I can type any characters, for example abc and it displayed in console.
Question: Why doesn't throw FormatException in second case and why it works successfully with non int characters?
UPDATE
with Console.ReadLine() method, which returns string type, also not trown FormatException and printed any character in console successfully.
class Program
{
static void Main(string[] args)
{
int a;
a = Convert.ToInt32(Console.ReadLine());
Console.Write(a);
}
}
The return type of Console.Read() is an int.
You then call Convert.ToInt32(int):
Returns the specified 32-bit signed integer; no actual conversion is performed.
Because the output of the Console.Read() is int. It means it get the int representation of what you have typed, so if you type character it actually get the int representation of that character, and everything is ok.
To see what is happening in detail:
int a;
a = Convert.ToInt32(Console.Read()); //input for example: abc
Console.WriteLine(a); //97
Console.WriteLine((char)a); //a
Return Value Type: System.Int32 The next character from the input
stream, or negative one (-1) if there are currently no more characters
to be read.
public static int Read()
Reference
FormatException : Value does not consist of an optional sign followed by a sequence of digits (0 through 9).
I strongly suspect you are mixing with Console.ReadLine and Console.Read methods.
From Console.Read doc;
Reads the next character from the standard input stream.
and
Return Value Type: System.Int32 The next character from the input
stream, or negative one (-1) if there are currently no more characters
to be read.
That means when you put abc with this method it returns 97 (because 97 is the ascii value of the first character) which is a valid integer.
https://msdn.microsoft.com/en-us/library/sf1aw27b(v=vs.110).aspx
ToInt32 does have an overloaded version that can take a string, but the string must be a representation of a real number; "a" is not a number, but if you had "101" it would parse correctly-
int a;
a = Convert.ToInt32("101"); //will parse to int
Console.Write(a);
a = Convert.ToInt32("a"); //will not parse to int
Console.Write(a);
The reason your second example works while the first one doesn't, is because Console.Read returns the integer value based on the next character passed into it, so all is fine when you call ToInt32 with it.
UPDATE-
Just tested it with ReadLine too, and it still gave me an error.
Why is it that the following code won't work:
endDate.AddDays(7-endDate.DayOfWeek);
While this will:
endDate.AddDays(0-endDate.DayOfWeek + 7);
?
(By "won't work" I mean results in the following compilation error: "cannot convert from 'System.DayOfWeek' to 'double'")
To expand upon what Lasse said (or rather, make it a little more explicit).
Because 0 is convertable to an Enum type,
0 - endDate.DayOfWeek becomes
(DayOfWeek)0 - endDate.DayOfWeek
And since you can subtract one enum from another and get an integer difference:
(DayOfWeek)0 - endDate.DayOfWeek == (int)endDate.DayOfWeek
Thus, since the result of the subtraction is an int, you can then add 7 to it.
endDate.AddDays(0-endDate.DayOfWeek + 7);
So, if Monday's Enum value is 1
0 - endDate.DayOfWeek == -1 + 7 == 6
However, you can't do the reverse.
endDate.DayOfWeek - 0 + 7,
because the result type of the calculation is dependant upon the leftmost side. Thus, while 0 - endDate.DayOfWeek results in an integer, endDate.DayOfWeek - 0 results in an enum DayOfWeek.
Most interestingly, you could use this side-effect to get the value of an enum without casting, though I would consider this hackish and confusing... thus to be avoided.
int enumValue = -(0 - endDate.DayOfWeek);
This is very interesting. The right way to do this is:
endDate.AddDays(7 - (int)endDate.DayOfWeek);
But, your question isn't about a solution, but a reason for the behavior. It has something to do with the way the compiler treats a zero. Either line fails if no zero is present, while both lines work if a zero is present.
You can subtract two enum values to get their integer value difference:
using System;
namespace ConsoleApplication10
{
public enum X { A, B, C, D }
public class Program
{
static void Main()
{
var x = X.D + X.A;
Console.Out.WriteLine(x);
Console.In.ReadLine();
}
}
}
Will print out 3.
But you can't add, probably makes no sense.
In the case of "0", 0 is auto-convertible to all enum types, so basically "0 - enumvalue" means the same as "(enumtype)0 - enumvalue", which again works.