I have a problem which is already solved, but I don't know what really happens. Here is the simplified task: I have a list of records. The records consists of 2 fields, a key and a value. All keys are different. I want to sort them, so
I have a row with empty string as key, that should be in the first place.
The come the rows in which the value contains "Key not found" in alphabetical order by key
Then the rest of the rows in alphabetical order by key.
So I made this class:
private class RecordComparer : IComparer<GridRecord>
{
public int Compare(GridRecord x, GridRecord y)
{
if (x.Key == String.Empty)
return -1;
else if (y.Key == String.Empty)
return 1;
else if (x.Value.Contains("Key not found:") && !y.Value.Contains("Key not found:"))
return -1;
else if (!x.Value.Contains("Key not found:") && y.Value.Contains("Key not found:"))
return 1;
else return (x.Key.CompareTo(y.Key));
}
}
When I try to use it, I got Comparer (or the IComparable methods it relies upon) did not return zero when Array.Sort called x. CompareTo(x). x: '' x's type: 'GridRecord' The IComparer:
The error doesn't always appear, sometimes(usually when I use it first time in my program) it works fine. Second or third call crashes.
Inserting
if (x.Key == y.Key)
return 0;
in the begginning of the Compare function above solved the problem, everything works fine. Why?
If you compare {Key=""} with anything, you are currently returning -1. Even if you are comparing it with itself. When you compare something with itself (or something semantically equivalent to the same), you are supposed to return 0. That is what the error is about.
It is wise to enforce total order in your custom comparer. One of the requirements for total order is reflexivity: for any x Compare(x, x) must be equal to zero. This property is required, for example, when comparer is used to sort an array with non-unique values.
Some libraries may do additional checks for custom comparers. There is no point to compare the element to itself, but on the other hand such check allow the runtime to find subtle errors (like the one you made). Probably thats why you've got your error message. Fixing such errors makes your code more stable. Usually the checks like this exist in debug builds only.
Related
I am working with existing data and have records which contain an array double[23] and double[46]. The values in the array can be the same across multiple records. I would like to generate an id (perhaps an int) to uniquely identify the values in each array.
There are places in the application where I need to group records based on the values in the array being identical. While there are ways to query for this, I was hoping for a single int field (or something similar) to group on. This would really help simplify queries and especially help with report tools where grouping on a smaller single field would help immensely.
I thought of generating a hash code, but I understand these are not guaranteed to be the same for each double[] with matching values. I had tried implementing
((IStructuralEquatable)combined).GetHashCode(EqualityComparer<double>.Default);
To compare the structure and data, but again, I don't think this is guaranteed to match another double[] having the same values.
Perhaps a form of checksum would work but admittedly I am having trouble implementing something. I am looking for suggestions/direction.
Here is data for 3 sample records. Data in record 1&3 are the same so a generated id should match for those.
32.7,48.9,55.9,48.9,47.7,46.9,45.7,44.4,43.4,41.9,40.4,38.4,36.7,34.4,32.4,30.4,27.9,25.4,22.4,19.4,16.4,13.4,10.4,47.9
40.8,49.0,50.0,49.0,47.8,47.0,45.8,44.5,43.5,42.0,40.5,38.5,36.8,34.5,32.5,30.5,28.0,25.5,22.5,19.5,16.5,13.5,10.5,48.0
32.7,48.9,55.9,48.9,47.7,46.9,45.7,44.4,43.4,41.9,40.4,38.4,36.7,34.4,32.4,30.4,27.9,25.4,22.4,19.4,16.4,13.4,10.4,47.9
Perhaps this is not possible without just checking all the data, but was hoping for a better solution to simplify the application and improve the speed.
The goal is to add a new id field to the existing records to represent the array data. That way, passing records into report tools would group together easily on one field rather than checking the whole array on each record.
I appreciate any direction.
EDIT - Some issues I ran into trying things (incase it helps someone)
In trying to understand this originally, I was calling this code (which is part of .NET). I understood these functions would hash the values of the array together (only 8 values in this case). I didn't think it included the array handle. The result was not quite as expected as there is a bug MS corrected in .NET as per the commented line below. With the fix I was getting better results.
int IStructuralEquatable.GetHashCode(IEqualityComparer comparer) {
if (comparer == null)
throw new ArgumentNullException("comparer");
Contract.EndContractBlock();
int ret = 0;
for (int i = (this.Length >= 8 ? this.Length - 8 : 0); i < this.Length; i++) {
ret = CombineHashCodes(ret, comparer.GetHashCode(GetValue(i)));
//.NET 4.6.2, in .NET 4.5.2 it is ret = CombineHashCodes(ret, comparer.GetHashCode(GetValue(0)))
}
return ret;
}
internal static int CombineHashCodes(int h1, int h2) {
return (((h1 << 5) + h1) ^ h2);
}
I modified this to handle more than 8 values and still had some hashes not matching. I later determined the issue was in the data; I was unaware some of the records had some doubles stored with more than one decimal place (should have been rounded). This of course changed the hash. Now that I have the data consistent, I am seeing matching hashes; any arrays with identical values have an identical hash.
I thought of generating a hash code, but I understand these are not guaranteed to be the same for each double[] with matching values
Quite the opposite, a hash function is required by design to return equal hashes for equal inputs. For example, 0 is a good starting point for your hash function, returning the value 0 for equal rows. Everything else is just an optimization to try to reduce false positives.
Perhaps this is not possible without just checking all the data
Of course you need to check all the data, how else would you do it?
However your implementation is broken. The default hash function for an array hashes the handle to the array itself, so different instances of arrays with the same data will show up as different. What you want to do is to use a HashCode instance and Add() each element of your array in it to get a proper hash code.
I have the following code:
bool ColorExistsByName(string colorNameV)
{
var colorName = colorPage.ColorList.CategoryList.FirstOrDefault(c => c.GetText().Contains(colorNameV));
if (colorName == null) {
return false;
}
return colorName.ExistsAndDisplayed;
}
I need to edit this so that it can either Contain OR Equal colorNameV... what's the best way to do this?
EDIT
Sorry - I wasn't clear enough. I need to somehow add a parameter (wholeWord) so that sometimes, it can use contains, and sometimes it can use equals (if we're using wholeWord) - is it possible to use parameter like this? (We already have functionality written for wholeWord parameter in order to check if wholeWord is written)
But your command c.GetText().Contains(colorNameV) does exactly what you want. If you want to test "Contains", that's what this function does. If you want to test equals, it also works, since the c.GetText() will contain colorNameV and nothing else (will contain just itself). It works for both of your cases.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I have array of objects and I would like to find index of some specific object inside this array:
int ix = Array.IndexOf(products, products.Where(item => item != null && item.Id == "xxx").FirstOrDefault());
Item with Id="xxx" doesn't exists, but the ix result is 1.
So, I guess that default for int is 1. How can I know if 1 belongs to first item or default value? It would be nice if I can set default value to -1.
At the end I have done it with findIndex method, but would like to know how to do it with indexOf method.
So you have two parts of one problem. First, you want to find a product:
var product = products.FirstOrDefault(item => item != null && item.Id == "xxx");
And when that product is found, you want to find its index in the products collection:
int index = Array.IndexOf(products, product);
You're halfway there using FirstOrDefault(). If a product with Id "xxx" does not exist, product will be null. So you can check for that and skip IndexOf() for null:
if (product == null)
{
return -1;
}
else
{
return Array.IndexOf(products, product);
}
The fact that your current code returns 1, means that products[1] is null.
Per Microsoft:
Sometimes the value of default(TSource) is not the default value that you want to use if the collection contains no elements. Instead of checking the result for the unwanted default value and then changing it if necessary, you can use the DefaultIfEmpty(IEnumerable, TSource) method to specify the default value that you want to use if the collection is empty. Then, call First(IEnumerable) to obtain the first element.
However, I'm not so sure that's your issue. Your code isn't syntactically correct (you're missing a ')' somewhere), but it appears you're calling FirstOrDefault() after your Where(). This will either return an element or null. Since you said an item with id "xxx" doesn't exist, it's going to check for the index of null in your array. The default value for indexOf is (again, per Microsoft) "the lower bound of the array minus 1." This should return -1 (I'd hope) instead of 1.
Conclusion: take a better look at your code and see what's really going on. Break up this linq statement into two different parts.
var item = products.Where(item => item != null && item.Id == "xxx").FirstOrDefault();
int ix = Array.IndexOf(products, item);
Then step through your code to check the values of everything. I'm sure you will find your issue, and it won't be what you expected.
If you want to call FirstOrDefault on a struct but the default value is the same as a valid one, here's one thing you can do:
(I won't use your code as the missing parenthesis prevents from knowing what your goal is)
myCollection.Where(myCondition).Cast<int?>().FirstOrDefault();
This way, 0 would be the first correct value, and null would mean that there is no correct values.
First or default returns the first element (in this case, the first item found on the where conditions) OR, the default value.
This would return an OBJECT if conditions are met (the first object that mets the condition).
However, if conditions aren't met, it would return null (the default value of an OBJECT),
soo your code would be trying to find indexOf(products,null).. that would be an NullReferenceException.
The firstOrDefault is doing his job under the OBJECT type that is inside the array.
After this, the result is passed as parameter on the method indexOf.
"indexOf" will return the index of the first object on the "where" condition.
If not found, indexOf will return -1.
By the way, u're missing an parenthesis.
This question already has answers here:
Why does IQueryable.All() return true on an empty collection?
(11 answers)
Closed 7 years ago.
var strs = new Collection<string>();
bool b = strs.All(str => str == "ABC");
The code creates an empty collection of string, then tries to determine if all the elements in the collection are "ABC".
If you run it, b will be true.
But the collection does not even have any elements in it, let alone any elements that equal to "ABC".
Is this a bug, or is there a reasonable explanation?
It's certainly not a bug. It's behaving exactly as documented:
true if every element of the source sequence passes the test in the specified predicate, or if the sequence is empty; otherwise, false.
Now you can argue about whether or not it should work that way (it seems fine to me; every element of the sequence conforms to the predicate) but the very first thing to check before you ask whether something is a bug, is the documentation. (It's the first thing to check as soon as a method behaves in a way other than what you expected.)
All requires the predicate to be true for all elements of the sequence. This is explicitly stated in the documentation. It's also the only thing that makes sense if you think of All as being like a logical "and" between the predicate's results for each element. The true you're getting out for the empty sequence is the identity element of the "and" operation. Likewise, the false you get from Any for the empty sequence is the identity for logical "or".
If you think of All as "there are no elements in the sequence that are not", this might make more sense.
It is true, as nothing (no condition) makes it false.
The docs probably explain it. (Jon Skeet also mentioned something a few years back)
Same goes for Any (the opposite of All) returning false for empty sets.
Edit:
You can imagine All to be implemented semantically the same as:
foreach (var e in elems)
{
if (!cond(e))
return false;
}
return true; // no escape from loop
Most answers here seem to go along the lines of "because that's how is defined". But there is also a logical reason why is defined this way.
When defining a function, you want your function to be as general as possible, such that it can be applied to the largest possible number of cases. Say, for instance, that I want to define the Sum function, which returns the sum of all the numbers in a list. What should it return when the list is empty? If you'd return an arbitrary number x, you'd define the function as the:
Function that returns the sum of all numbers in the given list, or x if the list is empty.
But if x is zero, you can also define it as the
Function that returns x plus the given numbers.
Note that definition 2 implies definition 1, but 1 does not imply 2 when x is not zero, which by itself is enough reason to pick 2 over 1. But also note 2 is more elegant and, in its own right, more general than 1. Is like placing a spotlight farther away so that it lightens a larger area. A lot larger actually. I'm not a mathematician myself but I'm sure they'll find a ton of connections between definition 2 and other mathematical concepts, but not so many related to definition 1 when x is not zero.
In general, you can, and most likely want to return the identity element (the one that leaves the other operand unchanged) whenever you have a function that applies a binary operator over a set of elements and the set is empty. This is the same reason a Product function will return 1 when the list is empty (note that you could just replace "x plus" with "one times" in definition 2). And is the same reason All (which can be thought of as the repeated application of the logical AND operator) will return true when the list is empty (p && true is equivalent to p), and the same reason Any (the OR operator) will return false.
The method cycles through all elements until it finds one that does not satisfy the condition, or finds none that fail. If none fail, true is returned.
So, if there are no elements, true is returned (since there were none that failed)
Here is an extension that can do what OP wanted to do:
static bool All<T>(this IEnumerable<T> source, Func<T, bool> predicate, bool mustExist)
{
foreach (var e in source)
{
if (!predicate(e))
return false;
mustExist = false;
}
return !mustExist;
}
...and as others have pointed out already this is not a bug but well-documented intended behavior.
An alternative solution if one does not wish to write a new extension is:
strs.DefaultIfEmpty().All(str => str == "ABC");
PS: The above does not work if looking for the default value itself!
(Which for strings would be null.)
In such cases it becomes less elegant with something similar to:
strs.DefaultIfEmpty(string.Empty).All(str => str == null);
If you can enumerate more than once the easiest solution is:
strs.All(predicate) && strs.Any();
i.e simply add a check after that there actually were any element.
Keeping the implementation aside. Does it really matter if it is true? See if you have some code which iterates over the enumerable and executes some code. if All() is true then that code is still not going to run since the enumerable doesn't have any elements in it.
var hungryDogs = Enumerable.Empty<Dog>();
bool allAreHungry = hungryDogs.All(d=>d.Hungry);
if (allAreHungry)
foreach (Dog dog in hungryDogs)
dog.Feed(biscuits); <--- this line will not run anyway.
What's the best way to go about handling nulls during a binary search over a List<string> (well, it would be a List<string> if I could read all the values out beforehand)?
int previous = 0;
int direction = -1;
if (itemToCompare == null) {
previous = mid;
for (int tries = 0; tries < 2; tries++) {
mid += direction;
itemToCompare = GetItem(mid);
while (itemToCompare == null && insideInclusiveRange(min, max, mid)) {
mid += direction;
itemToCompare = GetItem(mid);
}
if (!insideInclusiveRange(min, max, mid)) {
/* Reached an endpoint without finding anything,
try the other direction. */
mid = previous;
direction = -direction;
} else if (itemToCompare != null) {
break;
}
}
}
I'm currently doing something like the above - if null is encountered, then linearly search in a direction until either non-null or beyond endpoint is encountered, if no success then repeat in other direction. In the actual code I'm getting direction from the previous comparison result, and GetItem() caches the values it retrieves. Is there an easier way, without making an intermediate list of non-null values (takes far too long for my purposes because the GetItem() function above is slow)?
I guess I'm asking if there's a smarter way to handle null values than to degrade to a linear search. In all likelihood there will only be a small percentage of nulls (1-5%), but it's possible for there to be sequences of 100s of null.
Edit - The data looks something like this
aa aaa
b bb bbb
c cc
d ddd
where each row is a separate object, and not all cells are guaranteed to be filled. The user needs to be able to search across an entire row (so that both "bb" and "bbb" would match the entire second row). Querying each object is slow enough that a linear search will not work. For the same reason, creating a new list without nulls is not really feasible.
Unless there is a reason to actually select/find a null value (not sure what that means as null is a singleton and binary search is often most desirable on unique values), consider not allowing them in the list at all.
[Previous answer: After reflecting on the question more I have decided that nulls likely have no place in the problem-space -- take bits and parts as appropriate.]
If nulls are desired, just sort the list such that null values are first (or last) and update the logic correctly -- then just make sure not to invoke a method upon any of the null values ;-)
This should have little overall impact since a sort is already required. If items are changed to null -- which sounds like an icky side-effect! -- then just "compact" the List (e.g. "remove" the null item). I would, however, just not modify the sorted list unless there is a good reason.
Binary search is only really designed/suitable for (entirely) sorted data. No point turning it into a binary-maybe-linear search.
Happy coding.