I have a function which I believe can be simplified into LINQ but have been unable to do so yet.
The function looks like this:
private IList<Colour> GetDifference(IList<Colour> firstList, IList<Colour> secondList)
{
// Create a new list
var list = new List<Colour>();
// Loop through the first list
foreach (var first in firstList)
{
// Create a boolean and set to false
var found = false;
// Loop through the second list
foreach (var second in secondList)
{
// If the first item id is the same as the second item id
if (first.Id == second.Id)
{
// Mark it has being found
found = true;
}
}
// After we have looped through the second list, if we haven't found a match
if (!found)
{
// Add the item to our list
list.Add(first);
}
}
// Return our differences
return list;
}
Can this be converted to a LINQ expression easily?
What is Colour? If it overrides Equals to compare by Id then this would work:
firstList.Except(secondList);
If Colour does not override Equals or it would be wrong for you to do so in the wider context, you could implement an IEqualityComparer<Colour> and pass this as a parameter:
firstList.Except(secondList, comparer);
See the documentation
As noted in the comments below, Except has the added side effect of removing any duplicates in the source (firstList in this example). This may or may not be an issue to you, but should be considered.
If keeping any duplicates in firstList is of importance, then this is the alternative:
var secondSet = new HashSet<Colour>(secondList, comparer);
var result = firstList.Where(c => !secondSet.Contains(c));
As before, comparer is optional if Colour implements appropriate equality
try the following:
var result = firstList.Where(x => !secondList.Any(y => y.ID == x.ID));
Edit:
If you care about runtime and don't mind creating your own IEqualityComparer<>, i would suggest you use Except like Charles suggested in his answer. Except seems to use a hashtable for the second list which speeds it up quite a bit compared to my O(n*m) query. However be aware that Except removes duplicates from secondList as well.
Related
I need to analyze a task that starts with the code below but I couldn't figure out what the LINQ part is doing. Any leads are appreciated
foreach (var item in list.GroupBy(x => x.AccountNumber).Select(g => g.First()))
{
...
}
Some roughly-equivalent code (i.e. has the same function, but works slightly differently) would be:
var seenAccountNumbers = new HashSet<int>(); // Or some other data type?
foreach (var item in list)
{
if (seenAccountNumbers.Add(item.AccountNumber))
{
...
}
}
This code is a (somewhat wasteful) way of getting the first item by account number. It's wasteful because there's no reason to group everything before trying to find the first item per group.
The same thing can be implemented with an iterator function by iterating over all items in the input list and keeping track of all the AccountNumber values found so far. When a new one is found, yield it and add it to the tracking list. Or rather, HashSet.
In fact, that's how MoreLinq's DistinctBy operator is implemented :
var knownKeys = new HashSet<TKey>(comparer);
foreach (var element in source)
{
if (knownKeys.Add(keySelector(element)))
yield return element;
}
From the method's description:
Returns all distinct elements of the given source, where "distinctness"is determined via a projection and the default equality comparer for the projected type.
If a key is seen multiple times, only the first element with that key is returned.
The question's code can be replaced with :
foreach (var item in list.DistinctBy(x => x.AccountNumber))
{...
}
Create a dictionary, with the AccountNumber as Key, and put all your items from list, in that dictionary. That is about what happens.
You will overwrite items, with the same key, and a randomly last element, will stay in the dictionary. There is no order ensured when using GroupBy, so it doesn't matter if you choose First or Last element at the end, it just has the meaning of "pick one" (random).
var dict = new Dictionary<KeyType, ElementType>();
foreach(var item in list)
if (!dict.ContainsKey(item.AccountNumber))
dict[item.AccountNumber] = item;
You original iteration would now be
foreach(var item in dict.Values)
{
.....
}
To ask for Non-LINQ solution is not so strange, cause LINQ offers never the most performant solution, it's just short writing and fast coding.
I am working with an XML standard called SDMX. It's fairly complicated but I'll make it as short as possible. I am receiving an object called CategoryScheme. This object can contain a number of Category, and each Category can contain more Category, and so on, the chain can be infinite. Every Category has an unique ID.
Usually each Category contains a lot of Categories. Together with this object I am receiving an Array, that contains the list of IDs that indicates where a specific Category is nested, and then I am receiving the ID of that category.
What I need to do is to create an object that maintains the hierarchy of the Category objects, but each Category must have only one child and that child has to be the one of the tree that leads to the specific Category.
So I had an idea, but in order to do this I should generate LINQ queries inside a cycle, and I have no clue how to do this. More information of what I wanted to try is commented inside the code
Let's go to the code:
public void RemoveCategory(ArtefactIdentity ArtIdentity, string CategoryID, string CategoryTree)
{
try
{
WSModel wsModel = new WSModel();
// Prepare Art Identity and Array
ArtIdentity.Version = ArtIdentity.Version.Replace("_", ".");
var CatTree = JArray.Parse(CategoryTree).Reverse();
// Get Category Scheme
ISdmxObjects SdmxObj = wsModel.GetCategoryScheme(ArtIdentity, false, false);
ICategorySchemeMutableObject CatSchemeObj = SdmxObj.CategorySchemes.FirstOrDefault().MutableInstance;
foreach (var Cat in CatTree)
{
// The cycle should work like this.
// At every iteration it must delete all the elements except the correct one
// and on the next iteration it must delete all the elements of the previously selected element
// At the end, I need to have the CatSchemeObj full of the all chains of categories.
// Iteration 1...
//CatSchemeObj.Items.ToList().RemoveAll(x => x.Id != Cat.ToString());
// Iteration 2...
//CatSchemeObj.Items.ToList().SingleOrDefault().Items.ToList().RemoveAll(x => x.Id != Cat.ToString());
// Iteration 3...
//CatSchemeObj.Items.ToList().SingleOrDefault().Items.ToList().SingleOrDefault().Items.ToList().RemoveAll(x => x.Id != Cat.ToString());
// Etc...
}
}
catch (Exception ex)
{
throw ex;
}
}
Thank you for your help.
So, as i already said in my comment, building a recursive function should fix the issue. If you're new to it, you can find some basic information about recursion in C# here.
The method could look something like this:
private void DeleteRecursively(int currentRecursionLevel, string[] catTree, ICategorySchemeMutableObject catSchemeObj)
{
catSchemeObj.Items.ToList().RemoveAll(x => x.Id != catTree[currentRecursionLevel].ToString());
var leftoverObject = catSchemeObj.Items.ToList().SingleOrDefault();
if(leftoverObject != null) DeleteRecursively(++currentRecursionLevel, catTree, leftoverObject);
}
Afterwards you can call this method in your main method, instead of the loop:
DeleteRecursively(0, CatTree, CatSchemeObject);
But as i also said, keep in mind, that calling the method in the loop, seems senseless to me, because you already cleared the tree, besides the one leftover path, so calling the method with the same tree, but another category, will result in an empty tree (in CatSchemeObject).
CAUTION! Another thing to mention i noticed right now: Calling to list on your Items property and afterwards deleting entries, will NOT affect your source object, as ToList is generating a new object. It IS keeping the referenced original objects, but a deletion only affects the list. So you must write back the resulting list to your Items property, or find a way to directly delete in the Items object. (Assuming it's an IEnumerable and not a concrete collection type you should write it back).
Just try it out with this simple example, and you will see that the original list is not modified.
IEnumerable<int> test = new List<int>() { 1, 2, 3, 4 , 1 };
test.ToList().RemoveAll(a => a != 1);
Edited:
So here is another possible way of going after the discussion below.
Not sure what do you really need so just try it out.
int counter = 0;
var list = CatSchemeObj.Items.ToList();
//check before you call it or you will get an error
if(!list.Equals(default(list)))
{
while(true)
{
var temp = list.Where(x => CatTree[counter++] == x.Id); // or != ? play with it .
list = temp.Items.ToList().SingleOrDefault();
if(list.Equals(default(list))
{
break;
}
}
}
I just translated you problem to 2 solutions, but I am not sure if you won't lose data because of the SingleOrDefault call. It means 'Grab the first item regardless of everything'. I know you said you have only 1 Item that is ok, but still... :)
Let me know in comment if this worked for you or not.
//solution 1
// inside of this loop check each child list if empty or not
foreach (var Cat in CatTree)
{
var list = CatSchemeObj.Items.ToList();
//check before you call it or you will get an error
if(!list.Equals(default(list)))
{
while(true)
{
list.RemoveAll(x => x.Id != Cat.ToString());
list = list.ToList().SingleOrDefault();
if(list.Equals(default(list))
{
break;
}
}
}
}
//solution 2
foreach (var Cat in CatTree)
{
var list = CatSchemeObj.Items.ToList();
//check before you call it or you will get an error
if(!list.Equals(default(list)))
{
CleanTheCat(cat, list);
}
}
//use this recursive function outside of loop because it will cat itself
void CleanTheCat(string cat, List<typeof(ICategorySchemeMutableObject.Items) /*Place here whatever type you have*/> CatSchemeObj)
{
CatSchemeObj.RemoveAll(x => x.Id != cat);
var catObj = CatSchemeObj.Items.ToList().SingleOrDefault();
if (!catObj.Equals(default(catObj)){
CleanTheCat(cat, catObj);
}
}
Thank you to whoever tried to help but I solved it by myself in a much easier way.
I just sent the full CategoryScheme object to the method that converted it in the XML format, then just one line did the trick:
XmlDocument.Descendants("Category").Where(x => !CatList.Contains(x.Attribute("id").Value)).RemoveIfExists();
Consider you have two lists in C#, first list contains elements of TypeOne and second list contains elements of TypeTwo:
TypeOne
{
int foo;
int bar;
}
TypeTwo
{
int baz;
int qux;
}
Now I need to find elements ( with some property value ) in the first list that don't exist in the second list, and similarly I want to find elements in the second list that don't exist in the first list. (There are only zero or one occurences in either lists.)
What I tried so far is to iterate both lists like this:
foreach (var item in firstList)
{
if (!secondList.Any(a=> a.baz == item.foo)
{
// Item is in the first list but not in second list.
}
}
and again:
foreach (var item in secondList)
{
if (!firstList.Any(a=> a.foo == item.baz)
{
// Item is in the second list but not in first list.
}
}
I hardly think this is a good way to do what I want. I'm iterating my lists two times and use Any in each of them which also iterates the list. So too many iterations.
What is the most efficient way to achieve this?
I am afraid there is no prebuild solution for this, so the best we can do is optimize as much as possible. We only have to iterate the first list, because everything that is in second will be compared already
// First we need copies to operate on
var firstCopy = new List<TypeOne>(firstList);
var secondCopy = new List<TypeTwo>(secondList);
// Now we iterate the first list once complete
foreach (var typeOne in firstList)
{
var match = secondCopy.FirstOrDefault(s => s.baz == typeOne.foo);
if (match == null)
{
// Item in first but not in second
}
else
{
// Match is duplicate and shall be removed from both
firstCopy.Remove(typeOne);
secondCopy.Remove(match);
}
}
After running this both copies will only contain the values which are unique in this instance. This not only reduces it to half the number of iterations but also constantly improves because the second copy shrinks with each match.
Use this LINQ Query.
var result1 = secondList.Where(p2 => !firstList.Any(p1 => p1.foo == p2.baz));
var result2=firstList.Where(p1=> !secondList.Any(p2=> p2.foo == p1.baz);
I had a strange bug i don't understand, and changing LINQ's IEnumerable to list half way through fixed it, and i dont understand why
Not Real the Code, but very similar
The code below doesn't work:
// an IEnumerable of some object (Clasess) internally an array
var ansestors = GetAnsestors();
var current = GetCurrentServerNode();
var result = from serverNode in ansestors
select new PolicyResult
{
//Some irrelevant stuff
OnNotAvailableNode = NodeProcessingActionEnum.ContinueExecution,
};
var thisNode = new PolicyResult
{
//Some irrelevant stuff
OnNotAvailableNode = NodeProcessingActionEnum.ThrowException,
};
result = result.Reverse();
result = result.Concat(new List<PolicyResult> { thisNode });
result.First().OnNotAvailableNode = NodeProcessingActionEnum.ThrowException;
// When looking in the debugger, and in logs, the first element of the
// result sequence has OnNotAvailableNode set to ContinueExecution
// Which doesnt make any sense...
But when i change the ending to the following it works:
result = result.Reverse();
result = result.Concat(new List<PolicyResult> { thisNode });
var policyResults = result.ToList();
var firstPolicyResult = policyResults.First();
firstPolicyResult.OnNotAvailableNode = NodeProcessingActionEnum.ThrowException;
return policyResults;
All the types here are classes (reference types) except NodeProcessingActionEnum which is an enum.
Is this a bug?
Me missing something crucial about LINQ?
Help?
result.First() executes the (deferred / lazy) query.
That line will set the value OK but when you use result later the query will be executed again.
Later you are looking at a newly fetched copy. The fact that it is different lets me assume that GetAnsestors() is also lazily evaluated and is not an in memory List<>
This means that ToList() is a worthwhile optimization as well as a fix. Note that after the ToList you can also use
var firstPolicyResult = policyResults[0];
The problem is that running First on your IEnumerable removes it from the enumerator so you've then checking the next element. Actually I've changed my mind - that's probably not it. This solution might be worth a shot, though.
You could wrap the IEnumerable with something which makes the change for you, e.g. using the Select override which accepts an index too:
var modifiedResults = results.Select((r, index) => {
if (index == 0) {
// This is the first element
r.OnNotAvailableNode = NodeProcessingActionEnum.ThrowException;
}
return r;
});
(untested) should do the trick.
I asked a question in which one of the response contained the following LINQ code:
var selected = lstAvailableColors.Cast<ListItem>().Where(i => i.Selected).ToList();
selected.ForEach( x => { lstSelectedColors.Items.Add(x); });
selected.ForEach( x => { lstAvailableColors.Items.Remove(x);});
Can someone explain the above LINQ to a total newbie?
The LINQ operators use what's called a fluent interface, so you can read the first line as a series of function calls. Assuming that lstAvailableColors is IEnumerable<T>, the idea is that each available color flows through the LINQ operators.
Let's break it down:
var selected = lstAvailableColors
// each item is cast to ListItem type
.Cast<ListItem>()
// items that don't pass the test (Selected == true) are dropped
.Where(i => i.Selected)
// turn the stream into a List<ListItem> object
.ToList();
EDIT: As JaredPar pointed out, the last line above (ToList()) is very important. If you didn't do this, then each of the two selected.ForEach calls would re-run the query. This is called deferred execution and is an important part of LINQ.
You could rewrite this first line like this:
var selected = new List<ListItem>();
foreach (var item in lstAvailableColors)
{
var listItem = (ListItem)item;
if (!listItem.Selected)
continue;
selected.Add(listItem);
}
The last two lines are just another way to write a foreach loop and could be rewritten as:
foreach (var x in selected)
{
lstSelectedColors.Items.Add(x);
}
foreach (var x in selected)
{
lstAvailableColors.Items.Remove(X);
}
Probably the hardest part of learning LINQ is learning the flow of data and the syntax of lambda expressions.
Explanation from original question.
The LINQ version works in two parts. The first part is the first line which finds the currently selected items and stores the value in a List. It's very important that the line contain the .ToList() call because that forces the query to execute immediately vs. being delayed executed.
The next two lines iterate through each value which is selected and remove or add it to the appropriate list. Because the selected list is already stored we are no longer enumerating the collection when we modify it.
It casts each item in the list to type ListItem, then selects only those whose Selected property is true. It then creates a new list containing just these items. For each item in the resulting list, it adds that item to the selected colors list and removes it from the available colors list.
Maybe some translations would help
var selected = lstAvailableColors.Cast<ListItem>().Where(i => i.Selected).ToList();
could be written as:
List<ListItem> selected = new List<ListItem>();
foreach (ListItem item in lstAvailableColors)
{
if (item.Selected)
selected.Add(item);
}
Note that foreach implicitly casts the items on the list to whatever type the loop variable is, in this case ListItem, so that takes care of the Cast<ListItem> on the list. Where filters out any items for which the expression is false, so I do the same thing with an if statement. Finally, ToList turns the sequence into a list, so I just build up a list as I go. The end result is the same.
And:
selected.ForEach( x => { lstSelectedColors.Items.Add(x); });
selected.ForEach( x => { lstAvailableColors.Items.Remove(x); });
could be written as:
foreach (ListItem item in selected)
{
lstSelectedColors.Items.Add(item);
lstAvailableColors.Items.Remove(item);
}
I doubt if there's a good reason for writing it the more obscure way in that case.