Querying a chain of list of lists with LINQ - c#

I am working with an XML standard called SDMX. It's fairly complicated but I'll make it as short as possible. I am receiving an object called CategoryScheme. This object can contain a number of Category, and each Category can contain more Category, and so on, the chain can be infinite. Every Category has an unique ID.
Usually each Category contains a lot of Categories. Together with this object I am receiving an Array, that contains the list of IDs that indicates where a specific Category is nested, and then I am receiving the ID of that category.
What I need to do is to create an object that maintains the hierarchy of the Category objects, but each Category must have only one child and that child has to be the one of the tree that leads to the specific Category.
So I had an idea, but in order to do this I should generate LINQ queries inside a cycle, and I have no clue how to do this. More information of what I wanted to try is commented inside the code
Let's go to the code:
public void RemoveCategory(ArtefactIdentity ArtIdentity, string CategoryID, string CategoryTree)
{
try
{
WSModel wsModel = new WSModel();
// Prepare Art Identity and Array
ArtIdentity.Version = ArtIdentity.Version.Replace("_", ".");
var CatTree = JArray.Parse(CategoryTree).Reverse();
// Get Category Scheme
ISdmxObjects SdmxObj = wsModel.GetCategoryScheme(ArtIdentity, false, false);
ICategorySchemeMutableObject CatSchemeObj = SdmxObj.CategorySchemes.FirstOrDefault().MutableInstance;
foreach (var Cat in CatTree)
{
// The cycle should work like this.
// At every iteration it must delete all the elements except the correct one
// and on the next iteration it must delete all the elements of the previously selected element
// At the end, I need to have the CatSchemeObj full of the all chains of categories.
// Iteration 1...
//CatSchemeObj.Items.ToList().RemoveAll(x => x.Id != Cat.ToString());
// Iteration 2...
//CatSchemeObj.Items.ToList().SingleOrDefault().Items.ToList().RemoveAll(x => x.Id != Cat.ToString());
// Iteration 3...
//CatSchemeObj.Items.ToList().SingleOrDefault().Items.ToList().SingleOrDefault().Items.ToList().RemoveAll(x => x.Id != Cat.ToString());
// Etc...
}
}
catch (Exception ex)
{
throw ex;
}
}
Thank you for your help.

So, as i already said in my comment, building a recursive function should fix the issue. If you're new to it, you can find some basic information about recursion in C# here.
The method could look something like this:
private void DeleteRecursively(int currentRecursionLevel, string[] catTree, ICategorySchemeMutableObject catSchemeObj)
{
catSchemeObj.Items.ToList().RemoveAll(x => x.Id != catTree[currentRecursionLevel].ToString());
var leftoverObject = catSchemeObj.Items.ToList().SingleOrDefault();
if(leftoverObject != null) DeleteRecursively(++currentRecursionLevel, catTree, leftoverObject);
}
Afterwards you can call this method in your main method, instead of the loop:
DeleteRecursively(0, CatTree, CatSchemeObject);
But as i also said, keep in mind, that calling the method in the loop, seems senseless to me, because you already cleared the tree, besides the one leftover path, so calling the method with the same tree, but another category, will result in an empty tree (in CatSchemeObject).
CAUTION! Another thing to mention i noticed right now: Calling to list on your Items property and afterwards deleting entries, will NOT affect your source object, as ToList is generating a new object. It IS keeping the referenced original objects, but a deletion only affects the list. So you must write back the resulting list to your Items property, or find a way to directly delete in the Items object. (Assuming it's an IEnumerable and not a concrete collection type you should write it back).
Just try it out with this simple example, and you will see that the original list is not modified.
IEnumerable<int> test = new List<int>() { 1, 2, 3, 4 , 1 };
test.ToList().RemoveAll(a => a != 1);

Edited:
So here is another possible way of going after the discussion below.
Not sure what do you really need so just try it out.
int counter = 0;
var list = CatSchemeObj.Items.ToList();
//check before you call it or you will get an error
if(!list.Equals(default(list)))
{
while(true)
{
var temp = list.Where(x => CatTree[counter++] == x.Id); // or != ? play with it .
list = temp.Items.ToList().SingleOrDefault();
if(list.Equals(default(list))
{
break;
}
}
}
I just translated you problem to 2 solutions, but I am not sure if you won't lose data because of the SingleOrDefault call. It means 'Grab the first item regardless of everything'. I know you said you have only 1 Item that is ok, but still... :)
Let me know in comment if this worked for you or not.
//solution 1
// inside of this loop check each child list if empty or not
foreach (var Cat in CatTree)
{
var list = CatSchemeObj.Items.ToList();
//check before you call it or you will get an error
if(!list.Equals(default(list)))
{
while(true)
{
list.RemoveAll(x => x.Id != Cat.ToString());
list = list.ToList().SingleOrDefault();
if(list.Equals(default(list))
{
break;
}
}
}
}
//solution 2
foreach (var Cat in CatTree)
{
var list = CatSchemeObj.Items.ToList();
//check before you call it or you will get an error
if(!list.Equals(default(list)))
{
CleanTheCat(cat, list);
}
}
//use this recursive function outside of loop because it will cat itself
void CleanTheCat(string cat, List<typeof(ICategorySchemeMutableObject.Items) /*Place here whatever type you have*/> CatSchemeObj)
{
CatSchemeObj.RemoveAll(x => x.Id != cat);
var catObj = CatSchemeObj.Items.ToList().SingleOrDefault();
if (!catObj.Equals(default(catObj)){
CleanTheCat(cat, catObj);
}
}

Thank you to whoever tried to help but I solved it by myself in a much easier way.
I just sent the full CategoryScheme object to the method that converted it in the XML format, then just one line did the trick:
XmlDocument.Descendants("Category").Where(x => !CatList.Contains(x.Attribute("id").Value)).RemoveIfExists();

Related

C# Recursive Search an array of Objects.parent_id for value, then search those and so on till none left

Looking for a solution to find an object.id and get all the parent_id's in an array of objects, and then set object.missed = true.
Object.id, and Object parent_id. If the object doesn't have a parent_id, parent_id = id.
I know how to do it for one level of parent_id's. How can I go unlimited levels deep? Below is the code I have for searching the 1 level.
public class EPlan
{
public int id;
public int parent_id;
public bool is_repeatable;
public bool missed;
}
EPlan[] plans = Array.FindAll(eventsPlan, item => item.parent_id == event_id);
foreach (EPlan plan in plans)
{
plan.missed = true;
plan.is_repeatable = false;
}
I'm trying to search for event_id an int. So I search all of the object.id's for event_id. Once I find object.id == event_id. I need to set object.is_repeatable = false and object.missed = true.
Then I need to search all of the objects.parent_id for current object.id (event_id). Change all of those object to the same as above.
Then I need to check all of those object.id's against all of the object.parent_id's and do the same to those. Like a tree affect. 1 event was missed, and any of the events that are parented to that event need to be set as missed as well.
So far, all I can do is get 1 level deep, or code multiple foreach loops in. But it could be 10 or more levels deep. So that doesn't make sense.
Any help is appreciated. There has to be a better way that the multiple loops.
I too was confused by the question, save for the one line you said:
1 event was missed, and any of the events that are parented to that event need to be set as missed as well.
With that in mind, I suggest the following code will do what you're looking for. Each time you call the method, it will find all of the objects in the array that match the ID and set the event as Missed and Is_Repeatable appropriately.
It also keeps a running list of the Parent_ID's it found during this scan. Once the loop is finished it will call itself, using the list of parent id values instead of the passed in list of events ids it just used. That is the trick that makes the recursion work here.
To start the process off, you call the method with the single event ID you did for 1-level search.
findEvents(new List<string>{event_id}, eventsPlan);
private void findEvents(List<int> eventIDs, EPlan[] eventsPlan)
{
foreach (int eventID in eventIDs)
{
EPlan[] plans = Array.FindAll(eventsPlan, item => item.parent_id == eventID);
List<int> parentIDs = new List<int>();
foreach (EPlan plan in plans)
{
plan.missed = true;
plan.is_repeatable = false;
parentIDs.Add(plan.parent_id);
}
if (parentIDs.Count > 0)
findEvents(parentIDs, eventsPlan);
}
}
I also recommend that if you have the chance to reengineer this code to not use arrays, but a Generic Collection (like List<EPlan>) you can avoid the performance penalty this code has because it's building new arrays in memory each time you call the Array.FindAll method. Using the Generic Collection, or even using old-school foreach loop will work faster when processing a lot of data here.
Update 1:
To answer your question about how you might go about this using a Generic Collection instead:
private void findEventsAsList(List<int> eventIDs, List<EPlan> eventsPlans)
{
List<int> parentIDs = new List<int>();
foreach (EPlan plan in eventsPlans.Where(p => eventIDs.Contains(p.parent_id)))
{
plan.missed = true;
plan.is_repeatable = false;
parentIDs.Add(plan.parent_id);
}
findEventsAsList(parentIDs, eventsPlan);
}

What is the most efficient way to find elements in a list that do not exist in another list and vice versa?

Consider you have two lists in C#, first list contains elements of TypeOne and second list contains elements of TypeTwo:
TypeOne
{
int foo;
int bar;
}
TypeTwo
{
int baz;
int qux;
}
Now I need to find elements ( with some property value ) in the first list that don't exist in the second list, and similarly I want to find elements in the second list that don't exist in the first list. (There are only zero or one occurences in either lists.)
What I tried so far is to iterate both lists like this:
foreach (var item in firstList)
{
if (!secondList.Any(a=> a.baz == item.foo)
{
// Item is in the first list but not in second list.
}
}
and again:
foreach (var item in secondList)
{
if (!firstList.Any(a=> a.foo == item.baz)
{
// Item is in the second list but not in first list.
}
}
I hardly think this is a good way to do what I want. I'm iterating my lists two times and use Any in each of them which also iterates the list. So too many iterations.
What is the most efficient way to achieve this?
I am afraid there is no prebuild solution for this, so the best we can do is optimize as much as possible. We only have to iterate the first list, because everything that is in second will be compared already
// First we need copies to operate on
var firstCopy = new List<TypeOne>(firstList);
var secondCopy = new List<TypeTwo>(secondList);
// Now we iterate the first list once complete
foreach (var typeOne in firstList)
{
var match = secondCopy.FirstOrDefault(s => s.baz == typeOne.foo);
if (match == null)
{
// Item in first but not in second
}
else
{
// Match is duplicate and shall be removed from both
firstCopy.Remove(typeOne);
secondCopy.Remove(match);
}
}
After running this both copies will only contain the values which are unique in this instance. This not only reduces it to half the number of iterations but also constantly improves because the second copy shrinks with each match.
Use this LINQ Query.
var result1 = secondList.Where(p2 => !firstList.Any(p1 => p1.foo == p2.baz));
var result2=firstList.Where(p1=> !secondList.Any(p2=> p2.foo == p1.baz);

Function to linq conversion

I have a function which I believe can be simplified into LINQ but have been unable to do so yet.
The function looks like this:
private IList<Colour> GetDifference(IList<Colour> firstList, IList<Colour> secondList)
{
// Create a new list
var list = new List<Colour>();
// Loop through the first list
foreach (var first in firstList)
{
// Create a boolean and set to false
var found = false;
// Loop through the second list
foreach (var second in secondList)
{
// If the first item id is the same as the second item id
if (first.Id == second.Id)
{
// Mark it has being found
found = true;
}
}
// After we have looped through the second list, if we haven't found a match
if (!found)
{
// Add the item to our list
list.Add(first);
}
}
// Return our differences
return list;
}
Can this be converted to a LINQ expression easily?
What is Colour? If it overrides Equals to compare by Id then this would work:
firstList.Except(secondList);
If Colour does not override Equals or it would be wrong for you to do so in the wider context, you could implement an IEqualityComparer<Colour> and pass this as a parameter:
firstList.Except(secondList, comparer);
See the documentation
As noted in the comments below, Except has the added side effect of removing any duplicates in the source (firstList in this example). This may or may not be an issue to you, but should be considered.
If keeping any duplicates in firstList is of importance, then this is the alternative:
var secondSet = new HashSet<Colour>(secondList, comparer);
var result = firstList.Where(c => !secondSet.Contains(c));
As before, comparer is optional if Colour implements appropriate equality
try the following:
var result = firstList.Where(x => !secondList.Any(y => y.ID == x.ID));
Edit:
If you care about runtime and don't mind creating your own IEqualityComparer<>, i would suggest you use Except like Charles suggested in his answer. Except seems to use a hashtable for the second list which speeds it up quite a bit compared to my O(n*m) query. However be aware that Except removes duplicates from secondList as well.

Why does this LINQ query new up only one instance of the internal List?

Upon request, I have simplified this question. When trying to take two generic List and blend them, I get unexpected results.
private List<ConditionGroup> GetConditionGroupParents()
{
return (from Conditions in dataContext.Conditions
orderby Conditions.Name
select new ConditionGroup
{
GroupID = Conditions.ID,
GroupName = Conditions.Name,
/* PROBLEM */ MemberConditions = new List<Condition>()
}).ToList();
}
private List<ConditionGroup> BuildConditionGroups()
{
var results = GetConditionGroupParents();
// contents of ConditionMaps is irrelevant to this matter
List<ConditionMap> ConditionMaps = GenerateGroupMappings();
// now pair entries from the map into their appropriate group,
// adding them to the proper List<MemberConditions> as appropriate
foreach (var map in ConditionMaps)
{
results.Find(groupId => groupId.GroupID == map.GroupID)
.MemberConditions.Add(new ConditionOrphan(map));
}
return results;
}
I would expect each map in ConditionMaps to be mapped to a single ConditionGroup's MemberConditions in the "results.Find...." statement.
Instead, each map is being added to the list of every group, and that happens simultaneously/concurrently.
[edit] I've since proven that there is only a single instance of
List<Memberconditions>, being referenced by each group.
I unrolled the creation of the groups like so:
.
.
.
/* PROBLEM */ MemberConditions = null }).ToList();
foreach (var result in results)
{
List<Condition> memberConditions = new List<Condition>();
results.MemberConditions = memberConditions;
}
return results;
In that case I was able to watch each instantiation stepping
through the loop, and then it worked as expected. My question
remains, though, why the original code only created a single
instance. Thanks!
.
Why doesn't the LINQ query in GetConditionGroupParents "new up" a unique MemberConditions list for each Group, as indicated in the /* PROBLEM */ comment above?
Any insight is appreciated. Thanks!
Jeff Woods of
Reading, PA
This is a bug. As a workaround you can create a factory function
static List<T> CreateList<T>(int dummy) { ... }
And pass it any dummy value depending on the current row such as Conditions.ID.
This trick works because L2S, unlike EF, is capable of calling non-translatable functions in the last Select of the query. You will not have fun migrating to EF since they have not implemented this (yet).

C# List RemoveAt(0)

I have a list and want to iterate smoothly through it while removing one element after another. I thought I could do it like this:
List<Point> open = new List<Point>();
...
while (!(open == null))
{
Point p = open.RemoveAt(0);
...
However, it is not quite working how I would like it to, starting with "Cannot implicitly convert type 'void' to 'Point'". But shouldn't the call of RemoveAt give the point to P before removing it/making it void?
List.RemoveAt does not return item which you are removing. Also list will not become null when you'll remove all items. It will become empty, i.e. with Count equal to 0. I would suggest you to use Queue<T> instead of List<T>. Thus you will be able to remove fist added item and get it at same time:
Queue<Point> open = new Queue<Point>();
while(open.Count > 0)
{
var point = open.Dequeue();
// ...
}
If you want to use list, and remove first items, then you should retrieve item by index, and only then remove it from list:
List<Point> open = new List<Point>();
while (open.Count > 0) // or open.Any()
{
Point p = open[0];
open.RemoveAt(0);
// ...
}
No, it does not. It does not return anything, as per the specification. Try using a Queue<Point> instead. Also, removing the first item in a List<T> does force a copy of the array-contents as far as I know (If somebody knows, please add relevant reference), so always avoid removing the first element in list and try to always find the best data structure to solve your particular issue!
Example:
var open = new Queue<Point>();
// ... Fill it
// Any() is in general faster than Count() for checking that collection has data
// It is a good practice to use it in general, although Count (the property) is as fast
// but not all enumerables has that one
while (open.Any()) {
Point p = open.Dequeue();
// ... Do stuff
}

Categories

Resources