LINQ to SQL- Check if object has selected child, grandchild etc

LINQ to SQL- Check if object has selected child, grandchild etc - c#

I've been looking for an answer everywhere, but can't find anything. I have two tables, Media and Keywords, which have a many to many relationship. Now the Keywords table is quite simple - it has a ID, Name and ParentFK column that relates to ID column (it's a tree structure).
The user can assign any single keyword to the media file, which means that he can select a leaf without selecting the root or branch.
Now I have to be able to determine if a root keyword has any child, grandchild etc. which is assigned to a media object, but I have to do it from the root.
Any help will be appreciated.

Just look for any entry, which has the given ParentFK set with your ID.
public static bool HasChild(int id) {
return
db.Keywords.Any(item => item.Parent == id);
}
public static bool HasGrandChilds(int id) {
return
db.Keywords.Where(item => item.Parent == id).Any(item => HasChild(item.ID);
}
A more generic way:
public static bool HasGrandChilds(int id, int depth) {
var lst = new List<Keywords>();
for (var i = 0; i < depth - 1; i++) {
if (i == 0)
{
//Initial search at first loop run
lst = db.Keywords.Where(item => item.ParentId == id);
}
else
{
//Search all entries, where the parent is in our given possible parents
lst = db.Keywords.Where(item => lst.Any(k => k.Id == item.Parent));
}
if (!lst.Any())
{
//If no more children where found, the searched depth doesn't exist
return false;
}
}
return true;
}

From your current schema I can't think of a better solution than the following:
Issue a query to retrieve a list of all children of the root.
Issue queries to retrieve a list of all children of the children from the previous step.
So on, recursively to create a list of all descendants of the root.
Next query the DB for all media objects that have any of the keywords in the list.
But the above algorithm will entail multiple calls to the DB. You can make it in a single query of you refine your schema a little. I would suggest that you keep for each keyword not only its parent FK, but also its root FK. This way you could issue a single query to get all objects that have a keyword whose root FK is the desired one.

Related

Querying a chain of list of lists with LINQ

I am working with an XML standard called SDMX. It's fairly complicated but I'll make it as short as possible. I am receiving an object called CategoryScheme. This object can contain a number of Category, and each Category can contain more Category, and so on, the chain can be infinite. Every Category has an unique ID.
Usually each Category contains a lot of Categories. Together with this object I am receiving an Array, that contains the list of IDs that indicates where a specific Category is nested, and then I am receiving the ID of that category.
What I need to do is to create an object that maintains the hierarchy of the Category objects, but each Category must have only one child and that child has to be the one of the tree that leads to the specific Category.
So I had an idea, but in order to do this I should generate LINQ queries inside a cycle, and I have no clue how to do this. More information of what I wanted to try is commented inside the code
Let's go to the code:
public void RemoveCategory(ArtefactIdentity ArtIdentity, string CategoryID, string CategoryTree)
{
try
{
WSModel wsModel = new WSModel();
// Prepare Art Identity and Array
ArtIdentity.Version = ArtIdentity.Version.Replace("_", ".");
var CatTree = JArray.Parse(CategoryTree).Reverse();
// Get Category Scheme
ISdmxObjects SdmxObj = wsModel.GetCategoryScheme(ArtIdentity, false, false);
ICategorySchemeMutableObject CatSchemeObj = SdmxObj.CategorySchemes.FirstOrDefault().MutableInstance;
foreach (var Cat in CatTree)
{
// The cycle should work like this.
// At every iteration it must delete all the elements except the correct one
// and on the next iteration it must delete all the elements of the previously selected element
// At the end, I need to have the CatSchemeObj full of the all chains of categories.
// Iteration 1...
//CatSchemeObj.Items.ToList().RemoveAll(x => x.Id != Cat.ToString());
// Iteration 2...
//CatSchemeObj.Items.ToList().SingleOrDefault().Items.ToList().RemoveAll(x => x.Id != Cat.ToString());
// Iteration 3...
//CatSchemeObj.Items.ToList().SingleOrDefault().Items.ToList().SingleOrDefault().Items.ToList().RemoveAll(x => x.Id != Cat.ToString());
// Etc...
}
}
catch (Exception ex)
{
throw ex;
}
}
Thank you for your help.

So, as i already said in my comment, building a recursive function should fix the issue. If you're new to it, you can find some basic information about recursion in C# here.
The method could look something like this:
private void DeleteRecursively(int currentRecursionLevel, string[] catTree, ICategorySchemeMutableObject catSchemeObj)
{
catSchemeObj.Items.ToList().RemoveAll(x => x.Id != catTree[currentRecursionLevel].ToString());
var leftoverObject = catSchemeObj.Items.ToList().SingleOrDefault();
if(leftoverObject != null) DeleteRecursively(++currentRecursionLevel, catTree, leftoverObject);
}
Afterwards you can call this method in your main method, instead of the loop:
DeleteRecursively(0, CatTree, CatSchemeObject);
But as i also said, keep in mind, that calling the method in the loop, seems senseless to me, because you already cleared the tree, besides the one leftover path, so calling the method with the same tree, but another category, will result in an empty tree (in CatSchemeObject).
CAUTION! Another thing to mention i noticed right now: Calling to list on your Items property and afterwards deleting entries, will NOT affect your source object, as ToList is generating a new object. It IS keeping the referenced original objects, but a deletion only affects the list. So you must write back the resulting list to your Items property, or find a way to directly delete in the Items object. (Assuming it's an IEnumerable and not a concrete collection type you should write it back).
Just try it out with this simple example, and you will see that the original list is not modified.
IEnumerable<int> test = new List<int>() { 1, 2, 3, 4 , 1 };
test.ToList().RemoveAll(a => a != 1);

Edited:
So here is another possible way of going after the discussion below.
Not sure what do you really need so just try it out.
int counter = 0;
var list = CatSchemeObj.Items.ToList();
//check before you call it or you will get an error
if(!list.Equals(default(list)))
{
while(true)
{
var temp = list.Where(x => CatTree[counter++] == x.Id); // or != ? play with it .
list = temp.Items.ToList().SingleOrDefault();
if(list.Equals(default(list))
{
break;
}
}
}
I just translated you problem to 2 solutions, but I am not sure if you won't lose data because of the SingleOrDefault call. It means 'Grab the first item regardless of everything'. I know you said you have only 1 Item that is ok, but still... :)
Let me know in comment if this worked for you or not.
//solution 1
// inside of this loop check each child list if empty or not
foreach (var Cat in CatTree)
{
var list = CatSchemeObj.Items.ToList();
//check before you call it or you will get an error
if(!list.Equals(default(list)))
{
while(true)
{
list.RemoveAll(x => x.Id != Cat.ToString());
list = list.ToList().SingleOrDefault();
if(list.Equals(default(list))
{
break;
}
}
}
}
//solution 2
foreach (var Cat in CatTree)
{
var list = CatSchemeObj.Items.ToList();
//check before you call it or you will get an error
if(!list.Equals(default(list)))
{
CleanTheCat(cat, list);
}
}
//use this recursive function outside of loop because it will cat itself
void CleanTheCat(string cat, List<typeof(ICategorySchemeMutableObject.Items) /*Place here whatever type you have*/> CatSchemeObj)
{
CatSchemeObj.RemoveAll(x => x.Id != cat);
var catObj = CatSchemeObj.Items.ToList().SingleOrDefault();
if (!catObj.Equals(default(catObj)){
CleanTheCat(cat, catObj);
}
}

Thank you to whoever tried to help but I solved it by myself in a much easier way.
I just sent the full CategoryScheme object to the method that converted it in the XML format, then just one line did the trick:
XmlDocument.Descendants("Category").Where(x => !CatList.Contains(x.Attribute("id").Value)).RemoveIfExists();

A special C# Tree algorithm in Umbraco CMS

I'm creating a special tree algorithm and I need a bit of help with the code that I currently have, but before you take a look on it please let me explain what it really is meant to do.
I have a tree structure and I'm interacting with a node (any of the nodes in the tree(these nodes are Umbraco CMS classes)) so upon interaction I render the tree up to the top (to the root) and obtain these values in a global collection (List<Node> in this particular case). So far, it's ok, but then upon other interaction with another node I must check the list if it already contains the parents of the clicked node if it does contain every parent and it doesn't contain this node then the interaction is on the lowest level (I hope you are still with me?).
Unfortunately calling the Contains() function in Umbraco CMS doesn't check if the list already contains the values which makes the list add the same values all over again even through I added the Contains() function for the check.
Can anyone give me hand here if he has already met such a problem? I exchanged the Contains() function for the Except and Union functions, and they yield the same result - they do contain duplicates.
var currentValue = (string)CurrentPage.technologies;
List<Node> globalNodeList = new List<Node>();
string[] result = currentValue.Split(',');
foreach (var item in result)
{
var node = new Node(int.Parse(item));
if (globalNodeList.Count > 0)
{
List<Node> nodeParents = new List<Node>();
if (node.Parent != null)
{
while (node != null)
{
if (!nodeParents.Contains(node))
{
nodeParents.Add(node);
}
node = (Node)node.Parent;
}
}
else { globalNodeList.Add(node); }
if (nodeParents.Count > 0)
{
var differences = globalNodeList.Except<Node>(globalNodeList);
globalNodeList = globalNodeList.Union<Node>(differences).ToList<Node>();
}
}
else
{
if (node.Parent != null)
{
while (node != null)
{
globalNodeList.Add(node);
node = (Node)node.Parent;
}
}
else
{
globalNodeList.Add(node);
}
}
}
}

If I understand your question, you only want to see if a particular node is an ancestor of an other node. If so, just (string) check the Path property of the node. The path property is a comma separated string. No need to build the list yourself.
Just myNode.Path.Contains(",1001") will work.
Small remarks.
If you are using Umbraco 6, use the IPublishedContent instead of Node.
If you would build a list like you do, I would rather take you can provide the Umbraco helper with multiple Id's and let umbraco build the list (from cache).
For the second remark, you are able to do this:
var myList = Umbraco.Content(1001,1002,1003);
or with a array/list
var myList = Umbraco.Content(someNode.Path.Split(','));
and because you are crawling up to the root, you might need to add a .Reverse()
More information about the UmbracoHelper can be found in the documentation: http://our.umbraco.org/documentation/Reference/Querying/UmbracoHelper/
If you are using Umbraco 4 you can use #Library.NodesById(...)

Most appropriate way to construct a File and Directory class in order to easily filter results when placing them on a tree

I am creating a program that cursively finds all the files and directories in the specified path. So one node may have other nodes if that node happens to be a directory.
Here is my Node class:
class Node
{
public List<Node> Children = new List<Node>(); // if node is directory then children will be the files and directories in this direcotry
public FileSystemInfo Value { get; set; } // can eather be a FileInfo or DirectoryInfo
public bool IsDirectory
{
get{ return Value is DirectoryInfo;}
}
public long Size // HERE IS WHERE I AM HAVING PROBLEMS! I NEED TO RETRIEVE THE
{ // SIZE OF DIRECTORIES AS WELL AS FOR FILES.
get
{
long sum = 0;
if (Value is FileInfo)
sum += ((FileInfo)Value).Length;
else
sum += Children.Sum(x => x.Size);
return sum;
}
}
// this is the method I use to filter results in the tree
public Node Search(Func<Node, bool> predicate)
{
// if node is a leaf
if(this.Children.Count==0)
{
if (predicate(this))
return this;
else
return null;
}
else // Otherwise if node is not a leaf
{
var results = Children.Select(i => i.Search(predicate)).Where(i => i != null).ToList();
if (results.Any()) // THIS IS HOW REMOVE AND RECUNSTRUCT THE TREE WITH A FILTER
{
var result = (Node)MemberwiseClone();
result.Children = results;
return result;
}
return null;
}
}
}
and thanks to that node class I am able to display the tree as:
In one column I display the name of the directory or file and on the right the size. The size is formated as currency just because the commas help visualize it more clearly.
So now my problem is The reason why I have this program was to perform some advance searches. So I may only want to search for files that have the ".txt" extension for example. If I perform that filter on my tree I will get:
(note that I compile the text to a function that takes a Node and returns a bool and I pass that method to the Search method on my Node class in order to filter results. More information on how to dynamically compile code can be found at: http://www.codeproject.com/Articles/10324/Compiling-code-during-runtime) Anyways that has nothing to do with this question. The important part was that I removed all the nodes that did not matched that criteria and because I removed those nodes now the sizes of the directories changed!!!
So my question is how will I be able to filter results maintaining the real size of the directory. I guess I will have to remove the property Size and replace it with a field. The problem with that is that every time I add to the tree I will have to update the size of all the parent directories and that gets complex. Before starting coding it that way I will appreciate your opinion on how I should start implementing the class.

Since you're using recursion and your weight is a node-level property you can't expect that will continue to sum even after you remove the node. You either promote it to a upper level (collection) or use an external counter within the recursion (which counts but not depending on filter, you'll need to carry this through the recuersion).
Anyway, why are you implementing a core .NET functionality again? any reason beyond filtering or recursive search? both are pretty well implemented in the BCL.

Unable to get my basic tree structure working

Sometimes you get one of those days no matter how much you batter your head around a wall, even the simplest task alludes you (this is one of those days!).
So what I have is a list of categories
CategoryID, CategoryName, ParentID, Lineage
1 Root Category, NULL, /1/
2 Child Category, 1, /1/2/
3 Grandchild, 2, /1/2/3
4 Second Root, NULL, /4/
5 Second Child 2, /1/2/5/
I've created a class to hold this where it contains all the values above, plus
ICollection<Category> Children;
This should create the tree
Root Category
`-- Child category
| `-- Grandchild
`-- Second Child
Second Root
So I'm trying to add a new category to the tree given the Lineage and the element, I convert the lineage to a queue and throw it into this function.
public void AddToTree(ref Category parentCategory, Category newCategory, Queue<Guid>lineage)
{
Guid lastNode = lineage.Dequeue();
if(lastNode == newCategory.CategoryId)
{
parentCategory.Children.Add(newCategory);
return;
}
foreach (var category in parentCategory.Children)
{
if(category.CategoryId == lastNode)
{
this.AddToTree(ref category, newCategory, lineage);
}
}
}
Now two problems I'm getting
The self referencing isn't too worrying (its designed to be recursive) but since the category in the foreach loop is a locally instantiated variable I can't make it by reference and use it as a pointer.
I'm sure there has to be an easier way than this!
Any pointers would be greatly received.

This code seems to be what you are looking for, but without any self references and recursions - it goes through the tree along the given lineage and in the end of the lineage inserts the given category.
Several assumptions:
Tree is stored as a list of its roots
lineage is a string
void AddCategory(List<Category> roots, Category categoryToAdd, string lineage)
{
List<Guid> categoryIdList = lineage.Split('/').Select(id => new Guid(id)).ToList();
List<Category> currentNodes = roots;
Category parentNode = null;
foreach (Guid categoryId in categoryIdList)
{
parentNode = currentNodes.Where(category => category.CategoryId == categoryId).Single();
currentNodes = parentNode.Children;
}
parentNode.Children.Add(categoryToAdd);
}

You dont appear to need the "ref" at all. You are not modifying the object reference, just its state.
EDIT:
If you must use ref, then use a temporary variable, for example...
foreach (var temp in parentCategory.Children)
{
Category category = temp;
if (category.CategoryId == lastNode)
{
this.AddToTree(ref category, newCategory, lineage);
}
}
But even with this, the ref is about useless. AddToTree does not modify the reference value. It modifies the referenced objects state. Maybe you have more code involved that we need to see.
If your intent is to modify the child reference in the parent, you will have an issue with ICollection Children object. You cannot use "ref" on an element in the ICollection to in effect replace the reference. You would have to remove the child reference and add a new one.

Build Tree more efficiently?

I was wondering if this code is good enough or if there are glaring newbie no-no's.
Basically I'm populating a TreeView listing all Departments in my database. Here is the Entity Framework model:
Here is the code in question:
private void button1_Click(object sender, EventArgs e)
{
DepartmentRepository repo = new DepartmentRepository();
var parentDepartments = repo.FindAllDepartments()
.Where(d => d.IDParentDepartment == null)
.ToList();
foreach (var parent in parentDepartments)
{
TreeNode node = new TreeNode(parent.Name);
treeView1.Nodes.Add(node);
var children = repo.FindAllDepartments()
.Where(x => x.IDParentDepartment == parent.ID)
.ToList();
foreach (var child in children)
{
node.Nodes.Add(child.Name);
}
}
}
EDIT:
Good suggestions so far. Working with the entire collection makes sense I guess. But what happens if the collection is huge as in 200,000 entries? Wouldn't this break my software?
DepartmentRepository repo = new DepartmentRepository();
var entries = repo.FindAllDepartments();
var parentDepartments = entries
.Where(d => d.IDParentDepartment == null)
.ToList();
foreach (var parent in parentDepartments)
{
TreeNode node = new TreeNode(parent.Name);
treeView1.Nodes.Add(node);
var children = entries.Where(x => x.IDParentDepartment == parent.ID)
.ToList();
foreach (var child in children)
{
node.Nodes.Add(child.Name);
}
}

Since you are getting all of the departments anyway, why don't you do it in one query where you get all of the departments and then execute queries against the in-memory collection instead of the database. That would be much more efficient.
In a more general sense, any database model that is recursive can lead to issues, especially if this could end up being a fairly deep structure. One possible thing to consider would be for each department to store all of its ancestors so that you would be able to get them all at once instead of having to query for them all at once.

In light of your edit, you might want to consider an alternative database schema that scales to handle very large tree structures.
There's a explanation on the fogbugz blog on how they handle hierarchies. They also link to this article by Joe Celko for more information.
Turns out there's a pretty cool solution for this problem explained by Joe Celko. Instead of attempting to maintain a bunch of parent/child relationships all over your database -- which would necessitate recursive SQL queries to find all the descendents of a node -- we mark each case with a "left" and "right" value calculated by traversing the tree depth-first and counting as we go. A node's "left" value is set whenever it is first seen during traversal, and the "right" value is set when walking back up the tree away from the node. A picture probably makes more sense:
The Nested Set SQL model lets us add case hierarchies without sacrificing performance.
How does this help? Now we just ask for all the cases with a "left" value between 2 and 9 to find all of the descendents of B in one fast, indexed query. Ancestors of G are found by asking for nodes with "left" less than 6 (G's own "left") and "right" greater than 6. Works in all databases. Greatly increases performance -- particularly when querying large hierarchies.

Assuming that you are getting the data from a database the first thing that comes to mind is that you are going to be hitting the database n+1 times for as many parents that you have in the database. You should try and get the whole tree structure out in one hit.
Secondly, you seem to get the idea patterns seeing as you appear to be using the repository pattern so you might want to look at IoC. It allows you to inject your dependency on a particular object such as your repository into your class where it is going to be used allowing for easier unit testing.
Thirdly, regardless of where you get your data from, move the structuring of the data into a tree data structure into a service which returns you an object containing all your departments that have already been organised (This basically becomes a DTO). This will help you reduce code duplication.
With anything you need to apply the yagni principle. This basically says that you should only do something if you are going to need it so if the code you have provided above is complete, needs no further work and is functional don't touch it. The same goes with the performance issue of select n+1, if you are not seeing any performance hits don't do anything as it may be premature optimization.
In your edit
DepartmentRepository repo = new DepartmentRepository();
var entries = repo.FindAllDepartments();
var parentDepartments = entries.Where(d => d.IDParentDepartment == null).ToList();
foreach (var parent in parentDepartments)
{
TreeNode node = new TreeNode(parent.Name);
treeView1.Nodes.Add(node);
var children = entries.Where(x => x.IDParentDepartment == parent.ID).ToList();
foreach (var child in children)
{
node.Nodes.Add(child.Name);
}
}
You still have a n+1 issue. This is because the data is only retrieved from the database when you call the ToList() or when you iterate over the enumeration. This would be better.
var entries = repo.FindAllDepartments().ToList();
var parentDepartments = entries.Where(d => d.IDParentDepartment == null);
foreach (var parent in parentDepartments)
{
TreeNode node = new TreeNode(parent.Name);
treeView1.Nodes.Add(node);
var children = entries.Where(x => x.IDParentDepartment == parent.ID);
foreach (var child in children)
{
node.Nodes.Add(child.Name);
}
}

That looks ok to me, but think about a collection of hundreds of thousands nodes. The best way to do that is asynchronous loading - please notice, that do don't necassarily have to load all elements at the same time. Your tree view can be collapsed by default and you can load additional levels as the user expands tree's nodes. Let's consider such case: you have a root node containing 100 nodes and each of these nodes contains at least 1000 nodes. 100 * 1000 = 100000 nodes to load - pretty much, istn't it? To reduce the database traffic you can first load your first 100 nodes and then, when user expands one of those, you can load its 1000 nodes. That will save considerable amount of time.

Things that come to mind:
It looks like .ToList() is needless. If you are simply iterating over the returned result, why bother with the extra step?
Move this function into its own thing and out of the event handler.
As other have said, you could get the whole result in one call. Sort by IDParentDepartment so that the null ones are first. That way, you should be able to get the list of departments in one call and iterate over it only once adding children departments to already existing parent ones.

Wrap the TreeView modifications with:
treeView.BeginUpdate();
// modify the tree here.
treeView.EndUpdate();
To get better performance.
Pointed out here by jgauffin

This should use only one (albeit possibly large) call to the database:
Departments.Join(
Departments,
x => x.IDParentDepartment,
x => x.Name,
(o,i) => new { Child = o, Parent = i }
).GroupBy(x => x.Parent)
.Map(x => {
var node = new TreeNode(x.Key.Name);
x.Map(y => node.Nodes.Add(y.Child.Name));
treeView1.Nodes.Add(node);
}
)
Where 'Map' is just a 'ForEach' for IEnumerables:
public static void Map<T>(this IEnumerable<T> source, Action<T> func)
{
foreach (T i in source)
func(i);
}
Note: This will still not help if the Departments table is huge as 'Map' materializes the result of the sql statement much like 'ToList()' does. You might consider Piotr's answer.

In addition to Bronumski and Keith Rousseau answer
Also add the DepartmentID with the nodes(Tag) so that you don't have to re-query the database to get the departmentID

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

LINQ to SQL- Check if object has selected child, grandchild etc - c#

Related

Querying a chain of list of lists with LINQ

A special C# Tree algorithm in Umbraco CMS

Most appropriate way to construct a File and Directory class in order to easily filter results when placing them on a tree

Unable to get my basic tree structure working

Build Tree more efficiently?

Categories

Resources