I'm new to LINQ and I've been at this for hours now. I have a List<> of objects where one of the objects properties is a List of selected categories. I also have, outside of the objects, a List representing a subset of categories and I want to return all objects which contain at least one category that is also in the subset as illustrated in the following pseudo code (not my actual code)
List<string> subset = cat, dog, mouse
List<myclass> myclasses =
{name:alphie, category:[cat,elephant]},{name:sally, category:[fish]}, {name:bob, category:[dog, mouse]}
In the above example I need to return alphie and bob since they both have at least one category that's in my subset.
The only solution so far is to get a list of both and then use expensive foreach loops to go through and compare. I'm sure LINQ must provide a more efficient way to achieve the same?
More details (I think my pseudo code is not detailed enough)
public class RadioProgram {
...
private List<string> _category = new List<string>();
public List<string> Category { get { return _category; } set { _category = value; } }
...
}
public class Category {
...
private string _categoryName = "";
private List<Category> _subCategories = new List<Category>();
public string CategoryName { get { return _categoryName; } set { _categoryName = value; } }
public List<Category> SubCategories { get { return _subCategories; } set { _subCategories = value; } }
...
}
I have a method, GetCategories(string parentCategory), that returns all child categoryNames as List. Each radioProgram.Category (yes, name needs to be refactored to plural) is itself a List and may contain zero, one or more categoryNames. I'm getting my master list of radioPrograms and I want to return a subset that contain where each one contains at least one categoryName that matches the set from GetCategories.
I'm trying to avoid changing the architecture of the application (which is a potential solution) as it means a lot of refactoring of existing functionality AND I think this happens to be a good exercise for tackling and understanding LINQ.
One thing you could use is
myclasses
.Where(o => o.category.Any(c => subset.Contains(c)));
Related
I have a class:
public class DataMember {
public string ID{ get; set; }
public List<string> Versions { get; set; }
}
And another class:
public class MasterDataMember {
public string ID { get; set; }
public List<string> FoundVersions { get; set; }
}
I store both sets of data in a Cache as:
List<DataMember> datamembers
List<MasterDataMember> masterdatamembers
When originally built, the MasterDataMember is a list of partial "versions". These versions need to be confirmed and found in the list of DataMember's.
How can I update masterdatamembers with the confirmed versions found in datamembers?
(this code block is untested but it illustrates what I'm trying to do)
foreach (MasterDataMember item in masterdatamembers) {
List<string> confirmedvers = new List<string>();
foreach(string rawver in item.FoundVersions ){
foreach(DataMember checkitem in datamembers){
foreach (string confirmedver in checkitem.Versions) {
if (rawver.Contains(confirmedver)) {
confirmedvers.Add(confirmedver);
}
}
}
}
item.FoundVersions = vers;
}
Is there a LINQ that can accomplish this a lot easier, faster (I've already tried lots of ideas, iterations)?
Speed is the key here since both lists can be hundreds to thousands long.
Thank you in advance!
foreach (MasterDataMember item in masterdatamembers) {
IEnumerable<string> confirmedvers = item.FoundVersions.Where(rawver => rawver.Any(confirmedver => datamembers.Any(checkitem => checkitem.Versions.Contains(rawver)));
}
HOLY crap bro that was confusing as hell for me!
Awesome mind experiment though!
If speed really is your primary concern because of large lists, then you'll want to use hash table constructs. Using LINQ is slick, but won't necessarily make things faster (or clearer) for you. What you really need is to use the proper collection type.
Assumptions made for the code that follows:
datamembers cache cannot have duplicate DataMember entries (where more than one entry has the same ID).
masterdatamembers cache cannot have duplicate MasterDataMember entries (where more than one entry has the same ID).
In both DataMember and MasterDataMember, the Versions and FoundVersions lists cannot have duplicate version entries.
Algorithm Description
I still feel that your code block doesn't quite reflect your intent. And unfortunately, as a result, I think you got wrong answers.
This is the algorithm I followed, based on trying to interpret your intended result:
For each master data member, update its FoundVersions set (or list) by only keeping the versions in the list that can also be found in the matching data member's Versions set (or list). If no matching data member is found, then I assume you want the master data members FoundVersions set (or list) to be emptied, as none of the versions can be confirmed.
Implementation
Notice that I replaced a few uses of List<T> with Dictionary<K, V> or HashSet<T> where it would benefit performance. Of course, I am assuming that your lists can become large as you said. Otherwise, the performance will be similar as simple lists.
Your 2 classes, (notice the change in types):
public class DataMember
{
public string ID { get; set; }
public HashSet<string> Versions { get; set; } // using hashset is faster here.
}
public class MasterDataMember
{
public string ID { get; set; }
public HashSet<string> FoundVersions { get; set; } // used HashSet for consistency, but for the purposes of the algorithm, a List can still be used here if you want.
}
Your cached data, (notice the change to a Dictionary):
Dictionary<string, DataMember> datamembers; // using a Dictionary here, where your key is the DataMember's ID, is your fastest option.
List<MasterDataMember> masterdatamembers; // this can stay as a list if you want.
And finally, the work is done here:
foreach (var masterDataMember in masterdatamembers)
{
DataMember dataMember;
if (datamembers.TryGetValue(masterDataMember.ID, out dataMember))
{
HashSet<string> newSet = new HashSet<string>();
foreach (var version in masterDataMember.FoundVersions)
{
if (dataMember.Versions.Contains(version))
{
newSet.Add(version);
}
}
masterDataMember.FoundVersions = newSet;
}
else
{
masterDataMember.FoundVersions.Clear();
}
}
Your code will look like something like this in Linq
masterDataMembers.ForEach(q=>q.FoundVersions = (from rawver in q.FoundVersions from checkitem in dataMembers from confirmedver in checkitem.Versions where rawver.Contains(confirmedver) select confirmedver).ToList());
Let me begin by illustrating what I have vs goal I'm trying to achieve
In controller I get all the categories into single generic list in pretty much random unsorted manner
var categories = new List<Category>(this.categoryService.GetCategories())
Each category has 4 properties that matter here Id, ParentCategoryId, SortOrder, Text
SortOrder has to be applied only towards siblings on same level in the hierarchy and children have to be positioned always underneath their parent. Text has to change by prepending ".." for each level of depth.
I'd like this to be done properly with performance in mind, don't want to loop recursively through massive list multiple times.
Thanks for any input.
This might not be the most performant code, but should work for OP's problem.
Since OP didn't explicitly define the data structure (model) in question, I'm going to assume it's something like this:
public class Category {
public int Id { get; set; }
public int ParentCategoryId { get; set; }
public int SortOrder { get; set; }
public string Text { get; set; }
}
To sort a list of category (List<Category>) into a multilevel categories, we need a tree-like structure to hold the data.
First, I would extend the model (Category class) with new properties such as Level (to indicate the level-depth), Children (to hold the sub/child Category), and DisplayText (for displaying the category text according to its level):
public class CategoryNode : Category {
public CategoryNode(Category category) {
Id = category.Id;
ParentCategoryId = category.ParentCategoryId;
SortOrder = category.SortOrder;
Text = category.Text;
}
public CategoryTree Children { get; set; }
public int Level { get; set;}
public string DisplayText {
get {
// OP wants two-dots prefix per level
return string.Concat(new string('.', Level*2), Text);
}
}
}
Note: Depends of your preference, you could alter the Category class directly, instead of subclassing it into CategoryNode.
Next, I would define a class to wrap the collection of CategoryNode called CategoryTree, it is a simple wrapper of List<CategoryNode> which expose an IEnumerable interace. I'd also add a Flatten() method inside CategoryTree which will flatten the tree-like structure into a single list. This method will come in handy for binding the data into a single-list (non-hierarchical) control such as DropDownList or ListBox. And last, I'd also add a static creation method called Create() to create an instance of CatogoryTree based on a given list of Category:
public class CategoryTree : IEnumerable<CategoryNode> {
private List<CategoryNode> innerList = new List<CategoryNode>();
public CategoryTree(IEnumerable<CategoryNode> nodes) {
innerList = new List<CategoryNode>(nodes);
}
public IEnumerator<CategoryNode> GetEnumerator()
{
return innerList.GetEnumerator();
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
public IEnumerable<CategoryNode> Flatten() {
foreach(var category in innerList.OrderBy(o => o.SortOrder)) {
yield return category;
if (category.Children != null) {
foreach(var child in category.Children.Flatten()) {
yield return child;
}
}
}
}
public static CategoryTree Create(
IEnumerable<Category> categories,
Func<Category, bool> parentPredicate,
int level = 0)
{
var nodes = categories
.Where(parentPredicate)
.OrderBy(o => o.SortOrder)
.Select(item => new CategoryNode(item) {
Level = level,
Children = Create(categories, o => o.ParentCategoryId == item.Id, level + 1)
});
return new CategoryTree(nodes);
}
}
Note: Again, arguably, you could just use List<CategoryNode> directly, extract the methods, and save yourself the hassle of creating a new class here. Your call.
With all pieces in place I could now use the following code to convert a list of Category (List<Category>) into a multilevel list of Category, and flatten that list to bind it into a DropDownList:
...
var categories = new List<Category>(this.categoryService.GetCategories())
// assuming ParentCategoryId == 0 is the root category
var categoryTree = CategoryTree.Create(categories, o => o.ParentCategoryId == 0);
var model = new SomeViewModel();
model.Categories = new SelectList(categoryTree.Flatten(), , "Id", "DisplayText");
return View(model);
...
#Html.DropDownListFor(m => m.SelectedCategory, Model.Categories)
...
Demo (using Console.Out): https://ideone.com/ejOfrr
Have you tried implementing a Dictionary structure?
As in:
Dictionary<Categories, List<T>> dictionary = new Dictionary<Categories, List<T>>();
This might conform to your requirements.
Further if you want a sorted Category structure the you can implement a SortedDictionary with Category as the Key & a corresponding List as Value.
I've already searched through StackOverflow (and other websites) about transforming a DataTable to List with reflection in C#.
My results until now are pretty good: I can reflect 200k lines in 3.5 seconds (0.5 seconds in hardcoded mode).
But my entities (the classes that represent my data, but I think you already know that) follow this pattern:
My database have columns like this (I don't actually do this, but you'll get the idea):
Table: Clients
Columns:
ClientID, ClientName, ClientPhone, CityID[FK]
I'm using SqlConnection (MySqlConnection), so I have to hardcode my entities and transform the database result in a list of this entity. Like:
Select *, cit.* from Clients cli
Inner join Cities cit on (cit.CityID == cli.CityID)
Inner join Countries cou on (cou.CountryID == cit.CountID)
I don't know if this SQL is correct, but I think you got the idea. This should return some fields like this:
ClientID, ClientName, ClientPhone, CityID, CityName, CountryID, CountryName
Shoud result a List<Client>.
Here's the problem: I have 2 inner joins and I represent this data in my entities like this (I like the expression "like this"):
public class Client
{
public int ClientID { get; set; }
public string ClientName { get; set; }
public string ClientPhone { get; set; }
public City ClientCity { get; set; }
}
public class City
{
public int CityID { get; set; }
public string CityName { get; set; }
public Country CityCountry { get; set; }
}
public class Country
{
public int ContryID { get; set; }
public string CountryName { get; set; }
}
So, if I have a Client object, I would get its country name by the expression client.ClientCity.CityCountry.CountryName. I call it a 3-level property acessor.
And I want to reflect it properly. Here is the main method to transform the DataTable into a List. My native language is Portuguese, but I tried to translate my comments to match my description above.
The idea of this code is: I try to find in the main class the column I have to set. If I don't find it, I search the property in the properties that are objects. Like CityName inside ClientCity inside Client. This code is a mess.
public List<T> ToList<T>(DataTable dt) where T : new()
{
Type type= typeof(T);
ReflectionHelper h = new ReflectionHelper(type);
insertPropInfo(tipo); //a pre-reflection work, I cache some delegates, etc..
List<T> list = new List<T>();
DataTableReader dtr = dt.CreateDataReader();
while (dtr.Read())
{
T obj = new T();
for (int i = 0; i < dtr.FieldCount; i++)
{
GetObject(ref obj, tipo, dtr.GetName(i), dtr.GetValue(i));
}
list.Add(obj);
}
return lista;
}
//ref T obj: the object I create before calling this method
//Type classType: the type of the object (say, Client)
//string colName: this is the Database Column i'm trying to fill. Like ClientID or CityName or CountryName.
//colLineData: the data I want to put in the colName.
public void GetObject<T>(ref T obj, Type classType, string colName, object colLineData) where T : new()
{
//I do some caching to reflect just once, and after the first iteration, I think all the reflection I need is already done.
foreach (PropertyInfo info in _classPropInfos[classType])
{
//If the current PropertyInfo is a valuetype (like int, int64) or string, and so on
if (info.PropertyType.IsValueType || info.PropertyType == typeof(string))
{
//I think string.Equals is a little faster, but i had not much difference using "string" == "string"
if (info.Name.Equals(colName)) //did I found the property?
if (info.PropertyType != typeof(char)) //I have to convert the type if this is a Char. MySql returns char as string.
{
_delegateSetters[info](obj, colLineData); //if it isn't a char, just set it.
}
else
{
_delegateSetters[info](obj, Convert.ChangeType(colLineData, typeof(char)));
}
break;
}
else //BUT, if the property is a class, like ClientCity:
{
//I reflect the City class, if it isn't reflected yet:
if (!_classPropInfos.ContainsKey(info.PropertyType))
{
insertPropInfo(info.PropertyType);
}
//now I search for the property:
Boolean foundProperty = false;
object instance = _delegateGetters[info](obj); //Get the existing instance of ClientCity, so I can fill the CityID and CityName in the same object.
foreach (PropertyInfo subInfo in _classPropInfos[info.PropertyType])
{
if (subInfo.Name.Equals(colName))//did I found the property?
{
if (instance == null)
{
//This will happen if i'm trying to set the first property of the class, like CityID. I have to instanciate it, so in the next iteration it won't be null, and will have it's CityID filled.
instance = _initializers[info.PropertyType]();//A very fast object initializer. I'm worried about the Dictionary lookups, but i have no other idea about how to cache it.
}
_delegateSetters[subInfo](instance, colLineData);//set the data. This method is very fast. Search about lambda getters & setters using System.Linq.Expression.
foundProperty = true;
break;//I break the loops when I find the property, so it wont iterate anymore.
}
}
if (foundProperty)//if I found the property in the code above, I set the instance of ClientCity to the Client object.
{
_delegateSetters[info](obj, instance);
break;
}
}
}
}
There is a problem with this code: I can reach the CityID and CityName, and fill it. But CountryID and CountryName wont. Because this code can do a 2-level reflection, I need some recursive-approach to fill many levels I need. I tried to do this BUT i got so many stack overflows and null reference exceptions I almost gave up.
This code would make it much easier to fetch database rows, Did you already find some library or anything that does what I want? If not, how could I achieve a n-level reflection to make a proper List from a DataTable?
Your problem is really common and practically every ORM in circulation addresses this question.
Of course changing an already written application to take advantage of an ORM is often unpractical, but there are some simple ORM that are really easy to add to an existing application and let you replace incrementally the already written code.
One of these ORMs is DAPPER. It consists of just one source file that you can include directly in the same project with your POCO classes and repository methods (Or just reference the compiled assembly). It is really easy to learn and it is incredibly fast considering the complexity of the work to be carried out. Not to mention that the authors of this little gem are regularly on this site answering questions on their work. Just do a search with the #dapper tag
The only nuisances that I have found to date are the mapping one-to-one from your POCO properties and the field names and also the sometime eluding rules between PK and FK when your keys are not named ID. But that's me that I still haven't fully understood these rules.
Consider to use EntityFramework. It will automate all this work.
This is based on you getting a dataset with the 3 tables and creating the proper DataRelation.
On your particular case(200k lines) i dont know how it will perform but shouldnt be that bad :).
Your calling code could be something like this:
List<Clients> clients = Test.CreateListFromTable<Clients>(ds.Tables["Clients"]);
Remember as i said its based in you fettching the dataset and creating the relations.
Next here is the class with the methods in question(ClientsToCity and CityToCountry are the names of the datarelations,you can place your own):
public class Test
{
// function that set the given object from the given data row
public static void SetItemFromRow<T>(T item, DataRow row) where T : new()
{
foreach (DataColumn c in row.Table.Columns)
{
PropertyInfo prop = item.GetType().GetProperty(c.ColumnName);
if (prop != null && row[c] != DBNull.Value)
{
prop.SetValue(item, row[c], null);
}
else
{
if (c.ColumnName == "CityID")
{
object obj = Activator.CreateInstance(typeof(City));
SetItemFromRow<City>(obj as City, row.GetChildRows("ClientsToCity")[0]);
PropertyInfo nestedprop = item.GetType().GetProperty("ClientCity");
nestedprop.SetValue(item, obj, null);
}
else if (c.ColumnName == "CountryID")
{
object obj = Activator.CreateInstance(typeof(Country));
SetItemFromRow<Country>(obj as Country, row.GetChildRows("CityToCountry")[0]);
PropertyInfo nestedprop = item.GetType().GetProperty("CityCountry");
nestedprop.SetValue(item, obj, null);
}
}
}
}
// function that creates an object from the given data row
public static T CreateItemFromRow<T>(DataRow row) where T : new()
{
T item = new T();
SetItemFromRow(item, row);
return item;
}
// function that creates a list of an object from the given data table
public static List<T> CreateListFromTable<T>(DataTable tbl) where T : new()
{
List<T> lst = new List<T>();
foreach (DataRow r in tbl.Rows)
{
lst.Add(CreateItemFromRow<T>(r));
}
return lst;
}
}
I have a class
public class Orders
{
public Orders() {}
private string _idOrder;
private string _totalPrice;
public string idOrder
{
get{ return _idOrder;}
set { _idOrder = value;}
}
public string totalPrice
{
get { return _totalPrice; }
set { _totalPrice = value; }
}
}
I am loading the list from database like this
while (dr.Read())
{
Orders.idOrder = dr["IdOrder"].ToString();
Orders.totalPrice= dr["totalPrice"].ToString();
}
It's is showing me only last record. How can I load all the orders and retrieve them back by foreach loop?
Create a list :-)
List<Order> orders = new List<Order>();
while (dr.Read())
{
Order order = new Order();
order.idOrder = dr["IdOrder"].ToString();
order.totalPrice= dr["totalPrice"].ToString();
orders.Add(order);
}
As you see, I renamed your class from Orders to Order, because that's what it really represents: One order. To have more orders, you need to put those single orders into a list.
It's only showing you the one item, because you're only changing properties on the one item, not instantiating a new one:
var results = new List<Order>();
while (reader.Read())
{
var order = new Order
{
Id = (int)reader["IdOrder"],
TotalPrice = (decimal)reader["totalPrice"]
};
results.Add(order);
}
I think you are looking for something like this:
IEnumberable<Order> FetchOrders()
{
while(dr.Read())
yield return new Order {
idOrder=dr["IdOrder"].ToString(),
totalPrice=dr["totalPrice"].ToString()
});
}
That Orders class represents a single order! If what you need is a list of orders then I suggest you rename that class to Order, and then create a List<Order> (a list of order-objects) and populate that from your query results.
Also (forgive me for being pernickety) "idOrder" is not a good field name. The standard approaches are "orderId" or just plain old "Id" (ID, or even id). Likewise I would expect the price-of-ONE-order to be called just "amount", or even "price"... not "totalPrice"... it'll be too confusing when you come to total-up the totalPrices... get my drift?
Cheers. Keith.
I don't see how that will compile. Orders.idOrder is not a static property, it's an instance property.
If i understand you right you want to use something like this:
List<Order> = new List<Order>();
while (dr.Read())
{
Order newOrder = new Order();
newOrder.idOrder = dr["IdOrder"].ToString();
newOrder.totalPrice= dr["totalPrice"].ToString();
orderList.Add(newOrder);
}
Notice this that I just discuss more for #Grook Answer. I Think it is so near to what to want.
IEnumberable<Order> FetchOrders()
{
while(dr.Read())
yield return new Order {
idOrder=dr["IdOrder"].ToString(),
totalPrice=dr["totalPrice"].ToString()
});
}
Then You can easily use foreach loop
Foreach(Order order in GetOrders())
{
doSomething(order);
}
Is it clear?
I am working on a C# application which consists of objects Department, Course, and Section. Each Department has many Courses, and each Course has many Sections. Currently I have three classes: Department, Course, and Section. Department contains some properties and then a List Courses, which contains the courses the department offers. Course contains some properties and then a List Sections, which contains the sections of the course. Is this a good way to have the code structured or should I be doing it a different way?
Secondly, when I instantiate a department in my application, I set some properties and then would like to begin adding courses to the List Courses defined in the Department class. However, I seem to be unable to simply do Department.Courses.Add(Course) from the application. What must I do within the Department class so that I may add objects to that list without breaking the principle of encapsulation?
An example of what I have with the list right now is:
class Department
{
// ......
List<Course> Courses = new List<Course>;
}
however Department.Courses is not available in the program code after the class has been instantiated (all other properties of the class are available).
Instantiate the internal Courses list inside the parameterless constructor of your class.
private List<Course> _coursesList;
public Department()
{
_coursesList = new List<Course>();
}
Also, another way to ensure the encapsulation is to provide a method on your Department class to add the courses to it instead of directly exposing the courses list. Something like
public void AddCourse(Course c) { ... }
// or (adding the feature of doing the method calls in a composable way)
public Course AddCourse(Course c) { ... }
// or
public void AddCource(String name, etc) { ... }
I think in your case it is not a good idea do directly exposes the List because the class List, may provide methods like, Add and Remove which could potentially creates an invalid state on your parent class. So if you choose to expose methods to manipulate the internal collections like I suggested, you could expose an array of Courses to your API clients (remember the arrays are read-only) so your API consumers won't be able to the create side effects on your department class.
public Course[] Courses {
get { return _coursesList.ToArray(); }
}
In addition, you could also implement the IEnumerable interface on your Department class. It would enable you to take advantage of the all LINQ extension methods available in C# 3.0.
I hope it helps,
Carlos.
Probably something Similar. There are several ways of soing this. depends upon what your requirements are.
public class Department
{
// Initialize the list inside Default Constructor
public Department()
{ courses = new List<Course>(); }
// Initialize List By Declaring outside and Passing with Dpartment Initilization
public Department(List<Course> _courses)
{ courses = _courses; }
List<Course> courses;
public List<Course> Courses
{
get
{
if (courses == null)
return new List<Course>();
else return courses;
}
set { courses = value; }
}
internal bool AddCourseToCourses(Course _course)
{
bool isAdded = false;
// DoSomeChecks here like
if (!courses.Contains(_course))
{
courses.Add(_course);
isAdded = true;
}
return isAdded;
}
}
public class Course
{
public Course(List<Subject> _subject)
{ subjects = _subject; }
List<Subject> subjects;
public List<Subject> Subjects
{
get { return subjects; }
set { subjects = value; }
}
}
// I do not get what do you mean by course "section", very general.
// used Subject instead, Change as you want just to give an idea
public class Subject
{
string name;
public string Name
{
get { return name; }
set { name = value; }
}
int creditHours;
public int CreditHours
{
get { return creditHours; }
set { creditHours = value; }
}
public Subject(string _name, int _creditHours)
{
name = _name;
creditHours = _creditHours;
}
}
public class TestClass
{
public void DoSomething()
{
// Subjects
Subject subj1 = new Subject("C#", 10);
Subject subj2 = new Subject(".Net", 10);
// List of Subjects
List<Subject> advancedPrSubjects = new List<Subject>();
advancedPrSubjects.Add(subj1);
advancedPrSubjects.Add(subj2);
// Course
Course advancedProgramming = new Course(advancedPrSubjects);
// Deliver authoroty to add Course to Department Class itself
Department dept = new Department();
dept.AddCourseToCourses(advancedProgramming);
}
}
There are better ways of doing this. have a look at these tutorials for better insight
http://www.csharp-station.com/Tutorials/Lesson07.aspx
http://www.functionx.com/csharp/index.htm
Hope it helps
As to your second question - without some code or more details its a bit hard - but i'll take a guess.
You're probably not actually creating the list, just declaring it
List<xxxx> _variable;
vs
List<xxxx> _variable = new List<xxxxx>();
You must create a list to be able to add to it (new List());
You sound as if you're on the right track.
Your second problem could be down to many things.
It could be as Ruddy says and that you're not creating the list.
It could also be that your Courses List is not public or that you haven't instanciated a new Course object to add.