This question pertains to C#, LINQ grouping and Collections.
I'm currently working on a grouping issue and I wanted to get some feedback from the community. I've encountered this particular problem enough times in the past that I'm thinking of writing my own data structure for it. Here are the details:
Suppose you have a grouping that consists of a manufacturer and products and the data structure is grouped by manufacturer. There are many manufacturers and many products. The manufacturers each have a unique name and id. The products may have similar names, but they do have unique ids. The proceeding list represents an example.
Ford 1, Fiesta 1945, Taurus 6413, Fusion 4716, F1 8749,
Toyota 2, Camry 1311, Prius 6415, Corolla 1117, Tacoma 9471
Chevrolet 3, Silverado 4746, Camero 6473, Volt 3334, Tahoe 9974
etc...
The data structure I would use to represent this would be
IEnumerable<Manufacturer, ManufacturerID, IEnumerable<Product, ProductID>>
but this doesn't exist. So my question I want to ask the community is what data structure would you recommend and why?
Update:
I would like to keep the types anonymous and avoid the dynamic keyword. So the data structure would be something like
IEnumerable SomeDataStructure<T, U, V>
The other requirement is that is can have duplicated items. Here's kind of what I'm thinking:
public class MyDataStructure<T, U, V>
{
// make this like a list, not a dictionary
}
Update:
I decided to go with a Tuple data structure. It's very powerful and easy to query against. The proceding code is how I ended up using it to create my manufacturer-vehicle relationships. The result is a nicely ordered data structure that has unique manufacturers ordered by name with their associated unique vehicles ordered by name.
public class ManufacturersVehicles
{
public int ManufacturerID { get; set; }
public string ManufacturerName { get; set; }
public int VehicleID { get; set; }
public string VehicleName { get; set; }
}
// "data" actually comes from the database. I'm just creating a list to use a mock structure to query against.
var data = new List<ManufacturersVehicles>
{
{ ManufacturerID = 1, Manufacturer = "Ford", VehicleID = 1945, VehicleName = "Fiesta" },
{ ManufacturerID = 1, Manufacturer = "Ford", VehicleID = 6413, VehicleName = "Taurus" },
{ ManufacturerID = 1, Manufacturer = "Ford", VehicleID = 4716, VehicleName = "Fusion" },
etc...
};
// Get a collection of unique manufacturers from the data collection and order it by the manufacturer's name.
var manufacturers = data.Select(x => new { ManufacturerID = x.ManufacturerID, ManufacturerName = x.ManufacturerName })
.Distinct()
.OrderBy(x => x.ManufacturerName)
.Select(x => Tuple.Create(x.ManufacturerID, x.ManufacturerName, new Dictionary<int, string>()))
.ToList();
// Add the manufacturer's vehicles to it's associated dictionary collection ordered by vehicle name.
foreach (var manufacturer in manufacturers)
{
// Get the collection of unique vehicles ordered by name.
var vehicles = _alertDetails.Where(x => x.ManufacturerID == manufacturer.Item1)
.Select(x => new { VehicleID = x.VehicleID, VehicleName = x.VehicleName })
.Distinct()
.OrderBy(x => x.VehicleName);
foreach (var vehicle in vehicles)
{
manufacturer.Item3.Add(vehicle.VehicleID, vehicle.VehicleName);
}
}
I would create a class named Manufacturer:
public class Manufacturer
{
public int ManufacturerId { get; set;}
public string Name { get; set; }
public IEnumerable<Product> Products { get; set;}
}
Then create a Product class:
public class Product
{
public int ProductId { get; set;}
public string Name { get; set;}
}
Then use LINQ projection with the Select extension method to create the Manufacturer object.
Like this?
public class Manufacturer : IEquatable<Manufacturer>
{
public string Name { get; private set; }
public int ID { get; private set; }
// ...
}
public class Product
{
public string Name { get; private set; }
public int ID { get; private set; }
// ...
}
// groups is of type IEnumerable<IGrouping<Manufacturer, Product>>
var groups = data.GroupBy(row => new Manufacturer(row), row => new Product(row));
EDIT: If you want to use anonymous types (as you mentioned now in your update) then GroupBy should work just as well if you construct anonymous objects, instead of declaring a Manufacturer and Product class as in my sample.
MyDataStructure sounds very similar to a Tuple. See here for the three-generic parameter variant. Tuple gives a strongly typed container for a number of specified other types.
Related
Trying to get a query to work, but honestly not sure how (or if it's even possible) to go about it as everything I have tried hasn't worked.
Querying a total of 6 tables: Person, PersonVote, PersonCategory, Category, City, and FirstAdminDivision.
PersonVote is a user review table for people and contains a column called Vote that is a decimal accepting a value from 1-5 (5 being "best"). FirstAdminDivision would be synonymous with US states, like California. Person table has a column called CityId which is the foreign key to City. The other tables I believe are mostly self-explanatory so I won't comment unless needed.
My goal is create a query that returns a list of the "most popular" people which would be based on the average of all votes on the PersonVote table for a particular person. For instance, if a person has 3 votes and all 3 votes are "5" then they would be first in the list...don't really care about secondary ordering at this point...eg...like most votes in a tie would "win".
I have this working without AutoMapper, but I love AM's ability to do projection using the ProjectTo extension method as the code is very clean and readable and would prefer to use that approach if possible but haven't had any luck getting it to work.
Here is what I have that does work....so basically, I am trying to see if this is possible with ProjectTo instead of LINQ's Select method.
List<PersonModel> people = db.People
.GroupBy(x => x.PersonId)
.Select(x => new PersonModel
{
PersonId = x.FirstOrDefault().PersonId,
Name = x.FirstOrDefault().Name,
LocationDisplay = x.FirstOrDefault().City.Name + ", " + x.FirstOrDefault().City.FirstAdminDivision.Name,
AverageVote = x.FirstOrDefault().PersonVotes.Average(y => y.Vote),
Categories = x.FirstOrDefault().PersonCategories.Select(y => new CategoryModel
{
CategoryId = y.CategoryId,
Name = y.Category.Name
}).ToList()
})
.OrderByDescending(x => x.AverageVote)
.ToList();
By looking at your code sample I tried to determine what your models would be in order to setup an example. I only implemented using a few of the properties to show the functionality:
public class People
{
public int PeronId { get; set; }
public string Name { get; set; }
public City City { get; set; }
public IList<PersonVotes> PersonVoes { get; set; }
}
public class PersonVotes
{
public int Vote { get; set; }
}
public class City
{
public string Name { get; set; }
}
public class FirstAdminDivision
{
public string Name { get; set; }
}
public class PersonModel
{
public int PersonId { get; set; }
public string Name { get; set; }
public string LocationDisplay { get; set; }
public double AverageVote { get; set; }
}
To use the ProjectTo extension I then initialize AM through the static API:
Mapper.Initialize(cfg =>
{
cfg.CreateMap<IEnumerable<People>, PersonModel>()
.ForMember(
model => model.LocationDisplay,
conf => conf.MapFrom(p => p.FirstOrDefault().City.Name))
.ForMember(
model => model.AverageVote,
conf => conf.MapFrom(p => p.FirstOrDefault().PersonVoes.Average(votes => votes.Vote)));
});
So given the following object:
var people = new List<People>()
{
new People
{
PeronId = 1,
City = new City
{
Name = "XXXX"
},
PersonVoes = new List<PersonVotes>
{
new PersonVotes
{
Vote = 4
},
new PersonVotes
{
Vote = 3
}
}
}
};
I would then a have query:
var result = people
.GroupBy(p => p.PeronId)
.Select(peoples => peoples)
.AsQueryable()
.ProjectTo<PersonModel>();
I'm just using in memory objects so that is why I convert to IQueryable to use the ProjectTo extension method in AM.
I'm hoping this was what you're looking for. Cheers,
UPDATED FOR LINQ TO ENTITIES QUERY:
var result = db.People
.GroupBy(p => p.PersonId)
.ProjectTo<PersonModel>(base.ConfigProvider) // AM requires me to pass Mapping Provider here.
.OrderByDescending(x => x.AverageVote)
.ToList();
I have a list of objects, TargetList populated from the database which I want to group together based on the AnalyteID, MethodID and InstrumentID fields, but the Unit fields will be stored in a list applicable to each grouped object.
Furthermore, it is only possible for one of the available units to have a target assigned to it. Therefore, during the grouping I need a check to see if a target is available and, if so, skip creation of the unit list.
The TargetList object contains the following attributes:
public int id { get; set; }
public int AnalyteID { get; set; }
public string AnalyteName { get; set; }
public int MethodID { get; set; }
public string MethodName { get; set; }
public int InstrumentID { get; set; }
public string InstrumentName { get; set; }
public int UnitID { get; set; }
public string UnitDescription { get; set; }
public decimal TargetMean { get; set; }
public List<Unit> Units { get; set; }
I have a method for multi-grouping using LINQ:
TargetList.GroupBy(x => new { x.AnalyteID, x.MethodID, x.InstrumentID })...
But unsure as to how to check for a target at a row before extracting all available units at current group if target doesn't exist.
I created a solution which groups all rows returned from the database based on the AnalyteID, MethodID and InstrumentID ('names' of each of these are included in the grouping aswell).
Additionally, all non-unique Unit attributes (UnitID and UnitDescription) are placed into a list only if the TargetMean is 0.
targetViewModel.TargetList
// Group by unique analyte/method/instrument
.GroupBy(x => new { x.AnalyteID, x.AnalyteName, x.MethodID, x.MethodName, x.InstrumentID, x.InstrumentName })
// Select all attributes and collect units together in a list
.Select(g => new TargetView
{
id = g.Max(i => i.id),
AnalyteID = g.Key.AnalyteID,
AnalyteName = g.Key.AnalyteName,
MethodID = g.Key.MethodID,
MethodName = g.Key.MethodName,
InstrumentID = g.Key.InstrumentID,
InstrumentName = g.Key.InstrumentName,
TargetMean = g.Max(i => i.TargetMean),
UnitID = g.Max(i => i.UnitID),
UnitDescription = g.Max(i => i.UnitDescription),
// only extract units when target mean is 0
Units = g.Where(y => y.TargetMean == 0)
.Select(c => new Unit { ID = c.UnitID, Description = c.UnitDescription }).ToList()
}).ToList();
Note: The Max method is used to extract any required non-key attributes, such as the TargetMean/id. This works fine because only one row will ever be returned if a TargetMean exists.
It does feel 'dirty' to use the Max method in order to obtain all other non-key attributes though so if anyone has any other suggestions, please feel free to drop an answer/comment as I am interested to see if there are any cleaner ways of achieving the same result.
I'm trying to use a contains in a list within a list but been stuck on this one:
var postFilter = PredicateBuilder.False<Company>();
// Loop through each word and see if it's in the company's facility
foreach (var term in splitSearch)
{
var sTerm = term.Trim();
postFilter = postFilter.Or(x =>
x.Facilities.Contains(y=>
y.Facility.Name.ToUpper().Contains(sTerm)) ||
x => x.Facilities.Contains(y =>
y.Facility.Description.ToUpper().Contains(sTerm)));
}
Postfilter is a list of companies, a company has a list of companyfacility items which is a 1:m relationship. A facility also has a 1:m relationship with this table. The x thus represents a companyfacility object. The y should represent a facility object.
(So a company can have many facilities, and a facility can belong to many companies. In between i use the companyfacility table for additional information of that companies facility - example, a specific company can have a lathe table that goes to diameter 300 where other companies would go to diameter 250, so it's important to have the table in between)
I want to return the companies which have the sTerm in their facility name or facility description, but this linq statement is invalid.
Thanks for the help!
Here is the LINQ and some example code:
void Main()
{
List<Company> postFilter = new List<UserQuery.Company>();
var sTerm = "XXX".Trim().ToUpper();
postFilter = postFilter.Where(x =>
x.Facilities.Any(f => f.Name.ToUpper().Contains(sTerm)
|| f.Description.ToUpper().Contains(sTerm))).ToList();
}
public class Company
{
public string CompanyName { get; set; }
public List<Facility> Facilities { get; set; }
}
public class Facility
{
public string Name { get; set; }
public string Description { get; set; }
}
I'm not really sure if what i'm looking for actually exists, so maybe you guys can help out.
I have the below data:
Apples|3211|12
Markers|221|9
Turtle|1023123123|22
The first column is always a string, the second column and third column are ints. However, what I want to do is be able to reference theses as strings or ints, and then be able to sort via the third column asc. Any ideas?
Something like MyTable[i].Column[i] and in this case MyTable[1].Column[2] would produce 12 as a int (because it's ordered).
If you want type safety you will need to create a class that holds each record:
class Record
{
string Name { get; set; }
int SomeValue { get; set; }
int OrderNr { get; set; }
}
Afterwards store them in a generic List<>, then you can order them, as you like:
List<Record> items = // read them into a list of items;
List<Record> orderedList = items.OrderBy(i => i.OrderNr).ToList();
UPDATE
Since it was requested I customized the answer from JustinNiessner to fit to my example:
string data = // your data as string
List<Record> records = data
.Split('|')
.Select(item => new Record
{
Name = item[0],
SomeValue = int.Parse(item[1]),
OrderNr = int.Parse(item[2])
}).ToList();
List<Record> orderedRecords = records.OrderBy(r => r.OrderNr).ToList();
This can be optimized by using var and not executing ToList() on the list, but is done this way in order to keep it simple for you to understand the concepts better.
Assuming you have your data stored in some sort of IEnumerable<string> type, you could try something like:
var sortedObjs = stringRows
.Split('|')
.Select(r => new
{
ColA = r[0],
ColB = int.Parse(r[1]),
ColC = int.Parse(r[2])
})
.OrderBy(r => r.ColC).ToList();
var specificVal = sortedObjs[1].ColC;
This speaks to a larger problem in your design. Using collections to hold a bunch of disparate types with the intent of organizing them into some sort of structure is fragile, error prone, and completely unnecessary.
Instead, create your own type to organize this information.
class MyType
{
public string Name { get; set; }
public int Whatever { get; set; }
public int AnotherProp { get; set; }
}
Now your data is logically grouped in a nice, tight, type safe package.
Your original post didn't specify what the ints were, but since you wanted to select them either by the Descripton(?) or the Id(?) and then sort them by the third column perhaps something like this will work for you?
//Code tested in LinqPad
void Main()
{
//Apples|3211|12
//Markers|221|9
//Turtle|1023123123|22
//Create a list of items
var items = new List<Item>
{
new Item { Description = "Apple", Id = 3211, Sequence = 12 },
new Item { Description = "Markers", Id = 221, Sequence = 9 },
new Item { Description = "Turtle", Id = 1023123123, Sequence = 22 }
};
//Get sorted list of Apple by Description
var sortedByDescription = items.Where(i => i.Description == "Apple").OrderBy(i => i.Sequence);
//Get sorted list of Turtle by Id
var sortedById = items.Where(i => i.Id == 221).OrderBy(i => i.Sequence);
}
public class Item
{
public string Description { get; set; }
public int Id { get; set; }
public int Sequence { get; set; }
}
I have a datatable with below values
id Name date
1 a 5/3/2011
1 a 6/4/2011
I want to retrieve the values with a list of associated dates for each id/name pair.
I would suggest you create a class which encapsulates all of that data, and then you can create a List<T> of the appropriate type. You'd create an instance of your new class per entry in the DataTable.
If you use a strongly-typed dataset you could use the generated DataRow type instead, if you wanted.
(It's not clear what you mean by "store in a list as single entry" - the whole table, or one entry per row?)
It's difficult to answer with out the context of the usage. Is this going to be used right a way, communicated to other parts of the system. The below assumes that it's not coomunicated to other parts of the system
var list = (from e in DataTable.Rows.AsEnumerable()
select new {
id = e["id"],
Name = e["Name"],
data = e["data"]
}).ToList()
Create a class that maps onto your table. You can use EntityFramework, LINQ to SQL, nHibernate or a custom ORM to retrieve data from the table as these objects. Select the objects and use the LINQ grouping operator to either create anonymous objects (or another class with the list of dates).
public class Foo
{
public int ID { get; set; }
public sring Name { get; set; }
public DateTime Date { get; set;
}
public class Bar
{
public int ID { get; set; }
public string Name { get; set; }
public List<DateTime> Dates { get; set; }
}
public class FooDataContext : DbContext
{
IDbSet<Foo> Foos { get; set; }
}
using (var context = new FooDataContext())
{
List<Bar> bars = context.Foos
.GroupBy( f => new { f.ID, f.Name } )
.Select( g => new Bar
{
ID = g.Key.ID,
Name = g.Key.Name,
Dates = g.Select( f => f.Date )
});
}