Groupby Transform into another group

Groupby Transform into another group - c#

I have a
IGrouping<string, MyObj>
I want to transform it into another IGrouping. For argument sake the key is the same, but MyObj will transform into MyOtherObj i.e.
IGrouping<string, MyOtherObj>
I am using Linq2Sql but I can copy with this last bit not being transformable into SQL.
I want it to be still be an IGrouping<T,TT> because it is a recognised type and I want the signature and result to be apparent. I also want to be able to do this so I can break my link down a bit and put into better labelled methods. i.e.
GetGroupingWhereTheSearchTextAppearsMoreThanOnce()
RetrieveRelatedResultsAndMap()
Bundle up and return encased in an IEnumerable - no doubt as to what is going on.
I have come close by daisy chaining
IQueryable<IGrouping<string, MyObj>> grouping ....
IQueryable<IGrouping<string, IEnumerable<MyOtherObj>>> testgrouping = grouping.GroupBy(gb => gb.Key, contacts => contacts.Select(s => mapper.Map<MyObj, MyOtherObj>(s)));
but I end up with
IGrouping<string, IEnumerable<MyOtherObj>>
I know it is because of how I am accessing the enumerable that the IGrouping represents but I can't figure out how to do it.

You could just flatten the groupings with SelectMany(x => x) then do the GroupBy again, but then you're obviously doing the work twice.
You should be able to do the projection as part of the first GroupBy call instead.
Alternatively, you can add your own implementation of IGrouping, as described here What is the implementing class for IGrouping?, then simply do:
groups.Select(g => new MyGrouping(g.Key, g.Select(myObj => Mapper.Map<MyObj,MyOtherObj>(myObj))))

Related

using AsQueryable after ToList [duplicate]

I know some differences of LINQ to Entities and LINQ to Objects which the first implements IQueryable and the second implements IEnumerable and my question scope is within EF 5.
My question is what's the technical difference(s) of those 3 methods? I see that in many situations all of them work. I also see using combinations of them like .ToList().AsQueryable().
What do those methods mean, exactly?
Is there any performance issue or something that would lead to the use of one over the other?
Why would one use, for example, .ToList().AsQueryable() instead of .AsQueryable()?

There is a lot to say about this. Let me focus on AsEnumerable and AsQueryable and mention ToList() along the way.
What do these methods do?
AsEnumerable and AsQueryable cast or convert to IEnumerable or IQueryable, respectively. I say cast or convert with a reason:
When the source object already implements the target interface, the source object itself is returned but cast to the target interface. In other words: the type is not changed, but the compile-time type is.
When the source object does not implement the target interface, the source object is converted into an object that implements the target interface. So both the type and the compile-time type are changed.
Let me show this with some examples. I've got this little method that reports the compile-time type and the actual type of an object (courtesy Jon Skeet):
void ReportTypeProperties<T>(T obj)
{
Console.WriteLine("Compile-time type: {0}", typeof(T).Name);
Console.WriteLine("Actual type: {0}", obj.GetType().Name);
}
Let's try an arbitrary linq-to-sql Table<T>, which implements IQueryable:
ReportTypeProperties(context.Observations);
ReportTypeProperties(context.Observations.AsEnumerable());
ReportTypeProperties(context.Observations.AsQueryable());
The result:
Compile-time type: Table`1
Actual type: Table`1
Compile-time type: IEnumerable`1
Actual type: Table`1
Compile-time type: IQueryable`1
Actual type: Table`1
You see that the table class itself is always returned, but its representation changes.
Now an object that implements IEnumerable, not IQueryable:
var ints = new[] { 1, 2 };
ReportTypeProperties(ints);
ReportTypeProperties(ints.AsEnumerable());
ReportTypeProperties(ints.AsQueryable());
The results:
Compile-time type: Int32[]
Actual type: Int32[]
Compile-time type: IEnumerable`1
Actual type: Int32[]
Compile-time type: IQueryable`1
Actual type: EnumerableQuery`1
There it is. AsQueryable() has converted the array into an EnumerableQuery, which "represents an IEnumerable<T> collection as an IQueryable<T> data source." (MSDN).
What's the use?
AsEnumerable is frequently used to switch from any IQueryable implementation to LINQ to objects (L2O), mostly because the former does not support functions that L2O has. For more details see What is the effect of AsEnumerable() on a LINQ Entity?.
For example, in an Entity Framework query we can only use a restricted number of methods. So if, for example, we need to use one of our own methods in a query we would typically write something like
var query = context.Observations.Select(o => o.Id)
.AsEnumerable().Select(x => MySuperSmartMethod(x))
ToList – which converts an IEnumerable<T> to a List<T> – is often used for this purpose as well. The advantage of using AsEnumerable vs. ToList is that AsEnumerable does not execute the query. AsEnumerable preserves deferred execution and does not build an often useless intermediate list.
On the other hand, when forced execution of a LINQ query is desired, ToList can be a way to do that.
AsQueryable can be used to make an enumerable collection accept expressions in LINQ statements. See here for more details: Do i really need use AsQueryable() on collection?.
Note on substance abuse!
AsEnumerable works like a drug. It's a quick fix, but at a cost and it doesn't address the underlying problem.
In many Stack Overflow answers, I see people applying AsEnumerable to fix just about any problem with unsupported methods in LINQ expressions. But the price isn't always clear. For instance, if you do this:
context.MyLongWideTable // A table with many records and columns
.Where(x => x.Type == "type")
.Select(x => new { x.Name, x.CreateDate })
...everything is neatly translated into a SQL statement that filters (Where) and projects (Select). That is, both the length and the width, respectively, of the SQL result set are reduced.
Now suppose users only want to see the date part of CreateDate. In Entity Framework you'll quickly discover that...
.Select(x => new { x.Name, x.CreateDate.Date })
...is not supported (at the time of writing). Ah, fortunately there's the AsEnumerable fix:
context.MyLongWideTable.AsEnumerable()
.Where(x => x.Type == "type")
.Select(x => new { x.Name, x.CreateDate.Date })
Sure, it runs, probably. But it pulls the entire table into memory and then applies the filter and the projections. Well, most people are smart enough to do the Where first:
context.MyLongWideTable
.Where(x => x.Type == "type").AsEnumerable()
.Select(x => new { x.Name, x.CreateDate.Date })
But still all columns are fetched first and the projection is done in memory.
The real fix is:
context.MyLongWideTable
.Where(x => x.Type == "type")
.Select(x => new { x.Name, DbFunctions.TruncateTime(x.CreateDate) })
(But that requires just a little bit more knowledge...)
What do these methods NOT do?
Restore IQueryable capabilities
Now an important caveat. When you do
context.Observations.AsEnumerable()
.AsQueryable()
you will end up with the source object represented as IQueryable. (Because both methods only cast and don't convert).
But when you do
context.Observations.AsEnumerable().Select(x => x)
.AsQueryable()
what will the result be?
The Select produces a WhereSelectEnumerableIterator. This is an internal .Net class that implements IEnumerable, not IQueryable. So a conversion to another type has taken place and the subsequent AsQueryable can never return the original source anymore.
The implication of this is that using AsQueryable is not a way to magically inject a query provider with its specific features into an enumerable. Suppose you do
var query = context.Observations.Select(o => o.Id)
.AsEnumerable().Select(x => x.ToString())
.AsQueryable()
.Where(...)
The where condition will never be translated into SQL. AsEnumerable() followed by LINQ statements definitively cuts the connection with entity framework query provider.
I deliberately show this example because I've seen questions here where people for instance try to 'inject' Include capabilities into a collection by calling AsQueryable. It compiles and runs, but it does nothing because the underlying object does not have an Include implementation anymore.
Execute
Both AsQueryable and AsEnumerable don't execute (or enumerate) the source object. They only change their type or representation. Both involved interfaces, IQueryable and IEnumerable, are nothing but "an enumeration waiting to happen". They are not executed before they're forced to do so, for example, as mentioned above, by calling ToList().
That means that executing an IEnumerable obtained by calling AsEnumerable on an IQueryable object, will execute the underlying IQueryable. A subsequent execution of the IEnumerable will again execute the IQueryable. Which may be very expensive.
Specific Implementations
So far, this was only about the Queryable.AsQueryable and Enumerable.AsEnumerable extension methods. But of course anybody can write instance methods or extension methods with the same names (and functions).
In fact, a common example of a specific AsEnumerable extension method is DataTableExtensions.AsEnumerable. DataTable does not implement IQueryable or IEnumerable, so the regular extension methods don't apply.

ToList()
Execute the query immediately
AsEnumerable()
lazy (execute the query later)
Parameter: Func<TSource, bool>
Load EVERY record into application memory, and then handle/filter them. (e.g. Where/Take/Skip, it will select * from table1, into the memory, then select the first X elements) (In this case, what it did: Linq-to-SQL + Linq-to-Object)
AsQueryable()
lazy (execute the query later)
Parameter: Expression<Func<TSource, bool>>
Convert Expression into T-SQL (with the specific provider), query remotely and load result to your application memory.
That’s why DbSet (in Entity Framework) also inherits IQueryable to get the efficient query.
Do not load every record, e.g. if Take(5), it will generate select top 5 * SQL in the background. This means this type is more friendly to SQL Database, and that is why this type usually has higher performance and is recommended when dealing with a database.
So AsQueryable() usually works much faster than AsEnumerable() as it generate T-SQL at first, which includes all your where conditions in your Linq.

ToList() will being everything in memory and then you will be working on it.
so, ToList().where ( apply some filter ) is executed locally.
AsQueryable() will execute everything remotely i.e. a filter on it is sent to the database for applying.
Queryable doesn't do anything til you execute it. ToList, however executes immediately.
Also, look at this answer Why use AsQueryable() instead of List()?.
EDIT :
Also, in your case once you do ToList() then every subsequent operation is local including AsQueryable(). You can't switch to remote once you start executing locally.
Hope this makes it a little bit more clearer.

Encountered a bad performance on below code.
void DoSomething<T>(IEnumerable<T> objects){
var single = objects.First(); //load everything into memory before .First()
...
}
Fixed with
void DoSomething<T>(IEnumerable<T> objects){
T single;
if (objects is IQueryable<T>)
single = objects.AsQueryable().First(); // SELECT TOP (1) ... is used
else
single = objects.First();
}
For an IQueryable, stay in IQueryable when possible, try not be used like IEnumerable.
Update. It can be further simplified in one expression, thanks Gert Arnold.
T single = objects is IQueryable<T> q?
q.First():
objects.First();

Perform includes on key object, using LINQ, in a GroupBy situation

I have a relatively simple, yet somehow weirdly complicated case whereby I need to perform includes on a lengthy object graph, when I'm doing a group-by.
Here is roughly what my LINQ looks like:
var result = DbContext.ParentTable
.Where(p => [...some criteria...])
.GroupBy(p => p.Child)
.Select(p => new
{
ChildObject = p.Key,
AllTheThings = p.Sum(p => p.SomeNumericColumn),
LatestAndGreatest = p.Max(p => p.SomeDateColumn)
})
.OrderByDescending(o => o.SomeTotal)
.Take(100)
.ToHashSet();
That gives me a listing of anonymous objects, just the way I want it, with child objects neatly associated with some aggregate stats about said object. Fine. But I also need a fair share of the object graph associated with child object.
This ask gets even a bit messier than it might otherwise be, when I want to use existing code, I already have to perform the includes. I.e., I have a static method which will take an IQueryable of my child object and, based upon parameters, give me back another IQueryable, with all the proper includes that I need (there are rather a lot of them).
I can't seem to figure the correct way to take my child object as a queryable, and give that to my include method, such that I get it back, for expansion at the point I want to express it to the new anonymous object (where I'm saying ChildObject = n.Key).
Sorry if this is something of a duplicate -- I did search around and found solutions that were close to what I'm wanting, here but not quite.

Clearing up many-to-many in Entity Framework 6 and Entity Framework Core querying

I have to eat humble pie and admit that I thought I understood many to many in EF6 and EFCore. I have your standard example, many students many subjects scenario, but the problem comes when I try to navigate the collections to get at specific properties during projections that I get stuck and can't figure out how to use either Select or SelectMany to get at properties in my projection.
So for example how would I use either Select or SelectMany to finish this?
I need to understand properly these two LINQ methods but can anyone help me?
Here is an example of where I'm stuck:
return await _db.Subjects
.Include(s => s.Teachers)
.Include(s => s.Students)
.Where(s => s.Students.Select(x => x.Class.ClassName).Contains(classname))
.Select(s => new SubjectViewModel
{
Class = s.Students.Select(p => p.Class.ClassName)
})
So how do I complete this, do I do a SelectMany or Select? Oh and can anyone point me to some content other than MSDN to properly understand Select and SelectMany? Also can anyone show me how this would be done in EFCore? I think I just need help.

Based on what we have to work with (as #Matthew Cawley said in the comment) if you use Select you get an object of type IEnuberable<IEnumerable<string>> or IQueryable<IQueryable<string>> which is a list of lists of strings.
If you need only one list of strings you can use SelectMany to iterate just like Select but flatten the results into one collection and then applies the selector you passed p => p.Class.ClassName.
If you want to concatenate them into a single string you can use string.Join(",", <collection>) but not directly in the projection if you are using linq-to-sql because it wouldn't know how to translate that into sql code.

When using LINQ to Entities, is there a better method than casting to list in order to use unsupported code/extensions?

I find myself needing to do things like this frequently, and I'm just curious if there is a better way.
Suppose I have a class that holds a snapshot of some data:
private List<Person> _people;
In some method I populate that list from a LINQ query using Entity Framework, and perhaps I need to, for example, run a custom IEqualityComparer on it. Since this isn't supported in LINQ to entities, I end up with something like this:
_people = db.People.Where(...)
.ToList()
.Distinct(new MyCustomComparer())
.ToList();
Another example might be using an extension method, which also is not supported in LINQ to entities:
_people = db.People.Where(...)
.ToList()
.Select(_ => new { Age = _.DOB.MyExtensionMethod() })
.ToList();
In order to use either of these I have to cast the database entities into regular memory objects with the first ToList(), and then since I ultimately want a list anyway, I have a final cast ToList() at the end. This seems inefficient to me, and I'm wondering if there's a better pattern for these types of situations?

You can use AsEnumerable():
_people = db.People.Where(...)
.AsEnumerable()
.Distinct(new MyCustomComparer())
.ToList();
Which is equivalent to:
IEnumerable<Person> _people = db.People.Where(...);
_people = _people.Distinct(new MyCustomComparer()).ToList();
This is not much of an improvement, but at least it doesn't create another List<T> and is better expressing that you want to switch to the realm of IEnumerable<T> (in-memory).
See MSDN

Faking IGrouping for LINQ

Imagine you have a large dataset that may or may not be filtered by a particular condition of the dataset elements that can be intensive to calculate. In the case where it is not filtered, the elements are grouped by the value of that condition - the condition is calculated once.
However, in the case where the filtering has taken place, although the subsequent code still expects to see an IEnumerable<IGrouping<TKey, TElement>> collection, it doesn't make sense to perform a GroupBy operation that would result in the condition being re-evaluated a second time for each element. Instead, I would like to be able to create an IEnumerable<IGrouping<TKey, TElement>> by wrapping the filtered results appropriately, and thus avoiding yet another evaluation of the condition.
Other than implementing my own class that provides the IGrouping interface, is there any other way I can implement this optimization? Are there existing LINQ methods to support this that would give me the IEnumerable<IGrouping<TKey, TElement>> result? Is there another way that I haven't considered?

the condition is calculated once
I hope those keys are still around somewhere...
If your data was in some structure like this:
public class CustomGroup<T, U>
{
T Key {get;set;}
IEnumerable<U> GroupMembers {get;set}
}
You could project such items with a query like this:
var result = customGroups
.SelectMany(cg => cg.GroupMembers, (cg, z) => new {Key = cg.Key, Value = z})
.GroupBy(x => x.Key, x => x.Value)

Inspired by David B's answer, I have come up with a simple solution. So simple that I have no idea how I missed it.
In order to perform the filtering, I obviously need to know what value of the condition I am filtering by. Therefore, given a condition, c, I can just project the filtered list as:
filteredList.GroupBy(x => c)
This avoids any recalculation of properties on the elements (represented by x).
Another solution I realized would work is to revers the ordering of my query and perform the grouping before I perform the filtering. This too would mean the conditions only get evaluated once, although it would unnecessarily allocate groupings that I wouldn't subsequently use.

What about putting the result into a LookUp and using this for the rest of the time?
var lookup = data.ToLookUp(i => Foo(i));

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Groupby Transform into another group - c#

Related

using AsQueryable after ToList [duplicate]

Perform includes on key object, using LINQ, in a GroupBy situation

Clearing up many-to-many in Entity Framework 6 and Entity Framework Core querying

When using LINQ to Entities, is there a better method than casting to list in order to use unsupported code/extensions?

Faking IGrouping for LINQ

Categories

Resources