GroupBy linq works only on adjacent records - c#

I have a problem using groupby operator in linq. I am using .Net Core EF, so I am not really sure whether there is a bug on it, or if I am doing something wrong. I am trying to perform a grouping on a table which contains several records. My grouping key is a non-anonymous entity that gets its values from navigating a level down the db structure.
In the following example, I am interested in grouping records present on the ResultTable, by the values given by the TypeId and OtherId properties.
ResultTable
- ResultId
- DetailId (FK)
DetailTable
- DetailId
- TypeId (maps to enum, not null)
- OtherId (FK, null)
My code to group is the following:
private IQueryable<IGrouping<Grouper, Result>> GetGroupedResults()
{
var results = MyContext.ResultSet.Include(r => r.Detail);
var groupedResults = results.GroupBy(r => new Grouper(r.Detail.TypeId, r.Detail.OtherId ?? 0));
return groupedResults ;
}
The definition of my grouper entity is as follows:
public class Grouper
{
public Grouper(Type type, int otherId)
{
Type= type;
OtherId = otherId;
}
public Type Type{ get; } // this is an enum
public int OtherId { get; }
public override bool Equals(object obj)
{
var p = obj as Grouper;
if (p == null)
{
return false;
}
return (Type == p.Type ) && (OtherId == p.OtherId);
}
public override int GetHashCode()
{
return (int)Type ;
}
}
What I would expect this to do is the following. Lets say that I have the following records:
ResultTable
ResultId: 1, DetailId: 1
ResultId: 2, DetailId: 2
ResultId: 3, DetailId: 3
ResultId: 4, DetailId: 4
DetailTable
DetailId: 1, TypeId: 1, OtherId: 1
DetailId: 2, TypeId: 1, OtherId: 1
DetailId: 3, TypeId: 2, OtherId: NULL
DetailId: 4, TypeId: 1, OtherId: 1
Considering that data I would expect two groups with the following keys and values
First group with key Grouper(Type: 1, OtherId: 1), values ResultId(1, 2, 4)
Second group with key Grouper(Type: 2, OtherId: 0), values ResultId(3)
However I am not getting this. I am instead getting three groups with keys and values as follows:
First group with key Grouper(Type: 1, OtherId: 1), values ResultId(1, 2, 4)
Second group with key Grouper(Type: 2, OtherId: 0), values ResultId(3)
Third group with key Grouper(Type: 1, OtherId: 1), values ResultId(4)
It seems as if the GroupBy operation were only capable of grouping records with subsequent id's. Why is this happening?
Greetings
Luis.
EDIT: Using an anonymous type to perform grouping results in correct groups, could it be the way I am defining my Grouper class?

Related

LINQ filter fields that starts with a particular integer

I need to filter fields that starts with a particular integer or list of integers. For example, When I enter an integer (ex: 1), the query should return 1, 10, 119, 1187, 1098.
The following query gives me all the Schools that has an Id of 1. However, I need to modify it so it returns 1, 10, 119, 1187, 1098.
return await _dbContext.Schools.Where(c => c.Id == Convert.ToInt32(schoolId)).Take(5).ToListAsync() ;
You could do something like this, assuming:
Id is an int
schoolId is a string
return await _dbContext
.Schools
.Where(c => c.Id.ToString().StartsWith(schoolId)).Take(5).ToListAsync();

Understanding the GroupJoin and Join in Linq chaining syntax (Homework)

I need help with understanding the fourth argument for GroupJoin. From what i understand so far GroupJoin takes 4 arguments: (1, 2) - the first one is the secondary list and argument two is a Func that returns the key from the first object type in other words from the first list in this case (people). (3, 4) A Func that returns the key from the second object type from the second list in this case (items), and one that stores the grouped object with the group itself (I can't understand the code for this part). Considering this and having the code below:
var products = new Product[]
{
new Product { Id = 1, Type = "Phone", Model = "OnePlus", Price = 1000 },
new Product { Id = 2, Type = "Phone", Model = "Apple", Price = 2000 },
new Product { Id = 3, Type = "Phone", Model = "Samsung", Price = 1500 },
new Product { Id = 4, Type = "TV", Model = "Samsung 32", Price = 200 },
};
var people = new Person[]
{
new Person { Id = 1, Name = "Ivan Ivanov", Money = 150000 },
new Person { Id = 2, Name = "Dragan Draganov", Money = 250000 },
new Person { Id = 3, Name = "Ivelin Ivelinov", Money = 350000
}
};
var items = new Item[]
{
new Item { PersonId = 1, ProductId = 1, Amount = 1 },
new Item { PersonId = 1, ProductId = 4, Amount = 1 },
new Item { PersonId = 1, ProductId = 5, Amount = 1 },
new Item { PersonId = 1, ProductId = 7, Amount = 1 },
new Item { PersonId = 2, ProductId = 2, Amount = 1 },
};
Query:
var productOwnerList = people
.GroupJoin(
items,
o => o.Id,
i => i.PersonId,
(o, i) => new <--- (**)
{
Person = o,
Products = i
.Join(products,
o1 => o1.ProductId,
i2 => i2.Id,
(o1, i2) => i2) <--- (*)
.ToArray()
})
.ToArray();
Just to mention I post only a few lines for the data. I need help to understand what the 4th argument for the join method is performing here -> (*) (stores the grouped object with the group itself) ? When i watch the result i see it it puts all Person id's associate with the product keys and joined the two lists based on Items list (one to many). But i cannot get what exactly this line means (o1, o2) => i2). Its obvious what is doing (put all the items associated with the person id in a array (items[]) for every person. but what is "under the hood" here ? Also one question about (**) this line its creating new object, is this a anonymous class or if its not what is it.
The fourth argument - which maps to the fifth parameter in the documentation (because the first parameter is the target of the extension method call) is just the result selector. It's a function accepting two parameters: the first is an element of the "outer" sequence (the people array in your case) and the second is a sequence of elements from the "inner" sequence (the items array in your case) which have the same key as the outer element. The function should return a "result" element, and the overall result of the method call is a sequence of those results.
The function is called once for each of the "outer" elements, so you'd have:
First call: person ID 1, and products with IDs 1, 4, 5, 7
Second call: person ID 2, and the product with ID 2
Third call: person ID 3, and an empty sequence of products
Your query is complex because you're using an anonymous type for your result, and constructing an instance of the anonymous type using another query. Here's a simpler query that might help to clarify:
var productOwnerList = people
.GroupJoin(
items,
o => o.Id,
i => i.PersonId,
(person, items) => $"{person.Id}: {string.Join(",", items.Select(item => item.ProductId))}"
.ToArray();

Entity Framework + LINQ Expression outer join error

i want to create a left outer join for a linq expression that query data from database via entity framework. this is the linq expression. basically what I am trying to do is search problem_vehicle_id from problemVehiclesTicket in Problems table to see if it exists, if it doesn't exists, i want to return a problem object that is null/empty. Basically I believe it is left outer join.
var ticketsDetails = (from tickets in DbContext.tickets
join problemVehiclesTicket in DbContext.problem_vehicle on tickets.tickets_id equals problemVehiclesTicket.tickets_id
join problems in DbContext.problem on problemVehiclesTicket.problem_vehicle_id equals problem.problem_vehicle_id into problemGroup
from problems in problemGroup.DefaultIfEmpty(new problem { })
where (tickets.tickets_id == ticketsId)
select new TicketsDetails
{
Ticket = tickets,
ProblemVehicle = problemVehiclesTicket,
Problems= problem,
}).ToList();
Problem is a class that mirrors that of the Problem table in database
`Problem`
id (int), description (string), type (short)
The error i got is "The entity or complex type 'SPOTS_Repository.speeding_offence' cannot be constructed in a LINQ to Entities query." The source is from Entity Framework.
any help is greatly appreciated.
The type problem in your case is a mapped entity. Therefore, you cannot project onto it. You can use an anonymous type or another non-mapped class (DTO).
Because in your DefaultIfEmpty method you are constructing a new problem, which is a mapped entity, this not allowed.
Fix
You do not need to pass anything to DefaultIfEmpty method. Actually in your case, you are not even allowed because the only thing you can pass is problem and that is mapped. Therefore, use .DefaultIfEmpty() without creating a new problem.
More Belabor
Here is an example, which will clarify the usage of DefaultIfEmpty:
Option 1: DefaultIfEmpty() with No Parameter
var list1 = new List<int> { 1, 2, 3, 6, 4 };
var list2 = new List<int> { 4, 1, 2 };
var selection =
from l1 in list1
join l2 in list2 on l1 equals l2 into joined
from j in joined.DefaultIfEmpty()
select j;
Output: 1, 2, 0, 0, 4
Why? Because 3 and 6 are not found and DefaultIfEmpty for an integer returns a 0.
Option 2: DefaultIfEmpty() with Parameter
In some cases we may want to indicate that if the item is not found in the join, what to return instead. We can do that by sending a single parameter to DefaultIfEmpty method like this:
var list1 = new List<int> { 1, 2, 3, 6, 4 };
var list2 = new List<int> { 4, 1, 2 };
var selection =
from l1 in list1
join l2 in list2 on l1 equals l2 into joined
from j in joined.DefaultIfEmpty(99) //<-- see this
select j;
Output: 1, 2, 99, 99, 4 Why? Because 3 and 6 are not found and we instructed DefaultIfEmpty to return a 99 in that case.
Please note that DefaultIfEmpty is a generic method. In my case it required an int because I am joining to the second list which is a List of int(s). In your case it is problem(s) but that is mapped. Therefore, you cannot construct it in your query.
Here is another example:
var depts = new List<Department>
{
new Department { Name = "Accounting" },
new Department { Name = "IT" },
new Department { Name = "Marketing" }
};
var persons = new List<Person>
{
new Person { DeptName = "Accounting", Name = "Bob" }
};
var selection2 =
from d in depts
join p in persons on d.Name equals p.DeptName into joined2
// See here DefaultIfEmpty can be passed a Person
from j2 in joined2.DefaultIfEmpty(new Person { DeptName = "Unknown", Name = "Alien" })
select j2;
foreach(var thisJ in selection2)
{
Console.WriteLine("Dept: {0}, Name: {1}", thisJ.DeptName, thisJ.Name);
}
Output:
Dept: Accounting, Name: Bob
Dept: Unknown, Name: Alien
Dept: Unknown, Name: Alien
<== Fiddle Me ==>
Public class problem()
{
public int id;
public string description;
public short type;
}
.DefaultIfEmpty(
new problem()
{
Id = ticketsId,
Description = string.empty,
});
create class and make use of that in linq query
Hope it helps you.

Dividing up a table of two-way relationships into distinct groups

I'm working on an application where users can tag "components" as part of the workflow. In many cases, they end up with several tags that are synonyms of each other. They would like these to be grouped together so that when one tag is added to a component, the rest of the tags in the group can be added as well.
I decided to break up tag groups into two-way relationships between each pair of tags in the group. So if a group has tags 1 and 2, there's a record that looks like this:
ID TagID RelatedTagID
1 1 2
2 2 1
Basically, a group is represented as a Cartesian product of each tag in it. Extend that to 3 tags:
ID Name
1 MM
2 Managed Maintenance
3 MSP
Our relationships look like this:
ID TagID RelatedTagID
1 1 2
2 2 1
3 1 3
4 3 1
5 2 3
6 3 2
I have a couple methods to group them together, but they're less than stellar. First, I wrote a view that lists each tag along with the list of tags in its group:
SELECT
TagKey AS ID,
STUFF
((SELECT ',' + cast(RelatedTagKey AS nvarchar)
FROM RelatedTags rt
WHERE rt.TagKey = t.TagKey
FOR XML PATH('')), 1, 1, '') AS RelatedTagKeys
FROM (
SELECT DISTINCT TagKey
FROM RelatedTags
) t
The problem with this is that each group appears in the results as many times as there are tags in it, which I wasn't able to think of a way to work around in a single query. So it gives me back:
ID RelatedTagKeys
1 2,3
2 1,3
3 1,2
Then in my back-end, I discard all groups that contain a key that occurs in another group. Tags aren't being added to multiple groups, so that works, but I don't like how much extraneous data I'm pulling down.
The second solution I came up with was this LINQ query. The key used to group the tags is a listing of the group itself. This is probably much worse than I originally thought.
from t in Tags.ToList()
where t.RelatedTags.Any()
group t by
string.Join(",", (new List<int> { t.ID })
.Concat(t.RelatedTags.Select(i => i.Tag.ID))
.OrderBy(i => i))
into g
select g.ToList()
I really hate grouping by the result of calling string.Join, but when I tried just grouping by the list of keys, it didn't group properly, putting each tag in a group by itself. Also, the SQL it generated is monstrous. I'm not going to paste it here, but LINQPad shows that it generates about 12,000 lines of individual SELECT statements on my test database (we have 1562 tags and 67 records in RelatedTags).
These solutions work, but they're pretty naive and inefficient. I don't know where else to go with this, though. Any ideas?
I suppose working with your data gets easier if you have a groupId for each of your tags, such that tags that are related share the same value of groupId.
To explain what I mean, I added a second set of related tags to your dataset:
INSERT INTO tags ([ID], [Name]) VALUES
(1, 'MM'),
(2, 'Managed Maintenance'),
(3, 'MSP'),
(4, 'UM'),
(5, 'Unmanaged Maintenance');
and
INSERT INTO relatedTags ([ID], [TagID], [RelatedTagID]) VALUES
(1, 1, 2),
(2, 2, 1),
(3, 1, 3),
(4, 3, 1),
(5, 2, 3),
(6, 3, 2),
(7, 4, 5),
(8, 5, 4);
Then, a table holding the following information should make a lot of other things easier (I first explain the content of the table and then how to get it using a query):
tagId | groupId
------|--------
1 | 1
2 | 1
3 | 1
4 | 4
5 | 4
The data comprises two groups of related tags, i.e. {1,2,3} and {4,5}. Therefore, above table marks tags belonging to the same group with the same groupId, i.e. 1 for {1,2,3}, and 4 for {4,5}.
To achieve such a view/table, you could use the following query:
with rt as
( (select r2.tagId, r2.relatedTagId
from relatedTags r1 join relatedTags r2 on r1.tagId = r2.relatedTagId)
union
(select r3.tagId, r3.tagId as relatedTagId from relatedTags r3)
)
select rt.tagId, min(rt.relatedTagId) as groupId from rt
group by tagId
Of course, instead of introducing a new table / view, you could also extend your primary tags-table by a groupId attribute.
Hope this helps.
I really don't understand the relationships. You didn't explain very well. But I somehow got same results. Not sure if I did it right.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication41
{
class Program
{
static void Main(string[] args)
{
Data.data = new List<Data>() {
new Data() { ID = 1, TagID = 1, RelatedTagID = 2},
new Data() { ID = 2, TagID = 2, RelatedTagID = 1},
new Data() { ID = 3, TagID = 1, RelatedTagID = 3},
new Data() { ID = 4, TagID = 3, RelatedTagID = 1},
new Data() { ID = 5, TagID = 2, RelatedTagID = 3},
new Data() { ID = 6, TagID = 3, RelatedTagID = 2}
};
var results = Data.data.GroupBy(x => x.RelatedTagID)
.OrderBy(x => x.Key)
.Select(x => new {
ID = x.Key,
RelatedTagKeys = x.Select(y => y.TagID).ToList()
}).ToList();
foreach (var result in results)
{
Console.WriteLine("ID = '{0}', RelatedTagKeys = '{1}'", result.ID, string.Join(",",result.RelatedTagKeys.Select(x => x.ToString())));
}
Console.ReadLine();
}
}
public class Data
{
public static List<Data> data { get; set; }
public int ID { get; set; }
public int TagID { get; set; }
public int RelatedTagID { get; set; }
}
}

Linq query to fetch DB records corresponding to List of int[2]

I am trying to count the VALUEs corresponding to a List<int[]> using Linq to Entity Framework.
I have a List<int[]> where each int[] in the List is of length 2.
I have a DB table VALUES which contains 3 columns, called ID, PARENT and VALUE, where each int[] in the List (see 1) may correspond to a record in the table, the 0 index being ID and the 1 index being PARENT. *But some arrays likely do not correspond to any existing records in the table.
Each combination of ID and PARENT correspond to multiple DB records, with different VALUEs.
Several points that are important to note:
One of the problems is that I can't rely on ID alone - each value is defined/located according to both the ID and PARENT.
None of the int arrays repeat, though the value in each index may appear in several arrays, e.g.
List<int[]> myList = new List<int[]>();
myList.add(new int[]{2, 1});
myList.add(new int[]{3, 1}); //Notice - same "PARENT"
myList.add(new int[]{4, 1}); //Notice - same "PARENT"
myList.add(new int[]{3, 1}); //!!!! Cannot exist - already in the List
I can't seem to figure out how to request all of the VALUEs from the VALUES table that correspond to the ID, PARENT pairs in the List<int[]>.
I've tried several variations but keep arriving at the pitfall of attempting to compare an array in a linq statement... I can't seem to crack it without loading substantially more information that I actually need.
Probably the closest I've gotten is with the following line:
var myList = new List<int[]>();
// ... fill the list ...
var res = myContext.VALUES.Where(v => myList.Any(listItem => listItem[0] == v.ID && listItem[1] == v.PARENT));
Of course, this can't work because The LINQ expression node type 'ArrayIndex' is not supported in LINQ to Entities.
#chris huber
I tried it out but it was unsuccessful.
2 things:
Where you created "myValues" I have a DB table entity, not a List.
Due to point number 1, I am using LINQ to Entities, as opposed to LINQ to Object
My code then comes to something like this:
var q2 = from value in myContext.VALUES where myList.Select(x => new { ID = x.ElementAt(0), Parent = x.ElementAt(1) }).Contains(new { ID = value.ID, Parent = value.PARENT }) select value;
This returns the following error message when run:
LINQ to Entities does not recognize the method 'Int32 ElementAt[Int32](System.Collections.Generic.IEnumerable` 1[System.Int32],Int32)' method, and this method cannot be translated into a store expression.
#Ovidiu
I attempted your solution as well but the same problem as above:
As I am using LINQ to Entities, there are simply certain things that cannot be performed, in this case - the ToString() method is "not recognized". Removing the ToString() method and attempting to simply have a Int32 + "|" + In32 gives me a whole other error about LINQ to Entities not being able to cast an Int32 to Object.
Use the following LINQ expression:
List<int[]> myList = new List<int[]>();
myList.Add(new int[] { 2, 1 });
myList.Add(new int[] { 3, 1 }); //Notice - same "PARENT"
myList.Add(new int[] { 4, 1 }); //Notice - same "PARENT"
myList.Add(new int[] { 3, 1 });
List<int[]> myValues = new List<int[]>();
myValues.Add(new int[] { 2, 1 , 1});
myValues.Add(new int[] { 3, 1 , 2}); //Notice - same "PARENT"
myValues.Add(new int[] { 4, 1 , 3}); //Notice - same "PARENT"
myValues.Add(new int[] { 3, 1, 4 });
myValues.Add(new int[] { 3, 2, 4 });
var q2 = from value in myValues where myList.Select(x => new { ID = x.ElementAt(0), Parent = x.ElementAt(1) }).Contains(new { ID = value.ElementAt(0), Parent = value.ElementAt(1) }) select value;
var list = q2.ToList();
You could use this workaround: create a new List<string> from your List<int[]> and compare the values in your table with the new list.
I didn't test this with EF, but it might work
List<string> strList = myList.Select(x => x[0].ToString() + "|" + x[1].ToString()).ToList();
var res = myContext.VALUES.Where(x => strList.Contains(SqlFunctions.StringConvert((double)x.ID).Trim() + "|" + SqlFunctions.StringConvert((double)x.PARENT).Trim()));

Categories

Resources