Select Distinct rows using Linq

Select Distinct rows using Linq - c#

The data is as follow
ID Title Category About Link CategoryID
1 The Matrix Sci-Fi Text goes here http://... 1
2 The Simpsons Cartoon Text goes here http://... 2
3 Avengers Action Text goes here http://... 3
4 The Matrix Sci-Fi Text goes here http://... 1
5 The One Sci-Fi Text goes here http://... 1
6 The Hobbit Sci-Fi Text goes here http://... 1
I have a checkbox list containing the categories. The problem is if the user selects 'Action' and 'Sci-Fi' as category to display The Matrix will be displayed twice.
This is my try for getting unique rows in SQL Query.
select distinct title, about, link from mytable
inner join tableCategories on categoryID = tableCategoriesID
group by title, about, link
Using the LINQ,
(from table in movieTables
join x in categoryIDList
on categoryID equals x
slect table).Distinct()
Note that the categories are in a separate table linked by the categoryID.
Need help displaying unique or distinct rows in LINQ.

You can happily select your result into a list of whatever you want:
var v = from entry in tables
where matching_logic_here
select new {id = some_id, val=some_value};
and then you can run your distinct on that list (well, a ToList() on the above will make it one), based on your needs.
The following should illustrate what i mean (just paste into linqpad. if you're using VS, get rid of the .Dump():
void Main()
{
var input = new List<mock_entry> {
new mock_entry {id = 1, name="The Matrix", cat= "Sci-Fi"},
new mock_entry {id = 2, name="The Simpsons" ,cat= "Cartoon"},
new mock_entry {id = 3, name="Avengers" ,cat= "Action"},
new mock_entry {id = 4, name="The Matrix", cat= "Sci-Fi"},
new mock_entry {id = 5, name="The One" ,cat= "Sci-Fi"},
new mock_entry {id = 6, name="The Hobbit",cat= "Sci-Fi"},
};
var v = input.Where(e=>e.cat == "Action" || e.cat =="Sci-Fi")
.Dump()
.Select(e => new {n = e.name, c =e.cat})
.Dump()
;
var d = v.Distinct()
.Dump()
;
}
// Define other methods and classes here
public struct mock_entry {
public int id {get;set;}
public string name {get;set;}
public string cat {get;set;}
}
Another option would be to use DistinctBy from more linq as suggested in this question
Edit:
Even simpler, you can use GroupBy, and just select the first entry (you'll lose the id though, but up to you).
Here's an example that will work with the above:
var v = input.GroupBy (i => i.name)
.Select(e => e.First ())
.Dump()
.Where(e=>e.cat == "Action" || e.cat =="Sci-Fi")
.Dump()
;
will yield:
1 The Matrix Sci-Fi
3 Avengers Action
5 The One Sci-Fi
6 The Hobbit Sci-Fi

Related

Create sub array of duplicates and randomize order c#

I am returning a list of items from the Database. List contains info like id, first name, last name, # of sales, $ revenue.
1 John James 431 213000
2 Scott Smith 301 43000
3 Jane Doe 431 300000
4 Tess Jones 431 14280
my results will contain the 4 rows as shown above. I am ordering my rows in descending order based on the # sales value. If the # sales value is the same then I order by id. So after ordering my results will look like so:
3 Jane Doe 431 300000
1 John James 431 213000
4 Tess Jones 431 14280
2 Scott Smith 301 43000
Now I would like to randomize the order of the rows where the #sales are the same. I am sure that means that I probably won't use the order by id.
How can I create a subarray of my results where the #sales are the same and then randomize their order and insert it to the original array? The main reason I am doing this is to add some variety to the data I will display so it not the same and gives the users the opportunity to be displayed in a different order.
This is how i am getting my original data:
results = results.OrderByDescending(x => Math.Max((uint)x.TotalSales, x.TotalRevenue))
.GroupBy(x => x.Id)
.Select(x => x.First()).ToList();
I believe I will be applying the randomizing part as follows:
var randomIndx= new Random(TotalSales).Next(100) % (#: will this be the number of duplicates?);
so that
I am not sure how to put things together. Any helpful tips are much appreciated.

Apply an ordering to your collection using Random, which will shuffle the results. Using ThenBy will ensure that entries are shuffled only within their previous-level ordering (items will be shuffled within their same "NumOfSales").
For example, with class:
public class Employee
{
public int Id{get;set;}
public string Name{get;set;}
public int NumOfSales{get;set;}
public int Revenue{get;set;}
}
Your code could look like this:
var originalEmployeeList = new List<Employee>()
{
new Employee { Id = 1, Name = "John James", NumOfSales = 431, Revenue = 213000 },
new Employee { Id = 2, Name = "Scott Smith", NumOfSales = 301, Revenue = 43000 },
new Employee { Id = 3, Name = "Jane Doe", NumOfSales = 431, Revenue = 300000 },
new Employee { Id = 4, Name = "Tess Jones", NumOfSales = 431, Revenue = 14280 },
};
var random = new Random();
var randomizedResults = originalEmployeeList
.OrderByDescending(x => x.NumOfSales)
.ThenBy(x => random.Next())
.ToList();
The key here is using random.Next() INSIDE a ThenBy. With this example, people with a NumOfSales = 431 will always appear before people with a NumOfSales = 301, but the listing of people within the same NumOfSales will be randomized.
Here's a runnable example: https://dotnetfiddle.net/W2ESGt

EF duplicate values in secondary table

I'm developing a web application using Entity Framework.
I need do a select and pass values for an Ilist but it's returns duplicate values.
IQueryable<establishmentInfo> filter = (from x in db.establishments
join t in db.establishment_categories on x.id equals t.establishment
join q in db.categories on t.category equals q.id
where (x.name.ToUpper().Contains(search.ToUpper()))
select new establishmentInfo
{
id = x.id,
name = x.name,
id_category = q.id,
category = q.name,
});
IList<establishmentInfo>establishments = filter.ToList();
Establishment table
id name email
---------------------------
1 AAA a#a.com
2 BBB b#b.com
Establishment_categories
id establishment category
-------------------------------
1 1 1
2 1 2
3 2 1
Categories
id name
---------------------
1 alpha
2 beta
The problem is that return 2 establishments, one with category 1 and other with category 2. I need remove one of these.
Can anyone help?

As #NetMage said,your linq statement should return two values that are not repeated.
We can see that there are two records with establishment set to 1 in your Establishment_categories table. You can check your establishments. The id_category should be 1, the category should be alpha, the other should be id_categoryis 2, and the category should be beta.
You can see below image:
If you only want to get the first data of establishments, you can write the code as follows:
IQueryable<Establishment> filter = (from x in _context.Establishments
join t in _context.Establishment_Categories on x.Id equals t.EstablishmentId
join q in _context.Categories on t.CategoryId equals q.Id
where x.Name.ToUpper().Contains(search.ToUpper())
select new Establishment
{
Id = x.Id,
Name = x.Name,
CategoryId = q.Id,
CategoryName = q.Name,
}).Take(1);
List<Establishment> establishments = filter.ToList();
Result:
By the way, assuming that there are duplicates in your returned data, you can add the .Distinct() method after your linq to remove duplicates.

Understanding the GroupJoin and Join in Linq chaining syntax (Homework)

I need help with understanding the fourth argument for GroupJoin. From what i understand so far GroupJoin takes 4 arguments: (1, 2) - the first one is the secondary list and argument two is a Func that returns the key from the first object type in other words from the first list in this case (people). (3, 4) A Func that returns the key from the second object type from the second list in this case (items), and one that stores the grouped object with the group itself (I can't understand the code for this part). Considering this and having the code below:
var products = new Product[]
{
new Product { Id = 1, Type = "Phone", Model = "OnePlus", Price = 1000 },
new Product { Id = 2, Type = "Phone", Model = "Apple", Price = 2000 },
new Product { Id = 3, Type = "Phone", Model = "Samsung", Price = 1500 },
new Product { Id = 4, Type = "TV", Model = "Samsung 32", Price = 200 },
};
var people = new Person[]
{
new Person { Id = 1, Name = "Ivan Ivanov", Money = 150000 },
new Person { Id = 2, Name = "Dragan Draganov", Money = 250000 },
new Person { Id = 3, Name = "Ivelin Ivelinov", Money = 350000
}
};
var items = new Item[]
{
new Item { PersonId = 1, ProductId = 1, Amount = 1 },
new Item { PersonId = 1, ProductId = 4, Amount = 1 },
new Item { PersonId = 1, ProductId = 5, Amount = 1 },
new Item { PersonId = 1, ProductId = 7, Amount = 1 },
new Item { PersonId = 2, ProductId = 2, Amount = 1 },
};
Query:
var productOwnerList = people
.GroupJoin(
items,
o => o.Id,
i => i.PersonId,
(o, i) => new <--- (**)
{
Person = o,
Products = i
.Join(products,
o1 => o1.ProductId,
i2 => i2.Id,
(o1, i2) => i2) <--- (*)
.ToArray()
})
.ToArray();
Just to mention I post only a few lines for the data. I need help to understand what the 4th argument for the join method is performing here -> (*) (stores the grouped object with the group itself) ? When i watch the result i see it it puts all Person id's associate with the product keys and joined the two lists based on Items list (one to many). But i cannot get what exactly this line means (o1, o2) => i2). Its obvious what is doing (put all the items associated with the person id in a array (items[]) for every person. but what is "under the hood" here ? Also one question about (**) this line its creating new object, is this a anonymous class or if its not what is it.

The fourth argument - which maps to the fifth parameter in the documentation (because the first parameter is the target of the extension method call) is just the result selector. It's a function accepting two parameters: the first is an element of the "outer" sequence (the people array in your case) and the second is a sequence of elements from the "inner" sequence (the items array in your case) which have the same key as the outer element. The function should return a "result" element, and the overall result of the method call is a sequence of those results.
The function is called once for each of the "outer" elements, so you'd have:
First call: person ID 1, and products with IDs 1, 4, 5, 7
Second call: person ID 2, and the product with ID 2
Third call: person ID 3, and an empty sequence of products
Your query is complex because you're using an anonymous type for your result, and constructing an instance of the anonymous type using another query. Here's a simpler query that might help to clarify:
var productOwnerList = people
.GroupJoin(
items,
o => o.Id,
i => i.PersonId,
(person, items) => $"{person.Id}: {string.Join(",", items.Select(item => item.ProductId))}"
.ToArray();

Entity Framework + LINQ Expression outer join error

i want to create a left outer join for a linq expression that query data from database via entity framework. this is the linq expression. basically what I am trying to do is search problem_vehicle_id from problemVehiclesTicket in Problems table to see if it exists, if it doesn't exists, i want to return a problem object that is null/empty. Basically I believe it is left outer join.
var ticketsDetails = (from tickets in DbContext.tickets
join problemVehiclesTicket in DbContext.problem_vehicle on tickets.tickets_id equals problemVehiclesTicket.tickets_id
join problems in DbContext.problem on problemVehiclesTicket.problem_vehicle_id equals problem.problem_vehicle_id into problemGroup
from problems in problemGroup.DefaultIfEmpty(new problem { })
where (tickets.tickets_id == ticketsId)
select new TicketsDetails
{
Ticket = tickets,
ProblemVehicle = problemVehiclesTicket,
Problems= problem,
}).ToList();
Problem is a class that mirrors that of the Problem table in database
`Problem`
id (int), description (string), type (short)
The error i got is "The entity or complex type 'SPOTS_Repository.speeding_offence' cannot be constructed in a LINQ to Entities query." The source is from Entity Framework.
any help is greatly appreciated.

The type problem in your case is a mapped entity. Therefore, you cannot project onto it. You can use an anonymous type or another non-mapped class (DTO).
Because in your DefaultIfEmpty method you are constructing a new problem, which is a mapped entity, this not allowed.
Fix
You do not need to pass anything to DefaultIfEmpty method. Actually in your case, you are not even allowed because the only thing you can pass is problem and that is mapped. Therefore, use .DefaultIfEmpty() without creating a new problem.
More Belabor
Here is an example, which will clarify the usage of DefaultIfEmpty:
Option 1: DefaultIfEmpty() with No Parameter
var list1 = new List<int> { 1, 2, 3, 6, 4 };
var list2 = new List<int> { 4, 1, 2 };
var selection =
from l1 in list1
join l2 in list2 on l1 equals l2 into joined
from j in joined.DefaultIfEmpty()
select j;
Output: 1, 2, 0, 0, 4
Why? Because 3 and 6 are not found and DefaultIfEmpty for an integer returns a 0.
Option 2: DefaultIfEmpty() with Parameter
In some cases we may want to indicate that if the item is not found in the join, what to return instead. We can do that by sending a single parameter to DefaultIfEmpty method like this:
var list1 = new List<int> { 1, 2, 3, 6, 4 };
var list2 = new List<int> { 4, 1, 2 };
var selection =
from l1 in list1
join l2 in list2 on l1 equals l2 into joined
from j in joined.DefaultIfEmpty(99) //<-- see this
select j;
Output: 1, 2, 99, 99, 4 Why? Because 3 and 6 are not found and we instructed DefaultIfEmpty to return a 99 in that case.
Please note that DefaultIfEmpty is a generic method. In my case it required an int because I am joining to the second list which is a List of int(s). In your case it is problem(s) but that is mapped. Therefore, you cannot construct it in your query.
Here is another example:
var depts = new List<Department>
{
new Department { Name = "Accounting" },
new Department { Name = "IT" },
new Department { Name = "Marketing" }
};
var persons = new List<Person>
{
new Person { DeptName = "Accounting", Name = "Bob" }
};
var selection2 =
from d in depts
join p in persons on d.Name equals p.DeptName into joined2
// See here DefaultIfEmpty can be passed a Person
from j2 in joined2.DefaultIfEmpty(new Person { DeptName = "Unknown", Name = "Alien" })
select j2;
foreach(var thisJ in selection2)
{
Console.WriteLine("Dept: {0}, Name: {1}", thisJ.DeptName, thisJ.Name);
}
Output:
Dept: Accounting, Name: Bob
Dept: Unknown, Name: Alien
Dept: Unknown, Name: Alien
<== Fiddle Me ==>

Public class problem()
{
public int id;
public string description;
public short type;
}
.DefaultIfEmpty(
new problem()
{
Id = ticketsId,
Description = string.empty,
});
create class and make use of that in linq query
Hope it helps you.

Dividing up a table of two-way relationships into distinct groups

I'm working on an application where users can tag "components" as part of the workflow. In many cases, they end up with several tags that are synonyms of each other. They would like these to be grouped together so that when one tag is added to a component, the rest of the tags in the group can be added as well.
I decided to break up tag groups into two-way relationships between each pair of tags in the group. So if a group has tags 1 and 2, there's a record that looks like this:
ID TagID RelatedTagID
1 1 2
2 2 1
Basically, a group is represented as a Cartesian product of each tag in it. Extend that to 3 tags:
ID Name
1 MM
2 Managed Maintenance
3 MSP
Our relationships look like this:
ID TagID RelatedTagID
1 1 2
2 2 1
3 1 3
4 3 1
5 2 3
6 3 2
I have a couple methods to group them together, but they're less than stellar. First, I wrote a view that lists each tag along with the list of tags in its group:
SELECT
TagKey AS ID,
STUFF
((SELECT ',' + cast(RelatedTagKey AS nvarchar)
FROM RelatedTags rt
WHERE rt.TagKey = t.TagKey
FOR XML PATH('')), 1, 1, '') AS RelatedTagKeys
FROM (
SELECT DISTINCT TagKey
FROM RelatedTags
) t
The problem with this is that each group appears in the results as many times as there are tags in it, which I wasn't able to think of a way to work around in a single query. So it gives me back:
ID RelatedTagKeys
1 2,3
2 1,3
3 1,2
Then in my back-end, I discard all groups that contain a key that occurs in another group. Tags aren't being added to multiple groups, so that works, but I don't like how much extraneous data I'm pulling down.
The second solution I came up with was this LINQ query. The key used to group the tags is a listing of the group itself. This is probably much worse than I originally thought.
from t in Tags.ToList()
where t.RelatedTags.Any()
group t by
string.Join(",", (new List<int> { t.ID })
.Concat(t.RelatedTags.Select(i => i.Tag.ID))
.OrderBy(i => i))
into g
select g.ToList()
I really hate grouping by the result of calling string.Join, but when I tried just grouping by the list of keys, it didn't group properly, putting each tag in a group by itself. Also, the SQL it generated is monstrous. I'm not going to paste it here, but LINQPad shows that it generates about 12,000 lines of individual SELECT statements on my test database (we have 1562 tags and 67 records in RelatedTags).
These solutions work, but they're pretty naive and inefficient. I don't know where else to go with this, though. Any ideas?

I suppose working with your data gets easier if you have a groupId for each of your tags, such that tags that are related share the same value of groupId.
To explain what I mean, I added a second set of related tags to your dataset:
INSERT INTO tags ([ID], [Name]) VALUES
(1, 'MM'),
(2, 'Managed Maintenance'),
(3, 'MSP'),
(4, 'UM'),
(5, 'Unmanaged Maintenance');
and
INSERT INTO relatedTags ([ID], [TagID], [RelatedTagID]) VALUES
(1, 1, 2),
(2, 2, 1),
(3, 1, 3),
(4, 3, 1),
(5, 2, 3),
(6, 3, 2),
(7, 4, 5),
(8, 5, 4);
Then, a table holding the following information should make a lot of other things easier (I first explain the content of the table and then how to get it using a query):
tagId | groupId
------|--------
1 | 1
2 | 1
3 | 1
4 | 4
5 | 4
The data comprises two groups of related tags, i.e. {1,2,3} and {4,5}. Therefore, above table marks tags belonging to the same group with the same groupId, i.e. 1 for {1,2,3}, and 4 for {4,5}.
To achieve such a view/table, you could use the following query:
with rt as
( (select r2.tagId, r2.relatedTagId
from relatedTags r1 join relatedTags r2 on r1.tagId = r2.relatedTagId)
union
(select r3.tagId, r3.tagId as relatedTagId from relatedTags r3)
)
select rt.tagId, min(rt.relatedTagId) as groupId from rt
group by tagId
Of course, instead of introducing a new table / view, you could also extend your primary tags-table by a groupId attribute.
Hope this helps.

I really don't understand the relationships. You didn't explain very well. But I somehow got same results. Not sure if I did it right.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication41
{
class Program
{
static void Main(string[] args)
{
Data.data = new List<Data>() {
new Data() { ID = 1, TagID = 1, RelatedTagID = 2},
new Data() { ID = 2, TagID = 2, RelatedTagID = 1},
new Data() { ID = 3, TagID = 1, RelatedTagID = 3},
new Data() { ID = 4, TagID = 3, RelatedTagID = 1},
new Data() { ID = 5, TagID = 2, RelatedTagID = 3},
new Data() { ID = 6, TagID = 3, RelatedTagID = 2}
};
var results = Data.data.GroupBy(x => x.RelatedTagID)
.OrderBy(x => x.Key)
.Select(x => new {
ID = x.Key,
RelatedTagKeys = x.Select(y => y.TagID).ToList()
}).ToList();
foreach (var result in results)
{
Console.WriteLine("ID = '{0}', RelatedTagKeys = '{1}'", result.ID, string.Join(",",result.RelatedTagKeys.Select(x => x.ToString())));
}
Console.ReadLine();
}
}
public class Data
{
public static List<Data> data { get; set; }
public int ID { get; set; }
public int TagID { get; set; }
public int RelatedTagID { get; set; }
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Select Distinct rows using Linq - c#

Related

Create sub array of duplicates and randomize order c#

EF duplicate values in secondary table

Understanding the GroupJoin and Join in Linq chaining syntax (Homework)

Entity Framework + LINQ Expression outer join error

Dividing up a table of two-way relationships into distinct groups

Categories

Resources