With LINQ DISTINCT a Data Table Multiple Columns Excluding a Single Column

With LINQ DISTINCT a Data Table Multiple Columns Excluding a Single Column - c#

I have a C# DataTable. I am retrieving Data into DataTable. After that I am trying to DISTINCT entry's at the same time creating a List<MyObject>.
Here is the code with what I am chasing with:
viewModelList = (from item in response.AsEnumerable()
select new
{
description = DataTableOperationHelper.GetStringValue(item, "description"),
unitCost = DataTableOperationHelper.GetDecimalValue(item, "unitcost"),
defaultChargeable = DataTableOperationHelper.GetBoolValue(item, "defaultChargeable"),
contractId = DataTableOperationHelper.GetIntValue(item, "contractID"),
consumableid = DataTableOperationHelper.GetIntValue(item, "consumableid")
})
.Distinct()
.Select(x => new ConsumablesViewModel(
x.description,
x.unitCost,
x.defaultChargeable,
x.contractId,
x.consumableid)
)
.ToList();
I just want to exclude a single column (consumableid) when I am doing DISTINCT. How could I DISTINCT with my rest of the Data Excluding a single value (consumableid)?

Take a look at this answered question (LinQ distinct with custom comparer leaves duplicates).
Basically, you create an equality comparer for your type that allows you to decide what makes an object distinct.

Related

Why is linq reversing order in group by

I have a linq query which seems to be reversing one column of several in some rows of an earlier query:
var dataSet = from fb in ds.Feedback_Answers
where fb.Feedback_Questions.Feedback_Questionnaires.QuestionnaireID == criteriaType
&& fb.UpdatedDate >= dateFeedbackFrom && fb.UpdatedDate <= dateFeedbackTo
select new
{
fb.Feedback_Questions.Feedback_Questionnaires.QuestionnaireID,
fb.QuestionID,
fb.Feedback_Questions.Text,
fb.Answer,
fb.UpdatedBy
};
Gets the first dataset and is confirmed working.
This is then grouped like this:
var groupedSet = from row in dataSet
group row by row.UpdatedBy
into grp
select new
{
Survey = grp.Key,
QuestionID = grp.Select(i => i.QuestionID),
Question = grp.Select(q => q.Text),
Answer = grp.Select(a => a.Answer)
};
While grouping, the resulting returnset (of type: string, list int, list string, list int) sometimes, but not always, turns the question order back to front, without inverting answer or questionID, which throws it off.
i.e. if the set is questionID 1,2,3 and question A,B,C it sometimes returns 1,2,3 and C,B,A
Can anyone advise why it may be doing this? Why only on the one column? Thanks!
edit: Got it thanks all! In case it helps anyone in future, here is the solution used:
var groupedSet = from row in dataSet
group row by row.UpdatedBy
into grp
select new
{
Survey = grp.Key,
QuestionID = grp.OrderBy(x=>x.QuestionID).Select(i => i.QuestionID),
Question = grp.OrderBy(x=>x.QuestionID).Select(q => q.Text),
Answer = grp.OrderBy(x=>x.QuestionID).Select(a => a.Answer)
};

Reversal of a grouped order is a coincidence: IQueryable<T>'s GroupBy returns groups in no particular order. Unlike in-memory GroupBy, which specifies the order of its groups, queries performed in RDBMS depend on implementation:
The query behavior that occurs as a result of executing an expression tree that represents calling GroupBy<TSource,TKey,TElement>(IQueryable<TSource>, Expression<Func<TSource,TKey>>, Expression<Func<TSource,TElement>>) depends on the implementation of the type of the source parameter.`
If you would like to have your rows in a specific order, you need to add OrderBy to your query to force it.
How I do it and maintain the relative list order, rather than apply an order to the resulting set?
One approach is to apply grouping to your data after bringing it into memory. Apply ToList() to dataSet at the end to bring data into memory. After that, the order of subsequent GrouBy query will be consistent with dataSet. A drawback is that the grouping is no longer done in RDBMS.

Linq Distinct not bringing back the correct results

I'm trying to select a distinct values from a DataTable using Linq. The DataTable gets populated from an excel sheet which has dynamic column apart from each excel sheet has a column name SERIAL NUMBER which is mandatory.
I have a DataTable for demo purpose which consist of 4 serial number as:
12345
12345
98765
98765
When I do
var distinctList = dt.AsEnumerable().Select(a => a).Distinct().ToList();
If I do
var distinctList = dt.AsEnumerable().Select(a => a.Field<string>("SERIAL NUMBER").Distinct().ToList();
Then I get the correct results, however but it only contains the one column from dt and not all the other columns
I get all four records instead of 2. Can someone tell me where I'm going wrong please.

The problem is that Distinct method by default uses the default equality comparer, which for DataRow is comparing by reference. To get the desired result, you can use the Distinct overload that allows you to pass IEqualityComparer<T>, and pass DataRowComparer.Default:
The DataRowComparer<TRow> class is used to compare the values of the DataRow objects and does not compare the object references.
var distinctList = dt.AsEnumerable().Distinct(DataRowComparer.Default).ToList();
For more info, see Comparing DataRows (LINQ to DataSet).

So, you want to group them by Serial Number and retrieve the full DataRow? Assuming that after grouping them we want to retrieve the first item:
var distinctList = dt.AsEnumerable().GroupBy(a => a.Field<string>("SERIAL NUMBER"))
.Select(a => a.FirstOrDefault()).Distinct().ToList();
EDIT: As requested
var distinctValues = dt.AsEnumerable().Select(a => a.Field<string>("SERIAL NUMBER")).Distinct().ToList();
var duplicateValues = dt.AsEnumerable().GroupBy(a => a.Field<string>("SERIAL NUMBER")).SelectMany(a => a.Skip(1)).Distinct().ToList();
var duplicatesRemoved = dt.AsEnumerable().Except(duplicateValues);

In ToTable method the first parameter specifies if you want Distinct records, the second specify by which column name we will make distinct.
DataTable returnVals = dt.DefaultView.ToTable(true, "ColumnNameOnWhichYouWantDistinctRecords");
Here there is no need to use linq for this task !

Using Linq a GroupBy would be better suited, by the sounds of it.
var groups = dt.AsEnumerable().GroupBy(a => a.SerialNumber).Select(_ => new {Key = _.Key, Items = _});
This will then contain groupings based on the Serial Number. With each group of items having the same serial number, but other property values different.

Try this:
List<string> distinctValues = (from row in dt.AsEnumerable() select row.Field<string>("SERIAL NUMBER")).Distinct().ToList();
However to me this also works:
List<string> distinctValues = dt.AsEnumerable().Select(row => row.Field<string>("SERIAL NUMBER")).Distinct().ToList();

Lambda Distinct not working

I am unable to get a distinct list of 'Order' from my Lambda query. Even though am using the keyword Distinct() it is still returning repeated select list item.
public ActionResult Index()
{
var query = _dbContext.Orders
.ToList()
.Select(x => new SelectListItem
{
Text = x.OrderID.ToString(),
Value = x.ShipCity
})
.OrderBy(y => y.Value)
.Distinct();
ViewBag.DropDownValues = new SelectList(query, "Text", "Value");
return View();
}
Any suggestions please?
UPDATE
Sorry guys I genuinely missed out the Distinct() from my code. I have now added it to my code.
I am basically trying to get all distinct rows where yes the values are same but the ids are different.
Same as this SQL Query......
SELECT distinct [ShipCity] FROM [northwind].[dbo].[Orders] ORDER by ShipCity

I'm assuming you removed your distinct from the end of the query.
Actually for that matter i don't see how you could get duplicate orders at all since you're doing nothing in your query except selecting and your query is on a table in a database, so you already can't get the same row multiple time.
What do you call a "duplicate"? If you mean two rows with the same values except their ID that's not a duplicate at all, that's just two unrelated rows, with the same values . . .
If on the other hand you mean you expect them to be equal because you're tossing the .Distinct after the select and you're only using OrderId and ShipCity in there for which there are duplicates (and i really don't see why a column named OrderId in an orders table should have duplicates but that's another issue) then that still won't work because you're NOT selecting OrderId nor ShipCity, you're selecting a new SelectListItem and if you create two reference types with the same value, they're not equal in .NET, they need to be the same instance to be equal, not two instances with different values.
edited following your comment :
var query = _dbContext.Orders
.ToList()
// Group them by what you want to "distint" on
.GroupBy(item=>item.ShipCity)
// For each of those groups grab the first item, we just faked a distinct)
.Select(item=>item.First())
.Select(x => new SelectListItem
{
Text = x.OrderID.ToString(),
Value = x.ShipCity
})
.OrderBy(y => y.Value)
.Distinct();

How to convert to int and then compare in linq query c#

IEnumerable<classB> list = getItems();
//dt is datatable
list = list.Where(x => Convert.ToInt32( !dt.Columns["Id"]) == (x.Id));
I want to only keep the items in the list which match in datatable id column. The rest are removed. I m not doing it right.
The datatable can have: ID - 1,3,4,5,7
The list can have: ID - 1,2,3,4,5,6,7,8,9,10
I want the output list to have: ID - 1,3,4,5,7

Your code won't work because you're comparing a definition of a column to an integer value. That's not a sensible comparison to make.
What you can do is put all of the values from the data table into a collection that can be effectively searched and then get all of the items in the list that are also in that collection:
var ids = new HashSet<int>(dt.AsEnumerable()
.Select(row => row.Field<int>("Id"));
list = list.Where(x => ids.Contains(x.Id));

Try this one
var idList = dt.AsEnumerable().Select(d => (int) d["Id"]).ToList();
list = list.Where(x => idList.Contains(x.Id));

You can't do it like that. Your dt.Columns["Id"] returns the DataColumn and not the value inside that column in a specific datarow. You need to make a join between two linq query, the first one you already have, the other you need to get from the DataTable.
var queryDt = (from dtRow in dt
where !dtRow.IsNull("Id")
select int.Parse(dtRow["Id"])).ToList();
Now the join
var qry = from nonNull in queryDt
join existing in list on nonNull equals list.id

LINQ contains between 2 lists

I have a string List and a supplier List<supplier>.
string list contains some searched items and supplier list contains a list of supplier object.
Now I need to find all the supplier names that matches with any of the items in the string List<string>.
this is one of my failed attempts..
var query = some join with the supplier table.
query = query.where(k=>stringlist.contains(k.companyname)).select (...).tolist();
any idea how to do that??
EDIT:
May be my question is not clear enough...I need to find a list of suppliers(not only names,the whole object) where suppliers names matches with the any items in the string list.
If I do
query = query.where(k=>k.companyname.contains("any_string")).select (...).tolist();
it works. but this is not my requirement.
My requirement is a list of string not a single string.

Following query will return distinct suppliers names which exist in list of names:
suppliers.Where(s => stringlist.Contains(s.CompanyName))
.Select(s => s.CompanyName) // remove if you need whole supplier object
.Distinct();
Generated SQL query will look like:
SELECT DISTINCT [t0].[FCompanyName]
FROM [dbo].[Supplier] AS [t0]
WHERE [t0].[CompanyName] IN (#p0, #p1, #p2)
BTW consider to use better names, e.g. companyNames instead of stringlist

You could use Intersect for this (for just matching names):
var suppliersInBothLists = supplierNames
.Intersect(supplierObjects.Select(s => s.CompanyName));
After your EDIT, for suppliers (not just names):
var suppliers = supplierObjects.Where(s => supplierNames.Contains(s.CompanyName));

var matches = yourList.Where(x => stringList.Contains(x.CompanyName)).Select(x => x.CompanyName).ToList();

Either use a join as Tim suggested or you could just use a HashSet directly. This is much more efficient that using .Contains on a List as in some of the other answers.
var stringSet = new HashSet(stringList);
var result = query.Where(q => stringSet.Contains(q.Name));

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

With LINQ DISTINCT a Data Table Multiple Columns Excluding a Single Column - c#

Take a look at this answered question (LinQ distinct with custom comparer leaves duplicates). Basically, you create an equality comparer for your type that allows you to decide what makes an object distinct.

Related

Why is linq reversing order in group by

Linq Distinct not bringing back the correct results

Lambda Distinct not working

How to convert to int and then compare in linq query c#

LINQ contains between 2 lists

Categories

Resources