Linq Query Grouping Data table using two columns - c#

I've got one question here. I'm a newbie so pardon with my terminologies, I am querying a data table wherein I need to group this data table according to date and their unique access code.
var tMainTable = (from System.Data.DataRow b in _tData.data_table.Rows
group b by b["ACCESS_CODE"] into bGroup
select new
{ bGroup });
in my current grouping above, I am grouping my data table according to access code. My data table is composed of 3 fields: DATE, ACCESS_CODE, COUNT. This is provided that I cant make my datatable AsEnumerable() type.
So this time, I want to add in its condition, which is grouping by date as well... is there such thing as:
var tMainTable = (from System.Data.DataRow b in _tData.data_table.Rows
**group b by b["ACCESS_CODE"] AND b["DATE"] into bGroup**
select new
{ bGroup });
Thanks for any inputs.

Use an anonymous type for your grouping:
var codeDateGroups = _tData.data_table.AsEnumerable()
.GroupBy(r => new {
AccessCode = r.Field<string>("ACCESS_CODE"),
Date = r.Field<DateTime>("DATE")
});
You can access it via the Key:
foreach(var group in codeDateGroups)
Console.WriteLine("Code=[{0}] Date=[{1}]"
, group.Key.AccessCode
, group.Key.Date);

var tMainTable = (from System.Data.DataRow b in _tData.data_table.Rows
group b by new { AccessCode = b["ACCESS_CODE"], Date = b["DATE"] } into bGroup
select new
{ bGroup });

var groups = _tData.data_table.AsEnumerable()
.GroupBy(row=> new {row["ACCESS_CODE"],row["DATE"] });

Related

List<T> joins DataTable

I have a List of objects (lst) and DataTable (dt). I want to join the lst and dt on the common field (code as string) and need to return all matching rows in the lst.
My List contains two columns i.e code and name along with values below:
code name
==== ====
1 x
2 y
3 z
The DataTable contains two columns i.e code and value along with values below:
code value
==== =====
3 a
4 b
5 c
The result is:
3 z
Below is my code; but I know it is not a correct statement and thus seeking your advice here. I would be much appreciated if you could guide me on how to write the correct statement.
var ld = from l in lst
join d in dt.AsEnumerable() on l.code equals d.code
select new { l.code, l.name };
You can use Linq query or Join extension method to join the collection on code. Just that when you select data from datatable, you need to use dt.Field method. Please use either of the following code.
Query1:
var ld = lst.Join(dt.AsEnumerable(),
l => l.code,
d => d.Field<string>("code"),
(l, d) => new
{
l.code,
l.name,
value = d.Field<string>("value")
}).ToList();
Query2:
var ld = (from l in lst
join d in dt.AsEnumerable()
on l.code equals d.Field<string>("code")
select new
{
l.code,
l.name,
value = d.Field<string>("value")
}).ToList();
Query3:
var ld = (from l in lst
join d in dt.AsEnumerable()
on l.code equals d.Field<string>("code")
let value = d.Field<string>("value")
select new
{
l.code,
l.name,
value
}).ToList();
You can try any of the below.
var ld = from l in lst
join d in dt.AsEnumerable() on l.code equals d.Field<int>("code")
select new { l.code, l.name };
var ld = lst.Join(dt.AsEnumerable(), l => l.code, d => d.Field<int>("code"), (l,d) => new { l.code, l.name });
It's not clear what your required output is but it looks as if you are correctly getting the only common records. You could extend your select to
select new { l.code, l.name, d.value }
Which would give all the data/columns from both tables.
code name value
==== ==== =====
3 z a
Try this:
var ld = from l in lst
join d in dt.Cast <DataRow>() on l.code equals d["code"].ToString()
select new { l.code, l.name };
SO you have a List and a DataTable. You don't plan to use the Values of the DataTable, only the Codes.
You want to keep those List items, that have a Code that is also a code in the DataTable.
If you plan to use your DataTable for other things than just for this problem, My advice would be to first create a procedure to convert your DataTable into an enumerable sequence.
This way you can add LINQ statements, not only for this problem, but also for other problems.
Let's create an extension method for your DataTable that converts the data into the items that are in the DataTable. See extension methods demystified.
Alas, I don't know what's in your DataTable, let's assume that your DataTable contains Orders
class CustomerOrder
{
public int Id {get; set;}
public int CustomerId {get; set;}
public int Code {get; set;}
public string Value {get; set;}
...
}
The extension method that extends functionality of class DataTable:
public static IEnumerable<Order> ToCustomerOrders(this DataTable table)
{
return table.AsEnumerable().Select(row => new CustomerOrder
{
Id = ...
CustomerId = ...
Code = ...
Value = ...
};
}
I'm not really familiar with DataTables, but you know how to convert the cells of the row into the proper value.
Usage:
DataTable table = ...
Int customerId = 14;
var ordersOfThisCustomer = table.ToCustomerOrders
.Where(customerOrder => customerOrder.CustomerId == customerId)
.FirstOrDefault();
In words: convert the datatable into CustomerOrders, row by row, and check for every converted CustomerOrder whether it has a CustomerId equal to 14. Stop if found. return null if there is no such row.
Now that you've got a nice reusable procedure that is also easy to test, debug and change, we can answer your question.
Given a DataTable with CustomerOrders, and a sequence of items that contain Code and Name, keep only those items from the sequence that have a Code that is also a Code in the DataTable.
var dataTable = ... // your DataTable, filled with CustomerOrders.
var codeNames = ... // your list with Codes and Names
var codesInDataTable = dataTable.ToCustomerOrders
.Select(customerOrder => customerOrder.Code)
.Distinct();
This will create an enumerable sequence that will convert your DataTable row by row and extract property Code. Duplicate Code values will be removed.
If Codes are unique, you don't need Distinct.
Note: the enumerable sequence is not enumerated yet!
var result = codeNames
.Where(codeName => codesInDataTable.Contains(codeName.Code))
.ToList();
In words: for every [Code, Name] combination in your list, keep only those [Code, Name] combinations that have a value for Code that is also in codesInDataTable.

Sum and Group by in linq using Datarows

Full disclosure, I'm pretty much a total noob whe it comes to linq. I could be way of base on how i should be approaching this.
I have a DataTable with 3 columns
oid,idate,amount
each id has multiple dates, and each date has multiple amounts. What I need to do is sum the amount for each day for each id, so instead of:
id,date,amount
00045,02/13/2011,11.50
00045,02/14/2011,11.00
00045,02/14/2011,12.00
00045,02/15/2011,10.00
00045,02/15/2011,5.00
00045,02/15/2011,12.00
00054,02/13/2011,8.00
00054,02/13/2011,9.00
I would have:
id,date,SumOfAmounts
00045,02/13/2011,11.50
00045,02/14/2011,23.00
00045,02/15/2011,27.00
00054,02/13/2011,17.00
private void excelDaily_Copy_Into(DataTable copyFrom, DataTable copyTo)
{
var results = from row in copyFrom.AsEnumerable()
group row by new
{
oid = row["oid"],
idate = row["idate"]
} into n
select new
{
///unsure what to do
}
};
I've tried a dozen or so different ways of doing this and I always sort of hit a wall where i can't figure out how to progress. I've been all over stack overflow and the msdn and nothing so far has really helped me.
Thank you in advance!
You could try this:
var results = from row in copyFrom.AsEnumerable()
group row by new
{
oid = row.Field<int>("oid"),// Or string, depending what is the real type of your column
idate = row.Field<DateTime>("idate")
} into g
select new
{
g.Key.oid,
g.Key.idate,
SumOfAmounts=g.Sum(e=>e.Field<decimal>("amount"));
};
I suggest to use Field extension method which provides strongly-typed access to each of the column values in the specified row.
Although you don't specify it, apparently copyFrom is an object from a class DataTable that implements IEnumerable.
According to MSDN System.Data.DataTable the class does not implement it. If you use that class, you need property Rows, which returns a collections of rows that implements IEnumerable:
IEnumerable<DataRow> rows = copyFrom.Rows.Cast<DataRow>()
but if you use a different DataTable class, you'll probably do something similar to cast it to a sequence of DataRow.
An object of class System.Data.DataRow has item properties to access the columns in the row. In your case the column names are oid, idate and amount.
To convert your copyFrom to the sequence of items you want to do the processing on is:
var itemsToProcess = copyFrom.Rows.Cast<DataRow>()
.Select(row => new
{
Oid = row["oid"],
Date = (DateTime)row["idate"],
Amount = (decimal)row["amount"],
});
I'm not sure, but I assume that column idate contains dates and column amount contains some value. Feel free to use other types if your columns contain other types.
If your columns contain strings, convert them to the proper items using Parse:
var itemsToProcess = copyFrom.Rows.Cast<DataRow>()
.Select(row => new
{
Id = (string)row["oid"],
Date = DateTime.Parse( (string) row["idate"]),
Amount = Decimal.Parse (string) row["amount"]),
});
If you are unfamiliar with the lambda expressions. It helped me a lot to read it as follows:
itemsToProcess is a collection of items, taken from the collection of
DataRows, where from each row in this collection we created a new
object with three properties: Id = ...; Data = ...; Amount = ...
See
Explanation of Standard Linq oerations for Cast and Select
Anonymous Types
Now we have a sequence where we can compare dates and sum the amounts.
What you want, is to group all items in this sequence into groups with the same Id and Date. So you want a group where with Id = 00045 and Date = 02/13/2011, and a group with Id = 00045 and date = ,02/14/2011.
For this you use Enumerable.GroupBy. As the selector (= what have all items in one group in common) you use the combination of Id and Date:
var groups = itemsToProcess.GroupBy(item => new
{Id = item.Id, Data = item.Data} );
Now you have groups.
Each group has a property Key, of a type with two properties: Id and Data.
Each group is a sequence of items from your itemsToProcess collection (so it is an "itemToprocess" with Id / Data / Value properties)
all items in one group have the same Id and same Data.
So all you have to do is Sum all elements from the sequence in each group.
var resultSequence = groups.Select(groupItem => new
{
Id = groupItem.Key.Id
Date = groupItem.Key.Date,
Sum = groupItem.Sum(itemToProcess => itemToProcess.Value,
}
So putting it all together into one statement:
var resultSequence = copyFrom.Rows.Cast<DataRow>()
.Select(row => new
{
Id = (string)row["oid"],
Date = DateTime.Parse( (string) row["idate"]),
Amount = Decimal.Parse (string) row["amount"]),
})
.GroupBy (itemToProcess => new
{
Id = item.Id,
Data = item.Data
});
.Select(groupItem => new
{
Id = groupItem.Key.Id
Date = groupItem.Key.Date,
Sum = groupItem.Sum(itemToProcess => itemToProcess.Value,
});

LINQ Query To Join Two Tables and Select Most Recent Records from Table B corresponding to Table A

I have two tables. Table One contains a list of Areas, and Table Two contains a list of Samples, with each Sample row containing Area_ID as a Foreign Key.
I need to retrieve all the records in my Area table with only the most recent corresponding Sample Status. I have this query, but it just returns one Area with the most recent sample from the Sample table:
var result = (
from a in db.area
join c in db.sample
on a.location_id equals c.location_id
select new
{
name = a.location_name,
status = c.sample_status,
date = c.sample_date
}).OrderByDescending(c => c.date).FirstOrDefault();
A solution could be filtering your second DbSet:
var result = from a in db.area
join c in db.sample.Where(s=>s.location_id==a.location_id).OrderByDescending(c => c.sample_date).Take(1)
on a.location_id equals c.location_id
select new
{
name = a.location_name,
status = c.sample_status,
date = c.sample_date
};
Another solution could be applying a group join:
var result = from a in db.area
join c in db.sample
on a.location_id equals c.location_id into samples
let sample=samples.OrderByDescending(c => c.sample_date).FirstOrDefault()
select new
{
name = a.location_name,
status = sample.sample_status,
date = sample.sample_date
};
If you use navigation properties could be even easier. Supposing you have a one to many relationship between Area and Sample:
var result =from a in db.area
let sample= a.Samples.OrderByDescending(c => c.sample_date).FirstOrDefault()
select new
{
name = a.location_name,
status = sample.sample_status,
date = sample.sample_date
};

Group By with multiple column

I have data in a table as below
RowId | User | Date
--------------------------
1 A 2015-11-11 08:50:48.243
2 A 2015-11-11 08:51:01.433
3 B 2015-11-11 08:51:05.210
Trying to get the data as below:
User, Date, Count
A 2015-11-11 2
B 2015-11-11 1
Select User,Date,Count(User) from Table1
Group By User,Date
It is returning me 3 rows because of time involved in Date field.
How to get this in SQL and Linq.
Please suggest me.
EDITING:
I am able to get it in SQL
Select User,Cast(Date as Date),Count(User) from Table1
Group By User,Cast(Date as Date)
EDITING:
adding linq query
var details = db.table1.GroupBy( r => new { r.RowId,r.User,r.Date})
.Select(g => new {Name = g.Key, Count = g.Count()}).ToList();
For Linq Query just do the following: (you need to import using System.Data.Entity.SqlServer namespace.
Execute this linq query all calculations are done on the server database. Notice that Table1s represents the DbSet for Table1 and context is your DbContext instance.
var query = from item in context.Table1s
group item by new
{
item.User,
Year = SqlFunctions.DatePart("yyyy", item.Date),
Month = SqlFunctions.DatePart("mm", item.Date),
Day = SqlFunctions.DatePart("dd", item.Date)
} into g
select new { g.Key.User, g.Key.Year, g.Key.Month, g.Key.Day, Count = g.Count() };
Then create the final result like this:
var result = query.ToList().Select(p =>
new
{
p.User,
Date = new DateTime(p.Year.Value, p.Month.Value, p.Day.Value),
p.Count
}).ToList();
Other solution is to create a SQL View that will be used by DbContext to retrive the data you want. The SQL View body must be the SQL your wrote in your question.
EDIT 2 : DbFunctions
Like Cetin Basoz pointed in comments we can use System.Data.Entity.DbFunctions as well. And the code is more cleaner than using SqlFunctions. This will work only with EF 6 and greater. The version using SqlFunctions work with EF 4 and greater.
var query = from item in context.Table1s
group item by new
{
item.User,
Date = DbFunctions.TruncateTime(item.Date)
} into g
select new { g.Key.User, g.Key.Date, Count = g.Count() };
EDIT 1 : this is specific for Cetin Basoz's answer :
As we all know using AsEnumerable is not efficient for doing what is needed.
The second solution he gives us which is :
var grouped = from d in db.MyTable
group d by new {
User = d.User,
Date=d.Date.HasValue ? d.Date.Value.Date : (DateTime?)null} into g
select new {User=g.Key.User, Date=g.Key.Date, Count=g.Count()};
This solution just not work because of this :
The specified type member 'Date' is not supported in LINQ to Entities. Only initializers, entity members, and entity navigation properties are supported.
If the time is the problem, you can first convert it:
select User, CAST(dateColumn AS DATE) as dateConverted
into #tempTable
from myTable
then using a window function or a group by:
select *,
count(user) over (partition by date) as userCount
from #tempTable
This should work in SQL server, don't know about Linq
edit: If the date part is the problem, just select into from your table to a table with the casted date. Then you won't have this problem in Linq.
var grouped = from d in db.MyTable.AsEnumerable()
group d by new {
User = d.User,
Date=d.Date.HasValue ? d.Date.Value.Date : (DateTime?)null} into g
select new {User=g.Key.User, Date=g.Key.Date, Count=g.Count()};
Sooner or later, someone would say that this is not server side grouping and would suffer from performance and they would be right. Without Enumerable it is serverside but at the cost of another call per group, so here is another way:
public class MyResult
{
public string User {get;set;}
public DateTime? Date {get;set;}
public int Count {get;set;}
}
var grouped = db.ExecuteQuery<MyResult>(#"select [User],
Cast([Date] as Date) as [Date],
Count(*) as [Count]
from myTable
group by [user], Cast([Date] as Date)");
EDIT: I don't know why I thought the other way before, this would just work serverside and do it, AsEnumerable() was not needed:
var grouped = from d in db.MyTable
group d by new {
User = d.User,
Date=d.Date.HasValue ? d.Date.Value.Date : (DateTime?)null} into g
select new {User=g.Key.User, Date=g.Key.Date, Count=g.Count()};

How would I overload the except method to compare the first field of a Linq row collection?

How would I overload the except() method to compare only the first field in a row collection? What if I had an unequal number of columns in the two queries below (extra fields in one qry but not the other)?
I've read through some custom equality comparer questions similar to mine but could not find the exact answer for my solution.
Please help me in writing the overload code as I am new to the except method.
//Pass in your two datatables
//build the queries based on id and name.
var qry1 = datatable1.AsEnumerable().Select(a => new { ID = a["ID"].ToString(), Name = a["NAME"].ToString() });
var qry2 = datatable2.AsEnumerable().Select(b => new { ID = b["ID"].ToString(), Name = b["NAME"].ToString() });
//detect row deletes - a row is in datatable1 except missing from datatable2
var exceptAB = qry1.Except(qry2);
//detect row inserts - a row is in datatable2 except missing from datatable1
var exceptAB2 = qry2.Except(qry1);
//then I execute my code here
if (exceptAB.Any())
{
foreach (var id in exceptAB)
{
//print to console id and name
}
}
if (exceptAB2.Any())
{
foreach (var id in exceptAB2)
{
//print to console id and name
}
}
EDIT:
I solved this by using a linq query. I was storing the ID's in a variable already so I just used Contains() to pull the extra fields I needed.
var vProjectSummary = from a in dt1.AsEnumerable()
where sDelProjSummCheck.Contains(a.Field<string>("ID"))
select new
{
INV_ID = a.Field<string>("ID"),
It_Group = a.Field<string>("IT_GROUP"),
L6 = a.Field<string>("L6"),
Test_Mgr = a.Field<string>("TEST_MGR"),
INV_NAME = a.Field<string>("INV_NAME")
};
//
//
//
var vInsProjectSummary = from a in dt2.AsEnumerable()
where sInsProjSummCheck.Contains(a.Field<string>("ID"))
select new
{
INV_ID = a.Field<string>("ID"),
INV_NAME = a.Field<string>("INV_NAME")
};
Well if you must use Except you don't need to overload the method call you just need to build up a class the extends the IEqualityComparer interface. See http://msdn.microsoft.com/en-us/library/bb336390.aspx for example usages. The downside is you can't do it with anonymous types, you would have to create a class to store the column data in.
If you don't have to use Except you can accomplish the same results with anonymous types by doing the following:
var exceptAB = qry1.Where(q1 => !qry2.Any(q2 => q2.ID == q1.ID));
var exceptBA = qry2.Where(q2 => !qry1.Any(q1 => q1.ID == q2.ID));

Categories

Resources