Sum and Group by in linq using Datarows - c#

Full disclosure, I'm pretty much a total noob whe it comes to linq. I could be way of base on how i should be approaching this.
I have a DataTable with 3 columns
oid,idate,amount
each id has multiple dates, and each date has multiple amounts. What I need to do is sum the amount for each day for each id, so instead of:
id,date,amount
00045,02/13/2011,11.50
00045,02/14/2011,11.00
00045,02/14/2011,12.00
00045,02/15/2011,10.00
00045,02/15/2011,5.00
00045,02/15/2011,12.00
00054,02/13/2011,8.00
00054,02/13/2011,9.00
I would have:
id,date,SumOfAmounts
00045,02/13/2011,11.50
00045,02/14/2011,23.00
00045,02/15/2011,27.00
00054,02/13/2011,17.00
private void excelDaily_Copy_Into(DataTable copyFrom, DataTable copyTo)
{
var results = from row in copyFrom.AsEnumerable()
group row by new
{
oid = row["oid"],
idate = row["idate"]
} into n
select new
{
///unsure what to do
}
};
I've tried a dozen or so different ways of doing this and I always sort of hit a wall where i can't figure out how to progress. I've been all over stack overflow and the msdn and nothing so far has really helped me.
Thank you in advance!

You could try this:
var results = from row in copyFrom.AsEnumerable()
group row by new
{
oid = row.Field<int>("oid"),// Or string, depending what is the real type of your column
idate = row.Field<DateTime>("idate")
} into g
select new
{
g.Key.oid,
g.Key.idate,
SumOfAmounts=g.Sum(e=>e.Field<decimal>("amount"));
};
I suggest to use Field extension method which provides strongly-typed access to each of the column values in the specified row.

Although you don't specify it, apparently copyFrom is an object from a class DataTable that implements IEnumerable.
According to MSDN System.Data.DataTable the class does not implement it. If you use that class, you need property Rows, which returns a collections of rows that implements IEnumerable:
IEnumerable<DataRow> rows = copyFrom.Rows.Cast<DataRow>()
but if you use a different DataTable class, you'll probably do something similar to cast it to a sequence of DataRow.
An object of class System.Data.DataRow has item properties to access the columns in the row. In your case the column names are oid, idate and amount.
To convert your copyFrom to the sequence of items you want to do the processing on is:
var itemsToProcess = copyFrom.Rows.Cast<DataRow>()
.Select(row => new
{
Oid = row["oid"],
Date = (DateTime)row["idate"],
Amount = (decimal)row["amount"],
});
I'm not sure, but I assume that column idate contains dates and column amount contains some value. Feel free to use other types if your columns contain other types.
If your columns contain strings, convert them to the proper items using Parse:
var itemsToProcess = copyFrom.Rows.Cast<DataRow>()
.Select(row => new
{
Id = (string)row["oid"],
Date = DateTime.Parse( (string) row["idate"]),
Amount = Decimal.Parse (string) row["amount"]),
});
If you are unfamiliar with the lambda expressions. It helped me a lot to read it as follows:
itemsToProcess is a collection of items, taken from the collection of
DataRows, where from each row in this collection we created a new
object with three properties: Id = ...; Data = ...; Amount = ...
See
Explanation of Standard Linq oerations for Cast and Select
Anonymous Types
Now we have a sequence where we can compare dates and sum the amounts.
What you want, is to group all items in this sequence into groups with the same Id and Date. So you want a group where with Id = 00045 and Date = 02/13/2011, and a group with Id = 00045 and date = ,02/14/2011.
For this you use Enumerable.GroupBy. As the selector (= what have all items in one group in common) you use the combination of Id and Date:
var groups = itemsToProcess.GroupBy(item => new
{Id = item.Id, Data = item.Data} );
Now you have groups.
Each group has a property Key, of a type with two properties: Id and Data.
Each group is a sequence of items from your itemsToProcess collection (so it is an "itemToprocess" with Id / Data / Value properties)
all items in one group have the same Id and same Data.
So all you have to do is Sum all elements from the sequence in each group.
var resultSequence = groups.Select(groupItem => new
{
Id = groupItem.Key.Id
Date = groupItem.Key.Date,
Sum = groupItem.Sum(itemToProcess => itemToProcess.Value,
}
So putting it all together into one statement:
var resultSequence = copyFrom.Rows.Cast<DataRow>()
.Select(row => new
{
Id = (string)row["oid"],
Date = DateTime.Parse( (string) row["idate"]),
Amount = Decimal.Parse (string) row["amount"]),
})
.GroupBy (itemToProcess => new
{
Id = item.Id,
Data = item.Data
});
.Select(groupItem => new
{
Id = groupItem.Key.Id
Date = groupItem.Key.Date,
Sum = groupItem.Sum(itemToProcess => itemToProcess.Value,
});

Related

c# datatable groupby and sum column's values (without know the name)

I need to do a group by and sum the values for each columns. Actually I've been able to create a datatable as:
DataTable stats = dt.AsEnumerable().GroupBy(r => r["Data"]).OrderByDescending(r => r.Key).Select(g => g.OrderBy(r => r["Data"]).First()).CopyToDataTable();
Basically I need also to sum each values for each columns in the original datatable (dt). Please consider that, apart a couple of columns, I might dunno how many they are and its name.
In a previous test I used:
var query = from stat in stats
group stat by stat.Field<string>("Data") into data
orderby data.Key
select new
{
Data = data.Key,
TotTWorked = data.Sum(stat => stat.Field<int>("Time_Work")),
TotTHold = data.Sum(stat => stat.Field<int>("Time_Hold")),
TotTAlarm = data.Sum(stat => stat.Field<int>("Time_Alarm")),
Productivity = 0,
};
But now I need to be more flexible so I can't specify the column name as above. Any help?
So assuming you have at least the list of column names, I'd go with the approach of creating a dictionary as part of the select and then transform it later to whatever form you need it. Here's an example:
var query = from stat in stats
group stat by stat.Field<string>("Data") into data
orderby data.Key
select new
{
Data = data.Key,
SumsDictionary = listOfColumnNames
.Select(colName => new { ColName = colName, Sum = data.Sum(stat => stat.Field<int>(colName)) })
.ToDictionary(d => d.ColName, d => d.Sum),
Productivity = 0,
};
So that if you were to serialize the result object it would look something like this:
{
"Data": {},
"SumsDictionary": {
"Time_Work": 10,
"Time_Hold": 20,
"Time_Alarm": 30
},
"Productivity": 0
}
Hope it helps!

How to convert to int and then compare in linq query c#

IEnumerable<classB> list = getItems();
//dt is datatable
list = list.Where(x => Convert.ToInt32( !dt.Columns["Id"]) == (x.Id));
I want to only keep the items in the list which match in datatable id column. The rest are removed. I m not doing it right.
The datatable can have: ID - 1,3,4,5,7
The list can have: ID - 1,2,3,4,5,6,7,8,9,10
I want the output list to have: ID - 1,3,4,5,7
Your code won't work because you're comparing a definition of a column to an integer value. That's not a sensible comparison to make.
What you can do is put all of the values from the data table into a collection that can be effectively searched and then get all of the items in the list that are also in that collection:
var ids = new HashSet<int>(dt.AsEnumerable()
.Select(row => row.Field<int>("Id"));
list = list.Where(x => ids.Contains(x.Id));
Try this one
var idList = dt.AsEnumerable().Select(d => (int) d["Id"]).ToList();
list = list.Where(x => idList.Contains(x.Id));
You can't do it like that. Your dt.Columns["Id"] returns the DataColumn and not the value inside that column in a specific datarow. You need to make a join between two linq query, the first one you already have, the other you need to get from the DataTable.
var queryDt = (from dtRow in dt
where !dtRow.IsNull("Id")
select int.Parse(dtRow["Id"])).ToList();
Now the join
var qry = from nonNull in queryDt
join existing in list on nonNull equals list.id

How would I overload the except method to compare the first field of a Linq row collection?

How would I overload the except() method to compare only the first field in a row collection? What if I had an unequal number of columns in the two queries below (extra fields in one qry but not the other)?
I've read through some custom equality comparer questions similar to mine but could not find the exact answer for my solution.
Please help me in writing the overload code as I am new to the except method.
//Pass in your two datatables
//build the queries based on id and name.
var qry1 = datatable1.AsEnumerable().Select(a => new { ID = a["ID"].ToString(), Name = a["NAME"].ToString() });
var qry2 = datatable2.AsEnumerable().Select(b => new { ID = b["ID"].ToString(), Name = b["NAME"].ToString() });
//detect row deletes - a row is in datatable1 except missing from datatable2
var exceptAB = qry1.Except(qry2);
//detect row inserts - a row is in datatable2 except missing from datatable1
var exceptAB2 = qry2.Except(qry1);
//then I execute my code here
if (exceptAB.Any())
{
foreach (var id in exceptAB)
{
//print to console id and name
}
}
if (exceptAB2.Any())
{
foreach (var id in exceptAB2)
{
//print to console id and name
}
}
EDIT:
I solved this by using a linq query. I was storing the ID's in a variable already so I just used Contains() to pull the extra fields I needed.
var vProjectSummary = from a in dt1.AsEnumerable()
where sDelProjSummCheck.Contains(a.Field<string>("ID"))
select new
{
INV_ID = a.Field<string>("ID"),
It_Group = a.Field<string>("IT_GROUP"),
L6 = a.Field<string>("L6"),
Test_Mgr = a.Field<string>("TEST_MGR"),
INV_NAME = a.Field<string>("INV_NAME")
};
//
//
//
var vInsProjectSummary = from a in dt2.AsEnumerable()
where sInsProjSummCheck.Contains(a.Field<string>("ID"))
select new
{
INV_ID = a.Field<string>("ID"),
INV_NAME = a.Field<string>("INV_NAME")
};
Well if you must use Except you don't need to overload the method call you just need to build up a class the extends the IEqualityComparer interface. See http://msdn.microsoft.com/en-us/library/bb336390.aspx for example usages. The downside is you can't do it with anonymous types, you would have to create a class to store the column data in.
If you don't have to use Except you can accomplish the same results with anonymous types by doing the following:
var exceptAB = qry1.Where(q1 => !qry2.Any(q2 => q2.ID == q1.ID));
var exceptBA = qry2.Where(q2 => !qry1.Any(q1 => q1.ID == q2.ID));

Linq Query Grouping Data table using two columns

I've got one question here. I'm a newbie so pardon with my terminologies, I am querying a data table wherein I need to group this data table according to date and their unique access code.
var tMainTable = (from System.Data.DataRow b in _tData.data_table.Rows
group b by b["ACCESS_CODE"] into bGroup
select new
{ bGroup });
in my current grouping above, I am grouping my data table according to access code. My data table is composed of 3 fields: DATE, ACCESS_CODE, COUNT. This is provided that I cant make my datatable AsEnumerable() type.
So this time, I want to add in its condition, which is grouping by date as well... is there such thing as:
var tMainTable = (from System.Data.DataRow b in _tData.data_table.Rows
**group b by b["ACCESS_CODE"] AND b["DATE"] into bGroup**
select new
{ bGroup });
Thanks for any inputs.
Use an anonymous type for your grouping:
var codeDateGroups = _tData.data_table.AsEnumerable()
.GroupBy(r => new {
AccessCode = r.Field<string>("ACCESS_CODE"),
Date = r.Field<DateTime>("DATE")
});
You can access it via the Key:
foreach(var group in codeDateGroups)
Console.WriteLine("Code=[{0}] Date=[{1}]"
, group.Key.AccessCode
, group.Key.Date);
var tMainTable = (from System.Data.DataRow b in _tData.data_table.Rows
group b by new { AccessCode = b["ACCESS_CODE"], Date = b["DATE"] } into bGroup
select new
{ bGroup });
var groups = _tData.data_table.AsEnumerable()
.GroupBy(row=> new {row["ACCESS_CODE"],row["DATE"] });

How to select specific column in LINQ?

I have to select specific column from my DataTable using linq
I am using this code
ds.Table[0].AsEnumerable().Where<DataRow>(r=>r.Field<int>("productID")==23).CopyToDataTable();
~
But it is giving me all columns and I need only PRODUCTNAME , DESCRIPTION , PRICE
How I can write this query?
To expand a bit on #lazyberezovsky, you can use an anonymous type projection to get all of the fields you want:
ds.Table[0].AsEnumerable()
.Where<DataRow>(r => r.Field<int>("productID") == 23)
.Select(r => new { ProductName = r.Field<string>("productName"),
Description = r.Field<string>("description"),
Price = r.Field<decimal>("price") });
I don't know what name and type your product name, description, and price fields are, so you will have to substitute those.
Use Select method:
ds.Table[0].AsEnumerable()
.Where<DataRow>(r=>r.Field<int>("productID")==23)
.Select(r => r.Field<int>("productID"));
UPDATE: In case you need to select several columns, you can return anonymous type:
var query = from row in dt.ds.Table[0].AsEnumerable()
where row.Field<int>("productID")==23
select new {
ProductID = x.Field<string>("productID"),
Foo = x.Field<string>("foo")
};
If you need to copy that data to new table, you'll face problem (CopyToDataTable requires collection of DataRow objects). See How to: Implement CopyToDataTable Where the Generic Type T Is Not a DataRow to solve this problem.

Categories

Resources