c# datatable groupby and sum column's values (without know the name) - c#

I need to do a group by and sum the values for each columns. Actually I've been able to create a datatable as:
DataTable stats = dt.AsEnumerable().GroupBy(r => r["Data"]).OrderByDescending(r => r.Key).Select(g => g.OrderBy(r => r["Data"]).First()).CopyToDataTable();
Basically I need also to sum each values for each columns in the original datatable (dt). Please consider that, apart a couple of columns, I might dunno how many they are and its name.
In a previous test I used:
var query = from stat in stats
group stat by stat.Field<string>("Data") into data
orderby data.Key
select new
{
Data = data.Key,
TotTWorked = data.Sum(stat => stat.Field<int>("Time_Work")),
TotTHold = data.Sum(stat => stat.Field<int>("Time_Hold")),
TotTAlarm = data.Sum(stat => stat.Field<int>("Time_Alarm")),
Productivity = 0,
};
But now I need to be more flexible so I can't specify the column name as above. Any help?

So assuming you have at least the list of column names, I'd go with the approach of creating a dictionary as part of the select and then transform it later to whatever form you need it. Here's an example:
var query = from stat in stats
group stat by stat.Field<string>("Data") into data
orderby data.Key
select new
{
Data = data.Key,
SumsDictionary = listOfColumnNames
.Select(colName => new { ColName = colName, Sum = data.Sum(stat => stat.Field<int>(colName)) })
.ToDictionary(d => d.ColName, d => d.Sum),
Productivity = 0,
};
So that if you were to serialize the result object it would look something like this:
{
"Data": {},
"SumsDictionary": {
"Time_Work": 10,
"Time_Hold": 20,
"Time_Alarm": 30
},
"Productivity": 0
}
Hope it helps!

Related

SQL to LINQ expres

I'm trying to convert a SQL expression to Linq but I can't make it work, does anyone help?
SELECT
COUNT(descricaoFamiliaNovo) as quantidades
FROM VeiculoComSeminovo
group by descricaoFamiliaNovo
I try this:
ViewBag.familiasCount = db.VeiculoComSeminovo.GroupBy(a => a.descricaoFamiliaNovo).Count();
I need to know how many times each value repeats, but this way it shows me how many distinct values ​​there are in the column.
You can try:
var list = from a in db.VeiculoComSeminovo
group a by a.descricaoFamiliaNovo into g
select new ViewBag{
familiasCount=g.Count()
};
or
var list = db.VeiculoComSeminovo.GroupBy(a => a.descricaoFamiliaNovo)
.Select (g => new ViewBag
{
familiasCount=g.Count()
});
If you need column value:
new ViewBag{
FieldName=g.Key,
familiasCount=g.Count()
};
You don't need the GROUP BY unless there are fields other than the one in COUNT. Try
SELECT
COUNT(descricaoFamiliaNovo) as quantidades
FROM VeiculoComSeminovo
UPDATE, from your comment:
SELECT
COUNT(descricaoFamiliaNovo) as quantidades,
descricaoFamiliaNovo
FROM VeiculoComSeminovo
GROUP BY descricaoFamiliaNovo
That's it as SQL. In LINQ it is something like:
var reponse = db.VeiculoComSeminovo.GroupBy(a => a.descricaoFamiliaNovo)
.Select ( n => new
{Name = n.key,
Count = n.Count()
}
)
Not tested.
Ty all for the help.
I solved the problem using this lines:
// get the objects on db
var list = db.VeiculoComSeminovo.ToList();
// lists to recive data
List<int> totaisFamilia = new List<int>();
List<int> totaisFamiliaComSN = new List<int>();
// loop to cycle through objects and add the values ​​I need to their lists
foreach (var item in ViewBag.familias)
{
totaisFamilia.Add(list.Count(a => a.descricaoFamiliaNovo == item && a.valorSeminovo == null));
totaisFamiliaComSN.Add(list.Count(a => a.descricaoFamiliaNovo == item && a.valorSeminovo != null));
}
The query was a little slow than I expected, but I got the data

Sum and Group by in linq using Datarows

Full disclosure, I'm pretty much a total noob whe it comes to linq. I could be way of base on how i should be approaching this.
I have a DataTable with 3 columns
oid,idate,amount
each id has multiple dates, and each date has multiple amounts. What I need to do is sum the amount for each day for each id, so instead of:
id,date,amount
00045,02/13/2011,11.50
00045,02/14/2011,11.00
00045,02/14/2011,12.00
00045,02/15/2011,10.00
00045,02/15/2011,5.00
00045,02/15/2011,12.00
00054,02/13/2011,8.00
00054,02/13/2011,9.00
I would have:
id,date,SumOfAmounts
00045,02/13/2011,11.50
00045,02/14/2011,23.00
00045,02/15/2011,27.00
00054,02/13/2011,17.00
private void excelDaily_Copy_Into(DataTable copyFrom, DataTable copyTo)
{
var results = from row in copyFrom.AsEnumerable()
group row by new
{
oid = row["oid"],
idate = row["idate"]
} into n
select new
{
///unsure what to do
}
};
I've tried a dozen or so different ways of doing this and I always sort of hit a wall where i can't figure out how to progress. I've been all over stack overflow and the msdn and nothing so far has really helped me.
Thank you in advance!
You could try this:
var results = from row in copyFrom.AsEnumerable()
group row by new
{
oid = row.Field<int>("oid"),// Or string, depending what is the real type of your column
idate = row.Field<DateTime>("idate")
} into g
select new
{
g.Key.oid,
g.Key.idate,
SumOfAmounts=g.Sum(e=>e.Field<decimal>("amount"));
};
I suggest to use Field extension method which provides strongly-typed access to each of the column values in the specified row.
Although you don't specify it, apparently copyFrom is an object from a class DataTable that implements IEnumerable.
According to MSDN System.Data.DataTable the class does not implement it. If you use that class, you need property Rows, which returns a collections of rows that implements IEnumerable:
IEnumerable<DataRow> rows = copyFrom.Rows.Cast<DataRow>()
but if you use a different DataTable class, you'll probably do something similar to cast it to a sequence of DataRow.
An object of class System.Data.DataRow has item properties to access the columns in the row. In your case the column names are oid, idate and amount.
To convert your copyFrom to the sequence of items you want to do the processing on is:
var itemsToProcess = copyFrom.Rows.Cast<DataRow>()
.Select(row => new
{
Oid = row["oid"],
Date = (DateTime)row["idate"],
Amount = (decimal)row["amount"],
});
I'm not sure, but I assume that column idate contains dates and column amount contains some value. Feel free to use other types if your columns contain other types.
If your columns contain strings, convert them to the proper items using Parse:
var itemsToProcess = copyFrom.Rows.Cast<DataRow>()
.Select(row => new
{
Id = (string)row["oid"],
Date = DateTime.Parse( (string) row["idate"]),
Amount = Decimal.Parse (string) row["amount"]),
});
If you are unfamiliar with the lambda expressions. It helped me a lot to read it as follows:
itemsToProcess is a collection of items, taken from the collection of
DataRows, where from each row in this collection we created a new
object with three properties: Id = ...; Data = ...; Amount = ...
See
Explanation of Standard Linq oerations for Cast and Select
Anonymous Types
Now we have a sequence where we can compare dates and sum the amounts.
What you want, is to group all items in this sequence into groups with the same Id and Date. So you want a group where with Id = 00045 and Date = 02/13/2011, and a group with Id = 00045 and date = ,02/14/2011.
For this you use Enumerable.GroupBy. As the selector (= what have all items in one group in common) you use the combination of Id and Date:
var groups = itemsToProcess.GroupBy(item => new
{Id = item.Id, Data = item.Data} );
Now you have groups.
Each group has a property Key, of a type with two properties: Id and Data.
Each group is a sequence of items from your itemsToProcess collection (so it is an "itemToprocess" with Id / Data / Value properties)
all items in one group have the same Id and same Data.
So all you have to do is Sum all elements from the sequence in each group.
var resultSequence = groups.Select(groupItem => new
{
Id = groupItem.Key.Id
Date = groupItem.Key.Date,
Sum = groupItem.Sum(itemToProcess => itemToProcess.Value,
}
So putting it all together into one statement:
var resultSequence = copyFrom.Rows.Cast<DataRow>()
.Select(row => new
{
Id = (string)row["oid"],
Date = DateTime.Parse( (string) row["idate"]),
Amount = Decimal.Parse (string) row["amount"]),
})
.GroupBy (itemToProcess => new
{
Id = item.Id,
Data = item.Data
});
.Select(groupItem => new
{
Id = groupItem.Key.Id
Date = groupItem.Key.Date,
Sum = groupItem.Sum(itemToProcess => itemToProcess.Value,
});

MongoDB (v3.2.4) Fluent - Aggregate group sum is always 0

I'm trying to filter on millions of documents inside a collection, then applying a grouping on one of the properties. I need to know the sum of the amount properties. I've written the following code already, but the sum seems to be 0 always, even though the amount exists within each document in the collection.
var results = collection.Aggregate().Match(expression)
.Project(data => new
{
Id = data.Id,
Name = data.Name,
Amount = data.NetAmount
})
.Group(
key => new { key.Id, key.Name },
grouping => new
{
IdAndName = grouping.Key,
GroupTotalAmount = grouping.Sum(obj => obj.Amount)
})
.Project(arg => new
{
Id = arg.IdAndName.Id,
Name = arg.IdAndName.Name,
Amount = arg.GroupTotalAmount
});
If I use 1 instead of obj.Amount, in the above code snippet, I get the counts for the grouped collection.
Moreover, if I just project without group in the pipeline, I can see the Amount value populated for each document.
This is my expression in the above code:
Expression<Func<MyObject, bool>> expression = data => data.PropertyA == 55 && data.PropertyB == 2000;
Any help would be highly appreciated. Thanks.

Select from DataTable most recent result for each item

I have a datatable containing thousands of rows. in the table there is a serial number column and a test number column. If a serial is tested more than one the test number increments. I need to be able to select the most recent test for each serial from my datatable and insert it into another datatable. Currently I am using this:
DataTable newdata = data.AsEnumerable().Where(x => x.Field<Int16>("Test") ==
data.AsEnumerable().Where(y => y.Field<string>("Serial") ==
x.Field<string>("SerialNumber")).Select(y =>
y.Field<Int16>("Test")).Max()).Select(x => x).CopyToDataTable();
This does do the job however as it is quite clear it is incredibly inefficient. Is there a more efficient way to select the top row of data for each serial number?
Thank you
Solution
So following on from Cam Bruce's answer I implemented the following code with a Dictionary rather than with a join:
//Get all of the serial numbers and there max test numbers
Dictionary<string, Int16> dict = data.AsEnumerable().GroupBy(x => x.Field<string>("SerialNumber")).ToDictionary(x => x.Key, x => x.Max(y => y.Field<Int16>("Test")));
//Create a datatable with only the max rows
DataTable newdata = data.AsEnumerable().Where(x => x.Field<Int16>("Test") ==
dict[x.Field<string>("SerialNumber")]).Select(x => x).CopyToDataTable();
//Clear the dictionary
dict.Clear();
This will give you each serial number, and the Max test. You can then join that result set back to the DataTable to get all the max rows.
var maxTest= data.AsEnumerable()
.GroupBy(g=> g.Field<string>("SerialNumber"))
.Select(d=> new
{
SerialNumber = g.Key
Test = g.Max(g.Field<Int16>("Field"))
};
var maxRows = from d in data.AsEnumerable()
join m in maxTest
on new { S = d.Field<string>("SerialNumber"), T = d.Field<Int16>("Test") }
equals new { S = m.SerialNumber, T = m.Test }
select d;

LINQ Group-by with complete object access

What I want is better explained with code. I have this query:
var items = context.Items.GroupBy(g => new {g.Name, g.Model})
.Where(/*...*/)
.Select(i => new ItemModel{
Name=g.Key.Name,
SerialNumber = g.FirstOrDefault().SerialNumber //<-- here
});
Is there a better way to get the serial number or some other property that is not used in the key? The only way I could think of is to use FirstOrDefault.
Why not just include the serial number as part of the key via the anonymous type you're declaring:
var items = context.Items.GroupBy(g => new {g.Name, g.Model, g.SerialNumber })
.Where(/*...*/)
.Select(i => new ItemModel {
Name=g.Key.Name,
SerialNumber = g.FirstOrDefault().SerialNumber //<-- here
});
Or, alternatively, make your object the key:
var items = context.Items.Where(...).GroupBy(g => g)
.Select(i => new ItemModel {...});
Sometimes it can be easier to comprehend the query syntax (here, I've projected the Item object as part of the key):
var items = from i in context.Items
group i by new { Serial = g.Serialnumber, Item = g } into gi
where /* gi.Key.Item.GetType() == typeof(context.Items[0]) */
select new ItemModel {
Name = gi.Key.Name,
SerialNumber = gi.Key.Serial
/*...*/
};
EDIT: you could try grouping after projection like so:
var items = context.Items.Where(/*...*/).Select(i => new ItemModel { /*...*/})
.GroupBy(g => new { g.Name, g.Model });
you get an
IGrouping<AnonymousType``1, IEnumerable<ItemModel>> from this with your arbitrary group by as the key, and your ItemModels as the grouped collection.
I would strongly advise against what you're doing. The serial number is being chosen arbitrarily since you do no ordering in your queries. It would be better if you specified exactly which serial number to choose that way there are no surprises if the queries return items in a different ordering than "last time".
With that said, I think it would be cleaner to project the grouping and select the fields you need and take the first result. They all will have the same key values so that will stay the same, then you can add on any other fields you want.
var items = context.Items.GroupBy(i => new { i.Name, i.Model })
.Where(/*...*/)
.Select(g =>
g.OrderBy(i => i.Name).Select(i => new ItemModel
{
Name = i.Name,
SerialNumber = i.SerialNumber,
}).FirstOrDefault()
);
Since you need all the data, you need to store all the group data into your value (in the KeyValuePair).
I don't have the exact syntax in front of me, but it would look like:
/* ... */
.Select(g => new {
Key = g.key,
Values = g
});
After that, you can loop through the Key to get your Name group. Inside of that loop, include a loop through the Values to get your ItemModel (I guess that's the object containing 1 element).
It would look like:
foreach (var g in items)
{
Console.WriteLine("List of SerialNumber in {0} group", g.Key);
foreach (var i in g.Values)
{
Console.WriteLine(i.SerialNumber);
}
}
Hope this helps!
You might want to look at Linq 101 samples for some help on different queries.
if the serial number is unique to the name and model, you should include it in your group by object.
If it is not, then you have a list of serials per name and model, and selecting firstordefault is probably plain wrong, that is, I can think of no scenario you would want this.

Categories

Resources