How to calculate a cummulative sum in LINQ? - c#

I developed my program with C# with Entity Framework and using LINQ for the query. And I have a problem while converting my SQL query to LINQ.
I don't know what to do, please.
Here is my SQL query in SQL Server:
SELECT bs.idBalancesortie, bs.datesortie, bs.c_num_debut,bs.c_num_fin,bs.nombredetickets
, n.quotite, bs.montant
, SUM(bs.montant) OVER (ORDER BY bs.idNature ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as Cumul
, bs.idNature
FROM BalanceSortie bs
LEFT JOIN Natures n on bs.idNature = n.idNature
order BY bs.idNature

it calculates the cumulative sum of the "montant" but I want this cumulative sum to be done by "idNature"
For example
id Category montant cumul
1 C 49 49
2 A 4 4
3 A 16 20
4

Updated for cumul reset to zero for each idNature
I have not tested this solution, but the key to getting a running sum is to create the "cumul" variable and use a multi-statement lambda. This will require early execution (.ToList() or AsEnumerable()) after the Join.
I added a second variable that holds the "current idNature. Assumed it to be an int and that it cannot be -1.
decimal cumul = 0; // define a variable
int idNature = -1;
var target = BalanceSorties
.Join(idNature, bs => bs.idNature, n => n.idNature, (bs, n) => new { bs, n })
.OrderBy(q => q.n.idNature)
.AsEnumerable()
.Select(q =>
{
if (idNature != q.n.idNature)
{
cumul =0;
idNature = q.n.IdNature;
}
cumul += q.bs.montant;
return new
{
q.bs.idBalancesortie,
q.bs.datesortie,
q.bs.c_num_fin,
q.bs.nombredetickets,
q.n.quotité,
Cumul = cumul,
q.bs.idNature
};
})
.ToList();

Related

How to filter only last value in 2 columns grouping in LINQ C#

Lets say I have the following data:
EntryID | ColValueText
1113 20
1113 19
1024 20
1113 20
1024 21
1113 23
So In C# using LINQ or any method I want to filter the data like this
EntryID | ColValueText
1113 23
1024 21
That is: group by both columns but get only the last value of colvaluetext against each entryid.
I am attaching my code but I am unable to get the last colvaluetext against entryid.
var groupSalesPersonData = (from r in sp
group r by new { r.EntryID, r.ColValueText }
into results
select new GroupByCommonDto
{
ParentGroupById = results.Select(a => a.EntryID).LastOrDefault(),
FieldValue = results.Select(a => a.FieldValue).LastOrDefault(),
Order = results.Sum(x => x.Order),
VehicleProfit = results.Sum(x => x.VehicleProfit),
ColValueText = results.Select(a => a.ColValueText).LastOrDefault()
})
.ToList();
I wouldn't use Linq to solve this problem. You could potentially do it with Aggregate as this is an aggregation problem. I would use a simple foreach and a dictionary
var groupSalesPersonData = (from r in sp
group r by new { r.EntryID}
into results
select new GroupByCommonDto
{
ParentGroupById = results.Select(a => a.EntryID).LastOrDefault(),
ColValueText = results.Select(a => a.ColValueText).LastOrDefault() //Assuming this is a text value and you always want last in the collection.
})
.ToList();
ColValueText if this are numbers using the correct type with correct sort order give you always the right value

Partition By Logic in Code to calculate value of a DataTable Column

I'm using the following SQL for calculating the value of a column named weight within a view.
I need to move this calculation logic to code.
CASE
WHEN SUM(BaseVal) OVER (PARTITION BY TEMPS.MandateCode) = 0 THEN 0
ELSE (BaseVal / (SUM(BaseVal) OVER (PARTITION BY TEMPS.MandateCode))) END AS [Weight]
Is iterating over each and grouping by MandateCode a good idea
var datatableenum = datatable.AsEnumerable();
foreach(var item in datatableenum)
{
List<DataTable> result = datatable.AsEnumerable()
.GroupBy(row => row.Field<int>("MandateCode"))
.Select(g => g.CopyToDataTable())
.ToList();
}
I'm going to say "no" because as you have it, it will perform the group operation for every mandate code, for each row then copy then to list, which adds up to a huge amount of burnt resources.. I would make a dictionary of mandatecode=>sum first and then use it when iterating the table
var d = datatable.AsEnumerable()
.GroupBy(
row => row.Field<int>("MandateCode"),
row => row.Field<double>("BaseVal")
).ToDictionary(g => g.Key, g => g.Sum());
Note I've no idea what type BaseVal is; you need to adjust this. If it's an integer remember that you'll be doing a calc of small_int/big_int eg 12/6152, which is always 0 so cast one of the operandi to eg double so the result will be like 0.1234
Then use the dictionary on each row
foreach(var item in datatableenum)
{
int sumbv = d[item.Field<int>("MandateCode"));
item["Weight"] = sumbv == 0 ? 0 : item.Field<double>("BaseVal") / sumbv;
}

Fetch every nth row with LINQ

We have a table in our SQL database with historical raw data I need to create charts from. We access the DB via Entity Framework and LINQ.
For smaller datetime intervals, I can simply read the data and generate the charts:
var mydata = entity.DataLogSet.Where(dt => dt.DateTime > dateLimit);
But we want to implement a feature where you can quickly "zoom out" from the charts to include larger date intervals (last 5 days, last month, last 6 months, last 10 years and so on and so forth.)
We don't want to chart every single data point for this. We want to use a sample of the data, by which I mean something like this --
Last 5 days: chart every data point in the table
Last month: chart every 10th data point in the table
Last 6 months: chart every 100th data point
The number of data points and chart names are only examples. What I need is a way to pick only the "nth" row from the database.
You can use the Select overload that includes the item index of enumerations. Something like this should do the trick --
var data = myDataLogEnumeration.
Select((dt,i) => new { DataLog = dt, Index = i }).
Where(x => x.Index % nth == 0).
Select(x => x.DataLog);
If you need to limit the query with a Where or sort with OrderBy, you must do it before the first Select, otherwise the indexes will be all wrong --
var data = myDataLogEnumeration.
Where(dt => dt.DateTime > dateLimit).
OrderBy(dt => dt.SomeField).
Select((dt,i) => new { DataLog = dt, Index = i }).
Where(x => x.Index % nth == 0).
Select(x => x.DataLog);
Unfortunately, as juharr commented, this overload is not supported in Entity Framework. One way to deal with this is to do something like this --
var data = entity.DataLogSet.
Where(dt => dt.DateTime > dateLimit).
OrderBy(dt => dt.SomeField).
ToArray().
Select((dt,i) => new { DataLog = dt, Index = i }).
Where(x => x.Index % nth == 0).
Select(x => x.DataLog);
Note the addition of a ToArray(). This isn't ideal though as it will force loading all the data that matches the initial query before selecting only every nth row.
There might be a trick that is supported by ef that might work for this.
if (step != 0)
query = query.Where(_ => Convert.ToInt32(_.Time.ToString().Substring(14, 2)) % step == 0);
this code converts the date into string then cuts the minutes out converts the minutes into an int and then gets every x'th minute for example if the variable step is 5 it's every 5 minutes.
For Postgresql this converts to:
WHERE ((substring(c.time::text, 15, 2)::INT % #__step_1) = 0)
this works best with fixed meassure points such as once a minute.
However, you can use the same method to group up things by cutting up to the hour or the minutes or the first part of the minute (10 minutes grouped) and use aggregation functions such as max() average() sum(), what might even is more desirable.
For example, this groups up in hours and takes the max of most but the average of CPU load:
using var ef = new DbCdr.Context();
IQueryable<DbCdr.CallStatsLog> query;
query = from calls in ef.Set<DbCdr.CallStatsLog>()
group calls by calls.Time.ToString().Substring(0, 13)
into g
orderby g.Max(_ => _.Time) descending
select new DbCdr.CallStatsLog()
{
Time = g.Min(_ => _.Time),
ConcurrentCalls = g.Max(_ => _.ConcurrentCalls),
CpuUsage = (short)g.Average(_ => _.CpuUsage),
ServerId = 0
};
var res = query.ToList();
translates to:
SELECT MAX(c.time) AS "Time",
MAX(c.concurrent_calls) AS "ConcurrentCalls",
AVG(c.cpu_usage::INT::double precision)::smallint AS "CpuUsage",
0 AS "ServerId"
FROM call_stats_log AS c
GROUP BY substring(c.time::text, 1, 13)
ORDER BY MAX(c.time) DESC
note: the examples work with postgres and iso datestyle.

Counting number of items in ObservableCollection where it equals 1 - C#

I made an SQL query and filled the data to an ObservableCollection. The database contains many columns so I want to count how many instances where a specific column = 1, then return that number to an int.
The query:
var test = from x in m_dcSQL_Connection.Testheaders
where dtStartTime <= x.StartTime && dtEndtime >= x.StartTime
select new {
x.N,
x.StartTime,
x.TestTime,
x.TestStatus,
x.Operator,
x.Login,
x.DUT_id,
x.Tester_id,
x.PrintID
};
Then I add the data pulled from the database to an Observable Collection via:
lstTestData.Add(new clsTestNrData(item.N.ToString(),
item.StartTime.ToString(),
item.TestTime.ToString()
etc.....
I want to count how many times TestStatus = 1.
I have read about the .Count property but I do not fully understand how it works on ObservableCollections.
Any help?
The standard ObservableCollection<T>.Count property will give you the number of items in the collection.
What you are looking for is this:
testStatusOneItemCount = lstTestData.Where(item => item.TestStatus == 1).Count()
...which uses IEnumerable<T>.Count() method which is part of LINQ.
To elaborate a bit, Count will simply count the objects in your collection.
I suggest having a quick look at linq 101. Very good examples.
Here's an example:
// Assuming you have :
var list = new List<int>{1,2,3,4,5,6 };
var items_in_list = list.Count(); // = 6;
Using linq's Where, you're basically filtering out items, creating a new list. So, the following will give you the count of all the numbers which are pair:
var pair = list.Where(item => item%2 ==0);
var pair_count = pair.Count; // = 3
You can combine this without the temp variables:
var total = Enumerable.Range(1,6).Where(x => x % 2 ==0).Count(); // total = 6;
Or you can then select something else:
var squares_of_pairs = Enumerable.Range(1,6)
.Where(x => x % 2 ==0).
.Select( pair => pair*pair);
// squares_of_pairs = {4,16, 36}. You can count them, but still get 3 :)

Best way to join / sort two lists

I am trying to figure out what would be the best / fastest way to accomplish next task.
There is a list of int:
{ 4, 1, 112, 78 }
and there is a list of objects:
object { Id, Date, Value }
Rules:
{int list} contains Ids which are not sorted in any particular order
{int list} contains unknown number of elements
{object list} will always have only one Id occurrence in one particular day. There can be not one date with two same Ids (the list of object is already supplied like this). You could say that Id+Date represents a unique object.
JOIN Part: one day could have 1...n items, where 'n' represents the number of elements in {int list}. Requirement is that in final result all days have 'n' Ids. So if the day 1/1/2014 does not have item with Id=42, then a new item will be added to this list with Value=0.
SORT part: {object list} needs to be sorted by date, and then by Id, but the Id order must be the same as it is in {int list}.
What would be the best algorithm to accomplish this task? This is what I do currently:
// first I insert all the missing Ids
// to achieve this, I sorted lists so I now when to expect which Id
var orderedIntList = intList.OrderBy(x => x).ToList();
var orderedObjectList = objectList.OrderBy(x => x.Date).ThenBy(x => x.Id).ToList();
for (int i = 0; i < totalRecords; i++)
{
currentIndex = i % 4;
currentId = orderedIntList[currentIndex];
if (orderedObjectList.Count <= i || currentId != orderedObjectList[i].Id)
orderedObjectList.Insert(i, new Object { Date = currentDate, Id = currentId });
currentDate = orderedList[i].Date;
}
// then in order to have items sorted in original order, I use LINQ join
int counter = 0;
var aListWithIndex = activityIds.Select(x => new { Index = counter++, Id = x }).ToList();
return (from a in aListWithIndex
join b in orderedObjectList on a.Id equals b.Id
orderby b.Date, a.Index
select b
)
.ToList();

Categories

Resources