How to perform operation on grouped records?

How to perform operation on grouped records? - c#

This is my records:
Id EmpName Stats
1 Abc 1000
1 Abc 3000
1 Abc 2000
2 Pqr 4000
2 Pqr 5000
2 Pqr 6000
2 Pqr 7000
I am trying to group by on Id fields and after doing group by i want output like this:
Expected output:
Id EmpName Stats
1 Abc 3000
2 Pqr 3000
For 1st output record calculation is like this:
3000 - 1000=2000 (i.e subtract highest - lowest from 1st and 2nd records)
3000 - 2000=1000 (i.e subtract highest - lowest from 2nd and 3rd records)
Total=2000 + 1000 =3000
For 2nd output record calculation is like this:
5000 - 4000=1000 (i.e subtract highest - lowest from first two records)
6000 - 5000=1000
7000 - 6000=1000
total=1000 + 1000=2000
This is 1 sample fiddle i have created:Fiddle
So far i have manage to group records by id but now how do i perform this calculation on group records??

You can use the Aggregate method overload that allows you to maintain custom accumulator state.
In your case, we'll be maintaining the following:
decimal Sum; // Current result
decimal Prev; // Previous element Stats (zero for the first element)
int Index; // The index of the current element
The Index is basically needed just to avoid accumulating the first element Stats into the result.
And here is the query:
var result = list.GroupBy(t => t.Id)
.Select(g => new
{
ID = g.Key,
Name = g.First().EmpName,
Stats = g.Aggregate(
new { Sum = 0m, Prev = 0m, Index = 0 },
(a, e) => new
{
Sum = (a.Index < 2 ? 0 : a.Sum) + Math.Abs(e.Stats - a.Prev),
Prev = e.Stats,
Index = a.Index + 1
}, a => a.Sum)
}).ToList();
Edit: As requested in the comments, here is the foreach equivalent of the above Aggregate usage:
static decimal GetStats(IEnumerable<Employee> g)
{
decimal sum = 0;
decimal prev = 0;
int index = 0;
foreach (var e in g)
{
sum = (index < 2 ? 0 : sum) + Math.Abs(e.Stats - prev);
prev = e.Stats;
index++;
}
return sum;
}

Firstly, like mentioned in my comment, this can be done using a single linq query but would have many complications, one being unreadable code.
Using a simple foreach on the IGrouping List,
Updated (handle dynamic group length):
var list = CreateData();
var groupList = list.GroupBy(t => t.Id);
var finalList = new List<Employee>();
//Iterate on the groups
foreach(var grp in groupList){
var part1 = grp.Count()/2;
var part2 = (int)Math.Ceiling((double)grp.Count()/2);
var firstSet = grp.Select(i=>i.Stats).Take(part2);
var secondSet = grp.Select(i=>i.Stats).Skip(part1).Take(part2);
var total = (firstSet.Max() - firstSet.Min()) + (secondSet.Max() - secondSet.Min());
finalList.Add(new Employee{
Id = grp.Key,
EmpName = grp.FirstOrDefault().EmpName,
Stats = total
});
}
*Note -
You can optimize the logic used in getting the data for calculation.
More complicated logic is to divide the group into equal parts in case it is not fixed.
Updated Fiddle
The LinQ way,
var list = CreateData();
var groupList = list.GroupBy(t => t.Id);
var testLinq = (from l in list
group l by l.Id into grp
let part1 = grp.Count()/2
let part2 = (int)Math.Ceiling((double)grp.Count()/2)
let firstSet = grp.Select(i=>i.Stats).Take(part2)
let secondSet = grp.Select(i=>i.Stats).Skip(part1).Take(part2)
select new Employee{
Id = grp.Key,
EmpName = grp.FirstOrDefault().EmpName,
Stats = (firstSet.Max() - firstSet.Min()) + (secondSet.Max() - secondSet.Min())
}).ToList();

Related

How to remove objects that have 3 properties the same only leaving 1 but adding the number of those duplicates to a nr. column of that 1

I have a txt file of over a 100k rows with values like this.
A B C D 1 2
A B C E 1 3
D E C F 1 3
D E C F 1 3
A B C B 1 2
E F G G 1 1
I read the file and fill an object with it and then add it into a list but what I need to do next is take the values that have certain properties the same summing one number column which always has a value 1 for those rows with those repeating values. So in the example i would get a list with objects like so
A B C 3 6
D E C 2 6
E F G 1 1
There are 3 A B C values so I leave only one and the number is really the sum but also the count since that value is always 1. The other columns are different but irrelevant to me if the 3 I look at are the same then I consider the object to be the same.One way I have found to do the grouping is using LINQ and group by with a key and if I also create a counter I also get a count of every value (which is the sum since the number is always 1) however this does not give me what I need.
Is there any way using LINQ after the group by to get this effect? Or another method?
EDIT
My Latest attempt
var dupes = serijskaLista.GroupBy(e => new { e.sreIsplatio, e.sreIznDob, e.sreSif, e.sreSerija })
.Select(y => new { Element = y.Key, Counter = y.Count()});
ConcurrentBag<SreckaIsplacena> sgmooreList = new ConcurrentBag<SreckaIsplacena>();
List<Srecka> _srecke = _glavniRepository.UcitajSamoaktivneSrecke().OrderByDescending(item => item.ID).ToList<Srecka>();
ConcurrentBag<SreckaIsplacena> pomList = new ConcurrentBag<SreckaIsplacena>();
SreckaIsplacena _isplacena;
SreckaNagrade nag;
Parallel.ForEach(dupes, new ParallelOptions { MaxDegreeOfParallelism = 10 }, (dp) =>
{
Srecka srec = (from s in _srecke
where s.Sifra == dp.Element.sreSif && s.Serija == dp.Element.sreSerija
select s).First();
ConcurrentBag<SreckaNagrade> sreckaNagrade = new ConcurrentBag<SreckaNagrade>(_glavniRepository.DohvatiNagradeZaSrecku(srec.ID));
if (sreckaNagrade != null)
{
nag = (from sn in sreckaNagrade
where sn.Iznos == dp.Element.sreIznDob
select sn).FirstOrDefault();
Odobrenje odo = new Odobrenje();
odo = odo.DohvatiOdobrenje(valutaGlavna.ID, dp.Element.sreIsplatio).FirstOrDefault();
if (odo != null)
{
ConcurrentBag<PorezSrecka> listaPoreza = new ConcurrentBag<PorezSrecka>(_glavniRepository.UcitajPorezSrecka(valutaGlavna, odo, srec, nag.NagradaId));
_isplacena = new SreckaIsplacena();
decimal iz = Convert.ToDecimal(dp.Element.sreIznDob);
_isplacena.BrojDobitaka = dp.Counter;
_isplacena.Iznos = iz;
_isplacena.Nagrada = nag;
_isplacena.Prodavac = dp.Element.sreIsplatio;
_isplacena.Valuta = valutaGlavna;
_isplacena.Srecka = srec;
_isplacena.Cijena = Convert.ToDecimal(srec.Cijena);
if (listaPoreza.Count == 1)
{
PorezSrecka ps = listaPoreza.ElementAt(0);
_isplacena.SreckaPorez = ps;
}
lock (_isplacena)
{
_isplacena.Save();
lock (pomList)
{
pomList.Add(_isplacena);
}
}
}
}
});
What happens is this seems to insert correctly into the DB but the ConcurrentBag is not filled correctly. I don't understand why

This does not give you the results you desire, as I don't know how the last number is calculated, and you say the actual calculation of the last number is outside the scope the question.
So, as an example, I will use a calculation of doubling the maximum of the last number in the input group. You will need to replace that with the actual calculation.
e.g.
string text= #"
A B C D 1 2
A B C E 1 3
D E C F 1 3
D E C F 1 3
A B C B 1 2
E F G G 1 1
";
// Split into lines and then by spaces to get data which can be queried.
var data = text.Split(new char[] { '\r', '\n'} , StringSplitOptions.RemoveEmptyEntries)
.Select(l=>l.Split(new char[] { ' '}))
.Select(a => new
{
L1 = a[0], L2 = a[1], L3 = a[2], L4 = a[3],
N1 = Convert.ToInt32(a[4]),
N2 = Convert.ToInt32(a[5])
}
);
// Group by the first three letters
// and calculate the numeric values for each group
var grouped = (from r in data group r by r.L1 + " "+ r.L2 + " " + r.L3 into results
select new
{
results.Key,
N1 = results.Sum(a=>a.N1) , // or N1 = results.Count() ,
N2 = results.Max(a=>a.N2) * 2 // Replace with actual calculation
}
);
grouped.Dump();
// Or if you want it export back to text
var text2 = String.Join("\r\n", grouped.Select(a => $"{a.Key} {a.N1} {a.N2}"));
text2.Dump();
Results in LinqPad would be

C# SQL to Linq - translation

I have a table with transactions, similar to:
--REQUEST_ID----ITEM_ID----ITEM_STATUS_CD----EXECUTION_DTTM
1 1 1 2016-08-29 12:36:07.000
1 2 0 2016-08-29 12:37:07.000
2 3 5 2016-08-29 13:37:07.000
2 4 1 2016-08-29 15:37:07.000
2 5 10 2016-08-29 15:41:07.000
3 6 0 2016-08-29 15:41:07.000
What i want is at table showing how many success/warning/Error in % with endtime of the latest transaction in the Request_ID:
--REQUEST_ID--Transactions----------EndTime----------Success----Warning----Error
1 2 2016-08-29 12:37:07.000 50 50 0
2 3 2016-08-29 15:41:07.000 0 33 66
3 1 2016-08-29 15:41:07.000 100 0 0
I have the table that I want by the following slq, but I dont know how to do it in linq(C#)....Anyone?
SELECT distinct t1.[REQUEST_ID],
t2.Transactions,
t2.EndTime,
t2.Success,
t2.Warning,
t2.Error
FROM [dbo].[jp_R_ITEM] t1 inner join(
select top(100) max([EXECUTION_DTTM]) EndTime, REQUEST_ID,
count([ITEM_ID]) as Transactions,
coalesce(count(case when [ITEM_STATUS_CD] = 0 then 1 end), 0) * 100 / count([ITEM_ID]) as Success,
coalesce(count(case when [ITEM_STATUS_CD] = 1 then 1 end), 0) * 100 / count([ITEM_ID]) as Warning,
coalesce(count(case when [ITEM_STATUS_CD] > 1 then 1 end), 0) * 100 / count([ITEM_ID]) as Error
from [dbo].[jp_R_ITEM] group by REQUEST_ID order by REQUEST_ID desc) t2 on t1.[REQUEST_ID] = t2.REQUEST_ID and t1.[EXECUTION_DTTM] = t2.EndTime

So from all your Transactions with RequestId 1, you want to make one element. This one element should have the RequestId, which in this case is 1, it should have the latest value of all ExecutionDttms or transactions with RequestId, and finally, from all those transaction you want a percentage of successes, warnings and errors
You want something similar for the Transactions with RequestId 2, and for the Transactions with RequestId 3, etc.
Whenever you see something like: "I want to group all items from a sequence into one object" you should immediately think of GroupBy.
This one object might be a very complex object, a List, a Dictionary, or an object of a class with a lot of properties
So let's first make groups of Transactions that have the same RequestId:
var groupsWithSameRequestId = Transactions
.GroupBy(transaction => transaction.RequestId);
Every group has a Key, which is the RequestId of all elements in the Group. Every group is (not has) the sequence of all Transaction that have this RequestId value.
You want to transform every group into one result element. Every result element
has a property RequestId and the number of transactions with this RequestId.
The RequestId is the Key of the group, the TransactionCount is of course the number of elements in the group
var result = groupsWithSameRequestId.Select(group => new
{
RequestId = group.Key,
TransactionCount = group.Count(),
...
};
Those were the easiest ones.
Endtime is the maximum value of all ExecutionDttm in your group.
var result = groupsWithSameRequestId.Select(group => new
{
RequestId = group.Key,
TransactionCount = group.Count(),
EndTime = group.Select(groupElement => groupElement.ExecutionDttm).Max(),
...
};
It might be that your data query translator does not allow to Max on Datetime. In that case: order descending and take the first:
EndTime = group.Select(groupElement => groupElement.ExecutionDttm)
.OrderByDescenting(date => date)
.First(),
First() is enough, FirstOrDefault() is not needed, we know there are no groups without any elements
We have save the most difficult / fun part for the last. You want the percentage of success / warning / errors, which is the number of elements with ItemStatus 0 / 1 / more.
Success = 100 * group
.Where(groupElement => groupElement.ItemStatus == 0).Count()
/group.Count(),
Warning = 100 * group
.Where(groupElement => groupElement.ItemStatus == 1).Count()
/group.Count(),
Error = 100 * group
.Where(groupElement => groupElement.ItemStatus > 1).Count()
/group.Count(),
It depends a bit on how smart your IQueryable / DbContext is. But at first glance it seems that Count() is called quite often. Introducing an extra Select will prevent this.
So combining this all into one LINQ statement:
var result = Transactions
.GroupBy(transaction => transaction.RequestId)
.Select(group => new
{
RequestId = group.Key
GroupCount = group.Count(),
SuccessCount = group
.Where(groupElement => groupElement.ItemStatus == 0).Count(),
WarningCount = group
.Where(groupElement => groupElement.ItemStatus == 1).Count(),
ErrorCount = group
.Where(groupElement => groupElement.ItemStatus > 1).Count(),
EndTime = group
.Select(groupElement => groupElement.ExecutionDttm)
.Max(),
})
.Select(item => new
{
RequestId = item.RequestId,
TransactionCount = item.GroupCount,
EndTime = item.EndTime,
SuccessCount = 100.0 * item.SuccesCount / item.GroupCount,
WarningCount = 100.0 * item.WarningCount / item.GroupCount,
ErrorCount = 100.0 * item.ErrorCount / item.GroupCount,
}

var query = (from t1 in lst
join t2 in (from b in lst
group b by b.REQUEST_ID into grp
select new
{
EndTime = (from g1 in grp select g1.EXECUTION_DTTM).Max(),
REQUEST_ID = grp.Key,
Transactions = grp.Count(),
Success = ((from g2 in grp select g2.ITEM_STATUS_CD).Count(x => x == 0)) * 100 / grp.Count(),
Warning = ((from g3 in grp select g3.ITEM_STATUS_CD).Count(x => x == 1)) * 100 / grp.Count(),
Error = ((from g4 in grp select g4.ITEM_STATUS_CD).Count(x => x > 1)) * 100 / grp.Count(),
}).OrderByDescending(x => x.REQUEST_ID).Take(100)
on new { RID = t1.REQUEST_ID, EXDT = t1.EXECUTION_DTTM } equals new { RID = t2.REQUEST_ID, EXDT = t2.EndTime }
select new
{
t1.REQUEST_ID,
t2.Transactions,
t2.EndTime,
t2.Success,
t2.Warning,
t2.Error
}).Distinct().ToList();

Get count and avg for specific criterias and also the rest

I have my data in the following format..
UserId Property1 Property2 Property3 Testval
1 1 1 10 35
2 1 2 3 45
3 2 5 6 55
and so on..
I have several criterias, a couple of example are as below..
a) Where Property1=1 and Property3=10
b) Where Property1!=1 and Property2=5
What I need is the count of users & testval average who fall within these criterias and also of all the rest who do not.
So, result data structure would be as follows..
User Count
Criteria Users
a 100
b 200
rest 1000
TestVal Average
Criteria avg
a 25
b 45
rest 15
I know how to get the userlist for the specific criterias separately.
data.Where(w=>w.Property1==1).Select(s=>s.UserId).ToList()
But how do I get the usercount and avg val and more importantly the same for the rest of users.
Any help is sincerely appreciated
Thanks

Looks like you are seeking for group by criteria. Something like this:
var result = data.GroupBy(x =>
x.Property1 == 1 && x.Property3 == 10 ? 0 :
x.Property1 != 1 && x.Property2 == 5 ? 1 :
// ...
-1)
.Select(g => new
{
Criteria = g.Key,
Users = g.Count(),
Avg = g.Average(x => x.Testval),
})
.ToList();

To get the count/average for a specific criterion, it's easy
Func<MyUser, boolean> criterion1 = user => user.Property1==1;
var avg = data.Where(criterion1).Average(user => user.Testval);
var count = data.Where(criterion1).Count();
(this will enumerate the data twice, so if that's an issue, you can materialize the data before the calculations)
If you want to evaluate multiple criteria (and don't want to repeat this code as many times as there are criteria), you can put them in a dictionary, and loop over them:
var criteria = new Dictionary<string, Func<MyUser, boolean>>{
{ "criterion1", user => user.Property1==1 },
{ "criterion2", user => user.Property1!=1 && user.Property2=5 },
//...
}
foreach (var criterion in criteria){
var avg = data.Where(criterion.Value).Average(user => user.Testval);
var count = data.Where(criterion).Count();
Console.WriteLine($"{criterion.Key} average: {avg}, count: {count}");
}
You can also put the results in another dictionary, something like
var results = new Dictionary<string, Tuple<string, string>>();
foreach (var criterion in criteria){
var avg = data.Where(criterion.Value).Average(user => user.Testval);
var count = data.Where(criterion).Count();
results.Add(criterion.Key, Tuple.Create(avg, count));
}
and then make a better looking report, or you can even create a specific result class that will be easier to print after.
To get the rest (the count/average of the data that does not fit any predicate) you can loop through all the predicates, negating them;
var query = data;
foreach (var criterion in criteria.Values){
query = query.Where(user => !criterion(user));
}
var restAvg = query.Average(user => user.Testval);
var count = query.Count();

You can do it using select new to return new anonymously typed objects which contains your criteria.
public void Test()
{
var list = new List<User>();
list.Add(new User {UserId = 1, Property1 = 1, Property2 = 1, Property3 = 10, Testval = 35});
list.Add(new User {UserId = 1, Property1 = 2, Property2 = 2, Property3 = 3, Testval = 45});
list.Add(new User {UserId = 1, Property1 = 5, Property2 = 5, Property3 = 6, Testval = 55});
Func<User, bool> crit = u => u.Property1 == 1 & u.Property3==10;
var zz = list.Where(crit)
.GroupBy(t => new {ID = t.UserId})
.Select(w => new
{
average = w.Average(a => a.Testval),
count = w.Count(),
rest = list.Except(list.Where(crit)).Average(a => a.Testval)
}).Single();
}

How to: sum all values and assign a percentage of the total in Linq to sql

I have a simple linq query that I'm trying to extend so that I can first sum all the values in the VoteCount field and then for each Nominee I want to assign what percentage of votes the nominee received.
Here's the code:
TheVoteDataContext db = new TheVoteDataContext();
var results = from n in db.Nominees
join v in db.Votes on n.VoteID equals v.VoteID
select new
{
Name = n.Name,
VoteCount = v.VoteCount,
NomineeID = n.NomineeID,
VoteID = v.VoteID
};

Since selecting the single votes for each nominee and calculating the sum of all votes are two different tasks, I cannot think of a way of doing this efficiently in one single query. I would simply do it in two steps, as
var results = from n in db.Nominees
join v in db.Votes on n.VoteID equals v.VoteID
select new
{
Name = n.Name,
VoteCount = v.VoteCount,
NomineeID = n.NomineeID,
VoteID = v.VoteID
};
var sum = (decimal)results.Select(r=>r.VoteCount).Sum();
var resultsWithPercentage = results.Select(r=>new {
Name = r.Name,
VoteCount = r.VoteCount,
NomineeID = r.NomineeID,
VoteID = r.VoteID,
Percentage = sum != 0 ? (r.VoteCount / sum) * 100 : 0
});
You could also calculate the sum before the results (using an aggregate query), this would leave the task of summing to the Database engine. I believe that this would be slower, but you can always find out by trying :)

Extract results from List<Customers> using LINQ

I have a List of Customers
List<Customers> cust = new List<Customers>();
cust.Add(new Customers(){ID=1, Name="Sam", PurchaseDate=DateTime.Parse("01/12/2008")});
cust.Add(new Customers(){ID=2, Name="Lolly" PurchaseDate=DateTime.Parse("03/18/2008")});
I want to show 2 seperate results like:
Purchases by Customer in Yr // Grouping by yr and display id,name
Purchases by Customer in Yr - Month // Grouping by yr then month and display id,name
Also What if i want to order the yr?
Update:
Just one more addition. If I have a field called "Status" in the Customer class with either of these values 'Y', 'N', 'C' standing for yes, no and cancel how will i create a query to give ratio in %
Y - 20%
N - 30%
C - 50%

Grouping by year:
var groupByYear = customers.GroupBy(customer => customer.PurchaseDate.Year);
foreach (var group in groupByYear)
{
Console.WriteLine("Year: {0}", group.Key);
foreach (var customer in group)
{
Console.WriteLine("{0}: {1}", customer.ID, customer.Name);
}
}
Grouping by year and month:
var groupByYearMonth = customers.GroupBy(customer =>
new DateTime(customer.PurchaseDate.Year, customer.PurchaseDate.Month, 1));
foreach (var group in groupByYear)
{
Console.WriteLine("Year/month: {0}/{1}", group.Key.Year, group.Key.Month);
foreach (var customer in group)
{
Console.WriteLine("{0}: {1}", customer.ID, customer.Name);
}
}
Ordering:
var ordered = customers.OrderBy(customer => customer.PurchaseDate.Year);
All of these use "dot notation" instead of query expressions because they're so simple - but you could use a query expression if you wanted to.
EDIT: For the status part, just use David B's answer.

int total = customer.Count()
var counts = customers.GroupBy( c => c.Status )
.Select( g => new
{
Status = g.Key,
TheRatio = (g.Count() * 100) / total;
})

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to perform operation on grouped records? - c#

Related

How to remove objects that have 3 properties the same only leaving 1 but adding the number of those duplicates to a nr. column of that 1

C# SQL to Linq - translation

Get count and avg for specific criterias and also the rest

How to: sum all values and assign a percentage of the total in Linq to sql

Extract results from List<Customers> using LINQ

Categories

Resources