Get count and avg for specific criterias and also the rest - c#

I have my data in the following format..
UserId Property1 Property2 Property3 Testval
1 1 1 10 35
2 1 2 3 45
3 2 5 6 55
and so on..
I have several criterias, a couple of example are as below..
a) Where Property1=1 and Property3=10
b) Where Property1!=1 and Property2=5
What I need is the count of users & testval average who fall within these criterias and also of all the rest who do not.
So, result data structure would be as follows..
User Count
Criteria Users
a 100
b 200
rest 1000
TestVal Average
Criteria avg
a 25
b 45
rest 15
I know how to get the userlist for the specific criterias separately.
data.Where(w=>w.Property1==1).Select(s=>s.UserId).ToList()
But how do I get the usercount and avg val and more importantly the same for the rest of users.
Any help is sincerely appreciated
Thanks

Looks like you are seeking for group by criteria. Something like this:
var result = data.GroupBy(x =>
x.Property1 == 1 && x.Property3 == 10 ? 0 :
x.Property1 != 1 && x.Property2 == 5 ? 1 :
// ...
-1)
.Select(g => new
{
Criteria = g.Key,
Users = g.Count(),
Avg = g.Average(x => x.Testval),
})
.ToList();

To get the count/average for a specific criterion, it's easy
Func<MyUser, boolean> criterion1 = user => user.Property1==1;
var avg = data.Where(criterion1).Average(user => user.Testval);
var count = data.Where(criterion1).Count();
(this will enumerate the data twice, so if that's an issue, you can materialize the data before the calculations)
If you want to evaluate multiple criteria (and don't want to repeat this code as many times as there are criteria), you can put them in a dictionary, and loop over them:
var criteria = new Dictionary<string, Func<MyUser, boolean>>{
{ "criterion1", user => user.Property1==1 },
{ "criterion2", user => user.Property1!=1 && user.Property2=5 },
//...
}
foreach (var criterion in criteria){
var avg = data.Where(criterion.Value).Average(user => user.Testval);
var count = data.Where(criterion).Count();
Console.WriteLine($"{criterion.Key} average: {avg}, count: {count}");
}
You can also put the results in another dictionary, something like
var results = new Dictionary<string, Tuple<string, string>>();
foreach (var criterion in criteria){
var avg = data.Where(criterion.Value).Average(user => user.Testval);
var count = data.Where(criterion).Count();
results.Add(criterion.Key, Tuple.Create(avg, count));
}
and then make a better looking report, or you can even create a specific result class that will be easier to print after.
To get the rest (the count/average of the data that does not fit any predicate) you can loop through all the predicates, negating them;
var query = data;
foreach (var criterion in criteria.Values){
query = query.Where(user => !criterion(user));
}
var restAvg = query.Average(user => user.Testval);
var count = query.Count();

You can do it using select new to return new anonymously typed objects which contains your criteria.
public void Test()
{
var list = new List<User>();
list.Add(new User {UserId = 1, Property1 = 1, Property2 = 1, Property3 = 10, Testval = 35});
list.Add(new User {UserId = 1, Property1 = 2, Property2 = 2, Property3 = 3, Testval = 45});
list.Add(new User {UserId = 1, Property1 = 5, Property2 = 5, Property3 = 6, Testval = 55});
Func<User, bool> crit = u => u.Property1 == 1 & u.Property3==10;
var zz = list.Where(crit)
.GroupBy(t => new {ID = t.UserId})
.Select(w => new
{
average = w.Average(a => a.Testval),
count = w.Count(),
rest = list.Except(list.Where(crit)).Average(a => a.Testval)
}).Single();
}

Related

How can i get specific item in a list using LINQ

So I am learning the basics of LINQ and I´m haveing a little trouble.
How can I get the interval under entry name and exit name?? I'm having a hard time to solve this.
this is the code
List<list> list = new List<list>();
station1 = new list()
{
no = 1,
interval = 0,
name = "name1",
};
station2 = new list()
{
no = 2,
interval = 1,
name = "name2",
};
station3 = new list()
{
no= 3,
interval = 2,
name = "name3",
};
station4 = new list()
{
no = 4,
interval = 1,
name = "name4",
};
station5 = new list()
{
no = 5,
interval = 1,
name = "name5",
};
for example I enter the entry station and exit station (name1, name5)
I want to add those interval inside the station under name1 and name5.
so the process will be
output = name2.interval = 1 + name3.interval = 2 + name4.interval = 1 ;
total interval = 4
What I tried is, which is wrong and I am stuck:
interval = list.GetRange(entry, exit);
This only gives me the interval of entry so I need to add a filter. Been trying this and that with no luck. If anyone could give me more hints or be of some assistance would be greatly appreciated
I suggest using Skip and Take:
int total = list
.OrderBy(item => item.no) // if stations are not ordered
.SkipWhile(item => item.name != "name1") // skip before 1st station
.Skip(1) // skip 1st station
.TakeWhile(item => item.name != "name5") // take up to the last station
.Sum(item => item.interval); // sum intervals
First, you'd have to get the no of stations with "name1" and "name5". After that, you can get the total with a LINQ query like this:
var no1 = list.First(x => x.name == "name1").no;
var no5 = list.First(x => x.name == "name5").no;
var total = list
.Where(x => x.no > no1 && x.no < no5)
.Sum(x => x.interval);
This sample assumes that the stations exist. After getting the no of the stations it filters the list for items with a no between the stations and afterwards builds the sum of the interval field for these items.
In addition, it iterates the list several times. If you want to find the stations by name more efficiently, you could change your list to a Dictionary<string, list> where name is the key and the item is the value. Then you can simply look the items up by name and iterate the list only once. In memory with a limited number of items, the difference will not be too big between the list and the dictionary.

LINQ complex search query advice

I need some help with some LINQ logic that I am trying to do
Using EF, I have this result set:
Basically what I want to achieve is if the user wants to find an element that has TagID 3 AND TagID 4 it should only return Low, Medium
This should ignore Low as this element doesn't have TagID 4
Also if the user just wants the elements that contain TagID 3, it should return Low, Medium and Low as both contain TagID 3
I have tried this just to get Low, Medium back (the harder logic) but to no prevail.
var result = result.Where(x => x.TagID == 3 && x.TagID == 4).ToList();
A step in the right direction is all that is needed please
This should work if tags are only available once per ID (i.e. no items with the same ID and the same tag ID).
I don't think EF will be available to translate to SQL -> materialize first.
var q = result.ToList();
var tagIDs = new HashSet<int>() { 3, 4 };
IEnumerable<string> itemContents =
q.Where(x => tagIDs.Contains(x.TagID)). // Keep only the tags we're interested in
GroupBy(x => x.Id). // Group the items by ID
Where(g => (g.Count() == tagIDs.Count)). // Select the groups having the right number of items
SelectMany(g => g.Select(x => x.ItemContent)). // Extract ItemContent
Distinct(); // Remove duplicates
I don't know if EF this swallows, here is an example:
var data = new[]
{
new { Id = 12, TagID = 3, ItemContent = "Low" },
new { Id = 13, TagID = 3, ItemContent = "Low, Medium" },
new { Id = 13, TagID = 4, ItemContent = "Low, Medium" },
};
var search = new List<int>(new[] { 3, 4 });
var result = data
// group the items on ItemContent
.GroupBy(item => item.ItemContent, d => d, (k, g) => new { ItemContent = k, g })
// only select groups when all searchitems are found in a list of TagID
.Where(groupedItem => search.All(i => groupedItem.g.Select(y => y.TagID).Contains(i)))
// select the result
.Select(groupedItem => groupedItem);
foreach (var r in result)
Console.WriteLine(r.ItemContent);
Console.ReadLine();

Linq Group By, Skip 1 where the skipped 1 was the most recent

I have a data set of objects that i have stored in the events List (the variables have all been declared earlier at class level):
[SetUp]
public void Setup()
{
eventLogObj = new EventLogObj();
event1 = new EventLogObj() { RecordId = 1, TableKey = "PERSON_CODE=1", Status = "S", EventTime = Convert.ToDateTime("2013-07-15 14:00:00") };
event2 = new EventLogObj() { RecordId = 2, TableKey = "PERSON_CODE=2", Status = "S", EventTime = Convert.ToDateTime("2013-07-15 13:00:00") };
event3 = new EventLogObj() { RecordId = 3, TableKey = "PERSON_CODE=3", Status = "S", EventTime = Convert.ToDateTime("2013-07-15 13:00:00") };
event4 = new EventLogObj() { RecordId = 4, TableKey = "PERSON_CODE=2", Status = "S", EventTime = Convert.ToDateTime("2013-07-15 14:00:00") };
event5 = new EventLogObj() { RecordId = 5, TableKey = "PERSON_CODE=1", Status = "S", EventTime = Convert.ToDateTime("2013-07-15 13:00:00") };
events = new List<EventLogObj>() { event1, event2, event3, event4, event5 };
}
I was initially just extracting the duplicates - which worked (below)
[Test]
public void StoreOnlyDuplicateDetailsFromRowsIntoCollection()
{
var duplicates = events.GroupBy(s => s.TableKey)
.SelectMany(grp => grp.Skip(1)).ToList();
Assert.AreEqual(2, duplicates.Count);
}
However, now I want to extract duplicates with the lowest dates and I'm not quite sure how to adjust the linq query i setup.
Here is what I have done so far but it fails.
If you are wondering what duplicates2 is, it a failed attempt to implement this: LINQ: Group by aggregate but still get information from the most recent row?
[Test]
public void pickDuplicateEventWithLeastDate()
{
var duplicates = events
//.OrderBy(e => e.EventTime)
.GroupBy(s => s.TableKey)
.SelectMany(grp => grp.Skip(1))
.ToList();
var duplicates2 = from res in events
group res by res.TableKey into g
select new
{
Count = g.Count(),
MemberID = g.Key,
MostRecent = g.OrderByDescending(x => x.EventTime)
.First()
};
Assert.AreEqual(2, duplicates.Count);
var e1 = duplicates[0];
var e2 = duplicates[1];
Assert.AreEqual(e1.EventTime, Convert.ToDateTime("2013-07-15 13:00:00"));
Assert.AreEqual(e2.EventTime, Convert.ToDateTime("2013-07-15 13:00:00"));
}
If you want to try it without having to setup interfaces, classes, etc in Visual Studio, see here https://dotnetfiddle.net/YYoSfM and fiddle about. Basically if the tests pass, you should get nothing in the 'console window'.
You are skipping the first item, so this must be the one with the highest date. To achieve this, you need to use OrderByDescending which belongs inside the SelectMany just before the Skip.
var duplicates = events
.GroupBy(s => s.TableKey)
.SelectMany(grp => grp.OrderByDescending(e => e.EventTime)
.Skip(1))
.ToList();
If you want the remaining duplicates to be ordered ascending, just add a .OrderBy(e => e.EventTime) right after the Skip.
Oh, and your test data is bogus. The first data item with PERSON_CODE=1 is an 2013-07-13 but should be on 2013-07-15 to pass the test suite.

Delete the repeated Item inside a List of object C#

I want to compare the element of a list of object ,delete the repeated Item and increment the number of the quantity of that Item (C# code ), I don't know if I should use LinQ,For or foreach statement : I have a list of OrderItem I want to delete the OrderItem that have the same FK_ArtikelId and increment the Qantity of the OrderItem . Exp:
for (int i=1 ; i < lot.Count ; i ++)
{
for (j = i + 1; j <= lot.Count; j++)
{
if (lot[i].FK_ArticleId.Equals(lot[j].FK_ArticleId))
{
lot[i].Quantity += lot[j].Quantity;
lot.Remove(lot[j]);
}
}
}
You have to use the GroupBy linq method and process the resulting groups: given the class
public class Article
{
public int FK_ArticleId { get; set; }
public int Quantity { get; set; }
}
and the following list:
var list = new List<Article>()
{
new Article() {FK_ArticleId = 1, Quantity = 10}
, new Article() {FK_ArticleId = 1, Quantity = 10}
, new Article() {FK_ArticleId = 1, Quantity = 10}
, new Article() {FK_ArticleId = 2, Quantity = 100}
, new Article() {FK_ArticleId = 2, Quantity = 100}
, new Article() {FK_ArticleId = 3, Quantity = 1000}
};
The following linq query returns what you need:
list.GroupBy(a => a.FK_ArticleId)
.Select(g => new Article() {FK_ArticleId = g.Key, Quantity = g.Sum(a => a.Quantity)});
// article id 1, quantity 30
// article id 2, quantity 200
// article id 3, quantity 1000
If you don't want to create a new article, you can take the first of the resulting group and set its Quantity to the correct value:
var results = list.GroupBy(a => a.FK_ArticleId)
.Select(g =>
{
var firstArticleOfGroup = g.First();
firstArticleOfGroup.Quantity = g.Sum(a => a.Quantity);
return firstArticleOfGroup;
});
I didn't test but this should give you an idea of the power of linq...
var stuff = lot
.GroupBy(p => p.FK_ArticleId)
.Select(g => g)
.ToList();
This should give you groups of articleIDs whereby you can easily get counts, create new consolidated lists etc.
For starters you can't use foreach because you're modifying the list and the Enumerator will throw an exception. You can do the following with Linq:
var grouped = lot.GroupBy(x => x.FK_ArticleId).ToArray();
foreach(var group in grouped)
{
group.First().Quantity = group.Sum(item => item.Quantity);
}
Now, first item in each group will contain the sum of all the quantities of items with the same FK_ArticleId. Now, to get the results use this:
var results = grouped.Select(g => g.First());
At this point it's purely your decision whether to return the results as a separate collection or insert them into the original list. If you opt for the second approach, don't forget to clear the list first:
list.Clear();
list.AddRange(results);
EDIT
A more elegant solution to accumulating the Quantity property into the first item of each group would be the following:
data.GroupBy(x=>x.FK_ArticleId)
.Select(g => g.Aggregate((acc, item) =>
{
acc.Quantity = item.Quantity;
return acc;
}));
This is what I scrapped in LinqPad:

how to get an ordered list with default values using linq

I have an ICollection of records (userID,itemID,rating) and an IEnumerable items
for a specific userID and each itemID from a set of itemIDs, i need to produce a list of the users rating for the items or 0 if no such record exists. the list should be ordered by the items.
example:
records = [(1,1,2),(1,2,3),(2,3,1)]
items = [3,1]
userID = 1
result = [0,2]
my attempt:
dataset.Where((x) => (x.userID == uID) & items.Contains(x.iID)).Select((x) => x.rating);
it does the job but it doesn't return 0 as default value and it isnt ordered...
i'm new to C# and LINQ, a pointer in the correct direction will be very appreciated.
Thank you.
This does the job:
var records = new int[][] { new int[] { 1, 1, 2 }, new int[] { 1, 2, 3 }, new int[] { 2, 3, 1 } };
var items = new int[] { 3, 1 };
var userId = 1;
var result = items.Select(i =>
{
// When there's a match
if (records.Any(r => r[0] == userId && r[1] == i))
{
// Return all numbers
return records.Where(r => r[0] == userId && r[1] == i).Select(r => r[2]);
}
else
{
// Just return 0
return new int[] { 0 };
}
}).SelectMany(r => r); // flatten the int[][] to int[]
// output
result.ToList().ForEach(i => Console.Write("{0} ", i));
Console.ReadKey(true);
How about:
dataset.Where((x) => (x.userID == uID)).Select((x) => items.Contains(x.iID) ? x.rating : 0)
This does the job. But whether it's maintainable/readable solution is topic for another discussion:
// using your example as pseudo-code input
var records = [(1,1,2),(1,2,3),(2,3,1)];
var items = [3,1];
var userID = 1;
var output = items
.OrderByDescending(i => i)
.GroupJoin(records,
i => i,
r => r.ItemId,
(i, r) => new { ItemId = i, Records = r})
.Select(g => g.Records.FirstOrDefault(r => r.UserId == userId))
.Select(r => r == null ? 0 : r.Rating);
How this query works...
ordering is obvious
the ugly GroupJoin - it joins every element from items with all records that share same ItemId into annonymous type {ItemId, Records}
now we select first record for each entry that matches userId - if none is found, null will be returned (thanks to FirstOrDefault)
last thing we do is check whether we have value (we select Rating) or not - 0
How about this. your question sounds bit like an outer join from SQL, and you can do this with a GroupJoin, SelectMany:
var record1 = new Record() { userID = 1, itemID = 1, rating = 2 };
var record2 = new Record() { userID = 1, itemID = 2, rating = 3 };
var record3 = new Record() { userID = 2, itemID = 3, rating = 1 };
var records = new List<Record> { record1, record2, record3 };
int userID = 1;
var items = new List<int> { 3, 1 };
var results = items
.GroupJoin( records.Where(r => r.userID == userID), item => item, record => record.itemID, (item, record) => new { item, ratings = record.Select(r => r.rating) } )
.OrderBy( itemRating => itemRating.item)
.SelectMany( itemRating => itemRating.ratings.DefaultIfEmpty(), (itemRating, rating) => rating);
To explain what is going on
For each item GroupJoin gets the list of rating (or empty list if no rating) for the specified user
OrderBy is obvious
SelectMany flattens the ratings lists, providing a zero if the ratings list is empty (by DefaultIfEmpty)
Hope this makes sense.
Be aware, if there is more than one rating for an item by a user, they will all appear in the final list.

Categories

Resources