Count groups using linq - c#

I have an object called simulation results.
public SimulationResult
{
public Horse Winner {get;set;}
public Horse Second {get;set;}
public Horse Third {get;set;}
public Horse Fourth {get;set;}
}
public Horse
{
public Guid Id{get;set;}
}
So, I have a list of 50000 SimulationResult. How can I determine the top 50 most common results.
I tried using LINQ groupBy but the horseId appears in each object and it doesn't allow multiple occurrences of one value.
EDIT
Sorry, thought it was clear.
So we have 8 horses total. Say horse id is 1-8.
So in simulation result 1 the winner is 1, second is 2, third is 3, fourth is 4.
In simulation result 2 first is 5, second is 6 , third is 7, fourth is 8.
In simulation result 3 first is 1, second is 2, third is 3, fourth is 4.
So result set 1 and result set 3 are equal. So in this sample, winner 1 second 2 third 3 fourth 4 is the most common result.

I tried using LINQ groupBy but the horseId appears in each object and it doesn't allow multiple occurrences of one value.
If you mean using anonymous type as explained in Grouping by Composite Keys, although most of the time we can let the compiler infer the names of us, we can always (and here it's necessary to) specify them explicitly:
var topResults = simulationResults
.GroupBy(r => new
{
WinnerId = r.Winner.Id,
SecondId = r.Second.Id,
ThirdId = r.Third.Id,
FourthId = r.Fourth.Id,
})
.OrderByDescending(g => g.Count())
.Select(g => g.First()) // or new { Result = g.First(), Count = g.Count() } if you need the count
.Take(50)
.ToList();

Simplest answer to your question:
class ExactResult {
public String CombinedId { get; set; }
public int Count { get; set; }
}
resultList.Select(l => {
var combinedId = l.Winner.Id.ToString() + l.Second.Id.ToString() + l.Third.ToString() + l.Fourth.ToString();
return new ExactResult() { CombinedId = combinedId), Count = l.Count(c => c.Winner.Id.ToString() + c.Second.Id.ToString() + c.Third.ToString() + c.Fourth.ToString();)}
}).OrderByDescending(e => e.Count).Take(50)
The answer is meaningless though. If what you're really going for is the most likely 4 winners from a bunch of results, this is not the way to go about it. This will just display the most results with the EXACT same 4 winners.
What you're probably looking for is statistical analysis or maybe spread. Anyway things more complicated than what you're actually asking.

Maybe this is what you're looking for:
var q = (from x in mySimulationResultList
group x by x into g
let count = g.Count()
orderby count descending
select new { Value = g.Key, Count = count }).Take(50);
foreach (var x in q)
{
Console.WriteLine($"Value: {x.Value.ToString()} Count: {x.Count}");
}
If you want meaningful output for the Console.WriteLine you would need to override ToString for Horse and SimulationResult
You must override Equals and GetHashCode for SimulationResult, something like this:
public override bool Equals(object obj)
{
SimulationResult simResult = obj as SimulationResult;
if (simResult != null)
{
if (Winner == simResult.Winner
&& Second == simResult.Second
&& Third == simResult.Third
&& Fourth == simResult.Fourth)
{
return true;
}
}
return false;
}
public override int GetHashCode()
{
int hash = 12;
hash = hash * 5 + Winner.GetHashCode();
hash = hash * 5 + Second.GetHashCode();
hash = hash * 5 + Third.GetHashCode();
hash = hash * 5 + Fourth.GetHashCode();
return hash;
}
Sources here (group by query) and here (comparing objects against eachother)

Related

Linq order list using list property

If i have a list of football teams, and each team contains a list matches. If each match has a property of goals scored, how can i order the list of football teams so that it is ordered by the most goals scored in the lastest match, followed by the match before and so on?
The number of matches is unknown.
I cant figure out the linq and im not having much luck with investigating dynamic linq
Many thanks
The number of matches will always be the same and theoretically there isnt a maximum although it is reasonable to expect that it will be less than 20. If the number of goals is the same it will use team name in alphabetical order.
Linq doesn't do recursion natively. You may need to define a custom comparer in order to to the recursive search, then pass that to OrderBy. Without seeing the actual structure the pseudo-code would be:
N = 1
while(true)
{
if L has less than N matches
if R has less than matches
return L.Name.CompareTo(R.Name) // order by team name
else
return 1 // R has more matches
if R has less than matches // L has more matches
return -1
compare Nth match of each team
if equal
N = N + 1;
else
return compare result
}
Recursion seems to not be necessary. Here's an iterative approach.
void Main() {
var teams = CreateTeams().ToArray();
int games = teams.Min(t => t.Games.Count);
var ordered = teams.OrderBy(team => 0);
for (int i = games - 1; i >= 0; i--) {
var captured = i; // the value of i will change, so use this capturing variable,
ordered = ordered.ThenByDescending(team => team.Games[captured].Points);
}
ordered = ordered.ThenBy(team => team.Name);
foreach (var team in ordered) {
Console.WriteLine("{0} {1}", team.Name, string.Join(", ", team.Games.Select(game => game.Points)));
}
}
IEnumerable<Team> CreateTeams() {
yield return (new Team("War Donkeys", 1, 2, 3));
yield return (new Team("Fighting Beavers", 2, 2, 3));
yield return (new Team("Angry Potatoes", 2, 1, 3));
yield return (new Team("Wispy Waterfalls", 3, 2, 1));
yield return (new Team("Frisky Felines", 1, 2, 3));
}
class Team {
public string Name { get; set; }
public IList<Game> Games { get; set; }
public Team(string name, params int[] points) {
this.Name = name;
this.Games = points.Select(p => new Game { Points = p }).ToArray();
}
}
class Game {
public int Points { get; set; }
}
The output is
Fighting Beavers 2, 2, 3
Frisky Felines 1, 2, 3
War Donkeys 1, 2, 3
Angry Potatoes 2, 1, 3
Wispy Waterfalls 3, 2, 1

How do I get total Qty using one linq query?

I have two linq queries, one to get confirmedQty and another one is to get unconfirmedQty.
There is a condition for getting unconfirmedQty. It should be average instead of sum.
result = Sum(confirmedQty) + Avg(unconfirmedQty)
Is there any way to just write one query and get the desired result instead of writing two separate queries?
My Code
class Program
{
static void Main(string[] args)
{
List<Item> items = new List<Item>(new Item[]
{
new Item{ Qty = 100, IsConfirmed=true },
new Item{ Qty = 40, IsConfirmed=false },
new Item{ Qty = 40, IsConfirmed=false },
new Item{ Qty = 40, IsConfirmed=false },
});
int confirmedQty = Convert.ToInt32(items.Where(o => o.IsConfirmed == true).Sum(u => u.Qty));
int unconfirmedQty = Convert.ToInt32(items.Where(o => o.IsConfirmed != true).Average(u => u.Qty));
//Output => Total : 140
Console.WriteLine("Total : " + (confirmedQty + unconfirmedQty));
Console.Read();
}
public class Item
{
public int Qty { get; set; }
public bool IsConfirmed { get; set; }
}
}
Actually accepted answer enumerates your items collection 2N + 1 times and it adds unnecessary complexity to your original solution. If I'd met this piece of code
(from t in items
let confirmedQty = items.Where(o => o.IsConfirmed == true).Sum(u => u.Qty)
let unconfirmedQty = items.Where(o => o.IsConfirmed != true).Average(u => u.Qty)
let total = confirmedQty + unconfirmedQty
select new { tl = total }).FirstOrDefault();
it would take some time to understand what type of data you are projecting items to. Yes, this query is a strange projection. It creates SelectIterator to project each item of sequence, then it create some range variables, which involves iterating items twice, and finally it selects first projected item. Basically you have wrapped your original queries into additional useless query:
items.Select(i => {
var confirmedQty = items.Where(o => o.IsConfirmed).Sum(u => u.Qty);
var unconfirmedQty = items.Where(o => !o.IsConfirmed).Average(u => u.Qty);
var total = confirmedQty + unconfirmedQty;
return new { tl = total };
}).FirstOrDefault();
Intent is hidden deeply in code and you still have same two nested queries. What you can do here? You can simplify your two queries, make them more readable and show your intent clearly:
int confirmedTotal = items.Where(i => i.IsConfirmed).Sum(i => i.Qty);
// NOTE: Average will throw exception if there is no unconfirmed items!
double unconfirmedAverage = items.Where(i => !i.IsConfirmed).Average(i => i.Qty);
int total = confirmedTotal + (int)unconfirmedAverage;
If performance is more important than readability, then you can calculate total in single query (moved to extension method for readability):
public static int Total(this IEnumerable<Item> items)
{
int confirmedTotal = 0;
int unconfirmedTotal = 0;
int unconfirmedCount = 0;
foreach (var item in items)
{
if (item.IsConfirmed)
{
confirmedTotal += item.Qty;
}
else
{
unconfirmedCount++;
unconfirmedTotal += item.Qty;
}
}
if (unconfirmedCount == 0)
return confirmedTotal;
// NOTE: Will not throw if there is no unconfirmed items
return confirmedTotal + unconfirmedTotal / unconfirmedCount;
}
Usage is simple:
items.Total();
BTW Second solution from accepted answer is not correct. It's just a coincidence that it returns correct value, because you have all unconfirmed items with equal Qty. This solution calculates sum instead of average. Solution with grouping will look like:
var total =
items.GroupBy(i => i.IsConfirmed)
.Select(g => g.Key ? g.Sum(i => i.Qty) : (int)g.Average(i => i.Qty))
.Sum();
Here you have grouping items into two groups - confirmed and unconfirmed. Then you calculate either sum or average based on group key, and summary of two group values. This also neither readable nor efficient solution, but it's correct.

Tricky combination of two lists using Linq

I hope somebody will be able to guide me in right direction here...
public class SubmissionLog
{
public int PKId {get;set;}
public int SubmissionId {get;set;}
public DateTime Created {get;set;}
public int StatusId {get;set;}
}
And this is the data:
1, 123, '1/24/2013 01:00:00', 1
2, 456, '1/24/2013 01:30:00', 1
3, 123, '1/25/2013 21:00:00', 2
4, 456, '1/25/2013 21:30:00', 2
5, 123, '2/25/2013 22:00:00', 1
6, 123, '2/26/2013 21:00:00', 2
7, 123, '2/16/2013 21:30:00', 1
What I am trying to is following:
I'd like to know the the average time span from StatusId 1 to StatusId 2 on a given day.
So, let's say date is 2/26/2013, then what I thought would make sense if first get the list like this:
var endlingList = (from sl in db.SubmissionLogs
where (DateTime.Now.AddDays(days).Date == sl.Created.Date) // days = passed number of days to make it 2/26/2013
&& (sl.StatusId == 2)
select sl).ToList();
var endingLookup = endlingList.ToLookup(a => a.SubmissionId, a => a.Created); // thought of using lookup because Dictionary doesn't allow duplicates
After that I thought I'd figure out starting points
var startingList = (from sl in db.SubmissionLogs
where endingList.Select(a => a.SubmissionId).ToArray().Contains(sl.QuoteId)
&& sl.StatusId == 1
select sl).ToList();
And then what I did was following:
var revisedList = endingLookup.Select(a =>
new SubmissionInterval {
SubmissionId = a.Key,
EndDateTime = endingLookup[a.Key].FirstOrDefault(), //This is where the problem is. This will only grab the first occurance.
StartDateTime = startLookup[a.Key].FirstOrDefault() //This is where the problem is. This will only grab the first occurance.
});
And then what I do to get average is following (again, this will only include the initial or first ocurances of status 1 and status 2 of some submission id Submission Log):
return revisedList.Count() > 0 ? revisedList.Select(a=> a.EndDateTime.Subtract(a.StartDateTime).TotalHours).Average() : 0;
So, I hope somebody will understand what my problem here is first of all... To re-cap, I want to get timespan between each status 1 and 2. I pass the date in, and then I have to look up 2's as that ensures me that I will find 1's. If I went the other way around and looked for 1's, then 2's may not exist (don't want that anyway).
At the end I wanna be able to average stuff out...
So let's say if some submission first went from 1 to 2 in a time span of 5h (the code that I left, will get me up to this point), then let's say it got reassigned to 1 and then it went back to 2 in a new time span of 6h, I wanna be able to get both and do the average, so (5+6)/2.
Thanks
I think I understand what you're trying to do. Does thishelp
void Main()
{
var list = new List<SubmissionLog>
{
new SubmissionLog(1, 123, "1/24/2013 01:00:00", 1),
new SubmissionLog(2, 456, "1/24/2013 01:30:00", 1),
new SubmissionLog(3, 123, "1/25/2013 21:00:00", 2),
new SubmissionLog(4, 456, "1/25/2013 21:30:00", 2),
new SubmissionLog(5, 123, "2/25/2013 22:00:00", 1),
new SubmissionLog(6, 123, "2/26/2013 21:00:00", 2),
new SubmissionLog(7, 123, "2/16/2013 21:30:00", 1),
};
// split out status 1 and 2
var s1s = list.Where (l => l.StatusId == 1).OrderBy (l => l.Created);
var s2s = list.Where (l => l.StatusId == 2).OrderBy (l => l.Created);
// use a sub-query to get the first s2 after each s1
var q = s1s.Select (s1 => new
{
s1,
s2 = s2s.FirstOrDefault (s2 =>
s1.SubmissionId == s2.SubmissionId &&
s2.Created >= s1.Created
)
}
).Where (s => s.s1.PKId < s.s2.PKId && s.s2 != null);
// extract the info we need
// note that TotalSecond is ok in Linq to Object but you'll
// probably need to use SqlFunctions or equivalent if this is to
// run against a DB.
var q1 = q.Select (x => new
{
Start=x.s1.Created,
End=x.s2.Created,
SubmissionId=x.s1.SubmissionId,
Seconds=(x.s2.Created - x.s1.Created).TotalSeconds
}
);
// group by submissionId and average the time
var q2 = q1.GroupBy (x => x.SubmissionId).Select (x => new {
x.Key,
Count=x.Count (),
Start=x.Min (y => y.Start),
End=x.Max (y => y.End),
Average=x.Average (y => y.Seconds)});
}
public class SubmissionLog
{
public SubmissionLog(int id, int submissionId, string date, int statusId)
{
PKId = id;
SubmissionId = submissionId;
Created = DateTime.Parse(date, CultureInfo.CreateSpecificCulture("en-US"));
StatusId = statusId;
}
public int PKId {get;set;}
public int SubmissionId {get;set;}
public DateTime Created {get;set;}
public int StatusId {get;set;}
}

C# - How to find all items in SortedDictionary which have similar keys?

I have a SortedDictionary:
static SortedDictionary<string, int> myDictionary = new SortedDictionary<string, int>();
where the keys represent strings something like that:
string key = someNumber + " " + row + " " + col + " " + someString;
What I want is to find all the items in the sorted dictionary that have specific row and col. For example if I have the following keys:
1 2 3 p
3 2 3 p
2 2 3 t
5 1 6 p
8 2 1 p
7 2 3 t
I want to get only these keys that have row=2 and col=3:
1 2 3 p
3 2 3 p
2 2 3 t
7 2 3 t
Unfortunately in this case you need to iterate over the whole collection and select the items that match your criteria (so not much use of the dictionary itself):
public IList<int> FindValues(int row, int col)
{
myDictionary
.Where(item => MatchKey(item.Key, row, col))
.Select(item => item.Value)
.ToList();
}
public bool MatchKey(string key, int row, int col)
{
var splitKey = key.Split();
return splitKey[1] == row.ToString() && splitKey[2] == col.ToString();
// or match the key according to your logic
}
Though if you need to query by row and column often, then it's better to build a different data structure first. Maybe
Dictionary<Coord, IList<int>> myDict;
Where Coord is a class/struct (and overrides Equals, GetHashCode)
class Coord
{
public int Row { get; set; }
public int Column { get; set; }
}

Fastest way to select distinct values from list based on two properties

I have a this list:
List<myobject> list= new List<myobject>();
list.Add(new myobject{name="n1",recordNumber=1});
list.Add(new myobject{name="n2",recordNumber=2});
list.Add(new myobject{name="n3",recordNumber=3});
list.Add(new myobject{name="n4",recordNumber=3});
I'm looking for the fastest way to select distinct objects based on recordNumber, but if there is more than one object with same recordNumber(here recordNumber=3), I want to select object base on its name.(the name provided by paramater)
thanks
It looks like you are really after something like:
Dictionary<int, List<myobject>> myDataStructure;
That allows you to quickly retrieve by record number. If the List<myobject> with that dictionary key contains more than one entry, you can then use the name to select the correct one.
Note that if your list is not terribly long, an O(n) check that just scans the list checking for the recordNumber and name may be fast enough, in the sense that other things happening in your program could obscure the list lookup cost. Consider that possibility before over-optimizing lookup times.
Here's the LINQ way of doing this:
Func<IEnumerable<myobject>, string, IEnumerable<myobject>> getDistinct =
(ms, n) =>
ms
.ToLookup(x => x.recordNumber)
.Select(xs => xs.Skip(1).Any()
? xs.Where(x => x.name == n).Take(1)
: xs)
.SelectMany(x => x)
.ToArray();
I just tested this with a 1,000,000 randomly created myobject list and it produced the result in 106ms. That should be fast enough for most situations.
Are you looking for
class Program
{
static void Main(string[] args)
{
List<myobject> list = new List<myobject>();
list.Add(new myobject { name = "n1", recordNumber = 1 });
list.Add(new myobject { name = "n2", recordNumber = 2 });
list.Add(new myobject { name = "n3", recordNumber = 3 });
list.Add(new myobject { name = "n4", recordNumber = 3 });
//Generates Row Number on the fly
var withRowNumbers = list
.Select((x, index) => new
{
Name = x.name,
RecordNumber = x.recordNumber,
RowNumber = index + 1
}).ToList();
//Generates Row Number with Partition by clause
var withRowNumbersPartitionBy = withRowNumbers
.OrderBy(x => x.RowNumber)
.GroupBy(x => x.RecordNumber)
.Select(g => new { g, count = g.Count() })
.SelectMany(t => t.g.Select(b => b)
.Zip(Enumerable.Range(1, t.count), (j, i) => new { Rn = i, j.RecordNumber, j.Name}))
.Where(i=>i.Rn == 1)
.ToList();
//print the result
withRowNumbersPartitionBy.ToList().ForEach(i => Console.WriteLine("Name = {0} RecordNumber = {1}", i.Name, i.RecordNumber));
Console.ReadKey();
}
}
class myobject
{
public int recordNumber { get; set; }
public string name { get; set; }
}
Result:
Name = n1 RecordNumber = 1
Name = n2 RecordNumber = 2
Name = n3 RecordNumber = 3
Are you looking for a method to do this?
List<myobject> list= new List<myobject>();
list.Add(new myobject{name="n1",recordNumber=1});
list.Add(new myobject{name="n2",recordNumber=2});
list.Add(new myobject{name="n3",recordNumber=3});
list.Add(new myobject{name="n4",recordNumber=3});
public myobject Find(int recordNumber, string name)
{
var matches = list.Where(l => l.recordNumber == recordNumber);
if (matches.Count() == 1)
return matches.Single();
else return matches.Single(m => m.name == name);
}
This will - of course - break if there are multiple matches, or zero matches. You need to write your own edge cases and error handling!
If the name and recordNumber combination is guaranteed to be unique then you can always use Hashset.
You can then use RecordNumber and Name to generate the HashCode by using a method described here.
class myobject
{
//override GetHashCode
public override int GetHashCode()
{
unchecked // Overflow is fine, just wrap
{
int hash = 17;
// Suitable nullity checks etc, of course :)
hash = hash * 23 + recordNumber.GetHashCode();
hash = hash * 23 + name.GetHashCode();
return hash;
}
}
//override Equals
}

Categories

Resources