Linq for a list of lists - c#

class Bar
{
public DateTime Time { get; set; }
public double Price { get; set; }
}
class Instrument
{
public List<Bar> Bars { get; set; }
public string Name { get; set; }
public Instrument(string name, string path) {
// set the Bars list here reading from files
}
}
Here are two simplified versions of my classes. I'm creating a custom backtesting platform for currencies. The current problem is to chop off bars where I don't have price data on every instrument.
I read prices from an XML file.
var xlinqBarLists = xlinqInstruments.Select(i => i.Bars);
which is basically
List<List<Bar>> xlinqBarLists
What I want to do is loop through each individual list, find the latest start date and the earliest end date, then lop off all bars outside of that time window. My hacked together code is
var xlinqInstruments = root.Elements("instrument").Select( a =>
new Instrument( a.Element("name").Value, a.Element("path").Value ) );
var xlinqBarLists = xlinqInstruments.Select(i => i.Bars);
DateTime latestStartDate = DateTime.MinValue;
DateTime earliestEndDate = DateTime.MaxValue;
foreach (List<Bar> bars in xlinqBarLists)
{
if (bars.Min(b => b.Time) > latestStartDate)
latestStartDate = bars.Min(b => b.Time);
if (bars.Max(b => b.Time) < earliestEndDate)
earliestEndDate = bars.Max(b => b.Time);
}
foreach (List<Bar> barList in xlinqBarLists)
{
var timeWindowBars = from bar in barList
where bar.Time >= latestStartDate && bar.Time <= earliestEndDate
select bar;
// I need some way to overwrite the original Instrument.Bars property with timeWindowBars
// Suggestions?
}
Can I do this more quickly and efficiently by skipping the foreach loops?

For latest start date and earliest end date you can use
DateTime latestStartDate = xlinqInstruments.Max(i => i.Bars.Min(bar => bar.Time));
DateTime earliestEndDate = xlinqInstruments.Min(i => i.Bars.Max(bar => bar.Time));
And for the last part maybe you would like to add a parameterless constructor for ´Instrument´ and then
var result = xlinqInstruments
.Select(i=>new Instrument()
{
Name = i.Name,
Bars = i.Bars.Where(bar => bar.Time >= latestStartDate
&& bar.Time <=earliestEndDate)
.ToList()
});

Here's an answer if my comment above, (which I've copy/pasted below) turns out to be the case.
Are you wanting to set the Bars property of each instrument to a
List containing only those objects that fall within the minimum
end date and maximum start date across all Bar objects (including
those present on other Instruments)?
// Get the value of the earliest Bar.Time on each Instrument, and select the most recent of those.
DateTime latestStartDate = xlinqInstruments.Max(instrument => instrument.Bars.Min(bar => bar.Time));
// Get the value of the latest Bar.Time on each Instrument, and select the earliest of those.
DateTime earliestEndDate = xlinqInstruments.Min(instrument => instrument.Bars.Max(bar => bar.Time));
// Overwrite the Bars collection of each instrument with its contents truncated appropriately.
// I'd suggest doing this with a foreach loop as opposed to what I've provided below, but that's just me.
xlinqInstruments.ForEach(instrument =>
{
instrument.Bars = instrument.Bars.Where(obj => obj.Time >= latestStartDate && obj.Time <= earliestEndDate).ToList();
});
It may be worth nothing that the ForEach method requires you to call .ToList() on the xlinqInstruments collection first. In my code, I assume the collection has already been materialized to List<Instrument>
You may also be interested in linq's Enumerable.SelectMany method.
Enumerable.SelectMany Method (IEnumerable, Func>)
Projects each element of a sequence to an IEnumerable and flattens the resulting sequences into one sequence.

Call ToList before assigning to xlinqBarLists:
var xlinqBarLists = xlinqInstruments.Select(i => i.Bars).ToList();
Otherwise you're parsing the same XML over and over again.
You should most likely call ToList when creating xlinqInstruments as well, if you want to update them later.

Related

Finding duplicate items then selecting one with closest date to currentDate

Note: Using Windows Mobile 6.5 Compact Framework.
I have a collection of the following object.
public class RFileModel
{
public List<string> RequiredFilesForR = new List<string>();
public string Date { get; set; }
public string RouteId { get; set; }
}
var ListOfRFileModels = new List<RFileModel>();
There is the chance that the same RouteId will be in multiple instances of RFileModel but with a different Date.
I'm trying to identify the duplicates and select only one, the one closest to the current date.
I have the following LINQ so far:
var query = ListOfRFileModels.GroupBy(route => route.RouteId)
.OrderBy(newGroup => newGroup.Key)
.Select(newGroup => newGroup).ToList();
But I don't think this is what I need, since it still returns all elements. I was expecting a list of non unique RouteId, that way I can iterate each non-unique id and compare dates to see which one to keep.
How can I accomplish this with LINQ or just plain ole foreach?
Your expression sorts groups, not group elements. Here is how to fix it:
DateTime currentDate = ...
var query = ListOfRFileModels
.GroupBy(route => route.RouteId)
.Select(g => g.OrderBy(fm => currentDate-fm.Date).First())
.ToList();
currentDate-fm.Date expression produces the difference between the current date and the date of the RFileModel object. The object with the smallest difference would end up in the first position of the ordered sequence. The call First() picks it up from the group to produce the final result.
Assuming you want ONLY the members with duplicates, take #dasblinkenlight's answer and add a Where clause: .Where(grp => grp.Count()>1):
DateTime currentDate = DateTime.Now;
var query = ListOfRFileModels
.GroupBy(route => route.RouteId)
.Where(grp => grp.Count()>1)
.Select(g => g.OrderBy(fm => currentDate-fm.Date).First())
.ToList();

Linq sum on object w/o group

This seems simple enough but I'm not getting it for some reason.
Given:
public class Foo
{
public List<Bar> Bars { get; private set; }
public Bar Totals { get; private set; }
public Foo()
{
// Blah blah something to populate List of Bars
this.Bars = new List<Bar>()
{
new Bar("Some dude", 50, 1),
new Bar("Some other dude", 60,25)
};
// Calculate Total
var totals = Bars
.GroupBy(gb => gb.CustomerName) // When I comment out line i get "Bar does not contain a definition for "Sum" and no extension...." I want the sum of this without the group.
.Select(s => new
{
Cost = s.Sum(x => x.Cost),
Donation = s.Sum(x => x.Donation),
}
).ToList();
Totals = new Bar("Totals", totals[0].Cost, totals[0].Donation);
}
}
public class Bar
{
public string CustomerName { get; private set; }
public int Cost { get; private set; }
public int Donation { get; private set; }
public int Total { get { return Cost + Donation; } }
public Bar(string customerName, int cost, int donation)
{
this.CustomerName = customerName;
this.Cost = cost;
this.Donation = donation;
}
}
I'm having a few problems here:
-This works with a group by, but if i take out the group by which is my end goal I get "Bar does not contain a definition for "Sum" and no extension....". I want this sum on the entire collection, so do not want a group by.
-I'm creating an anon object before placing into a Bar because I'm not sure how to create a Bar without a parameterless constructor (and I can't add one to this particular class)
-I don't like accessing the "var totals" data using index 0 - should I not be ToListing at the end? If not, how do i access the properties? totals.Cost does not work.
Please help me figure out the proper way to get around my issue (specifically the 3 bullet points above this paragraph). Pretty new to the fluent (and linq in general)syntax and I'm trying to figure out the right way to do it.
EDIT:
thanks for the responses all. Taking kind of a combination of several answers really got me to what my end goal was (but biggest thanks D Stanley)
This is how I'm implementing now:
public Foo()
{
// ....
// Calculate Total
Totals = new Bar("Totals", Bars.Sum(s => s.Cost), Bars.Sum(s => s.Donation));
}
Guess I was just making it more complicated than it needed to be! :O
The s variable in the lambda is of type Bar if you remove the GroupBy. You want it to be List<Bar> instead in your case. So, what I think you want is something like:
var totalCosts = Bars.Sum(x => x.Cost);
var totalDonations = Bars.Sum(x => x.Donation);
but if i take out the group by I get "Bar does not contain a definition for "Sum"
That's because when you take out the GroupBy you're iterating over the individual items instead of a collection of groups. If you want to sum the entire collection use
var totals = new
{
Cost = Bars.Sum(x => x.Cost),
Donation = Bars.Sum(x => x.Donation),
}
;
or if you want a collection with one item, just change your GroupBy:
var totals = Bars
.GroupBy(gb => true) // trivial grouping
.Select(s => new Bar
{
Cost = s.Sum(x => x.Cost),
Donation = s.Sum(x => x.Donation),
}
).ToList();
-I'm casting to an anon object before placing into a Bar because I'm not sure how to cast it in without a parameterless constructor (and I can't add one to this particular class)
Just change your projection to
var totals = new Bar("Totals", Bars.Sum(x => x.Cost), Bars.Sum(x => x.Donation));
I don't like accessing the "var totals" data using index 0 - should I not be ToListing at the end? If not, how do i access the properties? totals.Cost does not work.
If you take out the group by you end up with just one object. If you have a colection with one item you could use First:
Totals = new Bar("Totals", totals.First().Cost, totals.First().Donation);
I want this sum on the entire collection, so do not want a group by.
Then use Sum on the collection
Bars.Sum(x => x.Cost)
I'm casting to an anon object before placing into a Bar because I'm not sure how to cast it in without a parameterless constructor (and I can't add one to this particular class)
You are not casting, you are creating anonymous objects
I don't like accessing the "var totals" data using index 0 - should I not be ToListing at the end? If not, how do i access the properties? totals.Cost does not work.
If you want single result use First.
It's simple enough, when you do Bars.Select(s =>, s is of type Bar and Bar has no definition of Sum. If you want the sum of all of it without any grouping, you can do:
Bars.Sum(b => b.Cost);
Bars.Sum(b => b.Donation);
You only need this :
Totals = new Bar("Totals", Bars.Sum(o => o.Cost), Bars.Sum(o => o.Donation));

Linq, select List Item where column is the max value

I basically have a List that has a few columns in it. All I want to do is select whichever List item has the highest int in a column called Counted.
List<PinUp> pinned= new List<PinUp>();
class PinUp
{
internal string pn { get; set; }
internal int pi{ get; set; }
internal int Counted { get; set; }
internal int pp{ get; set; }
}
So basically I just want pinned[whichever int has highested Count]
Hope this makes sense
The problem is i want to remove this [whichever int has highested Count] from the current list. So I have to no which int it is in the array
One way, order by it:
PinUp highest = pinned
.OrderByDescending(p => p.Counted)
.First();
This returns only one even if there are multiple with the highest Counted. So another way is to use Enumerable.GroupBy:
IEnumerable<PinUp> highestGroup = pinned
.GroupBy(p => p.Counted)
.OrderByDescending(g => g.Key)
.First();
If you instead just want to get the highest Counted(i doubt that), you just have to use Enumerable.Max:
int maxCounted = pinned.Max(p => p.Counted);
Update:
The problem is i want to remove this [whichever int has highested Count] from the current list.
Then you can use List(T).RemoveAll:
int maxCounted = pinned.Max(p => p.Counted);
pinned.RemoveAll(p => p.Counted == maxCounted);
var excludingHighest = pinned.OrderByDescending(x => x.Counted)
.Skip(1);
If you need need to have a copy of the one being removed and still need to remove it you can do something like
var highestPinned = pinned.OrderByDescending(x => x.Counted).Take(1);
var excludingHighest = pinned.Except(highestPinned);
You can order it:
var pin = pinned.OrderByDescending(p => p.Counted).FirstOrDefault();
// if pin == null then no elements found - probably empty.
If you want to remove, you don't need an index:
pinned.Remove(pin);
it is a sorting problem.
Sort your list by Counted in descending order and pick the first item.
Linq has a way to do it:
var highest = pinned.OrderByDescending(p => p.Counted).FirstOrDefault();
Try the following:
PinUp pin = pinned.OrderByDescending(x => x.Counted).First();

Average extension method in Linq for default value

Anyone know how I can set a default value for an average? I have a line like this...
dbPlugins = (from p in dbPlugins
select new { Plugin = p, AvgScore = p.DbVersions.Average(x => x.DbRatings.Average(y => y.Score)) })
.OrderByDescending(x => x.AvgScore)
.Select(x => x.Plugin).ToList();
which throws an error becase I have no ratings yet. If I have none I want the average to default to 0. I was thinking this should be an extension method where I could specify what the default value should be.
There is: DefaultIfEmpty.
I 'm not sure about what your DbVersions and DbRatings are and which collection exactly has zero items, but this is the idea:
var emptyCollection = new List<int>();
var average = emptyCollection.DefaultIfEmpty(0).Average();
Update: (repeating what's said in the comments below to increase visibility)
If you find yourself needing to use DefaultIfEmpty on a collection of class type, remember that you can change the LINQ query to project before aggregating. For example:
class Item
{
public int Value { get; set; }
}
var list = new List<Item>();
var avg = list.Average(item => item.Value);
If you don't want to/can not construct a default Item with Value equal to 0, you can project to a collection of ints first and then supply a default:
var avg = list.Select(item => item.Value).DefaultIfEmpty(0).Average();
My advice would to create a reusable solution instead of a solution for this problem only.
Make an extension method AverageOrDefault, similar to FirstOrDefault. See extension methods demystified
public static class MyEnumerableExtensions
{
public static double AverageOrDefault(this IEnumerable<int> source)
{
// TODO: decide what to do if source equals null: exception or return default?
if (source.Any())
return source.Average();
else
return default(int);
}
}
There are 9 overloads of Enumerable.Average, so you'll need to create an AverageOrDefault for double, int?, decimal, etc. They all look similar.
Usage:
// Get the average order total or default per customer
var averageOrderTotalPerCustomer = myDbContext.Customers
.GroupJoin(myDbContext.Orders,
customer => customer.Id,
order => order.CustomerId,
(customer, ordersOfThisCustomer) => new
{
Id = customer.Id,
Name = customer.Name,
AverageOrder = ordersOfThisCustomer.AverageOrDefault(),
});
I don't think there's a way to select default, but how about this query
dbPlugins = (from p in dbPlugins
select new {
Plugin = p, AvgScore =
p.DbVersions.Any(x => x.DbRatings) ?
p.DbVersions.Average(x => x.DbRatings.Average(y => y.Score)) : 0 })
.OrderByDescending(x => x.AvgScore)
.Select(x => x.Plugin).ToList();
Essentially the same as yours, but we first ask if there are any ratings before averaging them. If not, we return 0.

Find minimal and maximal date in array using LINQ?

I have an array of classes with a property Date, i.e.:
class Record
{
public DateTime Date { get; private set; }
}
void Summarize(Record[] arr)
{
foreach (var r in arr)
{
// do stuff
}
}
I have to find the earliest (minimum) and the latest (maximum) dates in this array.
How can I do that using LINQ?
If you want to find the earliest or latest Date:
DateTime earliest = arr.Min(record => record.Date);
DateTime latest = arr.Max(record => record.Date);
Enumerable.Min, Enumerable.Max
If you want to find the record with the earliest or latest Date:
Record earliest = arr.MinBy(record => record.Date);
Record latest = arr.MaxBy(record => record.Date);
See: How to use LINQ to select object with minimum or maximum property value
old school solution without LINQ:
DateTime minDate = DateTime.MaxValue;
DateTime maxDate = DateTime.MinValue;
foreach (var r in arr)
{
if (minDate > r.Date)
{
minDate = r.Date;
}
if (maxDate < r.Date)
{
maxDate = r.Date;
}
}
The two in one LINQ query (and one traversal):
arr.Aggregate(
new { MinDate = DateTime.MaxValue,
MaxDate = DateTime.MinValue },
(accDates, record) =>
new { MinDate = record.Date < accDates.MinDate
? record.Date
: accDates.MinDate,
MaxDate = accDates.MaxDate < record.Date
? record.Date
: accDates.MaxDate });
Using lambda expressions:
void Summarise(Record[] arr)
{
if (!(arr == null || arr.Length == 0))
{
List<Record> recordList = new List<Record>(arr);
recordList.Sort((x,y) => { return x.Date.CompareTo(y.Date); });
// I may have this the wrong way round, but you get the idea.
DateTime earliest = recordList[0];
DateTime latest = recordList[recordList.Count];
}
}
Essentially:
Sort into a new list in order of date
Select the first and last elements of that list
UPDATE: Thinking about it, I'm not sure that this is the way to do it if you care at all about performance, as sorting the entire list will result in many more comparisons than just scanning for the highest / lowest values.
I'd just make two properties Min,Max, assign them the value of the first item you add to the array, then each time you add a new item just check if its DateTime is less or greater than the Min Max ones.
Its nice and fast and it will be much faster than iterating through the array each time you need to get Min Max.

Categories

Resources