LINQ Select Records X Minutes apart - c#

I have a simple table that keeps track of the entry date. I would like to select records that are X minutes apart.
IMAGE_LOCATION IMAGE DATE
============== =============
2227.jpg 08/03/2014 22:27:47
2228.jpg 08/03/2014 22:28:48
2229.jpg 08/03/2014 22:59:49
2230.jpg 08/03/2014 23:12:50
2231.jpg 08/03/2014 23:29:49
From the sample above i would like the query to return items that are at least X minutes apart, lets say 30 min. so from the list above 2227.jpg, 2229.jpg and 2231.jpg would be returned only.
This is what i have so far that just returns the latest images, however i need the latest ones but separated by at least 30 minutes between records.
using (var db = new GibFrontierEntities())
{
var result = (from u in db.CCTV_IMAGES.OrderByDescending(u => u.ImageDate)
select u).Take(rows);
return result.ToList();
}

This is a quick attempt to achieve exactly what you asked for, a LINQ solution (tested and working in .NET 4):
var list = db.CCTV_IMAGES.OrderByDescending(u => u.ImageDate);
return list.Where((d, i) =>
{
//Look ahead to compare against the next if it exists.
if (list.ElementAtOrDefault(i + 1) != null)
{
return d.ImageDate.Subtract(list.ElementAtOrDefault(i + 1).ImageDate).TotalMinutes > 30;
}
//Look behind to compare against the previous if this is the last item in the list.
if (list.ElementAtOrDefault(i - 1) != null)
{
return list.ElementAtOrDefault(i - 1).ImageDate.Subtract(d.ImageDate).TotalMinutes > 30;
}
return false;
}).ToList();
Per comments and a clearer definition of the requirement:
Because you stated in the comments below that you will have 1 item a minute and you previously stated that you need them separated by at least 30 minutes, would you consider simplifying the logic to grab every 30th item from the list?
return list.Where((d, i) => i % 30 == 0);

You can use SelectMany to achieve what you want:
using (var db = new GibFrontierEntities())
{
var images = db.CCTV_IMAGES;
var result = images
.SelectMany(i => images,
(first, second) => new { First = first, Second = second })
.Where(i => i.First != i.Second)
.Where(i => Math.Abs(
EntityFunctions
.DiffMinutes(i.First.ImageDate, i.Second.ImageDate)) >= 30)
.Select(i => i.First)
.Distinct()
.OrderByDescending(i => i.ImageDate)
.Take(rows)
.ToList();
return result;
}

As mentioned already, this could be easily achievable through iteration. However, it you really have to deal with LINQ expression, here's a quick and dirty sample that will return dates that are 30 minutes apart:
List<DateTime> dateLlist = new List<DateTime>();
dateLlist.Add(new DateTime(2014, 1, 1, 1, 0, 0, 0));
dateLlist.Add(new DateTime(2014, 1, 1, 1, 10, 0, 0));
dateLlist.Add(new DateTime(2014, 1, 1, 1, 45, 0, 0));
DateTime previousTime = new DateTime();
bool shouldAdd = false;
List<DateTime> newList = dateLlist.Where(x =>
{
shouldAdd = (previousTime == DateTime.MinValue || previousTime.AddMinutes(30) < x);
previousTime = x;
return shouldAdd;
}).ToList();

Related

How to consolidate date ranges in a list in C#

I have a list of dates organized like this:
(From, To)
(From, To)
...
(From, To)
I am trying to find how to consolidate ranges in an efficient way (it has to be quite fast because it is to consolidate financial data streams in realtime).
Dates do NOT overlap.
what I was thinking about is:
Sort everything by From time
and then iterate through pairs to see if Pair1.To == Pair2.From to merge them, but this means several iterations.
Is there a better way to do this, like in a single pass
Here are some examples
(2019-1-10, 2019-1-12)
(2019-3-10, 2019-3-14)
(2019-1-12, 2019-1-13)
expected output:
(2019-1-10, 2019-1-12) + (2019-1-12, 2019-1-13) -> (2019-1-10, 2019-1-13)
(2019-3-10, 2019-3-14) -> (2019-3-10, 2019-3-14)
In practice, it's really about seconds and not dates, but the idea is the same.
You mention that dates never overlap but I think it is slightly simpler to write code that just merges overlapping dates. First step is to define the date range type:
class Interval
{
public DateTime From { get; set; }
public DateTime To { get; set; }
}
You can then define an extension method that checks if two intervals overlap:
static class IntervalExtensions
{
public static bool Overlaps(this Interval interval1, Interval interval2)
=> interval1.From <= interval2.From
? interval1.To >= interval2.From : interval2.To >= interval1.From;
}
Notice that this code assumes that From <= To so you might want to change Interval into an immutable type and verify this in the constructor.
You also need a way to merge two intervals:
public static Interval MergeWith(this Interval interval1, Interval interval2)
=> new Interval
{
From = new DateTime(Math.Min(interval1.From.Ticks, interval2.From.Ticks)),
To = new DateTime(Math.Max(interval1.To.Ticks, interval2.To.Ticks))
};
Next step is define another extension method that iterates a sequence of intervals and tries to merge consecutive overlapping intervals. This is best done using an iterator block:
public static IEnumerable<Interval> MergeOverlapping(this IEnumerable<Interval> source)
{
using (var enumerator = source.GetEnumerator())
{
if (!enumerator.MoveNext())
yield break;
var previousInterval = enumerator.Current;
while (enumerator.MoveNext())
{
var nextInterval = enumerator.Current;
if (!previousInterval.Overlaps(nextInterval))
{
yield return previousInterval;
previousInterval = nextInterval;
}
else
{
previousInterval = previousInterval.MergeWith(nextInterval);
}
}
yield return previousInterval;
}
}
If two consecutive intervals don't overlap it yields the previous interval. However, if they overlap it instead updates the previous interval by merging the two intervals and keep the merged interval as the previous interval for the next iteration.
Your sample data is not sorted so before merging the intervals you have to sort them:
var mergedIntervals = intervals.OrderBy(interval => interval.From).MergeOverlapping();
However, if the real data is sorted which you have indicated in a comment you can skip the sorting. The algorithm will do a single pass over the data and thus is O(n).
Give this a go:
var source = new[]
{
new { from = new DateTime(2019, 1, 10), to = new DateTime(2019, 1, 12) },
new { from = new DateTime(2019, 3, 10), to = new DateTime(2019, 3, 14) },
new { from = new DateTime(2019, 1, 12), to = new DateTime(2019, 1, 13) },
};
var data =
source
.OrderBy(x => x.from)
.ThenBy(x => x.to)
.ToArray();
var results =
data
.Skip(1)
.Aggregate(
data.Take(1).ToList(),
(a, x) =>
{
if (a.Last().to >= x.from)
{
a[a.Count - 1] = new { from = a.Last().from, to = x.to };
}
else
{
a.Add(x);
}
return a;
});
It's a nice query and it gives the output that you want.
Create two Dictionaries (i.e. hash maps), one using the To date as the key and the From-To date as the value, the other with the From date as the key.
Iterate over your date ranges and for each range check if the From date exists as a key in the To-date-keyed Dictionary, and vice versa.
If not a match in either then add the range to both the Dictionaries.
If there is a match in one but not the other then remove the matching range from both Dictionaries (using the appropriate key), merge the new range with the existing range and add the result to both.
If there is a match in both Dictionaries (the range being added fills a hole) then remove both matches from both Dictionaries, merge the three ranges (two existing and one new) and add the result to both Dictionaries.
At the end your Dictionaries contain an unsorted set of all date ranges, which you can extract by iterating over the keys of one of the Dictionaries.
Here is a 'two-dictionaries' implementation, that consolidates the ranges without sorting them first. The assumptions are that there is no overlapping, and no duplicate properties. A duplicate property will cause an exception to be thrown.
public static IEnumerable<TSource> Consolidate<TSource, TProperty>(
this IEnumerable<TSource> source,
Func<TSource, TProperty> property1Selector,
Func<TSource, TProperty> property2Selector,
Func<TSource, TSource, TSource> combine)
{
var dict1 = source.ToDictionary(property1Selector);
var dict2 = source.ToDictionary(property2Selector);
if (dict1.Keys.Count == 0) yield break;
var first = dict2.Values.First(); // Start with a random element
var last = first;
var current = first;
while (true) // Searching backward
{
dict1.Remove(property1Selector(first));
dict2.Remove(property2Selector(first));
if (dict2.TryGetValue(property1Selector(first), out current))
{
first = current; // Continue searching backward
}
else
{
while (true) // Searching forward
{
if (dict1.TryGetValue(property2Selector(last), out current))
{
last = current; // Continue searching forward
dict1.Remove(property1Selector(last));
dict2.Remove(property2Selector(last));
}
else
{
yield return combine(first, last);
break;
}
}
if (dict1.Keys.Count == 0) break;
first = dict1.Values.First(); // Continue with a random element
last = first;
}
}
}
Usage example:
var source = new List<(DateTime From, DateTime To)>()
{
(new DateTime(2019, 1, 10), new DateTime(2019, 1, 12)),
(new DateTime(2019, 3, 10), new DateTime(2019, 3, 14)),
(new DateTime(2019, 1, 12), new DateTime(2019, 1, 13)),
(new DateTime(2019, 3, 5), new DateTime(2019, 3, 10)),
};
var consolidated = source
.Consolidate(r => r.From, r => r.To, (r1, r2) => (r1.From, r2.To))
.OrderBy(r => r.From)
.ToList();
foreach (var range in consolidated)
{
Console.WriteLine($"{range.From:yyyy-MM-dd} => {range.To:yyyy-MM-dd}");
}
Output:
2019-01-10 => 2019-01-13
2019-03-05 => 2019-03-14
My take using MoreLinq and functional style. IMO, easy to understand. Most lines here are sample data, logic is only few lines (GetAsDays method and all.Segment call)
How it is done: we transform date ranges into collection of days, union these collections and split them into separate ranges (where more then 1 day is between end and start of the next).
void Main()
{
var baseD = new DateTime(01, 01, 01);
var from = DateTime.Today.Dump("from");
var to = from.AddDays(20).Dump("to");
var range1 = GetAsDays(from, to);
var from2 = DateTime.Today.AddDays(10).Dump("from2");
var to2 = from2.AddDays(20).Dump("to2");
var from3 = DateTime.Today.AddDays(50).Dump("from2");
var to3 = from3.AddDays(10).Dump("to2");
var range2 = GetAsDays(from2, to2);
var range3 = GetAsDays(from3, to3);
var all = range3
.Union(range1)
.Union(range2)
.OrderBy(e=>e);
var split=all.Segment((iPlus1, i, a) => (iPlus1 - i) > 1);
split.Select(s=>(baseD.AddDays(s.First()),baseD.AddDays(s.Last()))).Dump();
}
public IList<int> GetAsDays(DateTime from, DateTime to)
{
var baseD = new DateTime(01, 01, 01);
var fromSpan = from - baseD;
var toSpan = to - baseD;
var set1 = Enumerable.Range((int)fromSpan.TotalDays, (int)(toSpan - fromSpan).TotalDays);
return new List<int>(set1);
}

Finding the longest overlapping period

I have a list of records containing Id, DateFrom, DateTo. For the sake of this question we can use this one:
List<(int, DateTime, DateTime)> data = new List<(int, DateTime, DateTime)>
{
(1, new DateTime(2012, 5, 16), new DateTime(2018, 1, 25)),
(2, new DateTime(2009, 1, 1), new DateTime(2011, 4, 27)),
(3, new DateTime(2014, 1, 1), new DateTime(2016, 4, 27)),
(4, new DateTime(2015, 1, 1), new DateTime(2015, 1, 3)),
(2, new DateTime(2013, 5, 10), new DateTime(2017, 4, 27)),
(5, new DateTime(2013, 5, 16), new DateTime(2018, 1, 24)),
(2, new DateTime(2017, 4, 28), new DateTime(2018, 1, 24)),
};
In my real case the List could be a lot bigger. Initially I was working with the assumption that there can be only one record for a certain Id and I was able to come up with a pretty good solution but now, as you can see, the assumption is that you can have several periods for an Id and all periods should be taken into consideration when comparing the whole time.
The task is to find the two records that has the longest time overlap and to return the ids and the number of days overlapped.
Which in this sample case means that these should be records 1 and 2.
My implementation of this is the following:
public (int, int, int) GetLongestElapsedPeriodWithDuplications(List<(int, DateTime, DateTime)> periods)
{
Dictionary<int, List<(DateTime, DateTime)>> periodsByPeriodId = new Dictionary<int, List<(DateTime, DateTime)>>();
foreach (var period in periods)
{
if (periodsByPeriodId.ContainsKey(period.Item1))
{
periodsByPeriodId[period.Item1].Add((period.Item2, period.Item3));
}
else
{
periodsByPeriodId[period.Item1] = new List<(DateTime, DateTime)>();
periodsByPeriodId[period.Item1].Add((period.Item2, period.Item3));
}
}
int firstId = -1;
int secondId = -1;
int periodInDays = 0;
foreach (var period in periodsByPeriodId)
{
var Id = period.Key;
foreach (var currPeriod in periodsByPeriodId)
{
int currentPeriodInDays = 0;
if (Id != currPeriod.Key)
{
for (var i = 0; i < period.Value.Count; i++)
{
for (var j = 0; j < currPeriod.Value.Count; j++)
{
var firstPeriodDateFrom = period.Value[i].Item1;
var firstPeriodDateTo = period.Value[i].Item2;
var secondPeriodDateFrom = currPeriod.Value[j].Item1;
var secondPeriodDateTo = currPeriod.Value[j].Item2;
if (secondPeriodDateFrom < firstPeriodDateTo && secondPeriodDateTo > firstPeriodDateFrom)
{
DateTime commonStartingDate = secondPeriodDateFrom > firstPeriodDateFrom ? secondPeriodDateFrom : firstPeriodDateFrom;
DateTime commonEndDate = secondPeriodDateTo > firstPeriodDateTo ? firstPeriodDateTo : secondPeriodDateTo;
currentPeriodInDays += (int)(commonEndDate - commonStartingDate).TotalDays;
}
}
}
if (currentPeriodInDays > periodInDays)
{
periodInDays = currentPeriodInDays;
firstId = Id;
secondId = currPeriod.Key;
}
}
}
}
return (firstId, secondId, periodInDays);
}
As you can see the method is pretty big and in my opinion far from optimized in terms of execution speed. I know that those nested loops rise the complexity a lot, but this additional requirement to deal with more than one period for an Id really left me without ideas. How can I optimize this logic so in case of bigger input it would execute faster than now?
As in your original solution - you need to compare each interval with any other, except intervals with the same id, so I'd code this like this:
Supporting classes, just to simplify actual algorithm:
class Period {
public DateTime Start { get; }
public DateTime End { get; }
public Period(DateTime start, DateTime end) {
this.Start = start;
this.End = end;
}
public int Overlap(Period other) {
DateTime a = this.Start > other.Start ? this.Start : other.Start;
DateTime b = this.End < other.End ? this.End : other.End;
return (a < b) ? b.Subtract(a).Days : 0;
}
}
class IdData {
public IdData() {
this.Periods = new List<Period>();
this.Overlaps = new Dictionary<int, int>();
}
public List<Period> Periods { get; }
public Dictionary<int, int> Overlaps { get; }
}
Method to find max overlap:
static int GetLongestElapsedPeriod(List<(int, DateTime, DateTime)> periods) {
int maxOverlap = 0;
Dictionary<int, IdData> ids = new Dictionary<int, IdData>();
foreach (var period in periods) {
int id = period.Item1;
Period idPeriod = new Period(period.Item2, period.Item3);
// preserve interval for ID
var idData = ids.GetValueOrDefault(id, new IdData());
idData.Periods.Add(idPeriod);
ids[id] = idData;
foreach (var idObj in ids) {
if (idObj.Key != id) {
// here we calculate of new interval with all previously met
int o = idObj.Value.Overlaps.GetValueOrDefault(id, 0);
foreach (var otherPeriods in idObj.Value.Periods)
o += idPeriod.Overlap(otherPeriods);
idObj.Value.Overlaps[id] = o;
// check whether newly calculate overlapping is the maximal one, preserve Ids if needed too
if (o > maxOverlap)
maxOverlap = o;
}
}
}
return maxOverlap;
}
You can use TimePeriodLibrary.NET:
PM> Install-Package TimePeriodLibrary.NET
TimePeriodCollection timePeriods = new TimePeriodCollection(
data.Select(q => new TimeRange(q.Item2, q.Item3)));
var longestOverlap = timePeriods
.OverlapPeriods(new TimeRange(timePeriods.Start, timePeriods.End))
.OrderByDescending(q => q.Duration)
.FirstOrDefault();
With an extension method:
public static T MaxBy<T, TKey>(this IEnumerable<T> src, Func<T, TKey> key, Comparer<TKey> keyComparer = null) {
keyComparer = keyComparer ?? Comparer<TKey>.Default;
return src.Aggregate((a, b) => keyComparer.Compare(key(a), key(b)) > 0 ? a : b);
}
And some helper functions
DateTime Max(DateTime a, DateTime b) => (a > b) ? a : b;
DateTime Min(DateTime a, DateTime b) => (a < b) ? a : b;
int OverlappingDays((DateTime DateFrom, DateTime DateTo) span1, (DateTime DateFrom, DateTime DateTo) span2) {
var maxFrom = Max(span1.DateFrom, span2.DateFrom);
var minTo = Min(span1.DateTo, span2.DateTo);
return Math.Max((minTo - maxFrom).Days, 0);
}
You can group together the spans with matching Ids
var dg = data.GroupBy(d => d.Id);
Generate all pairs of Ids
var pdgs = from d1 in dg
from d2 in dg.Where(d => d.Key > d1.Key)
select new[] { d1, d2 };
Then compute the overlap in days between each pair of Ids and find the maximum:
var MaxOverlappingPair = pdgs.Select(pdg => new {
Id1 = pdg[0].Key,
Id2 = pdg[1].Key,
OverlapInDays = pdg[0].SelectMany(d1 => pdg[1].Select(d2 => OverlappingDays((d1.DateFrom, d1.DateTo), (d2.DateFrom, d2.DateTo)))).Sum()
}).MaxBy(TwoOverlap => TwoOverlap.OverlapInDays);
Since efficiency is mentioned, I should say that implementing some of these operations directly instead of using LINQ is more efficient, but you are using Tuples and in-memory structures so I don't think it will make much difference.
I ran some performance tests using a list of 24000 spans with 1249 unique IDs. The LINQ code took about 16 seconds. By inlining some of the LINQ and replacing anonymous objects with tuples, it came down to about 3.1 seconds. By adding a shortcut skipping any IDs whose cumulative days were shorter than the current max overlapping days and a few more optimizations, I got it down to less than 1 second.
var baseDate = new DateTime(1970, 1, 1);
int OverlappingDays(int DaysFrom1, int DaysTo1, int DaysFrom2, int DaysTo2) {
var maxFrom = DaysFrom1 > DaysFrom2 ? DaysFrom1 : DaysFrom2;
var minTo = DaysTo1 < DaysTo2 ? DaysTo1 : DaysTo2;
return (minTo > maxFrom) ? minTo - maxFrom : 0;
}
var dgs = data.Select(d => {
var DaysFrom = (d.DateFrom - baseDate).Days;
var DaysTo = (d.DateTo - baseDate).Days;
return (d.Id, DaysFrom, DaysTo, Dist: DaysTo - DaysFrom);
})
.GroupBy(d => d.Id)
.Select(dg => (Id: dg.Key, Group: dg, Dist: dg.Sum(d => d.Dist)))
.ToList();
var MaxOverlappingPair = (Id1: 0, Id2: 0, OverlapInDays: 0);
for (int j1 = 0; j1 < dgs.Count; ++j1) {
var dg1 = dgs[j1];
if (dg1.Dist > MaxOverlappingPair.OverlapInDays)
for (int j2 = j1 + 1; j2 < dgs.Count; ++j2) {
var dg2 = dgs[j2];
if (dg2.Dist > MaxOverlappingPair.OverlapInDays) {
var testOverlapInDays = 0;
foreach (var d1 in dg1.Group)
foreach (var d2 in dg2.Group)
testOverlapInDays += OverlappingDays(d1.DaysFrom, d1.DaysTo, d2.DaysFrom, d2.DaysTo);
if (testOverlapInDays > MaxOverlappingPair.OverlapInDays)
MaxOverlappingPair = (dg1.Id, dg2.Id, testOverlapInDays);
}
}
}
Optimizations applied:
Convert each spans DateTimes to # of days from an arbitrary baseDate to optimize overlapping days calculation by doing date conversion once.
Compute the total days for each span and skip any span pairs that can't exceed the current overlap
Replace SelectMany/Select with nested foreach to compute overlapping days.
Use ValueTuples instead of anonymous objects which are (slightly) faster for this problem.
Replace pair generation LINQ with nested for loops generating each possible pair directly
Pass individual from/to parameters instead of objects to OverlappingDays function
Note: I tried a smarter overlapping days calculation but when the number of spans per ID is small, the overhead took longer than just doing the calculation directly.
There are already few solutions
but
if you want to improve the efficiency then you don't have to compare every objects/value with everyother value or object. You can use Interval Search Tree for this problem and it can be solved in RlogN where R are number of intersections between intervals.
I recommend you to watch this video of Robert Sedgwick and also that book is online available.
Your basic problem here is how to identify a unique set of time periods. Give each one its own unique ID yourself.
When you write your final answer, include the additional details in the output so the user can understand which (original) IDs and original time periods resulted in the final answer.
Remember - the problem is still the same as in the original post (https://codereview.stackexchange.com/questions/186014/finding-the-longest-overlapping-period/186031?noredirect=1#comment354707_186031) and you still have the same information to work with. Don't get too hung up on the "ID"s as provided in the original list - you are still iterating through a list of time periods.

LINQ Where clause inline for loop?

I have database values called start and length.
start is the start time of a booking (1->09:00, 2->10:00 etc) and length is the length in hours.
i then have an array of start times and end times. I want to be able to check whether each start and end pair are already booked. I so far have it figured that if the start times are the same, it is booked, or if the end times are the same, it is also booked. But if the start and end time are inbetween the comparison times, it will return not booked, which is false.
I am trying to write a LINQ query to test whether a booking is already in the database. So far I have
var proposedRequest = db.requests.Include(r => r.rooms);
proposedRequest = proposedRequest.Where(r => r.booked.Equals(1));
proposedRequest = proposedRequest.Where(r => r.roundID.Equals(roundID));
proposedRequest = proposedRequest.Where(r => r.day.Equals(day));
int[] startTimes = new int[length];
int[] endTimes = new int[length];
for(var q=0;q<length;q++)
{
startTimes[q] = time + q;
endTimes[q] = time + q + 1;
}
proposedRequest = proposedRequest.Where(s => startTimes.Contains(s.start) || endTimes.Contains(s.start+s.length));
Now, this only works for if the new booking starts at the same time as the booking already in the DB, or if it ends at the same time. This doesn't look at the following case
there is a records in the db where start -> 2 and length ->3.
so this booking runs from 10:00->13:00.
but say I am checking this against an entry that starts at 11:00 and ends at 12. It would not come back as booked already because the start and end times do not match.
What is the best way to solve this?
the only way i could see fit is to loop through my startTime and endTime arrays and have another clause for each pair that would produce something like the following:
.Where((s => s.startTime<startTime[i] && (s.startTime + s.Length) > endTime[i]) || (s => s.startTime<startTime[i+1] && (s.startTime + s.Length) > endTime[i+1]))
but i dont think this is possible.
Based on this answer, two ranges overlap if (StartA <= EndB) and (EndA >= StartB)
In your case:
StartA = s.start
EndA = s.start + s.length
StartB = time
EndB = time + length
So your last condition should be like this:
proposedRequest = proposedRequest.Where(s => s.start <= time + length &&
s.start + s.length >= time);
This will return objects that have your StartTime, EndTime, and a boolean that signifies if it booked already.
var proposedRequest = db.requests
.Include(r => r.rooms)
.Where(r => r.booked.Equals(1))
.Where(r => r.roundID.Equals(roundID))
.Where(r => r.day.Equals(day))
.ToList();
//int[] startTimes = new int[length];
//int[] endTimes = new int[length];
//for(var q=0;q<length;q++)
//{
// startTimes[q] = time + q;
// endTimes[q] = time + q + 1;
//}
var times=Enumerable
.Range(time,length)
.Select(r=>
new {
StartTime=r,
EndTime=r+1,
Booked=proposedRequest.Any(pr=>pr.StartTime<=r && pr.StartTime+pr.Length>r)
}).ToList();

Parallel.For with date time

OK this code is a bit meta but it roughly explains how i have it now and what i want to achieve.
specialObject{
DateTime date;
int number;
}
var startDate = Lowest date in the list;
var endDate = Hightest date int the list;
List<SpecialObject> objs = (list from database where date > startDate and date < endDate)
//A list with alot of dates and numbers, most of the dates are the same. List contains roughly 1000 items, but can be more and less.
for(var date = startDate; date < endDate; date = date.AddDay(1){
listItem = objs.Where(x => x.Day = date).Sum(x => x.alotOfNUmbers);
}
Now since i don't care what day i calculate first, i thought i could do this.
Parallel.For(startDate, endDate, day => {
listItem = objs.Where(x => x.Day = date).Sum(x => x.alotOfNUmbers);
}
But how do i make it step dates ?
You can make a Range and iterate over it with Parallel.ForEach :
// not tested
var days = Enumerable
.Range(0, (endDate-startDate).Days) // check the rounding
.Select(i => startDate.AddDays(i));
Parallel.ForEach(days, day => ....)
Alternatively, you could use PLinq over the original source, probably faster. Roughly:
// not tested
var sums = objs.AsParallel().GroupBy(x => x.date).Select(g => g.Sum(i => i.number));
All the overloads of Parallel.For take two integer variables for start and end. I also don't see any version which would support something like a step so you can't just use the tick count of a DateTime as the loop variable.
But it should be easy to use a Parallel.ForEach instead, when you create an IEnumerable<DateTime> as the source sequence.
var source = Enumerable.Range(0, (endDate - startDate).Days)
.Select(t => startDate.AddDays(t));
Add +1 to the count parameter if the endDate is inclusive.
Ok after a few days search i figured if i placed all days in an array and "whiled" through it. It gives a pretty good result. With code easy to read
var start = new DateTime(2014, 09, 09);
var end = new DateTime(2014, 10, 01);
var listOfDays = new List<DateTime>();
int i = 0;
for (var day = start; day < end; day = day.AddDays(1))
{
listOfDays.Add(day);
}
Parallel.ForEach(listOfDays.ToArray(), currentDay =>
{
for (var d = new DateTime(currentDay.Year, currentDay.Month, currentDay.Day, 0, 0, 0); d < new DateTime(currentDay.Year, currentDay.Month, currentDay.Day, 23, 59, 59); d = d.AddSeconds(5))
{
var str = "Loop: " + i + ", Date: " + d.ToString();
Console.WriteLine(str);
}
i++;
});
Console.Read();

In C#, what is the best way to find gaps in a DateTime array?

I have a list of dates that are apart by a month in the sense that all dates are the "First Monday of the month". In some cases months are missing so I need to write a function to determine if all dates are consecutive
So for example if this was the list of dates, the function would return true as all items are the "First Friday of the month" and there are no gaps. This example below would return true.
var date = new DateTime(2013, 1, 4);
var date1 = new DateTime(2013, 2, 1);
var date2 = new DateTime(2013, 3, 1);
var date3 = new DateTime(2013, 4, 5);
var dateArray = new DateTime[]{date, date1, date2, date3};
bool isConsecutive = IsThisListConsecutive(dateArray);
where this example below would return false because, even though they are also all "First Friday of the month", its missing the March 2013 item.
var date = new DateTime(2013, 1, 4);
var date1 = new DateTime(2013, 2, 1);
var date3 = new DateTime(2013, 4, 5);
var dateArray = new DateTime[]{date, date1, date3};
bool isConsecutive = IsThisListConsecutive(dateArray);
so i am trying to figure out the right logic for the IsThisListConsecutive() method:
Here was my first try: (Note I already know upfront that all dates are same day of week and same week of month so the only thing i am looking for is a missing slot)
private bool IsThisListConsecutive(IEnumerable<DateTime> orderedSlots)
{
DateTime firstDate = orderedSlots.First();
int count = 0;
foreach (var slot in orderedSlots)
{
if (slot.Month != firstDate.AddMonths(count).Month)
{
return false;
}
count++;
}
return true;
}
This code above works exept if the list crosses over from one year to another. I wanted to get any advice on a better way to create this function and how that line could be rewritten to deal with dates that cross over years.
So to implement this we'll start with a simple helper method that takes a sequence and returns a sequence of pairs that make up each item with it's previous item.
public static IEnumerable<Tuple<T, T>> Pair<T>(this IEnumerable<T> source)
{
T previous;
using (var iterator = source.GetEnumerator())
{
if (iterator.MoveNext())
previous = iterator.Current;
else
yield break;
while(iterator.MoveNext())
{
yield return Tuple.Create(previous, iterator.Current);
previous = iterator.Current;
}
}
}
We'll also use this simple method to determine if two dates are in the same month:
public static bool AreSameMonth(DateTime first, DateTime second)
{
return first.Year == second.Year
&& first.Month == second.Month;
}
Using that, we can easily grab the month of each date and see if it's the month after the previous month. If it's true for all of the pairs, then we have consecutive months.
private static bool IsThisListConsecutive(IEnumerable<DateTime> orderedSlots)
{
return orderedSlots.Pair()
.All(pair => AreSameMonth(pair.Item1.AddMonths(1), pair.Item2));
}
Note: This is completely untested, and the date checks are probably pretty bad or somewhat redundant, but that’s the best I could come up with right now ^^
public bool AreSameWeekdayEveryMonth(IEnumerable<DateTime> dates)
{
var en = dates.GetEnumerator();
if (en.MoveNext())
{
DayOfWeek weekday = en.Current.DayOfWeek;
DateTime previous = en.Current;
while (en.MoveNext())
{
DateTime d = en.Current;
if (d.DayOfWeek != weekday || d.Day > 7)
return false;
if (d.Month != previous.Month && ((d - previous).Days == 28 || (d - previous).Days == 35))
return false;
previous = d;
}
}
return true;
}
I would recommend looking at the TimeSpan structure. Thanks to operator overload you can get a TimeSpan by substracting two dates and then receive a TimeSpan that expresses the difference between the two dates.
http://msdn.microsoft.com/en-us/library/system.timespan.aspx
okay, your code doesnt work when the years cross over becuase jan 1st may be a monday on one year and a tuesday on the next. If I was doing this, I would first check that
a) they are the same day of the week in each month (use DateTime.DayOfWeek)
b) they are the same week of the month in each month*
use extension method DayOfMonth (see link)
* Calculate week of month in .NET *
(you said you already know a & b to be true so lets go on to the third condition)
c) we have to determine if they are in consecutive months
//order the list of dates & place it into an array for ease of looping
DateTime[] orderedSlots = slots.OrderBy( t => t).ToArray<DateTime>();
//create a variable to hold the date from the previous month
DateTime temp = orderedSlots[0];
for(i= 1; index < orderedSlots.Length; index++)
{
if((orderedSlots[index].Month != temp.AddMonths(1).Month |
orderedSlots[index].Year != temp.AddMonths(1).Year)){
return false;
}
previousDate = orderedSlots[index];
}
return true;
if you need to check conditions a & b as well add change the if statement as follows
if( orderedSlots[index].Month != temp.AddMonths(1).Month |
orderedSlots[index].Year != temp.AddMonths(1).Year) |
orderedSlots[index].DayOfWeek != temp.DayOfWeek |
orderedSlots[index].GetWeekOfMonth != temp.AddMonths(1).GetWeekOfMonth){
return false;
}
remember that to use the get week of month extension method you have to include the code in
Calculate week of month in .NET
I'm sure there are typos as I did this in a text editor.
Well, here is my initial thought on how I would approach this problem.
First, is to define a function that will turn the dates into the ordinal values corresponding to the order in which they should appear.
int ToOrdinal(DateTime d, DateTime baseline) {
if (d.Day <= 7
&& d.DayInWeek == baseline.DayInWeek) {
// Since there is only one "First Friday" a month, and there are
// 12 months in year we can easily compose the ordinal.
// (As per default.kramer's comment, months normalized to [0,11].)
return d.Year * 12 + (d.Month - 1);
} else {
// Was not correct "kind" of day -
// Maybe baseline is Tuesday, but d represents Wednesday or
// maybe d wasn't in the first week ..
return 0;
}
}
var dates = ..;
var baseline = dates.FirstOrDefault();
var ordinals = dates.Select(d => ToOrdinal(d, baseline));
Then, for the dates provided, we end up with ordinal sequences like:
[24156 + 0, 24156 + 1, 24156 + 2, 24156 + 3]
And
[24156 + 0, 24156 + 1, /* !!!! */ 24156 + 3]
From here it is just a trivial matter of iterating the list and ensuring that the integers occur in sequence without gaps or stalls - that is, each item/integer is exactly one more than the previous.
I could be misinterpreting what you are trying to do, but I think this will work, assuming you don't have to handle ancient dates. See if there are any gaps in the dates converted to "total months"
int totalMonths = date.Year * 12 + (date.Month - 1);

Categories

Resources