There are prices set for certain time periods... I'm having trouble coming up with an algorithm to determine the lowest price for a specific time period.
I'm doing this with a list of objects, where the object has properties DateTime StartDate, DateTime EndDate, decimal Price.
For example, two price sets and their active date ranges:
A. 09/26/16 - 12/31/17 at $20.00
B. 12/01/16 - 12/31/16 at $18.00
You can see that B is inside the A time period and is lower.
I need that converted to this:
A. 09/26/16 - 11/30/16 at $20.00
B. 12/01/16 - 12/31/16 at $18.00
C. 01/01/17 - 12/31/17 at $20.00
It has to work for any number of date ranges and combinations. Has anyone come across anything I can manipulate to get the result I need? Or any suggestions?
Edit: My data structure:
public class PromoResult
{
public int ItemId { get; set; }
public decimal PromoPrice { get; set; }
public DateTime StartDate { get; set; }
public DateTime EndDate { get; set; }
public int PromoType { get; set; } // can ignore this...
}
This is a great case for using Linq. Assuming your price range object is called PriceRecord...
You will need to create a list of all dates and then filter down to price records that are between two consecutive dates. An implementation might look something like this:
public static IEnumerable<PriceRecord> ReduceOverlaps(IEnumerable<PriceRecord> source)
{
// Get a list of all edges of date ranges
// edit, added OrderBy (!)
var edges = source.SelectMany(record => new[] { record.StartDate, record.EndDate }).OrderBy(d => d).ToArray();
// iterate over pairs of edges (i and i-1)
for (int i = 1; i < edges.Length; i++)
{
// select min price for range i-1, i
var price = source.Where(r => r.StartDate <= edges[i - 1] && r.EndDate >= edges[i]).Select(r => r.Price).Min();
// return a new record from i-1, i with price
yield return new PriceRecord() { StartDate = edges[i - 1], EndDate = edges[i], Price = price };
}
}
I haven't tested this and you may need to tinker with the comparison operators, but it may be a good starting point.
I have now tested the code, the example here works with the data in the question.
Feel free to propose edits to improve this example.
I will use 2 functions DateRange and GroupSequenceWhile
List<PromoResult> promoResult = new List<PromoResult>()
{
new PromoResult() { PromoPrice=20, StartDate = new DateTime(2016, 9, 26),EndDate=new DateTime(2017, 12, 31)},
new PromoResult() { PromoPrice=18, StartDate = new DateTime(2016, 12, 1),EndDate=new DateTime(2016, 12, 31)}
};
var result = promoResult.SelectMany(x => DateRange(x.StartDate, x.EndDate, TimeSpan.FromDays(1))
.Select(y => new { promo = x, date = y }))
.GroupBy(x => x.date).Select(x => x.OrderBy(y => y.promo.PromoPrice).First())
.OrderBy(x=>x.date)
.ToList();
var final = result.GroupSequenceWhile((x, y) => x.promo.PromoPrice == y.promo.PromoPrice)
.Select(g => new { start = g.First().date, end = g.Last().date, price = g.First().promo.PromoPrice })
.ToList();
foreach (var r in final)
{
Console.WriteLine(r.price + "$ " + r.start.ToString("MM/dd/yy", CultureInfo.InvariantCulture) + " " + r.end.ToString("MM/dd/yy", CultureInfo.InvariantCulture));
}
OUTPUT:
20$ 09/26/16 11/30/16
18$ 12/01/16 12/31/16
20$ 01/01/17 12/31/17
Algorithm:
1- create a <day,price> tuple for each item in promoResult list
2- group this tuples by day and select min price
3- order this tuples by date
4- select the starting and ending day when there is a change in price in consecutive days
IEnumerable<DateTime> DateRange(DateTime start, DateTime end, TimeSpan period)
{
for (var dt = start; dt <= end; dt = dt.Add(period))
{
yield return dt;
}
}
public static IEnumerable<IEnumerable<T>> GroupSequenceWhile<T>(this IEnumerable<T> seq, Func<T, T, bool> condition)
{
List<T> list = new List<T>();
using (var en = seq.GetEnumerator())
{
if (en.MoveNext())
{
var prev = en.Current;
list.Add(en.Current);
while (en.MoveNext())
{
if (condition(prev, en.Current))
{
list.Add(en.Current);
}
else
{
yield return list;
list = new List<T>();
list.Add(en.Current);
}
prev = en.Current;
}
if (list.Any())
yield return list;
}
}
}
Doesn't directly answer your question, but here is some SQL that I used to solve a similar problem I had (simplified down a bit, as I was also dealing with multiple locations and different price types):
SELECT RI.ItemNmbr, RI.UnitPrice, RI.CasePrice
, RP.ProgramID
, Row_Number() OVER (PARTITION BY RI.ItemNmbr,
ORDER BY CASE WHEN RI.UnitPrice > 0
THEN RI.UnitPrice
ELSE 1000000 END ASC
, CASE WHEN RI.CasePrice > 0
THEN RI.CasePrice
ELSE 1000000 END ASC
, RP.EndDate DESC
, RP.BeginDate ASC
, RP.ProgramID ASC) AS RowNumBtl
, Row_Number() OVER (PARTITION BY RI.UnitPrice,
ORDER BY CASE WHEN RI.CasePrice > 0
THEN RI.CasePrice
ELSE 1000000 END ASC
, CASE WHEN RI.UnitPrice > 0
THEN RI.UnitPrice
ELSE 1000000 END ASC
, RP.EndDate DESC
, RP.BeginDate ASC
, RP.ProgramID ASC) AS RowNumCase
FROM RetailPriceProgramItem AS RI
INNER JOIN RetailPriceMaster AS RP
ON RP.ProgramType = RI.ProgramType AND RP.ProgramID = RI.ProgramID
WHERE RP.ProgramType='S'
AND RP.BeginDate <= #date AND RP.EndDate >= #date
AND RI.Active=1
I select from that where RowNumBtl=1 for the UnitPrice and RowNumCase=1 for the CasePrice. If you then create a table of dates (which you can do using a CTE), you can cross apply on each date. This is a bit inefficient, since you only need to test at border conditions between date ranges, so... good luck with that.
I would start with the ranges in date order based on starting date, add the first entry as a range in its entirety so:
09/26/16 - 12/31/17 at $20.00
TBD:
12/01/16 - 12/31/16 at $18.00
Next grab the next range you have, if it overlaps with the previous one, split the overlap (there are few kinds of overlaps, make sure to handle them all) taking the minimum value for the overlapped region:
09/26/16 - 11/30/16 at $20.00
12/01/16 - 12/31/16 at $18.00
TBD:
01/01/17 - 12/31/17 at $20.00
Note that you don't have the last one yet as you would take any splits that occur after and put them back into your sorted list of "yet to be compared" items.
Try this
lets say we have:
public class DatePrice
{
public DateTime StartDate { get; set; }
public DateTime EndDate { get; set; }
public decimal Price { get; set; }
}
and
IList<DatePrice> list = new List<DatePrice>(); // populate your data from the source..
var lowestPriceItem = list.OrderBy(item => item.Price).First();
should give you the lowest price item.
Related
Class TimeRange{
private DateTime StartDate{get; set;}
private DateTime EndDate{get; set;}
}
List<TimeRange> TimeRangeList = new List<TimeRange>(){
new TimeRange(){StartDate = new DateTime(2050, 1, 1),
EndDate = new DateTime(2050, 1, 10)},
new TimeRange(){StartDate = new DateTime(2050, 2, 1),
EndDate = new DateTime(2050, 2, 10)},
//This item will triggered the overlap validation failed
new TimeRange(){StartDate = new DateTime(2050, 1, 5),
EndDate = new DateTime(2050, 1, 9)},
},
}
so after I checked out the similar topic, I still can't figured out the algorithm of checking the overlapped date range.
This is quite simple in SQL, according to Checking for date overlap across multiple date range objects
I just need to compare two date range like this
SELECT COUNT(*)
FROM Table1
WHERE Table1.StartDate < 'endCheckDate'
AND Table1.EndDate > 'startCheckDate'
I found it is difficult to do in Linq, how do we compare all items in one collection within? of cause we can use foreach in just loop the collection just like comparing two list, but how is it work in select?
actually I'm doing something like this
for (int i = 0; i < TimeRangeList .Count(); ++i)
{
var item = TimeRangeList[i];
for (int y = i + 1; y < TimeRangeList.Count(); ++y)
{
var item2 = TimeRangeList[y];
if (IsOverLapped(item, item2))
{
// this is overlapped
};
}
}
private bool IsOverLapped(dynamic firstObj, dynamic secondObj)
{
return secondObj.StartDate <= firstObj.EndDate && firstObj.StartDate <= secondObj.EndDate;
}
Is there a more elegant way to do without looping?
so my questions is how do we compare one single list for each items itself by linq?
A simple brute force idea:
bool overlap = TimeRangeList
.Any(r => TimeRangeList
.Where(q => q != r)
.Any(q => q.EndDate >= r.StartDate && q.StartDate <= r.EndDate) );
If I look at your SQLcode, it seems that you have a Table1 object which is a sequence of similar objects, let's say of class Table1Row. Every Table1Row has at least two DateTime properties, a StartDate and an EndDate. Furthermore you have two DateTime objects: startCheckDate and endCheckDate.
You want to count all elements in your Table1 that have a StartDate smaller than startCheckDate and an EndDate larger than endCheckDate
Written as an extension function of IQueryable:
public static int CountOverlapping(this IQueryable<Table1Row> table1,
DateTime startCheckDate,
DateTime endCheckDate)
{
return table1
.Where (row => row.StartDate < startCheckDate && row.EndDate > endCheckDate)
.Count();
}
Usage:
DateTime startCheckDate = ...
DateTime endCheckDate = ...
IQueryable<Table1Row> table1 = ...
int nrOfOverlapping = table1.CountOverlapping(startCheckDate, endCheckDate);
Simple comme bonjour?
I have a list of records containing Id, DateFrom, DateTo. For the sake of this question we can use this one:
List<(int, DateTime, DateTime)> data = new List<(int, DateTime, DateTime)>
{
(1, new DateTime(2012, 5, 16), new DateTime(2018, 1, 25)),
(2, new DateTime(2009, 1, 1), new DateTime(2011, 4, 27)),
(3, new DateTime(2014, 1, 1), new DateTime(2016, 4, 27)),
(4, new DateTime(2015, 1, 1), new DateTime(2015, 1, 3)),
(2, new DateTime(2013, 5, 10), new DateTime(2017, 4, 27)),
(5, new DateTime(2013, 5, 16), new DateTime(2018, 1, 24)),
(2, new DateTime(2017, 4, 28), new DateTime(2018, 1, 24)),
};
In my real case the List could be a lot bigger. Initially I was working with the assumption that there can be only one record for a certain Id and I was able to come up with a pretty good solution but now, as you can see, the assumption is that you can have several periods for an Id and all periods should be taken into consideration when comparing the whole time.
The task is to find the two records that has the longest time overlap and to return the ids and the number of days overlapped.
Which in this sample case means that these should be records 1 and 2.
My implementation of this is the following:
public (int, int, int) GetLongestElapsedPeriodWithDuplications(List<(int, DateTime, DateTime)> periods)
{
Dictionary<int, List<(DateTime, DateTime)>> periodsByPeriodId = new Dictionary<int, List<(DateTime, DateTime)>>();
foreach (var period in periods)
{
if (periodsByPeriodId.ContainsKey(period.Item1))
{
periodsByPeriodId[period.Item1].Add((period.Item2, period.Item3));
}
else
{
periodsByPeriodId[period.Item1] = new List<(DateTime, DateTime)>();
periodsByPeriodId[period.Item1].Add((period.Item2, period.Item3));
}
}
int firstId = -1;
int secondId = -1;
int periodInDays = 0;
foreach (var period in periodsByPeriodId)
{
var Id = period.Key;
foreach (var currPeriod in periodsByPeriodId)
{
int currentPeriodInDays = 0;
if (Id != currPeriod.Key)
{
for (var i = 0; i < period.Value.Count; i++)
{
for (var j = 0; j < currPeriod.Value.Count; j++)
{
var firstPeriodDateFrom = period.Value[i].Item1;
var firstPeriodDateTo = period.Value[i].Item2;
var secondPeriodDateFrom = currPeriod.Value[j].Item1;
var secondPeriodDateTo = currPeriod.Value[j].Item2;
if (secondPeriodDateFrom < firstPeriodDateTo && secondPeriodDateTo > firstPeriodDateFrom)
{
DateTime commonStartingDate = secondPeriodDateFrom > firstPeriodDateFrom ? secondPeriodDateFrom : firstPeriodDateFrom;
DateTime commonEndDate = secondPeriodDateTo > firstPeriodDateTo ? firstPeriodDateTo : secondPeriodDateTo;
currentPeriodInDays += (int)(commonEndDate - commonStartingDate).TotalDays;
}
}
}
if (currentPeriodInDays > periodInDays)
{
periodInDays = currentPeriodInDays;
firstId = Id;
secondId = currPeriod.Key;
}
}
}
}
return (firstId, secondId, periodInDays);
}
As you can see the method is pretty big and in my opinion far from optimized in terms of execution speed. I know that those nested loops rise the complexity a lot, but this additional requirement to deal with more than one period for an Id really left me without ideas. How can I optimize this logic so in case of bigger input it would execute faster than now?
As in your original solution - you need to compare each interval with any other, except intervals with the same id, so I'd code this like this:
Supporting classes, just to simplify actual algorithm:
class Period {
public DateTime Start { get; }
public DateTime End { get; }
public Period(DateTime start, DateTime end) {
this.Start = start;
this.End = end;
}
public int Overlap(Period other) {
DateTime a = this.Start > other.Start ? this.Start : other.Start;
DateTime b = this.End < other.End ? this.End : other.End;
return (a < b) ? b.Subtract(a).Days : 0;
}
}
class IdData {
public IdData() {
this.Periods = new List<Period>();
this.Overlaps = new Dictionary<int, int>();
}
public List<Period> Periods { get; }
public Dictionary<int, int> Overlaps { get; }
}
Method to find max overlap:
static int GetLongestElapsedPeriod(List<(int, DateTime, DateTime)> periods) {
int maxOverlap = 0;
Dictionary<int, IdData> ids = new Dictionary<int, IdData>();
foreach (var period in periods) {
int id = period.Item1;
Period idPeriod = new Period(period.Item2, period.Item3);
// preserve interval for ID
var idData = ids.GetValueOrDefault(id, new IdData());
idData.Periods.Add(idPeriod);
ids[id] = idData;
foreach (var idObj in ids) {
if (idObj.Key != id) {
// here we calculate of new interval with all previously met
int o = idObj.Value.Overlaps.GetValueOrDefault(id, 0);
foreach (var otherPeriods in idObj.Value.Periods)
o += idPeriod.Overlap(otherPeriods);
idObj.Value.Overlaps[id] = o;
// check whether newly calculate overlapping is the maximal one, preserve Ids if needed too
if (o > maxOverlap)
maxOverlap = o;
}
}
}
return maxOverlap;
}
You can use TimePeriodLibrary.NET:
PM> Install-Package TimePeriodLibrary.NET
TimePeriodCollection timePeriods = new TimePeriodCollection(
data.Select(q => new TimeRange(q.Item2, q.Item3)));
var longestOverlap = timePeriods
.OverlapPeriods(new TimeRange(timePeriods.Start, timePeriods.End))
.OrderByDescending(q => q.Duration)
.FirstOrDefault();
With an extension method:
public static T MaxBy<T, TKey>(this IEnumerable<T> src, Func<T, TKey> key, Comparer<TKey> keyComparer = null) {
keyComparer = keyComparer ?? Comparer<TKey>.Default;
return src.Aggregate((a, b) => keyComparer.Compare(key(a), key(b)) > 0 ? a : b);
}
And some helper functions
DateTime Max(DateTime a, DateTime b) => (a > b) ? a : b;
DateTime Min(DateTime a, DateTime b) => (a < b) ? a : b;
int OverlappingDays((DateTime DateFrom, DateTime DateTo) span1, (DateTime DateFrom, DateTime DateTo) span2) {
var maxFrom = Max(span1.DateFrom, span2.DateFrom);
var minTo = Min(span1.DateTo, span2.DateTo);
return Math.Max((minTo - maxFrom).Days, 0);
}
You can group together the spans with matching Ids
var dg = data.GroupBy(d => d.Id);
Generate all pairs of Ids
var pdgs = from d1 in dg
from d2 in dg.Where(d => d.Key > d1.Key)
select new[] { d1, d2 };
Then compute the overlap in days between each pair of Ids and find the maximum:
var MaxOverlappingPair = pdgs.Select(pdg => new {
Id1 = pdg[0].Key,
Id2 = pdg[1].Key,
OverlapInDays = pdg[0].SelectMany(d1 => pdg[1].Select(d2 => OverlappingDays((d1.DateFrom, d1.DateTo), (d2.DateFrom, d2.DateTo)))).Sum()
}).MaxBy(TwoOverlap => TwoOverlap.OverlapInDays);
Since efficiency is mentioned, I should say that implementing some of these operations directly instead of using LINQ is more efficient, but you are using Tuples and in-memory structures so I don't think it will make much difference.
I ran some performance tests using a list of 24000 spans with 1249 unique IDs. The LINQ code took about 16 seconds. By inlining some of the LINQ and replacing anonymous objects with tuples, it came down to about 3.1 seconds. By adding a shortcut skipping any IDs whose cumulative days were shorter than the current max overlapping days and a few more optimizations, I got it down to less than 1 second.
var baseDate = new DateTime(1970, 1, 1);
int OverlappingDays(int DaysFrom1, int DaysTo1, int DaysFrom2, int DaysTo2) {
var maxFrom = DaysFrom1 > DaysFrom2 ? DaysFrom1 : DaysFrom2;
var minTo = DaysTo1 < DaysTo2 ? DaysTo1 : DaysTo2;
return (minTo > maxFrom) ? minTo - maxFrom : 0;
}
var dgs = data.Select(d => {
var DaysFrom = (d.DateFrom - baseDate).Days;
var DaysTo = (d.DateTo - baseDate).Days;
return (d.Id, DaysFrom, DaysTo, Dist: DaysTo - DaysFrom);
})
.GroupBy(d => d.Id)
.Select(dg => (Id: dg.Key, Group: dg, Dist: dg.Sum(d => d.Dist)))
.ToList();
var MaxOverlappingPair = (Id1: 0, Id2: 0, OverlapInDays: 0);
for (int j1 = 0; j1 < dgs.Count; ++j1) {
var dg1 = dgs[j1];
if (dg1.Dist > MaxOverlappingPair.OverlapInDays)
for (int j2 = j1 + 1; j2 < dgs.Count; ++j2) {
var dg2 = dgs[j2];
if (dg2.Dist > MaxOverlappingPair.OverlapInDays) {
var testOverlapInDays = 0;
foreach (var d1 in dg1.Group)
foreach (var d2 in dg2.Group)
testOverlapInDays += OverlappingDays(d1.DaysFrom, d1.DaysTo, d2.DaysFrom, d2.DaysTo);
if (testOverlapInDays > MaxOverlappingPair.OverlapInDays)
MaxOverlappingPair = (dg1.Id, dg2.Id, testOverlapInDays);
}
}
}
Optimizations applied:
Convert each spans DateTimes to # of days from an arbitrary baseDate to optimize overlapping days calculation by doing date conversion once.
Compute the total days for each span and skip any span pairs that can't exceed the current overlap
Replace SelectMany/Select with nested foreach to compute overlapping days.
Use ValueTuples instead of anonymous objects which are (slightly) faster for this problem.
Replace pair generation LINQ with nested for loops generating each possible pair directly
Pass individual from/to parameters instead of objects to OverlappingDays function
Note: I tried a smarter overlapping days calculation but when the number of spans per ID is small, the overhead took longer than just doing the calculation directly.
There are already few solutions
but
if you want to improve the efficiency then you don't have to compare every objects/value with everyother value or object. You can use Interval Search Tree for this problem and it can be solved in RlogN where R are number of intersections between intervals.
I recommend you to watch this video of Robert Sedgwick and also that book is online available.
Your basic problem here is how to identify a unique set of time periods. Give each one its own unique ID yourself.
When you write your final answer, include the additional details in the output so the user can understand which (original) IDs and original time periods resulted in the final answer.
Remember - the problem is still the same as in the original post (https://codereview.stackexchange.com/questions/186014/finding-the-longest-overlapping-period/186031?noredirect=1#comment354707_186031) and you still have the same information to work with. Don't get too hung up on the "ID"s as provided in the original list - you are still iterating through a list of time periods.
I've got a Web API and a Get method, returning a query:
var query = from results in context.Table
where results.Date>= startDate && results.Date <= endDate
select new
{
Week = { this is where I need a method to group by weeks },
Average = results.Where(x => x.Number).Average()
}
return query.ToList();
I want to calculate the average for each 7 days (that being the first week).
Example:
Average 1 ... day 7 (Week 1)
Average 2 ... day 14 (Week 2)
How can I do that? Being given an interval of datetimes, to filter it by weeks (not week of year)
Try this (not tested with tables)
var avgResult = context.QuestionaireResults
.Where(r => (r.DepartureDate >= startDate && r.DepartureDate <= endDate)).ToList()
.GroupBy( g => (Decimal.Round(g.DepartureDate.Day / 7)+1))
.Select( g => new
{
Week = g.Key,
Avg = g.Average(n => n.Number)
});
You will need to group by the number of days, since a reference date, divided by 7, so
.GroupBy(x => Math.Floor(((x.DepartureDate - new DateTime(1980,1,1)).TotalDays + 2) / 7))
Subtracting "Jan 1, 1980" from your departure date, gives you a TimeSpan object with the difference between the two dates. The TotalDays property of that timespan gives you timespan in days. Adding 2 corrects for the fact that "Jan 1, 1980" was a Tuesday. Dividing by 7 gives you the number of weeks since then. Math.Floor rounds it down, so that you get a consistent integer for the week, given any day of the week or portion of days within the week.
You could simplify a little by picking a reference date that is a Sunday (assuming that is your "first day of the week"), so you dont have to add 2 to correct. Like so:
.GroupBy(x => Math.Floor(((x.DepartureDate - new DateTime(1979,12,30)).TotalDays) / 7))
If you are sure that your data all falls within a single calendar year, you could maybe use the Calendar.GetWeekOfYear method to figure out the week, but I am not sure it would be any simpler.
Why not write a stored procedure, I think there may be some limitations on your flexibility using Linq because of the idea that normally the GroupBy groups by value (the value of the referenced "thing") so you can group by State, or Age, but I guess you can Group week... (new thought)
Add a property called EndOfWeek and for example, the end of this week is (Sunday let's say) then EndOfWeek = 9.2.16 whereas last week was 8.28.16... etc. then you can easily group but you still have to arrange the data.
I know I didn't answer the question but I hope that I sparked some brain activity in an area that allows you to solve the problem.
--------- UPDATED ----------------
simple solution, loop through your records, foreach record determine the EndOfWeek for that record. After this you will now have a groupable value. Easily group by EndOfWeek. Simple!!!!!!!!!!!! Now, #MikeMcCaughan please tell me how this doesn't work? Is it illogical to extend an object? What are you talking about?
------------ HERE IS THE CODE ----------------
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace SandboxConsole
{
class Program
{
static void Main(string[] args)
{
var t = new Transactions();
List<Transactions> transactions = t.GetTransactions();
// Now let's add a Weeks end date so we can determine the average per week
foreach(var transaction in transactions)
{
var transactionDayOfWeek = transaction.TransactionDate;
int daysUntilEndOfWeek_Sat = ((int)DayOfWeek.Saturday - (int)transactionDayOfWeek.DayOfWeek + 7) % 7;
transaction.Newly_Added_Property_To_Group_By_Week_To_Get_Averages = transactionDayOfWeek.AddDays(daysUntilEndOfWeek_Sat).ToShortDateString();
//Console.WriteLine("{0} {")
}
foreach(var weekEnd in transactions.GroupBy(tt => tt.Newly_Added_Property_To_Group_By_Week_To_Get_Averages))
{
decimal weekTotal = 0;
foreach(var trans in weekEnd)
{
weekTotal += trans.Amount;
}
var weekAverage = weekTotal / 7;
Console.WriteLine("Week End: {0} - Avg {1}", weekEnd.Key.ToString(), weekAverage.ToString("C"));
}
Console.ReadKey();
}
}
class Transactions
{
public int Id { get; set; }
public string SomeOtherProp { get; set; }
public DateTime TransactionDate { get; set; }
public decimal Amount { get; set; }
public string Newly_Added_Property_To_Group_By_Week_To_Get_Averages { get; set; }
public List<Transactions> GetTransactions()
{
var results = new List<Transactions>();
for(var i = 0; i<100; i++)
{
results.Add(new Transactions
{
Id = i,
SomeOtherProp = "Customer " + i.ToString(),
TransactionDate = GetRandomDate(i),
Amount = GetRandomAmount()
});
}
return results;
}
public DateTime GetRandomDate(int i)
{
Random gen = new Random();
DateTime startTime = new DateTime(2016, 1, 1);
int range = (DateTime.Today - startTime).Days + i;
return startTime.AddDays(gen.Next(range));
}
public int GetRandomAmount()
{
Random rnd = new Random();
int amount = rnd.Next(1000, 10000);
return amount;
}
}
}
------------ OUTPUT ---------------
Sample Output
I have an initial and a final date range = 1/1/2015 - 1/30/2015
I have these date ranges that represent dates of unavailability.
1/5/2015 - 1/10/2015
1/15/2015 - 1/20/2015
1/22/2015 - 1/28/2015
I want this output, mainly the dates of availability from the main range:
A: 1/1/2015 - 1/4/2015
B: 1/11/2015 - 1/14/2015
C: 1/21/2015 - 1/21/2015
D: 1/29/2015 - 1/30/2015
I tried to generate a sequential date range like this in order to get the exception dates with Except() but I think I'm complicating the thing.
//dtStartDate = 1/1/2015
//dtEndDate = 1/30/2015
var days = (int)(dtEndDate - dtStartDate).TotalDays + 1;
var completeSeq = Enumerable.Range(0, days).Select(x => dtStartDate.AddDays(x)).ToArray();
How can I get the gap of date ranges from period of time.
I other words how can I get the A, B, C and D from this picture
http://www.tiikoni.com/tis/view/?id=ebe851c
If these dates overlap, they must not be considered only where is a gap.
----------UPDATE-----------
I think if I do this:
var range = Enumerable.Range(0, (int)(1/10/2015 - 1/5/2015).TotalDays + 1).Select(i => 1/5/2015.AddDays(i));
var missing = completeSeq.Except(range).ToArray();
for each date range I will have the exclusion of each date range given but still cannot get the gap!
I saw your question in my morning today and really liked it, but was busy the whole day. So, got a chance to play with your question and believe me I enjoyed it. Here is my code:-
DateTime startDate = new DateTime(2015, 1, 1);
DateTime endDate = new DateTime(2015, 1, 30);
int totalDays = (int)(endDate - startDate).TotalDays + 1;
availability.Add(new Availability { StartDate = endDate, EndDate = endDate });
var result = from x in Enumerable.Range(0, totalDays)
let d = startDate.AddDays(x)
from a in availability.Select((v, i) => new { Value = v, Index = i })
where (a.Index == availability.Count - 1 ?
d <= a.Value.StartDate : d < a.Value.StartDate)
&& (a.Index != 0 ? d > availability[a.Index - 1].EndDate : true)
group new { d, a } by a.Value.StartDate into g
select new
{
AvailableDates = String.Format("{0} - {1}",g.Min(x => x.d),
g.Max(x => x.d))
};
This, definitely need explanation so here it is:-
Step 1: Create a range of dates from Jan 01 till Jan 30 using Enumerable.Range
Step 2: Since after the second unavailable date range, we need to limit the dates selected from last endate till current object startdate, I have calculated index so that we can get access to the last enddate.
Step 3: Once we get the index, all we need to do is filter the dates except for first date range since we didn't have last object in this case.
Step 4: For the last item since we don't have the max range I am adding the endDate to our unavailable list (hope this makes sense).
Here is the Working Fiddle, if you get confused just remove group by and other filters and debug and see the resulting output it will look fairly easy :)
using System;
using System.Collections.Generic;
using System.Linq;
public static class Program {
public static void Main() {
Tuple<DateTime,DateTime> range=Tuple.Create(new DateTime(2015,1,1),new DateTime(2015,1,30));
Tuple<DateTime,DateTime>[] exclude=new[] {
Tuple.Create(new DateTime(2015,1,5),new DateTime(2015,1,10)),
Tuple.Create(new DateTime(2015,1,15),new DateTime(2015,1,20)),
Tuple.Create(new DateTime(2015,1,22),new DateTime(2015,1,28))
};
foreach(Tuple<DateTime,DateTime> r in ExcludeIntervals(range,exclude)) {
Console.WriteLine("{0} - {1}",r.Item1,r.Item2);
}
}
public static IEnumerable<Tuple<DateTime,DateTime>> ExcludeIntervals(Tuple<DateTime,DateTime> range,IEnumerable<Tuple<DateTime,DateTime>> exclude) {
IEnumerable<Tuple<DateTime,bool>> dates=
new[] { Tuple.Create(range.Item1.AddDays(-1),true),Tuple.Create(range.Item2.AddDays(1),false) }.
Concat(exclude.SelectMany(r => new[] { Tuple.Create(r.Item1,false),Tuple.Create(r.Item2,true) })).
OrderBy(d => d.Item1).ThenBy(d => d.Item2); //Get ordered list of time points where availability can change.
DateTime firstFreeDate=default(DateTime);
int count=1; //Count of unavailability intervals what is currently active. Start from 1 to threat as unavailable before range starts.
foreach(Tuple<DateTime,bool> date in dates) {
if(date.Item2) { //false - start of unavailability interval. true - end of unavailability interval.
if(--count==0) { //Become available.
firstFreeDate=date.Item1.AddDays(1);
}
} else {
if(++count==1) { //Become unavailable.
DateTime lastFreeDate=date.Item1.AddDays(-1);
if(lastFreeDate>=firstFreeDate) { //If next unavailability starts right after previous ended, then no gap.
yield return Tuple.Create(firstFreeDate,lastFreeDate);
}
}
}
}
}
}
ideone.com
Got a little oopy...
public class DateRange
{
public DateTime Start { get; set; }
public DateTime End { get; set; }
public bool HasStart
{
get { return Start != DateTime.MinValue; }
}
public bool IsInRange(DateTime date)
{
return (date >= this.Start && date <= this.End);
}
public List<DateRange> GetAvailableDates(DateRange excludedRange)
{
return GetAvailableDates(new List<DateRange>(){excludedRange});
}
public List<DateRange> GetAvailableDates(List<DateRange> excludedRanges)
{
if (excludedRanges == null)
{
return new List<DateRange>() { this };
}
var list = new List<DateRange>();
var aRange = new DateRange();
var date = this.Start;
while (date <= this.End)
{
bool isInARange = excludedRanges.Any(er => er.HasStart && er.IsInRange(date));
if (!isInARange)
{
if (!aRange.HasStart)
{
aRange.Start = date;
}
aRange.End = date;
}
else
{
if (aRange.HasStart)
{
list.Add(aRange);
aRange = new DateRange();
}
}
date = date.AddDays(1);
}
if (aRange.HasStart)
{
list.Add(aRange);
}
return list;
}
}
Update 1 , following Ayende's answer
This is my first journey into RavenDb and to experiment with it I wrote a small map/ reduce, but unfortunately the result is empty?
I have around 1.6 million documents loaded into RavenDb
A document:
public class Tick
{
public DateTime Time;
public decimal Ask;
public decimal Bid;
public double AskVolume;
public double BidVolume;
}
and wanted to get Min and Max of Ask over a specific period of Time.
The collection by Time is defined as:
var ticks = session.Query<Tick>().Where(x => x.Time > new DateTime(2012, 4, 23) && x.Time < new DateTime(2012, 4, 24, 00, 0, 0)).ToList();
Which gives me 90280 documents, so far so good.
But then the map/ reduce:
Map = rows => from row in rows
select new
{
Max = row.Bid,
Min = row.Bid,
Time = row.Time,
Count = 1
};
Reduce = results => from result in results
group result by new{ result.MaxBid, result.Count} into g
select new
{
Max = g.Key.MaxBid,
Min = g.Min(x => x.MaxBid),
Time = g.Key.Time,
Count = g.Sum(x => x.Count)
};
...
private class TickAggregationResult
{
public decimal MaxBid { get; set; }
public decimal MinBid { get; set; }
public int Count { get; set; }
}
I then create the index and try to Query it:
Raven.Client.Indexes.IndexCreation.CreateIndexes(typeof(TickAggregation).Assembly, documentStore);
var session = documentStore.OpenSession();
var g1 = session.Query<TickAggregationResult>(typeof(TickAggregation).Name);
var group = session.Query<Tick, TickAggregation>()
.Where(x => x.Time > new DateTime(2012, 4, 23) &&
x.Time < new DateTime(2012, 4, 24, 00, 0, 0)
)
.Customize(x => x.WaitForNonStaleResults())
.AsProjection<TickAggregationResult>();
But the group is just empty :(
As you can see I've tried two different Queries, I'm not sure about the difference, can someone explain?
Now I get an error:
The group are still empty :(
Let me explain what I'm trying to accomplish in pure sql:
select min(Ask), count(*) as TickCount from Ticks
where Time between '2012-04-23' and '2012-04-24)
Unfortunately, Map/Reduce doesn't work that way. Well, at least the Reduce part of it doesn't. In order to reduce your set, you would have to predefine specific time ranges to group by, for example - daily, weekly, monthly, etc. You could then get min/max/count per day if you reduced daily.
There is a way to get what you want, but it has some performance considerations. Basically, you don't reduce at all, but you index by time and then do the aggregation when transforming results. This is similar to if you ran your first query to filter and then aggregated in your client code. The only benefit is that the aggregation is done server-side, so you don't have to transmit all of that data to the client.
The performance concern here is how big of a time range are you filtering to, or more precisely, how many items will there be inside your filter range? If it's relatively small, you can use this approach. If it's too large, you will be waiting while the server goes through the result set.
Here is a sample program that illustrates this technique:
using System;
using System.Linq;
using Raven.Client.Document;
using Raven.Client.Indexes;
using Raven.Client.Linq;
namespace ConsoleApplication1
{
public class Tick
{
public string Id { get; set; }
public DateTime Time { get; set; }
public decimal Bid { get; set; }
}
/// <summary>
/// This index is a true map/reduce, but its totals are for all time.
/// You can't filter it by time range.
/// </summary>
class Ticks_Aggregate : AbstractIndexCreationTask<Tick, Ticks_Aggregate.Result>
{
public class Result
{
public decimal Min { get; set; }
public decimal Max { get; set; }
public int Count { get; set; }
}
public Ticks_Aggregate()
{
Map = ticks => from tick in ticks
select new
{
Min = tick.Bid,
Max = tick.Bid,
Count = 1
};
Reduce = results => from result in results
group result by 0
into g
select new
{
Min = g.Min(x => x.Min),
Max = g.Max(x => x.Max),
Count = g.Sum(x => x.Count)
};
}
}
/// <summary>
/// This index can be filtered by time range, but it does not reduce anything
/// so it will not be performant if there are many items inside the filter.
/// </summary>
class Ticks_ByTime : AbstractIndexCreationTask<Tick>
{
public class Result
{
public decimal Min { get; set; }
public decimal Max { get; set; }
public int Count { get; set; }
}
public Ticks_ByTime()
{
Map = ticks => from tick in ticks
select new {tick.Time};
TransformResults = (database, ticks) =>
from tick in ticks
group tick by 0
into g
select new
{
Min = g.Min(x => x.Bid),
Max = g.Max(x => x.Bid),
Count = g.Count()
};
}
}
class Program
{
private static void Main()
{
var documentStore = new DocumentStore { Url = "http://localhost:8080" };
documentStore.Initialize();
IndexCreation.CreateIndexes(typeof(Program).Assembly, documentStore);
var today = DateTime.Today;
var rnd = new Random();
using (var session = documentStore.OpenSession())
{
// Generate 100 random ticks
for (var i = 0; i < 100; i++)
{
var tick = new Tick { Time = today.AddMinutes(i), Bid = rnd.Next(100, 1000) / 100m };
session.Store(tick);
}
session.SaveChanges();
}
using (var session = documentStore.OpenSession())
{
// Query items with a filter. This will create a dynamic index.
var fromTime = today.AddMinutes(20);
var toTime = today.AddMinutes(80);
var ticks = session.Query<Tick>()
.Where(x => x.Time >= fromTime && x.Time <= toTime)
.OrderBy(x => x.Time);
// Ouput the results of the above query
foreach (var tick in ticks)
Console.WriteLine("{0} {1}", tick.Time, tick.Bid);
// Get the aggregates for all time
var total = session.Query<Tick, Ticks_Aggregate>()
.As<Ticks_Aggregate.Result>()
.Single();
Console.WriteLine();
Console.WriteLine("Totals");
Console.WriteLine("Min: {0}", total.Min);
Console.WriteLine("Max: {0}", total.Max);
Console.WriteLine("Count: {0}", total.Count);
// Get the aggregates with a filter
var filtered = session.Query<Tick, Ticks_ByTime>()
.Where(x => x.Time >= fromTime && x.Time <= toTime)
.As<Ticks_ByTime.Result>()
.Take(1024) // max you can take at once
.ToList() // required!
.Single();
Console.WriteLine();
Console.WriteLine("Filtered");
Console.WriteLine("Min: {0}", filtered.Min);
Console.WriteLine("Max: {0}", filtered.Max);
Console.WriteLine("Count: {0}", filtered.Count);
}
Console.ReadLine();
}
}
}
I can envision a solution to the problem of aggregating over a time filter with a potentially large scope. The reduce would have to break things down into decreasingly smaller units of time at different levels. The code for this is a bit complex, but I am working on it for my own purposes. When complete, I will post over in the knowledge base at www.ravendb.net.
UPDATE
I was playing with this a bit more, and noticed two things in that last query.
You MUST do a ToList() before calling single in order to get the full result set.
Even though this runs on the server, the max you can have in the result range is 1024, and you have to specify a Take(1024) or you get the default of 128 max. Since this runs on the server, I didn't expect this. But I guess its because you don't normally do aggregations in the TransformResults section.
I've updated the code for this. However, unless you can guarantee that the range is small enough for this to work, I would wait for the better full map/reduce that I spoke of. I'm working on it. :)