I am trying to write a method that will determine the closest date chronlogically, given a list of dates and a target date. For example, given a (simplified) date set of {Jan 2011, March 2011, November 2011}, and a target date of April 2011, the method would return March 2011
At first, I was thinking of using LINQ's skip, but I'm not sure of an appropriate Func such that it would stop before the date was exceeded. This below seems a viable solution, but I'm not sure if there's a more efficient means of doing this. Presumably Last and First would be linear time each.
The source dataSet can be between 0 and 10,000 dates, generally around 5,000. Also, I am iterating over this whole process between 5 and 50 times (this is the number of target dates for different iterations).
// assume dateSet are ordered ascending in time.
public DateTime GetClosestDate(IEnumerable<DateTime> dateSet, DateTime targetDate)
{
var earlierDate = dateSet.Last(x => x <= targetDate);
var laterDate = dateSet.First(x => x >= targetDate);
// compare TimeSpans from earlier to target and later to target.
// return closestTime;
}
Well, using MinBy from MoreLINQ:
var nearest = dateSet.MinBy(date => Math.Abs((date - targetDate).Ticks);
In other words, for each date, find out how far it is by subtracting one date from the other (either way round), taking the number of Ticks in the resulting TimeSpan, and finding the absolute value. Pick the date which gives the smallest result for that difference.
If you can't use MoreLINQ, you could either write something similar yourself, or do it in two steps (blech):
var nearestDiff = dateSet.Min(date => Math.Abs((date - targetDate).Ticks));
var nearest = dateSet.Where(date => Math.Abs((date - targetDate).Ticks) == nearestDiff).First();
Using Last and First iterates the dateSet twice. You could iterate the dateSet yourself using your own logic. This would be more efficient, but unless your dateSet is very large or enumerating the dateSet is very costly for some other reason, the little gain in speed is probably not worth writing a more complicated code. Your code should be easy to understand in the first place.
It is simple!
List<DateTime> MyDateTimeList =....;
....
getNearest(DateTime dt)
{
return MyDateTimeList.OrderBy(t=> Math.ABS((dt-t).TotalMiliSeconds)).First();
}
Related
Im having some problems with comparing real numbers stored as doubles.
I think the problems are most likely cause by rounding errors but im not sure.
How would be the best way to compare numbers stored as doubles and tested in linq ?
I get a time as a string from a 3rd party source.
This looks like it is seconds from the epoch
Converting it to real time im sure it is in seconds and not milliseconds.
I covert that to a double using
double Time = Convert.ToDouble("1549666889.6220000");
Then i use linq to extract from a list all the entries that encompass this time
Infos.Where(x => x.StartTime <= starttime
&& x.EndTime >= starttime).OrderBy(x => x.StartTime).ToList();
and the results i get seem outside the comparison boundary i expected.
I expected the returned items are those that the time im testing for are between the start and end times of the items in the Infos list.
I get something like
(sorry the next lot should be a table of start and end times but i cant get it to format in a table layout here)
Start time End Time
1549665989.622097 1549666889.6221507
1549665989.6690228 1549666889.6790602
1549665989.8786857 1549666889.8817368
1549665989.8926628 1549666889.9037011
these results seem wrong especially the start times as they should be less than the time index im given.
I think this is a rounding issue, but not sure if its that or my logic.
If it is a rounding issue how should i be doing the testing in LINQ.
any advice appreciated.
It just occurred to me maybe i should multiply each double value by 10000000 to remove the decimals and compare just whole numbers ?
Is that a good idea ?
Converting a string like "1549665989.622097" to double leads to an error due to the precision. In this case the converted double will be 1549665989.6221.
If precision errors of your doubles are a problem, you should make use of the decimal data type:
The decimal keyword indicates a 128-bit data type. Compared to other floating-point types, the decimal type has more precision and a smaller range, which makes it appropriate for financial and monetary calculations.
Convert.ToDecimal provides the required conversion from a string. The result will be 1549665989.622097 without a precision error.
Your conversion is inefficient
You do realize that you convert the StartTime string to a double for your Where, and a lot of times again for your OrderBy don't you: the OrderBy will compare the 1st element with the 2nd, and the 1st with the 3rd, and the 2nd with the 3rd, and the 1st with the 4th and the 2nd with the 4th, and the 3rd with the 4th: you convert your strings to doubles over and over again.
Wouldn't it be more efficient to remember this conversion, and re-use the converted values?
You convert to the wrong type
As we are converting the 3rd party data anyway, why not convert it to a proper object that represents a point in time: System.DateTime?
Write two extension functions of class Info:
static class InfoExtensions
{
public static DateTime StartDateTime(this Info info)
{
return info.startTime.ToDateTime();
}
public static DateTime EndDateTime(this Info info)
{
return info.endTime.ToDateTime();
}
private static DateTime ToDateTime(this string date3rdParty)
{
// ask from your 3rd party what the value means
// for instance: seconds since some start epoch time:
static DateTime epochTime = new DateTime(...)
double secondsSinceEpochTime = Double.Parse(date3rdParty);
return epochTime.AddSeconds(secondsSinceEpochTime);
}
}
Usage:
DateTime startTime = ...
var result = Infos.Select(info => new
{
StartTime = info.StartTime.StartDatetime(),
EndTime = info.EndTime.EndDateTime(),
// select the Info properties you actually plan to use:
...
// or select the complete Info:
Info = info,
})
.Where(info => info.StartTime <= startTime && startTime <= info.EndTime)
.OrderBy(info => info.StartTime)
// Only if you prefer to throw away your converted StartTime / EndTime:
.Select(info => info.Info);
It might be that the precision of your 3rd party time is different than the precision of DateTime, and that you want the ultimate precision. In that case, consider converting their string into DateTime.Ticks, and then use this Ticks to create a new DateTime object. Since Ticks are integers, you'll have less trouble with the conversion
Separation of concerns
You should work more on separation of concerns. If you separated the way your 3rdparty represents their idea of dates (some string representation of seconds since some epoch time) from the way you would like to have it (probably System.DateTime), then you wouldn't have this problem.
If you separated their info class, from your info class, your code will be better maintainable, because you will have only one place where their info properties are translated into your info properties. If in future they add properties that you don't use, you would not notice it. If they decide to change their idea of date, for instance by using a different epoch time, or maybe using a System.DateTime, there will only be one place where you'll have to change your info. Also: if there comes a fourth party info: there is only one place where you'll have to convert.
Separation is efficient: conversion is only done once, no matter how often you use property StartTime. For instance, if in future you want to get all Infos grouped by same Date.
Separation is also easier to test: most of your code will work with your own converted info classes. Only one small piece of code will convert their info to your idea of info. You can test most code using your info class, and their is only one place where you'll have to test the conversion: once you know conversion is Okay, you'll never have to worry about it anymore
Create a class MyNamespace.Info, that has a constructor thirdPartyNamespace.Info:
class MyInfo
{
public DateTime StartTime {get; set;}
public DateTime EndTime {get; set;}
... // other info properties you actually plan to use
// Constructors:
public MyInfo() { } // default constructor
public MyInfo(ThirdParyNameSpace.Info info)
{
this.StartTime = info.StartTime.ToDateTime();
this.EndTime = info.EndTime.ToDateTime();
...
}
}
Did you see how easy it is to add support for Info from a fourth party? Or how little change there is if the 3rd party info changes, or if you need more properties (or less)?
Almost all your code can be tested using your local info class. Only one test class is needed to test that the 3rd party info is properly converted to your info.
I have got a text file that all lines like:
8:30 8:50 1
..........
20:30 20:35 151
Every line is a new user connection with it's time period in In-net.
The goal is to find periods of time where the quantity reaches the maximum.
So, maybe someone knows algorithm that can help me with this task(multiple intersections)? Find this task non-trivial for me(because i am new in programming), i have some ideas but i find them awful, that's why maybe i should start with mathematical algorithms to make the best way to achieve my goal.
For beginning we have to make some assumptions.
Assume you are looking for the shortest period with maximum connections.
Assume every line represents one connection. It's not clear from our
question what are the integer numbers after start and end times on
every line. So I ignore it.
The lines are given in order of increasing period start time.
We are free to choose any local maximum as the answer in case we got several periods with the same number of simultaneous connections.
The first stage of the solution is parsing. Given a sequence of lines we get the sequence of pairs of System.DateTime – a pair for each period in order.
static Tuple<DateTime, DateTime> Parse(string line)
{
var a = line.Split()
.Take(2) // take the start and end times only
.Select(p =>
DateTime.ParseExact(p, "H:m",
CultureInfo.InvariantCulture))
.ToArray();
return Tuple.Create(a[0], a[1]);
}
The next stage is the algorithm itself. It has two parts. First, we find local maximums as triples of start time, end time and connection count. Second, we select the absolute maximum from the set produced by the first part:
File.ReadLines(FILE_PATH).Select(Parse).GetLocalMaximums().Max(x=>x.Item3)
File.ReadLines(FILE_PATH).Select(Parse).GetLocalMaximums()
.Aggregate((x, y) => x.Item3 > y.Item3 ? x : y))
File.ReadLines(FILE_PATH).Select(Parse).GetLocalMaximums()
.Aggregate((x, y) => x.Item3 >= y.Item3 ? x : y))
The most sophisticated part is detection of a local maximum.
Take the first period A and write down its end time. Then write down
its start time as the last known start time. Note there is one end
time written and there is one active connection.
Take the next period B and write its end time. Compare the start
time of B to the minimum of end times written.
If there is no written end time smaller than B's start time then the
number of connections increases at this time. So discard previous
value for the last known start time and replace it with B's start
time. Then proceed to the next period. Note again there are one more
connections at this time and we have one more end time. So the number
of active connections is always equal to number of written down end
times.
If there is a value in the list of end time smaller than B's end, we
had a decrease in connection count and this means we just passed a
local maximum (here is the math). We have to report it: yield the triple (the last known start time, the minimum of written end times, the number of
end times written minus one). We should not count the end time for B
we had already written. Then discard all the end times being less
than B's start time, replace the last known start time, and proceed
to the next period.
When the minimum end time equals to the B's start, it means we've
lost one connection and got another one at the same time. This means
we have to discard the end time and proceed to the next period.
Repeat from step 2 for all the periods we have.
The source code for the local maximum detection
static IEnumerable<Tuple<DateTime, DateTime, int>>
GetLocalMaximums(this IEnumerable<Tuple<DateTime, DateTime>> ranges)
{
DateTime lastStart = DateTime.MinValue;
var queue = new List<DateTime>();
foreach (var r in ranges)
{
queue.Add(r.Item2);
var t = queue.Min();
if (t < r.Item1)
{
yield return Tuple.Create(lastStart, t, queue.Count-1);
do
{
queue.Remove(t);
t = queue.Min();
} while (t < r.Item1);
}
if (t == r.Item1) queue.Remove(t);
else lastStart = r.Item1;
}
// yield the last local maximum
if (queue.Count > 0)
yield return Tuple.Create(lastStart, queue.Min(), queue.Count);
}
While using List(T) was not the best decision, it's easy to understand. Use a sorted version of list for better performance. Replacing tuples with structs will eliminate a lot of memory allocation operations.
You could do:
string[] lines=System.IO.File.ReadAllLines(filePath)
var connections = lines
.Select(d => d.Split(' '))
.Select(d => new
{
From = DateTime.Parse(d[0]),
To = DateTime.Parse(d[1]),
Connections = int.Parse(d[2])
})
.OrderByDescending(d=>d.Connections).ToList();
connections will contain the sorted list with the top results first
So I have a list of timestamps, they're not uniformly spaced meaning one time stamp can be 10 minutes after the previous or 5 seconds after. What's the best way to find the index of the entry that is closest to (DateTime.Now.TotalSeconds - 3600)?
Since you don't gave any specific code, we can only do suggestions on that.
What you can do, is take the absolute difference of the date in the list and compare that to the desired date, taking the lowest.
Something like:
list.OrderBy( x => Math.Abs((x.Date - desiredDate).TotalMilliseconds)).FirstOrDefault();
I have these below strings to sort (String Format: 121030-1833 --> YYDDMM-HHMM)
String Example:
121030-1833
120823-2034
120807-2014
120627-2316
120525-1136
111226-1844
I want to sort them from latest to old format.
Which sort method will be the best to use here?
Updated problem statement
dictionary[String,string] "dictLocation" values with me, Want to sort the Dictionary according to its key (Key-YYMMDD-HHMM, latest to oldest)
{[110901-1226, sandyslocation.110901-1226]}
{[120823-2034, andyslocatio.120823-2034]}
{[110915-1720, mylocation.110915-1720]}
{[121030-1833, mylocation.121030-1833]}
I am trying this var latestToOldest = dictLocation.OrderBy(key => key.Key)
It not giving me proper result, Is there anything i am missing ?
As it happens, just sorting them in a descending way will work:
var latestToOldest = original.OrderByDescending(x => x);
That's assuming you want to assume all values are within the same century. (If you can possibly change the format, I'd suggest using a 4-digit year for clarity.)
However, I would recommend parsing the values into DateTime values as early as possible. Whenever you can, keep your data in its most "natural" form. Fundamentally these are dates and times, not strings - so convert them into that format early, and keep them in that format as long as you possibly can.
var sortedDateStrings = dateStrings.Sort().Reverse();
// or
var sortedDateStrings = dateStrings.OrderByDescending(x => x);
I have the following line of code:
group p by new DateTime((p.DateTime.Value.Ticks / interval.Ticks) * interval.Ticks).TimeOfDay
This works great in LINQ-to-Objects but it will not work in LINQ-to-Entities. I have two quetsions:
Why does it work? I don't understand how dividing the time by an interval and then multiplying it by that same interval magically converts my 12:37 to 12:30. My basic experience with math says that (A/B)*B = A. (Cleared up by #spencer in comments. Much thanks!)
How can I reformat this line to work in LINQ-to-Entities?
The DateTime methods you are using can't be translated by l2e. Consider using the EntityFunctions class for supported time operations.
See here for more details.
For instance, say your interval were 60 seconds, I'm guessing you might do something like:
DateTime? start=new DateTime(2000,1,1); //start of first interval
...group p by
EntityFunctions
.AddSeconds(
start,
(EntityFunctions
.DiffSeconds(p.DateTime,start)/60)*60
)