I have an entity collection of Readings.
Each Reading is linked to an entity called Meter.
(And each Meter holds multiple readings).
each Reading holds a field for meter id (int) and a field for time.
Here is some simplified code to demonstrate it:
public class Reading
{
int Id;
int meterId;
DateTime time;
}
public class Meter
{
int id;
ICollection<Readings> readings;
}
Given a specific period and list of meterids,
what would be the most efficient way to get for each Meter
the first and last reading in that time period?
I am able to iterate through all meters and for each meter to obatin
first and last reading for the period,
but I was wandering if there is a more efficient way to acheive this.
And a bonus question: same question, but with multiple periods of time to get data for,
instead of just one period.
I am not exactly sure how you want this data, but you could project it into an anonymous type:
var metersFirstAndLastReading = meters.Select(m => new
{
Meter = m,
FirstReading = m.readings.OrderBy(r => r.time).First(),
LastReading = m.readings.OrderBy(r => r.time).Last()
});
You can then read your result list like this (this example is just meant as an illustration):
foreach(var currentReading in metersFirstAndLastReading)
{
string printReadings = String.Format("Meter id {0}, First = {1}, Last = {2}",
currentReading.Meter.id.ToString(),
currentReading.FirstReading.time.ToString(),
currentReading.LastReading.time.ToString());
// Do something...
}
Another option would be to create properties in Meter which dynamically return the first and last readings:
public class Meter
{
public int id;
public List<Reading> readings;
public Reading FirstReading
{
get
{
return readings.OrderBy(r => r.time).First();
}
}
public Reading LastReading
{
get
{
return readings.OrderBy(r => r.time).Last();
}
}
}
EDIT: I misunderstood the question a little.
Here is the implementation to determine the first and last readings for a meter including a date range (assuming meterIdList is an ICollection<int> of IDs and begin and end is the specified date range)
var metersFirstAndLastReading = meters
.Where(m => meterIdList.Contains(m.id))
.Select(m => new
{
Meter = m,
FirstReading = m.readings
.Where(r => r.time >= begin && r.time <= end)
.OrderBy(r => r.time)
.FirstOrDefault(),
LastReading = m.readings
.Where(r => r.time >= begin && r.time <= end)
.OrderByDescending(r => r.time)
.FirstOrDefault()
});
You won't be able to use properties now (as you need to supply parameters) so methods will work just fine as an alternative:
public class Meter
{
public int id;
public List<Reading> readings;
public Reading GetFirstReading(DateTime begin, DateTime end)
{
var filteredReadings = readings.Where(r => r.time >= begin && r.time <= end);
if(!HasReadings(begin, end))
{
throw new ArgumentOutOfRangeException("No readings available during this period");
}
return filteredReadings.OrderBy(r => r.time).First();
}
public Reading GetLastReading(DateTime begin, DateTime end)
{
var filteredReadings = readings.Where(r => r.time >= begin && r.time <= end);
if(!HasReadings(begin, end))
{
throw new ArgumentOutOfRangeException("No readings available during this period");
}
return filteredReadings.OrderBy(r => r.time).Last();
}
public bool HasReadings(DateTime begin, DateTime end)
{
return readings.Any(r => r.time >= begin && r.time <= end);
}
}
I have a very similar data model where this code is used to get the oldest readings, i just changed it to also include the newest.
I use query syntax to do something like this:
var query = from reading in db.Readings
group reading by reading.meterId
into readingsPerMeter
let oldestReadingPerMeter = readingsPerMeter.Min(g => g.time)
let newestReadingPerMeter = readingsPerMeter.Max(g => g.time)
from reading in readingsPerMeter
where reading.time == oldestReadingPerMeter || reading.time == newestReadingPerMeter
select reading; //returns IQueryable<Reading>
That would result in a only the newest and oldest reading for each meter.
The reason i think this is efficient is because its one lookup to the DB to get all the readings for each meter, instead of several lookups for each meter. We have ~40000 meters with ~30mil readings. i just tested the lookup on our data it took about 10s
The sql preformed is a crossjoin between two sub selects for each of the min and max dates.
UPDATE:
Since this is queryable you should be able to supply a period after, like this:
query.Where(r=>r.time > someTime1 && r.time < someTime2)
Or put it into the original query, i just like it seperated like this. The query isnt executed yet since we havent performed an action that fetches the data yet.
Create a new class as the return type called Result, which looks like this
public class Result
{
public int MeterId;
public Readings Start;
public Readings Last;
}
I emulated your situation by making a list of Meters and populating some data, your query should be pretty much the same though
var reads = Meters.Where(x => x.readings != null)
.Select(x => new Result
{
MeterId = x.id,
Start = x.readings.Select(readings => readings).OrderBy(readings=>readings.time).FirstOrDefault(),
Last = x.readings.Select(readings=>readings).OrderByDescending(readings=>readings.time).FirstOrDefault()
});
public IEnumerable<Reading> GetFirstAndLastInPeriod
(IEnumerable<Reading> readings, DateTime begin, DateTime end)
{
return
from reading in readings
let span = readings.Where(item => item.time >= begin && item.time <= end)
where reading.time == span.Max(item => item.time)
|| reading.time == span.Min(item => item.time)
select reading;
}
meters.Where(mt=>desiredMeters.Contains(mt)).Select(mt=>
new{
mt.Id,
First = mt.Readings.Where(<is in period>).OrderBy(rd=>rd.Time).FirstOrDefault(),
Last = mt.Readings.Where(<is in period>).OrderBy(rd=>rd.Time).LastOrDefault()
});
If you have lots of readings per meter, this will not perform well, and you should consider Readings to be of SortedList class.
my solution will return exact what u want (List of all Meters containing Readings within given Time Period)
public IList<Reading[]> GetFirstAndLastReadings(List<Meter> meterList, DateTime start, DateTime end)
{
IList<Reading[]> fAndlReadingsList = new List<Reading[]>();
meterList.ForEach(x => x.readings.ForEach(y =>
{
var readingList = new List<Reading>();
if (y.time >= startTime && y.time <= endTime)
{
readingList.Add(y);
fAndlReadingsList.Add(new Reading[] { readingList.OrderBy(reading => reading.time).First(), readingList.OrderBy(reading => reading.time).Last() });
}
}));
return fAndlReadingsList;
}
I got some very nice leads, thank to all the responders.
Here is the solution that worked for me:
/// <summary>
/// Fills the result data with meter readings matching the filters.
/// only take first and last reading for each meter in period.
/// </summary>
/// <param name="intervals">time intervals</param>
/// <param name="meterIds">list of meter ids.</param>
/// <param name="result">foreach meter id , a list of relevant meter readings</param>
private void AddFirstLastReadings(List<RangeFilter<DateTime>> intervals, List<int> meterIds, Dictionary<int, List<MeterReading>> result)
{
foreach (RangeFilter<DateTime> interval in intervals)
{
var metersFirstAndLastReading = m_context.Meter.Where(m => meterIds.Contains(m.Id)).Select(m => new
{
MeterId = m.Id,
FirstReading = m.MeterReading
.Where(r => r.TimeStampLocal >= interval.FromVal && r.TimeStampLocal < interval.ToVal)
.OrderBy(r => r.TimeStampLocal)
.FirstOrDefault(),
LastReading = m.MeterReading
.Where(r => r.TimeStampLocal >= interval.FromVal && r.TimeStampLocal < interval.ToVal)
.OrderByDescending(r => r.TimeStampLocal)
.FirstOrDefault()
});
foreach (var firstLast in metersFirstAndLastReading)
{
MeterReading firstReading = firstLast.FirstReading;
MeterReading lastReading = firstLast.LastReading;
if (firstReading != null)
{
result[firstLast.MeterId].Add(firstReading);
}
if (lastReading != null && lastReading != firstReading)
{
result[firstLast.MeterId].Add(lastReading);
}
}
}
}
}
Related
I have this while loop to get next working day excluding holidays and sundays.
But it calculates by adding 1 day. And i want that number of day to be given by the user. I get that input from the below TextBox (TboxIzin).
How can execute that while loop to do the calculation for given number of times?
int i = 1;
int sayi;
IzinIslem i1 = new IzinIslem();
int.TryParse(i1.TboxIzin.Text, out sayi);
public static DateTime GetNextWeekDay(DateTime date,
IList<Holiday> holidays, IList<DayOfWeek> weekendDays)
{
int i = 1;
int sayi;
IzinIslem i1 = new IzinIslem();
int.TryParse(i1.TboxIzin.Text, out sayi);
// always start with tomorrow, and truncate time to be safe
date = date.Date.AddDays(1);
// calculate holidays for both this year and next year
var holidayDates = holidays.Select(x => x.GetDate(date.Year))
.Union(holidays.Select(x => x.GetDate(date.Year + 1)))
.Where(x => x != null)
.Select(x => x.Value)
.OrderBy(x => x).ToArray();
// increment to get working day
while (true)
{
if (weekendDays.Contains(date.DayOfWeek) ||
holidayDates.Contains(date))
date = date.AddDays(1);
else
return date;
}
}
I get not all code paths return a value
error when i try nesting while loops.
while is a conditional loop. Here you put a non-condition in the clause and immediately follow up with a condition. Put the condition in the while clause:
while(weekendDays.Contains(date.DayOfWeek) || holidayDates.Contains(date)) {
date = date.AddDays(1);
}
return date;
The actual reason you're getting the error is that the compiler cannot predict if your if condition will ever resolve to false. If it doesn't, then your function never returns.
With the modified while loop, that may still happen, but it will result in an infinite loop, and the compiler is fine if you shoot yourself in the foot that way.
You can change your else clause to break out of the loop. And then return out of the loop.
while (true)
{
if (weekendDays.Contains(date.DayOfWeek) ||
holidayDates.Contains(date))
date = date.AddDays(1);
else
break;
}
return date;
Let's get rif of all UI in the GetNextWeekDay (like int.TryParse(i1.TboxIzin.Text, out sayi);):
public static DateTime GetNextWeekDay(DateTime date,
IEnumerable<Holiday> holidays,
IEnumerable<DayOfWeek> weekendDays) {
// public method arguments validation
if (null == holidays)
throw new ArgumentNullException(nameof(holidays));
else if (null == weekendDays)
throw new ArgumentNullException(nameof(weekendDays));
HashSet<DayOfWeek> wends = new HashSet<DayOfWeek>(weekendDays);
// Current Year and Next years - .AddYear(1)
HashSet<DateTime> hds = new HashSet<DateTime>(holidays
.Select(item => item.Date)
.Concate(holidays.Select(item => item.Date.AddYear(1))));
for (var day = date.Date.AddDays(1); ; day = day.AddDays(1))
if (!wends.Contains(day.DayOfWeek) && ! hds.Contains(day))
return day;
}
Or if you prefer Linq, the loop can be rewritten as
return Enumerable
.Range(1, 1000)
.Select(day => date.Date.AddDays(day))
.TakeWhile(item => item.Year - date.Year <= 1)
.First(item => !wends.Contains(item.DayOfWeek) && !hds.Contains(item));
I am trying to search through a group of days, and determine if a worker has worked that day, and get a total of days worked. The below works, but is terribly inefficient since even after it finds a guy worked a day it keeps looking through the rest of those days. If I could somehow increment the outer ForEach loop when the inner condition (day worked) is satisfied it would surely be faster.
totalDaysWorked is what I'm after below:
public class StationSupportRequest
{
public string RequestNum;
public string Status;
public string Prefix;
public string PlantLoc;
public DateTime Date;
public string Departmnt;
public DateTime Time;
public string StationID;
public string Fixture;
public string Supervisor;
public string PartNo;
public string SerialNum;
public string FailedStep;
public string Reason;
public string OtherReason;
public string Details;
public string Urgency;
public DateTime Date_1;
public DateTime Time_1;
public DateTime Date_2;
public DateTime Time_2;
public string ProblemFound;
public string SolutionCode;
public string Solution;
public double ServiceTechHrs;
public double ServiceEngHrs;
public string DocHistory;
public DateTime CloseDate;
public DateTime IniDate;
public DateTime IniTime;
public string MOT;
public string Initiator;
public string Notification;
public string ServiceTech;
public string ServiceEng;
public string SolutionCode_1;
public string Solution_1;
public string UpdatedBy;
public List<string> UpdatedByList;
public string Revisions;
public List<DateTime> RevisionsDateTime;
public List<WorkedDatapoint> WorkedDataPointsList;
}
public class WorkedDatapoint
{
public string AssignerName { get; set; }
public string AssigneeName { get; set; }
public DateTime Date { get; set; }
public bool AssignedToOther { get; set; }
}
var DateRange = SSRList.Where(y => y.IniDate >= IniDate && y.CloseDate < EndDate);
//DateRange = DateRange.Where(dr => dr.Fixture != null && dr.Fixture.Length == 6); //To get valid fixtures if pivoting on "Fixture"
var groupedData = DateRange.GroupBy(x => new { DS = x.ServiceTech }).Select(x =>
{
double totalSsrsWorkedOn = x.Select(y => y.RequestNum).Count();
IEnumerable<TimeSpan> hoursWorked = x.Select(y => y.CloseDate - y.IniDate.AddDays(GetWeekendDaysToSubtract(y.IniDate, y.CloseDate)));
var averageReactionTimeMinutes = x.Where(d => d.IniDate != null && d.Revisions != null)
.Average(d => ((DateTime.Parse(d.Revisions.Split(',')[0]) - (DateTime)d.IniDate)).Minutes);
double[] listOfMinutesOpenTime = x.Where(d => d.IniDate != null && d.Revisions != null)
.Select(d => Convert.ToDouble(((DateTime.Parse(d.Revisions.Split(',')[0]) - (DateTime)d.IniDate)).Minutes))
.ToArray();
double[] listOfDaysOpenTime = x.Where(d => d.IniDate != null && d.CloseDate != null)
.Select(d => ((DateTime)d.CloseDate - (DateTime)d.IniDate.AddDays(GetWeekendDaysToSubtract(d.IniDate, d.CloseDate))).TotalDays)
.ToArray();
string testtech = x.Select(y => y.ServiceTech).FirstOrDefault();
List<DateTime> totalDaysInDateRange = Enumerable.Range(0, 1 + EndDate.Subtract(IniDate).Days)
.Select(offset => IniDate.AddDays(offset)).ToList();
double totalHoursLogged = x.Sum(d => d.ServiceEngHrs) + x.Sum(d => d.ServiceTechHrs);
int assignedToOthersCount = x.SelectMany(y => y.WorkedDataPointsList)
.Where(z => z.AssignerName.Contains(testtech) && z.AssignedToOther == true)
.Count();
int brokenWiresFixed = x.Where(d => d.SolutionCode != null)
.Where(d => d.SolutionCode.Contains("A01 -") ||
d.SolutionCode.Contains("F01 -") ||
d.SolutionCode.Contains("S01 -")).Count();
int npfResults = x.Where(d => d.ProblemFound != null).Where(d => d.ProblemFound.Contains("NPF")).Count();
int totalDaysWorked = 0;
List<DateTime> workingDatesList = new List<DateTime>();
totalDaysInDateRange.ForEach((day) =>
{
x.Select(y => y.WorkedDataPointsList).ForEach((WorkedDataPoint) =>
{
IEnumerable<WorkedDatapoint> dateList = WorkedDataPoint
.Where(y => testtech == y.AssignerName)
.DistinctBy(z => z.Date.Date);
foreach ( WorkedDatapoint date in dateList)
{
if (x.Any(b => b.Date.Date.Date == date.Date.Date.Date))
{
workingDatesList.Add(date.Date.Date.Date);
break;
}
}
});
});
workingDatesList.Dump("WorkingDatesList");
totalDaysWorked = workingDatesList.DistinctBy(b => b.Date).Count();
/*int totalDaysWorked = 0;
totalDaysInDateRange.ForEach((day) =>
{
if (AssignersList.Where(d => testtech.Contains(d.AssignerName))
.DistinctBy(d => d.Date.Date)
.Any(d => d.Date.Date == day.Date))
{
totalDaysWorked++;
}
}); TODO: Delete this once new is working*/
return new
{
//SSRs = x,
//Station = x.Select(d => d.StationID).FirstOrDefault(),
//Fixture = x.Select(d => d.Fixture).FirstOrDefault(),
//ProductTested = x.Select(d => d.Details).FirstOrDefault(),
TestTech = testtech,
//TestEng = x.Select(d => d.ServiceEng).Distinct().Where(d => d.Length > 0),
TotalSSRsWorkedOn = Math.Round(totalSsrsWorkedOn, 4),
TotalHoursLogged = Math.Round(totalHoursLogged, 4),
AssignedToOthersCount = assignedToOthersCount,
AssignedToOthersPercentage = 100 * Math.Round(assignedToOthersCount / (assignedToOthersCount + totalSsrsWorkedOn), 4),
//AverageReactionTimeMinutes = averageReactionTimeMinutes,
AverageTimeToCompleteHours = x.Where(y => y.CloseDate != null && y.Time_1 != null && y.Time_1 != DateTime.MinValue).Select(z => (z.CloseDate - z.Time_1).TotalHours).Average(),
//Close = x.Where(y => y.CloseDate != null && y.Time_1 != null).Select(z => (z.CloseDate)),
//Time = x.Where(y => y.CloseDate != null && y.Time_1 != null).Select(z => (z.Time_1)),
MedianDaysRequestOpen = Math.Round(GetMedian(listOfDaysOpenTime), 3),
DaysWorkedPerDateRange = totalDaysWorked,
AveSSRsClosedPerWorkedDay = Math.Round(totalSsrsWorkedOn / totalDaysWorked, 3),
AveHoursLoggedPerRequest = Math.Round((x.Select(y => y.ServiceTechHrs + y.ServiceEngHrs).Sum()) / totalSsrsWorkedOn, 3),
BrokenWiresFixed = brokenWiresFixed,
PercentageBrokenWires = 100 * Math.Round(brokenWiresFixed / totalSsrsWorkedOn, 4),
NPFResults = npfResults,
PercentageNPF = 100 * Math.Round(npfResults / totalSsrsWorkedOn, 4),
};
}).OrderByDescending(x => x.TotalSSRsWorkedOn)
.Dump("Summary");
return;
Sample output, with the duplicate dates evaluated (workingDatesList):
8/1/2017 12:00:00 AM
8/1/2017 12:00:00 AM
8/1/2017 12:00:00 AM
8/2/2017 12:00:00 AM
A couple of comments on the code you posted:
Since you don't ever use the day variable from the outermost loop, simply remove that loop altogether.
Why are you testing whether x.Any(...) within a loop that iterates over y? This seems fundamentally flawed.
I can't discern from your problem statement what your data structures are, nor what it is that you are actually trying to do. Your problem statement is currently worded as:
I am trying to search through a group of days, and determine if a worker has worked that day, and get a total of days worked.
It appears you are taking some input called testtech (String) and totalDaysInDateRange (List<DateTime>), then want to find all entries in some data structure x (I can't infer what this is) where String.equalsIgnoreCase(y.AssignerName, testtech) && totalDaysInDateRange.contains(y.Date). Is this interpretation correct?
If so, simply iterate over the entries in whatever your x data structure is, and run the above logic. If this doesn't solve your problem, then please give us more information on the layout of the data structure x and how information about each worker is actually associated with the other data about that worker.
BEGIN EDIT
OK, now that you have provided more information, I think you want to replace the totalDaysInDateRange.ForEach statement with the following:
x.Select(y => y.WorkedDataPointsList).ForEach((wdp) =>
{
if (testtech == wdp.AssignerName && IniDate.Date <= wdp.Date.Date
&& wdp.Date.Date <= EndDate.Date)
{
workingDatesList.Add(wdp.Date.Date);
}
});
After changing your implementation, simply delete totalDaysInDateRange. I also recommend changing the type of workingDatesList to HashSet<DateTime>, since you don't seem to care about duplicate dates. Be sure to convert workingDatesList to a list and sort it once the loop is complete if you want the dates printed in chronological order.
I have a list of log entries in Audit class
public class Audit
{
public DateTime TimeStamp { get; set; }
public string User { get; set; }
public string AuditType { get; set; }
}
so a list might look like this;
20140206 11:29:20 Owen Open
20140206 11:29:21 Owen Close
20140206 11:31:20 Owen Open
20140206 11:32:20 Owen Close
20140206 11:42:20 Owen Open
20140206 11:50:00 Owen Acknowledge
This gives us gaps of 1 second, 1 minute, and 40 seconds. So the longest time it was open was the middle pair for 1 minute, then it was acknowledged at 11:50. I'm looking for the date pair where it was open longes, in this case 1 min.
I know I can process the list in sequentially and find the biggest gap using a TimeSpan but I figure there is a neat LINQ way to do it maybe with groups?
UPDATE It's not pretty, but this is the logic in really expanded walk
var audits = notice.AuditEntries.Where(a => a.User == user);
DateTime? currentOpen = null;
DateTime? bestOpen = null;
DateTime? bestClose = null;
foreach (var audit in audits)
{
if (audit.AuditType == "Open")
{
if (currentOpen.HasValue) continue;
currentOpen = audit.TimeStamp;
}
if (audit.AuditType == "Close" || audit.AuditType == "Acknowledge")
{
if (currentOpen.HasValue)
{
DateTime? currentClose = audit.TimeStamp;
if (!bestOpen.HasValue)
{
bestOpen = currentOpen;
bestClose = currentClose;
}
else
{
if (bestClose.Value.Subtract(bestOpen.Value) > currentClose.Value.Subtract(currentOpen.Value))
{
bestOpen = currentOpen;
bestClose = currentClose;
}
}
currentOpen = null;
}
}
}
I think this will do the trick:
IEnumerable<Audit> audits = ...
var longestAuditsByUser = audits.OrderBy(a => a.Timestamp)
// group by user, since presumably we don't want to match an open from one user with a close from another
.GroupBy(a => a.User)
.Select(userAudits =>
{
// first, align each audit entry with it's index within the entries for the user
var indexedAudits = userAudits.Select((audit, index) => new { audit, index });
// create separate sequences for open and close/ack entries
var starts = indexedAudits.Where(t => t.audit.AuditType == "Open");
var ends = indexedAudits.Where(t => t.audit.AuditType == "Close" || t.audit.AuditType == "Acknowledge");
// find the "transactions" by joining starts to ends where start.index = end.index - 1
var pairings = starts.Join(ends, s => s.index, e => e.index - 1, (start, end) => new { start, end });
// find the longest such pairing with Max(). This will throw if no pairings were
// found. If that can happen, consider changing this Select() to SelectMany()
// and returning pairings.OrderByDescending(time).Take(1)
var longestPairingTime = pairings.Max(t => t.end.Timestamp - t.start.Timestamp);
return new { user = userAudits.Key, time = longestPairingTime };
});
// now that we've found the longest time for each user, we can easily find the longest
// overall time as well
var longestOverall = longestAuditsByUser.Max(t => t.time);
Not tested but should work:
var auditGaps = audits
.GroupBy(a => a.User)
.Select(g => new
{
User = g.Key,
MinOpen = g.Where(a => a.AuditType == "Open").Select(a=> a.TimeStamp).Min(),
MaxClosed = g.Where(a => a.AuditType == "Close").Select(a=> a.TimeStamp).Max(),
MaxAcknowledge = g.Where(a => a.AuditType == "Acknowledge").Select(a=> a.TimeStamp).Max()
})
.Select(x => new
{
x.User,
LargestOpenCloseGap = x.MaxClosed - x.MinOpen,
LargestOpenAcknowledgeGap = x.MaxAcknowledge - x.MinOpen
});
I have a bunch of these Tasks that are all based on LINQ queries. I am looking for good way to refactor them and make them easier to read and allow me to change the queries depending on language/region etc.
var mailTaskOne = CreateTask(() => myService.Mail.Where(p => p.ProjectName == "Delta"
&& (p.MailLang== (int)MailLanguage.EU || p.MailLang == (int)MailLanguage.RU)
&& (p.DateEntered >= startDate && p.DateEntered <= endDate)
&& p.MailPriority == (int)MailPriority.High).Count());
One of the ways I thought would be convenient would be to split the query up into something like this.
var results = myService.Mail.Where(x => x.ProjectName == "Delta");
results = results.Where(p => p.MailLang== (int)MailLanguage.EU);
results = results.Where(p => p.DateModified >= startDate && p.DateModified <= endDate);
This would allow me to do this without having to repeat the whole query for each region.
if (MailLanguage == "English")
results = results.Where(p => p.MailLang== (int)MailLanguage.EU);
else
results = results.Where(p => p.MailLang== (int)MailLanguage.RU);
Is there anyone that knows a better solution for this? I end up having huge functions as I need to do maybe 20 of these queries depending on the requirements; such as Region, Project name etc.
Edit:
Due to some limitations I did not know of with the back-end (web service/api) I could unfortunately not use some of the awesome answers mentioned in this question.
For example this does not get translated properly, but in no ways because the answer incorrect, simply does not work with the API I am working against -- possibly because it is poorly implemented.
public bool IsValid(Type x)
{
return (x.a == b) && (x.c ==d) && (x.d == e);
}
Anyway, anyone looking for similar solutions all of these are valid answers, but in the end I ended up going with something similar to the solution snurre provided.
I would go with just splitting up the query onto different lines like you suggested, it means you can put comments per line to describe what it is doing. You are still only making 1 trip to the database so you aren't losing anything in terms of performance but gaining better readability.
Why not simply have a method for the purpose?
public static IQueryable<Mail> Count(this IQueryable<Mail> mails,
string projectName,
MailLanguage mailLanguage,
DateTime startDate,
DateTime endDate) {
return mails.Count(p=>
p.ProjectName == projectName
&& p.MailLang == mailLanguage
&& p.DateEntered >= startDate
&& p.DateEntered <= endDate
&& p.MailPriority == (int)MailPriority.High);
}
then you can simply use it like this
CreateTask(() => myService.Mail.Count("Delta",MailLanguage.EU,startDate,endDate));
You could turn project name, data modified, mail language and any other criteria into variables and guive them the value you want based on any condition. Then your query would use the variables not the literal values.
var projectName="Delta";
var mailLanguage=(int)MailLanguage.RU;
var results=myService.Mail.Where(x => x.ProjectName == projectName)
&& (p.MailLang== mailLanguage);
That way you can put most of the complexity in giving the values to the variables and the linq query would be easier to read and mantain.
You could create a parameter class like:
public class MailParameters
{
public DateTime EndTime { get; private set; }
public IEnumerable<int> Languages { get; private set; }
public int Priority { get; private set; }
public string ProjectName { get; private set; }
public DateTime StartTime { get; private set; }
public MailParameters(string projectName, DateTime startTime, DateTime endTime, MailLang language, Priority priority)
: this(projectName, startTime, endTime, new[] { language }, priority)
public MailParameters(string projectName, DateTime startTime, DateTime endTime, IEnumerable<MailLang> languages, Priority priority)
{
ProjectName = projectName;
StartTime = startTime;
EndTime = endTime;
Languages = languages.Cast<int>();
Priority = (int)priority;
}
}
Then add these extension methods:
public static int Count(this IQueryable<Mail> mails, MailCountParameter p)
{
return mails.Count(m =>
m.ProjectName == p.ProjectName &&
p.Languages.Contains(m.MailLang) &&
m.EnteredBetween(p.StartTime, p.EndTime) &&
m.Priority == p.Priority);
}
public static bool EnteredBetween(this Mail mail, DateTime startTime, DateTime endTime)
{
return mail.DateEntered >= startTime && mail.DateEntered <= endTime;
}
The usage would then be:
var mailParametersOne = new MailParameters("Delta", startDate, endDate, new[] { MailLang.EU, MailLang.RU }, MailPriority.High);
var mailTaskOne = CreateTask(() => myService.Mail.Count(mailParametersOne));
Consider moving the complex comparisons into a function. For exanple, instead of
Results.Where(x => (x.a == b) && (x.c == d) && (x.d == e))
consider
Results.Where(x => IsValid(x))
...
public bool IsValid(Type x)
{
return (x.a == b) && (x.c ==d) && (x.d == e);
}
The code becomes more readable and IsValid is easy to test using an automated testing framework.
My final solution is based on an article by ScottGu.
http://weblogs.asp.net/scottgu/archive/2008/01/07/dynamic-linq-part-1-using-the-linq-dynamic-query-library.aspx
I build the LINQ query like this.
var linqStatements = new List<String>();
linqStatements.Add(parser.StringToLinqQuery<Project>("ProjectId", report.Project));
linqStatements.Add(parser.StringToLinqQuery<Region>("RegionId", report.Region));
linqStatements.Add(parser.StringToLinqQuery<Status>("Status", report.Status));
linqStatements.Add(parser.StringToLinqQuery<Priority>("Priority", report.Priority));
linqStatements.Add(parser.StringToLinqQuery<Category>("CategoryId", report.Category));
linqStatements.Add(AccountIdsToLinqQuery(report.PrimaryAssignment));
string baseQuery = String.Join(" AND ", linqStatements.Where(s => !String.IsNullOrWhiteSpace(s)));
var linqQuery = service.Mail.Where(baseQuery).Cast<Mail>();
The StringToLinqQuery looks something like this (simplified version).
public string StringToLinqQuery<TEnum>(string field, string value) where TEnum : struct
{
if (String.IsNullOrWhiteSpace(value))
return String.Empty;
var valueArray = value.Split('|');
var query = new StringBuilder();
for (int i = 0; i < valueArray.Count(); i++)
{
TEnum result;
if (Enum.TryParse<TEnum>(valueArray[i].ToLower(), true, out result))
{
if (i > 0)
query.Append(" OR ");
query.AppendFormat("{0} == {1}", field, Convert.ToInt32(result));
}
else
{
throw new DynoException("Item '" + valueArray[i] + "' not found. (" + type of (TEnum) + ")",
query.ToString());
}
}
// Wrap field == value with parentheses ()
query.Insert(0, "(");
query.Insert(query.Length, ")");
return query.ToString();
}
And the end result would look something like this.
service.Mail.Where("(ProjectId == 5) AND (RegionId == 6 OR RegionId == 7) AND (Status == 5) and (Priority == 5)")
In my project I store the values in an XML file, and then feed them into the above LINQ query. If an field is empty it will be ignored. It also support multiple values using the | sign, e.g. EU|US would translate to (Region == 5 OR Region == 6).
Update 1 , following Ayende's answer
This is my first journey into RavenDb and to experiment with it I wrote a small map/ reduce, but unfortunately the result is empty?
I have around 1.6 million documents loaded into RavenDb
A document:
public class Tick
{
public DateTime Time;
public decimal Ask;
public decimal Bid;
public double AskVolume;
public double BidVolume;
}
and wanted to get Min and Max of Ask over a specific period of Time.
The collection by Time is defined as:
var ticks = session.Query<Tick>().Where(x => x.Time > new DateTime(2012, 4, 23) && x.Time < new DateTime(2012, 4, 24, 00, 0, 0)).ToList();
Which gives me 90280 documents, so far so good.
But then the map/ reduce:
Map = rows => from row in rows
select new
{
Max = row.Bid,
Min = row.Bid,
Time = row.Time,
Count = 1
};
Reduce = results => from result in results
group result by new{ result.MaxBid, result.Count} into g
select new
{
Max = g.Key.MaxBid,
Min = g.Min(x => x.MaxBid),
Time = g.Key.Time,
Count = g.Sum(x => x.Count)
};
...
private class TickAggregationResult
{
public decimal MaxBid { get; set; }
public decimal MinBid { get; set; }
public int Count { get; set; }
}
I then create the index and try to Query it:
Raven.Client.Indexes.IndexCreation.CreateIndexes(typeof(TickAggregation).Assembly, documentStore);
var session = documentStore.OpenSession();
var g1 = session.Query<TickAggregationResult>(typeof(TickAggregation).Name);
var group = session.Query<Tick, TickAggregation>()
.Where(x => x.Time > new DateTime(2012, 4, 23) &&
x.Time < new DateTime(2012, 4, 24, 00, 0, 0)
)
.Customize(x => x.WaitForNonStaleResults())
.AsProjection<TickAggregationResult>();
But the group is just empty :(
As you can see I've tried two different Queries, I'm not sure about the difference, can someone explain?
Now I get an error:
The group are still empty :(
Let me explain what I'm trying to accomplish in pure sql:
select min(Ask), count(*) as TickCount from Ticks
where Time between '2012-04-23' and '2012-04-24)
Unfortunately, Map/Reduce doesn't work that way. Well, at least the Reduce part of it doesn't. In order to reduce your set, you would have to predefine specific time ranges to group by, for example - daily, weekly, monthly, etc. You could then get min/max/count per day if you reduced daily.
There is a way to get what you want, but it has some performance considerations. Basically, you don't reduce at all, but you index by time and then do the aggregation when transforming results. This is similar to if you ran your first query to filter and then aggregated in your client code. The only benefit is that the aggregation is done server-side, so you don't have to transmit all of that data to the client.
The performance concern here is how big of a time range are you filtering to, or more precisely, how many items will there be inside your filter range? If it's relatively small, you can use this approach. If it's too large, you will be waiting while the server goes through the result set.
Here is a sample program that illustrates this technique:
using System;
using System.Linq;
using Raven.Client.Document;
using Raven.Client.Indexes;
using Raven.Client.Linq;
namespace ConsoleApplication1
{
public class Tick
{
public string Id { get; set; }
public DateTime Time { get; set; }
public decimal Bid { get; set; }
}
/// <summary>
/// This index is a true map/reduce, but its totals are for all time.
/// You can't filter it by time range.
/// </summary>
class Ticks_Aggregate : AbstractIndexCreationTask<Tick, Ticks_Aggregate.Result>
{
public class Result
{
public decimal Min { get; set; }
public decimal Max { get; set; }
public int Count { get; set; }
}
public Ticks_Aggregate()
{
Map = ticks => from tick in ticks
select new
{
Min = tick.Bid,
Max = tick.Bid,
Count = 1
};
Reduce = results => from result in results
group result by 0
into g
select new
{
Min = g.Min(x => x.Min),
Max = g.Max(x => x.Max),
Count = g.Sum(x => x.Count)
};
}
}
/// <summary>
/// This index can be filtered by time range, but it does not reduce anything
/// so it will not be performant if there are many items inside the filter.
/// </summary>
class Ticks_ByTime : AbstractIndexCreationTask<Tick>
{
public class Result
{
public decimal Min { get; set; }
public decimal Max { get; set; }
public int Count { get; set; }
}
public Ticks_ByTime()
{
Map = ticks => from tick in ticks
select new {tick.Time};
TransformResults = (database, ticks) =>
from tick in ticks
group tick by 0
into g
select new
{
Min = g.Min(x => x.Bid),
Max = g.Max(x => x.Bid),
Count = g.Count()
};
}
}
class Program
{
private static void Main()
{
var documentStore = new DocumentStore { Url = "http://localhost:8080" };
documentStore.Initialize();
IndexCreation.CreateIndexes(typeof(Program).Assembly, documentStore);
var today = DateTime.Today;
var rnd = new Random();
using (var session = documentStore.OpenSession())
{
// Generate 100 random ticks
for (var i = 0; i < 100; i++)
{
var tick = new Tick { Time = today.AddMinutes(i), Bid = rnd.Next(100, 1000) / 100m };
session.Store(tick);
}
session.SaveChanges();
}
using (var session = documentStore.OpenSession())
{
// Query items with a filter. This will create a dynamic index.
var fromTime = today.AddMinutes(20);
var toTime = today.AddMinutes(80);
var ticks = session.Query<Tick>()
.Where(x => x.Time >= fromTime && x.Time <= toTime)
.OrderBy(x => x.Time);
// Ouput the results of the above query
foreach (var tick in ticks)
Console.WriteLine("{0} {1}", tick.Time, tick.Bid);
// Get the aggregates for all time
var total = session.Query<Tick, Ticks_Aggregate>()
.As<Ticks_Aggregate.Result>()
.Single();
Console.WriteLine();
Console.WriteLine("Totals");
Console.WriteLine("Min: {0}", total.Min);
Console.WriteLine("Max: {0}", total.Max);
Console.WriteLine("Count: {0}", total.Count);
// Get the aggregates with a filter
var filtered = session.Query<Tick, Ticks_ByTime>()
.Where(x => x.Time >= fromTime && x.Time <= toTime)
.As<Ticks_ByTime.Result>()
.Take(1024) // max you can take at once
.ToList() // required!
.Single();
Console.WriteLine();
Console.WriteLine("Filtered");
Console.WriteLine("Min: {0}", filtered.Min);
Console.WriteLine("Max: {0}", filtered.Max);
Console.WriteLine("Count: {0}", filtered.Count);
}
Console.ReadLine();
}
}
}
I can envision a solution to the problem of aggregating over a time filter with a potentially large scope. The reduce would have to break things down into decreasingly smaller units of time at different levels. The code for this is a bit complex, but I am working on it for my own purposes. When complete, I will post over in the knowledge base at www.ravendb.net.
UPDATE
I was playing with this a bit more, and noticed two things in that last query.
You MUST do a ToList() before calling single in order to get the full result set.
Even though this runs on the server, the max you can have in the result range is 1024, and you have to specify a Take(1024) or you get the default of 128 max. Since this runs on the server, I didn't expect this. But I guess its because you don't normally do aggregations in the TransformResults section.
I've updated the code for this. However, unless you can guarantee that the range is small enough for this to work, I would wait for the better full map/reduce that I spoke of. I'm working on it. :)