Map reduce in RavenDb, update 1 - c#

Update 1 , following Ayende's answer
This is my first journey into RavenDb and to experiment with it I wrote a small map/ reduce, but unfortunately the result is empty?
I have around 1.6 million documents loaded into RavenDb
A document:
public class Tick
{
public DateTime Time;
public decimal Ask;
public decimal Bid;
public double AskVolume;
public double BidVolume;
}
and wanted to get Min and Max of Ask over a specific period of Time.
The collection by Time is defined as:
var ticks = session.Query<Tick>().Where(x => x.Time > new DateTime(2012, 4, 23) && x.Time < new DateTime(2012, 4, 24, 00, 0, 0)).ToList();
Which gives me 90280 documents, so far so good.
But then the map/ reduce:
Map = rows => from row in rows
select new
{
Max = row.Bid,
Min = row.Bid,
Time = row.Time,
Count = 1
};
Reduce = results => from result in results
group result by new{ result.MaxBid, result.Count} into g
select new
{
Max = g.Key.MaxBid,
Min = g.Min(x => x.MaxBid),
Time = g.Key.Time,
Count = g.Sum(x => x.Count)
};
...
private class TickAggregationResult
{
public decimal MaxBid { get; set; }
public decimal MinBid { get; set; }
public int Count { get; set; }
}
I then create the index and try to Query it:
Raven.Client.Indexes.IndexCreation.CreateIndexes(typeof(TickAggregation).Assembly, documentStore);
var session = documentStore.OpenSession();
var g1 = session.Query<TickAggregationResult>(typeof(TickAggregation).Name);
var group = session.Query<Tick, TickAggregation>()
.Where(x => x.Time > new DateTime(2012, 4, 23) &&
x.Time < new DateTime(2012, 4, 24, 00, 0, 0)
)
.Customize(x => x.WaitForNonStaleResults())
.AsProjection<TickAggregationResult>();
But the group is just empty :(
As you can see I've tried two different Queries, I'm not sure about the difference, can someone explain?
Now I get an error:
The group are still empty :(
Let me explain what I'm trying to accomplish in pure sql:
select min(Ask), count(*) as TickCount from Ticks
where Time between '2012-04-23' and '2012-04-24)

Unfortunately, Map/Reduce doesn't work that way. Well, at least the Reduce part of it doesn't. In order to reduce your set, you would have to predefine specific time ranges to group by, for example - daily, weekly, monthly, etc. You could then get min/max/count per day if you reduced daily.
There is a way to get what you want, but it has some performance considerations. Basically, you don't reduce at all, but you index by time and then do the aggregation when transforming results. This is similar to if you ran your first query to filter and then aggregated in your client code. The only benefit is that the aggregation is done server-side, so you don't have to transmit all of that data to the client.
The performance concern here is how big of a time range are you filtering to, or more precisely, how many items will there be inside your filter range? If it's relatively small, you can use this approach. If it's too large, you will be waiting while the server goes through the result set.
Here is a sample program that illustrates this technique:
using System;
using System.Linq;
using Raven.Client.Document;
using Raven.Client.Indexes;
using Raven.Client.Linq;
namespace ConsoleApplication1
{
public class Tick
{
public string Id { get; set; }
public DateTime Time { get; set; }
public decimal Bid { get; set; }
}
/// <summary>
/// This index is a true map/reduce, but its totals are for all time.
/// You can't filter it by time range.
/// </summary>
class Ticks_Aggregate : AbstractIndexCreationTask<Tick, Ticks_Aggregate.Result>
{
public class Result
{
public decimal Min { get; set; }
public decimal Max { get; set; }
public int Count { get; set; }
}
public Ticks_Aggregate()
{
Map = ticks => from tick in ticks
select new
{
Min = tick.Bid,
Max = tick.Bid,
Count = 1
};
Reduce = results => from result in results
group result by 0
into g
select new
{
Min = g.Min(x => x.Min),
Max = g.Max(x => x.Max),
Count = g.Sum(x => x.Count)
};
}
}
/// <summary>
/// This index can be filtered by time range, but it does not reduce anything
/// so it will not be performant if there are many items inside the filter.
/// </summary>
class Ticks_ByTime : AbstractIndexCreationTask<Tick>
{
public class Result
{
public decimal Min { get; set; }
public decimal Max { get; set; }
public int Count { get; set; }
}
public Ticks_ByTime()
{
Map = ticks => from tick in ticks
select new {tick.Time};
TransformResults = (database, ticks) =>
from tick in ticks
group tick by 0
into g
select new
{
Min = g.Min(x => x.Bid),
Max = g.Max(x => x.Bid),
Count = g.Count()
};
}
}
class Program
{
private static void Main()
{
var documentStore = new DocumentStore { Url = "http://localhost:8080" };
documentStore.Initialize();
IndexCreation.CreateIndexes(typeof(Program).Assembly, documentStore);
var today = DateTime.Today;
var rnd = new Random();
using (var session = documentStore.OpenSession())
{
// Generate 100 random ticks
for (var i = 0; i < 100; i++)
{
var tick = new Tick { Time = today.AddMinutes(i), Bid = rnd.Next(100, 1000) / 100m };
session.Store(tick);
}
session.SaveChanges();
}
using (var session = documentStore.OpenSession())
{
// Query items with a filter. This will create a dynamic index.
var fromTime = today.AddMinutes(20);
var toTime = today.AddMinutes(80);
var ticks = session.Query<Tick>()
.Where(x => x.Time >= fromTime && x.Time <= toTime)
.OrderBy(x => x.Time);
// Ouput the results of the above query
foreach (var tick in ticks)
Console.WriteLine("{0} {1}", tick.Time, tick.Bid);
// Get the aggregates for all time
var total = session.Query<Tick, Ticks_Aggregate>()
.As<Ticks_Aggregate.Result>()
.Single();
Console.WriteLine();
Console.WriteLine("Totals");
Console.WriteLine("Min: {0}", total.Min);
Console.WriteLine("Max: {0}", total.Max);
Console.WriteLine("Count: {0}", total.Count);
// Get the aggregates with a filter
var filtered = session.Query<Tick, Ticks_ByTime>()
.Where(x => x.Time >= fromTime && x.Time <= toTime)
.As<Ticks_ByTime.Result>()
.Take(1024) // max you can take at once
.ToList() // required!
.Single();
Console.WriteLine();
Console.WriteLine("Filtered");
Console.WriteLine("Min: {0}", filtered.Min);
Console.WriteLine("Max: {0}", filtered.Max);
Console.WriteLine("Count: {0}", filtered.Count);
}
Console.ReadLine();
}
}
}
I can envision a solution to the problem of aggregating over a time filter with a potentially large scope. The reduce would have to break things down into decreasingly smaller units of time at different levels. The code for this is a bit complex, but I am working on it for my own purposes. When complete, I will post over in the knowledge base at www.ravendb.net.
UPDATE
I was playing with this a bit more, and noticed two things in that last query.
You MUST do a ToList() before calling single in order to get the full result set.
Even though this runs on the server, the max you can have in the result range is 1024, and you have to specify a Take(1024) or you get the default of 128 max. Since this runs on the server, I didn't expect this. But I guess its because you don't normally do aggregations in the TransformResults section.
I've updated the code for this. However, unless you can guarantee that the range is small enough for this to work, I would wait for the better full map/reduce that I spoke of. I'm working on it. :)

Related

String Data type and sum time duration using ASP.Net MVC

What I have is a string data type which stores the duration.
I am a looking for sum of the duration and then average of that sum.
I am using ASP.NET MVC.
Example:
00:30:21
00:40:01
00:21:10
Model class
public DateTime? FeedbackDateTime { get; set; }
public DateTime? FeedbackSharedDateTime { get; set; }
public string AuditorAHT { get; set; }
ReportVM To Group Data and display in the View
public string FeedbackSharedBy { get; set; }
public int AuditCount { get; set; }
public string AudtAht { get; set; }
Controller that saves the action perform by auditor as duration in
public string AuditorAHT { get; set; }
dto.FeedbackSharedDateTime = DateTime.Now;
string ahtString = string.Format("{0:hh\\:mm\\:ss}", dto.FeedbackSharedDateTime - dto.FeedbackDateTime);
dto.AuditorAHT = ahtString;
db.SaveChanges();
Below Action should display Auditors Name, Count, and Average Time spent. From which Name and Count is working but not the Average Time Spend
var audtName = db.Chats.Where(x => System.Data.Entity.DbFunctions.TruncateTime(x.MSTChatCreatedDateTime) >= mostRecentMonday
&& System.Data.Entity.DbFunctions.TruncateTime(x.MSTChatCreatedDateTime) <= weekEnd && x.Feedback != null && x.FeedbackSharedBy != null).Select(x => new {
x.FeedbackSharedBy,
x.AuditorAHT
}).ToList() // this hits the database
// We need to do grouping in the code (rather than the db)
// because timespans are stored as strings
.GroupBy(e => e.FeedbackSharedBy)
.Select(g => new ReportVM
{
FeedbackSharedBy = g.Key,
AuditCount = g.Count(),
AudtAht = TimeSpan.FromSeconds(g.Sum(t => TimeSpan.Parse(t.AuditorAHT).TotalSeconds / g.Count())).ToString()
})
.OrderByDescending(s => s.AuditCount).ToList();
ViewBag.AudtReport = audtName;
Above COde is working for me, managed to make it work.
You can convert the string duration into a TimeSpan type and use this to do time calculations. To convert it you can use TimeSpan.Parse() or if you have a fix format use TimeSpan.ParseExact().
With TimeSpan you can get the the totals out of it with the various .Total*() methods. You also can get the internal ticks count with .Ticks. That's the one with the highest precision.
Now its a simple math: sum of all ticks / count = average ticks
You can pass this average ticks count into a TimeSpan again to grab it as .TotalMilliseconds() or output it formatted with .ToString().
Here is a basic sample:
using System;
public class Program
{
public static void Main()
{
var duration1 = TimeSpan.Parse("00:30:21");
var duration2 = TimeSpan.Parse("00:40:01");
var duration3 = TimeSpan.Parse("00:21:10");
var totalDuration = duration1.Add(duration2).Add(duration3);
var averageDurationTicks = totalDuration.Ticks / 3;
var averageDuration = TimeSpan.FromTicks(averageDurationTicks);
Console.WriteLine($"Total duration: {totalDuration}, Average duration: {averageDuration}");
}
}
Here is a .Net Fiddle: https://dotnetfiddle.net/1Q9tmV
After spending lot of time and with help of tymtam made the code work with below code.
var audtName = db.Chats.Where(x => System.Data.Entity.DbFunctions.TruncateTime(x.MSTChatCreatedDateTime) >= mostRecentMonday
&& System.Data.Entity.DbFunctions.TruncateTime(x.MSTChatCreatedDateTime) <= weekEnd && x.Feedback != null && x.FeedbackSharedBy != null).Select(x => new {
x.FeedbackSharedBy,
x.AuditorAHT
}).ToList() // this hits the database
// We need to do grouping in the code (rather than the db)
// because timespans are stored as strings
.GroupBy(e => e.FeedbackSharedBy)
.Select(g => new ReportVM
{
FeedbackSharedBy = g.Key,
AuditCount = g.Count(),
AudtAht = TimeSpan.FromSeconds(g.Sum(t => TimeSpan.Parse(t.AuditorAHT).TotalSeconds / g.Count())).ToString()
})
.OrderByDescending(s => s.AuditCount).ToList();
ViewBag.AudtReport = audtName;
If a second precision is enough for you you could combine Linq's Sum and Average with TimeSpan's Parse and TotalSeconds:
Sum = TimeSpan.FromSeconds(g.Sum(t => TimeSpan.Parse(t.Time).TotalSeconds)),
Avg = TimeSpan.FromSeconds(g.Average(t => TimeSpan.Parse(t.Time).TotalSeconds))
Here is a full example:
var data = new []{
new { X = "A", Time = "00:30:21"},
new { X = "B", Time = "00:40:01"},
new { X = "B", Time = "00:21:10"}
};
var grouped = data
.GroupBy(e => e.X)
.Select( g => new {
X = g.Key,
Count = g.Count(),
Sum = TimeSpan.FromSeconds(g.Sum(t => TimeSpan.Parse(t.Time).TotalSeconds)),
Avg = TimeSpan.FromSeconds(g.Average(t => TimeSpan.Parse(t.Time).TotalSeconds))
});
foreach (var item in grouped)
{
Console.WriteLine( $"'{item.X}' has {item.Count} item(s), sum = {item.Sum}, avg = {item.Avg}");
}
This produces:
'A' has 1 item(s), sum = 00:30:21, avg = 00:30:21
'B' has 2 item(s), sum = 01:01:11, avg = 00:30:35.5000000
You could use TotalMilliseconds + FromMillisecond or even go super precise Ticks.
Variant with Aggregate
Another option is to Parse earlier:
Sum = g.Select(e => TimeSpan.Parse(e.Time)).Aggregate((t1, t2) => t1 + t2),
Avg = g.Select(e => TimeSpan.Parse(e.Time)).Aggregate((t1, t2) => t1 + t2) / g.Count()
For LINQ to Entities
As you report in your comment if we try the above with your real code it results in LINQ to Entities does not recognize the method 'System.TimeSpan FromSeconds(Double)' method, and this method cannot be translated into a store expression. (With the same for TimeSpan.Parse.
Because of this we will would need to do the grouping in the code. This is less efficient that if the TimeSpan was in the database as TimeSpan.
var grouped = db.Chats
.Where(...)
.Select( x => new {
x.FeedbackSharedBy,
x.AuditorAHT
})
.ToList() // this hits the database
// We need to do grouping in the code (rather than the db)
// because timespans are stored as strings
.GroupBy(e => e.FeedbackSharedBy)
.Select(g => new
{
FeedbackSharedBy = g.Key,
AuditCount = g.Count(),
AuditorAHTSumSeconds = TimeSpan.FromSeconds(g.Sum(t => TimeSpan.Parse(t.AuditorAHT).TotalSeconds) / g.Count())
.ToString(),
})
.OrderByDescending(s => s.AuditCount)
.ToList(); // Optional

C# Join multiple collections into one

i have problem with joining multiple collections into one
-> I need collections with data from many sensors connect into one to have for each time values from all sensors in output file, f.e. if one sensor have no data, it will fill file with 0
Please help me, I am desperate
public class MeasuredData
{
public DateTime Time { get; }
public double Value { get; }
public MeasuredData(DateTime time, double value)
{
Time = time;
Value = value;
}
}
If you have multiple variables containing List<MeasuredData>, one for each sensor, you can group them in an array and then query them.
First, you need an extension method to round the DateTimes per #jdweng if you aren't already canonicalizing them as you acquire them.
public static DateTime Round(this DateTime dt, TimeSpan rnd) {
if (rnd == TimeSpan.Zero)
return dt;
else {
var ansTicks = dt.Ticks + Math.Sign(dt.Ticks) * rnd.Ticks / 2;
return new DateTime(ansTicks - ansTicks % rnd.Ticks);
}
}
Now you can create an array of the sensor reading Lists:
var sensorData = new[] { sensor0, sensor1, sensor2, sensor3 };
Then you can extract all the rounded times to create the left hand side of the table:
var roundTo = TimeSpan.FromSeconds(1);
var times = sensorData.SelectMany(sdl => sdl.Select(md => md.Time.Round(roundTo)))
.Distinct()
.Select(t => new { Time = t, Measurements = Enumerable.Empty<MeasuredData>() });
Then you can join each sensor to the table:
foreach (var oneSensorData in sensorData)
times = times.GroupJoin(oneSensorData, t => t.Time, md => md.Time.Round(roundTo),
(t, mdj) => new { t.Time, Measurements = t.Measurements.Concat(mdj) });
Finally, you can convert each row to the time and a List of measurements ordered by time:
var ans = times.Select(tm => new { tm.Time, Measurements = tm.Measurements.ToList() })
.OrderBy(tm => tm.Time);
If you wanted to flatten the List of measurements out to fields in the answer, you would need to do that manually with another Select.
Assuming you have something to join on, you can use Enumerable.Join:
var result = collection1.Join(collection2,
/* whatever your join is */ x => x.id,
y => y.id,
(a, b) => new {x = a, y = b}
foreach(var obj in result)
{
Console.WriteLine($"{obj.x.id}, {obj.y.id}")
}
This prints the id's of the two objects, but they could access anything. The link is probably more helpful, but you didn't give us much info

Finding lowest price for overlapping date ranges - C# algorithm

There are prices set for certain time periods... I'm having trouble coming up with an algorithm to determine the lowest price for a specific time period.
I'm doing this with a list of objects, where the object has properties DateTime StartDate, DateTime EndDate, decimal Price.
For example, two price sets and their active date ranges:
A. 09/26/16 - 12/31/17 at $20.00
B. 12/01/16 - 12/31/16 at $18.00
You can see that B is inside the A time period and is lower.
I need that converted to this:
A. 09/26/16 - 11/30/16 at $20.00
B. 12/01/16 - 12/31/16 at $18.00
C. 01/01/17 - 12/31/17 at $20.00
It has to work for any number of date ranges and combinations. Has anyone come across anything I can manipulate to get the result I need? Or any suggestions?
Edit: My data structure:
public class PromoResult
{
public int ItemId { get; set; }
public decimal PromoPrice { get; set; }
public DateTime StartDate { get; set; }
public DateTime EndDate { get; set; }
public int PromoType { get; set; } // can ignore this...
}
This is a great case for using Linq. Assuming your price range object is called PriceRecord...
You will need to create a list of all dates and then filter down to price records that are between two consecutive dates. An implementation might look something like this:
public static IEnumerable<PriceRecord> ReduceOverlaps(IEnumerable<PriceRecord> source)
{
// Get a list of all edges of date ranges
// edit, added OrderBy (!)
var edges = source.SelectMany(record => new[] { record.StartDate, record.EndDate }).OrderBy(d => d).ToArray();
// iterate over pairs of edges (i and i-1)
for (int i = 1; i < edges.Length; i++)
{
// select min price for range i-1, i
var price = source.Where(r => r.StartDate <= edges[i - 1] && r.EndDate >= edges[i]).Select(r => r.Price).Min();
// return a new record from i-1, i with price
yield return new PriceRecord() { StartDate = edges[i - 1], EndDate = edges[i], Price = price };
}
}
I haven't tested this and you may need to tinker with the comparison operators, but it may be a good starting point.
I have now tested the code, the example here works with the data in the question.
Feel free to propose edits to improve this example.
I will use 2 functions DateRange and GroupSequenceWhile
List<PromoResult> promoResult = new List<PromoResult>()
{
new PromoResult() { PromoPrice=20, StartDate = new DateTime(2016, 9, 26),EndDate=new DateTime(2017, 12, 31)},
new PromoResult() { PromoPrice=18, StartDate = new DateTime(2016, 12, 1),EndDate=new DateTime(2016, 12, 31)}
};
var result = promoResult.SelectMany(x => DateRange(x.StartDate, x.EndDate, TimeSpan.FromDays(1))
.Select(y => new { promo = x, date = y }))
.GroupBy(x => x.date).Select(x => x.OrderBy(y => y.promo.PromoPrice).First())
.OrderBy(x=>x.date)
.ToList();
var final = result.GroupSequenceWhile((x, y) => x.promo.PromoPrice == y.promo.PromoPrice)
.Select(g => new { start = g.First().date, end = g.Last().date, price = g.First().promo.PromoPrice })
.ToList();
foreach (var r in final)
{
Console.WriteLine(r.price + "$ " + r.start.ToString("MM/dd/yy", CultureInfo.InvariantCulture) + " " + r.end.ToString("MM/dd/yy", CultureInfo.InvariantCulture));
}
OUTPUT:
20$ 09/26/16 11/30/16
18$ 12/01/16 12/31/16
20$ 01/01/17 12/31/17
Algorithm:
1- create a <day,price> tuple for each item in promoResult list
2- group this tuples by day and select min price
3- order this tuples by date
4- select the starting and ending day when there is a change in price in consecutive days
IEnumerable<DateTime> DateRange(DateTime start, DateTime end, TimeSpan period)
{
for (var dt = start; dt <= end; dt = dt.Add(period))
{
yield return dt;
}
}
public static IEnumerable<IEnumerable<T>> GroupSequenceWhile<T>(this IEnumerable<T> seq, Func<T, T, bool> condition)
{
List<T> list = new List<T>();
using (var en = seq.GetEnumerator())
{
if (en.MoveNext())
{
var prev = en.Current;
list.Add(en.Current);
while (en.MoveNext())
{
if (condition(prev, en.Current))
{
list.Add(en.Current);
}
else
{
yield return list;
list = new List<T>();
list.Add(en.Current);
}
prev = en.Current;
}
if (list.Any())
yield return list;
}
}
}
Doesn't directly answer your question, but here is some SQL that I used to solve a similar problem I had (simplified down a bit, as I was also dealing with multiple locations and different price types):
SELECT RI.ItemNmbr, RI.UnitPrice, RI.CasePrice
, RP.ProgramID
, Row_Number() OVER (PARTITION BY RI.ItemNmbr,
ORDER BY CASE WHEN RI.UnitPrice > 0
THEN RI.UnitPrice
ELSE 1000000 END ASC
, CASE WHEN RI.CasePrice > 0
THEN RI.CasePrice
ELSE 1000000 END ASC
, RP.EndDate DESC
, RP.BeginDate ASC
, RP.ProgramID ASC) AS RowNumBtl
, Row_Number() OVER (PARTITION BY RI.UnitPrice,
ORDER BY CASE WHEN RI.CasePrice > 0
THEN RI.CasePrice
ELSE 1000000 END ASC
, CASE WHEN RI.UnitPrice > 0
THEN RI.UnitPrice
ELSE 1000000 END ASC
, RP.EndDate DESC
, RP.BeginDate ASC
, RP.ProgramID ASC) AS RowNumCase
FROM RetailPriceProgramItem AS RI
INNER JOIN RetailPriceMaster AS RP
ON RP.ProgramType = RI.ProgramType AND RP.ProgramID = RI.ProgramID
WHERE RP.ProgramType='S'
AND RP.BeginDate <= #date AND RP.EndDate >= #date
AND RI.Active=1
I select from that where RowNumBtl=1 for the UnitPrice and RowNumCase=1 for the CasePrice. If you then create a table of dates (which you can do using a CTE), you can cross apply on each date. This is a bit inefficient, since you only need to test at border conditions between date ranges, so... good luck with that.
I would start with the ranges in date order based on starting date, add the first entry as a range in its entirety so:
09/26/16 - 12/31/17 at $20.00
TBD:
12/01/16 - 12/31/16 at $18.00
Next grab the next range you have, if it overlaps with the previous one, split the overlap (there are few kinds of overlaps, make sure to handle them all) taking the minimum value for the overlapped region:
09/26/16 - 11/30/16 at $20.00
12/01/16 - 12/31/16 at $18.00
TBD:
01/01/17 - 12/31/17 at $20.00
Note that you don't have the last one yet as you would take any splits that occur after and put them back into your sorted list of "yet to be compared" items.
Try this
lets say we have:
public class DatePrice
{
public DateTime StartDate { get; set; }
public DateTime EndDate { get; set; }
public decimal Price { get; set; }
}
and
IList<DatePrice> list = new List<DatePrice>(); // populate your data from the source..
var lowestPriceItem = list.OrderBy(item => item.Price).First();
should give you the lowest price item.

ASP.NET MVC Filter datetime by weeks

I've got a Web API and a Get method, returning a query:
var query = from results in context.Table
where results.Date>= startDate && results.Date <= endDate
select new
{
Week = { this is where I need a method to group by weeks },
Average = results.Where(x => x.Number).Average()
}
return query.ToList();
I want to calculate the average for each 7 days (that being the first week).
Example:
Average 1 ... day 7 (Week 1)
Average 2 ... day 14 (Week 2)
How can I do that? Being given an interval of datetimes, to filter it by weeks (not week of year)
Try this (not tested with tables)
var avgResult = context.QuestionaireResults
.Where(r => (r.DepartureDate >= startDate && r.DepartureDate <= endDate)).ToList()
.GroupBy( g => (Decimal.Round(g.DepartureDate.Day / 7)+1))
.Select( g => new
{
Week = g.Key,
Avg = g.Average(n => n.Number)
});
You will need to group by the number of days, since a reference date, divided by 7, so
.GroupBy(x => Math.Floor(((x.DepartureDate - new DateTime(1980,1,1)).TotalDays + 2) / 7))
Subtracting "Jan 1, 1980" from your departure date, gives you a TimeSpan object with the difference between the two dates. The TotalDays property of that timespan gives you timespan in days. Adding 2 corrects for the fact that "Jan 1, 1980" was a Tuesday. Dividing by 7 gives you the number of weeks since then. Math.Floor rounds it down, so that you get a consistent integer for the week, given any day of the week or portion of days within the week.
You could simplify a little by picking a reference date that is a Sunday (assuming that is your "first day of the week"), so you dont have to add 2 to correct. Like so:
.GroupBy(x => Math.Floor(((x.DepartureDate - new DateTime(1979,12,30)).TotalDays) / 7))
If you are sure that your data all falls within a single calendar year, you could maybe use the Calendar.GetWeekOfYear method to figure out the week, but I am not sure it would be any simpler.
Why not write a stored procedure, I think there may be some limitations on your flexibility using Linq because of the idea that normally the GroupBy groups by value (the value of the referenced "thing") so you can group by State, or Age, but I guess you can Group week... (new thought)
Add a property called EndOfWeek and for example, the end of this week is (Sunday let's say) then EndOfWeek = 9.2.16 whereas last week was 8.28.16... etc. then you can easily group but you still have to arrange the data.
I know I didn't answer the question but I hope that I sparked some brain activity in an area that allows you to solve the problem.
--------- UPDATED ----------------
simple solution, loop through your records, foreach record determine the EndOfWeek for that record. After this you will now have a groupable value. Easily group by EndOfWeek. Simple!!!!!!!!!!!! Now, #MikeMcCaughan please tell me how this doesn't work? Is it illogical to extend an object? What are you talking about?
------------ HERE IS THE CODE ----------------
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace SandboxConsole
{
class Program
{
static void Main(string[] args)
{
var t = new Transactions();
List<Transactions> transactions = t.GetTransactions();
// Now let's add a Weeks end date so we can determine the average per week
foreach(var transaction in transactions)
{
var transactionDayOfWeek = transaction.TransactionDate;
int daysUntilEndOfWeek_Sat = ((int)DayOfWeek.Saturday - (int)transactionDayOfWeek.DayOfWeek + 7) % 7;
transaction.Newly_Added_Property_To_Group_By_Week_To_Get_Averages = transactionDayOfWeek.AddDays(daysUntilEndOfWeek_Sat).ToShortDateString();
//Console.WriteLine("{0} {")
}
foreach(var weekEnd in transactions.GroupBy(tt => tt.Newly_Added_Property_To_Group_By_Week_To_Get_Averages))
{
decimal weekTotal = 0;
foreach(var trans in weekEnd)
{
weekTotal += trans.Amount;
}
var weekAverage = weekTotal / 7;
Console.WriteLine("Week End: {0} - Avg {1}", weekEnd.Key.ToString(), weekAverage.ToString("C"));
}
Console.ReadKey();
}
}
class Transactions
{
public int Id { get; set; }
public string SomeOtherProp { get; set; }
public DateTime TransactionDate { get; set; }
public decimal Amount { get; set; }
public string Newly_Added_Property_To_Group_By_Week_To_Get_Averages { get; set; }
public List<Transactions> GetTransactions()
{
var results = new List<Transactions>();
for(var i = 0; i<100; i++)
{
results.Add(new Transactions
{
Id = i,
SomeOtherProp = "Customer " + i.ToString(),
TransactionDate = GetRandomDate(i),
Amount = GetRandomAmount()
});
}
return results;
}
public DateTime GetRandomDate(int i)
{
Random gen = new Random();
DateTime startTime = new DateTime(2016, 1, 1);
int range = (DateTime.Today - startTime).Days + i;
return startTime.AddDays(gen.Next(range));
}
public int GetRandomAmount()
{
Random rnd = new Random();
int amount = rnd.Next(1000, 10000);
return amount;
}
}
}
------------ OUTPUT ---------------
Sample Output

How do I get total Qty using one linq query?

I have two linq queries, one to get confirmedQty and another one is to get unconfirmedQty.
There is a condition for getting unconfirmedQty. It should be average instead of sum.
result = Sum(confirmedQty) + Avg(unconfirmedQty)
Is there any way to just write one query and get the desired result instead of writing two separate queries?
My Code
class Program
{
static void Main(string[] args)
{
List<Item> items = new List<Item>(new Item[]
{
new Item{ Qty = 100, IsConfirmed=true },
new Item{ Qty = 40, IsConfirmed=false },
new Item{ Qty = 40, IsConfirmed=false },
new Item{ Qty = 40, IsConfirmed=false },
});
int confirmedQty = Convert.ToInt32(items.Where(o => o.IsConfirmed == true).Sum(u => u.Qty));
int unconfirmedQty = Convert.ToInt32(items.Where(o => o.IsConfirmed != true).Average(u => u.Qty));
//Output => Total : 140
Console.WriteLine("Total : " + (confirmedQty + unconfirmedQty));
Console.Read();
}
public class Item
{
public int Qty { get; set; }
public bool IsConfirmed { get; set; }
}
}
Actually accepted answer enumerates your items collection 2N + 1 times and it adds unnecessary complexity to your original solution. If I'd met this piece of code
(from t in items
let confirmedQty = items.Where(o => o.IsConfirmed == true).Sum(u => u.Qty)
let unconfirmedQty = items.Where(o => o.IsConfirmed != true).Average(u => u.Qty)
let total = confirmedQty + unconfirmedQty
select new { tl = total }).FirstOrDefault();
it would take some time to understand what type of data you are projecting items to. Yes, this query is a strange projection. It creates SelectIterator to project each item of sequence, then it create some range variables, which involves iterating items twice, and finally it selects first projected item. Basically you have wrapped your original queries into additional useless query:
items.Select(i => {
var confirmedQty = items.Where(o => o.IsConfirmed).Sum(u => u.Qty);
var unconfirmedQty = items.Where(o => !o.IsConfirmed).Average(u => u.Qty);
var total = confirmedQty + unconfirmedQty;
return new { tl = total };
}).FirstOrDefault();
Intent is hidden deeply in code and you still have same two nested queries. What you can do here? You can simplify your two queries, make them more readable and show your intent clearly:
int confirmedTotal = items.Where(i => i.IsConfirmed).Sum(i => i.Qty);
// NOTE: Average will throw exception if there is no unconfirmed items!
double unconfirmedAverage = items.Where(i => !i.IsConfirmed).Average(i => i.Qty);
int total = confirmedTotal + (int)unconfirmedAverage;
If performance is more important than readability, then you can calculate total in single query (moved to extension method for readability):
public static int Total(this IEnumerable<Item> items)
{
int confirmedTotal = 0;
int unconfirmedTotal = 0;
int unconfirmedCount = 0;
foreach (var item in items)
{
if (item.IsConfirmed)
{
confirmedTotal += item.Qty;
}
else
{
unconfirmedCount++;
unconfirmedTotal += item.Qty;
}
}
if (unconfirmedCount == 0)
return confirmedTotal;
// NOTE: Will not throw if there is no unconfirmed items
return confirmedTotal + unconfirmedTotal / unconfirmedCount;
}
Usage is simple:
items.Total();
BTW Second solution from accepted answer is not correct. It's just a coincidence that it returns correct value, because you have all unconfirmed items with equal Qty. This solution calculates sum instead of average. Solution with grouping will look like:
var total =
items.GroupBy(i => i.IsConfirmed)
.Select(g => g.Key ? g.Sum(i => i.Qty) : (int)g.Average(i => i.Qty))
.Sum();
Here you have grouping items into two groups - confirmed and unconfirmed. Then you calculate either sum or average based on group key, and summary of two group values. This also neither readable nor efficient solution, but it's correct.

Categories

Resources