Reduce cannot contain Average() methods in grouping - c#

Just upgraded to v2 and this no longer works; I get a similar error if I try to use Count()
public class Deck_Ratings : AbstractIndexCreationTask<DeckRating, Deck_Ratings.ReduceResult>
{
public class ReduceResult
{
public string DeckId { get; set; }
public int Rating { get; set; }
}
public Deck_Ratings()
{
Map = deckRatings => deckRatings.Select(deckRating => new
{
deckRating.DeckId,
deckRating.Rating
});
Reduce = reduceResults => reduceResults
.GroupBy(reduceResult => reduceResult.DeckId)
.Select(grouping => new
{
DeckId = grouping.Key,
Rating = grouping.Average(reduceResult => reduceResult.Rating)
});
}
}

Aggregates that can be influenced by the size of the reduce batch (such as Count and Average) are prohibited because they will yield the wrong results. You may have been able to use it under 1.0, but your averages were probably wrong unless you had so few items that they all got done in one reduce batch. To understand more about reduce batches, read Map / Reduce - A Visual Explanation
You must count items by summing a 1 for each item. You must average items by taking a sum of the values as a total, a sum of 1's as a count, and then dividing them.
public class Deck_Ratings : AbstractIndexCreationTask<DeckRating, Deck_Ratings.ReduceResult>
{
public class ReduceResult
{
public string DeckId { get; set; }
public int TotalRating { get; set; }
public int CountRating { get; set; }
public double AverageRating { get; set; }
}
public Deck_Ratings()
{
Map = deckRatings => deckRatings.Select(deckRating => new
{
deckRating.DeckId,
TotalRating = deckRating.Rating,
CountRating = 1,
AverageRating = 0
});
Reduce = reduceResults => reduceResults
.GroupBy(reduceResult => reduceResult.DeckId)
.Select(grouping => new
{
DeckId = grouping.Key,
TotalRating = grouping.Sum(reduceResult => reduceResult.TotalRating)
CountRating = grouping.Sum(reduceResult => reduceResult.CountRating)
})
.Select(x => new
{
x.DeckId,
x.TotalRating,
x.CountRating,
AverageRating = x.TotalRating / x.CountRating
});
}
}

This is issue RavenDB-783. This is expected behavior since v2.0.
Not sure what he recommends as an alternative, though.

Related

Problem with fetching records from database with EntityFramework

I need little help with the three functions below. I expect the functions to take the records daily, monthly and all records of the current year. However, I notice on the daily report the amount of 'scrap' is around 126 meanwhile monthly and year reports are showing 32.
Why 126 'scrap' in the daily report is not included in the others reports? Thank you in advance.
public async Task<List<Scrap>> GetDailyScrap()
{
return await _dbContext.Scrap.Where(x =>
x.Created.Year == DateTime.Now.Year &&
x.Created.Month == DateTime.Now.Month &&
x.Created.Day == DateTime.Now.Day).ToListAsync();
}
public async Task<List<Scrap>> GetMonthlyScrap()
{
return await _dbContext.Scrap.Where(x =>
x.Created.Year == DateTime.Now.Year &&
x.Created.Month == DateTime.Now.Month).ToListAsync();
}
public async Task<List<Scrap>> GetYearScrap()
{
return await _dbContext.Scrap.Where(x =>
x.Created.Year == DateTime.Now.Year).ToListAsync();
}
The amount of scrap for KST-420(daily chart) to reflect with the correct numbers on the monthly and year report.
Scrap Model :
public class Scrap
{
public int Id { get; set; }
public int ScrapLineId { get; set; }
public string Line { get; set; }
public string Type { get; set; }
public string Position { get; set; }
public string Tag { get; set; }
public int Shift { get; set; }
public int ShiftLeaderPersonalId { get; set; }
public int OperatorPersonalId { get; set; }
public int Quantity { get; set; }
public int Week { get; set; }
public DateTime Created { get; set; }
}
endpoint:
//Daily Bar Chart SCRAP
List<Scrap> dailyScrap = await _scrapService.GetDailyScrap();
List<string> xValues = dailyScrap.DistinctBy(x => x.Line).Select(x => x.Line).ToList();
List<int> yValues = dailyScrap.DistinctBy(x => x.Quantity).Select(x => x.Quantity).ToList();
// Monthly Bar Chart SCRAP
List<Scrap> monthlyScrap = await _scrapService.GetMonthlyScrap();
List<string> xValuesMonthly = monthlyScrap.DistinctBy(x => x.Line).Select(x => x.Line).ToList();
List<int> yValuesMonthly = monthlyScrap.DistinctBy(x => x.Quantity).Select(x => x.Quantity).ToList();
// Year Bar Chart SCRAP
List<Scrap> yearScrap = await _scrapService.GetYearScrap();
List<string> xValuesYear = yearScrap.DistinctBy(x => x.Line).Select(x => x.Line).ToList();
List<int> yValuesYear= yearScrap.DistinctBy(x => x.Quantity).Select(x => x.Quantity).ToList();
charts
The way these queries are written, they count individual values, not the count or sum of items per line. For example, 101 and 102 would produce an Y value of 2, while 100 individual 100s would produce 1.
To get totals by line, use GroupBy and Count or Sum, eg :
var dailies=dailyScrap.GroupBy(s=>s.Line)
.Select(g=>new
{
X=g.Key,
Y=g.Sum(s=>s.Quantity)
})
.ToList();
This can be done in EF Core too, retrieving only the totals from the database :
var dateFrom=DateTime.Today;
var dateTo=dateFrom.AddDays(1);
var dailies=_dbContext.Scrap
.Where(s=> s.Created>=dateFrom
&& s.Created <dateTo)
.GroupBy(s=>s.Line)
.Select(g=>new
{
X=g.Key,
Y=g.Sum(s=>s.Quantity)
})
.ToList()
This generates
SELECT Line,SUM(Quantity)
FROM Scrap
WHERE Created >=#d1 && Created < #d2
GROUP BY Line
The condition can be simplified to only Where(s=> s.Created>=DateTime.Today) if there are no future values.
The query can be adapted to cover any period by changing the From and To parameters, eg :
var dateFrom=new DateTime(DateTime.Today.Year,DateTime.Today.Month,1);
var dateTo=dateFrom.AddMonths(1);
or
var dateFrom=new DateTime(DateTime.Today.Year,1,1);
var dateTo=dateFrom.AddYears(1);

Joining two lists of object optimization

I am looking for a way of optimizing my LINQ query.
Classes:
public class OffersObject
{
public List<SingleFlight> Flights { get; set; }
public List<Offer> Offers { get; set; } = new List<Offer>();
}
public class SingleFlight
{
public int Id { get; set; }
public string CarrierCode { get; set; }
public string FlightNumber { get; set; }
}
public class Offer
{
public int ProfileId { get; set; }
public List<ExtraOffer> ExtraOffers { get; set; } = new List<ExtraOffer>();
}
public class ExtraOffer
{
public List<int> Flights { get; set; }
public string Name { get; set; }
}
Sample object:
var sampleObject = new OffersObject
{
Flights = new List<SingleFlight>
{
new SingleFlight
{
Id = 1,
CarrierCode = "KL",
FlightNumber = "1"
},
new SingleFlight
{
Id = 2,
CarrierCode = "KL",
FlightNumber = "2"
}
},
Offers = new List<Offer>
{
new Offer
{
ProfileId = 41,
ExtraOffers = new List<ExtraOffer>
{
new ExtraOffer
{
Flights = new List<int>{1},
Name = "TEST"
},
new ExtraOffer
{
Flights = new List<int>{2},
Name = "TEST"
},
new ExtraOffer
{
Flights = new List<int>{1,2},
Name = "TEST"
}
}
}
}
};
Goal of LINQ query:
List of:
{ int ProfileId, string CommercialName, List<string> fullFlightNumbers }
FullFlightNumber should by created by "Id association" of a flight. It is created like: {CarrierCode} {FlightNumber}
What I have so far (works correctly, but not the fastest way I guess):
var result = sampleObject.Offers
.SelectMany(x => x.ExtraOffers,
(a, b) => {
return new
{
ProfileId = a.ProfileId,
Name = b.Name,
FullFlightNumbers = b.Flights.Select(f => $"{sampleObject.Flights.FirstOrDefault(fl => fl.Id == f).CarrierCode} {sampleObject.Flights.First(fl => fl.Id == f).FlightNumber}").ToList()
};
})
.ToList();
Final note
The part that looks wrong to me is:
.Select(f => $"{sampleObject.Flights.FirstOrDefault(fl => fl.Id == f)?.CarrierCode} {sampleObject.Flights.FirstOrDefault(fl => fl.Id == f)?.FlightNumber}").ToList()
I am basically looking for a way of "joining" those two lists of the OffersObject by Flight's Id.
Any tips appreciated.
If there will only be a few flights defined in sampleObject.Flights, a sequential search using a numeric key is hard to beat.
However, if the number of flights times the number of offers is substantial (1000s or more), I would suggest loading the list of flights into a dictionary with Id as the key for efficient lookup. Something like:
var flightLookup = sampleObject.Flights.ToDictionary(f => f.Id);
And then calculate your FullFlightNumbers as
FullFlightNumbers = b.Flights
.Select(flightId => {
flightLookup.TryGetValue(flightId, out SingleFlight flight);
return $"{flight?.CarrierCode} {flight?.FlightNumber}";
})
.ToList()
TryGetValue above will quietly return a null value for flight if no match is found. If you know that a match will always be present, the lookup cold alternately be coded as:
SingleFlight flight = flightLookup[flightId];
The above also uses a statement lambda. In short, lambda functions can have either expression or statement blocks as bodies. See the C# reference for more information.
I'd suggest replacing the double .FirstOrDefault() approach with .IntersectBy(). It is available in the System.Linq namespace, starting from .NET 6.
.IntersectBy() basically filters sampleObject.Flights by matching the flight ID for each flight in sampleObject with flight IDs in ExtraOffers.Flights.
In the code below, fl => fl.Id is the key selector for sampleObject.Flights (i.e. fl is a SingleFlight).
var result = sampleObject.Offers
.SelectMany(x => x.ExtraOffers,
(a, b) => {
return new
{
ProfileId = a.ProfileId,
Name = b.Name,
FullFlightNumbers = sampleObject.Flights
.IntersectBy(b.Flights, fl => fl.Id)
.Select(fl => fl.FullFlightNumber) // alternative 1
//.Select(fl => $"{fl.CarrierCode} {fl.FlightNumber}") // alternative 2
.ToList()
};
})
.ToList();
In my suggestion I have added the property FullFlightNumber to SingleFlight so that the Linq statement looks slightly cleaner:
public class SingleFlight
{
public int Id { get; set; }
public string CarrierCode { get; set; }
public string FlightNumber { get; set; }
public string FullFlightNumber => $"{CarrierCode} {FlightNumber}";
}
If defining SingleFlight.FullFlightNumber is not possible/desirable for you, the second alternative in the code suggestion can be used instead.
Example fiddle here.

Find all duplicates in a list in C#

I have a Custom class shown below
internal class RecurringClusterModel
{
public int? From { get; set; }
public int? To { get; set; }
public string REC_Cluster_1 { get; set; }
public string REC_Cluster_2 { get; set; }
public string REC_Cluster_3 { get; set; }
public string REC_Cluster_4 { get; set; }
public string REC_Cluster_5 { get; set; }
public string REC_Cluster_6 { get; set; }
public string REC_Cluster_7 { get; set; }
public string REC_Cluster_8 { get; set; }
public string REC_Cluster_9 { get; set; }
public string REC_Cluster_10 { get; set; }
I have a List of this class
List<RecurringClusterModel> recurringRecords = new List<RecurringClusterModel>();
The data can be in the below format
recurringRecords[0].REC_Cluster_1 = "USA";
recurringRecords[0].REC_Cluster_2 = "UK";
recurringRecords[0].REC_Cluster_3 = "India";
recurringRecords[0].REC_Cluster_4 = "France";
recurringRecords[0].REC_Cluster_5 = "China";
recurringRecords[1].REC_Cluster_1 = "France";
recurringRecords[1].REC_Cluster_2 = "Germany";
recurringRecords[1].REC_Cluster_3 = "Canada";
recurringRecords[1].REC_Cluster_4 = "Russia";
recurringRecords[1].REC_Cluster_5 = "India";
....
I want to find the duplicate records between all the Cluster properties..This is just a subset I have 50 properties till REC_Cluster_50. I want to find out which countries are getting duplicated between the 50 cluster properties of the list.
So in this case India and France are getting duplicated. I can group by individual property and then find out the duplicate by getting the count but then I d have to do it for all the 50 Rec_Clusters property. Not sure if there is a better way of doing it.
Thanks
Since you want to capture the From and To, I suggest you structure your class like this:
internal class RecurringClusterModel
{
public int? From { get; set; }
public int? To { get; set; }
public IEnumerable<string> REC_Clusters { get; set; }
}
Then you can search for duplicates:
var dupes = recs
.Select(r => new
{
r.From,
r.To,
DuplicateClusters = r.REC_Clusters.GroupBy(c => c)
.Where(g => g.Count() > 1) // duplicates
.SelectMany(g => g) // flatten it back
.ToArray() // indexed
})
.Where(r => r.DuplicateClusters.Any()) //only interested in clusters with duplicates
.ToArray();
EDIT
If you want all duplicates, then it will be:
var allDupes = recs.SelectMany(r => r.REC_Clusters)
.Select(r => r.GroupBy(c => c)
.Where(g => g.Count() > 1)
.SelectMany(g => g))
.Where(r => r.Any()).ToArray();
But now you lose track of the From/To
I would add an enumerable to your class that iterates over all properties of that class:
internal class RecurringClusterModel
{
public string REC_Cluster_1 { get; set; }
public string REC_Cluster_2 { get; set; }
public string REC_Cluster_3 { get; set; }
public IEnumerable<string> Clusters => GetAllClusters();
private IEnumerable<string> GetAllClusters()
{
if (!string.IsNullOrEmpty(REC_Cluster_1))
yield return REC_Cluster_1;
if (!string.IsNullOrEmpty(REC_Cluster_2))
yield return REC_Cluster_2;
if (!string.IsNullOrEmpty(REC_Cluster_3))
yield return REC_Cluster_3;
}
}
With this you can flatten the list to the individual clusters and then group by. If you need the original object back again, you have to provide it while flattening. Here is an example:
var clusters = Enumerable
.Range(1, 10)
.Select(_ => new RecurringClusterModel
{
REC_Cluster_1 = _Locations[_Random.Next(_Locations.Count)],
REC_Cluster_2 = _Locations[_Random.Next(_Locations.Count)],
REC_Cluster_3 = _Locations[_Random.Next(_Locations.Count)],
})
.ToList();
var dictionary = clusters
// Flatten the list and preserve original object
.SelectMany(model => model.Clusters.Select(cluster => (cluster, model)))
// Group by flattened value and put original object into each group
.GroupBy(node => node.cluster, node => node.model)
// Take only groups with more than one element (duplicates)
.Where(group => group.Skip(1).Any())
// Depending on further processing you could put the groups into a dictionary.
.ToDictionary(group => group.Key, group => group.ToList());
foreach (var cluster in dictionary)
{
Console.WriteLine(cluster.Key);
foreach (var item in cluster.Value)
{
Console.WriteLine(" " + String.Join(", ", item.Clusters));
}
}

Convert SQL to Linq with EF Core

I am using .NET Core 2.2, EF Core, C# and SQL Server 2017.
I am not able to translate the query I need to Linq.
This is the query I need to convert:
SELECT TOP 5
p.Id,
p.Title,
AVG(q.RatingValue) AvgRating
FROM Movies AS p
INNER JOIN Ratings AS q ON p.Id = q.MovieId
GROUP BY p.Id, p.Title
ORDER BY AvgRating DESC, p.Title ASC
The idea of the previous query is to get the Top 5 movies according to the Avg rating, ordering it by the highest average first, and in case of same average order alphabetically.
So far this is my query that makes the join, but then still missing: the group by, average, and ordering:
public class MovieRepository : IMovieRepository
{
private readonly MovieDbContext _moviesDbContext;
public MovieRepository(MovieDbContext moviesDbContext)
{
_moviesDbContext = moviesDbContext;
}
public IEnumerable<Movie> GetTopFive()
{
var result = _moviesDbContext.Movies.OrderByDescending(x => x.Id).Take(5).
Include(x => x.Ratings);
return result;
}
}
And these are the entities:
public class Movie
{
public int Id { get; set; }
public string Title { get; set; }
public int YearOfRelease { get; set; }
public string Genre { get; set; }
public int RunningTime { get; set; }
public IList<Rating> Ratings { get; set; }
}
public class Rating
{
public int Id { get; set; }
public int MovieId { get; set; }
public int UserId { get; set; }
public decimal RatingValue { get; set; }
}
I tried to use Linqer tool also to convert my query to Linq, but it was not working.
I will appreciate any help to convert that query to LINQ for the method "GetTopFive".
Thanks
Try this one -
var data = _moviesDbContext.Movies.Include(x => x.Ratings)
.Select(x => new {
Id = x.Id,
Title = x.Title,
Average = (int?)x.Ratings.Average(y => y.RatingValue)
}).OrderByDescending(x => x.Average).ThenBy(x => x.Title).Take(5).ToList();
Try as follows:
public IEnumerable<Movie> GetTopFive()
{
var result = _moviesDbContext.Ratings.GroupBy(r => r.MovieId).Select(group => new
{
MovieId = group.Key,
MovieTitle = group.Select(g => g.Movie.Title).FirstOrDefault(),
AvgRating = group.Average(g => g.RatingValue)
}).OrderByDescending(s => s.AvgRating).Take(5).ToList();
return result;
}
This will exclude the movies having no ratings.
But if you do as follows (as artista_14's answer):
public IEnumerable<Movie> GetTopFive()
{
var result = _moviesDbContext.Movies.GroupBy(x => new { x.Id, x.Title })
.Select(x => new {
Id = x.Key.Id,
Title = x.Key.Title,
Average = x.Average(y => y.Ratings.Sum(z => z.RatingValue))
}).OrderByDescending(x => x.Average).ThenBy(x => x.Title).Take(5).ToList();
return result;
}
this will include the movies having no ratings also.
Note: I see your Rating model class does not contain any Movie navigation property. Please add this as follows:
public class Rating
{
public int Id { get; set; }
public int MovieId { get; set; }
public int UserId { get; set; }
public decimal RatingValue { get; set; }
public Movie Movie { get; set; }
}
and finally this is the code working nicely:
var data = _moviesDbContext.Movies.Include(x => x.Ratings)
.Select(x => new MovieRating
{
Id = x.Id,
Title = x.Title,
Average = x.Ratings.Average(y => y.RatingValue)
}).OrderByDescending(x => x.Average).ThenBy(x => x.Title).Take(5).ToList();
return data;
The problem was creating an anonymous type in the select, so this line resolves the issue: .Select(x => new MovieRating
And this is the complete code for the method and the new class I have created to map the select fields with a concrete type:
public class MovieRepository : IMovieRepository
{
private readonly MovieDbContext _moviesDbContext;
public MovieRepository(MovieDbContext moviesDbContext)
{
_moviesDbContext = moviesDbContext;
}
public IEnumerable<Movie> GetAll()
{
return _moviesDbContext.Movies;
}
public IEnumerable<MovieRating> GetTopFive()
{
var result = _moviesDbContext.Movies.Include(x => x.Ratings)
.Select(x => new MovieRating
{
Id = x.Id,
Title = x.Title,
Average = x.Ratings.Average(y => y.RatingValue)
}).OrderByDescending(x => x.Average).ThenBy(x => x.Title).Take(5).ToList();
return result;
}
}
public class MovieRating
{
public int Id { get; set; }
public string Title { get; set; }
public decimal Average { get; set; }
}

How to count parent's property with linq

I have 2 object collections looking like this
public class Meter
{
public string UID { get; set; }
public string NR { get; set; }
public List<GMSData> data { get; set; }
}
public class GSMData : Meter
{
public DateTime TimeStamp { get; set; }
public int CellID { get; set; }
}
public static List<Meter> GetMeterUIDList()
{
return meters.Values.ToList();
}
public static List<GSMData> GetGsmdataList()
{
return meters.Values.SelectMany(m => m.Gsmdata)
.OrderBy(t => t.TimeStamp)
.ToList();
}
I need to get all NR for each CellId and a count on how many NR there are on each CellID.
How can i do that?
Perhaps:
var idGroups = meters
.SelectMany(m => m.data)
.GroupBy(d => d.CellID)
.Select(g => new { CellID = g.Key, UniqueNr = g.Select(m => m.NR).Distinct() });
foreach (var g in idGroups)
Console.WriteLine("CellID: {0} Count: {1}", g.CellID, g.UniqueNr.Count());
If the NR's don't need to be unique remove the Distinct.

Categories

Resources