LINQ to Entities - create median value in grouped data - c#

I have a LINQ to Entities query for grouping data and add some aggregations at the same time and it works except the median value calculation. Median value is calculated on sorted columns divided by 2 (get the middle value from the column). Here is my example:
private void button2_Click(object sender, EventArgs e)
{
var query = from t in _database.jon_export
orderby t.businessEmployeeCount
group t by t.county.ToString() into g
where g.Count() > 0
select new
{
County = g.Key,
CountValue = g.Count(),
BusinessEmployeeCount = g.Count(),
BusinessEmployeeAverageValue = g.Average(x => x.businessEmployeeCount),
//Median value from businessEmployeeCount column
BusinessRevenueAverageValue = g.Average(x => x.businessRevenue),
BusinessTurnover=g.Average(x => x.businessTurnover),
BooiqEconomicWellBeing=g.Average(x=>x.booiqEconomicWellBeing)
};
this.dataGridView1.DataSource = query.ToList();
}

Write an extension method on IEnumerable and call that extension method in the code with the right property.
public static double Median(this IEnumerable<int> items)
{
var data = items.OrderBy(n => n).ToArray();
if (data.Length % 2 == 0)
return (data[data.Length / 2 - 1] + data[data.Length / 2]) / 2.0;
return data[data.Length / 2];
}
and later in the above-mentioned snippet add as follows:
businessEmployeeMedian = g.Select(x => x.businessEmployeeCount).Median(),
For skipping businessEmployeeCount with null
businessEmployeeMedian = g.Where(x => x.businessEmployeeCount.HasValue).Select(x => (int)x.businessEmployeeCount).Median()
Complete linq in method syntax
var items = _database.jon_export
.OrderBy(item => item.businessEmployeeCount)
.GroupBy(item => item.County)
.Where(g => g.Any())
.Select(g => {
return new
{
County = g.Key,
CountValue = g.Count(),
BusinessEmployeeCount = g.Count(),
BusinessEmployeeAverageValue = g.Average(x => x.businessEmployeeCount),
businessEmployeeMedian = g.Where(x => x.businessEmployeeCount.HasValue).Select(x => (int)x.businessEmployeeCount).Median(),
BusinessRevenueAverageValue = g.Average(x => x.businessRevenue),
BusinessTurnover = g.Average(x => x.businessTurnover),
BooiqEconomicWellBeing = g.Average(x => x.booiqEconomicWellBeing)
};
}).ToList();

Related

GroupBy performs slowly

I have the following query and is super slow for 3000 records and produces 370 entries. How can I improve performance on it?
dealerResults = _results.GroupBy(x => new { x.DealerName, x.DealerId })
.Select(x => new MarketingReportResults()
{
DealerId = x.Key.DealerId,
DealerName = x.Key.DealerName,
LinkedTotal = linkedLeadCores.Count(y => y.DealerId == x.Key.DealerId),
LeadsTotal = x.Count(),
SalesTotal = x.Count(y => y.IsSold),
Percent = (decimal)(x.Count() * 100) / count,
ActiveTotal = x.Count(y => y.IsActive),
}).ToList();
I think the linkedLeadCores.Count() is the bottleneck here as you loop though the entire linkedLeadCores list each time a entry of _results is processed. This assumption seems to be confirmed by your comments also.
So to remove the bottleneck you could create a map (aka dictionary) that holds the count for each dealer before doing anything with _results like this ...
var linkedLeadCoresCountMap = linkedLeadCores
.GroupBy(y => y.DealerId )
.ToDictionary(y => y.Key, y => y.Count());
... and then you could write
LinkedTotal = linkedLeadCoresCountMap.ContainsKey(x.Key.DealerId) ?
linkedLeadCoresCountMap[x.Key.DealerId] : 0,
Doing a Group Join to linkedLeadCores will use an internal hash table for lookup and should solve your problem.
var dealerResults =
(from r in _results.GroupBy(x => new { x.DealerName, x.DealerId })
join llc in linkedLeadCores on r.Key.DealerId equals llc.DealerId into g
select new MarketingReportResults()
{
DealerId = r.Key.DealerId,
DealerName = r.Key.DealerName,
LinkedTotal = g.Count(),
LeadsTotal = r.Count(),
SalesTotal = r.Count(y => y.IsSold),
Percent = (decimal)(r.Count() * 100) / count,
ActiveTotal = r.Count(y => y.IsActive),
}).ToList();

How to rank a list with original order in c#

I want to make a ranking from a list and output it on original order.
This is my code so far:
var data = new[] { 7.806468478, 7.806468478, 7.806468478, 7.173501754, 7.173501754, 7.173501754, 3.40877696, 3.40877696, 3.40877696,
4.097010736, 4.097010736, 4.097010736, 4.036494085, 4.036494085, 4.036494085, 38.94333318, 38.94333318, 38.94333318, 14.43588131, 14.43588131, 14.43588131 };
var rankings = data.OrderByDescending(x => x)
.GroupBy(x => x)
.SelectMany((g, i) =>
g.Select(e => new { Col1 = e, Rank = i + 1 }))
.ToList();
However, the result will be order it from descending:
What I want is to display by its original order.
e.g.: Rank = 3, Rank = 3, Rank = 3, Rank = 4, Rank = 4, Rank = 4, etc...
Thank You.
Using what you have, one method would be to keep track of the original order and sort a second time (ugly and potentially slow):
var rankings = data.Select((x, i) => new {Item = x, Index = i})
.OrderByDescending(x => x.Item)
.GroupBy(x => x.Item)
.SelectMany((g, i) =>
g.Select(e => new {
Index = e.Index,
Item = new { Col1 = e.Item, Rank = i + 1 }
}))
.OrderBy(x => x.Index)
.Select(x => x.Item)
.ToList();
I would instead suggest creating a dictionary with your rankings and joining this back with your list:
var rankings = data.Distinct()
.OrderByDescending(x => x)
.Select((g, i) => new { Key = g, Rank = i + 1 })
.ToDictionary(x => x.Key, x => x.Rank);
var output = data.Select(x => new { Col1 = x, Rank = rankings[x] })
.ToList();
As #AntonínLejsek kindly pointed out, replacing the above GroupBy call with Distinct() is the way to go.
Note doubles are not a precise type and thus are really not a good candidate for values in a lookup table, nor would I recommend using GroupBy/Distinct with a floating-point value as a key. Be mindful of your precision and consider using an appropriate string conversion. In light of this, you may want to define an epsilon value and forgo LINQ's GroupBy entirely, opting instead to encapsulate each data point into a (non-anonymous) reference type, then loop through a sorted list and assign ranks. For example (disclaimer: untested):
class DataPoint
{
decimal Value { get; set; }
int Rank { get; set; }
}
var dataPointsPreservingOrder = data.Select(x => new DataPoint {Value = x}).ToList();
var sortedDescending = dataPointsPreservingOrder.OrderByDescending(x => x.Value).ToList();
var epsilon = 1E-15; //use a value that makes sense here
int rank = 0;
double? currentValue = null;
foreach(var x in sortedDescending)
{
if(currentValue == null || Math.Abs(x.Value - currentValue.Value) > epsilon)
{
currentValue = x.Value;
++rank;
}
x.Rank = rank;
}
From review of the data you will need to iterate twice over the result set.
The first iteration will be to capture the rankings as.
var sorted = data
.OrderByDescending(x => x)
.GroupBy(x => x)
.Select((g, i) => new { Col1 = g.First(), Rank = i + 1 })
.ToList();
Now we have a ranking of highest to lowest with the correct rank value. Next we iterate the data again to find where the value exists in the overall ranks as:
var rankings = (from i in data
let rank = sorted.First(x => x.Col1 == i)
select new
{
Col1 = i,
Rank = rank.Rank
}).ToList();
This results in a ranked list in the original order of the data.
A bit shorter:
var L = data.Distinct().ToList(); // because SortedSet<T> doesn't have BinarySearch :[
L.Sort();
var rankings = Array.ConvertAll(data,
x => new { Col1 = x, Rank = L.Count - L.BinarySearch(x) });

Preserve order with linq after groupby and selectmany

Is there a way to preserve the order after this linq expression?
var results =
DateList
.GroupBy(x => x.Date.Subtract(firstDay).Days / 7 + 1)
.SelectMany(gx => gx, (gx, x) => new {Week = gx.Key,DateTime =x,Count = gx.Count(),});
I found this Preserving order with LINQ , but I'm not sure if its the GroupBy or SelectMany casing the issues
Yes, if you first select your DateList and combine it with an index, using an overload of .Select that uses a delegate with a second (int) parameter that is called with the index of the items from the sequence :
DateList
.Select((dateTime, idx) => new {dateTime, idx})
.GroupBy(x => x.dateTime.Date.Subtract(firstDay).Days / 7 + 1)
...and persist the value through the linq chain
.SelectMany(gx => gx, (gx, x) => new {Week = gx.Key,
DateTime = x.dateTime,
Count = gx.Count(),
x.idx})
...then use it to re-order the output
.OrderBy(x => x.idx)
...and strip it from your final selection
.Select(x => new {x.Week, x.DateTime, x.Count});
then you can maintain the same order as the original list.
Solution of #spender is good, but can it be done without OrderBy? It can, because we can use the index for direct indexing into array, but it would not be one linq query:
var resultsTmp =
DateList.Select((d, i) => new { d, i })
.GroupBy(x => x.d.Date.Subtract(firstDay).Days / 7 + 1)
.SelectMany(gx => gx, (gx, x) => new { Week = gx.Key, DateTime = x.d, Count = gx.Count(), x.i })
.ToArray();
var resultsTmp2 = resultsTmp.ToArray();
foreach (var r in resultsTmp) { resultsTmp2[r.i] = r; };
var results = resultsTmp2.Select(r => new { r.Week, r.DateTime, r.Count });
It looks a bit complex. I would probably do something more straightforward like:
var DateList2 = DateList.Select(d => new { DateTime = d, Week = d.Subtract(firstDay).Days / 7 + 1 }).ToArray();
var weeks = DateList2.GroupBy(d => d.Week).ToDictionary(k => k.Key, v => v.Count());
var results = DateList2.Select(d2 => new { d2.Week, d2.DateTime, Count = weeks[d2.Week] });

Order by Total Value on LinQ

I want to order my Linq GroupBy statement but the item that has the more Total Descending but i can't make it
This is my LinQ
foreach (var item in db
.Pos.Where(r => r.Fecha.Day <= today.Day)
.Select(g => new { Pdv = g.Pdv, Total = g.Total })
.GroupBy(l => l.Pdv)
.AsEnumerable()
.Select(z => new {
Punto_De_Venta=z.Key,
Total = String.Format("{0:$#,##0.00;($#,##0.00);Zero}",
Decimal.Round(z.Sum(l => l.Total), 0))
}))
{
listadepuntos.Add(item.ToString());
}
var grupoPdv = new SelectList(listadepuntos.ToList());
ViewBag.GroupS = grupoPdv;
The Out put of my Linq Statement is :
Punto_De_Venta = Central, Total = 42,143.00
Punto_De_Venta = Restaurante, Total = 189,949.00
Punto_De_Venta = Venta Moto, Total = 89,678.00
And the Output im looking for is:
Punto_De_Venta = Restaurante, Total = 189,949.00
Punto_De_Venta = Venta Moto, Total = 89,678.00
Punto_De_Venta = Central, Total = 42,143.00
How can i do this?? i cant find a way to make this
The List<> does guarantee ordering, sort the List before passing to your SelectList
var grupoPdv = new SelectList(listadepuntos.OrderByDescending(l=>l.Total).ToList());
ViewBag.GroupS = grupoPdv;
Another approach :
Modify source query to return a sorted list.
var results = db.Pos.Where(r => r.Fecha.Day <= today.Day)
.Select(g => new { Pdv = g.Pdv, Total = g.Total })
.GroupBy(l => l.Pdv).AsEnumerable()
.Select(z => new { Punto_De_Venta=z.Key, Total = String.Format("{0:$#,##0.00;($#,##0.00);Zero}", Decimal.Round(z.Sum(l => Total), 0))})
.OrderByDescending(l=>l.Total)
.ToList();
Once you get the sorted list you can create your SelectList with sorted result.
var grupoPdv = new SelectList(result);
ViewBag.GroupS = grupoPdv;
You'll need to do something like this:
foreach (var item in db.Pos.Where(r => r.Fecha.Day <= today.Day)
.Select(g => new { Pdv = g.Pdv, Total = g.Total })
.GroupBy(l => l.Pdv)
.AsEnumerable()
.Select(z => new { Punto_De_Venta = z.Key, Total = z.Sum(l => l.Total) })
.OrderByDescending(r => r.Total)
.Select(r => new { Punto_De_Venta = r.Punto_De_Venta, Total = String.Format("{0:$#,##0.00;($#,##0.00);Zero}", Decimal.Round(z.Sum(l => l.Total), 0))})
{
listadepuntos.Add(item.ToString());
}

Translating SQL to lambda with groupby

I'm trying to translate this sql statement
SELECT row, SUM(value) as VarSum, AVG(value) as VarAve, COUNT(value) as TotalCount
FROM MDNumeric
WHERE collectionid = 6 and varname in ('C3INEV1', 'C3INEVA2', 'C3INEVA3', 'C3INVA11', 'C3INVA17', 'C3INVA19')
GROUP BY row
into an EF 4 query using lambda expressions and am missing something.
I have:
sumvars = sv.staticvararraylist.Split(',');
var aavresult = _myIFR.MDNumerics
.Where(r => r.collectionid == _collid)
.Where(r => sumvars.Contains(r.varname))
.GroupBy(r1 =>r1.row)
.Select(rg =>
new
{
Row = rg.Key,
VarSum = rg.Sum(p => p.value),
VarAve = rg.Average(p => p.value),
TotalCount = rg.Count()
});
where the staticvararraylist has the string 'C3INEV1', 'C3INEVA2', 'C3INEVA3', 'C3INVA11', 'C3INVA17', 'C3INVA19' (without single quotes) and the _collid variable = 6.
While I'm getting the correct grouping, my sum, average, & count values aren't correct.
You didn't post your error message, but I suspect it's related to Contains. I've found that Any works just as well.
This should get you quite close:
var result =
from i in _myIFR.MDNumerics
where i.collectionid == _collid && sumvars.Any(v => i.varname == v)
group i by i.row into g
select new {
row = g.Key,
VarSum = g.Sum(p => p.value),
VarAve = g.Average(p => p.value),
TotalCount = g.Count()
};
Try this:
var aavresult = _myIFR.MDNumerics
.Where(r => r.collectionid == _collid && sumvars.Contains(r.varname))
.GroupBy(r1 =>r1.row,
(key,res) => new
{
Row = key,
VarSum = res.Sum(r1 => r1.value),
VarAve = res.Average(r1 => r1.value),
TotalCount = res.Count()
});

Categories

Resources