Here is my first datatable dt
sscode scons cscons cstagged
A 10 2 20
A 10 2 20
B 10 2 40
Here is my second datatable dt1
Unit sscode
A101 A
A101 A
B101 B
and i want this output
Unit scons cscons cstagged
A101 20 4 40
I'm getting error while executing this query.
Here is my code
IEnumerable<DataRow> result = from data1 in dt.AsEnumerable()
join data2 in dt1.AsEnumerable()
on data1.Field<string>("sscode") equals
data2.Field<string>("substation_code")
group data2.Field<string>("Unit") by new {unit= data2.Field<string>("Unit")} into grp
orderby grp.Key.unit
select new
{
unit = grp.Key.unit,
sscons = grp.Sum(s => s.Field<string>("cscons")),
cscons = grp.Sum(s => s.Field<string>("cscons")),
cstagged = grp.Sum(s => s.Field<string>("cstagged"))
};
result.CopyToDataTable();
The problem with your current code is that grp holds the collection of both dataTables in which case you won't be able to get the items from first DataTable directly.
If I have understood your question correctly then this should give you the expected output:-
var result = from data2 in dt2.AsEnumerable()
group data2 by data2.Field<string>("Unit") into g
select new { Unit = g.Key, dt2Obj = g.FirstOrDefault() } into t3
let filteredData1 = dt.AsEnumerable()
.Where(x => x.Field<string>("sscode") == t3.dt2Obj.Field<string>("sscode"))
select new
{
unit = t3.unit,
sscons = filteredData1.Sum(s => s.Field<int>("cscons")),
cscons = filteredData1.Sum(s => s.Field<int>("cscons")),
cstagged = filteredData1.Sum(s => s.Field<int>("cstagged"))
};
First we are grouping by Unit in second dataTable (as that is the grouo which we need) then we are projecting the the entire object to get the sscode by using FirstOrDefault, after this simply filter the first list based on value we got from grouped sscode and project the items.
Check Working Fiddle.
First, You have to select after the group by otherwise only the grouped field is selected.
Second, You cannot sum strings. Only numeric fields (int, double...)
I'm not fluent in the inline-linq syntax, so I've changed it to methods chain.
var result =
dt.AsEnumerable()
.Join(dt1.AsEnumerable(), data1 => data1.Field<string>("sscode"), data2 => data2.Field<string>("substation_code"),
(data1, data2) => new {data1, data2})
.GroupBy(#t => new {unit = #t.data2.Field<string>("Unit")},
#t => #t.data1)
.Select(
grp =>
new
{
unit = grp.Key.unit,
sscons = grp.Sum(s => s.Field<int>("sscons")),
cscons = grp.Sum(s => s.Field<int>("cscons")),
cstagged = grp.Sum(s => s.Field<int>("cstagged"))
});
Note: Be aware that from this query you cannot use CopyToDataTable
Update
Since i understand that your fields are stored as strings you should use Convert.ToInt32:
grp.Sum(s => Convert.ToInt32(s.Field<string>("cscons"))
Update 2
As per the chat - it seems that the values are decimal and not ints:
sscons = grp.Sum(s => s.Field<decimal>("sscons")),
cscons = grp.Sum(s => s.Field<decimal>("cscons")),
cstagged = grp.Sum(s => s.Field<decimal>("cstagged"))
Related
I wrote a SQL query that will get the count of tickets, closed tickets and its closure rate (%) and group it monthly basis (current year), but I would like to express this as a LINQ query to achieve the same result.
SELECT *, (ClosedCount * 100 / TicketCount) AS ClosureRate FROM (
SELECT COUNT(Id) as TicketCount, MONTH(InsertDate) as MonthNumber, DATENAME(MONTH, E1.InsertDate) as MonthName,
(SELECT COUNT(Id) FROM EvaluationHistoryTable E2 WHERE TicketStatus = 'CLOSED' AND YEAR(E2.InsertDate) = '2021') AS 'ClosedCount'
FROM EvaluationHistoryTable E1
WHERE YEAR(E1.InsertDate) = 2021
GROUP BY MONTH(InsertDate), DATENAME(MONTH, E1.InsertDate));
This is code that I'm working on:
var ytdClosureRateData = _context.EvaluationHistoryTable
.Where(t => t.InsertDate.Value.Year == DateTime.Now.Year)
.GroupBy(m => new
{
Month = m.InsertDate.Value.Month
})
.Select(g => new YtdTicketClosureRateModel
{
MonthName = DateTimeFormatInfo.CurrentInfo.GetAbbreviatedMonthName(g.Key.Month),
MonthNumber = g.Key.Month,
ItemCount = g.Count(),
ClosedCount = // problem
ClosureRate = // problem
}).AsEnumerable()
.OrderBy(a => a.MonthNumber)
.ToList();
I am having rtouble trying to express the count of closed tickets (ClosedCount) in linq format, I need the count to calculate the ClosureRate.
This won't be the same SQL but it should produce the same result in memory:
var ytdClosureRateData = _context.EvaluationHistoryTable
.Where(t => t.InsertDate.Value.Year == DateTime.Now.Year)
.GroupBy(m => new
{
Month = m.InsertDate.Value.Month
})
.Select(g => new
{
Month = g.Key.Month,
ItemCount = g.Count(),
ClosedCount = g.Where(t => t.TicketStatus == "CLOSED").Count()
}).OrderBy(a => a.MonthNumber)
.ToList()
.Select(x => new YtdTicketClosureRateModel
{
MonthName = DateTimeFormatInfo.CurrentInfo.GetAbbreviatedMonthName(x.Month),
MonthNumber = x.Month,
ItemCount = x.ItemCount,
ClosedCount = x.ClosedCount,
ClosureRate = x.ClosedCount * 100D / x.ItemCount
})
.ToList();
Two techniques have been implemented here:
Use Fluent Query to specify the filter to apply for the ClosedCount set, you can combine Fluent and Query syntax to your hearts content, they each have pros and cons, in this instance it just simplifed the syntax to do it this way.
Focus the DB query on only bringing back the data that you need, the rest can be easily calculated in member after the initial DB execution. That is why there are 2 projections here, the first should be expressed purely in SQL, the rest is evaluated as Linq to Objects
The general assumption is that traffic over the wire and serialization are generally the bottle necks for simple queries like this, so we force Linq to Entities (or Linq to SQL) to produce the smallest payload that is practical and build the rest or the values and calculations in memory.
UPDATE:
Svyatoslav Danyliv makes a really good point in this answer
The logic can be simplified, from both an SQL and LINQ perspective by using a CASE expression on the TicketStatus to return 1 or 0 and then we can simply sum that column, which means you can avoid a nested query and can simply join on the results.
Original query can be simplified to this one:
SELECT *,
(ClosedCount * 100 / TicketCount) AS ClosureRate
FROM (
SELECT
COUNT(Id) AS TicketCount,
MONTH(InsertDate) AS MonthNumber,
DATENAME(MONTH, E1.InsertDate) AS MonthName,
SUM(CASE WHEN TicketStatus = 'CLOSED' THEN 1 ELSE 0 END) AS 'ClosedCount'
FROM EvaluationHistoryTable E1
WHERE YEAR(E1.InsertDate) = 2021
GROUP BY MONTH(InsertDate), DATENAME(MONTH, E1.InsertDate));
Which is easily convertible to server-side LINQ:
var grouped =
from eh in _context.EvaluationHistoryTable
where eh.InsertDate.Value.Year == DateTime.Now.Year
group eh by new { eh.InsertDate.Value.Month }
select new
{
g.Key.Month,
ItemCount = g.Count(),
ClosedCount = g.Sum(t => t.TicketStatus == "CLOSED" ? 1 : 0)
};
var query =
from x in grouped
orderby x.Month
select new YtdTicketClosureRateModel
{
MonthName = DateTimeFormatInfo.CurrentInfo.GetAbbreviatedMonthName(x.Month),
MonthNumber = x.Month,
ItemCount = x.ItemCount,
ClosedCount = x.ClosedCount,
ClosureRate = x.ClosedCount * 100D / x.ItemCount
};
var result = query.ToList();
I am able to produce a set of results that are desirable, but I have the need to group and sum of these fields and am struggling to understand how to approach this.
In my scenario, what would be the best way to get results that will:
Have a distinct [KeyCode] (right now I get many records, same KeyCode
but different occupation details)
SUM wage and projection fields (in same query)
Here is my LINQ code:
private IQueryable<MyAbstractCustomOccupationInfoClass> GetMyAbstractCustomOccupationInfoClass(string[] regionNumbers)
{
//Get a list of wage data
var wages = _db.ProjectionAndWages
.Join(
_db.HWOLInformation,
wages => wages.KeyCode,
hwol => hwol.KeyCode,
(wages, hwol) => new { wages, hwol }
)
.Where(o => regionNumbers.Contains(o.hwol.LocationID))
.Where(o => o.wages.areaID.Equals("48"))
.Where(o => regionNumbers.Contains(o.wages.RegionNumber.Substring(4))); //regions filter, remove first 4 characters (0000)
//Join OccupationInfo table to wage data, for "full" output results
var occupations = wages.Join(
_db.OccupationInfo,
o => o.wages.KeyCode,
p => p.KeyCode,
(p, o) => new MyAbstractCustomOccupationInfoClass
{
KeyCode = o.KeyCode,
KeyTitle = o.KeyTitle,
CareerField = o.CareerField,
AverageAnnualOpeningsGrowth = p.wages.AverageAnnualOpeningsGrowth,
AverageAnnualOpeningsReplacement = p.wages.AverageAnnualOpeningsReplacement,
AverageAnnualOpeningsTotal = p.wages.AverageAnnualOpeningsTotal,
});
//TO-DO: How to Aggregate and Sum "occupations" list here & make the [KeyCode] Distinct ?
return occupations;
}
I am unsure if I should perform the Grouping mechanism on the 2nd join? Or perform a .GroupJoin()? Or have a third query?
var occupations = _db.OccupationInfo.GroupJoin(
wages,
o => o.KeyCode,
p => p.wages.KeyCode,
(o, pg) => new MyAbstractCustomOccupationInfoClass {
KeyCode = o.KeyCode,
KeyTitle = o.KeyTitle,
CareerField = o.CareerField,
AverageAnnualOpeningsGrowth = pg.Sum(p => p.wages.AverageAnnualOpeningsGrowth),
AverageAnnualOpeningsReplacement = pg.Sum(p => p.wages.AverageAnnualOpeningsReplacement),
AverageAnnualOpeningsTotal = pg.Sum(p => p.wages.AverageAnnualOpeningsTotal),
});
I have a record set of approximetly 1 million records. I'm trying to query the records to report monthly figures.
The following MySQL query executes in about 0.3 seconds
SELECT SUM(total), MONTH(create_datetime), YEAR(create_datetime)
FROM orders GROUP BY MONTH(create_datetime), YEAR(create_datetime)
However I am unable to figure out an entity framework lambda expression that can execute any near as fast
The only statement I have come up with that actually works is
var monthlySales = db.Orders
.Select(c => new
{
Total = c.Total,
CreateDateTime = c.CreateDateTime
})
.GroupBy(c => new { c.CreateDateTime.Year, c.CreateDateTime.Month })
.Select(c => new
{
CreateDateTime = c.FirstOrDefault().CreateDateTime,
Total = c.Sum(d => d.Total)
})
.OrderBy(c => c.CreateDateTime)
.ToList();
But it is horribly slow.
How can I get this query to execute as quickly as it does directly in MySQL
When you do ".ToList()" in the middle of query (before doing grouping) EF will effectively query all orders from database in memory and then do grouping in C#. Depending on amount of data in your table, that can take a while and I think this is why your query is so slow.
Try to rewrite your query having only 1 expression that enumerates results (ToList, ToArray, AsEnumerable)
Try this:
var monthlySales = from c in db.Orders
group c by new { y = c.CreateDateTime.Year, m = c.CreateDateTime.Month } into g
select new {
Total = c.Sum(t => t.Total),
Year = g.Key.y,
Month = g.Key.m }).ToList();
I came across this setup which executes quickly
var monthlySales = db.Orders
.GroupBy(c => new { Year = c.CreateDateTime.Year, Month = c.CreateDateTime.Month })
.Select(c => new
{
Month = c.Key.Month,
Year = c.Key.Year,
Total = c.Sum(d => d.Total)
})
.OrderByDescending(a => a.Year)
.ThenByDescending(a => a.Month)
.ToList();
I have a doubt about the object IGrouping that results from a linq where I use a "group by" sentence.
I have two tables in the database, Products and Responses they have a relationship 1 to *. In the Responses table we have a column called FinalRate which is the rate of the product. The products can have n responses or rates.
I want to get the Products order by the sum of the FinalRate divided by the number of rates done. That is to say, order by the average rate descending from higher to lower marks.
As it can be read in the code (at the end of the question), I try to get the responses first. To sum all the finalrates and divide them by the count I use a group.
There are 2 problems with the code, even if the current code works:
1.-I tried to get the Products in a single query but it is impossible because I can not use the products table in the group and then use the Response table in the "orderby". One more thing LINQ only gives you the possibility to group one table, it is imposible to have "group prod, response".
I couldn't get this sql sentence in LINQ:
select prod.ProductID,prod.Commercial_Product_Name,prod.Manufacturer_Name,
prod.ProductImageUrl
from rev_product prod
inner join rev_response res on res.AtProductid=prod.ProductID
group by prod.ProductID,prod.Commercial_Product_Name,prod.Manufacturer_Name
,prod.ProductImageUrl
order by (sum(res.FinalRate)/count(res.AtProductid))
I tried this:
var gruposproductos = (from prod in ctx.Products
join res in ctx.Responses on prod.ProductID equals res.AtProductId
group prod by prod.ProductID into g
orderby (g.Sum(ra =>ra.FinalRate)/g.Count())
descending select g).Take(2);
But as I say, the "orderby (g.Sum..." gives an error, because "into g" groups the Product table, not the Response Table.
So this is why in my final code I don't get the products in the same LINQ sentence.
2.-Once accepted this fact, the problem is that I get an IGrouping, but I don't obtain a list of Responses that I can iterate without doing the two foreach in the code. I wanted only one loop, as one would do if you had a "List" object.
It is not really a cool method but it works. Moreover, I have to control that in the second loop there is only added 1 time.
Any better code?
var groupproducts = (from res in ctx.Responses
group res by res.AtProductId into g
orderby (g.Sum(ra =>ra.FinalRate)/g.Count())
descending select g).Take(2).ToList();
List<Product> theproducts = new List<Product>();
foreach (var groupresponse in groupproducts)
{
foreach (var response in groupresponse)
{
var producttemp= (from prod in ctx.Products
where prod.ProductID == response.AtProductId
select prod).First();
theproducts.Add(producttemp);
}
}
}
FINAL SOLUTION (thx a lot #Daniel)
var productsanonymtype = ctx.Products.Select(x => new
{
Product = x,
AverageRating = x.Responses.Count() == 0 ? 0 : x.Responses.Select(r => (double)r.FinalRate).Sum() / x.Responses.Count()
}).OrderByDescending(x => x.AverageRating);
List<Product> products = new List<Product>();
foreach (var prod in productsanonymtype)
{
products.Add(prod.Product);
}
Try this:
products.Select(x => new
{
Product = x,
AverageRating = x.Responses.Sum(x => x.FinalRate) /
x.Responses.Count()
});
The Sum overload I am using is not implemented in all providers. If that's a problem for you, you can use this alternate version:
products.Select(x => new
{
Product = x,
AverageRating = x.Responses.Select(x => x.FinalRate)
.Sum() /
x.Responses.Count()
});
If there is no navigation property from product to its responses you should first try to fix that. If you can't you can use this version:
products.Join(responses, x => x.Id, x => x.ProductId,
(p, r) => new { Product = p, Response = r })
.GroupBy(x => x.Product)
.Select(g => new { Product = g.Key,
AverageRating = g.Select(x => x.Response.FinalRate)
.Sum() /
g.Count()
});
Assuming FinalRate is an int, both methods will calculate the average rating with an int, i.e. there will be no 4.5 rating. And there will be no rounding, i.e. an actual average rating of 4.9 will result in 4. You can fix that by casting one of the operands of the division to double.
Another problem is the case with no ratings so far. The code above will result in an exception in this case. If that's a problem for you, you can change the calculation to this:
AverageRating = g.Count() == 0
? 0
: g.Select(x => (double)x.Response.FinalRate).Sum() / g.Count()
ctx.Products.GroupBy(x => new {
ProductId = x.ProductId,
FinalRate = x.Responses.Sum(y => y.FinalRate),
CountProductId = x.Responses.Count
})
.OrderBy(x => x.Key.FinalRate / x.Key.CountProductId);
And here with the projection.....
ctx.Products.Select(x => new {
ProductID = x.ProductID,
Commercial_Product_Name = x.Commercial_Product_Name,
Manufacturer_Name = x.Manufacturer_Name,
ProductImageUrl = x.ProductImageUrl,
FinalRate = x.Responses.Sum(y => y.FinalRate),
CountProductId = x.Responses.Count
})
.GroupBy(x => new {
ProductId = x.ProductId,
FinalRate = x.FinalRate,
CountProductId = x.CountProductId
})
.OrderBy(x => x.Key.FinalRate / x.Key.CountProductId);
I am able to find the duplicates out of DataTable rows. Like following:
var groups = table.AsEnumerable()
.GroupBy(r => new
{
c1 = r.Field<String>("Version"),
});
var tblDuplicates = groups
.Where(grp => grp.Count() > 1)
.SelectMany(grp => grp)
.CopyToDataTable();
Now, I want to merge all the duplicate records in to single and sum it's Value column value.
Pretty much like following:
DataTable with Duplicates:
Version Value
1 2
2 2
2 1
1 3
2 1
3 2
DataTable with no duplicates and Value summed.:
Version Value
1 5
2 4
3 2
I am aware about this link which does this with the help of reflection.
http://forums.asp.net/t/1570562.aspx/1
Anyother way to do it?
Edit:
However, if I have more than two columns, like five columns and I still want to do the sum on Value column and also need other columns data in resulatant summed datatable. How to do it? Here I get the Version and Value in my result DataTable. I want other columns with values also. Like following:
Version col1 col2 Value
1 A A 2
2 B B 2
2 B B 1
1 A A 3
2 B B 1
3 C C 2
var result = table.AsEnumerable()
.GroupBy(r => r.Field<string>("Version"))
.Select(g =>
{
var row = table.NewRow();
row.ItemArray = new object[]
{
g.Key,
g.Sum(r => r.Field<int>("Value"))
};
return row;
}).CopyToDataTable();
Edit:
If you want to keep other field, try below:
var result = table.AsEnumerable()
.GroupBy(r => new
{
Version = r.Field<String>("Version"),
Col1 = r.Field<String>("Col1"),
Col2 = r.Field<String>("Col2")
})
.Select(g =>
{
var row = g.First();
row.SetField("Value", g.Sum(r => r.Field<int>("Value")));
return row;
}).CopyToDataTable();