I simply can not get this to work out at all, so any expert help would be very much appreciated.
I'm trying (as the subject suggests) to join 2 datatables on Zip Code, but return a table which grouped this by State and has a SUM() of sales.
Here's the latest version of my troubles:
var results =(
from a in dtList.AsEnumerable()
join b in dtListCoded.AsEnumerable()
on a.Field<string>("ZIP") equals b.Field<string>("zip")
group a by {a.Field<string>("StateCode")} into g
select new {
StateCode = a.Field<string>("StateCode"),
SumSales = b.Sum(b => b.Field<double>("SUMSales"))
});
I can join the 2 tables but its getting the result i need that seems to be the tricky bit. If need be I will just have to do 2 queries, but that just seems a bit backward.
Thanks in advance.
Two queries wouldn't be any slower (they should be brought together into a single SQL query upon execution), and would be a lot more readable, transparent during debugging and reusable. I'd recommend breaking it down.
Related
I am doing a school project and need help with this last problem I am having.
Currently I am trying to do a many 2 many join and then fill an IEnumerable list with the result - using linq and lambda.
The purpose is to show the compatible games along with every product.
My code as of now:
else
{
var result = (from g in db.Games
join gs in db.GameSize
on g.GameId equals gs.GameId
join s in db.Size
on gs.SizeId equals s.SizeId
join p in db.Product
on s.SizeId equals p.SizeId
select p.Size.Name);
games = db.Games
.Where(game => game.GameSize
.All(s => s.Size.Name == result.FirstOrDefault()));
}
My idea is to join through the tables and find the gameid who have a matching productid - and then add them to "games".
I am aware that this table design is horrible and that I am only getting the first result in the list with FirstOrDefault().
Does anyone have a suggestion or solution to help me? Thanks.
Please ask if I am not making any sense.
Essentially I just wan't to show the games linked to a size. My table looks like this:
--SIZE
insert into size values ('Large')
insert into size values ('Medium')
insert into size values ('Small')
--GAMES
insert into games values ('Magic The Gathering')
insert into games values ('Pokemon')
insert into games values ('Dead of Winter')
--GAMESIZE (RELATION GAMES AND SIZE) (SIZEID, GAMEID)
insert into gamesize values (1, 1)
insert into gamesize values (2, 2)
insert into gamesize values (2, 3)
First, you should really move on from the LINQ to SQL syntax. The EF syntax is much easier and readable. For example:
var result = db.Games.Include("GameSize.Size.Product").Select(m => m.Size.Name);
Second, All doesn't do what you seem to think it does. It returns a boolean: true if all the items match the condition, false if not. It's very unlikely that all your GameSizes have the same Size.Name. You might be looking for Any here.
Third, this whole thing seems counter-intuitive. You're getting all the games and selecting all the size names. Then using that list of size names, to select games that have those size names. In other words, you're doing an extra query to get the results you already had. Remove the select from the result and use that for games instead.
Long and short, if I'm understanding your code properly, you can reduce all of this to just one simple line:
games = db.Games.Include("GameSize.Size.Product");
We have 18 table join which is typical for ERP systems. The join is done via LINQ over Entity Framework.
The join gets progressively slower as more joins are added. The return result set is small(15 records). The LINQ generated query is captured via SQL Profiler and when we run this via Microsoft Management Console it is very fast : 10ms. When we run it via our C# LINQ-over-EntityFramework it takes 4 seconds.
What i guess is happening:
The time it takes to compile expression tree into SQL is 2 seconds out of total 4 seconds, and another 2 seconds i guess is spent internally to convert SQL result set into actually C# classes. Also it is not connected to initialization of entity framework because we run some queries before and repetitive calls to this join produce same 4 seconds.
Is there a way to speed this up. Otherwise we are considering abandoning Entity Framework for being absolutely inefficient...
In case it helps, I had a nasty performance issue, whereby a simple query that took 1-2 seconds in raw SQL took about 11 seconds via EF.
I went from using...
List<GeographicalLocation> geographicalLocations = new SalesTrackerCRMEntities()
.CreateObjectSet<GeographicalLocation>()
.Where(g => g.Active)
.ToList();
which took about 11 seconds via EF, to using...
var geographicalLocations = getContext().CreateObjectSet<GeographicalLocation>()
.AsNoTracking()
.Where(g => g.Active).ToList();
which took less than 200 milliseconds.
The disadvantage to this is that it won't load related entities, so you have to load them manually afterwards, but it gives such an immense performance boost that it was well worth it (in this case at least).
You would have to assess each case individually to see if the extra speed is worth the extra code.
You correctly identified bottlenecks.
If you have quite complex queries, I would suggest you to use compiled queries to overcome expression tree to sql query conversion.
You can refer Compiled Queries in EF from here.
Fo second part if EF is using two much time materialize your object graph then I would suggest to use some other means to retrieve data apart from EF.
One option can be Dapper.NET, You can have your concise sql query and you can directly retrieve its result in concrete model objects using Dapper (or any other tiny ORM)
I suspect your query takes so long to generate becuase you are treating Entity Framework like it is a SQL Query, which is not correct. You have many joins and akward calls in your linq syntax. Generally, your syntax should be similar to the following fictitious modeling query:
var result = (from appointment in appointments
from operation in appointment.Operations
where appointment.Id == 12
select new Model {
Id = appointment.Id,
Name = appointment.Name,
// etc, etc
}).ToList();
There is no use of joins above, the navigation property between Appointment and Operations takes care of the neccessary plumbing. Remember, this is an ORM, there is no concept of a join, only a concept of relationships.
The call to Distinct at the end, also indicates the structure of the db schema may be problematic if it returns too many duplicate results.
If after refactoring the entity model and correctly constructing the query still leaves with underperformance, it is advisable to use a stored procedure and map the result with EF's built in methods for doing so.
It is hard to tell what is going wrong here without seeing how you are using linq, but I suspect this will fix your problem:
var myResult = dataContext.table.Where(x == "Your joins and otherstuff").ToList();
//after converting it to a list use it how you need, but not before.
If this does not help please post your code.
The problem is that you are probably passing it to a data source that is running all sorts of additional queries based on you open result set.
Try this instead:
IEnumerable<SigmaTEK.Web.Models.SchedulerGridModel> tasks = (from appointment in _appointmentRep.Value.Get().Where(a => (a.Start < DbContext.MaxTime && DbContext.MinTime < a.Expiration))
join timeApplink in _timelineAppointmentLink.Value.Get().Where(a => a.AppointmentId != Guid.Empty)
on appointment.Id equals timeApplink.AppointmentId
join timeline in timelineRep.Value.Get().Where(i => timelines.Contains(i.Id))
on timeApplink.TimelineId equals timeline.Id
join repeater in _appointmentRepeaterRep.Value.Get().Where(repeater => (repeater.Start < DbContext.MaxTime && DbContext.MinTime < repeater.Expiration))
on appointment.Id equals repeater.Appointment
into repeaters
from repeater in repeaters.DefaultIfEmpty()
join aInstance in _appointmentInstanceRep.Value.Get()
on appointment.Id equals aInstance.Appointment
into instances
from instance in instances.DefaultIfEmpty()
join opRes in opResRep.Get()
on instance.ResourceOwner equals opRes.Id
into opResources
from op in opResources.DefaultIfEmpty()
join itemResource in _opDocItemResourcelinkRep.Value.Get()
on op.Id equals itemResource.Resource
into itemsResources
from itemresource in itemsResources.DefaultIfEmpty()
join opDocItem in opDocItemRep.Get()
on itemresource.OpDocItem equals opDocItem.Id
into opDocItems
from opdocitem in opDocItems.DefaultIfEmpty()
join opDocSection in opDocOpSecRep.Get()
on opdocitem.SectionId equals opDocSection.Id
into sections
from section in sections.DefaultIfEmpty()
join opDoc in opDocRep.Get()
on section.InternalOperationalDocument equals opDoc.Id
into opdocs
from opdocitem2 in opDocItems.DefaultIfEmpty()
join opDocItemLink in opDocItemStrRep.Get()
on opdocitem2.Id equals opDocItemLink.Parent
into opDocItemLinks
from link in opDocItemLinks.DefaultIfEmpty()
join finItem in finItemsRep.Get()
on link.Child equals finItem.Id
into temp1
from rd1 in temp1.DefaultIfEmpty()
join sec in finSectionRep.Get()
on rd1.SectionId equals sec.Id
into opdocsections
from finopdocsec in opdocsections.DefaultIfEmpty()
join finopdoc in opDocRep.Get().Where(i => i.DocumentType == "Quote")
on finopdocsec.InternalOperationalDocument equals finopdoc.Id
into finOpdocs
from finOpDoc in finOpdocs.DefaultIfEmpty()
join entry in entryRep.Get()
on rd1.Transaction equals entry.Transaction
into entries
from entry2 in entries.DefaultIfEmpty()
join resproduct in resprosductRep.Get()
on entry2.Id equals resproduct.Entry
into resproductlinks
from resprlink in resproductlinks.DefaultIfEmpty()
join res in resRep.Get()
on resprlink.Resource equals res.Id
into rootResource
from finopdoc in finOpdocs.DefaultIfEmpty()
join rel in orgDocIndRep.Get().Where(i => (i.Relationship == "OrderedBy"))
on finopdoc.Id equals rel.OperationalDocument
into orgDocIndLinks
from orgopdoclink in orgDocIndLinks.DefaultIfEmpty()
join org in orgRep.Get()
on orgopdoclink.Organization equals org.Id
into toorgs
from opdoc in opdocs.DefaultIfEmpty()
from rootresource in rootResource.DefaultIfEmpty()
from toorg in toorgs.DefaultIfEmpty()
select new SigmaTEK.Web.Models.SchedulerGridModel()
{
Id = appointment.Id,
Description = appointment.Description,
End = appointment.Expiration,
Start = appointment.Start,
OperationDisplayId = op.DisplayId,
OperationName = op.Name,
AppContextId = _appContext.Id,
TimelineId = timeline.Id,
AssemblyDisplayId = rootresource.DisplayId,
//Duration = SigmaTEK.Models.App.Utils.StringHelpers.TimeSpanToString((appointment.Expiration - appointment.Start)),
WorkOrder = opdoc.DisplayId,
Organization = toorg.Name
}).Distinct().ToList();
//In your UI
MyGrid.DataSource = tasks;
MyGrid.DataBind();
//Do not use an ObjectDataSource! It makes too many extra calls
Currently I'm running some code and got some question about this. Below are the code listings of two LINQ to Entites queries.
Code listing A:
IQueryable list =
from tableProject in db.Project
select new {StaffInCharge = (
from tableStaff in db.Staff
where tableStaff.StaffId == tableProject.StaffInChargeId
select tableStaff.StaffName)};
Code listing B:
IQueryable list =
from tableProjectin db.Project
join tableStaff in db.Staff
on tableProject.StaffInChargeId
equal tableStaff.StaffId
select new {StaffInCharge = tableStaff.StaffName};
What I want to figure out is which one will be better and faster if I have to select many column from many others table.
Thanks.
this is the comment from #Tim Schmelter
"The article(actually my SO-Question) relates to LINQ-To-DataSet what is based on LINQ-To-Objects. Linq to SQL or Linq to Entities might be optimized by the DBMS in that way that a where clause has the same performance as a join."
and the link is
Why is LINQ JOIN so much faster than linking with WHERE?
i think it is very useful.
I am attempting to perform a join between two tables and limit results by 3 conditions. 2 of the conditions belong to the primary table, the third condition belongs to the secondary table. Here is the query I'm attempting:
var articles = (from article in this.Context.contents
join meta in this.Context.content_meta on article.ID equals meta.contentID
where meta.metaID == 1 && article.content_statusID == 1 && article.date_created > created
orderby article.date_created ascending
select article.content_text_key);
It is meant to join the two tables by the contentID, then filter based on the metaID (type of article), statusID, and then get all articles that are greater than the datetime created. The problem is that it returns 2 records (out of 4 currently). One has a date_created less than created and the other is the record that produced created in the first place (thus equal).
By removing the join and the where clause for the meta, the result produces no records (expected). What I can't understand is that when I translate this join into regular SQL it works just fine. Obviously I'm misunderstanding what the functionality of join is in this context. What would cause this behavior?
Edit:
Having tried this in LinqPad, I've noticed that LinqPad provides the expected results. I have tried these queries separately in code and it isn't until the join is added that odd results begin populating it appears to be happening on any date comparison where the record occurs on the same day as the limiter.
I can't seem to be able to add a comment but in debug mode you should be able to put a break point on this line of code. When you do you should be able to hover over it and have it tell you the sql that LINQ generates. Please post that sql.
At your suggestion, I'm posting my comment as the answer:
"It might also help to see your schema. The data types for metaID,
content_statusID, and date_created might come into play as well -- and
it's easy for me (somebody who's unfamiliar with your code) to make
assumptions about those data types."
Given that I have three tables (Customer, Orders, and OrderLines) in a Linq To Sql model where
Customer -- One to Many -> Orders -- One to Many -> OrderLines
When I use
var customer = Customers.First();
var manyWay = from o in customer.CustomerOrders
from l in o.OrderLines
select l;
I see one query getting the customer, that makes sense. Then I see a query for the customer's orders and then a single query for each order getting the order lines, rather than joining the two. Total of n + 1 queries (not counting getting customer)
But if I use
var tableWay = from o in Orders
from l in OrderLines
where o.Customer == customer
&& l.Order == o
select l;
Then instead of seeing a single query for each order getting the order lines, I see a single query joining the two tables. Total of 1 query (not counting getting customer)
I would prefer to use the first Linq query as it seems more readable to me, but why isn't L2S joining the tables as I would expect in the first query? Using LINQPad I see that the second query is being compiled into a SelectMany, though I see no alteration to the first query, not sure if that's a indicator to some problem in my query.
I think the key here is
customer.CustomerOrders
Thats an EntitySet, not an IQueryable, so your first query doesn't translate directly into a SQL query. Instead, it is interpreted as many queries, one for each Order.
That's my guess, anyway.
How about this:
Customers.First().CustomerOrders.SelectMany(item => item.OrderLines)
I am not 100% sure. But my guess is because you are traversing down the relationship that is how the query is built up, compared to the second solution where you are actually joining two sets by a value.
So after Francisco's answer and experimenting with LINQPad I have come up with a decent workaround.
var lines = from c in Customers
where c == customer
from o in c.CustomerOrders
from l in o.OrderLines
select l;
This forces the EntitySet into an Expression which the provider then turns into the appropriate query. The first two lines are the key, by querying the IQueryable and then putting the EntitySet in the SelectMany it becomes an expression. This works for the other operators as well, Where, Select, etc.
Try this query:
IQueryable<OrderLine> query =
from c in myDataContext.customers.Take(1)
from o in c.CustomerOrders
from l in o.OrderLines
select l;
You can go to the CustomerOrders property definition and see how the property acts when it used with an actual instance. When the property is used in a query expression, the behavior is up to the query provider - the property code is usually not run in that case.
See also this answer, which demonstrates a method that behaves differently in a query expression, than if it is actually called.