I've been struggling for the last 3 days on that topic.
I'm sure i'm doing something wrong but there, i need help.
During the load of a form, i'm doing a Linq query (on a global dataset) to populate fields on that form. As i want to be able to change the views of the form, i want queries that will make the data available in a specific format (to avoid having to query every now on then (the dataset is 20,000 lines)).
so i came up with that first queries :
var results =
from row in Globals.ds.Tables["Song"].AsEnumerable()
group row by (row.Field<int>("year"), row.Field<int>("rating")) into grp
orderby grp.Key
select new
{
year = grp.Key.Item1,
conte = grp.ToList().Count,
rating = grp.Key.Item2,
duree = grp.Sum(r => r.Field<int>("duree"))
};
It works and i'm pasting the result in the following screenshot (conte is the count)
Result of the query
1 have 2 issues :
1/ I really dont know how to handle that result : i would like to filter for a specific year and list all the subsequent ratings (i have from 1 to 6 per year). I tried the .ToList() but it only helped to get the count. The CopyToDataTable is not available for the query.
2/ i have buttons in the form that will need to access to that query, yet the var result is only available in the load and i can't manage to declare it at the class level.
Thanks for the help :)
So:
Your first point have been answered by #jdweng
It is possible to use LinQ also for collections (ex. List), not only Db queries.
The reason is that the result of the query is an anonymous type, and it can't be declared outside local scope. You must create a new class with the same structure.
public class MyResultClass
{
public int year;
public int conte;
public int rating;
public int duree;
}
Define your field:
List<MyResultClass> data;
And then use both:
var result =
from row in Globals.ds.Tables["Song"].AsEnumerable()
group row by (row.Field<int>("year"), row.Field<int>("rating")) into grp
orderby grp.Key
select new MyResultClass
{
year = grp.Key.Item1,
conte = grp.ToList().Count,
rating = grp.Key.Item2,
duree = grp.Sum(r => r.Field<int>("duree"))
};
data = result.ToList();
I hope I was helpful.
Related
I have a database table with over 200K+ records and a column containing a Date (NOT NULL). I am struggling to do a GroupBy Date since the database is massive the query takes soooo long to process (like 1 minute or so).
My Theory:
Get the list of all records from that table
From that list find the end date and the start date (basically the oldest date and the newest)
Then taking say like 20 dates to do the GroupBy on so the query will be done in a shorter set of records..
Here is my Model that I have to get the list:
registration.Select(c => new RegistrationViewModel()
{
DateReference = c.DateReference,
MinuteWorked = c.MinuteWorked,
});
The DateReferenceis the database column that I have to work with...
I am not pretty sure how to cycle through my list getting the dates start and end without taking too long.
Any idea on how to do that?
EDIT:
var registrationList = await context.Registration
.Where(c => c.Status == StatusRegistration.Active) // getting all active registrations
.ToRegistrationViewModel() // this is simply a select method
.OrderBy(d => d.DateReference.Date) // this takes long
.ToListAsync();
The GroupBy:
var grpList = registrationList.GroupBy(x => x.DateReference.Date).ToList();
var tempList = new List<List<RegistrationViewModel>>();
foreach (var item in grpList)
{
var selList = item.Select(c => new RegistrationViewModel()
{
RegistrationId = c.RegistrationId,
DateReference = c.DateReference,
MinuteWorked = c.MinuteWorked,
}).ToList();
tempList.Add(selList);
}
This is my SQL table:
This is the ToRegistrationViewModel() function:
return registration.Select(c => new RegistrationViewModel()
{
RegistrationId = c.RegistrationId,
PeopleId = c.PeopleId,
DateReference = c.DateReference,
DateChange = c.DateChange,
UserRef = c.UserRef,
CommissionId = c.CommissionId,
ActivityId = c.ActivityId,
MinuteWorked = c.MinuteWorked,
Activity = new ActivityViewModel()
{
Code = c.Activity.Code,
Description = c.Activity.Description,
},
Commission = new CommissionViewModel()
{
Code = c.Commission.Code,
Description = c.Commission.Description
},
People = new PeopleViewModel()
{
UserId = c.People.UserId,
Code = c.People.Code,
Name = c.People.Name,
Surname = c.People.Surname,
Active = c.People.Active
}
});
There are multiple potential problems here
Lack of indexes
Your query uses the Status and DateReference, and neither looks to have an index. If there are only a few active statuses a index on that column might suffice, otherwise you need a index on the date to speedup sorting. You might also consider a composite index that includes both columns. An appropriate index should solve the sorting issue.
Materializing the query
ToListAsync will trigger the execution of the sql query, making every subsequent operation run on the client. I would also be highly suspicious of ToRegistrationViewModel, I would try changing this to an anonymous type, and only convert to an actual type after the query has been materialized. Running things like sorting and grouping on the client is generally considered a bad idea, but you need to consider where the actual bottleneck is, optimizing the grouping will not help if the transfer of data takes most time.
Transferring data
Fetching a large number of rows will be slow, no matter what. The goal is usually to do as much filtering in the database as possible so you do not need to fetch so many rows. If you have to fetch a large amount of records you might use Pagination, i.e. combine OrderBy with Skip and Take to fetch smaller chunks of data. This will not save time overall, but can allow for things like progress and showing data continuously.
I have a database table with records for each user/year combination.
How can I get data from the database using EF and a list of userId/year combinations?
Sample combinations:
UserId Year
1 2015
1 2016
1 2018
12 2016
12 2019
3 2015
91 1999
I only need the records defined in above combinations. Can't wrap my head around how to write this using EF/Linq?
List<UserYearCombination> userYears = GetApprovedYears();
var records = dbcontext.YearResults.Where(?????);
Classes
public class YearResult
{
public int UserId;
public int Year;
public DateTime CreatedOn;
public int StatusId;
public double Production;
public double Area;
public double Fte;
public double Revenue;
public double Diesel;
public double EmissionsCo2;
public double EmissionInTonsN;
public double EmissionInTonsP;
public double EmissionInTonsA;
....
}
public class UserYearCombination
{
public int UserId;
public int Year;
}
This is a notorious problem that I discussed before here. Krishna Muppalla's solution is among the solutions I came up with there. Its disadvantage is that it's not sargable, i.e. it can't benefit from any indexes on the involved database fields.
In the meantime I coined another solution that may be helpful in some circumstances. Basically it groups the input data by one of the fields and then finds and unions database data by grouping key and a Contains query of group elements:
IQueryable<YearResult> items = null;
foreach (var yearUserIds in userYears.GroupBy(t => t.Year, t => t.UserId))
{
var userIds = yearUserIds.ToList();
var grp = dbcontext.YearResults
.Where(x => x.Year == yearUserIds.Key
&& userIds.Contains(x.UserId));
items = items == null ? grp : items.Concat(grp);
}
I use Concat here because Union will waste time making results distinct and in EF6 Concat will generate SQL with chained UNION statements while Union generates nested UNION statements and the maximum nesting level may be hit.
This query may perform well enough when indexes are in place. In theory, the maximum number of UNIONs in a SQL statement is unlimited, but the number of items in an IN clause (that Contains translates to) should not exceed a couple of thousands. That means that
the content of your data will determine which grouping field performs better, Year or UserId. The challenge is to minimize the number of UNIONs while keeping the number of items in all IN clauses below approx. 5000.
you can try this
//add the possible filters to LIST
var searchIds = new List<string> { "1-2015", "1-2016", "2-2018" };
//use the list to check in Where clause
var result = (from x in YearResults
where searchIds.Contains(x.UserId.ToString()+'-'+x.Year.ToString())
select new UserYearCombination
{
UserId = x.UserId,
Year = x.Year
}).ToList();
Method 2
var d = YearResults
.Where(x=>searchIds.Contains(x.UserId.ToString() + '-' + x.Year.ToString()))
.Select(x => new UserYearCombination
{
UserId = x.UserId,
Year = x.Year
}).ToList();
Let's say I have a List of Detail class with 1000 entries. How can I exactly retrieve the matching data from the database Details table using LINQ method with a combination of both FirstCode and SecondCode properties?
public class Detail
{
public string FirstCode { get; set; }
public string SecondCode { get; set; }
}
If we're going to retrieve a single data it would be like this:
foreach(var detail in details)
{
var retrievedData = context.Details
.Where(x => x.FirstCode == detail.FirstCode && x.SecondCode == detail.SecondCode)
.FirstOrDefault();
// Add to some list here
}
But I don't want to fetch 1000 times from the database, also I don't want to get all data from Details table and then do the searching within the .NET level, because it's not ideal if we have bunch of data(for ex. 500,000+ records in Details table).
You need to programmatically generate the 'where' clause. Start with a query that returns all the rows in the Details database table...
IQueryable<XDetail> queryable = (from d in context.Details select d);
...where XDetail is the class type of the database table. I assume it is different from the Detail class in your question. Now you need to generate all the clauses to the query that specify the list of entries we want...
var predicate = PredicateBuilder.False<XDetail>();
foreach(Detail d in details)
predicate = predicate.Or((xd) => xd.FirstCode == d.FirstCode &&
xd.SecondCode == d.SecondCode));
queryable = queryable.Where(predicate);
var results = queryable.ToList();
You can see the code for the PredicateBuilder class here. Note that Entity Framework will generate the required SQL but there is a limit to how big that query can be. So adding 1000 clauses will certainly make it to big. You would have to experiment but you might be limited to a 100 or less before you hit the limit.
I have a linq query which seems to be reversing one column of several in some rows of an earlier query:
var dataSet = from fb in ds.Feedback_Answers
where fb.Feedback_Questions.Feedback_Questionnaires.QuestionnaireID == criteriaType
&& fb.UpdatedDate >= dateFeedbackFrom && fb.UpdatedDate <= dateFeedbackTo
select new
{
fb.Feedback_Questions.Feedback_Questionnaires.QuestionnaireID,
fb.QuestionID,
fb.Feedback_Questions.Text,
fb.Answer,
fb.UpdatedBy
};
Gets the first dataset and is confirmed working.
This is then grouped like this:
var groupedSet = from row in dataSet
group row by row.UpdatedBy
into grp
select new
{
Survey = grp.Key,
QuestionID = grp.Select(i => i.QuestionID),
Question = grp.Select(q => q.Text),
Answer = grp.Select(a => a.Answer)
};
While grouping, the resulting returnset (of type: string, list int, list string, list int) sometimes, but not always, turns the question order back to front, without inverting answer or questionID, which throws it off.
i.e. if the set is questionID 1,2,3 and question A,B,C it sometimes returns 1,2,3 and C,B,A
Can anyone advise why it may be doing this? Why only on the one column? Thanks!
edit: Got it thanks all! In case it helps anyone in future, here is the solution used:
var groupedSet = from row in dataSet
group row by row.UpdatedBy
into grp
select new
{
Survey = grp.Key,
QuestionID = grp.OrderBy(x=>x.QuestionID).Select(i => i.QuestionID),
Question = grp.OrderBy(x=>x.QuestionID).Select(q => q.Text),
Answer = grp.OrderBy(x=>x.QuestionID).Select(a => a.Answer)
};
Reversal of a grouped order is a coincidence: IQueryable<T>'s GroupBy returns groups in no particular order. Unlike in-memory GroupBy, which specifies the order of its groups, queries performed in RDBMS depend on implementation:
The query behavior that occurs as a result of executing an expression tree that represents calling GroupBy<TSource,TKey,TElement>(IQueryable<TSource>, Expression<Func<TSource,TKey>>, Expression<Func<TSource,TElement>>) depends on the implementation of the type of the source parameter.`
If you would like to have your rows in a specific order, you need to add OrderBy to your query to force it.
How I do it and maintain the relative list order, rather than apply an order to the resulting set?
One approach is to apply grouping to your data after bringing it into memory. Apply ToList() to dataSet at the end to bring data into memory. After that, the order of subsequent GrouBy query will be consistent with dataSet. A drawback is that the grouping is no longer done in RDBMS.
What I want to do, is basically what this question offers: SQL Server - How to display most recent records based on dates in two tables .. Only difference is: I am using Linq to sql.
I have to tables:
Assignments
ForumPosts
These are not very similar, but they both have a "LastUpdated" field. I want to get the most recent joined records. However, I also need a take/skip functionality for paging (and no, I don't have SQL 2012).
I don't want to create a new list (with ToList and AddRange) with ALL my records, so I know the whole set of records, and then order.. That seems extremely unefficient.
My attempt:
Please don't laugh at my inefficient code.. Well ok, a little (both because it's inefficient and... it doesn't do what I want when skip is more than 0).
public List<TempContentPlaceholder> LatestReplies(int take, int skip)
{
using (GKDBDataContext db = new GKDBDataContext())
{
var forumPosts = db.dbForumPosts.OrderBy(c => c.LastUpdated).Skip(skip).Take(take).ToList();
var assignMents = db.dbUploadedAssignments.OrderBy(c => c.LastUpdated).Skip(skip).Take(take).ToList();
List<TempContentPlaceholder> fps =
forumPosts.Select(
c =>
new TempContentPlaceholder()
{
Id = c.PostId,
LastUpdated = c.LastUpdated,
Type = ContentShowingType.ForumPost
}).ToList();
List<TempContentPlaceholder> asm =
assignMents.Select(
c =>
new TempContentPlaceholder()
{
Id = c.UploadAssignmentId,
LastUpdated = c.LastUpdated,
Type = ContentShowingType.ForumPost
}).ToList();
fps.AddRange(asm);
return fps.OrderBy(c=>c.LastUpdated).ToList();
}
}
Any awesome Linq to SQl people, who can throw me a hint? I am sure someone can join their way out of this!
First, you should be using OrderByDescending, since later dates have greater values than earlier dates, in order to get the most recent updates. Second, I think what you are doing will work, for the first page, but you need to only take the top take values from the joined list as well. That is if you want the last 20 entries from both tables combined, take the last 20 entries from each, merge them, then take the last 20 entries from the merged list. The problem comes in when you attempt to use paging because what you will need to do is know how many elements from each list went into making up the previous pages. I think, your best bet is probably to merge them first, then use skip/take. I know you don't want to hear that, but other solutions are probably more complex. Alternatively, you could take the top skip+take values from each table, then merge, skip the skip values and apply take.
using (GKDBDataContext db = new GKDBDataContext())
{
var fps = db.dbForumPosts.Select(c => new TempContentPlaceholder()
{
Id = c.PostId,
LastUpdated = c.LastUpdated,
Type = ContentShowingType.ForumPost
})
.Concat( db.dbUploadedAssignments.Select(c => new TempContentPlaceholder()
{
Id = c.PostId,
LastUpdated = c.LastUpdated,
Type = ContentShowingType.ForumPost
}))
.OrderByDescending( c => c.LastUpdated )
.Skip(skip)
.Take(take)
.ToList();
return fps;
}