Linq fails to select from large data set

Linq fails to select from large data set - c#

I am trying to select a result set in to a list of object type.
var temp = res.Select(a => new ContentAudit {
locale = a.locale,
product_code = a.product_code, product_name = a.product_name,
image = a.image, product_short_description = a.product_short_description,
product_long_description = a.product_long_description,
items = a.items,
service_articles = GetServiceArticleCount(a.product_id).ToString(),
is_deleted = a.is_deleted, views = a.views,
placed_in_cart = GetPlacedInCartCount(a.product_id).ToString(),
ordered = GetOrderedCount(a.product_id).ToString(),
Importance = GetImportance(a.product_id),
operation = (a.product_id.ToString()) }
).ToList();
I am selecting from 'res' variable which is the result set selected from the database. Which has aroun 65000 records. So because of that the line of code above dosent work and the server get stucked. Is there anyother way i can achieve this? Thank you

There are many problems with this query.
1st You are trying to select 65000 records from DB and use .ToList()
It will iterate all objects.
You should use IEnumerable (or IQueryable), and use lazy loading.
If you do not need all of this objects try to add .Where() statement to limit number of entities.
2nd in query You are using methods wich are trying to make even more request to db. Do You realy need all this data? If yes make sure that everything is using lazy loading. Do not iterate it all in one time!
I can see two solutions. If You don't need all this data, take only data You need from db and limit number of retrived entitiies as much as its posible.
If you realy need all this data, try to use lazy loading, and add pagination (.take() and .skip() methods) to limit number of entites retrived in one call.

Related

Need to get the data during an SaveChanges() within EF Core

I need data for a LINQ query that is not already saved to the database. Here is my code:
foreach (var item in BusProc)
{
var WorkEffortTypeXref = new WorkEffortTypeXref
{
SourceDataEntityTypeName = "BusProc",
SourceDataEntityTypeId = item.BusProcId,
};
_clDBContext.WorkEffortTypeXref.AddRange(WorkEffortTypeXref);
}
but I need this data in the SQL Database before I do a join LINQ query on the data. Although I don't really want to do a OnSave() after this function because I want to make the whole process transactional.
This is the LINQ I need to execute. What is the best way to do this?
var linqquery = from bupr in BusProc
join wrtx in WorkEffortTypeXref on bupr.BusProcId equals wrtx.SourceDataEntityTypeId
// where wrtx.SourceDataEntityTypeName == "BusProc"
select new
{
wrtx.TargetDataEntityTypeId
};

First, try to compile your code. This code won't compile and as-is isn't comprehensible. You are treating WorkEffortTypeXref as a class, a singular object and a list all in the first section of code, so we really can't know what it is supposed to be.
Now, as I understand your question, you want to query information that is being added to a table, (currently stored in memory as a collection of some sort) but you want to query it before it is added? What if the table has other rows that match your query? What if the insert violates a constraint and therefore fails? Linq can query an in-memory collection, but you have to choose, are you querying the collection that you have in memory (and isn't yet a row of your table) or the database (which has all of the rules/contstraints/etc that databases provide)? Until the records are saved to your table, they aren't the same thing.

Too many parameters were provided in this RPC request. The maximum is 2100 [duplicate]

from f in CUSTOMERS
where depts.Contains(f.DEPT_ID)
select f.NAME
depts is a list (IEnumerable<int>) of department ids
This query works fine until you pass a large list (say around 3000 dept ids) .. then I get this error:
The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Too many parameters were provided in this RPC request. The maximum is 2100.
I changed my query to:
var dept_ids = string.Join(" ", depts.ToStringArray());
from f in CUSTOMERS
where dept_ids.IndexOf(Convert.ToString(f.DEPT_id)) != -1
select f.NAME
using IndexOf() fixed the error but made the query slow. Is there any other way to solve this? thanks so much.

My solution (Guids is a list of ids you would like to filter by):
List<MyTestEntity> result = new List<MyTestEntity>();
for(int i = 0; i < Math.Ceiling((double)Guids.Count / 2000); i++)
{
var nextGuids = Guids.Skip(i * 2000).Take(2000);
result.AddRange(db.Tests.Where(x => nextGuids.Contains(x.Id)));
}
this.DataContext = result;

Why not write the query in sql and attach your entity?
It's been awhile since I worked in Linq, but here goes:
IQuery q = Session.CreateQuery(#"
select *
from customerTable f
where f.DEPT_id in (" + string.Join(",", depts.ToStringArray()) + ")");
q.AttachEntity(CUSTOMER);
Of course, you will need to protect against injection, but that shouldn't be too hard.

You will want to check out the LINQKit project since within there somewhere is a technique for batching up such statements to solve this issue. I believe the idea is to use the PredicateBuilder to break the local collection into smaller chuncks but I haven't reviewed the solution in detail because I've instead been looking for a more natural way to handle this.
Unfortunately it appears from Microsoft's response to my suggestion to fix this behavior that there are no plans set to have this addressed for .NET Framework 4.0 or even subsequent service packs.
https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=475984
UPDATE:
I've opened up some discussion regarding whether this was going to be fixed for LINQ to SQL or the ADO.NET Entity Framework on the MSDN forums. Please see these posts for more information regarding these topics and to see the temporary workaround that I've come up with using XML and a SQL UDF.

I had similar problem, and I got two ways to fix it.
Intersect method
join on IDs
To get values that are NOT in list, I used Except method OR left join.
Update
EntityFramework 6.2 runs the following query successfully:
var employeeIDs = Enumerable.Range(3, 5000);
var orders =
from order in Orders
where employeeIDs.Contains((int)order.EmployeeID)
select order;

Your post was from a while ago, but perhaps someone will benefit from this. Entity Framework does a lot of query caching, every time you send in a different parameter count, that gets added to the cache. Using a "Contains" call will cause SQL to generate a clause like "WHERE x IN (#p1, #p2.... #pn)", and bloat the EF cache.
Recently I looked for a new way to handle this, and I found that you can create an entire table of data as a parameter. Here's how to do it:
First, you'll need to create a custom table type, so run this in SQL Server (in my case I called the custom type "TableId"):
CREATE TYPE [dbo].[TableId] AS TABLE(
Id[int] PRIMARY KEY
)
Then, in C#, you can create a DataTable and load it into a structured parameter that matches the type. You can add as many data rows as you want:
DataTable dt = new DataTable();
dt.Columns.Add("id", typeof(int));
This is an arbitrary list of IDs to search on. You can make the list as large as you want:
dt.Rows.Add(24262);
dt.Rows.Add(24267);
dt.Rows.Add(24264);
Create an SqlParameter using the custom table type and your data table:
SqlParameter tableParameter = new SqlParameter("#id", SqlDbType.Structured);
tableParameter.TypeName = "dbo.TableId";
tableParameter.Value = dt;
Then you can call a bit of SQL from your context that joins your existing table to the values from your table parameter. This will give you all records that match your ID list:
var items = context.Dailies.FromSqlRaw<Dailies>("SELECT * FROM dbo.Dailies d INNER JOIN #id id ON d.Daily_ID = id.id", tableParameter).AsNoTracking().ToList();

You could always partition your list of depts into smaller sets before you pass them as parameters to the IN statement generated by Linq. See here:
Divide a large IEnumerable into smaller IEnumerable of a fix amount of item

Query particular set of records with QueryExpression

Is there a way to build a QueryExpression returning just a particular set of records?
I have the following Criteria Types:
First:
Returns the first n Records (i.e. select top)
Last:
Returns the last n records
Every:
Returns every n'th record
For the type "First" I can use
queryExpression.TopCount = number_of_records
But I have no Idea how I can achieve the other types of criteria. The issue is that there are quite big data volumes and if I need first to get all records and query the result for example with Linq to customize the resultset I will probably have a performance issue.
If I could build the QueryExpression just selecting exactly what I need the whole thing gets more efficient.
Does anybody have an idea on how to achieve this with a QueryExpression?
The system in question is Microsoft Dynamics CRM Online

For the "last N" you can reverse the sort and use TopCount again.
For the "every Nth" you might want to consider paging the Query Expression.
Say you're looking for every 10th record. What I might do would be to set my page size to 10 (query.PageInfo.Count).
To iterate through the pages as quickly as possible I'd make my "main" query return only the GUIDs. When I retrieve a new page of GUIDs, I'd grab the first GUID and get the columns I want for that record using a separate Retrieve call.

Last N Records: quite simple order by particular field as descinding and then top N that's it
Returns the last n records
// Instantiate QueryExpression QEaccount
var QEaccount = new QueryExpression("account");
QEaccount.TopCount = 5;
// Add columns to QEaccount.ColumnSet
QEaccount.ColumnSet.AddColumns("name", "ah_account_type", "accountid");
QEaccount.AddOrder("name", OrderType.Descending);
Every nth Record:
Do you have any particular criteria here, for example give me all accounts where country =Germany
if yes then you can user condition to return particular set of records as below
// Define Condition Values
var QEaccount_address1_country = "Germany";
// Instantiate QueryExpression QEaccount
var QEaccount = new QueryExpression("account");
// Add columns to QEaccount.ColumnSet
QEaccount.ColumnSet.AddColumns("name", "ah_account_type", "accountid", "address1_country");
// Define filter QEaccount.Criteria
QEaccount.Criteria.AddCondition("address1_country", ConditionOperator.Equal, QEaccount_address1_country);

How To insert records of one database to other database in LINQ

I have 2 database with tables.
I wanted to insert records from first database to second database table in LINQ. I have created 2 dbml files with 2 datacontexts but I am unable to code the insertion of records.
I have list of records:
using(_TimeClockDataContext)
{
var Query = (from EditTime in _TimeClockDataContext.tblEditTimes
orderby EditTime.ScanDate ascending
select new EditTimeBO
{
EditTimeID = EditTime.EditTimeID,
UserID = Convert.ToInt64(EditTime.UserID),
ScanCardId = Convert.ToInt64(EditTime.ScanCardId),
}).ToList();
return Query;
}
Now I want to insert record in new table which is in _Premire2DataContext.

If you want to "copy" records from one database to another using Linq then you need two database contexts, one for the database you are reading from, and one for the database you are reading to.
EditTime[] sourceRows;
using (var sourceContext = CreateSourceContext())
{
sourceRows = ReadRows(sourceContext);
}
using (var destinationContext = CreateDestinationContext())
{
WriteRows(destinationContext, sourceRows);
}
You now just need to fill in the implementations for ReadRows and WriteRows using standard LINQ to SQL code. The code for writing rows should look a bit like this.
void WriteRows(TimeClockDataContext context, EditTime[] rows)
{
foreach (var row in rows)
{
destinationContext.tblEditTimes.Add(row);
}
destinationContext.SubmitChanges();
}
Note that as long as the schema is the same you can use the same context and therefore the same objects - so when reading records we ideally want to return the correct array type, therefore reading is going to look a bit like this
EditTime[] ReadRows(TimeClockDataContext context)
{
return (
from EditTime in _TimeClockDataContext.tblEditTimes
orderby EditTime.ScanDate ascending
select EditTime
).ToArray();
}
You can use an array or a list - it doesn't really matter. I've used an array mostly because the syntax is shorter. Note that we return the original EditTime objects rather than create new ones as this means we can add those objects directly to the second data context.
I've not compiled any of this code yet, so it might contain some typos. Also apologies if I've made some obvious errors - its been a while since I last used LINQ to SQL.
If you have foreign keys or the second database has a different schema then things get more complicated, but the fundamental process remains the same - read from one context (using standard LINQ to SQL) and store the results somewhere, then add the rows the the second context (using standard LINQ to SQL).
Also note that this isn't necessarily going to be particularly quick. If performance is an issue then you should look into using bulk inserts in the WriteRows method, or potentially even use linked servers to do the entire thing in SQL.

How can I set a value on a Linq object during the query?

I have a simple query as shown below
var trips = from t in ctx.Trips
select t;
The problem is that I have an extra property on the Trip object that needs to be assigned, preferably without iterating through the returned IQueryable.
Does anyone know how to set the value during the query? (i.e. select t, t.property = "value")

First of all, if you never iterate through the results, your code won't run. LINQ is "lazy", meaning that it only computes the next result when you ask for it.
However, if you do really need to do what you ask, try something like this:
Trip SetProperty(Trip t)
{
t.property = "value";
return t;
}
var trips = from t in ctx.Trips select SetProperty(t);

This would work:
var trips = from t in ctx.Trips select new Trip
{
// db properties
t.ID,
t.Name,
t.Description,
// non-db properties
SomeValue = 45,
SomeOtherValue = GetValueFromSomewhereInCode(t.ID)
}
By projecting your database rows into 'new' Trip objects you can set properties that are not populated by LINQ to SQL, without any additional enumeration required. You can also call custom code in your projection because it executes in code, and not as part of the SQL statement sent to the database.
However, the resulting Trip objects are not enabled for change-tracking so bear that in mind (you can't make changes to properties and have them automatically updated via SubmitChanges(), because the data context doesn't know about your Trip objects).
If you need to track changes as well then you will need to enumerate the tracked Trip objects again after you have retrieved them via LINQ to SQL. This isn't really a problem though as it should be an in-memory iteration.

If you don't want to iterate through the sequence of Trips, you are better of directly updating the database using a stored procedure or parameterized query.
If you're worried that the Trips sequence will contain too many items and be slow to process, you could of course add more filtering in the where clause.
If this isn't possible, it's just another reason to do the processing of very large result sets on the database using a stored proc etc.
Otherwise the usual LINQ to SQL approach is something like:
using (var context = new DefaultDataContext())
{
var defects = context.Defects.Where(d => d.Status==Status.Open);
foreach(Defect d in defects)
{
defect.Status = Status.Fixed;
defect.LastModified = DateTime.Now;
}
context.SubmitChanges();
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Linq fails to select from large data set - c#

Related

Need to get the data during an SaveChanges() within EF Core

Too many parameters were provided in this RPC request. The maximum is 2100 [duplicate]

Query particular set of records with QueryExpression

How To insert records of one database to other database in LINQ

How can I set a value on a Linq object during the query?

Categories

Resources