I have an IEnumerable of a POCO type containing around 80,000 rows
and a db table (L2E/EF4) containing a subset of rows where there was a "an error/a difference" (about 5000 rows, but often repeated to give about 150 distinct entries)
The following code gets the distinct VSACode's "in error" and then attempts to update the complete result set, updating JUST the rows that match...but it doesn't work!
var vsaCodes = (from g in db.GLDIFFLs
select g.VSACode)
.Distinct();
foreach (var code in vsaCodes)
{
var hasDifference = results.Where(r => r.VSACode == code);
foreach (var diff in hasDifference)
diff.Difference = true;
}
var i = results.Count(r => r.Difference == true);
After this code, i = 0
I've also tried:
foreach (var code in vsaCodes)
{
results.Where(r => r.VSACode == code).Select(r => { r.Difference = true; return r; }).ToList();
}
How can I update the "results" to set only the matching Difference property?
Assuming results is just a query (you haven't shown it), it will be evaluated every time you iterate over it. If that query creates new objects each time, you won't see the updates. If it returns references to the same objects, you would.
If you change results to be a materialized query result - e.g. by adding ToList() to the end - then iterating over results won't issue a new query, and you'll see your changes.
I had the same kind of error some time ago. The problem is that linq queries are often deferred and not executed when it appears you are calling them.
Quotation from "Pro LINQ Language Integrated Query in C# 2010":
"Notice that even though we called the query only once, the results of
the enumeration are different for each of the enumerations. This is
further evidence that the query is deferred. If it were not, the
results of both enumerations would be the same. This could be a
benefit or detriment. If you do not want this to happen, use one of
the conversion operators that do not return an IEnumerable so that
the query is not deferred, such as ToArray, ToList, ToDictionary, or
ToLookup, to create a different data structure with cached results
that will not change if the data source changes."
Here you have a good explanation with examples of it:
http://blogs.msdn.com/b/charlie/archive/2007/12/09/deferred-execution.aspx
Regards
Parsing words pretty closely on #jonskeet's answer...
If your query is simply a filter and the underlying source objects are updated, the query will be reevaluated and may exclude these objects based on the filter condition in which case your query results will change on subsequent enumerations but the underlying objects will still have been updated.
The key is a lack of a projection to a new type as far as updating and persisting the changed objects.
ToList() is the usual solution to this issue and it will solve the problem if there is a projection to a new type but things gets cloudy in event your query filters but does not project. Updates to the query still affect the original source objects given everything is referencing the same object.
Again, parsing words but these edge cases can trip you up.
public class Widget
{
public string Name { get; set; }
}
var widgets1 = new[]
{
new Widget { Name = "Red", },
new Widget { Name = "Green", },
new Widget { Name = "Blue", },
new Widget { Name = "Black", },
};
// adding ToList() will result in 'static' query result but
// updates to the objects will still affect the source objects
var query1 = widgets1
.Where(i => i.Name.StartsWith("B"))
//.ToList()
;
foreach (var widget in query1)
{
widget.Name = "Yellow";
}
// produces no output unless you uncomment out the ToList() above
// query1 is reevaluated and filters out "yellow" which does not start with "B"
foreach (var name in query1)
Console.WriteLine(name.Name);
// produces Red, Green, Yellow, Yellow
// the underlying widgets were updated
foreach (var name in widgets1)
Console.WriteLine(name.Name);
Related
I am trying to use EF 5 to apply multiple search criteria to a result set (in this case, for a library catalog search). Here is the relevant code:
public IQueryable<LibraryResource> GetSearchResults(string SearchCriteria, int? limit = null)
{
List<string> criteria = SearchCriteria.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).ToList();
IQueryable<LibraryResource> allResults = context.LibraryResources.Include("Type").Where(r => r.AuditInfo.DeletedAt == null);
foreach (string criterion in criteria)
{
allResults = allResults.Where(r => (r.Title.Contains(criterion) || r.Keywords.Contains(criterion) || r.Author.Contains(criterion) || r.Comments.Contains(criterion)));
}
allResults = allResults.OrderBy(r => r.Title);
if (limit.HasValue) allResults = allResults.Take(limit.Value);
return allResults;
}
Sample SearchCriteria = "history era"
For some reason, only the last criterion gets applied. For instance, in the sample above, all the books with "era" in the title, author, keywords and comments are returned, without also filtering by "history". I stepped through the code, and the loop executes twice, with the appropriate criterion each time. Can you see something I can't? Thanks!
You have fallen victim to modifying the value of a closed-over variable.
Change the code to this:
foreach (string criterion in criteria)
{
var crit = criterion;
allResults = allResults.Where(/* use crit here, not criterion */);
}
The problem here is that while you are building up the query your filtering expressions close over the variable criterion, in effect pulling it in scope at the point where the query is evaluated. However, at that time criterion will only have one value (the last one it happened to loop over) so all but the last of your filters will in fact be turned into duplicates of the last one.
Creating a local copy of criterion and referencing that inside the expressions corrects the problem because crit is a different local variable each time, with lifetime that does not extend from one iteration of the loop to the next one.
For more details you might want to read Is there a reason for C#'s reuse of the variable in a foreach?, where it is also mentioned that C# 5.0 will take a breaking change that applies to this scenario: the lifetime of the loop variable criterion is going to change, making this code work correctly without an extra local.
I have a list of Band Titles, and I wish to attach a SelectList with each one of them, depending on the BandID.
So first I am getting the list:-
List<BandQuestionTitles> bandQuestTitles = viewModel.PopulateBandQuestionTitles();
and then I have a loop on the BandQuestionTitles to populate a ViewData[var] from a SelectList:-
foreach (var bandQuestTitleItem in bandQuestTitles)
{
//populate the dropdownlist
string strViewDataString = bandQuestTitleItem.BandQuestTitlesText + "Data";
ViewData[strViewDataString] = new SelectList(viewModel.bandQuestionList.Where(p => p.BandQuestTitleID == bandQuestTitleItem.BandQuestTitlesID), "BandQuestID", "BandQuestText");
}
However for some reason, although I am getting correctly the 7 ViewData[""], I am always getting the same SelectList
When I hard code it, it works fine :-
ViewData["PersonalData"] = new SelectList(viewModel.bandQuestionList.Where(p => p.BandQuestTitleID == 1), "BandQuestID", "BandQuestText");
ViewData["BusinessData"] = new SelectList(viewModel.bandQuestionList.Where(p => p.BandQuestTitleID == 2), "BandQuestID", "BandQuestText");
What am I doing wrong in the loop?
Thanks for your help and time
Looks like a problem with your LINQ not being executed when you think it is.
Try this:
new SelectList(viewModel.bandQuestionList.Where(p => p.BandQuestTitleID == bandQuestTitleItem.BandQuestTitlesID)**.ToList()**, "BandQuestID", "BandQuestText");
Relevant article from MS:
http://msdn.microsoft.com/en-us/library/bb738633.aspx
In a query that returns a sequence of values, the query variable itself never holds the query results and only stores the query commands. Execution of the query is deferred until the query variable is iterated over in a foreach or For Each loop. This is known as deferred execution; that is, query execution occurs some time after the query is constructed.
In other words, when the queries are executed bandQuestTitleItem.BandQuestTitlesID will be assigned to the last(7th) ID in your collection for all of the queries.
Adding the .ToList() will cause the queries to execute immediately.
I have two tables: Transactions and TransactionAgents. TransactionAgents has a foreign key to Transactions called TransactionID. Pretty standard.
I also have this code:
BrokerManagerDataContext db = new BrokerManagerDataContext();
var transactions = from t in db.Transactions
where t.SellingPrice != 0
select t;
var taAgents = from ta in db.TransactionAgents
select ta;
foreach (var transaction in transactions)
{
foreach(var agent in taAgents)
{
agent.AgentCommission = ((transaction.CommissionPercent / 100) * (agent.CommissionPercent / 100) * transaction.SellingPrice) - agent.BrokerageSplit;
}
}
dataGridView1.DataSource = taAgents;
Basically, a TransactionAgent has a property/column named AgentCommission, which is null for all TransactionAgents in my database.
My goal is to perform the math you see in the foreach(var agent in taAgents) to patch up the value for each agent so that it isn't null.
Oddly, when I run this code and break-point on agent.AgentCommission = (formula) it shows the value is being calculated for AgentCommissision and the object is being updated but after it displays in my datagrid (used only for testing), it does not show the value it calculated.
So, to me, it seems that the Property isn't being permanently set on the object. What's more, If I persist this newly updated object back to the database with an update, I doubt the calculated AgentCommission will be set there.
Without having my table set up the same way, is there anyone that can look at the code and see why I am not retaining the property's value?
IEnumerable<T>s do not guarantee that updated values will persist across enumerations. For instance, a List will return the same set of objects on every iteration, so if you update a property, it will be saved across iterations. However, many other implementations of IEnumerables return a new set of objects each time, so any changes made will not persist.
If you need to store and update the results, pull the IEnumerable<T> down to a List<T> using .ToList() or project it into a new IEnumerable<T> using .Select() with the changes applied.
To specifically apply that to your code, it would look like this:
var transactions = (from t in db.Transactions
where t.SellingPrice != 0
select t).ToList();
var taAgents = (from ta in db.TransactionAgents
select ta).ToList();
foreach (var transaction in transactions)
{
foreach(var agent in taAgents)
{
agent.AgentCommission = ((transaction.CommissionPercent / 100) * (agent.CommissionPercent / 100) * transaction.SellingPrice) - agent.BrokerageSplit;
}
}
dataGridView1.DataSource = taAgents;
Specifically, the problem is that each time you access the IEnumerable, it enumerates over the collection. In this case, the collection is a call to the database. In the first part, you're getting the values from the database and updating them. In the second part, you're getting the values from the database again and setting that as the datasource (or, pedantically, you're setting the enumerator as the datasource, and then that is getting the values from the database).
Use .ToList() or similar to keep the results in memory, and access the same collection every time.
Assuming you are using LINQ to SQL, if EnableObjectTracking is false, then the objects will be constructed new every time the query is run. Otherwise, you would be getting the same object instances each time and your changes would survive. However, like others have shown, instead of having the query execute multiple times, cache the results in a list. Not only will you get what you want working, you'll have fewer database round trips.
I found that I had to locate the item in the list that I wanted to modify, extract the copy, modify the copy (by incrementing its count property), remove the original from the list and add the modified copy.
var x = stats.Where(d => d.word == s).FirstOrDefault();
var statCount = stats.IndexOf(x);
x.count++;
stats.RemoveAt(statCount);
stats.Add(x);
It is helpful to rewrite your LINQ expression using lambdas so that we can consider the code in more explicit terms.
//Original code from question
var taAgents = from ta in db.TransactionAgents
select ta;
//Rewritten to explicitly call attention to what Select() is actually doing
var taAgents = db.TransactionAgents.Select(ta => new TransactionAgents(/*database row's data*/)});
In the rewritten code, we can clearly see that Select() is constructing a new object based on each row returned from the database. What's more, this object construction occurs every time the IEnumerable taAgents is iterated through.
So, explained more concretely, if there are 5 TransactionAgents rows in the database, in the following example, the TransactionAgents() constructor is called a total of 10 times.
// Assume there are 5 rows in the TransactionAgents table
var taAgents = from ta in db.TransactionAgents
select ta;
//foreach will iterate through the IEnumerable, thus calling the TransactionAgents() constructor 5 times
foreach(var ta in taAgents)
{
Console.WriteLine($"first iteration through taAgents - element {ta}");
}
// these first 5 TransactionAgents objects are now out of scope and are destroyed by the GC
//foreach will iterate through the IEnumerable, thus calling the TransactionAgents() constructor 5 MORE times
foreach(var ta in taAgents)
{
Console.WriteLine($"second iteration through taAgents - element {ta}");
}
// these second 5 TransactionAgents objects are now out of scope and are destroyed by the GC
As we can see, all 10 of our TransactionAgents objects were created by the lambda in our Select() method, and do not exist outside of the scope of the foreach statement.
Hopefully I can explain this to where it make sense, but I'm trying to get a list of objects out of a master list using a speicific and complex (complex to me, at least) set of criteria.
I have a Class called TableInfo that exposes a List of ForeignKeyInfo. ForeignKeyInfo has a string property (among others) called, Table. I need to do some sequential processing using my TableInfo objects but only work with the TableInfo objects I haven't yet processed. To keep track of which TableInfo objects have already been processed I have a List which stores the name of the table after the processing has been complete.
I want to loop until all of the items in my TableInfo collection appear in my processed list. For each iteration of the loop, I should be processing all of the TableInfo items where all of the ForeignKeyInfo.Table strings appear in my processed List.
Here's how I've written it in "standard" looping code:
while(processed.Count != _tables.Count)
{
List<TableInfo> thisIteration = new List<TableInfo>();
foreach (TableInfo tab in _tables)
{
bool allFound = true;
foreach (ForeignKeyInfo fk in tab.ForeignKeys)
{
allFound = allFound && processed.Contains(fk.Table);
}
if (allFound && !processed.Contains(tab.Name))
{
thisIteration.Add(tab);
}
}
//now do processing using thisIteration list
//variable, "thisIteration", is what I'd like to replace with the result from LINQ
}
This should do it:
var thisIteration = _tables.Where(t => !processed.Contains(t.Name)
&& t.ForeignKeys
.All(fk => processed.Contains(fk.Table));
I'm assuming you just need to iterate over the thisIteration collection, in which case leaving it as an IEnumerable is fine. If you need it to be a list, you can just put in a .ToList() call at the end.
I'm not really sure what you're trying to do here. However, you can convert the body of your loop into the following LINQ query, if that makes things simpler...
List<TableInfo> thisIteration = (from tab in _tables
let allFound = tab.ForeignKeys.Aggregate(true, (current, fk) => current && processed.Contains(fk.Table))
where allFound && !processed.Contains(tab.Name)
select tab).ToList();
I'm fetching data from all 3 tables at once to avoid network latency. Fetching the data is pretty fast, but when I loop through the results a lot of time is used
Int32[] arr = { 1 };
var query = from a in arr
select new
{
Basket = from b in ent.Basket
where b.SUPERBASKETID == parentId
select new
{
Basket = b,
ObjectTypeId = 0,
firstObjectId = "-1",
},
BasketImage = from b in ent.Image
where b.BASKETID == parentId
select new
{
Image = b,
ObjectTypeId = 1,
CheckedOutBy = b.CHECKEDOUTBY,
firstObjectId = b.FIRSTOBJECTID,
ParentBasket = (from parentBasket in ent.Basket
where parentBasket.ID == b.BASKETID
select parentBasket).ToList()[0],
},
BasketFile = from b in ent.BasketFile
where b.BASKETID == parentId
select new
{
BasketFile = b,
ObjectTypeId = 2,
CheckedOutBy = b.CHECKEDOUTBY,
firstObjectId = b.FIRSTOBJECTID,
ParentBasket = (from parentBasket in ent.Basket
where parentBasket.ID == b.BASKETID
select parentBasket),
}
};
//Exception handling
var mixedElements = query.First();
ICollection<BasketItem> basketItems = new Collection<BasketItem>();
//Here 15 millis has been used
//only 6 elements were found
if (mixedElements.Basket.Count() > 0)
{
foreach (var mixedBasket in mixedElements.Basket){}
}
if (mixedElements.BasketFile.Count() > 0)
{
foreach (var mixedBasketFile in mixedElements.BasketFile){}
}
if (mixedElements.BasketImage.Count() > 0)
{
foreach (var mixedBasketImage in mixedElements.BasketImage){}
}
//the empty loops takes 811 millis!!
Why are you bothering to check the counts before the foreach statements? If there are no results, the foreach will just finish immediately.
Your queries are actually all being deferred - they'll be executed as and when you ask for the data. Don't forget that your outermost query is a LINQ to Objects query: it's just returning the result of calling ent.Basket.Where(...).Select(...) etc... which doesn't actually execute the query.
Your plan to do all three queries in one go isn't actually working. However, by asking for the count separately, you may actually be executing each database query twice - once just getting the count and once for the results.
I strongly suggest that you get rid of the "optimizations" in this code which are making it much more complicated and slower than just writing the simplest code you can.
I don't know of any way of getting LINQ to SQL (or LINQ to EF) to execute multiple queries in a single call - but this approach certainly isn't going to do it.
One other minor hint which is irrelevant in this case, but can be useful in LINQ to Objects - if you want to find out whether there's any data in a collection, just use Any() instead of Count() > 0 - that way it can stop as soon as it's found anything.
You're using IEnumerable in the foreach loop. Implementations only have to prepare data when it's asked for. In this way, I'd suggest that the above code is accessing your data lazily -- that is, only when you enumerate the items (which actually happens when you call Count().)
Put a System.Diagnostics.Stopwatch around the call to Count() and see whether that's taking the bulk of the time you're seeing.
I can't comment further here because you don't specify the type of ent in your code sample.