Getting weird behavior when retrieving data from Microsoft CRM using LINQ

Getting weird behavior when retrieving data from Microsoft CRM using LINQ - c#

I'm having this problem accessing the Contact entity using LINQ.
I have the 2 functions below.
If I ran the 1st function and then call the 2nd one, I seemed to be missing a lot of fields in the 2nd query. Like firstname and lastname are not showing up. They just shows up as null values. If I ran the 2nd function on its own, I am getting the right data. The only fields that shows up correctly in both runs are Id, ContactId and new_username.
If I ran the 2nd function on its own, I am getting the right data.
Any ideas what am I doing wrong?
Thanks a lot
Here are the 2 functions
public List<String> GetContactsUsernameOnly()
{
IQueryable<String> _records = from _contactSet in _flinsafeContext.ContactSet
where
_contactSet.new_FAN == "username"
orderby _contactSet.new_username
select _contactSet.new_username;
return _records.ToList();
}
public List<Contact> GetContacts()
{
IQueryable<Contact> _records = from _contactSet in _flinsafeContext.ContactSet
where
_contactSet.new_FAN == "my-username-here"
orderby _contactSet.new_username
select _contactSet;
return _records.ToList();
}

It is because you are reusing the same CRM context when you call both methods (in your case _flinsafeContext)
What the context does is cache records, so the first method is returning your contact but only bringing back the new_username field.
The second method wants to return the whole record, but when it is called after the first one the record already exists in the context so it just returns that, despite only having the one field populated. It is not clever enough to lazy load the fields that have not been populated. If this method was called first, it doesn't exist in the context so will return the whole record.
There are 2 ways to get around this:
1) Don't reuse CRMContexts. Instead create a new one in each method based on a singleton IOrganizationService.
2) There is a ClearChanges() method on your context that will mean the next time you do a query it will go back to CRM and get the fields you have selected. This will also clear any unsaved Created/Updates/Deletes etc so you have to be careful around what state the context is in.
As an aside, creating a new CRM Context isn't an intensive operation so it's not often worthwhile passing contexts around and reusing them. It is creating the underlying OrganisationService that is the slowest bit.
This behaviour can be so painful, because it is horribly inefficient and slow to return the entire record so you WANT to be selecting only the fields you want for each query.

And here's how you return just the fields you want:
IEnumerable<ptl_billpayerapportionment> bpas = context.ptl_billpayerapportionmentSet
.Where(bm => bm.ptl_bill.Id == billId)
.Select(bm => new ptl_billpayerapportionment()
{
Id = bm.Id,
ptl_contact = bm.ptl_contact
})
This will ensure a much smaller sql statement will be executed against the context as the Id and ptl_contact are the only two fields being returned. But as Ben says above, further retrievals against the same entity in the same context will return nulls for fields not included in the initial select (as per the OP's question).
For bonus points, using IEnumerable and creating a new, lightweight, entity gives you access to the usual LINQ methods, e.g. .Any(), .Sum() etc. The CRM SDK doesn't like using them against var datasets, apparently.

Related

Queryable Linq Query Differences In Entity Framework

I have a very simple many to many table in entity framework connecting my approvals to my transactions (shown below).
I am trying to do a query inside the approval object to count the amount of transactions on the approval, which should be relatively easy.
If I do something like this then it works super fast.
int count;
EntitiesContainer dbContext = new EntitiesContainer ();
var aCnt = from a in dbContext.Approvals
where a.id == id
select a.Transactions.Count;
count = aCnt.First();
However when I do this
count = Transactions.Count;
or this
count = Transactions.AsQueryable<Transaction>().Count();
its exceedingly slow. I have traced the sql running on the server and it does indeed seem to be trying to load in all the transactions instead of just doing the COUNT query on the collection of Transactions.
Can anyone explain to me why?
Additional :
Here is how the EF model looks in regards to these two classes
UPDATE :
Thanks for all the responses, I believe where I was going wrong was to believe that the collections attached to the Approval object would execute as IQueryable. I'm going to have to execute the count against the dbContext object.
Thanks everyone.

var aCnt = from a in dbContext.Approvals
where a.id == id
select a.Transactions.Count;
EF compiles query by itself, the above query will be compiled as select count transactions
Unlike,
count = Transactions.AsQueryable<Transaction>().Count();
count = Transactions.Count;
these will select all the records from transaction and then computes the count

When you access the a.Transactions property, then you load the list of transactions (lazy loading). If you want to get the Count only, then use something like this:
dbContext.Transactions.Where(t => t.Approvals.Any(ap => ap.Id == a.Id)).Count();
where a is given Approval.

Your first method allows the counting to take place on the database server level. It will ask the database not to return the records, but to return the amount of records found. This is the most efficient method.
This is not to say that other methods can't work as efficiently, but with the other two lines, you are not making it clear in the first place that you are retrieving transactions from a join on Approvals. Instead, in the other two lines, you take the Transactions collection just by itself and do a count on that, basically forcing the collection to be filled so it can be counted.

Your first snippet causes a query to be executed on the database server. It works that because the IQueryable instance is of type ObjectQuery provided by the Entity Framework which performs the necessary translation to SQL and then execution.
The second snippet illustrates working with IEnumerable instances. Count() works on them by, in worst case, enumerating the entire collection.
In the third snippet you attempt to make the IEnumerable an IQueryable again. But the Enumerable.AsQueryable method has no way of knowing that the IEnumerable it is getting "came" from Entity Framework. The best it can do is to wrap the IEnumerable in a EnumerableQuery instance which simply dynamically compiles the expression trees given to all LINQ query operators and executes them in memory.
If you need the count to be calculated by the database server, you can either formulate the requisite query manually (that is, write what you already did in snippet one), or use the method CreateSourceQuery available to you if you're not using Code First. Note that it will really be executed on the database server, so if you have modified the collection and have not yet saved changes, the result will be different to what would be returned by calling Count directly.

Data binding issues

I have started using the Entity Framework quite recently and think it is very good but I am a bit confused over a couple of things.
I'm writing a winforms-application where you are presented a list of persons in a list and if you click a particular person more information about that person appears in textboxes. Nothing fancy so far, but you are supposed to be able to edit the data about the person so data binding is nice here.
It works just fine but I am bit confused about what is the correct way to do it. First I did like this:
var query = _context.Person.Where(c => c.Id == chosenId);
this.personBindingSource.DataSource = query.ToList();
Then I read a bit and tried:
var local = _context.Person.Local;
IEnumerable<Customer> enumerable = local.Where(c => c.Id == chosenId);
this.personBindingSource.DataSource = enumerable.ToList();
This one seems to work fine as well.
Then I saw that someone suggested something like:
_context.Person.Load();
this.personBindingSource.DataSource = _context.Person.Local.ToBindingList();
I am a bit confused right now on what approach is the correct one and what is the difference between these three? Anyone can help me?

Depends on What You Want to do
I honestly never liked getting this answer because it seems to be the answer to everything but in this case it is really the only answer I can give
What the Local Property Does
Local gives you a reference to the current elements being tracked by the data context that hasn't been marked by delete, essentially you are asking for the current object that have already been loaded into memory by the context, you can read more about it here DbSet(TEntity).Local Property
What The Load Method Does
The Load method eagerly loads the targeted context, you can read more about it here.
DbExtensions.Load Method
What ToBindingList Does
Basically this is creating a two way binding between whatever entity you have created and the UI when you use a collection created using this method. That is that any changes in the UI should be automatically reflected in the related entities within this collection. You can read more about it using the following links
BindingList(T) Class DbExtensions.ToBindingList()
What Each Of Your Examples Do
First Example
var query = _context.Person.Where(c => c.Id == chosenId);
this.personBindingSource.DataSource = query.ToList();
Under the covers the following is going on
Creating a Query to be processed by the server with your linq expressionGetting the content from the database and creating a list around it
Here you are grabbing any of the people with Id of chosen Id from the database and then loading them into your application and creating a list
Second Example
var local = _context.Person.Local;
IEnumerable<Customer> enumerable = local.Where(c => c.Id == chosenId);
this.personBindingSource.DataSource = enumerable.ToList();
Under the covers the following is going on
Getting all of the currently tracked objects by the context object that have been hit by a query but have not been marked as deletedgetting all of the elements in local memory that have the Id chosen Id
Here you are grabbing any people that have already been loaded into the context this is not going to get all of your persisted data items, you must have hit them in other queries
Third Example
_context.Person.Load();
this.personBindingSource.DataSource = _context.Person.Local.ToBindingList();
Under the covers the following is going on
You are loading all of the people into local memoryYou create binding list (allows two way data binding between the objects and the UI) You bind the list to the UI element personBindingSource
Unless you want to load all of the items into memory this is probably not what you want to do, if the dataset ever grows large enough it will slow your program down and could possibly cause it to not work correctly (unlikely with case of person in most scenarios but is possible)
When You Should Use Them
FirstWhen you want to just get the data that matches your query into local memory and don't need a link between UI and the entitiesSecondWhen you have already run a query in the context and need to use it someplace else but don't need to rerun the query since it is currently in memoryThirdWhen you want to load all of the elements of an entity set into memory and create a two way databinding between them and a control

The only difference between the first and second example you have there is one is deferred and one is not. The third one is a little different as it creates a two way binding between your database and control datasource, effectively making the datacontext track changes to the list for you (add, & deletes).
In all your examples, so long as you keep you datacontext open changes to the objects themselves would be tracked.
As far as which way is correct, that depends on your application. Pick the approach that works best for you based on what you are trying to accomplish

Entity Framework: Linq query finds entries by original data but returns reference to changed entry

I just spent some days now to find a bug caused by some strange behavior of the Entity Framework (version 4.4.0.0). For an explanation I wrote a small test program. At the end you'll find some questions I have about that.
Declaration
Here we have a class "Test" which represents our test dataset. It only has an ID (primary key) and a "value" property. In our TestContext we implement a DbSet Tests, which shall handle our "Test" objects as a database table.
public class Test
{
public int ID { get; set; }
public int value { get; set; }
}
public class TestContext : DbContext
{
public DbSet<Test> Tests { get; set; }
}
Initialization
Now, we remove any (if existent) entries from our "Tests" table and add our one and only "Test" object. It has ID=1 (primary key) and value=10.
// Create a new DBContext...
TestContext db = new TestContext();
// Remove all entries...
foreach (Test t in db.Tests) db.Tests.Remove(t);
db.SaveChanges();
// Add one test entry...
db.Tests.Add(new Test { ID = 1, value = 10 });
db.SaveChanges();
Tests
Finally, we run some tests. We select our entry by it's original value (=10) and we change the "value" of our entry to 4711. BUT, we do not call db.SaveChanges(); !!!
// Find our entry by it's value (=10)
var result = from r in db.Tests
where r.value == 10
select r;
Test t2 = result.FirstOrDefault();
// change its value from 10 to 4711...
t2.value = 4711;
Now, we try to find the (old) entry by the original value (=10) and do some tests on the results of that.
// now we try to select it via its old value (==10)
var result2 = from r in db.Tests
where r.value == 10
select r;
// Did we get it?
if (result2.FirstOrDefault() != null && result2.FirstOrDefault().value == 4711)
{
Console.WriteLine("We found the changed entry by it's old value...");
}
When running the program we'll actually see "We found the changed entry by it's old value...". That means we have run a query for r.value == 10, found something... This would be acceptable. BUT, get receive the already changed object (not fulfilling value == 10)!!!
Note: You'll get an empty result set for "where r.value == 4711".
In some further testing, we find out, that the Entity Framework always hands out a reference to the same object. If we change the value in one reference, it's changed in the other one too. Well, that's ok... but one should know it happens.
Test t3 = result2.FirstOrDefault();
t3.value = 42;
if (t2.value == 42)
{
Console.WriteLine("Seems as if we have a reference to the same object...");
}
Summary
When running a LINQ query on the same Database Context (without calling SaveChanges()) we will receive references to the same object, if it has the same primary key. The strange thing is: Even, if we change an object we will find it (only!) by it's old values. But we will receive a reference to the already changed object. This means that the restrictions in our query (value == 10) is not guaranteed for any entries that we changed since our last call of SaveChanges().
Questions
Of course, I'll probably have to live with some effects here. But I, would like to avoid to "SaveChanges()" after every little change. Especially, because I would like to use it for transaction handling... to be able to revert some changes, if something goes wrong.
I would be glad, if anyone could answer me one or even both of the following questions:
Is there a possibility to change the behavior of entity framework to work as if I would communicate with a normal database during a transaction? If so... how to do it?
Where is a good resource for answering "How to use the context of entity framework?" which answers questions like "What can I rely on?" and "How to choose the scope of my DBContext object"?
EDIT #1
Richard just explained how to access the original (unchanged) database values. While this is valuable and helpful I've got the urge to clarify the goal ...
Let's have a look at what happens when using SQL. We setup a table "Tests":
CREATE TABLE Tests (ID INT, value INT, PRIMARY KEY(ID));
INSERT INTO Tests (ID, value) VALUES (1,10);
Then we have a transaction, that first looks for entities whose values are 10. After this, we update the value of these entries and look again for those entries. In SQL we already work on the updated version, so we will not find any results for our second query. After all we do a "rollback", so the value of our entry should be 10 again...
START TRANSACTION;
SELECT ID, value FROM Tests WHERE value=10; {1 result}
UPDATE Tests SET value=4711 WHERE ID=1; {our update}
SELECT ID, value FROM Tests WHERE value=10; {no result, as value is now 4711}
ROLLBACK; { just for testing transactions... }
I would like to have exactly this behavior for the Entity Framework (EF), where db.SaveChanges(); is equivalent to "COMMIT", where all LINQ queries are equivalent to "SELECT" statements and every write access to an entity is just like an "UPDATE". I don't care about when the EF does actually calls the UPDATE statement, but it should behave the same way as using a SQL Database the direct way... Of course, if "SaveChanges()" is called and returning successfully it should be guaranteed that all data was persisted correctly.
Note: Yes, I could call db.SaveChanges() before every query, but then I would loose the possibility for a "Rollback".
Regards,
Stefan

As you've discovered, Entity Framework tracks the entities it has loaded, and returns the same reference for each query which accesses the same entity. This means that the data returned from your query matches the current in-memory version of the data, and not necessarily the data in the database.
If you need to access the database values, you have several options:
Use a new DbContext to load the entity;
Use .AsNoTracking() to load an un-tracked copy of your entity;
Use context.Entry(entity).GetDatabaseValues() to load the property values from the database;
If you want to overwrite the properties of the local entity with the values from the database, you'll need to call context.Entry(entity).Reload().

You can wrap your updates in a transaction to achive the same result as in your SQL example:
using (var transaction = new TransactionScope())
{
var result = from r in db.Tests
where r.value == 10
select r;
Test t2 = result.FirstOrDefault();
// change its value from 10 to 4711...
t2.value = 4711;
// send UPDATE to Database but don't commit transcation
db.SaveChanges();
var result2 = from r in db.Tests
where r.value == 10
select r;
// should not return anything
Trace.Assert(result2.Count() == 0);
// This way you can commit the transaction:
// transaction.Complete();
// but we do nothing and after this line, the transaction is rolled back
}
For more information see http://msdn.microsoft.com/en-us/library/bb896325(v=vs.100).aspx

I think your problem is the expression tree. The Entity Framework executes your query to the database when you say SaveChanges(), as you allready mentioned. When manipulating something within the context, the changes do not happen on the database, they happen in your physical memory. Just when you call SaveChanges() your actions are translated to let's say SQL.
When you do a simple select the database is queried just in the moment when you acces the data. So if your have not call SaveChanges(), it finds the dataset in the database with (SQL)SELECT* FROM Test WHERE VALUE = 10 but interprets from the expression tree, that it has to be value == 4711.
The transaction in EF is happening in your storage. Everything you do before SaveChanges() is your transaction. Read for further information: MSDN
A really good ressource, which is probably up to date, for infomations about the EF is the Microsoft Data Developer Center

How do I make Entity Framework not join tables?

Ok so I am testing out EF once again for performance and I just want a simple result back from my database.
Example
var jobsList = from j in mf.Jobs
where j.UserID == 1001 select new { Job = j };
This unfortunately joins my User object to this list, which I don't want EF to do. How do I tell EF not to join just because there is a relationship. Basically I just want a simple row from that table.
Or do I need to use a different type of retrieval. I am still using the basic type of database retrieval below and I feel there are better ways to handle db work by now.
SqlConnection myconnection = new SqlConnection();
Edit
Basically what I am saying in a more clearer context. Is that instead of only getting the following.
Job.JobID
Job.UserID
//Extra properties
I Get
Job.JobID
Job.UserID
Job.User
//Extra properties
That User object easily consumes way more memory than is needed, plus I don't need it.
My Solution
So I am still not believing in EF too much and here is why. I turned off LazyLoading and turned it on and didn't really notice too much of a performance difference there. I then compared the amount of data that my SqlConnection type method uses compared to my EF method.
I get back the exact same result set and here are the performance differences.
For my Entity Framework method I get back a list of jobs.
MyDataEntities mf = new MyDataEntities(); // 4MB for the connection...really?
mf.ContextOptions.LazyLoadingEnabled = false;
// 9MB for the list below
var test = from j in mf.Jobs
where j.UserID == 1031
select j;
foreach (Job job in test) {
Console.WriteLine(job.JobID);
}
For my SqlConnection method that executes a Stored Proc and returns a result set.
//356 KB for the connection and the EXACT same list.
List<MyCustomDataSource.Jobs> myJobs = MyCustomDataSource.Jobs.GetJobs(1031);
I fully understand that Entity Framework is doing way way more than a standard SqlConnection, but why all this hype if it is going to take at a minimum of 25x more memory for a result set. Just doesn't seem worth it.
My solution is not to go with EF after all.

The User property is part of the job class but wont be loaded until you access it (lazy loading). So it is not actually "joined".
If you only want the two specified columns you can write
var jobsList = from j in mf.Jobs
where j.UserID == 1001
select new {
Job.JobID,
Job.UserID
};

The most probable reason for this behavior is that you have LazyLoadingEnabled property set to true.
If this is the case, the User isn't recovered in the original query. But if you try to acces this property, even if you do it through an inspection while debugging, this will be loaded from the database. But only if you try to access it.
You can check this opening a a SQL Server Profiler, and seeing what commands are begin sent to the DB.
Your code is not using eager loading or explicit loading. So this must be the reason.

I think that EF don't know that you want one result only. Try something like this.
Job jobsItem = mf.Jobs.Single(j=>j.UserID==1001)
If you don't want to use lambas...
Job JobItem = (from j in mf.Jobs where j.UserID == 1001 select j).Single()
I haven't a compiler near right now, I hope the syntax is right. Use can use var instead of Job for your variable if you preffer. It has no effect but I think Job is more readable in this case.

User actually is not attached to context until you access User property of Job. Turn off Lazy Loading if you want to get a null for User.

Entity Framework does not support lazy loading of properties. However, it has table-splitting
Emphasized the properties. Of course, Entity Framework supports lazy loading of rows

How do I refactor a common LINQ subquery into a method?

I'm struggling to come up with the right words to summarize this problem, so any input on what I can add to clarify it would be appreciated.
The basic scenario is this: I have a basic CMS (with pages, users, etc.). Page is a LINQ data object, which maps directly to a Page table.
I've added a method to the Page class called GetUserPermissions. This method accepts a UserId and returns a non-LINQ class called PagePermissionSet, which describes what the user can do. PagePermissionSet is calculated via a LINQ query.
Now I want to get the list of Pages which a user has access to. The ideal implementation would be as follows:
from page in mDataContext.Pages
where page.GetUserPermissions(userId).CanView
select page
This fails, stating that there is no SQL equivalent for GetUserPermissions (which is reasonable enough), or after some refactoring of the method, that the CanView member can't be invoked on an IQueryable.
Attempt two was to add a method to the DataContext, which returns all of the permissions for each Page/User as an IQueryable:
IQueryable<PagePermissionSet> GetAllPagePermissions()
I then tried to join to this result set:
IQueryable<Page> GetAllPages(Guid? userId) {
var permissions = mDataContext.GetAllPagePermissions();
var pages =
from page in mDataContext.WikiPages
join permission in permissions on Page.FileName equals permission.PageName
where permission.CanView && permission.UserId == userId
select page;
return pages;
}
This produces the error: "The member 'WikiTome.Library.Model.PagePermissionSet.PageName' has no supported translation to SQL."
PagePermissionSet is pretty much just a shell holding data from the select clause in GetUserPermissions, and is initialized as follows:
select new PagePermissionSet(pageName, userName, canView, canEdit, canRename)
With all of that out of the way... How can I reuse the LINQ query in Page.GetUserPermissions in another query? I definitely don't want to duplicate the code, and I would prefer not to translate it to SQL for inclusion as a view at this point.

Maybe you need a compiled query?
http://msdn.microsoft.com/en-us/library/bb399335.aspx

You have a few options.
1) The quick and dirty solution is to use AsEnumerable() with your query to bring the entire Pages table down to the client side then operate on it. For small tables this should be fine, however for large tables it will be inefficient and lead to performance issues depending on the size. If you choose to use this be mindful of how it actually operates. This means updating your code to:
from page in mDataContext.Pages.AsEnumerable()
where page.GetUserPermissions(userId).CanView
select page
2) A more involved solution would be to create a stored procedure or UDF on the SQL server that you would call from the DataContext and pass parameters to. Take a look at Scott Gu's blog post: LINQ to SQL (Part 6 - Retrieving Data Using Stored Procedures).
You could then write something like:
mDataContext.GetUserPermissions(userId)
All the logic you do in your code would written in SQL and you would return the viewable pages for the given user. This bypasses the use of the PagePermissionSet properties that have no supported translation to SQL.

I was able to solve the bulk of this problem today.
The error "The member 'WikiTome.Library.Model.PagePermissionSet.PageName' has no supported translation to SQL." was caused by HOW I was initializing my PagePermissionSet objects.
I had been initializing them using a constructor, like this:
select new PagePermissionSet(pageName, userName, canView, canEdit, canRename)
However, in order for LINQ to properly track the properties, it needs to be initialized like this:
select new PagePermissionSet { PageName=pageName, UserName = userName, CanView = canView, CanEdit = canEdit, CanRename = canRename }
With this change in place, I can create a method which returns an IQueryable<PagePermissionSet>, and then join my query to that (as in the second example).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Getting weird behavior when retrieving data from Microsoft CRM using LINQ - c#

Related

Queryable Linq Query Differences In Entity Framework

Data binding issues

Entity Framework: Linq query finds entries by original data but returns reference to changed entry

How do I make Entity Framework not join tables?

How do I refactor a common LINQ subquery into a method?

Categories

Resources