I was wonder what would be the best way to log changes made to objects created by linq.
I have searched around and this is what i came up with:
using (testDBDataContext db = new testDBDataContext())
{
Sometable table = db.Sometables.Single(x => x.id == 1);
table.Something = txtTextboxToChangeValue.Text;
Sometable tableBeforeChanges = db.Sometables.GetOriginalEntityState(table);
foreach (System.Data.Linq.ModifiedMemberInfo item in db.Sometables.GetModifiedMembers(table))
{
// Obviously writing to debug is not what i would like to do
System.Diagnostics.Debug.WriteLine("Old value: " + item.OriginalValue.ToString());
System.Diagnostics.Debug.WriteLine("New value: " + item.CurrentValue.ToString());
}
}
Is this really the way to go to log changes?
Change Tracking or Change Data Capture are the way to go. LINQ has nothing to do with it. As a general rule, client side cannot properly track changes that occur in the server because the changes may occur in alternate access that is not occurring through the client. As a cautionary note, setting up a complete data audit for all changes is seldom successful as the performance tax penalty is usually too high.
Ive searched around and i think im gonna use DoodleAudit (doddleaudit.codeplex.com) it seems to give me what i wanted, tnx for helping anyways!
Related
In my application there is an issue part that allows questions and responses. Something like (Pseudo code, as this is actually generated from Entity Framework):
class Response
{
string Author;
string Comment;
DateTime Date;
}
class Issue
{
IEnumerable<Response> Responses;
}
We have a summary page where we just want to show the last two responses. I tried a linq query like this:
from issue in db.Issue
let responses = from response in issue.Responses orderby response.Date
select new
{
Issue = issue,
Question = responses.FirstOrDefault(),
Answer = responses.Skip(1).FirstOrDefault()
}
But this gives me the error that Skip can only be used on ordered collections. I checked responses and it was an IOrderedEnummerable. I thought maybe the problem was that it was Enumerable instead of IOrderedQueryable, and saw that this happened because issue.Response is a collection so I switched the let statement to be:
let response = from response in db.Responses where response.IssueId = issue.ID // etc.
but this did not resolve the issue (but response did become an IOrderedQueryable) so I'm not really sure why entity won't accept the skip here. If I put the skip in the let statement, it works without problem (but then I can't get the first response). The issue seams to only occur by trying to put a portion of this statement in a variable before using it.
The problem here is: how will/should EF translate your query into a SQL statement? There is no straightforward SQL equivalent of Skip(1). Just try to write your query in SQL and you should see what I mean.
If you want an "easy" solution, then just get all responses from the DB and identify the ones you need in code.
If you want to minimize the data being read from the DB, the solutions might range from creating a view to writing a stored procedure to changing your tables so that your tables better reflect the data model in the application.
I'm not quite sure what's going on here, but wouldn't this be maybe a little simpler:
var theResponse = db.Issue.Select(i => new {
Issue = i,
Question = i.Responses.FirstOrDefault(),
Answer = i.Responses.OrderBy(r => r.Date).Skip(1).FirstOrDefault()
});
But this is also weird, because you are getting a full Issue object with all of its properties and whatnot, including the Reponse objects and stuffing it back into an Issue property of your dynamic type beside all of the Response objects...
I'm trying to loop through a large table and write the entries to a csv file. If I load all Objects into memory I get an OutOfMemoryException. My Employer class is mapped with fluent nhibernate.
Here's what I've tried:
This Loads all object on first iteration and crashes.
var myQuerable = DataProvider.GetEmployer(); // returns IQuerably
foreach (var emp in myQuerable)
{
// stuff...
}
No luck here:
var myEnumerator = myQuerable.GetEnumerator();
I thought this would work:
for (int i = 0; i <= myQuerable.Count(); i++)
{
Employer e = myQuerable.ElementAt(i);
}
but am getting this exception:
Could not parse expression
'value(NHibernate.Linq.NhQueryable`1[MyProject.Model.Employer]).ElementAt(0)':
This overload of the method 'System.Linq.Queryable.ElementAt' is currently not supported
Am I missing something here? Is this even possible with nHibernate?
Thanks!
I don't think loading your entries one by one could resolve your problem fully, as this is gonna go to another bad direction - huge loading on database side and longer response time for your C# method. I can't imagine how long it will take, as you've already god OutOfMemoryException exception that indicate you have huge number of records. I think the mechanism you really should take is pagination. There're various materials on the Internet about this topic, such as NHibernate 3 paging and determining the total number of rows.
Cheers!
Looks like I'm going to have to follow this artical:
http://ayende.com/blog/4548/nhibernate-streaming-large-result-sets
or use straight ado for performance.
Thanks for the help!
I have a huge table which I need to read through on a certain order and compute some aggregate statistics. The table already has a clustered index for the correct order so getting the records themselves is pretty fast. I'm trying to use LINQ to SQL to simplify the code that I need to write. The problem is that I don't want to load all the objects into memory, since the DataContext seems to keep them around -- yet trying to page them results in horrible performance problems.
Here's the breakdown. Original attempt was this:
var logs =
(from record in dataContext.someTable
where [index is appropriate]
select record);
foreach( linqEntity l in logs )
{
// Do stuff with data from l
}
This is pretty fast, and streams at a good rate, but the problem is that the memory use of the application keeps going up never stops. My guess is that the LINQ to SQL entities are being kept around in memory and not being disposed properly. So after reading Out of memory when creating a lot of objects C# , I tried the following approach. This seems to be the common Skip/Take paradigm that many people use, with the added feature of saving memory.
Note that _conn is created beforehand, and a temporary data context is created for each query, resulting in the associated entities being garbage collected.
int skipAmount = 0;
bool finished = false;
while (!finished)
{
// Trick to allow for automatic garbage collection while iterating through the DB
using (var tempDataContext = new MyDataContext(_conn) {CommandTimeout = 600})
{
var query =
(from record in tempDataContext.someTable
where [index is appropriate]
select record);
List<workerLog> logs = query.Skip(skipAmount).Take(BatchSize).ToList();
if (logs.Count == 0)
{
finished = true;
continue;
}
foreach( linqEntity l in logs )
{
// Do stuff with data from l
}
skipAmount += logs.Count;
}
}
Now I have the desired behavior that memory usage doesn't increase at all as I am streaming through the data. Yet, I have a far worse problem: each Skip is causing the data to load more and more slowly as the underlying query seems to actually cause the server to go through all the data for all previous pages. While running the query each page takes longer and longer to load, and I can tell that this is turning into a quadratic operation. This problem has appeared in the following posts:
LINQ Skip() Problem
LINQ2SQL select orders and skip/take
I can't seem to find a way to do this with LINQ that allows me to have limited memory use by paging data, and yet still have each page load in constant time. Is there a way to do this properly? My hunch is that there might be some way to tell the DataContext to explicitly forget about the object in the first approach above, but I can't find out how to do that.
After madly grasping at some straws, I found that the DataContext's ObjectTrackingEnabled = false could be just what the doctor ordered. It is, not surprisingly, specifically designed for a read-only case like this.
using (var readOnlyDataContext =
new MyDataContext(_conn) {CommandTimeout = really_long, ObjectTrackingEnabled = false})
{
var logs =
(from record in readOnlyDataContext.someTable
where [index is appropriate]
select record);
foreach( linqEntity l in logs )
{
// Do stuff with data from l
}
}
The above approach does not use any memory when streaming through objects. When writing data, I can use a different DataContext that has object tracking enabled, and that seems to work okay. However, this approach does have the problem of a SQL query that can take an hour or more to stream and complete, so if there's a way to do the paging as above without the performance hit, I'm open to other alternatives.
A warning about turning object tracking off: I found out that when you try to do multiple concurrent reads with the same DataContext, you don't get the error There is already an open DataReader associated with this Command which must be closed first. The application just goes into an infinite loop with 100% CPU usage. I'm not sure if this is a C# bug or a feature.
This seems to be like a common use case... but somehow I cannot get it working.
I'm attempting to use MongoDB as an enumeration store with unique items. I've created a collection with a byte[] Id (the unique ID) and a timestamp (a long, used for enumeration). The store is quite big (terabytes) and distributed among different servers. I am able to re-build the store from scratch currently, since I'm still in the testing phase.
What I want to do is two things:
Create a unique id for each item that I insert. This basically means that if I insert the same ID twice, MongoDB will detect this and give an error. This approach seems to work fine.
Continuously enumerate the store for new items by other processes. The approach I took was to add a second index to InsertID and used a high precision timestamp on this along with the server id and a counter (just to make it unique and ascending).
In the best scenario this would mean that the enumerator would keep track of an index cursor for every server. From what I've learned from mongodb query processing I expected this behavior. However, when I try to execute the code (below) it seems to take forever to get anything.
long lastid = 0;
while (true)
{
DateTime first = DateTime.UtcNow;
foreach (var item in collection.FindAllAs<ContentItem>().OrderBy((a)=>(a.InsertId)).Take(100))
{
lastid = item.InsertId;
}
Console.WriteLine("Took {0:0.00} for 100", (DateTime.UtcNow - first).TotalSeconds);
}
I've read about cursors, but am unsure if they fulfill the requirements when new items are inserted into the store.
As I said, I'm not bound to any table structure or something like that... the only things that are important is that I can get new items over time and without getting duplicate items.
-Stefan.
Somehow I figured it out... more or less...
I created the query manually and ended up with something like this:
db.documents.find({ "InsertId" : { "$gt" : NumberLong("2020374866209304106") } }).limit(10).sort({ "InsertId" : 1 });
The LINQ query I put in the question doesn't generate this query. After some digging in the code I found that it should be this LINQ query:
foreach (var item in collection.AsQueryable().Where((a)=>(a.InsertId > lastid)).OrderBy((a) => (a.InsertId)).Take(100))
The AsQueryable() seems to be the key to execute the rewriting of LINQ to MongoDB queries.
This gives results, but still they appeared to be slow (4 secs for 10 results, 30 for 100). However, when I added 'explain()' I noticed '0 millis' in the query execution.
I stopped the process doing bulk inserts and tada, it works, and fast. In other words: the issues I was having were due to the locking behavior of MongoDB, and due to the way I interpreted the linq implementation. Since the former is the result of initial bulk-filling the data store, this means that the problem is solved.
On the 'negative' part of the solution: I would have preferred a solution that involved serializable cursors or something like that... this 'take' solution has to iterate the b-tree over and over again. If someone has an answer for this, please let me know.
-Stefan.
I'm doing a massive import, and only doing a .SubmitChanges() only 1,000 records.
Example:
var targetRecord = new Data.User() { FirstName = sourceRecord.FirstName };
db.Users.InsertOnSubmit(record);
The above is in a loop, for each record from the source database. Then, later...
if (i % 1000 == 0) { db.SubmitChanges(); }
The problem is, the collection of items to be inserted keeps getting bigger and bigger, when I want to clear them out after each SubmitChanges();
What I'm looking for:
if (i % 1000 == 0) { db.SubmitChanges(); db.Dispose_InsertOnSubmit_Records(); }
Something like that. I could alternatively have a list of data records stored in a local variable that I continually re-instantiate after submitting changes, but, that's more code.
Hopefully this makes sense. Thanks!
You can intialize a new DataContext after each SubmitChanges. I'm not sure of the performance implications, but I've done something similar in the past without any problems.
The only other solution I've seen is iterating through your changes and reverting them. It seems like the former would be a much more efficient method.
Well, massive and linq-to-sql do not go together, I'm afraid. It is just not made for batch processing.
If what you are doing is just a straight import (and your example is indicating that) you are much better of with using SqlBulkCopy. That is magnitudes faster. Also more code but if you are looking for speed there is no better solution.