Have you got some good advices to use EF in a multithread program ?
I have 2 layers :
a EF layer to read/write into my database
a multithread service which uses my entities (read/write) and makes some computations (I use Task Parallel Library in the framework)
How can I synchronize my object contexts in each thread ?
Do you know a good pattern to make it work ?
Good advice is - just don't :-) EF barely manages to survive one thread - the nature of the beast.
If you absolutely have to use it, make the lightest DTO-s, close OC as soon as you have the data, repack data, spawn your threads just to do calculations and nothing else, wait till they are done, then create another OC and dump data back into DB, reconcile it etc.
If another "main" thread (the one that spawns N calculation threads via TPL) needs to know when some ther thread is done fire event, just set a flag in the other thread and then let it's code check the flag in it's loop and react by creating new OC and then reconciling data if it has to.
If your situation is more simple you can adapt this - the key is that you can only set a flag and let another thread react when it's ready. That means that it's in a stable state, has finished a round of whatever it was doing and can do things without risking race conditions. Reset the flag (an int) with interchaged operations and keep some timing data to make sure that your threads don't react again within some time T - otherwire they can spend their lifetime just querying DB.
This is how I implemented it my scenario.
var processing= new ConcurrentQueue<int>();
//possible multi threaded enumeration only processed non-queued records
Parallel.ForEach(dataEnumeration, dataItem=>
{
if(!processing.Contains(dataItem.Id))
{
processing.Enqueue(dataItem.Id);
var myEntityResource = new EntityResource();
myEntityResource.EntityRecords.Add(new EntityRecord
{
Field1="Value1",
Field2="Value2"
}
);
SaveContext(myEntityResource);
var itemIdProcessed = 0;
processing.TryDequeue(out itemIdProcessed );
}
}
public void RefreshContext(DbContext context)
{
var modifiedEntries = context.ChangeTracker.Entries()
.Where(e => e.State == EntityState.Modified || e.State == EntityState.Deleted);
foreach (var modifiedEntry in modifiedEntries)
{
modifiedEntry.Reload();
}
}
public bool SaveContext(DbContext context,out Exception error, bool reloadContextFirst = true)
{
error = null;
var saved = false;
try
{
if (reloadContextFirst)
this.RefreshContext(context);
context.SaveChanges();
saved = true;
}
catch (OptimisticConcurrencyException)
{
//retry saving on concurrency error
if (reloadContextFirst)
this.RefreshContext(context);
context.SaveChanges();
saved = true;
}
catch (DbEntityValidationException dbValEx)
{
var outputLines = new StringBuilder();
foreach (var eve in dbValEx.EntityValidationErrors)
{
outputLines.AppendFormat("{0}: Entity of type \"{1}\" in state \"{2}\" has the following validation errors:",
DateTime.Now, eve.Entry.Entity.GetType().Name, eve.Entry.State);
foreach (var ve in eve.ValidationErrors)
{
outputLines.AppendFormat("- Property: \"{0}\", Error: \"{1}\"", ve.PropertyName, ve.ErrorMessage);
}
}
throw new DbEntityValidationException(string.Format("Validation errors\r\n{0}", outputLines.ToString()), dbValEx);
}
catch (Exception ex)
{
error = new Exception("Error saving changes to the database.", ex);
}
return saved;
}
I think Craig might be right about your application no needing to have threads.. but you might look for the uses of ConcurrencyCheck in your models to make sure you don't "override" your changes
I don't know how much of your application is actually number crunching. If speed is the motivation for using multi-threading then it might pay off to take a step back and gather data about where the bottle next is.
In a lot of cases I have found that the limiting factor in applications using a database server is the speed of the I/O system for your storage. For example the speed of the hard drive disk(s) and their configuration can have a huge impact. A single hard drive disk with 7,200 RPM can handle about 60 transactions per second (ball park figure depending on many factors).
So my suggestion would be to first measure and find out where the bottle next is. Chances are you don't even need threads. That would make the code substantially easier to maintain and the quality is much higher in all likelihood.
"How can I synchronize my object contexts in each thread ?"
This is going to be tough. First of all SP or the DB queries can have parallel execution plan. So if you also have parallelism on object context you have to manually make sure that you have sufficient isolation but just enough that you dont hold lock too long that you cause deadlocks.
So I would say dont need to do it .
But that might not be the answer you want. So Can you explain a bit more what you want to achieve using this mutithreading. Is it more compute bound or IO bound. If it is IO bound long running ops then look at APM by Jeff Richter.
I think your question is more about synchronization between threads and EF is irrelevvant here. If I understand correctly you want to notify threads from one group when the main thread performed some operation - in this case "SaveChanges()" operation. The threads here are like client-server applications, where one thread is a server and other threads are clients and you want client-threads to react on server activity.
As someone noticed you probably do not need threads, but let's leave it as it is.
There is no fear of dead locks as long as you are going to use separate OC per thread.
I also assume that your client threads are long-running thread in some kind of loop. If you want your code to be executed on client thread you can't use C# events.
class ClientThread {
public bool SomethingHasChanged;
public MainLoop()
{
Loop {
if (SomethingHasChanged)
{
refresh();
SomethingHasChanged = false;
}
// your business logic here
} // End Loop
}
}
Now the question is how you will set the flag in all your client-threads? You could keep references to client threads in your main thread and loop through them and set all flags to true.
Back when I used EF, I simply had one ObjectContext, to which I synchronized all access.
This isn't ideal. Your database layer would effectively be singlethreaded. But, it did keep it thread-safe in a multithreaded environment. In my case, the heavy computation was not in the database code at all - this was a game server, so game logic was of course the primary resource hog. So, I didn't have any particular need for a multithreaded DB layer.
Related
I have a Windows service that every 5 seconds checks for work. It uses System.Threading.Timer for handling the check and processing and Monitor.TryEnter to make sure only one thread is checking for work.
Just assume it has to be this way as the following code is part of 8 other workers that are created by the service and each worker has its own specific type of work it needs to check for.
readonly object _workCheckLocker = new object();
public Timer PollingTimer { get; private set; }
void InitializeTimer()
{
if (PollingTimer == null)
PollingTimer = new Timer(PollingTimerCallback, null, 0, 5000);
else
PollingTimer.Change(0, 5000);
Details.TimerIsRunning = true;
}
void PollingTimerCallback(object state)
{
if (!Details.StillGettingWork)
{
if (Monitor.TryEnter(_workCheckLocker, 500))
{
try
{
CheckForWork();
}
catch (Exception ex)
{
Log.Error(EnvironmentName + " -- CheckForWork failed. " + ex);
}
finally
{
Monitor.Exit(_workCheckLocker);
Details.StillGettingWork = false;
}
}
}
else
{
Log.Standard("Continuing to get work.");
}
}
void CheckForWork()
{
Details.StillGettingWork = true;
//Hit web server to grab work.
//Log Processing
//Process Work
}
Now here's the problem:
The code above is allowing 2 Timer threads to get into the CheckForWork() method. I honestly don't understand how this is possible, but I have experienced this with multiple clients where this software is running.
The logs I got today when I pushed some work showed that it checked for work twice and I had 2 threads independently trying to process which kept causing the work to fail.
Processing 0-3978DF84-EB3E-47F4-8E78-E41E3BD0880E.xml for Update Request. - at 09/14 10:15:501255801
Stopping environments for Update request - at 09/14 10:15:501255801
Processing 0-3978DF84-EB3E-47F4-8E78-E41E3BD0880E.xml for Update Request. - at 09/14 10:15:501255801
Unloaded AppDomain - at 09/14 10:15:10:15:501255801
Stopping environments for Update request - at 09/14 10:15:501255801
AppDomain is already unloaded - at 09/14 10:15:501255801
=== Starting Update Process === - at 09/14 10:15:513756009
Downloading File X - at 09/14 10:15:525631183
Downloading File Y - at 09/14 10:15:525631183
=== Starting Update Process === - at 09/14 10:15:525787359
Downloading File X - at 09/14 10:15:525787359
Downloading File Y - at 09/14 10:15:525787359
The logs are written asynchronously and are queued, so don't dig too deep on the fact that the times match exactly, I just wanted to point out what I saw in the logs to show that I had 2 threads hit a section of code that I believe should have never been allowed. (The log and times are real though, just sanitized messages)
Eventually what happens is that the 2 threads start downloading a big enough file where one ends up getting access denied on the file and causes the whole update to fail.
How can the above code actually allow this? I've experienced this problem last year when I had a lock instead of Monitor and assumed it was just because the Timer eventually started to get offset enough due to the lock blocking that I was getting timer threads stacked i.e. one blocked for 5 seconds and went through right as the Timer was triggering another callback and they both somehow made it in. That's why I went with the Monitor.TryEnter option so I wouldn't just keep stacking timer threads.
Any clue? In all cases where I have tried to solve this issue before, the System.Threading.Timer has been the one constant and I think its the root cause, but I don't understand why.
I can see in log you've provided that you got an AppDomain restart over there, is that correct? If yes, are you sure that you have the one and the only one object for your service during the AppDomain restart? I think that during that not all the threads are being stopped right in the same time, and some of them could proceed with polling the work queue, so the two different threads in different AppDomains got the same Id for work.
You probably could fix this with marking your _workCheckLocker with static keyword, like this:
static object _workCheckLocker;
and introduce the static constructor for your class with initialization of this field (in case of the inline initialization you could face some more complicated problems), but I'm not sure is this be enough for your case - during AppDomain restart static class will reload too. As I understand, this is not an option for you.
Maybe you could introduce the static dictionary instead of object for your workers, so you can check the Id for documents in process.
Another approach is to handle the Stopping event for your service, which probably could be called during the AppDomain restart, in which you will introduce the CancellationToken, and use it to stop all the work during such circumstances.
Also, as #fernando.reyes said, you could introduce heavy lock structure called mutex for a synchronization, but this will degrade your performance.
TL;DR
Production stored procedure has not been updated in years. Workers were getting work they should have never gotten and so multiple workers were processing update requests.
I was able to finally find the time to properly set myself up locally to act as a production client through Visual Studio. Although, I wasn't able to reproduce it like I've experienced, I did accidentally stumble upon the issue.
Those with the assumptions that multiple workers were picking up the work was indeed correct and that's something that should have never been able to happen as each worker is unique in the work they do and request.
It turns out that in our production environment, the stored procedure to retrieve work based on the work type has not been updated in years (yes, years!) of deploys. Anything that checked for work automatically got updates which meant when the Update worker and worker Foo checked at the same time, they both ended up with the same work.
Thankfully, the fix is database side and not a client update.
I have a project which is a Web API project, my project is accessed by multiple users (i mean a really-really lot of users). When my project being accessed from frontend (web page using HTML 5), and user doing something like updating or retrieving data, the backend app (web API) will write a single log file (a .log file but the content is JSON).
The problem is, when being accessed by multiple users, the frontend became unresponsive (always loading). The problem is in writing process of the log file (single log file being accessed by a really-really lot of users). I heard that using a multi threading technique can solve the problem, but i don't know which method. So, maybe anyone can help me please.
Here is my code (sorry if typo, i use my smartphone and mobile version of stack overflow):
public static void JsonInputLogging<T>(T m, string methodName)
{
MemoryStream ms = new MemoryStream();
DataContractJsonSerializer ser = new
DataContractJsonSerializer(typeof(T));
ser.WriteObject(ms, m);
string jsonString = Encoding.UTF8.GetString(ms.ToArray());
ms.Close();
logging("MethodName: " + methodName + Environment.NewLine + jsonString.ToString());
}
public static void logging (string message)
{
string pathLogFile = "D:\jsoninput.log";
FileInfo jsonInputFile = new FileInfo(pathLogFile);
if (File.Exists(jsonInputFile.ToString()))
{
long fileLength = jsonInputFile.Length;
if (fileLength > 1000000)
{
File.Move(pathLogFile, pathLogFile.Replace(*some new path*);
}
}
File.AppendAllText(pathLogFile, *some text*);
}
You have to understand some internals here first. For each [x] users, ASP.Net will use a single worker process. One worker process holds multiple threads. If you're using multiple instances on the cloud, it's even worse because then you also have multiple server instances (I assume this ain't the case).
A few problems here:
You have multiple users and therefore multiple threads.
Multiple threads can deadlock each other writing the files.
You have multiple appdomains and therefore multiple processes.
Multiple processes can lock out each other
Opening and locking files
File.Open has a few flags for locking. You can basically lock files exclusively per process, which is a good idea in this case. A two-step approach with Exists and Open won't help, because in between another worker process might do something. Bascially the idea is to call Open with write-exclusive access and if it fails, try again with another filename.
This basically solves the issue with multiple processes.
Writing from multiple threads
File access is single threaded. Instead of writing your stuff to a file, you might want to use a separate thread to do the file access, and multiple threads that tell the thing to write.
If you have more log requests than you can handle, you're in the wrong zone either way. In that case, the best way to handle it for logging IMO is to simply drop the data. In other words, make the logger somewhat lossy to make life better for your users. You can use the queue for that as well.
I usually use a ConcurrentQueue for this and a separate thread that works away all the logged data.
This is basically how to do this:
// Starts the worker thread that gets rid of the queue:
internal void Start()
{
loggingWorker = new Thread(LogHandler)
{
Name = "Logging worker thread",
IsBackground = true,
Priority = ThreadPriority.BelowNormal
};
loggingWorker.Start();
}
We also need something to do the actual work and some variables that are shared:
private Thread loggingWorker = null;
private int loggingWorkerState = 0;
private ManualResetEventSlim waiter = new ManualResetEventSlim();
private ConcurrentQueue<Tuple<LogMessageHandler, string>> queue =
new ConcurrentQueue<Tuple<LogMessageHandler, string>>();
private void LogHandler(object o)
{
Interlocked.Exchange(ref loggingWorkerState, 1);
while (Interlocked.CompareExchange(ref loggingWorkerState, 1, 1) == 1)
{
waiter.Wait(TimeSpan.FromSeconds(10.0));
waiter.Reset();
Tuple<LogMessageHandler, string> item;
while (queue.TryDequeue(out item))
{
writeToFile(item.Item1, item.Item2);
}
}
}
Basically this code enables you to work away all the items from a single thread using a queue that's shared across threads. Note that ConcurrentQueue doesn't use locks for TryDequeue, so clients won't feel any pain because of this.
Last thing that's needed is to add stuff to the queue. That's the easy part:
public void Add(LogMessageHandler l, string msg)
{
if (queue.Count < MaxLogQueueSize)
{
queue.Enqueue(new Tuple<LogMessageHandler, string>(l, msg));
waiter.Set();
}
}
This code will be called from multiple threads. It's not 100% correct because Count and Enqueue don't necessarily have to be called in a consistent way - but for our intents and purposes it's good enough. It also doesn't lock in the Enqueue and the waiter will ensure that the stuff is removed by the other thread.
Wrap all this in a singleton pattern, add some more logic to it, and your problem should be solved.
That can be problematic, since every client request handled by new thread by default anyway. You need some "root" object that is known across the project (don't think you can achieve this in static class), so you can lock on it before you access the log file. However, note that it will basically serialize the requests, and probably will have a very bad effect on performance.
No multi-threading does not solve your problem. How are multiple threads supposed to write to the same file at the same time? You would need to care about data consistency and I don't think that's the actual problem here.
What you search is asynchronous programming. The reason your GUI becomes unresponsive is, that it waits for the tasks to complete. If you know, the logger is your bottleneck then use async to your advantage. Fire the log method and forget about the outcome, just write the file.
Actually I don't really think your logger is the problem. Are you sure there is no other logic which blocks you?
I've got a routine called GetEmployeeList that loads when my Windows Application starts.
This routine pulls in basic employee information from our Active Directory server and retains this in a list called m_adEmpList.
We have a few Windows accounts set up as Public Profiles that most of our employees on our manufacturing floor use. This m_adEmpList gives our employees the ability to log in to select features using those Public Profiles.
Once all of the Active Directory data is loaded, I attempt to "auto logon" that employee based on the System.Environment.UserName if that person is logged in under their private profile. (employees love this, by the way)
If I do not thread GetEmployeeList, the Windows Form will appear unresponsive until the routine is complete.
The problem with GetEmployeeList is that we have had times when the Active Directory server was down, the network was down, or a particular computer was not able to connect over our network.
To get around these issues, I have included a ManualResetEvent m_mre with the THREADSEARCH_TIMELIMIT timeout so that the process does not go off forever. I cannot login someone using their Private Profile with System.Environment.UserName until I have the list of employees.
I realize I am not showing ALL of the code, but hopefully it is not necessary.
public static ADUserList GetEmployeeList()
{
if ((m_adEmpList == null) ||
(((m_adEmpList.Count < 10) || !m_gotData) &&
((m_thread == null) || !m_thread.IsAlive))
)
{
m_adEmpList = new ADUserList();
m_thread = new Thread(new ThreadStart(fillThread));
m_mre = new ManualResetEvent(false);
m_thread.IsBackground = true;
m_thread.Name = FILLTHREADNAME;
try {
m_thread.Start();
m_gotData = m_mre.WaitOne(THREADSEARCH_TIMELIMIT * 1000);
} catch (Exception err) {
Global.LogError(_CODEFILE + "GetEmployeeList", err);
} finally {
if ((m_thread != null) && (m_thread.IsAlive)) {
// m_thread.Abort();
m_thread = null;
}
}
}
return m_adEmpList;
}
I would like to just put a basic lock using something like m_adEmpList, but I'm not sure if it is a good idea to lock something that I need to populate, and the actual data population is going to happen in another thread using the routine fillThread.
If the ManualResetEvent's WaitOne timer fails to collect the data I need in the time allotted, there is probably a network issue, and m_mre does not have many records (if any). So, I would need to try to pull this information again the next time.
If anyone understands what I'm trying to explain, I'd like to see a better way of doing this.
It just seems too forced, right now. I keep thinking there is a better way to do it.
I think you're going about the multithreading part the wrong way. I can't really explain it, but threads should cooperate and not compete for resources, but that's exactly what's bothering you here a bit. Another problem is that your timeout is too long (so that it annoys users) and at the same time too short (if the AD server is a bit slow, but still there and serving). Your goal should be to let the thread run in the background and when it is finished, it updates the list. In the meantime, you present some fallbacks to the user and the notification that the user list is still being populated.
A few more notes on your code above:
You have a variable m_thread that is only used locally. Further, your code contains a redundant check whether that variable is null.
If you create a user list with defaults/fallbacks first and then update it through a function (make sure you are checking the InvokeRequired flag of the displaying control!) you won't need a lock. This means that the thread does not access the list stored as member but a separate list it has exclusive access to (not a member variable). The update function then replaces (!) this list, so now it is for exclusive use by the UI.
Lastly, if the AD server is really not there, try to forward the error from the background thread to the UI in some way, so that the user knows what's broken.
If you want, you can add an event to signal the thread to stop, but in most cases that won't even be necessary.
In my c# application multiple clients will access the same server, to process one client ata a time below code is written.In the code i used Moniter class and also the queue class.will this code affect the performance.if i use Monitor class, then shall i remove queue class from the code.
Sometimes my remote server machine where my application running as service is totally down.is the below code is the reasond behind, coz all the clients go in a queue, when i check the netstatus -an command using command prompt, for 8 clients it shows 50 connections are holding in Time-wait...
Below is my code where client acces the server ...
if (Id == "")
{
System.Threading.Monitor.Enter(this);
try
{
if (Request.AcceptTypes == null)
{
queue.Enqueue(Request.QueryString["sessionid"].Value);
string que = "";
que = queue.Dequeue();
TypeController.session_id = que;
langStr = SessionDatabase.Language;
filter = new AllThingzFilter(SessionDatabase, parameters, langStr);
TypeController.session_id = "";
filter.Execute();
Request.Clear();
return filter.XML;
}
else
{
TypeController.session_id = "";
filter = new AllThingzFilter(SessionDatabase, parameters, langStr);
filter.Execute();
}
}
finally
{
System.Threading.Monitor.Exit(this);
}
}
Locking this is pretty wrong, it won't work at all if every thread uses a different instance of whatever class this code lives in. It isn't clear from the snippet if that's the case but fix that first. Create a separate object just to store the lock and make it static or give it the same scope as the shared object you are trying to protect (also not clear).
You might still have trouble since this sounds like a deadlock rather than a race. Deadlocks are pretty easy to troubleshoot with the debugger since the code got stuck and is not executing at all. Debug + Break All, then Debug + Windows + Threads. Locate the worker threads in the thread list. Double click one to select it and use Debug + Call Stack to see where it got stuck. Repeat for other threads. Look back through the stack trace to see where one of them acquired a lock and compare to other threads to see what lock they are blocking on.
That could still be tricky if the deadlock is intricate and involves multiple interleaved locks. In which case logging might help. Really hard to diagnose mandelbugs might require a rewrite that cuts back on the amount of threading.
I am working on a project with peek performance requirements, so we need to bulk (batch?) several operations (for example persisting the data to a database) for efficiency.
However, I want our code to maintain an easy to understand flow, like:
input = Read();
parsed = Parse(input);
if (parsed.Count > 10)
{
status = Persist(parsed);
ReportSuccess(status);
return;
}
ReportFailure();
The feature I'm looking for here is automatically have Persist() happen in bulks (and ergo asynchronously), but behave to its user as if it's synchronous (user should block until the bulk action completes). I want the implementor to be able to implement Persist(ICollection).
I looked into flow-based programming, with which I am not highly familiar. I saw one library for fbp in C# here, and played a bit with Microsoft's Workflow Foundation, but my impression is that both are overkill for what I need. What would you use to implement a bulked flow behavior?
Note that I would like to get code that is exactly like what I wrote (simple to understand & debug), so solutions that involve yield or configuration in order to connect flows to one another are inadequate for my purpose. Also, chaining
is not what I'm looking for - I don't want to first build a chain and then run it, I want code that looks as if it is a simple flow ("Do A, Do B, if C then do D").
Common problem - instead of calling Persist I usually load up commands (or smt along those lines) into a Persistor class then after the loop is finished I call Persistor.Persist to persist the batch.
Just a few pointers - If you're generating sql the commands you add to the persistor can represent your queries somehow (with built-in objects, custom objects or just query strings). If you're calling stored procedures you can use the commands to append stuff to a piece of xml tha will be passed down to the SP when you call the persist method.
hope it helps - Pretty sure there's a pattern for this but dunno the name :)
I don't know if this is what you need, because it's sqlserver based, but have you tried taking a look to SSIS and or DTS?
One simple thing that you can do is to create a MemoryBuffer where you push the messages which simply add them to a list and returns. This MemoryBuffer has a System.Timers.Timer which gets invoked periodically and do the "actual" updates.
One such implementation can be found in a Syslog Server (C#) at http://www.fantail.net.nz/wordpress/?p=5 in which the syslog messages gets logged to a SQL Server periodically in a batch.
This approach might not be good if the info being pushed to database is important, as if something goes wrong, you will lose the messages in MemoryBuffer.
How about using the BackgroundWorker class to persist each item asynchronously on a separate thread? For example:
using System;
using System.Collections;
using System.Collections.Generic;
using System.ComponentModel;
using System.Threading;
class PersistenceManager
{
public void Persist(ICollection persistable)
{
// initialize a list of background workers
var backgroundWorkers = new List<BackgroundWorker>();
// launch each persistable item in a background worker on a separate thread
foreach (var persistableItem in persistable)
{
var worker = new BackgroundWorker();
worker.DoWork += new DoWorkEventHandler(worker_DoWork);
backgroundWorkers.Add(worker);
worker.RunWorkerAsync(persistableItem);
}
// wait for all the workers to finish
while (true)
{
// sleep a little bit to give the workers a chance to finish
Thread.Sleep(100);
// continue looping until all workers are done processing
if (backgroundWorkers.Exists(w => w.IsBusy)) continue;
break;
}
// dispose all the workers
foreach (var w in backgroundWorkers) w.Dispose();
}
void worker_DoWork(object sender, DoWorkEventArgs e)
{
var persistableItem = e.Argument;
// TODO: add logic here to save the persistableItem to the database
}
}