I've built a page that runs an admin task on a background thread using QueueBackgroundWorkItem. After queueing the task, the page begins polling a Page Method to check whether the task has completed. But I think my strategy for communicating from the worker thread to the status request threads is flawed.
To communicate accross threads, I'm using an object in session state in StateServer mode. It seemed to work in all my initial local testing, but that was using InProc session state. Once we got it on the server, it started appearing to hang - polling forever without ever getting a status update. Here is the code:
//Object for communicating across threads
[Serializable]
public class BackgroundTaskStatus
{
public enum BackgroundTaskStatusType
{
None=0,
Pending=1,
Started=2,
Error=3,
Complete=4
}
public BackgroundTaskStatusType Status { get; set; }
public string Message { get; set; }
}
//Class containing a reference to the Session State and
//contains the task for QueueBackgroundWorkItem
public class LocationSiteToolProcessor
{
public static string CopyingStatusKey = "LST_CopyingStatus";
private HttpSessionState _session;
public LocationSiteToolProcessor(HttpSessionState session)
{
_session = session;
}
public void CopyPage(string relativeUrl, bool overwrite, bool subPages, CancellationToken cancellationToken)
{
if(_session[CopyingStatusKey] == null || !(_session[CopyingStatusKey] is BackgroundTaskStatus))
_session[CopyingStatusKey] = new BackgroundTaskStatus();
BackgroundTaskStatus taskStatus = _session[CopyingStatusKey] as BackgroundTaskStatus;
taskStatus.Status = BackgroundTaskStatus.BackgroundTaskStatusType.Started;
try
{
DateTime start = DateTime.Now;
ElevateToWebAdmin();
var pages = LocationSiteRepository.CopyTemplatePage(relativeUrl, overwrite, subPages);
TimeSpan duration = DateTime.Now - start;
taskStatus.Message = (pages != null ? String.Format("Page copied successfully.") : String.Format("No pages were copied.")) +
" Time elapsed: " + duration.ToString("g");
taskStatus.Status = BackgroundTaskStatus.BackgroundTaskStatusType.Complete;
}
catch (Exception ex)
{
taskStatus.Message = ex.ToString();
taskStatus.Status = BackgroundTaskStatus.BackgroundTaskStatusType.Error;
}
}
}
//Code that kicks off the background thread
Session[LocationSiteToolProcessor.CopyingStatusKey] = new BackgroundTaskStatus() { Status = BackgroundTaskStatus.BackgroundTaskStatusType.Pending };
LocationSiteToolProcessor processor = new LocationSiteToolProcessor(Session);
HostingEnvironment.QueueBackgroundWorkItem(c => processor.CopyPage(relativeUrl, overwrite, subPages, c));
//Page Method to support client side status polling
[System.Web.Services.WebMethod(true)]
public static BackgroundTaskStatus GetStatus()
{
//(Modified for brevity)
BackgroundTaskStatus taskStatus = HttpContext.Current.Session[LocationSiteToolProcessor.CopyingStatusKey] as BackgroundTaskStatus;
return taskStatus;
}
I've attached the debugger and what I've observed is the background thread sets the Status property of the BackgroundTaskStatus in the session, but when the subsequent status polling requests read that object from session, the property value is unchanged. They seem to be operating on two different copies of the session object.
Now I know that the State Server mode serializes the session and then deserializes the session when it binds it to a new request. So it's possible for GetStatus() and the background thread to deserialize their own simultaneous copies of the object. But I'm expecting the background thread's change to be serialized back to the same origin and since the GetStatus() method doesn't write to session, it should eventually read the updated Status property value after the background thread sets it.
However, it seems like either the session was branched at some point and is storing two different serialized copies of my object, or the Status set by the background thread is being overwritten, even though GetStatus() doesn't write to session. Where is it going wrong?
Additionally, is it safe to pass in the HttpSessionState object like I'm doing or can it be destroyed before the background thread completes (i.e. is it scoped to the initial request)? I was under the impression it was a static object, but now I'm doubtful of that. I want this to be safe to run on a farm but am hoping not to have to get a database involved.
Edit
I found some info on this page that is probably relevant:
When a page saves data to Session, the value is loaded into a made-to-measure dictionary class hosted by the HttpSessionState class. The contents of the dictionary is flushed to the state provider when the ongoing request completes.
To me, this sounds like it's saying that my polling request thread does have its entire session serialized back to the state server at the end of its request, even though it hasn't made any changes. Additionally, it would stand to reason, that the session dictionary that my background thread writes to never gets serialized back to the state server after I modify it because its request already ended. Can anyone confirm this?
I found some info on this page that is probably relevant:
When a page saves data to Session, the value is loaded into a
made-to-measure dictionary class hosted by the HttpSessionState class.
The contents of the dictionary is flushed to the state provider when
the ongoing request completes.
To me, this sounds like it's saying that my polling request thread does have its entire session serialized back to the state server at the end of its request, even though it hasn't made any changes. Additionally, it would stand to reason, that the session dictionary that my background thread writes to never gets serialized back to the state server after I modify it because its request already ended.
I took the evidence above as confirmation. I resorted to storing the state in a database instead of session.
Related
I know that the message SQLite error (5): database is locked means there is some concurrent writing to the database.
But here I'm trying to queue all data (from multiple threads) into a ConcurrentQueue and lock the code processing that queue to ensure that only one thread works on the queue which means only one thread writes to the database at a time.
The code is just like this:
public class DataService {
readonly ConcurrentQueue<Item> _cacheQueue = new ConcurrentQueue<Item>();
public void SaveItem(Item item) {
_cacheQueue.Enqueue(item);
_processQueue();
}
void _saveItemCore(Item item){
//logic here to save the item (with write access to the SQLite database)
}
bool _isProcessingQueue;
object _lo = new object();
void _processQueue(){
lock(_lo){
if(_isProcessingQueue) return;
_isProcessingQueue = true;
while(_cacheQueue.Count > 0){
Item e;
if(_cacheQueue.TryDequeue(out e)){
_saveItemCore(e);
}
}
_isProcessingQueue = false;
}
}
}
I have multiple threads (each corresponding to a Task.Run). All threads call the method SaveItem concurrently.
But as you can see my code seems not working, I can still see the message SQLite error (5): database is locked. What could be wrong here?
NOTE: my threads also perform reading to the database, so if the db is currently being written to by one thread, will it lock the reading by another thread? If this is the case, then this could explain my problem.
UPDATE: I've also tried enabling the WAL journal mode by modifying the connection string like this:
data source=app.db;PRAGMA journal_mode=WAL;
but I can still see the message SQLite error (5): database is locked. I'm not so sure if this is OK, whether the reading is successful or not.
I'm getting a bit frustrated with this problem:
I have a web site that manage some files to download, cause these files are very big, and must be organized in folders and then compacted, I build an Ajax structure that do this job in background, and when these files is ready to be downloaded, this job changes the status of an object in the user session (bool isReady = true, simple like that).
To achieve this, when the user clicks "download", a jquery Post is send to an API, and this API starts the "organizer" job and finish the code (main thread, the request scoped one), leaving a background thread doing the magic (it's so beautiful haha).
This "organizer" job is a background thread that receive HttpSessionState (HttpContext.Current.Session) by parameter. It organize and ZIP the files, create a download link and, in the end, change an object in the session using the HttpSessionState that received by param.
This works great when I'm using the session "InProc" mode (I was very happy to deploy this peace of art in production after the tests).
But, my nightmares started when I have deployed the project in production environment, cause we use "StateServer" mode in this environment.
In these environment, the changes is not applied.
What I have noticed, until now, is that in the StateServer, every change I make in the background thread is not "commited" to the session when the changes occurs AFTER the user request ends (the thread that starts the thread).
If i write a thread.join() to wait the thread to finish, the changes made inside the thread is applied.
I'm thinking about use the DB to store these values, but, I will lose some performance :(
[HttpPost]
[Route("startDownloadNow")]
public void StartDownloadNow(DownloadStatusProxy input)
{
//some pieces of code...
...
//add the download request in the user session
Downloads.Add(data);
//pass the session as parameter to the thread
//cause the thread itself don't know the current httpcontext session
HttpSessionState session = HttpContext.Current.Session;
Thread thread = new Thread(() => ProccessDownload(data, session));
thread.Start();
//here, if I put a thread.join(), the changes inside the thread are applied correctly, but I can't do this, otherwise, it ceases to be ajax
}
private void ProccessDownload(DownloadStatus currentDownload, HttpSessionState session)
{
List<DownloadStatus> listDownload = ((List<DownloadStatus>)session["Downloads"]);
try
{
//just make the magic...
string downloadUrl = CartClient.CartDownloadNow(currentDownload.idRegion, currentDownload.idUser, currentDownload.idLanguage, currentDownload.listCartAsset.ToArray(), currentDownload.listCartAssetThumb.ToArray());
listDownload.Find(d => d.hashId == currentDownload.hashId).downloadUrl = downloadUrl;
listDownload.Find(d => d.hashId == currentDownload.hashId).isReady = true;
//in this point, if I inspect the current session, the values are applied but, in the next user request, these values are in the previous state... sad... .net bad dog...
}
catch (Exception e)
{
listDownload.Find(d => d.hashId == currentDownload.hashId).msgError = Utils.GetAllErrors(e);
LogService.Log(e);
}
//this was a desesperated try, I retrieve the object, manipulated and put it back again to the session, but it doesn't works too...
session["Downloads"] = listDownload;
}
We currently have a NServiceBus 5 system, which contains two recurring Sagas. Since they act as dispatcher to periodically pull multiple sorts of data from an external system, we're using the Timeouts to trigger this: We created a generic and empty class called ExecuteTask, which is used by the Saga to handle the timeout.
public class ScheduleSaga1 : Saga<SchedulerSagaData>,
IAmStartedByMessages<StartScheduleSaga1>,
IHandleMessages<StopSchedulingSaga>,
IHandleTimeouts<ExecuteTask>
And the other Saga is almost identically defined:
public class ScheduleSaga2: Saga<SchedulerSagaData>,
IAmStartedByMessages<StartScheduleSaga2>,
IHandleMessages<StopSchedulingSaga>,
IHandleTimeouts<ExecuteTask>
The timeout is handled equally in both Sagas:
public void Handle(StartScheduleSaga1 message)
{
if (_schedulingService.IsDisabled())
{
_logger.Info($"Task '{message.TaskName}' is disabled!");
}
else
{
Debugger.DoDebug($"Scheduling '{message.TaskName}' started!");
Data.TaskName = message.TaskName;
// Check to avoid that if the saga is already started, don't initiate any more tasks
// as those timeout messages will arrive when the specified time is up.
if (!Data.IsTaskAlreadyScheduled)
{
// Setup a timeout for the specified interval for the task to be executed.
Data.IsTaskAlreadyScheduled = true;
// Send the first Message Immediately!
SendMessage();
// Set the timeout
var timeout = _schedulingService.GetTimeout();
RequestTimeout<ExecuteTask>(timeout);
}
}
}
public void Timeout(ExecuteTask state)
{
if (_schedulingService.IsDisabled())
{
_logger.Info($"Task '{Data.TaskName}' is disabled!");
}
else
{
SendMessage();
// Action that gets executed when the specified time is up
var timeout = _schedulingService.GetTimeout();
Debugger.DoDebug($"Request timeout for Task '{Data.TaskName}' set to {timeout}!");
RequestTimeout<ExecuteTask>(timeout);
}
}
private void SendMessage()
{
// Send the Message to the bus so that the handler can handle it
Bus.Send(EndpointConfig.EndpointName, Activator.CreateInstance(typeof(PullData1Request)));
}
Now the problem: Since both Sagas are requesting Timeouts for ExecuteTask, it gets dispatched to both Sagas!
Even worse, it seems like the stateful Data in the Sagas gets messed up, since both Sagas are sending both message.
Therefore, it seems like the Timeouts are getting sent to all the Saga Instances which are requesting it.
But looking at the example https://docs.particular.net/samples/saga/simple/ there is no special logic regarding multiple Saga instances and their state.
Is my assumption correct? If this is the case, what are the best practices to have multiple Sagas requesting and receiving Timeouts?
The only reason I can think of when this is happening is that they share the same identifier to uniquely identify the saga instance.
Both ScheduleSaga1 and ScheduleSaga2 are using the same SchedulerSagaData for storing state. NServiceBus sees an incoming message and tries to retrieve the state, based on the unique identifier in the incoming message. If both StartScheduleSaga1 and StartScheduleSaga2 come in with identifier 1 for example, NServiceBus will search for saga state in the table SchedulerSagaData with unique identifier 1.
Both ScheduleSaga1 and ScheduleSaga2 will then share the same row!!!
Timeouts are based on SagaId in the TimeoutEntity table. Because both sagas share the same SagaId, it's logical they are both executed once the timeout arrives.
At the minimum you should not reuse the identifier to schedule tasks. It's probably better to not share the same class for storing saga state. Also easier to debug.
I've got a routine called GetEmployeeList that loads when my Windows Application starts.
This routine pulls in basic employee information from our Active Directory server and retains this in a list called m_adEmpList.
We have a few Windows accounts set up as Public Profiles that most of our employees on our manufacturing floor use. This m_adEmpList gives our employees the ability to log in to select features using those Public Profiles.
Once all of the Active Directory data is loaded, I attempt to "auto logon" that employee based on the System.Environment.UserName if that person is logged in under their private profile. (employees love this, by the way)
If I do not thread GetEmployeeList, the Windows Form will appear unresponsive until the routine is complete.
The problem with GetEmployeeList is that we have had times when the Active Directory server was down, the network was down, or a particular computer was not able to connect over our network.
To get around these issues, I have included a ManualResetEvent m_mre with the THREADSEARCH_TIMELIMIT timeout so that the process does not go off forever. I cannot login someone using their Private Profile with System.Environment.UserName until I have the list of employees.
I realize I am not showing ALL of the code, but hopefully it is not necessary.
public static ADUserList GetEmployeeList()
{
if ((m_adEmpList == null) ||
(((m_adEmpList.Count < 10) || !m_gotData) &&
((m_thread == null) || !m_thread.IsAlive))
)
{
m_adEmpList = new ADUserList();
m_thread = new Thread(new ThreadStart(fillThread));
m_mre = new ManualResetEvent(false);
m_thread.IsBackground = true;
m_thread.Name = FILLTHREADNAME;
try {
m_thread.Start();
m_gotData = m_mre.WaitOne(THREADSEARCH_TIMELIMIT * 1000);
} catch (Exception err) {
Global.LogError(_CODEFILE + "GetEmployeeList", err);
} finally {
if ((m_thread != null) && (m_thread.IsAlive)) {
// m_thread.Abort();
m_thread = null;
}
}
}
return m_adEmpList;
}
I would like to just put a basic lock using something like m_adEmpList, but I'm not sure if it is a good idea to lock something that I need to populate, and the actual data population is going to happen in another thread using the routine fillThread.
If the ManualResetEvent's WaitOne timer fails to collect the data I need in the time allotted, there is probably a network issue, and m_mre does not have many records (if any). So, I would need to try to pull this information again the next time.
If anyone understands what I'm trying to explain, I'd like to see a better way of doing this.
It just seems too forced, right now. I keep thinking there is a better way to do it.
I think you're going about the multithreading part the wrong way. I can't really explain it, but threads should cooperate and not compete for resources, but that's exactly what's bothering you here a bit. Another problem is that your timeout is too long (so that it annoys users) and at the same time too short (if the AD server is a bit slow, but still there and serving). Your goal should be to let the thread run in the background and when it is finished, it updates the list. In the meantime, you present some fallbacks to the user and the notification that the user list is still being populated.
A few more notes on your code above:
You have a variable m_thread that is only used locally. Further, your code contains a redundant check whether that variable is null.
If you create a user list with defaults/fallbacks first and then update it through a function (make sure you are checking the InvokeRequired flag of the displaying control!) you won't need a lock. This means that the thread does not access the list stored as member but a separate list it has exclusive access to (not a member variable). The update function then replaces (!) this list, so now it is for exclusive use by the UI.
Lastly, if the AD server is really not there, try to forward the error from the background thread to the UI in some way, so that the user knows what's broken.
If you want, you can add an event to signal the thread to stop, but in most cases that won't even be necessary.
I've read and looked a quite a few examples for Threadpooling but I just cant seem to understand it they way I need to. What I have manage to get working is not really what I need. It just runs the function in its own thread.
public static void Main()
{
while (true)
{
try
{
ThreadPool.QueueUserWorkItem(new WaitCallback(Process));
Console.WriteLine("ID has been queued for fetching");
}
catch (Exception ex)
{
Console.WriteLine("Error: " + ex.Message);
}
Console.ReadLine();
}
}
public static void Process(object state)
{
var s = StatsFecther("byId", "0"); //returns all player stats
Console.WriteLine("Account: " + s.nickname);
Console.WriteLine("ID: " + s.account_id);
Console.ReadLine();
}
What I'm trying to do is have about 50 threads going (maybe more) that fetch serialized php data containing player stats. Starting from user 0 all the way up to a user ID i specify (300,000). My question is not about how to fetch the stats I know how to get the stats and read them, But how I write a Threadpool that will keep fetching stats till it gets to 300,000th user ID without stepping on the toes of the other threads and saves the stats as it retrieves them to a Database.
static int _globalId = 0;
public static void Process(object state)
{
// each queued Process call gets its own player ID to fetch
processId = InterlockedIncrement(ref _globalId);
var s = StatsFecther("byId", processId); //returns all player stats
Console.WriteLine("Account: " + s.nickname);
Console.WriteLine("ID: " + s.account_id);
Console.ReadLine();
}
This is the simplest thing to do. But is far from optimal. You are using synchronous calls, you are relying on the ThreadPool to throttle your call rate, you have no retry policy for failed calls and your application will behave extremly bad under error conditions (when the web calls are failing).
First you should consider using the async methods of WebRequest: BeginGetRequestStream (if you POST and have a request body) and/or BeginGetResponse. These methods scale much better and you'll get a higher troughput for less CPU (if the back end can keep up of course).
Second you should consider self-throthling. On a similar project I used a pending request count. On success, each call would submit 2 more calls, capped with the throtling count. On failure the call would not submit anything. If no calls are pending, a timer based retry submits a new call every minute. This way you only attempt once per minute when the service is down, saving your own resources from spinning w/o traction, and you increase the throughput back up to the throtling cap when the service is up.
You should also know that the .Net framework will limit the number of concurent conncetions it makes to any resource. You must find your destination ServicePoint and change the ConnectionLimit from its default value (2) to the max value you are willing to throttle on.
About the database update part, there are way to many variables at play and way too little information to give any meaningfull advice. Some general advice would be use asynchronous methods in the database call also, size yoru conneciton pool to allow for your throtling cap, make sure your updates use the player ID as a key so you don't deadlock on updating the same record from different threads.
How do you determine the user ID? One option is to segment all the threads so that thread X deals with ID's from 0 - N, and so on, as a fraction of how many threads you have.