Update 2
The queueing problem was probably solved already, as we've been able to run multiple requests concurrently and the lib nicely reported progress for each operation. Other issues we're still facing about concurrency were likely the reason for this apparent behaviour, but that's a design matter. To solve this however, it'd be helpful to have some knowledge of the inner workings of classes, modules and variables as used in VB6. A question arise: would encapsulating everything (connections, components etc.) in classes ensure that every created object does not share any data with other instances?
Update 1
We've refactored the application a bit more to cope with resource disposing, especially when dealing with OCXs. Apparently that solved the out of memory issue. What still bothers me is that I don't understand what is happening beneath the surface. In this regard, is there a way to see what objects are currently in memory and how many references they have? I know the reference counting model is different from garbage collector-based systems. Still I would suppose the RCW wrapping our com objects would keep things clean for us. In the model given, is that a safe assumption or there's something we're missing?
So, I've probably read the most variegated kind of articles and docs about the topic of COM multithreading, but I still cannot get how that's supposed to work exactly, especially when interacting with .Net technologies such as ASP.Net MVC. That could be considered a simple fancy of mine, except for the fact that we've got this quite critical project and we're experiencing severe issues in trying to tie everything up. We're getting out of memory errors (in VB6) and apparently we got wrong how objects are created and data shared between these in COM. Continue reading to know how the story goes...
How things came to be
Not much to say here. We have a legacy VB6 Desktop application made up of a number of ActiveX DLLs. These are configured to use Apartment as the threading model, and all classes are set as MultiUse. All worked well and nice until the time came when we was requested to transpose the app on the mighty web :O
The problem we faced and how we (thought we) solved it
Since we haven't got the resources to design and develop a solution from scratch, we used a third party java(script)-based framework to quickly build a web app. However, much of the real work is done by the legacy library, so we needed a way to interface these two components. The easiest way we could think of was to build a very basic (w/o auth and w/o UI) Asp.Net MVC website to use as the middle layer. This would receive requests from the web app and translate them for the COM lib to crunch data.
To this end, and since the libs were never meant to be used as a server, we tried to refactor the whole thing a bit so that most classes can now be used in a standalone manner: this included separating logic from the UI and eliminating all module and public vars where possible; unfortunately, some of the former are still present, in particular some ComponentOne OCXs to handle reports and prints. All in all, this seemed to work just fine, until we had to deal with the COM threading model :O
Making sense of nonsense
Long story short, after a lot of digging and headaches we devised the current solution, which is outlined below:
we install the legacy app as usual, so that it register its dlls in the registry;
in our MVC solution, we use System.Threading.Tasks, one per every request, to start the requested operation in an asynchronous manner. We assign the operation an id and return this id to the client. To start the task we call this method:
protected Task<TReturn> StartSTATask<TReturn>(Func<TReturn> function)
{
var task = Task.Factory.StartNew(
function,
System.Threading.CancellationToken.None,
TaskCreationOptions.None,
STATaskScheduler // property to store the scheduler instance
);
return task;
}
the task is run using the STATaskScheduler. We modified it so that it spawns a new thread if the number of threads in the pool is set to 0.
/// <summary>Initializes a new instance of the StaTaskScheduler class with the specified concurrency level.</summary>
/// <param name="numberOfThreads">The number of threads that should be created and used by this scheduler.</param>
public StaTaskScheduler(int numberOfThreads)
{
// Validate arguments
//if (numberOfThreads < 1) throw new ArgumentOutOfRangeException("concurrencyLevel");
// Initialize the tasks collection
_tasks = new BlockingCollection<Task>();
if (numberOfThreads > 0)
{
// Create the threads to be used by this scheduler
_threads = Enumerable.Range(0, numberOfThreads).Select(i =>
{
var thread = new Thread(() =>
{
// Continually get the next task and try to execute it.
// This will continue until the scheduler is disposed and no more tasks remain.
foreach (var t in _tasks.GetConsumingEnumerable())
{
TryExecuteTask(t);
}
});
thread.Name = "sta_thread_" + i;
thread.IsBackground = true;
thread.SetApartmentState(ApartmentState.STA);
return thread;
}).ToList();
// Start all of the threads
_threads.ForEach(t => t.Start());
}
}
/// <summary>Queues a Task to be executed by this scheduler.</summary>
/// <param name="task">The task to be executed.</param>
protected override void QueueTask(Task task)
{
if (_threads != null)
// Push it into the blocking collection of tasks
_tasks.Add(task);
else
{
var thread = new Thread(() => TryExecuteTask(task));
thread.Name = "sta_thread_task_" + task.Id;
thread.IsBackground = true;
thread.SetApartmentState(ApartmentState.STA);
thread.Start();
}
}
And in our base controller's OnActionExecuting method we initiliaze it so
STATaskScheduler = HttpContext.Application["STATaskScheduler"] as TaskScheduler;
if (null == STATaskScheduler)
{
STATaskScheduler = new StaTaskScheduler(0);
HttpContext.Application["STATaskScheduler"] = STATaskScheduler;
}
we use a thin wrapper to instantiate and call our COM libs through reflection:
// Libraries is a Dictionary containing the names of the registered dlls
protected object InitCom(Libraries lib)
{
return InitCom(lib, true);
}
protected virtual object InitCom(Libraries lib, bool setOperation)
{
var comObj = GetComInstance(lib);
var success = SetUpConnection(comObj);
if (!success)
throw new LeafOperationException(lib, "Errore durante la connessione: {1}".Printf(connectionString));
if(setOperation)
return InitOperation(comObj);
return comObj;
}
protected object GetComInstance(Libraries lib)
{
var comType = Type.GetTypeFromProgID(MALib[lib]);
var comObj = Activator.CreateInstance(comType);
return comObj;
}
protected virtual bool DisposeCom(object comObj)
{
var success = CloseConnection(comObj);
if(!success)
throw new LeafOperationException("Errore durante la chiusura della connessione: {1}".Printf(connectionString));
//Marshal.FinalReleaseComObject(comObj);
//comObj = null;
return success;
}
protected bool SetUpConnection(object comObj)
{
var serverName = connectionString.ServerName();
var catalogName = connectionString.CatalogName();
return Convert.ToBoolean(comObj.InvokeMethod("Set_ConnectionWeb", serverName, catalogName));
}
protected bool CloseConnection(object comObj)
{
return Convert.ToBoolean(comObj.InvokeMethod("Close_ConnectionWeb"));
}
protected object InitOperation(object comObj)
{
comObj.GetType().InvokeMember("OperationID", BindingFlags.SetProperty, null, comObj, new object[] { OperationId });
comObj.GetType().InvokeMember("OperationHash", BindingFlags.SetProperty, null, comObj, new object[] { OperationHash });
return comObj;
}
The rationale behind this is that we create a new instance of the class with each request, eventually releasing it when done. Read here to know why we commented out the ReleaseComObject part. Basically, we were trading out of memory for a lot of COM object that has been separated from its underlying RCW cannot be used exceptions.
The object is then used like this within methods of various classes:
public bool ChiusuraMese()
{
try
{
PulisciMessaggi();
var comObj = InitCom(Libraries.Chiusura);
var byRefArgs = new int[] { 2 };
var oReturn = comObj.InvokeMethodByRef("ChiusuraMese", byRefArgs, IdDitta, PeriodoGiornaliera, IdDipendenti.PadLeft(), IdGruppoInstallazione, CodGruppoGestione);
DisposeCom(comObj);
return Convert.ToInt32(oReturn) == 0;
}
catch (Exception ex)
{
using (ErrorLog Log = new ErrorLog(System.Reflection.Assembly.GetExecutingAssembly().FullName, ex)) { }
aErrorMessage = ex.Message;
return false;
}
}
where InvokeMethodByRef is an extension method defined this way:
public static object InvokeMethodByRef(this object comObj, string methodName, int[] byRefArgs, params object[] args)
{
var modifiers = new ParameterModifier(args.Length);
byRefArgs.ToList().ForEach(index => { modifiers[index] = true; });
return comObj.GetType().InvokeMember(methodName, BindingFlags.InvokeMethod, null, comObj, args, new ParameterModifier[] { modifiers }, null, null);
}
Left out of the apartment
For what I understood, this whole apartment stuff is really hard to get right, with its cross-thread marshalling, message loop, yadda yadda whatnot. Add to that we're using and old, unsupported technology used to develop an application that was not architected for the purpose we're forcing it into. All that said, and taken for grant that the .Net side of things is working correctly, a couple of thoughts still wander in our minds. In particular:
is this the correct way to get advantage of multithreading with COM? Sometimes, multiple requests for the same object get stuck as if queued. This makes us wonder whether COM is actually sharing some instances between threads;
are we really creating and disposing of objects with each request, or under the hood COM handles things differently? Apparently, we're getting public vars overwritten, so there's probably some resource contention and reentering somewhere we wouldn't expect;
is the setup correct? Are there alternatives which are easier to maintain and debug? Please keep in mind we don't have neither the time nor the resources to rewrite anything in great extent. We could probably try something like creating an exe ActiveX, but I wouldn't count on that.
what's the "least worse" way to use OCXs in a project of this kind (not using them is not an option at the moment)? Should we dispose of them in some particular way? We already checked we set them to nothing when finished, but maybe some other thread is still using them;
should we be aware of any particular COM limit related to our out of memory issue? We encountered the problem before when the form had more than 256 unique controls displayed. Maybe the same is happening here somehow? The error seems to be especially related to classes using UI components.
Things I've already read (and probably did not understand)
Before you point to resources online I should read, I add here some topics I've encountered, in random order:
About SingleUse/MultiUse
http://www.vb-helper.com/howto_activex_dll.html
https://msdn.microsoft.com/en-us/library/aa242108(v=vs.60).aspx
Not really much choice here, if we want to stick with ActiveX DLLs with forms.
About (apartment) threading
https://msdn.microsoft.com/en-us/library/aa716297(v=vs.60).aspx
https://msdn.microsoft.com/en-us/library/aa716228(v=vs.60).aspx. By the way, this one probably hints that calls to objects are being serialized for access by other threads.
https://msdn.microsoft.com/en-us/library/windows/desktop/ms680112%28v=vs.85%29.aspx?f=255&MSPPError=-2147217396
About debugging
https://msdn.microsoft.com/en-us/library/aa241684(v=vs.60).aspx
https://msdn.microsoft.com/en-us/library/aa716193%28v=vs.60%29.aspx?f=255&MSPPError=-2147217396
Could a stack dump be of any help when we face the error? I don't even know how to use WinDbg, so I'd like at least to know if that would be a total waste of time :D
We're kinda stuck here, as we've got no clue as to where or what to look for, so any kind of help would be really appreciated.
Comments
So I've been pointed out I should read more about COM's threading model. I kind of expected that. Anyhow, to elaborate further, let me write some comments.
First, I don't have any control over CoInitialize or whatever, I'm just instantiating some VB6 dlls. I guess COM is doing such and such under the hood. Fact is, I could not find anywhere what that is (edit - apparently, .Net is already taking care of that for me, see the answer to this question: Do i need to call CoInitialize before interacting with COM in .NET?).
To recap:
I'm using STA threads from the client app
I'm using Activator.CreateInstance supposing it is actually creating a new object every time it is called. The call is done within a new STA thread.
Let's set aside for a moment questions about thread-safety in the actual DLLs. What I'm mainly interested in understanding here is if the described solution is a correct way (possibly not the best way, I'm aware of that) to exploit multithreading with COM libraries.
To cite some sources, to the best of my current knowledge I should be in the situation depicted in Figure 8.5 here: https://msdn.microsoft.com/en-us/library/aa716228(v=vs.60).aspx
I can't find any reason why this should not work, since as I said I'm supposing each object resides in its own apartment and has its own variables, plus a copy of global vars (see here: https://msdn.microsoft.com/en-us/library/aa261361(v=vs.60).aspx).
Background
We are trying to archive old user data to keep our most common tables smaller.
Issue
Normal EF code for removing records works for our custom tables. The AspNetUsers table is a different story. It appears that the way to do it is using _userManager.Delete or _userManager.DeleteAsync. These work without trying to do multiple db calls in one transaction. When I wrap this in a transactionScope, it times out. Here is an example:
public bool DeleteByMultipleIds(List<string> idsToRemove)
{
try
{
using (var scope = new TransactionScope())
{
foreach (var id in idsToRemove)
{
var user = _userManager.FindById(id);
//copy user data to archive table
_userManager.Delete(user);//causes timeout
}
scope.Complete();
}
return true;
}
catch (TransactionAbortedException e)
{
Logger.Publish(e);
return false;
}
catch (Exception e)
{
Logger.Publish(e);
return false;
}
}
Note that while the code is running and I call straight to the DB like:
DELETE
FROM ASPNETUSERS
WHERE Id = 'X'
It will also time out. This SQL works before the the C# code is executed. Therefore, it appears that more than 1 db hit seems to lock the table. How can I find the user(db hit #1) and delete the user (db hit #2) in one transaction?
For me, the problem involved the use of multiple separate DbContexts within the same transaction. The BeginTransaction() approach did not work.
Internally, UserManager.Delete() is calling an async method in a RunSync() wrapper. Therefore, using the TransactionScopeAsyncFlowOption.Enabled parameter for my TransactionScope did work:
using (var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
{
_myContext1.Delete(organisation);
_myContext2.Delete(orders);
_userManager.Delete(user);
scope.Complete();
}
Advice from microsoft is to use a different API when doing transactions with EF. This is due to the interactions between EF and the TransactionScope class. Implicitly transaction scope is forcing things up to serializable, which causes a deadlock.
Good description of an EF internal API is here: MSDN Link
For reference you may need to look into user manager if it exposes the datacontext and replace your Transaction scope with using(var dbContextTransaction = context.Database.BeginTransaction()) { //code }
Alternatively, looking at your scenario, you are actually quite safe in finding the user ID, then trying to delete it and then just catching an error if the user has been deleted in the fraction of a second between finding it and deleting it.
I need to wrap some pieces of code around a TransactionScope. The code inside this using statement calls a managed C++ library, which will call some unmanaged code. I do also want to update my database, which is using Entity Framework.
Here comes the problem, when doing SaveChanges on the DbContext inside the TransactionScope I always get some sort of Timeout exception in the database layer. I've googled this, and it seems to be a fairly common problem but I haven't found any applicable answers to my problem. This is a snippet of my code
using (var transactionScope = new TransactionScope())
{
try
{
//Do call to the managed C++ Library
using (var dbContext = _dbContextFactory.Create())
{
//doing some CRUD Operations on the DbContext
//Probably some more dbContext related stuff
dbContext.SaveChanges(); //Results with a timeout
}
}
catch (Exception)
{
transactionScope.Dispose();
throw;
}
}
I'm using Entity Framework 6.1.3 so I can access the BeginTransaction on the database, but I also need wrap the C++ calls inside a TransactionScope.
Any suggestions?
You will need to pass in TransactionScopeOptions defining your timeout (How long to keep the transaction open). An example for the absolute upper limit of timeouts would be:
TransactionOptions to = new TransactionOptions();
to.IsolationLevel = IsolationLevel.ReadCommitted;
to.Timeout = TransactionManager.MaximumTimeout;
using (TransactionScope ts = new TransactionScope(TransactionScopeOption.Required, to)) { }
You should definitely be aware of the impact of such a long-running transaction, this isn't typically required, and I'd highly recommend using something below MaximumTimeout which reflects how long you expect it to run. You should do your best to keep the time period for which the transaction is held as small as possible, doing any processing that doesn't have to be a single transaction outside the transaction scope.
It's also worth noting that depending on the underlying database it can enforce it's own limitations on transaction durations if configured to do so.
The thing is that SQL Server sometimes chooses a session as its deadlock victim when 2 processes lock each other out. The one process does an update and the other just a read. During read SQL Server creates so called 'shared locks' which does not block other reader but does block updaters. So far the only way to solve this is to reprocess the victimized thread.
Now this is happening in a web application and I would like to have a mechanism that can do the reprocessing (let's say with a maximum of 5 times) when needed.
I've looked at the IHttpModule which has a BeginRequest() and EndRequest() event being called (amongst other events) but that does not give me the ability to reprocess the request.
In fact what I need is something that forces itself between the http handler and the process being called.
I could write something like this:
int maxtries = 5;
while(maxtries > 0)
{
try
{
using(var scope = Session.OpenTransaction())
{
// process
scope.Complete(); // commit
return result;
}
}
catch(DeadlockException dlex)
{
maxtries--;
}
catch(Exception ex)
{
throw;
}
}
but I would have to write that for all requests which is tedious and error prone. I would be nice if I could just configure a kind of reprocessing handler via the Web.Config that is automatically called and does the processing deadlock reprocessing for me.
If your getting deadlocks you've got something wrong in your DB layer. You missing indices or something similar, or you are doing out of sequence updates within transactions that are locking dependant entities.
Regardless using HTTP as a mechanism to handle this error is not the way to go.
If you truly need to retry a deadlock, then you should wrap the attempt in your own function and retry almost exactly as you describe above.
BUT I would strongly suggest that you identify the cause of the deadlock and resolve it.
Hope that does not sound too dismissive of your problem, but fix the cause of the problem not the symptoms.
Since you're using MVC and assuming it is safe to rerun your entire action on DB failure, you can simply write a common base controller class from which all of your controllers will inherit (if you already don't have one), and in it override OnActionExecuting and trap specific exception(s) and retry. This way you'll have the code only in one place, but, again, assuming it is safe to rerun the entire action in such case.
Example:
public abstract class MyBaseController : Controller
{
protected override void OnActionExecuting(
ActionExecutingContext filterContext
)
{
int maxtries = 5;
while(maxtries > 0)
{
try
{
return base.OnActionExecuting(filtercontext);
}
catch(DeadlockException dlex)
{
maxtries--;
}
catch(Exception ex)
{
throw;
}
}
throw new Exception("Persistent DB locking - max retries reached.");
}
}
... and then simply update every relevant controller to inherit from this controller (again, if you don't already have a common controller).
EDIT: B/w, Bigtoe's answer is correct - deadlock is the cause and should be dealt with accordingly. The above solution is really a workaround if DB layer cannot be reliably fixed. The first attempt should be on reviewing and (re-)structuring queries so as to avoid deadlock in the first place. Only if that is not practical should the above workaround be employed.
So I have a Class that looks something like the following. There is a thread that does some work using an Entity Framework Code First DbContext.
The problem I'm having is that it seems like m_DB context is caching data even though it should be disposed and recreated for every processing loop.
What I've seen is that some data in a relationship isn't present in the models loaded. If I kill and restart the process suddenly the data is found just like it should.
The only thing I can think of is this app is using the MultipleActiveResultSets=true in the database connection string, but I can't find anything stating clearly that this would cause the behavior I'm seeing.
Any insight would be appreciated.
public class ProcessingService
{
private MyContext m_DB = null
private bool m_Run = true;
private void ThreadLoop()
{
while(m_Run)
{
try
{
if(m_DB == null)
m_DB = new MyContext();
}
catch(Exception ex)
{
//Log Error
}
finally
{
if(m_DB != null)
{
m_DB.Dispose();
m_DB = null;
}
}
}
}
private void ProcessingStepOne()
{
// Do some work with m_DB
}
private void ProcessingStepTwo()
{
// Do some work with m_DB
}
}
Multiple Active Result Sets or MARS is a feature of SQL 2005/2008 and ADO.NET where one connection can be used by multiple active result sets (Just as the name implies). try switching this off on the connection string and observe the behaviour of the app, i am guessing that this could be the likely cause of your problem. read the following MSDN link for more on MARS
MSDN - Multiple Active Result Sets
Edit:
Try:
var results = context.SomeEntitiy.AsNoTracking() where this = that select s;
AsNoTracking() switches off internal change tracking of entities and it should also force Entity Framework to reload entities every time.
Whatever said and done you will require some amount of re-factoring since there's obviously a design flaw in your code.
I hate answering my own question, especially when I don't have a good explanation of why it fixes the problem.
I ended up removing MARS and it did resolve my issue. The best explanation I have is this:
Always read to the end of results for procedural requests regardless of whether they return results or not, and for batches that return multiple results. (http://technet.microsoft.com/en-us/library/ms131686.aspx)
My application doesn't always read through all the results returned, so its my theory that this some how caused data to get cached and reused the new DbContext.