How to put a timeout on Transform - c#

The XslCompiledTransform.Transform will hang under certain conditions (stack overflow, infinite loop, etc). This is a data (input) dependent error, so I don't have complete control in preventing it. If this happens, I'd like to be notified gracefully, but I don't want it to destroy my application process and hence the GUI where the user is inputting the input, which may be "valid" but "incomplete".
If I run the xslt file manually, I get
Process is terminated due to StackOverflowException
But XslCompiledTransform.Transform() will hang my application forever.
So, I want to wrap that call in a timeout, but nothing I've tried seems to work. It still hangs the application.
I want the thread that has the try block to not be hung. I want to create two tasks, one for Transform and the other timeout. Then start both at the same time. I don't know but I think the Run is running before the outer statement gets a chance to wire up the timeout and use the WhenAny.
How can this be fixed?
Update
I updated the code to reflect my current attempt. I can get into the if block if it times out, but whether I abort the thread or not, the application still hangs. I don't understand what it is about XslCompiledTransform.Transform that insists on taking the whole application down if it goes down.
public static Object Load(string mathML)
{
if (mathML == Notebooks.InputCell.EMPTY_MATH)
return null;
XmlDocument input = new XmlDocument();
input.LoadXml(mathML);
XmlDocument target = new XmlDocument(input.CreateNavigator().NameTable);
using (XmlWriter writer = target.CreateNavigator().AppendChild())
{
try
{
Thread thread = null;
var task = Task.Run(() =>
{
thread = Thread.CurrentThread;
XmlTransform.Transform(input, writer);
});
if (!task.Wait(TimeSpan.FromSeconds(5)))
{
thread.Abort();
throw new TimeoutException();
}
}
catch (XsltException xex)
{
if (xex.Message == "An item of type 'Attribute' cannot be constructed within a node of type 'Root'.")
return null;
else
throw;
}
}
return Load(target);
}

Here's how I solved the issue
I took my xsl and compiled it into an assembly and referenced that assembly from my project (which is called Library)
Advantages:
Fixed the hang
Compiled xslt into an assembly is supposedly much faster
Disadvantages:
You tell me! I don't know :)
Library Properties / Build Events / Pre-build Event
"C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.7 Tools\xsltc.exe" /settings:script+ /class:Transform "myStylesheet.xslt"
Library / References
+ myStylesheet.dll
Loading the compiled transform
private static XslCompiledTransform xslTransform;
private static XslCompiledTransform XslTransform
{
get
{
if (xslTransform == null)
{
xslTransform = new XslCompiledTransform();
xslTransform.Load(typeof(Transform));
}
return xslTransform;
}
}
Calling the transform
Same as updated code in the Question

Related

Cancel yield from host

I have a network application that uses Lua scripts. Upon starting the application I create a global Lua state and load all script files, that contain various functions, and for every client that connects I create a Lua thread, for that connection.
// On start
var GL = luaL_newstate();
// register functions...
// load scripts...
// On connection
connection.State = lua_newthread(GL);
When a request that uses a script comes in, I get the global function and call it.
var NL = connection.State;
var result = lua_resume(NL, 0);
if (result != 0 && result != LUA_YIELD)
{
// error...
result = 0;
}
if (result == 0)
{
// function returned...
}
Now, some scripts require a response to something from the client, so I yield in those functions, to wait for it. When the response comes in, the script is resumed with lua_resume(NL, 1).
// Lua
text("How are you?")
local response = select("Good", "Bad")
// Host
private int select(IntPtr L)
{
// send response request...
return lua_yield(L, 1);
}
// On response
lua_pushstring(NL, response);
var result = lua_resume(NL, 1);
// ...
My problem is that I need to be able to cancel that yield, and return from the Lua function, without executing any more code in the Lua function, and without adding additional code to the scripts. In other words, I basically want to make the Lua thread throw an exception, get back to the start, and forget it ever executed that function.
Is that possible?
One thing I thought might work, but didn't, was calling lua_error. The result was an SEHException on the lua_error call. I assume because the script isn't currently running, but yielding.
While I didn't find a way to wipe a thread's slate clean (I don't think it's possible), I did find a solution in figuring out how lua_newthread works.
When the thread is created, the reference to it is put on the "global" state's stack, and it doesn't get collected until it's removed from there. All you have to do to clean up the thread is removing it from the stack with lua_remove. This requires you to create new threads regularly, but that's not much of a problem for me.
I'm now keeping track of the created threads and their index on the stack, so I can removed them when I'm done with them for whatever reason (cancel, error, etc). All other indices are updated, as the removal will shift the ones that came after it.
if (sessionOver)
{
lua_remove(GL, thread.StackIndex);
foreach (var t in threads)
{
if (t.StackIndex > thread.StackIndex)
t.StackIndex--;
}
}

The process cannot access the file because it is being used by another process

I am trying to do the following:
var path = Server.MapPath("File.js"));
// Create the file if it doesn't exist or if the application has been restarted
// and the file was created before the application restarted
if (!File.Exists(path) || ApplicationStartTime > File.GetLastWriteTimeUtc(path)) {
var script = "...";
using (var sw = File.CreateText(path)) {
sw.Write(script);
}
}
However occasionally the following error is sometimes thrown:
The process cannot access the file '...\File.js' because it is being
used by another process
I have looked on here for similar questions however mine seems slightly different from the others. Also I cannot replicate it until the server is under heavy load and therefore I wish to make sure it is correct before I upload the fix.
I'd appreciate it if someone could show me how to fix this.
Thanks
It sounds like two requests are running on your server at the same time, and they're both trying to write to that file at the same time.
You'll want to add in some sort of locking behavior, or else write a more robust architecture. Without knowing more about what specifically you're actually trying to accomplish with this file-writing procedure, the best I can suggest is locking. I'm generally not a fan of locking like this on web servers, since it makes requests depend on each other, but this would solve the problem.
Edit: Dirk pointed out below that this may or may not actually work. Depending on your web server configuration, static instances may not be shared, and the same result could occur. I've offered this as a proof of concept, but you should most definitely address the underlying problem.
private static object lockObj = new object();
private void YourMethod()
{
var path = Server.MapPath("File.js"));
lock (lockObj)
{
// Create the file if it doesn't exist or if the application has been restarted
// and the file was created before the application restarted
if (!File.Exists(path) || ApplicationStartTime > File.GetLastWriteTimeUtc(path))
{
var script = "...";
using (var sw = File.CreateText(path))
{
sw.Write(script);
}
}
}
}
But, again, I'd be tempted to reconsider what you're actually trying to accomplish with this. Perhaps you could build this file in the Application_Start method, or even just a static constructor. Doing it for every request is a messy approach that will be likely to cause issues. Particularly under heavy load, where every request will be forced to run synchronously.

Force loop containing asynchronous task to maintain sequence

Something tells me this might be a stupid question and I have in fact approached my problem from the wrong direction, but here goes.
I have some code that loops through all the documents in a folder - The alphabetical order of these documents in each folder is important, this importance is also reflected in the order the documents are printed. Here is a simplified version:
var wordApp = new Microsoft.Office.Interop.Word.Application();
foreach (var file in Directory.EnumerateFiles(folder))
{
fileCounter++;
// Print file, referencing a previously instantiated word application object
wordApp.Documents.Open(...)
wordApp.PrintOut(...)
wordApp.ActiveDocument.Close(...)
}
It seems (and I could be wrong) that the PrintOut code is asynchronous, and the application sometimes gets into a situation where the documents get printed out of order. This is confirmed because if I step through, or place a long enough Sleep() call, the order of all the files is correct.
How should I prevent the next print task from starting before the previous one has finished?
I initially thought that I could use a lock(someObject){} until I remembered that they are only useful for preventing multiple threads accessing the same code block. This is all on the same thread.
There are some events I can wire into on the Microsoft.Office.Interop.Word.Application object: DocumentOpen, DocumentBeforeClose and DocumentBeforePrint
I have just thought that this might actually be a problem with the print queue not being able to accurately distinguish lots of documents that are added within the same second. This can't be the problem, can it?
As a side note, this loop is within the code called from the DoWork event of a BackgroundWorker object. I'm using this to prevent UI blocking and to feedback the progress of the process.
Your event-handling approach seems like a good one. Instead of using a loop, you could add a handler to the DocumentBeforeClose event, in which you would get the next file to print, send it to Word, and continue. Something like this:
List<...> m_files = Directory.EnumerateFiles(folder);
wordApp.DocumentBeforeClose += ProcessNextDocument;
...
void ProcessNextDocument(...)
{
File file = null;
lock(m_files)
{
if (m_files.Count > 0)
{
file = m_files[m_files.Count - 1];
m_files.RemoveAt(m_files.Count - 1);
}
else
{
// Done!
}
}
if (file != null)
{
PrintDocument(file);
}
}
void PrintDocument(File file)
{
wordApp.Document.Open(...);
wordApp.Document.PrintOut(...);
wordApp.ActiveDocument.Close(...);
}
The first parameter of Application.PrintOut specifies whether the printing should take place in the background or not. By setting it to false it will work synchronously.

Explain Strange Synchronization in WCF Source Code

While looking at the source code of System.ServiceModel.Channels.BufferManager, I noticed this method:
void TuneQuotas()
{
if (areQuotasBeingTuned)
return;
bool lockHeld = false;
try
{
try { }
finally
{
lockHeld = Monitor.TryEnter(tuningLock);
}
// Don't bother if another thread already has the lock
if (!lockHeld || areQuotasBeingTuned)
return;
areQuotasBeingTuned = true;
}
finally
{
if (lockHeld)
{
Monitor.Exit(tuningLock);
}
}
//
// DO WORK... (code removed for brevity)
//
areQuotasBeingTuned = false;
}
Obviously, they want only one thread to run TuneQuotas(), and other threads to not wait if it is already being run by another thread. I should note that the code removed was not try protected.
I'm trying to understand the advantages of this method above over just doing this:
void TuneQuotas()
{
if(!Monitor.TryEnter(tuningLock)) return;
//
// DO WORK...
//
Monitor.Exit(tuningLock);
}
Any ideas why they might have bothered with all that? I suspect the way they use the finally blocks is to guard against a thread abort scenario, but I still don't see the point because, even with all this code, TuneQuotas() would be locked for good if that one thread doesn't make it all the way to the end to set areQuotasBeingTunes=false, for one reason or another. So is there something cool about this pattern that I'm missing?
EDIT:
As a side note, it seems the method exists in .NET 4.0, which I confirmed using this code running on framework 4 (although I cannot confirm that the content of the method hasn't changed from what I found on the web):
var buffMgr = BufferManager.CreateBufferManager(1, 1);
var pooledBuffMgrType = buffMgr.GetType()
.GetProperty("InternalBufferManager")
.GetValue(buffMgr, null)
.GetType();
Debug.WriteLine(pooledBuffMgrType.Module.FullyQualifiedName);
foreach (var methodInfo in pooledBuffMgrType
.GetMethods(BindingFlags.Instance | BindingFlags.NonPublic))
{
Debug.WriteLine(methodInfo.Name);
}
which outputs:
C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.Runtime.DurableInstancing\v4.0_4.0.0.0__3 1bf3856ad364e35\System.Runtime.DurableInstancing.dll
ChangeQuota
DecreaseQuota
FindMostExcessivePool
FindMostStarvedPool
FindPool
IncreaseQuota
TuneQuotas
Finalize
MemberwiseClone
I'll add some comments:
void TuneQuotas()
{
if (areQuotasBeingTuned)
return; //fast-path, does not require locking
bool lockHeld = false;
try
{
try { }
finally
{
//finally-blocks cannot be aborted by Thread.Abort
//The thread could be aborted after getting the lock and before setting lockHeld
lockHeld = Monitor.TryEnter(tuningLock);
}
// Don't bother if another thread already has the lock
if (!lockHeld || areQuotasBeingTuned)
return; //areQuotasBeingTuned could have switched to true in the mean-time
areQuotasBeingTuned = true; //prevent others from needlessly trying to lock (trigger fast-path)
}
finally //ensure the lock being released
{
if (lockHeld)
{
Monitor.Exit(tuningLock);
}
}
//
// DO WORK... (code removed for brevity)
//
//this might be a bug. There should be a call to Thread.MemoryBarrier,
//or areQuotasBeingTuned should be volatile
//if not, the write might never reach other processor cores
//maybe this doesn't matter for x86
areQuotasBeingTuned = false;
}
The simple version you gave does not protect against some problems. At the very least it is not exception-safe (lock won't be released). Interestingly, the "sophisticated" version, doesn't either.
This method has been removed from .NET 4.
Until .NET 4.0 there was essentially a bug in the code that was generated by a lock statment. It would generate something similar to the following:
Monitor.Enter(lockObject)
// see next paragraph
try
{
// code that was in the lock block
}
finally
{
Monitor.Exit(lockObject);
}
This means that if an exception occurred between Enter and try, the Exit would never be called. As usr alluded to, this could happen due to Thread.Abort.
Your example:
if(!Monitor.TryEnter(tuningLock)) return;
//
// DO WORK...
//
Monitor.Exit(tuningLock);
Suffers from this problem and more. The window in which this code and become interrupted and Exit not be called is basically the whole block of code--by any exception (not just one from Thread.Abort).
I have no idea why most code was written in .NET. But, I surmise that this code was written to avoid the problem of an exception between Enter and try. Let's look at some of the details:
try{}
finally
{
lockHeld = Monitor.TryEnter(tuningLock);
}
Finally blocks basically generate a constrained execution region in IL. Constrained execution regions cannot be interrupted by anything. So, putting the TryEnter in the finally block above ensures that lockHeld reliably holds the state of the lock.
That block of code is contained in a try/finally block whose finally statement calls Monitor.Exit if tuningLock is true. This means that there is no point between the Enter and the try block that can be interrupted.
FWIW, this method was still in .NET 3.5 and is visible in the WCF 3.5 source code (not the .NET source code). I don't know yet what's in 4.0; but I would imagine it would be the same; there's no reason to change working code even if the impetus for part of its structure no longer exists.
For more details on what lock used to generate see http://blogs.msdn.com/b/ericlippert/archive/2007/08/17/subtleties-of-c-il-codegen.aspx
Any ideas why they might have bothered with all that?
After running some tests, I think see one reason (if not THE reason): They probably bothered with all that because it is MUCH faster!
It turns out Monitor.TryEnter is an expensive call IF the object is already locked (if it's not locked, TryEnter is still very fast -- no problems there). So all threads, except the first one, are going to experience the slowness.
I didn't think this would matter that much; since afterall, each thread is going to try taking the lock just once and then move on (not like they'd be sitting there, trying in a loop). However, I wrote some code for comparison and it showed that the cost of TryEnter (when already locked) is significant. In fact, on my system each call took about 0.3 ms without the debugger attached, which is several orders of magnitude slower than using a simple boolean check.
So I suspect, this probably showed up in Microsoft's test results, so they optimized the code as above, by adding the fast track boolean check. But that's just my guess..

How to Lock a file and avoid readings while it's writing

My web application returns a file from the filesystem. These files are dynamic, so I have no way to know the names o how many of them will there be. When this file doesn't exist, the application creates it from the database. I want to avoid that two different threads recreate the same file at the same time, or that a thread try to return the file while other thread is creating it.
Also, I don't want to get a lock over a element that is common for all the files. Therefore I should lock the file just when I'm creating it.
So I want to lock a file till its recreation is complete, if other thread try to access it ... it will have to wait the file be unlocked.
I've been reading about FileStream.Lock, but I have to know the file length and it won't prevent that other thread try to read the file, so it doesn't work for my particular case.
I've been reading also about FileShare.None, but it will throw an exception (which exception type?) if other thread/process try to access the file... so I should develop a "try again while is faulting" because I'd like to avoid the exception generation ... and I don't like too much that approach, although maybe there is not a better way.
The approach with FileShare.None would be this more or less:
static void Main(string[] args)
{
new Thread(new ThreadStart(WriteFile)).Start();
Thread.Sleep(1000);
new Thread(new ThreadStart(ReadFile)).Start();
Console.ReadKey(true);
}
static void WriteFile()
{
using (FileStream fs = new FileStream("lala.txt", FileMode.Create, FileAccess.Write, FileShare.None))
using (StreamWriter sw = new StreamWriter(fs))
{
Thread.Sleep(3000);
sw.WriteLine("trolololoooooooooo lolololo");
}
}
static void ReadFile()
{
Boolean readed = false;
Int32 maxTries = 5;
while (!readed && maxTries > 0)
{
try
{
Console.WriteLine("Reading...");
using (FileStream fs = new FileStream("lala.txt", FileMode.Open, FileAccess.Read, FileShare.Read))
using (StreamReader sr = new StreamReader(fs))
{
while (!sr.EndOfStream)
Console.WriteLine(sr.ReadToEnd());
}
readed = true;
Console.WriteLine("Readed");
}
catch (IOException)
{
Console.WriteLine("Fail: " + maxTries.ToString());
maxTries--;
Thread.Sleep(1000);
}
}
}
But I don't like the fact that I have to catch exceptions, try several times and wait an inaccurate amount of time :|
You can handle this by using the FileMode.CreateNew argument to the stream constructor. One of the threads is going to lose and find out that the file was already created a microsecond earlier by another thread. And will get an IOException.
It will then need to spin, waiting for the file to be fully created. Which you enforce with FileShare.None. Catching exceptions here doesn't matter, it is spinning anyway. There's no other workaround for it anyway unless you P/Invoke.
i think that a right aproach would be the following:
create a set of string were u will save the current file name
so one thread would process the file at time, something like this
//somewhere on your code or put on a singleton
static System.Collections.Generic.HashSet<String> filesAlreadyProcessed= new System.Collections.Generic.HashSet<String>();
//thread main method code
bool filealreadyprocessed = false
lock(filesAlreadyProcessed){
if(set.Contains(filename)){
filealreadyprocessed= true;
}
else{
set.Add(filename)
}
}
if(!filealreadyprocessed){
//ProcessFile
}
Do you have a way to identify what files are being created?
Say every one of those files corresponds to a unique ID in your database. You create a centralised location (Singleton?), where these IDs can be associated with something lockable (Dictionary). A thread that needs to read/write to one of those files does the following:
//Request access
ReaderWriterLockSlim fileLock = null;
bool needCreate = false;
lock(Coordination.Instance)
{
if(Coordination.Instance.ContainsKey(theId))
{
fileLock = Coordination.Instance[theId];
}
else if(!fileExists(theId)) //check if the file exists at this moment
{
Coordination.Instance[theId] = fileLock = new ReaderWriterLockSlim();
fileLock.EnterWriteLock(); //give no other thread the chance to get into write mode
needCreate = true;
}
else
{
//The file exists, and whoever created it, is done with writing. No need to synchronize in this case.
}
}
if(needCreate)
{
createFile(theId); //Writes the file from the database
lock(Coordination.Instance)
Coordination.Instance.Remove[theId];
fileLock.ExitWriteLock();
fileLock = null;
}
if(fileLock != null)
fileLock.EnterReadLock();
//read your data from the file
if(fileLock != null)
fileLock.ExitReadLock();
Of course, threads that don't follow this exact locking protocol will have access to the file.
Now, locking over a Singleton object is certainly not ideal, but if your application needs global synchronization then this is a way to achieve it.
Your question really got me thinking.
Instead of having every thread responsible for file access and having them block, what if you used a queue of files that need to be persisted and have a single background worker thread dequeue and persist?
While the background worker is cranking away, you can have the web application threads return the db values until the file does actually exist.
I've posted a very simple example of this on GitHub.
Feel free to give it a shot and let me know what you think.
FYI, if you don't have git, you can use svn to pull it http://svn.github.com/statianzo/MultiThreadFileAccessWebApp
The question is old and there is already a marked answer. Nevertheless I would like to post a simpler alternative.
I think we can directly use the lock statement on the filename, as follows:
lock(string.Intern("FileLock:absoluteFilePath.txt"))
{
// your code here
}
Generally, locking a string is a bad idea because of String Interning. But in this particular case it should ensure that no one else is able to access that lock. Just use the same lock string before attempting to read. Here interning works for us and not against.
PS: The text 'FileLock' is just some arbitrary text to ensure that other string file paths are not affected.
Why aren't you just using the database - e.g. if you have a way to associate a filename with the data from the db it contains, just add some information to the db that specifies whether a file exists with that information currently and when it was created, how stale the information in the file is etc. When a thread needs some information, it checks the db to see if that file exists and if not, it writes out a row to the table saying it's creating the file. When it's done it updates that row with a boolean saying the file is ready to be used by others.
the nice thing about this approach - all your information is in 1 place - so you can do nice error recovery - e.g. if the thread creating the file dies badly for some reason, another thread can come along and decide to rewrite the file because the creation time is too old. You can also create simple batch cleanup processes and get accurate data on how frequently certain data is being used for a file, how often information is updated (by looking at the creation times etc). Also, you avoid having to do many many disk seeks across your filesystem as different threads look for different files all over the place - especially if you decide to have multiple front-end machines seeking across a common disk.
The tricky thing - you'll have to make sure your db supports row-level locking on the table that threads write to when they create files because otherwise the table itself may be locked which could make this unacceptably slow.

Categories

Resources