I came across an issue, and I'm not sure if it's me or if there's an issue with thread locking.
I have a class I use for basic utilities. In that class is method to create or append a text file. And because I use it debug, I have the method using lock() to keep the access singular. Except, it appears to be failing and allowing multiple threads into the blocked code.
When running my test threads it doesn't throw an error every time. It's a little weird. There are 50 threads/tasks being created. Each thread is writing a line to a singe file using the class below. It cycles through about 3100 individual tasks. But a maximum of 50 tasks are created to handle each batch. As each thread completes its task, a new one is created to take its place. The last batch processed 3188 commands and threw 16 errors.
I have tried using Monitor.Enter and Exit, but I get the same results. I have also tried making the StdLibLockObj readonly. All with the same results.
Error: The process cannot access the file 'ThreadExe.txt' because it is being used by another process.
static class StdLib
{
private static object StdLibLockObj = new object();
public static void WriteLogFile(string #AFileName, string FileData, bool AppendIfExists = true, bool AddAppPath = true)
{
lock (StdLibLockObj)
{
StreamWriter sw = null;
try
{
if (AddAppPath)
{
AFileName = #Path.Combine(#ApplicationPath(), #AFileName);
}
if ((AppendIfExists) && File.Exists(AFileName))
{
sw = File.AppendText(AFileName);
}
else
{
sw = File.CreateText(AFileName);
}
sw.Write(FileData);
}
finally
{
if (sw != null)
{
sw.Flush();
sw.Close();
sw.Dispose();
}
sw = null;
}
}
}
}
My background is mostly in Delphi, where threading is a bit more granular.
Any help would be appreciated.
Wrap your StreamWriter entries in a "using" block. That will get rid of locking. Sort of like this:
public static void ErrorMessage(string logMessage)
{
using (StreamWriter sw_errors = new StreamWriter(m_errors, true))
{
sw_errors.Write("\r\nLog Entry : ");
sw_errors.WriteLine("{0} {1}", DateTime.Now.ToLongTimeString(),
DateTime.Now.ToLongDateString());
sw_errors.WriteLine(" :");
sw_errors.WriteLine(" :{0}", logMessage);
sw_errors.WriteLine("-------------------------------");
}
}
Related
Writing Stringbuilder to file asynchronously. This code takes control of a file, writes a stream to it and releases it. It deals with requests from asynchronous operations, which may come in at any time.
The FilePath is set per class instance (so the lock Object is per instance), but there is potential for conflict since these classes may share FilePaths. That sort of conflict, as well as all other types from outside the class instance, would be dealt with retries.
Is this code suitable for its purpose? Is there a better way to handle this that means less (or no) reliance on the catch and retry mechanic?
Also how do I avoid catching exceptions that have occurred for other reasons.
public string Filepath { get; set; }
private Object locker = new Object();
public async Task WriteToFile(StringBuilder text)
{
int timeOut = 100;
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
while (true)
{
try
{
//Wait for resource to be free
lock (locker)
{
using (FileStream file = new FileStream(Filepath, FileMode.Append, FileAccess.Write, FileShare.Read))
using (StreamWriter writer = new StreamWriter(file, Encoding.Unicode))
{
writer.Write(text.ToString());
}
}
break;
}
catch
{
//File not available, conflict with other class instances or application
}
if (stopwatch.ElapsedMilliseconds > timeOut)
{
//Give up.
break;
}
//Wait and Retry
await Task.Delay(5);
}
stopwatch.Stop();
}
How you approach this is going to depend a lot on how frequently you're writing. If you're writing a relatively small amount of text fairly infrequently, then just use a static lock and be done with it. That might be your best bet in any case because the disk drive can only satisfy one request at a time. Assuming that all of your output files are on the same drive (perhaps not a fair assumption, but bear with me), there's not going to be much difference between locking at the application level and the lock that's done at the OS level.
So if you declare locker as:
static object locker = new object();
You'll be assured that there are no conflicts with other threads in your program.
If you want this thing to be bulletproof (or at least reasonably so), you can't get away from catching exceptions. Bad things can happen. You must handle exceptions in some way. What you do in the face of error is something else entirely. You'll probably want to retry a few times if the file is locked. If you get a bad path or filename error or disk full or any of a number of other errors, you probably want to kill the program. Again, that's up to you. But you can't avoid exception handling unless you're okay with the program crashing on error.
By the way, you can replace all of this code:
using (FileStream file = new FileStream(Filepath, FileMode.Append, FileAccess.Write, FileShare.Read))
using (StreamWriter writer = new StreamWriter(file, Encoding.Unicode))
{
writer.Write(text.ToString());
}
With a single call:
File.AppendAllText(Filepath, text.ToString());
Assuming you're using .NET 4.0 or later. See File.AppendAllText.
One other way you could handle this is to have the threads write their messages to a queue, and have a dedicated thread that services that queue. You'd have a BlockingCollection of messages and associated file paths. For example:
class LogMessage
{
public string Filepath { get; set; }
public string Text { get; set; }
}
BlockingCollection<LogMessage> _logMessages = new BlockingCollection<LogMessage>();
Your threads write data to that queue:
_logMessages.Add(new LogMessage("foo.log", "this is a test"));
You start a long-running background task that does nothing but service that queue:
foreach (var msg in _logMessages.GetConsumingEnumerable())
{
// of course you'll want your exception handling in here
File.AppendAllText(msg.Filepath, msg.Text);
}
Your potential risk here is that threads create messages too fast, causing the queue to grow without bound because the consumer can't keep up. Whether that's a real risk in your application is something only you can say. If you think it might be a risk, you can put a maximum size (number of entries) on the queue so that if the queue size exceeds that value, producers will wait until there is room in the queue before they can add.
You could also use ReaderWriterLock, it is considered to be more 'appropriate' way to control thread safety when dealing with read write operations...
To debug my web apps (when remote debug fails) I use following ('debug.txt' end up in \bin folder on the server):
public static class LoggingExtensions
{
static ReaderWriterLock locker = new ReaderWriterLock();
public static void WriteDebug(string text)
{
try
{
locker.AcquireWriterLock(int.MaxValue);
System.IO.File.AppendAllLines(Path.Combine(Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().GetName().CodeBase).Replace("file:\\", ""), "debug.txt"), new[] { text });
}
finally
{
locker.ReleaseWriterLock();
}
}
}
Hope this saves you some time.
I have a simple logging mechanism that should be thread safe. It works most of the time, but every now and then I get an exception on this line, "_logQ.Enqueue(s);" that the queue is not long enough. Looking in the debugger there are sometimes just hundreds of items, so I can't see it being resources. The queue is supposed to expand as needed. If I catch the exception as opposed to letting the debugger pause at the exception I see the same error. Is there something not thread safe here? I don't even know how to start debugging this.
static void ProcessLogQ(object state)
{
try
{
while (_logQ.Count > 0)
{
var s = _logQ.Dequeue();
string dir="";
Type t=Type.GetType("Mono.Runtime");
if (t!=null)
{
dir ="/var/log";
}else
{
dir = #"c:\log";
if (!Directory.Exists(dir))
Directory.CreateDirectory(dir);
}
if (Directory.Exists(dir))
{
File.AppendAllText(Path.Combine(dir, "admin.log"), DateTime.Now.ToString("hh:mm:ss ") + s + Environment.NewLine);
}
}
}
catch (Exception)
{
}
finally
{
_isProcessingLogQ = false;
}
}
public static void Log(string s) {
if (_logQ == null)
_logQ = new Queue<string> { };
lock (_logQ)
_logQ.Enqueue(s);
if (!_isProcessingLogQ) {
_isProcessingLogQ = true;
ThreadPool.QueueUserWorkItem(ProcessLogQ);
}
}
Note that the threads all call Log(string s). ProcessLogQ is private to the logger class.
* Edit *
I made a mistake in not mentioning that this is in a .NET 3.5 environment, therefore I can't use Task or ConcurrentQueue. I am working on fixes for the current example within .NET 3.5 constraints.
** Edit *
I believe I have a thread-safe version for .NET 3.5 listed below. I start the logger thread once from a single thread at program start, so there is only one thread running to log to the file (t is a static Thread):
static void ProcessLogQ()
{
while (true) {
try {
lock (_logQ);
while (_logQ.Count > 0) {
var s = _logQ.Dequeue ();
string dir = "../../log";
if (!Directory.Exists (dir))
Directory.CreateDirectory (dir);
if (Directory.Exists (dir)) {
File.AppendAllText (Path.Combine (dir, "s3ol.log"), DateTime.Now.ToString ("hh:mm:ss ") + s + Environment.NewLine);
}
}
} catch (Exception ex) {
Console.WriteLine (ex.Message);
} finally {
}
Thread.Sleep (1000);
}
}
public static void startLogger(){
lock (t) {
if (t.ThreadState != ThreadState.Running)
t.Start ();
}
}
private static void multiThreadLog(string msg){
lock (_logQ)
_logQ.Enqueue(msg);
}
Look at the TaskParallel Library. All the hard work is already done for you. If you're doing this to learn about multithreading read up on locking techniques and pros and cons of each.
Further, you're checking if _logQ is null outside your lock statement, from what I can deduce it's a static field that you're not initializing inside a static constructor. You can avoid doing this null check (which should be inside a lock, it's critical code!) you can ensure thread-safety by making it a static readonly and initializing it inside the static constructor.
Further, you're not properly handling queue states. Since there's no lock during the check of the queue count it could vary on every iteration. You're missing a lock as your dequeuing items.
Excellent resource:
http://www.yoda.arachsys.com/csharp/threads/
For a thread-safe queue, you should use the ConcurrentQueue instead:
https://msdn.microsoft.com/en-us/library/dd267265(v=vs.110).aspx
While keeping in mind that:
I am using a blocking queue that waits for ever until something is added to it
I might get a FileSystemWatcher event twice
The updated code:
{
FileProcessingManager processingManager = new FileProcessingManager();
processingManager.RegisterProcessor(new ExcelFileProcessor());
processingManager.RegisterProcessor(new PdfFileProcessor());
processingManager.Completed += new ProcessingCompletedHandler(ProcessingCompletedHandler);
processingManager.Completed += new ProcessingCompletedHandler(LogFileStatus);
while (true)
{
try
{
var jobData = (JobData)fileMonitor.FileQueue.Dequeue();
if (jobData == null)
break;
_pool.WaitOne();
Application.Log(String.Format("{0}:{1}", DateTime.Now.ToString(CultureInfo.InvariantCulture), "Thread launched"));
Task.Factory.StartNew(() => processingManager.Process(jobData));
}
catch (Exception e)
{
Application.Log(String.Format("{0}:{1}", DateTime.Now.ToString(CultureInfo.InvariantCulture), e.Message));
}
}
}
What are are you suggestions on making the code multi-threaded while taking into consideration the possibility that two identical string paths may be added into the blocking queue? I have left the possibility that this might happen and in this case.. the file would be processed twice, the thing is that sometimes I get it twice, sometimes not, it is really awkward, if you have suggestions on this, please tell.
The null checking is for exiting the loop, I intentionally add a null from outside the threaded loop to determine it to stop.
For multi-threading this... I would probably add a "Completed" event to your FileProcessingManager and register for it. One argument of that event will be the "bool" return value you currently have. Then in that event handler, I would do the checking of the bool and re-queueing of the file. Note that you will have to keep a reference to the FileMonitorManager. So, I would have this ThreadProc method be in a class where you keep the FileMonitorManager and FileProcessingManager instances in a property.
To deduplicate, in ThreadProc, I would create a List outside of the while loop. Then inside the while loop, before you process a file, lock that list, check to see if the string is already in there, if not, add the string to the list and process the file, if it is, then skip processing.
Obviously, this is based on little information surrounding your method but my 2 cents anyway.
Rough code, from Notepad:
private static FileMonitorManager fileMon = null;
private static FileProcessingManager processingManager = new FileProcessingManager();
private static void ThreadProc(object param)
{
processingManager.RegisterProcessor(new ExcelFileProcessor());
processingManager.RegisterProcessor(new PdfFileProcessor());
processingManager.Completed += ProcessingCompletedHandler;
var procList = new List<string>();
while (true)
{
try
{
var path = (string)fileMon.FileQueue.Dequeue();
if (path == null)
break;
bool processThis = false;
lock(procList)
{
if(!procList.Contains(path))
{
processThis = true;
procList.Add(path);
}
}
if(processThis)
{
Thread t = new Thread (new ParameterizedThreadStart(processingManager.Process));
t.Start (path);
}
}
catch (System.Exception e)
{
Console.WriteLine(e.Message);
}
}
}
private static void ProcessingCompletedHandler(bool status, string path)
{
if (!status)
{
fileMon.FileQueue.Enqueue(path);
Console.WriteLine("\n\nError on file: " + path);
}
else
Console.WriteLine("\n\nSucces on file: " + path);
}
Environment : .net 4.0
I have a task that transforms XML files with a XSLT stylesheet, here is my code
public string TransformFileIntoTempFile(string xsltPath,
string xmlPath)
{
var transform = new MvpXslTransform();
transform.Load(xsltPath, new XsltSettings(true, false),
new XmlUrlResolver());
string tempPath = Path.GetTempFileName();
using (var writer = new StreamWriter(tempPath))
{
using (XmlReader reader = XmlReader.Create(xmlPath))
{
transform.Transform(new XmlInput(reader), null,
new XmlOutput(writer));
}
}
return tempPath;
}
I have X threads that can launch this task in parallel.
Sometimes my input file are about 300 MB, sometimes it's only a few MB.
My problem : I get OutOfMemoryException when my program try to transform some big XML files in the same time.
How can I avoid these OutOfMemoryEception ? My idea is to stop a thread before executing the task until there is enough available memory, but I don't know how to do that. Or there is some other solution (like putting my task in a distinct application).
Thanks
I don't recommend blocking a thread. In worst case, you'll just end up starving the task that could potentially free the memory you needed, leading to deadlock or very bad performance in general.
Instead, I suggest you keep a work queue with priorities. Get the tasks from the Queue scheduled fairly across a thread pool. Make sure no thread ever blocks on a wait operation, instead repost the task to the queue (with a lower priority).
So what you'd do (e.g. on receiving an OutOfMemory exception), is post the same job/task onto the queue and terminate the current task, freeing up the thread for another task.
A simplistic approach is to use LIFO which ensures that a task posted to the queue will have 'lower priority' than any other jobs already on that queue.
Since .NET Framework 4 we have API to work with good old Memory-Mapped Files feature which is available many years within from Win32API, so now you can use it from the .NET Managed Code.
For your task better fit "Persisted memory-mapped files" option,
MSDN:
Persisted files are memory-mapped files that are associated with a
source file on a disk. When the last process has finished working with
the file, the data is saved to the source file on the disk. These
memory-mapped files are suitable for working with extremely large
source files.
On the page of MemoryMappedFile.CreateFromFile() method description you can find a nice example describing how to create a memory mapped Views for the extremely large file.
EDIT: Update regarding considerable notes in comments
Just found method MemoryMappedFile.CreateViewStream() which creates a stream of type MemoryMappedViewStream which is inherited from a System.IO.Stream.
I believe you can create an instance of XmlReader from this stream and then instantiate your custom implementation of the XslTransform using this reader/stream.
EDIT2: remi bourgarel (OP) already tested this approach and looks like this particular XslTransform implementation (I wonder whether ANY would) wont work with MM-View stream in way which was supposed
The main problem is that you are loading the entire Xml file. If you were to just transform-as-you-read the out of memory problem should not normally appear.
That being said I found a MS support article which suggests how it can be done:
http://support.microsoft.com/kb/300934
Disclaimer: I did not test this so if you use it and it works please let us know.
You could consider using a queue to throttle how many concurrent transforms are being done based on some sort of artificial memory boundary e.g. file size. Something like the following could be used.
This sort of throttling strategy can be combined with maximum number of concurrent files being processed to ensure your disk is not being thrashed too much.
NB I have not included necessary try\catch\finally around execution to ensure that exceptions are propogated to calling thread and Waithandles are always released. I could go into further detail here.
public static class QueuedXmlTransform
{
private const int MaxBatchSizeMB = 300;
private const double MB = (1024 * 1024);
private static readonly object SyncObj = new object();
private static readonly TaskQueue Tasks = new TaskQueue();
private static readonly Action Join = () => { };
private static double _CurrentBatchSizeMb;
public static string Transform(string xsltPath, string xmlPath)
{
string tempPath = Path.GetTempFileName();
using (AutoResetEvent transformedEvent = new AutoResetEvent(false))
{
Action transformTask = () =>
{
MvpXslTransform transform = new MvpXslTransform();
transform.Load(xsltPath, new XsltSettings(true, false),
new XmlUrlResolver());
using (StreamWriter writer = new StreamWriter(tempPath))
using (XmlReader reader = XmlReader.Create(xmlPath))
{
transform.Transform(new XmlInput(reader), null,
new XmlOutput(writer));
}
transformedEvent.Set();
};
double fileSizeMb = new FileInfo(xmlPath).Length / MB;
lock (SyncObj)
{
if ((_CurrentBatchSizeMb += fileSizeMb) > MaxBatchSizeMB)
{
_CurrentBatchSizeMb = fileSizeMb;
Tasks.Queue(isParallel: false, task: Join);
}
Tasks.Queue(isParallel: true, task: transformTask);
}
transformedEvent.WaitOne();
}
return tempPath;
}
private class TaskQueue
{
private readonly object _syncObj = new object();
private readonly Queue<QTask> _tasks = new Queue<QTask>();
private int _runningTaskCount;
public void Queue(bool isParallel, Action task)
{
lock (_syncObj)
{
_tasks.Enqueue(new QTask { IsParallel = isParallel, Task = task });
}
ProcessTaskQueue();
}
private void ProcessTaskQueue()
{
lock (_syncObj)
{
if (_runningTaskCount != 0) return;
while (_tasks.Count > 0 && _tasks.Peek().IsParallel)
{
QTask parallelTask = _tasks.Dequeue();
QueueUserWorkItem(parallelTask);
}
if (_tasks.Count > 0 && _runningTaskCount == 0)
{
QTask serialTask = _tasks.Dequeue();
QueueUserWorkItem(serialTask);
}
}
}
private void QueueUserWorkItem(QTask qTask)
{
Action completionTask = () =>
{
qTask.Task();
OnTaskCompleted();
};
_runningTaskCount++;
ThreadPool.QueueUserWorkItem(_ => completionTask());
}
private void OnTaskCompleted()
{
lock (_syncObj)
{
if (--_runningTaskCount == 0)
{
ProcessTaskQueue();
}
}
}
private class QTask
{
public Action Task { get; set; }
public bool IsParallel { get; set; }
}
}
}
Update
Fixed bug in maintaining batch size when rolling over to next batch window:
_CurrentBatchSizeMb = fileSizeMb;
I've made my Logger, that logs a string, a static class with a static
so I can call it from my entire project without having to make an instance of it.
quite nice, but I want to make it run in a separate thread, since accessing the file costs time
is that possible somehow and what's the best way to do it?
Its a bit of a short description, but I hope the idea is clear. if not, please let me know.
Thanks in advance!
By the way any other improvements on my code are welcome as well, I have the feeling not everything is as efficient as it can be:
internal static class MainLogger
{
internal static void LogStringToFile(string logText)
{
DateTime timestamp = DateTime.Now;
string str = timestamp.ToString("dd-MM-yy HH:mm:ss ", CultureInfo.InvariantCulture) + "\t" + logText + "\n";
const string filename = Constants.LOG_FILENAME;
FileInfo fileInfo = new FileInfo(filename);
if (fileInfo.Exists)
{
if (fileInfo.Length > Constants.LOG_FILESIZE)
{
File.Create(filename).Dispose();
}
}
else
{
File.Create(filename).Dispose();
}
int i = 0;
while(true)
{
try
{
using (StreamWriter writer = File.AppendText(filename))
{
writer.WriteLine(str);
}
break;
}
catch (IOException)
{
Thread.Sleep(10);
i++;
if (i >= 8)
{
throw new IOException("Log file \"" + Constants.LOG_FILENAME + "\" not accessible after 5 tries");
}
}
}
}
}
enter code here
If you're doing this as an exercise (just using a ready made logger isn't an option) you could try a producer / consumer system.
Either make an Init function for your logger, or use the static constructor - inside it, launch a new System.Threading.Thread, which just runs through a while(true) loop.
Create a new Queue<string> and have your logging function enqueue onto it.
Your while(true) loop looks for items on the queue, dequeues them, and logs them.
Make sure you lock your queue before doing anything with it on either thread.
sry, but you may not reinvent the wheel:
choose log4net (or any other (enterprise) logging-engine) as your logger!
Ok, simply put you need to create a ThreadSafe static class. Below are some code snippets, a delegate that you call from any thread, this points to the correct thread, which then invokes the WriteToFile function.
When you start the application that you want to log against, pass it the following, where LogFile is the filename and path of your log file.
Log.OnNewLogEntry += Log.WriteToFile (LogFile, Program.AppName);
Then you want to put this inside your static Logging class. The wizard bit is the ThreadSafeAddEntry function, this will make sure you are in the correct Thread for writing the line of code away.
public delegate void AddEntryDelegate(string entry, bool error);
public static Form mainwin;
public static event AddEntryDelegate OnNewLogEntry;
public static void AddEntry(string entry) {
ThreadSafeAddEntry( entry, false );
}
private static void ThreadSafeAddEntry (string entry, bool error)
{
try
{
if (mainwin != null && mainwin.InvokeRequired) // we are in a different thread to the main window
mainwin.Invoke (new AddEntryDelegate (ThreadSafeAddEntry), new object [] { entry, error }); // call self from main thread
else
OnNewLogEntry (entry, error);
}
catch { }
}
public static AddEntryDelegate WriteToFile(string filename, string appName) {
//Do your WriteToFile work here
}
}
And finally to write a line...
Log.AddEntry ("Hello World!");
What you have in this case is a typical producer consumer scenario - many threads produce log entries and one thread writes them out to a file. The MSDN has an article with sample code for this scenario.
For starters, your logging mechanism should generally avoid throwing exceptions. Frequently logging mechanisms are where errors get written to, so things get ugly when they also start erroring.
I would look into the BackgroundWorker class, as it allows you to fork off threads that can do the logging for you. That way your app isn't slowed down, and any exceptions raised are simply ignored.