Multiple users writing at the same file - c#

I have a project which is a Web API project, my project is accessed by multiple users (i mean a really-really lot of users). When my project being accessed from frontend (web page using HTML 5), and user doing something like updating or retrieving data, the backend app (web API) will write a single log file (a .log file but the content is JSON).
The problem is, when being accessed by multiple users, the frontend became unresponsive (always loading). The problem is in writing process of the log file (single log file being accessed by a really-really lot of users). I heard that using a multi threading technique can solve the problem, but i don't know which method. So, maybe anyone can help me please.
Here is my code (sorry if typo, i use my smartphone and mobile version of stack overflow):
public static void JsonInputLogging<T>(T m, string methodName)
{
MemoryStream ms = new MemoryStream();
DataContractJsonSerializer ser = new
DataContractJsonSerializer(typeof(T));
ser.WriteObject(ms, m);
string jsonString = Encoding.UTF8.GetString(ms.ToArray());
ms.Close();
logging("MethodName: " + methodName + Environment.NewLine + jsonString.ToString());
}
public static void logging (string message)
{
string pathLogFile = "D:\jsoninput.log";
FileInfo jsonInputFile = new FileInfo(pathLogFile);
if (File.Exists(jsonInputFile.ToString()))
{
long fileLength = jsonInputFile.Length;
if (fileLength > 1000000)
{
File.Move(pathLogFile, pathLogFile.Replace(*some new path*);
}
}
File.AppendAllText(pathLogFile, *some text*);
}

You have to understand some internals here first. For each [x] users, ASP.Net will use a single worker process. One worker process holds multiple threads. If you're using multiple instances on the cloud, it's even worse because then you also have multiple server instances (I assume this ain't the case).
A few problems here:
You have multiple users and therefore multiple threads.
Multiple threads can deadlock each other writing the files.
You have multiple appdomains and therefore multiple processes.
Multiple processes can lock out each other
Opening and locking files
File.Open has a few flags for locking. You can basically lock files exclusively per process, which is a good idea in this case. A two-step approach with Exists and Open won't help, because in between another worker process might do something. Bascially the idea is to call Open with write-exclusive access and if it fails, try again with another filename.
This basically solves the issue with multiple processes.
Writing from multiple threads
File access is single threaded. Instead of writing your stuff to a file, you might want to use a separate thread to do the file access, and multiple threads that tell the thing to write.
If you have more log requests than you can handle, you're in the wrong zone either way. In that case, the best way to handle it for logging IMO is to simply drop the data. In other words, make the logger somewhat lossy to make life better for your users. You can use the queue for that as well.
I usually use a ConcurrentQueue for this and a separate thread that works away all the logged data.
This is basically how to do this:
// Starts the worker thread that gets rid of the queue:
internal void Start()
{
loggingWorker = new Thread(LogHandler)
{
Name = "Logging worker thread",
IsBackground = true,
Priority = ThreadPriority.BelowNormal
};
loggingWorker.Start();
}
We also need something to do the actual work and some variables that are shared:
private Thread loggingWorker = null;
private int loggingWorkerState = 0;
private ManualResetEventSlim waiter = new ManualResetEventSlim();
private ConcurrentQueue<Tuple<LogMessageHandler, string>> queue =
new ConcurrentQueue<Tuple<LogMessageHandler, string>>();
private void LogHandler(object o)
{
Interlocked.Exchange(ref loggingWorkerState, 1);
while (Interlocked.CompareExchange(ref loggingWorkerState, 1, 1) == 1)
{
waiter.Wait(TimeSpan.FromSeconds(10.0));
waiter.Reset();
Tuple<LogMessageHandler, string> item;
while (queue.TryDequeue(out item))
{
writeToFile(item.Item1, item.Item2);
}
}
}
Basically this code enables you to work away all the items from a single thread using a queue that's shared across threads. Note that ConcurrentQueue doesn't use locks for TryDequeue, so clients won't feel any pain because of this.
Last thing that's needed is to add stuff to the queue. That's the easy part:
public void Add(LogMessageHandler l, string msg)
{
if (queue.Count < MaxLogQueueSize)
{
queue.Enqueue(new Tuple<LogMessageHandler, string>(l, msg));
waiter.Set();
}
}
This code will be called from multiple threads. It's not 100% correct because Count and Enqueue don't necessarily have to be called in a consistent way - but for our intents and purposes it's good enough. It also doesn't lock in the Enqueue and the waiter will ensure that the stuff is removed by the other thread.
Wrap all this in a singleton pattern, add some more logic to it, and your problem should be solved.

That can be problematic, since every client request handled by new thread by default anyway. You need some "root" object that is known across the project (don't think you can achieve this in static class), so you can lock on it before you access the log file. However, note that it will basically serialize the requests, and probably will have a very bad effect on performance.

No multi-threading does not solve your problem. How are multiple threads supposed to write to the same file at the same time? You would need to care about data consistency and I don't think that's the actual problem here.
What you search is asynchronous programming. The reason your GUI becomes unresponsive is, that it waits for the tasks to complete. If you know, the logger is your bottleneck then use async to your advantage. Fire the log method and forget about the outcome, just write the file.
Actually I don't really think your logger is the problem. Are you sure there is no other logic which blocks you?

Related

Prevent multiple threads writing the same file

i am designing and developing an api where multiple threads are downloading files from the net and then write it to disk.
if it is used incorrectly it could happen that the same file is downloaded and written by more than one threads, which will lead to an exception at the moment of writing to disk.
i would like to avoid this problem with a lock() { ... } around the part that writes the file, but obviously i dont want to lock with a global object, just something that is related to that specific file so that not all threads are locked when a file is written.
i hope this question is understandable.
So what you want to be able to do is synchronize a bunch of actions based no a given key. In this case, that key can be an absolute file name. We can implement this as a dictionary that maps a key to some synchronization object. This could be either an object to lock on, if we want to implement a blocking synchronization mechanism, or a Task if we want to represent an asynchronous method of running the code when appropriate; I went with the later. I also went with a ConcurrentDictionary to let it handle the synchronization, rather than handling it manually, and used Lazy to ensure that each task was created exactly once:
public class KeyedSynchronizer<TKey>
{
private ConcurrentDictionary<TKey, Lazy<Task>> dictionary;
public KeyedSynchronizer(IEqualityComparer<TKey> comparer = null)
{
dictionary = new ConcurrentDictionary<TKey, Lazy<Task>>(
comparer ?? EqualityComparer<TKey>.Default);
}
public Task ActOnKey(TKey key, Action action)
{
var dictionaryValue = dictionary.AddOrUpdate(key,
new Lazy<Task>(() => Task.Run(action)),
(_, task) => new Lazy<Task>(() =>
task.Value.ContinueWith(t => action())));
return dictionaryValue.Value;
}
public static readonly KeyedSynchronizer<TKey> Default =
new KeyedSynchronizer<TKey>();
}
You can now create an instance of this synchronizer, and then specify actions along with the keys (files) that they correspond to. You can be confident that the actions won't be executed until any previous actions on that file have completed. If you want to wait until that action completes, then you can Wait on the task, if you don't have any need to wait, then you can just not. This also allows you to do your processing asynchronously by awaiting the task.
You may consider using ReaderWriterLockSlim
http://msdn.microsoft.com/en-us/library/system.threading.readerwriterlockslim.aspx
private ReaderWriterLockSlim fileLock = new ReaderWriterLockSlim();
fileLock.EnterWriteLock();
try
{
//write your file here
}
finally
{
fileLock.ExitWriteLock();
}
I had a similar situation, and resolved it by lock()ing on the StreamWriter object in question:
private Dictionary<string, StreamWriter> _writers; // Consider using a thread-safe dictionary
void WriteContent(string file, string content)
{
StreamWriter writer;
if (_writers.TryGetValue(file, out writer))
lock (writer)
writer.Write(content);
// Else handle missing writer
}
That's from memory, it may not compile. I'd read up on Andrew's solution (I will be), as it may be more exactly what you need... but this is super-simple, if you just want a quick-and-dirty.
I'll make it an answer with some explanation.
Windows already have something like you want, idea behind is simple: to allow multiple processes access same file and to carry on all writing/reading operations, so that: 1) all processes operates with the most recent data of that file 2) multiple writing or reading occurs without waiting (if possible).
It's called Memory-Mapped Files. I was using it for IPC mostly (without file), so can't provide an example, but there should be some.
You could mimic MMF behavior by using some buffer and sort of layer on top of it, which will redirect all reading/writing operations to that buffer and periodically flush updated content into physical file.
P.S: try to look also for file-sharing (open file for shared reading/writing).

Multi processes read&write one file

I have a txt file ABC.txt which will be read and wrote by multi processes. So when one process is reading from or writing to file ABC.txt, file ABC.txt must be locked so that any other processes can not reading from or writing to it. I know the enum System.IO.FileShare may be the right way to handle this problem. But I used another way which I'm not sure if it is right. The following is my solution.
I added another file Lock.txt to the folder. Before I can read from or write to file ABC.txt, I must have the capability to read from file Lock.txt. And after I have read from or written to file ABC.txt, I have to release that capability. The following is the code.
#region Enter the lock
FileStream lockFileStream = null;
bool lockEntered = false;
while (lockEntered == false)
{
try
{
lockFileStream = File.Open("Lock.txt", FileMode.Open, FileAccess.Read, FileShare.None);
lockEntered = true;
}
catch (Exception)
{
Thread.Sleep(500);
}
}
#endregion
#region Do the work
// Read from or write to File ABC.txt
// Read from or write to other files
#endregion
#region Release the lock
try
{
if (lockFileStream != null)
{
lockFileStream.Dispose();
}
}
catch
{
}
#endregion
On my computer, it seems that this solution works well, but I still can not make sure if it is appropriate..
Edit: Multi processes, not multi threads in the same process.
C#'s named EventWaitHandle is the way to go here. Create an instance of wait handle in every process which wants to use that file and give it a name which is shared by all such processes.
EventWaitHandle waitHandle = new EventWaitHandle(true, EventResetMode.AutoReset, "SHARED_BY_ALL_PROCESSES");
Then when accessing the file wait on waitHandle and when finished processing file, set it so the next process in the queue may access it.
waitHandle.WaitOne();
/* process file*/
waitHandle.Set();
When you name an event wait handle then that name is shared across all processes in the operating system. Therefore in order to avoid possibility of collisions, use a guid for name ("SHARED_BY_ALL_PROCESSES" above).
A mutex in C# may be shared across multiple processes. Here is an example for multiple processes writing to a single file:
using (var mutex = new Mutex(false, "Strand www.jakemdrew.com"))
{
mutex.WaitOne();
File.AppendAllText(outputFilePath,theFileText);
mutex.ReleaseMutex();
}
You need to make sure that the mutex is given a unique name that will be shared across the entire system.
Additional reading here:
http://www.albahari.com/threading/part2.aspx#_Mutex
Your solution is error prone. You've basically implemented double-checked locking (http://en.wikipedia.org/wiki/Double-checked_locking) which can be very unsafe.
A better solution would be to either introduce thread isolation, whereby only one thread ever accesses the file and does so by reading from a queue upon which requests to read or write are placed by other threads (and of course the queue is protected by mutually exclusive access by threads) or where the threads synchronize themselves either by synchronization devices (lock sections, mutices, whatever) or by using some other file access logic (for example, System.IO.FileShare came up in a few reponses here.)
If it was me, I would install something like SQL Server Compact Edition for reading/writing this data.
However, if you want to be able to lock access to a resource that is shared between multiple processes, you need to use a Mutex or a Semaphore.
The Mutex class is a .Net wrapper around an OS Level locking mechanism.
Overview of Synchronization Primitives

Writing data to a buffer and reading data from a buffer

I am currently looking into writing a program in c# which deals with managing log files. The purpose is the log file shouldn't exceed 50mb so I am renaming the file and creating a new and then it should start writing to the new file. To avoid data being written to the log while the files are changed I was thinking that in one part of the program I add the data to a buffer, then in another part of the program it reads the data in the buffer and outputs it to the file. If it can't write to the file it keeps it in the buffer until it can write to the file.
How would I go about doing this I can't seem to find anything on Google not sure if I'm searching the correct thing.
Thanks for any help you can provide
This sounds like a classical producer consumer problem. This is good starting points.
Blocking Collection
As other have pointed using a library is better rather than re-inventing the wheel. Log4net is a good Logging library.
I would suggest using a BlockingCollection. Threads that want to write to the log just enqueue the log string to the BlockingCollection. A separate thread monitors the BlockingCollection, de-queuing strings and writing them to the file. If the file isn't available, the thread can wait and try again.
See http://www.informit.com/guides/content.aspx?g=dotnet&seqNum=821 for some simple examples.
If you can't use BlockingCollection, you can use a Queue and protect it with a lock. Your method to log something becomes:
private Queue<string> myQueue = new Queue<string>(); // owned by the logger
void WriteLog(string s)
{
lock (myQueue)
{
myQueue.Enqueue(s);
}
}
The thread that removes things can get them from the queue. It'll be a little less than ideal because it'll have to poll periodically, but it shouldn't be too bad:
while (!shutdown) // do this until somebody shuts down the program
{
while (myQueue.Count > 0)
{
lock (myQueue)
{
string s = myQueue.Dequeue();
// write the string s to the log file
}
}
Thread.Sleep(1000); // sleep for a second and do it again.
}
There are ways to do that without the busy waits, but I don't remember the implementation details. See http://msdn.microsoft.com/en-us/library/system.threading.monitor.pulse.aspx for a sample.

How to Lock a file and avoid readings while it's writing

My web application returns a file from the filesystem. These files are dynamic, so I have no way to know the names o how many of them will there be. When this file doesn't exist, the application creates it from the database. I want to avoid that two different threads recreate the same file at the same time, or that a thread try to return the file while other thread is creating it.
Also, I don't want to get a lock over a element that is common for all the files. Therefore I should lock the file just when I'm creating it.
So I want to lock a file till its recreation is complete, if other thread try to access it ... it will have to wait the file be unlocked.
I've been reading about FileStream.Lock, but I have to know the file length and it won't prevent that other thread try to read the file, so it doesn't work for my particular case.
I've been reading also about FileShare.None, but it will throw an exception (which exception type?) if other thread/process try to access the file... so I should develop a "try again while is faulting" because I'd like to avoid the exception generation ... and I don't like too much that approach, although maybe there is not a better way.
The approach with FileShare.None would be this more or less:
static void Main(string[] args)
{
new Thread(new ThreadStart(WriteFile)).Start();
Thread.Sleep(1000);
new Thread(new ThreadStart(ReadFile)).Start();
Console.ReadKey(true);
}
static void WriteFile()
{
using (FileStream fs = new FileStream("lala.txt", FileMode.Create, FileAccess.Write, FileShare.None))
using (StreamWriter sw = new StreamWriter(fs))
{
Thread.Sleep(3000);
sw.WriteLine("trolololoooooooooo lolololo");
}
}
static void ReadFile()
{
Boolean readed = false;
Int32 maxTries = 5;
while (!readed && maxTries > 0)
{
try
{
Console.WriteLine("Reading...");
using (FileStream fs = new FileStream("lala.txt", FileMode.Open, FileAccess.Read, FileShare.Read))
using (StreamReader sr = new StreamReader(fs))
{
while (!sr.EndOfStream)
Console.WriteLine(sr.ReadToEnd());
}
readed = true;
Console.WriteLine("Readed");
}
catch (IOException)
{
Console.WriteLine("Fail: " + maxTries.ToString());
maxTries--;
Thread.Sleep(1000);
}
}
}
But I don't like the fact that I have to catch exceptions, try several times and wait an inaccurate amount of time :|
You can handle this by using the FileMode.CreateNew argument to the stream constructor. One of the threads is going to lose and find out that the file was already created a microsecond earlier by another thread. And will get an IOException.
It will then need to spin, waiting for the file to be fully created. Which you enforce with FileShare.None. Catching exceptions here doesn't matter, it is spinning anyway. There's no other workaround for it anyway unless you P/Invoke.
i think that a right aproach would be the following:
create a set of string were u will save the current file name
so one thread would process the file at time, something like this
//somewhere on your code or put on a singleton
static System.Collections.Generic.HashSet<String> filesAlreadyProcessed= new System.Collections.Generic.HashSet<String>();
//thread main method code
bool filealreadyprocessed = false
lock(filesAlreadyProcessed){
if(set.Contains(filename)){
filealreadyprocessed= true;
}
else{
set.Add(filename)
}
}
if(!filealreadyprocessed){
//ProcessFile
}
Do you have a way to identify what files are being created?
Say every one of those files corresponds to a unique ID in your database. You create a centralised location (Singleton?), where these IDs can be associated with something lockable (Dictionary). A thread that needs to read/write to one of those files does the following:
//Request access
ReaderWriterLockSlim fileLock = null;
bool needCreate = false;
lock(Coordination.Instance)
{
if(Coordination.Instance.ContainsKey(theId))
{
fileLock = Coordination.Instance[theId];
}
else if(!fileExists(theId)) //check if the file exists at this moment
{
Coordination.Instance[theId] = fileLock = new ReaderWriterLockSlim();
fileLock.EnterWriteLock(); //give no other thread the chance to get into write mode
needCreate = true;
}
else
{
//The file exists, and whoever created it, is done with writing. No need to synchronize in this case.
}
}
if(needCreate)
{
createFile(theId); //Writes the file from the database
lock(Coordination.Instance)
Coordination.Instance.Remove[theId];
fileLock.ExitWriteLock();
fileLock = null;
}
if(fileLock != null)
fileLock.EnterReadLock();
//read your data from the file
if(fileLock != null)
fileLock.ExitReadLock();
Of course, threads that don't follow this exact locking protocol will have access to the file.
Now, locking over a Singleton object is certainly not ideal, but if your application needs global synchronization then this is a way to achieve it.
Your question really got me thinking.
Instead of having every thread responsible for file access and having them block, what if you used a queue of files that need to be persisted and have a single background worker thread dequeue and persist?
While the background worker is cranking away, you can have the web application threads return the db values until the file does actually exist.
I've posted a very simple example of this on GitHub.
Feel free to give it a shot and let me know what you think.
FYI, if you don't have git, you can use svn to pull it http://svn.github.com/statianzo/MultiThreadFileAccessWebApp
The question is old and there is already a marked answer. Nevertheless I would like to post a simpler alternative.
I think we can directly use the lock statement on the filename, as follows:
lock(string.Intern("FileLock:absoluteFilePath.txt"))
{
// your code here
}
Generally, locking a string is a bad idea because of String Interning. But in this particular case it should ensure that no one else is able to access that lock. Just use the same lock string before attempting to read. Here interning works for us and not against.
PS: The text 'FileLock' is just some arbitrary text to ensure that other string file paths are not affected.
Why aren't you just using the database - e.g. if you have a way to associate a filename with the data from the db it contains, just add some information to the db that specifies whether a file exists with that information currently and when it was created, how stale the information in the file is etc. When a thread needs some information, it checks the db to see if that file exists and if not, it writes out a row to the table saying it's creating the file. When it's done it updates that row with a boolean saying the file is ready to be used by others.
the nice thing about this approach - all your information is in 1 place - so you can do nice error recovery - e.g. if the thread creating the file dies badly for some reason, another thread can come along and decide to rewrite the file because the creation time is too old. You can also create simple batch cleanup processes and get accurate data on how frequently certain data is being used for a file, how often information is updated (by looking at the creation times etc). Also, you avoid having to do many many disk seeks across your filesystem as different threads look for different files all over the place - especially if you decide to have multiple front-end machines seeking across a common disk.
The tricky thing - you'll have to make sure your db supports row-level locking on the table that threads write to when they create files because otherwise the table itself may be locked which could make this unacceptably slow.

Issue writing to single file in Web service in .NET

I have created a webservice in .net 2.0, C#. I need to log some information to a file whenever different methods are called by the web service clients.
The problem comes when one user process is writing to a file and another process tries to write to it. I get the following error:
The process cannot access the file because it is being used by another process.
The solutions that I have tried to implement in C# and failed are as below.
Implemented singleton class that contains code that writes to a file.
Used lock statement to wrap the code that writes to the file.
I have also tried to use open source logger log4net but it also is not a perfect solution.
I know about logging to system event logger, but I do not have that choice.
I want to know if there exists a perfect and complete solution to such a problem?
The locking is probably failing because your webservice is being run by more than one worker process.
You could protect the access with a named mutex, which is shared across processes, unlike the locks you get by using lock(someobject) {...}:
Mutex lock = new Mutex("mymutex", false);
lock.WaitOne();
// access file
lock.ReleaseMutex();
You don't say how your web service is hosted, so I'll assume it's in IIS. I don't think the file should be accessed by multiple processes unless your service runs in multiple application pools. Nevertheless, I guess you could get this error when multiple threads in one process are trying to write.
I think I'd go for the solution you suggest yourself, Pradeep, build a single object that does all the writing to the log file. Inside that object I'd have a Queue into which all data to be logged gets written. I'd have a separate thread reading from this queue and writing to the log file. In a thread-pooled hosting environment like IIS, it doesn't seem too nice to create another thread, but it's only one... Bear in mind that the in-memory queue will not survive IIS resets; you might lose some entries that are "in-flight" when the IIS process goes down.
Other alternatives certainly include using a separate process (such as a Service) to write to the file, but that has extra deployment overhead and IPC costs. If that doesn't work for you, go with the singleton.
Maybe write a "queue line" of sorts for writing to the file, so when you try to write to the file it keeps checking to see if the file is locked, if it is - it keeps waiting, if it isn't locked - then write to it.
You could push the results onto an MSMQ Queue and have a windows service pick the items off of the queue and log them. It's a little heavy, but it should work.
Joel and charles. That was quick! :)
Joel: When you say "queue line" do you mean creating a separate thread that runs in a loop to keep checking the queue as well as write to a file when it is not locked?
Charles: I know about MSMQ and windows service combination, but like I said I have no choice other than writing to a file from within the web service :)
thanks
pradeep_tp
Trouble with all the approached tried so far is that multiple threads can enter the code.
That is multiple threads try to acquire and use the file handler - hence the errors - you need a single thread outside of the worker threads to do the work - with a single file handle held open.
Probably easiest thing to do would be to create a thread during application start in Global.asax and have that listen to a synchronized in-memory queue (System.Collections.Generics.Queue). Have the thread open and own the lifetime of the file handle, only that thread can write to the file.
Client requests in ASP will lock the queue momentarily, push the new logging message onto the queue, then unlock.
The logger thread will poll the queue periodically for new messages - when messages arrive on the queue, the thread will read and dispatch the data in to the file.
To know what I am trying to do in my code, following is the singletone class I have implemented in C#
public sealed class FileWriteTest
{
private static volatile FileWriteTest instance;
private static object syncRoot = new Object();
private static Queue logMessages = new Queue();
private static ErrorLogger oNetLogger = new ErrorLogger();
private FileWriteTest() { }
public static FileWriteTest Instance
{
get
{
if (instance == null)
{
lock (syncRoot)
{
if (instance == null)
{
instance = new FileWriteTest();
Thread MyThread = new Thread(new ThreadStart(StartCollectingLogs));
MyThread.Start();
}
}
}
return instance;
}
}
private static void StartCollectingLogs()
{
//Infinite loop
while (true)
{
cdoLogMessage objMessage = new cdoLogMessage();
if (logMessages.Count != 0)
{
objMessage = (cdoLogMessage)logMessages.Dequeue();
oNetLogger.WriteLog(objMessage.LogText, objMessage.SeverityLevel);
}
}
}
public void WriteLog(string logText, SeverityLevel errorSeverity)
{
cdoLogMessage objMessage = new cdoLogMessage();
objMessage.LogText = logText;
objMessage.SeverityLevel = errorSeverity;
logMessages.Enqueue(objMessage);
}
}
When I run this code in debug mode (simulates just one user access), I get the error "stack overflow" at the line where queue is dequeued.
Note: In the above code ErrorLogger is a class that has code to write to the File. objMessage is an entity class to carry the log message.
Alternatively, you might want to do error logging into the database (if you're using one)
Koth,
I have implemented Mutex lock, which has removed the "stack overflow" error. I yet have to do a load testing before I can conclude whether it is working fine in all cases.
I was reading about Mutex objets in one of the websites, which says that Mutex affects the performance. I want to know one thing with putting lock through Mutex.
Suppose User Process1 is writing to a file and at the same time User Process2 tries to write to the same file. Since Process1 has put a lock on the code block, will Process2 will keep trying or just die after the first attempet iteself.?
thanks
pradeep_tp
It will wait until the mutex is released....
Joel: When you say "queue line" do you
mean creating a separate thread that
runs in a loop to keep checking the
queue as well as write to a file when
it is not locked?
Yeah, that's basically what I was thinking. Have another thread that has a while loop until it can get access to the file and save, then end.
But you would have to do it in a way where the first thread to start looking gets access first. Which is why I say queue.

Categories

Resources