I have a web service that is generating random errors and I think I've found the fault.
Basically, it is reading a config file as follows, then loaded into an XmlDocument:
var config = File.ReadAllText(filename);
xmlDoc.Load(config);
However, at a later time (maybe a second later) one of the config values is updated and the file saved
xmlDoc.Save(filename);
So I'm now experiencing more errors (unfortunately the original developer added an empty try block so I can't check just now) during the first READ operation and I think it's because it's trying to read the file just as another process spawned from IIS is at the .Save part. I don't know how File.ReadAllText works and whether it will fail on a write locked file.
What's the best solution to handle this to ensure reading will always work? the value being written is just a counter and if it fails it is ignored as it's not that important but would prefer it was written. I guess I could put it into a separate config file and live with the error but I'd rather it was just one file.
Thanks.
You can use a lock to make sure that a read is completed before a write and vice verser. As in:
using System;
using System.Threading;
class Program
{
static readonly object _fileAccess = new object();
static void Write()
{
// Obtain lock and write
lock (_fileAccess)
{
// Write data to filename
xmlDoc.Save(filename);
}
}
static void Read()
{
// Obtain lock and read
lock (_fileAccess)
{
// Read some data from filename
xmlDoc.load(filename);
}
}
static void Main()
{
ThreadStart writeT = new ThreadStart(Write);
new Thread(writeT).Start();
ThreadStart readT = new ThreadStart(Read);
new Thread(readT).Start();
}
}
With the lock, the Read() must wait for the Write() to complete and Write() must wait for Read() to complete.
To answer your question about how File.ReadAllText() works, looking at the source, it uses a StreamReader internally which in turn uses a FileStream opened with FileAccess.Read and FileShare.Read, so that would prevent any other process from writing to the file (e.g. your XmlDocument.Save()) until the ReadAllText completed.
Meanwhile, your XmlDocument.Save() eventually uses FileStream opened with
FileAccess.Write and FileShare.Read, so it would allow the File.ReadAllText() as long as the Save started before the ReadAllText.
References: https://referencesource.microsoft.com/#mscorlib/system/io/streamreader.cs,a820588d8233a829
https://referencesource.microsoft.com/#System.Xml/System/Xml/Dom/XmlDocument.cs,1db4dba15523d588
Related
I have a StreamWriter with an AutoFlush = true property. However, I still see the file only partially written when I randomly open it. I'm writing a file that needs to be fully written (JSON) or not during any given time.
var sw = new StreamWriter("C:\file.txt", true /* append */, Encoding.ASCII) { AutoFlush = true };
sw.WriteLine("....");
// long running (think like a logging application) -- 1000s of seconds
sw.Close();
In between the sw.WriteLine() call and sw.Close() I want to open the file, and always have it be in the "correct data format", i.e. my line should be complete.
Current Idea:
Increase the internal buffer of FileStream (and/or StreamWriter) to let's say 128KB. Then every 128KB-1, call .Flush() on the FileStream object. This leads me to my next question, when I do call Flush(), should I right before calling it get the Stream.Position and do a File.Lock(Position, 128KB-1)? Or does Flush take care of that?
Basically I don't want the reader to be able to read the contents in between Flush(), because it'll maybe partially broken.
using (StreamWriter sw = new StreamWriter("FILEPATH"))
{
sw.WriteLine("contents");
// if you open the file now, you may see partially written lines
// since the sw is still working on it.
}
// access the file now, since the stream writer has been properly closed and disposed.
If you need a "log-like" file which is never half-written, the way to go is not keeping it open.
Every time, you want to write your file, you should instantiate a new FileWriter, which will flush the file contents upon releasing the file like this:
private void LogLikeWrite(string filePath, string contents)
{
using (StreamWriter streamWriter = new StreamWriter(filePath, true)) // the true will make you append to the file instead of overwriting its contents
{
streamWriter.Write(contents);
}
}
This way your write operations will be flushed immediately.
If you are sharing the file between processes, your going to have a race condition unless you produce a locking mechanism of some kind. See https://stackoverflow.com/a/29127380/892327. This does require that you are able to modify both processes.
An alternative is to have process A wait for a file at a specified location. Process B writes to a intermediate file and once B has flushed, the file is copied to the location process A is expecting a file to be so that it can consume the file.
I have a program that continuously writes its log to a text file.
I don't have the source code of it, so I can not modify it in any way and it is also protected with Themida.
I need to read the log file and execute some scripts depending on the content of the file.
I can not delete the file because the program that is continuously writing to it has locked the file.
So what will be the better way to read the file and only read the new lines of the file?
Saving the last line position? Or is there something that will be useful for solving it in C#?
Perhaps use the FileSystemWatcher along with opening the file with FileShare (as it is being used by another process). Hans Passant has provided a nice answer for this part here:
var fs = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
using (var sr = new StreamReader(fs)) {
// etc...
}
Have a look at this question and the accepted answer which may also help.
using (var fs = new FileStream("test.txt", FileMode.Open, FileAccess.Read, FileShare.ReadWrite | FileShare.Delete))
using (var reader = new StreamReader(fs))
{
while (true)
{
var line = reader.ReadLine();
if (!String.IsNullOrWhiteSpace(line))
Console.WriteLine("Line read: " + line);
}
}
I tested the above code and it works if you are trying to read one line at a time. The only issue is that if the line is flushed to the file before it is finished being written then you will read the line in multiple parts. As long as the logging system is writing each line all at once it should be okay.
If not then you may want to read into a buffer instead of using ReadLine, so you can parse the buffer yourself by detecting each Environment.NewLine substring.
You can just keep calling ReadToEnd() in a tight loop. Even after it reaches the end of the file it'll just return an empty string "". If some more data is written to the file it will pick it up on a subsequent call.
while (true)
{
string moreData = streamReader.ReadToEnd();
Thread.Sleep(100);
}
Bear in mind you might read partial lines this way. Also if you are dealing with very large files you will probably need another approach.
Use the filesystemwatcher to detect changes and get new lines using last read position and seek the file.
http://msdn.microsoft.com/en-us/library/system.io.filestream.seek.aspx
The log file is being "continuously" updated so you really shouldn't use FileSystemWatcher to raise an event each time the file changes. This would be triggering continuously, and you already know it will be very frequently changing.
I'd suggest using a timer event to periodically process the file. Read this SO answer for a good pattern to use System.Threading.Timer1. Keep a file stream open for reading or reopen each time and Seek to the end position of your last successful read. By "last successful read" I mean that you should encapsulate the reading and validating of a complete log line. Once you've successfully read and validated a log line, then you have a new position for the next Seek.
1 Note that System.Threading.Timer will execute on a system supplied thread that is kept in business by the ThreadPool. For short tasks this is more desirable that a dedicated thread.
Use this answer on another post c# continuously read file.
This one is quite efficient, and it checks once per second if the file size has changed. So the file is usually not read-locked as a result.
The other answers are quite valid and simple. A couple of them will read-lock the file continuously, but that's probably not a problem for most.
I am currently facing an issue, I have a method that I run which queries specific ports of a server and writes the results to a text file called temp.txt. Temp.txt should never have any duplicate data in - the file should be clear before the method begins, although sometimes I find that the previous instance of the method is still running (as its asynchronous) and I often get duplicate data since the other method is still writing to the file / performing the queries.
Code Snippet:
StreamWriter sw = File.AppendText("temp");
sw.WriteLine("Check1=Success");
sw.Close();
You can implement some sort of lock
Lock ensures that one thread does not enter a critical section of
code while another thread is in the critical section. If another
thread attempts to enter a locked code, it will wait, block, until the
object is released.
class OnlyOneCallerAllowed
{
private static readonly object locker = new object();
public static void OnlyOneMethodCanWrite()
{
lock (locker)
{
using(StreamWriter sw = File.AppendText("temp"))
{
sw.WriteLine("Check1=Success");
}
}
}
I would prefer change the approach. Use a queue to record that you need to update the file. Could be an MSMQ or an in mmemory queue, depending on what level of fault tolerance you want to have. Then have a single thread to dequeue that evants and update the file, with this you guarantee single updates. This is a sort of implementation o the publish-subscribe pattern, with many publisher and a single subscriber. Alternatively you can lock the file access by using a lock() over a static object.
StreamWriter sw = File.AppendText("temp");
sw.WriteLine("Check1=Success");
sw.Flush(); // <======
sw.Close();
EDIT:
What about opening the file with exclusive access (FileShare.None)?
FileStream fs = new FileStream("temp", FileMode.Append, FileAccess.Write, FileShare.None);
StreamWriter sw = new StreamWriter(fs);
You will need to do introduce some error handling as well, since you will get an exception, if the other method is still having the file open.
I'm writing an application that manipulates a text file. The first half of my function reads the textfile, while the second half writes to (optionally) the same file. Although I call .close() on the StreamReader object before opening the StreamWriter object, I still get a IOException: The process cannot access the file "file.txt" because it is being used by another process.
How do I force my program to release the file before continuing?
public static void manipulateFile(String fileIn, String fileOut,String obj)
{
StreamReader sr = new StreamReader(fileIn);
String line;
while ((line = sr.ReadLine()) != null)
{
//code to split up file into part1, part2, and part3[]
}
sr.Close();
//Write the file
if (fileOut != null)
{
StreamWriter sw = new StreamWriter(fileOut);
sw.Write(part1 + part2);
foreach (String s in part3)
{
sw.WriteLine(s);
}
sw.Close();
}
}
Your code as posted runs fine - I don't see the exception.
However calling Close() manually like that is a bad idea - if an exception is thrown your call to Close() might never be made. You should use a finally block, or better yet : a using statement.
using (StreamReader sr = new StreamReader(fileIn))
{
// ...
}
But the actual problem you are experiencing might not be specifically with this method, but a general problem with forgetting to close files properly in using blocks. I suggest you go through all your code base and look for all the places in your code where you use IDisposable objects and check that you dispose them correctly even when there could be exceptions.
Getting read access to a file that's already opened elsewhere isn't usually difficult. Most code would open a file for reading with FileShare.Read, allowing somebody else to read the file as well. StreamReader does so for example.
Getting write access is an entirely different ball of wax. That same FileShare.Read does not include FileShare.Write, allowing you to write the file while somebody else is reading it. That's very troublesome, you're jerking the mat out from under that somebody else, suddenly providing entirely different data.
All you have to do is find out who that 'somebody else' might be. SysInternals' Handles utility can tell you. Hopefully it is your own program, you could do something about that.
May sound like a stupid question, but are you sure you didn't edit the file with another application, which didn't release the file? I've had this situation before, mostly with Excel files where Excel didn't completely unloading from memory (or me just being dumb enough not to close the other application sometimes). Might happen with whatever application you use for .txt files, if any. Just a suggestion.
My web application returns a file from the filesystem. These files are dynamic, so I have no way to know the names o how many of them will there be. When this file doesn't exist, the application creates it from the database. I want to avoid that two different threads recreate the same file at the same time, or that a thread try to return the file while other thread is creating it.
Also, I don't want to get a lock over a element that is common for all the files. Therefore I should lock the file just when I'm creating it.
So I want to lock a file till its recreation is complete, if other thread try to access it ... it will have to wait the file be unlocked.
I've been reading about FileStream.Lock, but I have to know the file length and it won't prevent that other thread try to read the file, so it doesn't work for my particular case.
I've been reading also about FileShare.None, but it will throw an exception (which exception type?) if other thread/process try to access the file... so I should develop a "try again while is faulting" because I'd like to avoid the exception generation ... and I don't like too much that approach, although maybe there is not a better way.
The approach with FileShare.None would be this more or less:
static void Main(string[] args)
{
new Thread(new ThreadStart(WriteFile)).Start();
Thread.Sleep(1000);
new Thread(new ThreadStart(ReadFile)).Start();
Console.ReadKey(true);
}
static void WriteFile()
{
using (FileStream fs = new FileStream("lala.txt", FileMode.Create, FileAccess.Write, FileShare.None))
using (StreamWriter sw = new StreamWriter(fs))
{
Thread.Sleep(3000);
sw.WriteLine("trolololoooooooooo lolololo");
}
}
static void ReadFile()
{
Boolean readed = false;
Int32 maxTries = 5;
while (!readed && maxTries > 0)
{
try
{
Console.WriteLine("Reading...");
using (FileStream fs = new FileStream("lala.txt", FileMode.Open, FileAccess.Read, FileShare.Read))
using (StreamReader sr = new StreamReader(fs))
{
while (!sr.EndOfStream)
Console.WriteLine(sr.ReadToEnd());
}
readed = true;
Console.WriteLine("Readed");
}
catch (IOException)
{
Console.WriteLine("Fail: " + maxTries.ToString());
maxTries--;
Thread.Sleep(1000);
}
}
}
But I don't like the fact that I have to catch exceptions, try several times and wait an inaccurate amount of time :|
You can handle this by using the FileMode.CreateNew argument to the stream constructor. One of the threads is going to lose and find out that the file was already created a microsecond earlier by another thread. And will get an IOException.
It will then need to spin, waiting for the file to be fully created. Which you enforce with FileShare.None. Catching exceptions here doesn't matter, it is spinning anyway. There's no other workaround for it anyway unless you P/Invoke.
i think that a right aproach would be the following:
create a set of string were u will save the current file name
so one thread would process the file at time, something like this
//somewhere on your code or put on a singleton
static System.Collections.Generic.HashSet<String> filesAlreadyProcessed= new System.Collections.Generic.HashSet<String>();
//thread main method code
bool filealreadyprocessed = false
lock(filesAlreadyProcessed){
if(set.Contains(filename)){
filealreadyprocessed= true;
}
else{
set.Add(filename)
}
}
if(!filealreadyprocessed){
//ProcessFile
}
Do you have a way to identify what files are being created?
Say every one of those files corresponds to a unique ID in your database. You create a centralised location (Singleton?), where these IDs can be associated with something lockable (Dictionary). A thread that needs to read/write to one of those files does the following:
//Request access
ReaderWriterLockSlim fileLock = null;
bool needCreate = false;
lock(Coordination.Instance)
{
if(Coordination.Instance.ContainsKey(theId))
{
fileLock = Coordination.Instance[theId];
}
else if(!fileExists(theId)) //check if the file exists at this moment
{
Coordination.Instance[theId] = fileLock = new ReaderWriterLockSlim();
fileLock.EnterWriteLock(); //give no other thread the chance to get into write mode
needCreate = true;
}
else
{
//The file exists, and whoever created it, is done with writing. No need to synchronize in this case.
}
}
if(needCreate)
{
createFile(theId); //Writes the file from the database
lock(Coordination.Instance)
Coordination.Instance.Remove[theId];
fileLock.ExitWriteLock();
fileLock = null;
}
if(fileLock != null)
fileLock.EnterReadLock();
//read your data from the file
if(fileLock != null)
fileLock.ExitReadLock();
Of course, threads that don't follow this exact locking protocol will have access to the file.
Now, locking over a Singleton object is certainly not ideal, but if your application needs global synchronization then this is a way to achieve it.
Your question really got me thinking.
Instead of having every thread responsible for file access and having them block, what if you used a queue of files that need to be persisted and have a single background worker thread dequeue and persist?
While the background worker is cranking away, you can have the web application threads return the db values until the file does actually exist.
I've posted a very simple example of this on GitHub.
Feel free to give it a shot and let me know what you think.
FYI, if you don't have git, you can use svn to pull it http://svn.github.com/statianzo/MultiThreadFileAccessWebApp
The question is old and there is already a marked answer. Nevertheless I would like to post a simpler alternative.
I think we can directly use the lock statement on the filename, as follows:
lock(string.Intern("FileLock:absoluteFilePath.txt"))
{
// your code here
}
Generally, locking a string is a bad idea because of String Interning. But in this particular case it should ensure that no one else is able to access that lock. Just use the same lock string before attempting to read. Here interning works for us and not against.
PS: The text 'FileLock' is just some arbitrary text to ensure that other string file paths are not affected.
Why aren't you just using the database - e.g. if you have a way to associate a filename with the data from the db it contains, just add some information to the db that specifies whether a file exists with that information currently and when it was created, how stale the information in the file is etc. When a thread needs some information, it checks the db to see if that file exists and if not, it writes out a row to the table saying it's creating the file. When it's done it updates that row with a boolean saying the file is ready to be used by others.
the nice thing about this approach - all your information is in 1 place - so you can do nice error recovery - e.g. if the thread creating the file dies badly for some reason, another thread can come along and decide to rewrite the file because the creation time is too old. You can also create simple batch cleanup processes and get accurate data on how frequently certain data is being used for a file, how often information is updated (by looking at the creation times etc). Also, you avoid having to do many many disk seeks across your filesystem as different threads look for different files all over the place - especially if you decide to have multiple front-end machines seeking across a common disk.
The tricky thing - you'll have to make sure your db supports row-level locking on the table that threads write to when they create files because otherwise the table itself may be locked which could make this unacceptably slow.