I have a function that reads a file and if it doesn't exist it creates and fills it. However this is in a .NET Standard Library and is called from an AWS Lambda function which gives back the following error:
"Read-only file system"
How can I determine that the file system is read only in code so I can skip the file creation and when its the case?
EDIT: My question is different than Is there a way to check if a file is in use. That question is asking about trying to read a file before it is saved. I am asking how to know if the file system wouldn't allow me to save a file (due to restrictions in the file system)
Generally, for nearly everything to do with file systems, the trick is to remember they are volatile, and therefore the answer is to just try whatever you want to do and handle the exception if it fails.
To understand what I mean by, "volatile", let's say you have some magic Path.IsReadOnlyFileSystem() method that does exactly what you want. So you want to run some code like this:
if (!Path.IsReadOnlyFileSystem(fileIWantToCreate))
{
using(var sw = new StreamWriter(fileIWantToCreate))
{
//fill file here
}
}
But then something happens in your AWS cloud forcing the service to go into read-only mode. And it happens right here:
if (!Path.IsReadOnlyFileSystem(fileIWantToCreate))
{
/* <==== AWS goes read-only here ====> */
using(var sw = new StreamWriter(fileIWantToCreate))
Your magic method did it's job perfectly. At the moment you called the method, the file system still allowed writes. Unfortunately, a volatile file system means you can't trust operations from one moment to the next (and a non-volatile file system wouldn't be much good to you for saving data). So you might be tempted to go for a pattern like this:
try
{
if (!Path.IsReadOnlyFileSystem(fileIWantToCreate))
{
using(var sw = new StreamWriter(fileIWantToCreate))
{
//fill file here
}
}
}
catch( )
{
// handle the exception here
}
And now it's clear you have to handle the exception anyway. The if() condition doesn't really gain anything in terms of actual functionality. All it really does is add complexity to your code.
Even so, there's a tendency to believe keeping the if() condition gains a performance benefit by saving an exception handler and saving a disk access. This belief is flawed.
The if() condition actually adds disk access when it checks the file system status. Moreover, this cost is added on every execution. Keeping the if() condition does save the exception handler, but only on occasions where the initial access fails. Admittedly, unrolling the stack to handle an exception is among the most expensive operations you can do in all of computer science. This is the "why" behind the "don't use exceptions for normal control flow" mantra. However, disk access is one of the few things that is far and away worse for performance.
In summary, we have an exception handler stack unwind sometimes vs an extra disk access every time. From a performance standpoint it's clear you're better off without the if() condition at all. We don't need the if() condition for correctness reasons, and we don't want it for performance reasons.... remind me again why we have it? The better pattern skips that block completely, like this:
try
{
using(var sw = new StreamWriter(fileIWantToCreate))
{
//fill file here
}
}
catch( )
{
// handle the exception here
}
Thus the magic IsReadOnlyFileSystem() method is not needed or even helpful.
This isn't to say that method doesn't exist (I don't think it's part of .Net Standard, but depending on your platform there's likely a lower-level construct you can call into). It's just that it's not a particularly good idea.
Related
I have the following C# algorithm for config file writeback:
string strPathConfigFile = "C:\File.txt"
string strPathBackupFile = "C:\File.backup"
string strContent = "File Content Text";
bool oldFilePresent = File.Exists(strPathConfigFile);
// Step 1
if (oldFilePresent)
{
File.Move(strPathConfigFile, strPathBackupFile);
}
// Step 2
using (FileStream f = new FileStream(strPath, FileMode.Create, FileAccess.ReadWrite, FileShare.None))
{
using (StreamWriter s = new StreamWriter(f))
{
s.Write(strContent);
s.Close();
}
f.Close();
}
// Step 3
if (oldFilePresent)
{
DeleteFile(strPathBackupFile);
}
It works like this:
The original File.txt is renamed to File.backup.
The new File.txt is written.
File.backup is deleted.
This way, if there is a power blackout during the write operation, there is still an intact backup file present. The backup file is only deleted after the write operation is completed. The reading process can check if the backup file is present. If it is, the normal file is considered broken.
For this approach to work, it is crucial that the order of the 3 steps is strictly followed.
Now to my question: Is it possible that the C# compiler swaps step 2 and 3?
It might be a slight performance benefit, as Step 1 and 3 are wrapped in identical if-conditions, which could tempt the compiler to put them together.
I suspect the compiler might do it, as Step 2 and 3 operate on completely different files. To a compiler who doesn't know the semantics of my exceptionally clever writeback procedure, Step 2 and 3 might seem unrelated.
According to the language specification, the C# compiler must preserve side effects when reordering statements. Writing to files is such a side effect.
In general, the compiler/jitter/CPU is free to reorder instructions as long as the result would be identical for a single thread. However, IO, system calls and most things involved with multi threading would involve memory barriers or other synchronization that prevents such reordering.
In the given example there is only a single thread involved. So if the File APIs are implemented correctly (and that would be a fairly safe assumption) there should be no risk of unintended behavior.
Reordering issues mostly popup when writing multi threaded code without being aware of all the potential hazards and requirements for synchronization. As long as you only use a single thread you should not need to care about potentials for reordering.
I want to download a file from Azure Blob Storage which may not exist yet. And I'm looking for the most reliable and performant way into handling this. To this end, I've found two options that both work:
Option 1: Use ExistsAsync()
Duration: This takes roughly 1000~1100ms to complete.
if (await blockBlob.ExistsAsync())
{
await blockBlob.DownloadToStreamAsync(ms);
return ms;
}
else
{
throw new FileNotFoundException();
}
Option 2: Catch the exception
Duration: This takes at least +1600ms, everytime.
try
{
await blockBlob.DownloadToStreamAsync(ms);
return ms;
}
catch (StorageException e)
{
// note: there is no 'StorageErrorCodeStrings.BlobNotFound'
if (e.RequestInformation.ErrorCode == "BlobNotFound")
throw new FileNotFoundException();
throw;
}
The metrics are done through simple API calls on a webapi, which consumes the above functions and returns an appropriate message. I've manually tested the end-to-end scenario here, through Postman. There is some overhead in this approach of course. But summarized, it seems the ExistsAsync() operation consequently saves at least 500ms. At least on my local machine, whilst debugging. Which is a bit remarkable, because the DoesServiceRequest attribute on ExistsAsync() seems to indicate it is another expensive http call that needs to be made.
Also, the ExistsAsync API docs don't say anything about it's usage or any side-effects.
A blunt conclusion, based on poor man's testing, would therefor lead me to option no. 1, because:
it's faster in debug/localhost (the catch; says nothing about compiled in prod)
to me it's more eloquent, especially because also the ErrorCode needs manual checking of a particular code
I would assume the ExistsAsync() operation is there for this exact reason
But here is my actual question: is this the correct interpration of the usage of ExistsAsync()?
E.g. is the "WHY" it exists = to be more efficiƫnt than simply catching a not found exception, particularly for performance reasons?
Thanks!
But here is my actual question: is this the correct interpration of the usage of ExistsAsync()?
You can easily take a look at the implementation yourself.
ExistsAsync() is just a wrapper around an http call that throws an http not found if the blob is not there and return false in that case. True otherwise.
I'd say go for ExistsAsync as it seems the most optimal way, especially if you count on the fact that sometimes the blob is not there. DownloadToStreamAsync has more work to do in terms of wrapping the exception in a StorageException and maybe do some more cleanup.
I would assume the ExistsAsync() operation is there for this exact reason
Consider this: sometimes you just want to know if a given blob exists without being interested in the content. For example to give a warning that something will be overwritten when uploading. In that case using ExistsAsync is a nice use case because using DownloadToStreamAsync will be expensive for just a check on existence since it will download the content if the blob is there.
I know it sounds really stupid, but I have a really easy application that saves some data from some users on a database, and then I want to write all the data on a .txt file.
The code is as follows:
List<MIEMBRO> listaMiembros = bd.MIEMBRO.ToList<MIEMBRO>();
fic.WriteLine("PARTICIPACIONES GRATUITAS MIEMBROS: ");
mi = new Miembro();
foreach (MIEMBRO_GRATIS m in listaMiembroGratis)
{
mi.setNomMiembro(m.nomMiembro);
mi.setNumRifa(m.numRifa.ToString());
fic.WriteLine(mi.ToString());
}
fic.WriteLine();
As you see, really easy code. The thing is: I show the information on a datagrid and I know there are lots of more members, but it stops writing in some point.
Is there any number of lines or characters to write on the streamwriter?? Why I can't write all the members, only part of them???
fic is probably not being flushed by the time you are looking at the output file; if you instantiate it as the argument for a using block, it will be flushed, closed, and disposed of when you are done.
Also, in case you are flushing properly (but it is not being flushed by the time you are checking the file), you could flush at the end of each iteration:
foreach (MIEMBRO_GRATIS m in listaMiembroGratis)
{
mi.setNomMiembro(m.nomMiembro);
mi.setNumRifa(m.numRifa.ToString());
fic.WriteLine(mi.ToString());
fic.Flush();
}
This will decrease performance slightly, but it will at least give you an opportunity to see which record is failing to write (if, indeed, an exception is being thrown).
Is there any number of lines or characters to write on the
streamwriter??
No, there isn't.
As you see, really easy code. The thing is: I show the information on
a datagrid and I know there are lots of more members, but it stops
writing in some point
My guess is that your code is throwing an exception and you aren't catching it. I would look at the implementation of setNomMiembro, setNumRifa and ToString in Miembro; which, by the way, in the case of setNomMiembro and setNumRifa should probably be implemented as properties ({get;set;}) and not as methods.
For example, calling ToString in numRifa would throw a null pointer exception if numRifa is null.
MSDN tells us that when you call "File.Delete( path );" on a file that doesn't exist an exception is generated.
Would it be more efficient to call the delete method and use a try/catch block to avoid the error or validate the existence of the file before doing the delete?
I'm inclined to think it's better to avoid the try/catch block. Why let an error occur when you know how to check for it.
Anyway, here is some sample code:
// Option 1: Just delete the file and ignore any exceptions
/// <summary>
/// Remove the files from the local server if the DeleteAfterTransfer flag has been set
/// </summary>
/// <param name="FilesToSend">a list of full file paths to be removed from the local server</param>
private void RemoveLocalFiles(List<string> LocalFiles)
{
// Ensure there is something to process
if (LocalFiles != null && LocalFiles.Count > 0 && m_DeleteAfterTransfer == true)
{
foreach (string file in LocalFiles)
{
try { File.Delete(file); }
catch { }
}
}
}
// Option 2: Check for the existence of the file before delting
private void RemoveLocalFiles(List<string> LocalFiles )
{
// Ensure there is something to process
if (LocalFiles != null && LocalFiles.Count > 0 && m_DeleteAfterTransfer == true)
{
foreach (string file in LocalFiles)
{
if( File.Exists( file ) == true)
File.Delete(file);
}
}
}
Some Background to what I'm trying to achieve:
The code is part of an FTP wrapper class which will simplify the features of FTP functionality to only what is required and can be called by a single method call.
In This case, we have a flag called "DeleteAfterTransfer" and if set to true will do the job. If the file didn't exists in the first place, I'd expect to have had an exception before getting to this point.
I think I'm answering my own question here but checking the existence of the file is less important than validating I have permissions to perform the task or any of the other potential errors.
You have essentially three options, considering that File.Delete does not throw an exception when your file isn't there:
Use File.Exists, which requires an extra roundtrip to the disk each time (credits to Alexandre C), plus a roundtrip to the disk for File.Delete. This is slow. But if you want to do something specific when the file doesn't exist, this is the only way.
Use exception handling. Considering that entering a try/catch block is relatively fast (about 4-6 m-ops, I believe), the overhead is negligible and you have the option to catch the specific exceptions, like IOException when the file is in use. This can be very beneficial, but you will not be able to act when the file does not exist, because that doesn't throw. Note: this is the easiest way to avoid race conditions, as Alexandre C explains below in more detail.
Use both exception handling and File.Exists. This is potentially slowest, but only marginally so and the only way to both catch exceptions and do something specific (issue a warning?) when the file doesn't exist.
A summary of my original answer, giving some more general advice on using and handling exceptions:
Do not use exception handling when control flow suffices, that's simply more efficient and more readable.
Use exceptions and exception-handling for exceptional cases only.
Exception handling entering try/catch is very efficient, but when an exception is thrown, this costs relatively much.
An exception to the above is: whenever dealing with file functions, use exception handling. The reason is that race conditions may happen and that you never know what occurs between your if-statement and your file-delete statement.
Never ever, and I mean: never ever use a try/catch for all exceptions (empty catch block, this is almost always a weak point in your application and needs improvement. Only catch specific exceptions. (exception: when dealing with COM exceptions not inheriting from Exception).
Another option: use Windows API DeleteFile...
[DllImport("kernel32.dll", CharSet=CharSet.Auto, SetLastError=true)]
public static extern bool DeleteFile(string path);
This returns true if done, otherwise false. If false, you don't have the large overhead of Exceptions.
MSDN says that no exception is generated. Actually, it is better this way since you have a race condition: between calls to File.Exists and File.Delete the file could have been deleted or created by another process.
If it were to throw, you are better catching the specific exception that could have been thrown (FileNotFoundException or similar). Note that because of the race condition, exceptions are the only way to go here.
If your problem is that the directory containing the file doesn't exist, then you can do:
if (LocalFiles != null && m_DeleteAfterTransfer == true)
{
foreach (string file in LocalFiles)
{
try { File.Delete(file); }
catch (DirectoryNotFoundException e) {}
}
}
Again, don't check for the directory existence before, because 1) it is cumbersome 2) it has the same race condition problem. Only File.Delete guarantees that the check and the delete are atomically executed.
Anyways you never want to catch every exception here since file IO methods can fail for a lot of reasons (and you surely don't want to silent a disk failure !)
There are many exceptions that can occur when trying to delete a file. Take a look here for all of them. So it's probably best to catch and deal with each of them as you see fit.
You should use File.Exists, but handle the exception anyway. In general it may be possible for the file to be deleted in between the two calls. Handling the exception still has some overhead, but checking the file for existence first reduces the throw frequency to almost-never. For the one in a million chance that the above scenario occurs, you could still ruin someone's work with an unhandled exception.
In the general case, it's indeed better to test for the exceptional case before calling the method, as exceptions should not be used for control flow.
However, we're dealing with the filesystem here, and even if you check that the file exists before deleting it, something else might remove it between your two calls. In that case, Delete() would still throw an exception even if you explicitly made sure it wouldn't.
So, in that particular case, I would be prepared to handle the exception anyway.
I think you should not only worry about efficiency, but also intention. Ask yourself questions like
Is it a faulty case that the list of files contains files that don't exist?
Do you bother about any of the other errors that can be the cause of failing to delete the file?
Should the process continue if errors other than "file not found" occurs?
Obviously, the Delete call can fail even if the file exists, so only adding that check will not protect your code from failure; you still need to catch exception. The question is more about what exceptions to catch and handle, and which ones should bubble upwards to the caller.
It's far better to do File.Exists first. Exceptions have a large overhead. In terms of efficiency, your definition isn't very clear, but performance and memory-wise, go for File.Exists.
See also my previous answer in this question about using Exceptions to control program flow:
Example of "using exceptions to control flow"
As below, I welcome anyone to try this themselves. People talking about spindle speed and access times for hard drives - that is massively irrelevant, because we are not timing that. The OP asked what the most performant way of achieving his task is. As you can see clear as day here, it's to use File.Exists. This is repeatable.:
Actual recorded performance results: try/catch vs File.Exists for non-existent files
Number of files (non-existent): 10,000
Source: http://pastebin.com/6KME40md
Results:
RemoveLocalFiles1 (try/catch) : 119ms
RemoveLocalFiles2 (File.Exists) : 106ms
Basically, is it better practice to store a value into a variable at the first run through, or to continually use the value? The code will explain it better:
TextWriter tw = null;
if (!File.Exists(ConfigurationManager.AppSettings["LoggingFile"]))
{
// ...
tw = File.CreateText(ConfigurationManager.AppSettings["LoggingFile"]);
}
or
TextWriter tw = null;
string logFile = ConfigurationManager.AppSettings["LoggingFile"].ToString();
if (!File.Exists(logFile))
{
// ...
tw = File.CreateText(logFile);
}
Clarity is important, and DRY (don't repeat yourself) is important. This is a micro-abstraction - hiding a small, but still significant, piece of functionality behind a variable. The performance is negligible, but the positive impact of clarity can't be understated. Use a well-named variable to hold the value once it's been acquired.
the 2nd solution is better for me because :
the dictionary lookup has a cost
it's more readable
Or you can have a singleton object with it's private constructor that populates once all configuration data you need.
Second one would be the best choice.
Imagine this next situation. Settings are updated by other threads and during some of them, since setting value isn't locked, changes to another value.
In the first situation, your execution can fail, or it'll be executed fine, but code was checking for a file of some name, and later saves whatever to a file that's not the one checked before. This is too bad, isn't it?
Another benefit is you're not retrieving the value twice. You get once, and you use wherever your code needs to read the whole setting.
I'm pretty sure, the second one is more readable. But if you talk about performance - do not optimize on early stages and without profiler.
I must agree with the others. Readability and DRY is important and the cost of the variable is very low considering that often you will have just Objects and not really store the thing multiple times.
There might be exceptions with special or large objects. You must keep in mind the question if the value you cache might change in between and if you would like or not (most times the second!) to know the new value within your code! In your example, think what might happen when ConfigurationManager.AppSettings["LoggingFile"] changes between the two calls (due to accessor logic or thread or always reading the value from a file from disk).
Resumee: About 99% you will want the second method / the cache!
IMO that would depend on what you are trying to cache. Caching a setting from App.conig might not be as benefiial (apart from code readability) as caching the results of a web service call over a GPRS connection.