How to use TransferUtility to upload multiple files

How to use TransferUtility to upload multiple files - c#

I am trying to make sense of the documentation for:
TransferUtility.UploadDirectory
The documentation does not describe the error condition of the upload. Typically I would guess something like System.Net.Http.HttpRequestException.
After reading multiple comments, it seems that S3 does not support TransactionScope. The only thing that seems to be supported is at file level:
Are writes to Amazon S3 atomic (all-or-nothing)?
and
https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html#ConsistencyModel
So my questions are:
Where can I find the error condition of UploadDirectory ?
Does it makes sense to use UploadDirectory since atomic operations are at file (object) level ?
My question is about uploading multiples files (ie. s3 'object'), not about doing a multi-parts upload of single file.

AFAIK, the best you can do here is wrap it in a try catch:
try
{
...
}
catch (AmazonS3Exception e)
{
// implement rollback operation
...
}
catch (Exception e)
{
// no possible rollback operation, abort program ?
...
}
You can keep track of progress using an UploadDirectoryProgressEvent. In the event of an error, if you want to clean up, you'd have to compare the progress, note the diffs, and take action as appropriate (e.g. by removing objects if you don't want to keep them in S3 and you want the entire operation to be atomic).
Pay special attention to the fact that:
var request = new TransferUtilityUploadDirectoryRequest
{
UploadFilesConcurrently = true,
};
Will have an impact on your rollback mechanism. Setting UploadFilesConcurrently to true imply that UploadDirectoryProgressArgs received in UploadDirectoryProgressEvent have a null value for CurrentFile:
https://github.com/aws/aws-sdk-net/issues/317
In which case you can only implement rollback in the case where you can delete the full remote directory.
Note also the documentation on multi-part uploads:
If a multipart upload is interrupted, TransferUtility will attempt to abort the multipart upload. Under certain circumstances (network outage, power failure, etc.), TransferUtility will not be able to abort the multipart upload. In this case, in order to stop getting charged for the storage of uploaded parts, you should manually invoke TransferUtility.AbortMultipartUploads() to abort the incomplete multipart uploads.
The documentation has examples of both tracking and aborting muultipart uploads.
As for your other question:
Does it makes sense to use UploadDirectory since atomic operations are at file (object) level ?
I'd say that depends. The code to upload an entire directory of files might be somewhat cleaner, but since you still have to possible track and clean up, you might as well process the files one by one.

Related

.NET locking IO code

I have several Client objects (TCPClient wrappers) operating on separate threads. If any of these objects encounters a problem an error message is saved to an XML error log. Obviously file access is restricted to one process at a time so I need a way of preventing other threads from reading/writing while another is using it.
I'm currently using the lock method however an exception is still thrown that another process is using the file. I was under the impression lock will manage waiting and retrying.
// Lock the XML IO for safety due to multi-threading
lock (this.xmlDoc) // Changed from this to the xmlDoc
{
// Attempt to load existing xml
try
{
this.xmlDoc.Load(this.logPath);
}
catch (FileNotFoundException e)
{
// xml file doesn't exist, create
this.xmlDoc.AppendChild(this.xmlDoc.CreateElement("root"));
}
// Get the doc root
XmlElement root = this.xmlDoc.DocumentElement;
// Create message entry
XmlElement msg = this.xmlDoc.CreateElement("message");
// Add <time></time> to msg
msg.AppendChild(this.xmlDoc.CreateElement("time")).InnerText = dt.ToString();
// Add <error></error> to msg
msg.AppendChild(this.xmlDoc.CreateElement("error")).InnerText = message;
// Add msg to root
root.AppendChild(msg);
// Save. Done.
this.xmlDoc.Save(this.logPath);
}

lock does not know what a file is. It operates on CLR objects, which exist per process. A different process will see a different object.
You need some cross-process mutual exclusion. Here are some options:
A named Mutex
File-level locking and a retry strategy (I suspect log4net does it that way)
Multiple files with different names (use random names so you don't get collisions)
Option 3 is the easiest.

If all your threads are coming from the same process:
Use System.Diagnostics.Trace. Its built in to the framework, contains thread safety, and you can use a config file to define a trace listener.
Construct an XML string and then Trace.WriteLine().
If your threads are coming from different processes, each process will need to maintain its own log file as #usr mentions.
As a side note, you could log errors to the windows event viewer if the volume is low enough to avoid file locking issues.

Upload file still in use sometimes when trying to move

After an image is uploaded to my server, my code moves it into a specific folder given by the user details. Sometimes I think it tries to move the file too fast or the upload file is still in use so 9/10 the function won't perform the move.
Is there a way to add a 'wait' or a way to check if a file is in use and possibly perform a while loop until the file is allowed to be moved?
Current move function in my controller:
while (!File.Exists(uploadedPath))
{
}
File.Move(uploadedPath, savePath);
PS. I intend to add in a counter to ensure the while loop doesn't get stuck and has a timeout.

If you have control over the code receiving the file, I would update it to notify the moving code when the file is received completely. Alternatively I would move the file from there or even save the file where it should be eventually.
Otherwise, it will be a hack. You need
Try to move the file,
Catch the exception if it doesn't move
Use Thread.Sleep for a few sec
Go To 1
Something along the lines:
bool success = false;
for (var count = 0; !success && count < 10; ++count)
{
try
{
File.Move(uploadedPath, savePath);
success = true;
}
catch (IOException)
{
Thread.Wait(1000);
}
}
You also need to handle the situation when it cannot move the file at all. So it is a hack and should not be done in general if there are other ways to notify the moving code.
Also note:
From File.Move msdn:
If you try to move a file across disk volumes and that file is in use,
the file is copied to the destination, but it is not deleted from the
source.
which means that your file will remain in the received files directory after moving.

Are UploadFile and MoveFile 2 different components that are independent of each other. If so I don't think it's a good architecture. I would recommend a way so as to have the UploadFile pass the control to MoveFile once it's part is done. This way you can avoid multiple processes trying to access the same file.

network sessions and sending files

Background
Hi.
I write a program that analyzes the packets for specific words contained therein. I need to analyze outgoing email, jabber, ICQ. If the words are found, the packet is blocked.I did it, but I have a problem with the files and sending email through the web.
Problems
Simple code:
while (Ndisapi.ReadPacket(hNdisapi, ref Request))
{
// some work
switch (protocol)
{
//....
case "HTTP":
// parse packet(byte[])
HTTP.HttpField field = HTTP.ParseHttp(ret);
if (field != null && field.Method == HTTP.HttpMethod.POST)
{
// analyze packet and drop if needed
DoWork();
}
}
The problem is the following. For example, I attach to email the file of 500 KB. The file will be split approximately in 340 packets. In the code above, DoWork() only for first packet will be executed.
Ok, then I need to restore session completely and pass whole session to DoWork(). I did it. But I can't wait while session is finished, because other packet( http, arp, all packets) will be suspended (And after a couple of minutes the Internet is disconnected).
Therefore, the first question:
How to solve this problem (may be advice for design program)?
Now the email, suppose this code:
switch (protocol)
{
//....
case "HTTP":
// parse packet(byte[])
var httpMimeMessage = Mime.Parse(ret);
// analyze packet and drop if needed
DoSomeWork();
break;
}
For example, we are looking for word "Finance". Then, if we open any website and there will be a word finance then packet is blocked.
Second question: How do I determine that this is the e-mail?
Thanks and sorry for my English.

To be able to analyze more than one packet/stream at the same time, you'll need to refactor your solution to use threading or some other form of multitasking and since your task appears to be both compute and io-intensive, you'll probably want to take a hard look at how to leverage event-handling at the operating system level (select, epoll, or the equivalent for your target platform).
And to answer your second question regarding email, you'll need to be able to identify and track the tcp session used to deliver email messages from client to server, assuming the session hasn't been encrypted.
As I'm sure you already know, the problem you're trying to solve is a very complicated one, requiring very specialized skills like realtime programming, deep knowledge of networking protocols, etc.
Of course, there are several "deep packet inspection" solutions out there already that do all of this for you, (typically used by public companies to fulfill regulatory requirements like Sarbanes-Oxley), but they are quite expensive.

.NET Denying Access to Directories After Copy Operations

I'm performing a "safe" copy of a directory over another directory as follows:
Given the source C:\Source and target C:\Target
Copy C:\Source to C:\Target-incoming
Move C:\Target (if it exists) to C:\Target-outgoing
Move C:\Target-incoming to C:\Target
Delete C:\Target-outgoing (if it exists)
If any of the first three steps fail, I'll attempt to put things back as they were to prevent data loss.
However, the move of C:\Target-incoming to C:\Target fails with "Access to the path C:\Target-incoming is denied" most of the time.
At the moment, inserting Thread.Sleep(100) just before the move operation fixes the problem for me. However, waiting .1 of a second seems ridiculous to me. Thread.Sleep(10) isn't enough to fix it. I also have the sinking feeling that the value I have to wait depends on the speed of disk IO.
So, my questions:
Can I prevent this from happening?
If not, is there a way of finding out when the lock on the directory is released after copying it?
Edit: For clarity, I'm doing all these operations in one method on one thread, and I'm just using Thread.Sleep() to pause code flow for a moment. The moves and copies are being done with standard .NET Directory.Move(), Directory.CreateDirectory() and File.CopyTo() methods. It would appear that the .NET methods are returning before the locks on the respective files are being released, causing the necessity to wait an amount of time before continuing.

What could be happening is probably that your thread is trying to "Move C:\Target-incoming to C:\Target" WHILE the "Move C:\Target to C:\Target-outgoing" is NOT finished YET.
This track is confirmed by the success of your process after short Thread Sleep.
Try to Chain your processes, i.e : Divide each step into specific methods, and call the methods one after the other (sync'ing the start of a method to the end of the previous one)
There are various ways to do that (among others syncing/locking/chaining different threads per process/step)
You could check Thread Synchronization in .NET
But of course, this is not the only possible cause for your problem.

After a bunch of testing, it seems like the very act of trying to move a locked folder gets the OS to hurry up and release the lock, even if the first attempt fails.
I wrote this extension method to DirectoryInfo:
public static void TryToMoveTo(this DirectoryInfo o, string targetPath) {
int attemptsRemaining = 5;
while (true) {
try {
o.MoveTo(targetPath);
break;
} catch (Exception) {
if (attemptsRemaining == 0) {
throw;
} else {
attemptsRemaining--;
System.Threading.Thread.Sleep(10);
}
}
}
}
While debugging the original problem, I settled on waiting for 100ms as anything less seemed to cause exceptions (I tried 10, 25, 50, 75 and 100ms). However, in the method above I wait 10ms before retrying, and I never, ever got more than one exception thrown in each of my hundreds of test runs.

You can always try waiting in a loop, up till a maximum # of tries. You can check to see if the directory is locked by calling CreateFile and checking it's return code. Be sure to read through the "flags" section of the docs because you need to pass in a special flag to open a directory.
Someone else mentioned in a comment that you may want to try Transactional NTFS. If you can, you might want to try that.

check wethere source and target directories exist before copying or moving using io.directory.exists
the access deneied error is caused by either source or target are not found.

How to check if a file is in use?

Is there any way to first test if a file is in use before attempting to open it for reading? For example, this block of code will throw an exception if the file is still being written to or is considered in use:
try
{
FileStream stream = new FileStream(fullPath, FileMode.Open, FileAccess.Read, FileShare.Read);
}
catch (IOException ex)
{
// ex.Message == "The process cannot access the file 'XYZ' because it is being used by another process."
}
I've looked all around and the best I can find is to perform some sort of polling with a try catch inside, and that feels so hacky. I would expect there to be something on System.IO.FileInfo but there isn't.
Any ideas on a better way?

"You can call the LockFile API function through the P/Invoke layer directly. You would use the handle returned by the SafeFileHandle property on the FileStream.
Calling the API directly will allow you to check the return value for an error condition as opposed to resorting to catching an exception."
"The try/catch block is the CORRECT solution (though you want to catch IOException, not all exceptions). There's no way you can properly synchronize, because testing the lock + acquiring the lock is not an atomic operation."
"Remember, the file system is volatile: just because your file is in one state for one operation doesn't mean it will be in the same state for the next operation. You have to be able to handle exceptions from the file system."
Using C# is it possible to test if a lock is held on a file
http://www.dotnet247.com/247reference/msgs/32/162678.aspx

Well a function that would try and do it would simply try catch in a loop. Just like with databases, the best way to find out IF you can do something is to try and do it. If it fails, deal with it. Unless your threading code is off, there is no reason that your program shouldn't be able to open a file unless the user has it open in another program.
Unless of course you're doing interesting things.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.