Handle the download of several tens of thousands files c#

Handle the download of several tens of thousands files c# - c#

I'm making a small software that download several tens of thousands files.
It's not efficient at all for now because i download each file once by once and so it's very slow, and also lot of files are less than 100ko.
Do you have any idea to improve the download speed ?
/*******************************
Worker work
/********************************/
private void backgroundWorker1_DoWork(object sender, DoWorkEventArgs e)
{
listCount = _downloadList.Count;
// no GUI method !
while (TotalDownloadFile < _downloadList.Count)
{
// handle closing form during download
if (_worker.CancellationPending)
{
_mainView = null;
_wc.CancelAsync();
e.Cancel = true;
}
else if (!DownloadInProgress && TotalDownloadFile < listCount)
{
_lv = new launcherVersion(_downloadList[TotalDownloadFile]);
var fileToDownloadPath = Info.getDownloadUrl() + _lv.Path;
var saveFileToPath = Path.GetFullPath("./") + _lv.Path;
if (Tools.IsFileExist(saveFileToPath))
File.Delete(saveFileToPath); // remove file if extist
else
// create directory where the file will be created (use api this don't do anything on existing directory)
Directory.CreateDirectory(Path.GetDirectoryName(saveFileToPath));
StartDownload(fileToDownloadPath, saveFileToPath);
UpdateRemaingFile();
_currentFile = TotalDownloadFile;
}
}
}
Start Download Function
/*******************************
start the download of files
/********************************/
public void StartDownload(string fileToDownloadLink, string pathToSaveFile)
{
try
{
using (_wc = new WebClient())
{
_wc.DownloadProgressChanged += client_DownloadProgressChanged;
_wc.DownloadFileCompleted += client_DownloadFileCompleted;
_wc.DownloadFileAsync(new Uri(fileToDownloadLink), pathToSaveFile);
DownloadInProgress = true;
}
}
catch (WebException e)
{
MessageBox.Show(fileToDownloadLink);
MessageBox.Show(e.ToString());
_worker.CancelAsync();
Application.Exit();
}
}

Expanding upon my comment. You could potentially use multi-threading and concurrency to download entire batches at once. You'd have to put some though into ensuring each thread completes successfully and ensure that files don't get downloaded twice. You would have to secure your centralized lists using something like lock.
I would personally implement 3 separate lists: ReadyToDownload, DownloadInProgress, and DownloadComplete.
ReadyToDownload would contain all objects that need to be downloaded. DownloadInProgress would contain both the item being downloaded and the Task handling the download. DownloadComplete would hold all objects that were downloaded and reference the Task that performed the download.
Each Task would hypothetically work better as an instance of a custom object. That object would take in a reference to each of the lists, and it would handle updating the lists once it work either completes or fails. In the event of a failure, you could either add a forth list to house the failed items, or reinsert them into the ReadyToDownload list.

Related

C# Calculating download speed correctly when dowloading with multiple threads

I'm trying to show to user correct download speed but my app downloads are bunch of small files at the same time and to speed things up I'm using Parallel.ForEach. However I can't calculate correct download speed. With my current code I'm basically calculating average download speed not the speed currently downloading. Because it updates UI every time download is completed. When I use normal foreach I can calculate correctly but then speed is slow. How can I show correctly the downloaded Mbps with multiple threads and multiple files ?
Note: This app is WPF but I hardly used any MVVM. This is my first time using WPF at the moment I'm just trying to make good looking something with decent functioning.
Download Function
var stopwatch = new Stopwatch();
stopwatch.Start();
DownloadController.stopwatch.Start();
DownloadController.IsDownloadStarted = true;
DownloadController.IsDownloadInProgress = true;
Parallel.ForEach(downloadList, new ParallelOptions { MaxDegreeOfParallelism = 5 }, file =>
{
try
{
DownloadController.LastDownloadingFileName = file.FileName;
GET_DownloadFile(file.FileName, file.LastUpdate.UnixTimeStampToDateTime()).GetAwaiter().GetResult();
logger.Info("Download", file.FileName, "Downloading file completed");
}
catch (Exception ex)
{
lock (_failedDownloads)
{
_failedDownloads.Add(file);
}
logger.Exception(ex, "Download", file.FileName, file.LastUpdate, file.Size, $"Failed to download file");
}
});
Progress Changed Event
public static void DownloadProgressChangedEvent(object sender, DownloadProgressChangedEventArgs e)
{
MainWindow._dispatcher.BeginInvoke(new Action(() =>
{
ButtonProgressAssist.SetValue(MainWindow.This.Prog_Downloading, ProgressValue);
ButtonController.ButtonPlay_Downloading();
if (e.ProgressPercentage == 100)
{
DownloadedSize += e.TotalBytesToReceive;
var downloadSpeed = string.Format("{0} ", (DownloadedSize / 1024.0 / 1024.0 / stopwatch.Elapsed.TotalSeconds).ToString("0.0"));
var text1 = $"({ProgressValue}% - {DownloadedFileCount}/{TotalFileUpdateCount}) # {downloadSpeed}MB/s {EasFile.GetFileNameWithExtension(LastDownloadingFileName)} ";
MainWindow.This.DownloadTextBlock.Text = text1;
}
}));
}
ProgressCompletedEvent
public static void DownloadProgressCompletedEvent(object? sender, AsyncCompletedEventArgs e)
{
if (!e.Cancelled)
{
DownloadedFileCount++;
}
}
I tried to use PerformanceCounter to watch my current app's network usage but it only shows me the usage of all usage on specific network.

You have two choices here:
First way is to create a class that will handle the single file downloading. In this class you should count the bytes downloaded. One possible solution for this is to have a member which is cleared every second. The received bytes should be added to it. Before it's clearing the class should report that value to the main class where it should be a member act in the same way for all currently downloading files.
The other solution is instead of having a class on every progress event to report the received bytes and the time. In the main class you should have a list which will store that pairs. Then on every second you can get only the records for last second and aggregate the received bytes.
P.S. Accessing the members should be thread safe in both ways.

Tell if a file is written to a networkshare completely [duplicate]

When a file is created (FileSystemWatcher_Created) in one directory I copy it to another. But When I create a big (>10MB) file it fails to copy the file, because it starts copying already, when the file is not yet finished creating...
This causes Cannot copy the file, because it's used by another process to be raised. ;(
Any help?
class Program
{
static void Main(string[] args)
{
string path = #"D:\levan\FolderListenerTest\ListenedFolder";
FileSystemWatcher listener;
listener = new FileSystemWatcher(path);
listener.Created += new FileSystemEventHandler(listener_Created);
listener.EnableRaisingEvents = true;
while (Console.ReadLine() != "exit") ;
}
public static void listener_Created(object sender, FileSystemEventArgs e)
{
Console.WriteLine
(
"File Created:\n"
+ "ChangeType: " + e.ChangeType
+ "\nName: " + e.Name
+ "\nFullPath: " + e.FullPath
);
File.Copy(e.FullPath, #"D:\levan\FolderListenerTest\CopiedFilesFolder\" + e.Name);
Console.Read();
}
}

There is only workaround for the issue you are facing.
Check whether file id in process before starting the process of copy. You can call the following function until you get the False value.
1st Method, copied directly from this answer:
private bool IsFileLocked(FileInfo file)
{
FileStream stream = null;
try
{
stream = file.Open(FileMode.Open, FileAccess.ReadWrite, FileShare.None);
}
catch (IOException)
{
//the file is unavailable because it is:
//still being written to
//or being processed by another thread
//or does not exist (has already been processed)
return true;
}
finally
{
if (stream != null)
stream.Close();
}
//file is not locked
return false;
}
2nd Method:
const int ERROR_SHARING_VIOLATION = 32;
const int ERROR_LOCK_VIOLATION = 33;
private bool IsFileLocked(string file)
{
//check that problem is not in destination file
if (File.Exists(file) == true)
{
FileStream stream = null;
try
{
stream = File.Open(file, FileMode.Open, FileAccess.ReadWrite, FileShare.None);
}
catch (Exception ex2)
{
//_log.WriteLog(ex2, "Error in checking whether file is locked " + file);
int errorCode = Marshal.GetHRForException(ex2) & ((1 << 16) - 1);
if ((ex2 is IOException) && (errorCode == ERROR_SHARING_VIOLATION || errorCode == ERROR_LOCK_VIOLATION))
{
return true;
}
}
finally
{
if (stream != null)
stream.Close();
}
}
return false;
}

From the documentation for FileSystemWatcher:
The OnCreated event is raised as soon as a file is created. If a file
is being copied or transferred into a watched directory, the
OnCreated event will be raised immediately, followed by one or more
OnChanged events.
So, if the copy fails, (catch the exception), add it to a list of files that still need to be moved, and attempt the copy during the OnChanged event. Eventually, it should work.
Something like (incomplete; catch specific exceptions, initialize variables, etc):
public static void listener_Created(object sender, FileSystemEventArgs e)
{
Console.WriteLine
(
"File Created:\n"
+ "ChangeType: " + e.ChangeType
+ "\nName: " + e.Name
+ "\nFullPath: " + e.FullPath
);
try {
File.Copy(e.FullPath, #"D:\levani\FolderListenerTest\CopiedFilesFolder\" + e.Name);
}
catch {
_waitingForClose.Add(e.FullPath);
}
Console.Read();
}
public static void listener_Changed(object sender, FileSystemEventArgs e)
{
if (_waitingForClose.Contains(e.FullPath))
{
try {
File.Copy(...);
_waitingForClose.Remove(e.FullPath);
}
catch {}
}
}

It's an old thread, but I'll add some info for other people.
I experienced a similar issue with a program that writes PDF files, sometimes they take 30 seconds to render.. which is the same period that my watcher_FileCreated class waits before copying the file.
The files were not locked.
In this case I checked the size of the PDF and then waited 2 seconds before comparing the new size, if they were unequal the thread would sleep for 30 seconds and try again.

You're actually in luck - the program writing the file locks it, so you can't open it. If it hadn't locked it, you would have copied a partial file, without having any idea there's a problem.
When you can't access a file, you can assume it's still in use (better yet - try to open it in exclusive mode, and see if someone else is currently opening it, instead of guessing from the failure of File.Copy). If the file is locked, you'll have to copy it at some other time. If it's not locked, you can copy it (there's slight potential for a race condition here).
When is that 'other time'? I don't rememeber when FileSystemWatcher sends multiple events per file - check it out, it might be enough for you to simply ignore the event and wait for another one. If not, you can always set up a time and recheck the file in 5 seconds.

Well you already give the answer yourself; you have to wait for the creation of the file to finish. One way to do this is via checking if the file is still in use. An example of this can be found here: Is there a way to check if a file is in use?
Note that you will have to modify this code for it to work in your situation. You might want to have something like (pseudocode):
public static void listener_Created()
{
while CheckFileInUse()
wait 1000 milliseconds
CopyFile()
}
Obviously you should protect yourself from an infinite while just in case the owner application never releases the lock. Also, it might be worth checking out the other events from FileSystemWatcher you can subscribe to. There might be an event which you can use to circumvent this whole problem.

When the file is writing in binary(byte by byte),create FileStream and above solutions Not working,because file is ready and wrotted in every bytes,so in this Situation you need other workaround like this:
Do this when file created or you want to start processing on file
long fileSize = 0;
currentFile = new FileInfo(path);
while (fileSize < currentFile.Length)//check size is stable or increased
{
fileSize = currentFile.Length;//get current size
System.Threading.Thread.Sleep(500);//wait a moment for processing copy
currentFile.Refresh();//refresh length value
}
//Now file is ready for any process!

So, having glanced quickly through some of these and other similar questions I went on a merry goose chase this afternoon trying to solve a problem with two separate programs using a file as a synchronization (and also file save) method. A bit of an unusual situation, but it definitely highlighted for me the problems with the 'check if the file is locked, then open it if it's not' approach.
The problem is this: the file can become locked between the time that you check it and the time you actually open the file. Its really hard to track down the sporadic Cannot copy the file, because it's used by another process error if you aren't looking for it too.
The basic resolution is to just try to open the file inside a catch block so that if its locked, you can try again. That way there is no elapsed time between the check and the opening, the OS does them at the same time.
The code here uses File.Copy, but it works just as well with any of the static methods of the File class: File.Open, File.ReadAllText, File.WriteAllText, etc.
/// <param name="timeout">how long to keep trying in milliseconds</param>
static void safeCopy(string src, string dst, int timeout)
{
while (timeout > 0)
{
try
{
File.Copy(src, dst);
//don't forget to either return from the function or break out fo the while loop
break;
}
catch (IOException)
{
//you could do the sleep in here, but its probably a good idea to exit the error handler as soon as possible
}
Thread.Sleep(100);
//if its a very long wait this will acumulate very small errors.
//For most things it's probably fine, but if you need precision over a long time span, consider
// using some sort of timer or DateTime.Now as a better alternative
timeout -= 100;
}
}
Another small note on parellelism:
This is a synchronous method, which will block its thread both while waiting and while working on the thread. This is the simplest approach, but if the file remains locked for a long time your program may become unresponsive. Parellelism is too big a topic to go into in depth here, (and the number of ways you could set up asynchronous read/write is kind of preposterous) but here is one way it could be parellelized.
public class FileEx
{
public static async void CopyWaitAsync(string src, string dst, int timeout, Action doWhenDone)
{
while (timeout > 0)
{
try
{
File.Copy(src, dst);
doWhenDone();
break;
}
catch (IOException) { }
await Task.Delay(100);
timeout -= 100;
}
}
public static async Task<string> ReadAllTextWaitAsync(string filePath, int timeout)
{
while (timeout > 0)
{
try {
return File.ReadAllText(filePath);
}
catch (IOException) { }
await Task.Delay(100);
timeout -= 100;
}
return "";
}
public static async void WriteAllTextWaitAsync(string filePath, string contents, int timeout)
{
while (timeout > 0)
{
try
{
File.WriteAllText(filePath, contents);
return;
}
catch (IOException) { }
await Task.Delay(100);
timeout -= 100;
}
}
}
And here is how it could be used:
public static void Main()
{
test_FileEx();
Console.WriteLine("Me First!");
}
public static async void test_FileEx()
{
await Task.Delay(1);
//you can do this, but it gives a compiler warning because it can potentially return immediately without finishing the copy
//As a side note, if the file is not locked this will not return until the copy operation completes. Async functions run synchronously
//until the first 'await'. See the documentation for async: https://msdn.microsoft.com/en-us/library/hh156513.aspx
CopyWaitAsync("file1.txt", "file1.bat", 1000);
//this is the normal way of using this kind of async function. Execution of the following lines will always occur AFTER the copy finishes
await CopyWaitAsync("file1.txt", "file1.readme", 1000);
Console.WriteLine("file1.txt copied to file1.readme");
//The following line doesn't cause a compiler error, but it doesn't make any sense either.
ReadAllTextWaitAsync("file1.readme", 1000);
//To get the return value of the function, you have to use this function with the await keyword
string text = await ReadAllTextWaitAsync("file1.readme", 1000);
Console.WriteLine("file1.readme says: " + text);
}
//Output:
//Me First!
//file1.txt copied to file1.readme
//file1.readme says: Text to be duplicated!

You can use the following code to check if the file can be opened with exclusive access (that is, it is not opened by another application). If the file isn't closed, you could wait a few moments and check again until the file is closed and you can safely copy it.
You should still check if File.Copy fails, because another application may open the file between the moment you check the file and the moment you copy it.
public static bool IsFileClosed(string filename)
{
try
{
using (var inputStream = File.Open(filename, FileMode.Open, FileAccess.Read, FileShare.None))
{
return true;
}
}
catch (IOException)
{
return false;
}
}

I would like to add an answer here, because this worked for me. I used time delays, while loops, everything I could think of.
I had the Windows Explorer window of the output folder open. I closed it, and everything worked like a charm.
I hope this helps someone.

uwp c# async method waiting data not completely loaded exception

I will try to tell my problem in as simple words as possible.
In my UWP app, I am loading the data async wise on my Mainpage.xaml.cs`
public MainPage()
{
this.InitializeComponent();
LoadVideoLibrary();
}
private async void LoadVideoLibrary()
{
FoldersData = new List<FolderData>();
var folders = (await Windows.Storage.StorageLibrary.GetLibraryAsync
(Windows.Storage.KnownLibraryId.Videos)).Folders;
foreach (var folder in folders)
{
var files = (await folder.GetFilesAsync(Windows.Storage.Search.CommonFileQuery.OrderByDate)).ToList();
FoldersData.Add(new FolderData { files = files, foldername = folder.DisplayName, folderid = folder.FolderRelativeId });
}
}
so this is the code where I am loading up a List of FolderData objects.
There in my other page Library.xaml.cs I am using that data to load up my gridview with binding data.
protected override void OnNavigatedTo(NavigationEventArgs e)
{
try
{
LoadLibraryMenuGrid();
}
catch { }
}
private async void LoadLibraryMenuGrid()
{
MenuGridItems = new ObservableCollection<MenuItemModel>();
var data = MainPage.FoldersData;
foreach (var folder in data)
{
var image = new BitmapImage();
if (folder.files.Count == 0)
{
image.UriSource = new Uri("ms-appx:///Assets/StoreLogo.png");
}
else
{
for (int i = 0; i < folder.files.Count; i++)
{
var thumb = (await folder.files[i].GetThumbnailAsync(Windows.Storage.FileProperties.ThumbnailMode.VideosView));
if (thumb != null) { await image.SetSourceAsync(thumb); break; }
}
}
MenuGridItems.Add(new MenuItemModel
{
numberofvideos = folder.files.Count.ToString(),
folder = folder.foldername,
folderid = folder.folderid,
image = image
});
}
GridHeader = "Library";
}
the problem I am facing is that when i launch my application, wait for a few seconds and then i navigate to my library page, all data loads up properly.
but when i try to navigate to library page instantly after launching the app, it gives an exception that
"collection was modified so it cannot be iterated"
I used the breakpoint and i came to know that if i give it a few seconds the List Folder Data is already loaded properly asyncornously, but when i dnt give it a few seconds, that async method is on half way of loading the data so it causes exception, how can i handle this async situation? thanks

What you need is a way to wait for data to arrive. How you fit that in with the rest of the application (e.g. MVVM or not) is a different story, and not important right now. Don't overcomplicate things. For example, you only need an ObservableCollection if you expect the data to change while the user it looking at it.
Anyway, you need to wait. So how do you wait for that data to arrive?
Use a static class that can be reached from everywhere. In there put a method to get your data. Make sure it returns a task that you cache for future calls. For example:
internal class Data { /* whatever */ }
internal static class DataLoader
{
private static Task<Data> loaderTask;
public static Task<Data> LoadDataAsync(bool refresh = false)
{
if (refresh || loaderTask == null)
{
loaderTask = LoadDataCoreAsync();
}
return loaderTask;
}
private static async Task<Data> LoadDataCoreAsync()
{
// your actual logic goes here
}
}
With this, you can start the download as soon as you start the application.
await DataLoader.LoadDataAsync();
When you need the data in that other screen, just call that method again. It will not download the data again (unless you set refresh is true), but will simply wait for the work that you started earlier to finish, if it is not finished yet.

I get that you don't have enough experience.There are multiple issues and no solution the way you are loading the data.
What you need is a Service that can give you ObservableCollection of FolderData. I think MVVM might be out of bounds at this instance unless you are willing to spend a few hours on it. Though MVVM will make things lot easier in this instance.
The main issue at hand is this
You are using foreach to iterate the folders and the FolderData list. Foreach cannot continue if the underlying collection changes.
Firstly you need to start using a for loop as opposed to foreach. 2ndly add a state which denotes whether loading has finished or not. Finally use observable data source. In my early days I used to create static properties in App.xaml.cs and I used to use them to share / observe other data.

Tying it all together, Directory Recursive Search + Events + Threading

What I would like to do, and have worked towards developing, is a standard class which I can use for retrieving all sub-directories (and their sub directories and files, and so on) and files.
WalkthroughDir(Dir)
Files a
Folders b
WalkthroughDir(b[i])
A straightforward recursive directory search.
Using this as a basis I wanted to extend it to fire events when:
A file is found;
A directory is found;
The search is completed
private void GetDirectories(string path)
{
GetFiles(path);
foreach (string dir in Directory.EnumerateDirectories(path))
{
if (DirectoryFound != null)
{
IOEventArgs<DirectoryInfo> args = new IOEventArgs<DirectoryInfo>(new DirectoryInfo(dir));
DirectoryFound(this, args);
}
// do something with the directory...
GetDirectories(dir, dirNode);
}
}
private void GetFiles(string path)
{
foreach (string file in Directory.EnumerateFiles(path))
{
if (FileFound != null)
{
IOEventArgs<FileInfo> args = new IOEventArgs<FileInfo>(new FileInfo(file));
FileFound(this, args);
}
// do something with the file...
}
}
Where you find the comments above ("do something[...]") is where I might add the file or directory to some data structure.
The most common factor in doing this type of search though is the processing time, particularly for large directories. So naturally I wanted to take this yet another step forward and implement threading. Now, my knowledge of threading is pretty limited but so far this is an outline of what I've come up with:
public void Search()
{
m_searchThread = new Thread(new ThreadStart(SearchThread));
m_searching = true;
m_searchThread.Start();
}
private void SearchThread()
{
GetDirectories(m_path);
m_searching = false;
}
If I use this implementation, assign the events in a control it throws errors (as I expected) that my GUI application is trying to access another thread.
Could anyone feedback on this implementation as well as how to accomplish the threading. Thanks.
UPDATE (selkathguy recommendation):
This is the adjusted code following selkathguy's recommendation:
private void GetDirectories(DirectoryInfo path)
{
GetFiles(path);
foreach (DirectoryInfo dir in path.GetDirectories())
{
if (DirectoryFound != null)
{
IOEventArgs<DirectoryInfo> args = new IOEventArgs<DirectoryInfo>(dir);
DirectoryFound(this, args);
}
// do something with the directory...
GetDirectories(dir);
}
}
private void GetFiles(DirectoryInfo path)
{
foreach (FileInfo file in path.GetFiles())
{
if (FileFound != null)
{
IOEventArgs<FileInfo> args = new IOEventArgs<FileInfo>(file);
FileFound(this, args);
}
// do something with the file...
}
}
Original code time taken: 47.87s
Altered code time taken: 46.14s

To address the first part of your request about raising your own events from the standard class: you can create a delegate to which other methods can be hooked as callbacks for the event. Please see http://msdn.microsoft.com/en-us/library/aa645739(v=vs.71).aspx as a good resource. It's fairly trivial to implement.
As for threading, I believe that would be unnecessary at least for your performance concerns. Most of the bottleneck of performance for recursively checking directories is waiting for the node information to load from the disk. Relatively speaking, this is what takes all of your time, as fetching a directory info is a blocking process. Making numerous threads all checking different directories can easily slow down the overall speed of your search, and it tremendously complicates your application with the management of the worker threads and delegation of work shares. With that said, having a thread per disk might be desirable if your search spans multiple disks or resource locations.
I have found that something as simple as recursion using DirectoryInfo.GetDirectories() was one of the fastest solutions, as it takes advantage of the caching that Windows already does. A search application I made using it can search tens of thousands of filenames and directory names per second.

FileSystemWatcher - is File ready to use

When a file is being copied to the file watcher folder, how can I identify whether the file is completely copied and ready to use? Because I am getting multiple events during file copy. (The file is copied via another program using File.Copy.)

When I ran into this problem, the best solution I came up with was to continually try to get an exclusive lock on the file; while the file is being written, the locking attempt will fail, essentially the method in this answer. Once the file isn't being written to any more, the lock will succeed.
Unfortunately, the only way to do that is to wrap a try/catch around opening the file, which makes me cringe - having to use try/catch is always painful. There just doesn't seem to be any way around that, though, so it's what I ended up using.
Modifying the code in that answer does the trick, so I ended up using something like this:
private void WaitForFile(FileInfo file)
{
FileStream stream = null;
bool FileReady = false;
while(!FileReady)
{
try
{
using(stream = file.Open(FileMode.Open, FileAccess.ReadWrite, FileShare.None))
{
FileReady = true;
}
}
catch (IOException)
{
//File isn't ready yet, so we need to keep on waiting until it is.
}
//We'll want to wait a bit between polls, if the file isn't ready.
if(!FileReady) Thread.Sleep(1000);
}
}

Here is a method that will retry file access up to X number of times, with a Sleep between tries. If it never gets access, the application moves on:
private static bool GetIdleFile(string path)
{
var fileIdle = false;
const int MaximumAttemptsAllowed = 30;
var attemptsMade = 0;
while (!fileIdle && attemptsMade <= MaximumAttemptsAllowed)
{
try
{
using (File.Open(path, FileMode.Open, FileAccess.ReadWrite))
{
fileIdle = true;
}
}
catch
{
attemptsMade++;
Thread.Sleep(100);
}
}
return fileIdle;
}
It can be used like this:
private void WatcherOnCreated(object sender, FileSystemEventArgs e)
{
if (GetIdleFile(e.FullPath))
{
// Do something like...
foreach (var line in File.ReadAllLines(e.FullPath))
{
// Do more...
}
}
}

I had this problem when writing a file. I got events before the file was fully written and closed.
The solution is to use a temporary filename and rename the file once finished. Then watch for the file rename event instead of file creation or change event.

Note: this problem is not solvable in generic case. Without prior knowledge about file usage you can't know if other program(s) finished operation with the file.
In your particular case you should be able to figure out what operations File.Copy consist of.
Most likely destination file is locked during whole operation. In this case you should be able to simply try to open file and handle "sharing mode violation" exception.
You can also wait for some time... - very unreliable option, but if you know size range of files you may be able to have reasonable delay to let Copy to finish.
You can also "invent" some sort of transaction system - i.e. create another file like "destination_file_name.COPYLOCK" which program that copies file would create before copying "destination_file_name" and delete afterward.

private Stream ReadWhenAvailable(FileInfo finfo, TimeSpan? ts = null) => Task.Run(() =>
{
ts = ts == null ? new TimeSpan(long.MaxValue) : ts;
var start = DateTime.Now;
while (DateTime.Now - start < ts)
{
Thread.Sleep(200);
try
{
return new FileStream(finfo.FullName, FileMode.Open);
}
catch { }
}
return null;
})
.Result;
...of course, you can modify aspects of this to suit your needs.

One possible solution (It worked in my case) is to use the Change event. You can log in the create event the name of the file just created and then catch the change event and verify if the file was just created. When I manipulated the file in the change event it didn't throw me the error "File is in use"

If you are doing some sort of inter-process communication, as I do, you may want to consider this solution:
App A writes the file you are interested in, eg "Data.csv"
When done, app A writes a 2nd file, eg. "Data.confirmed"
In your C# app B make the FileWatcher listen to "*.confirmed" files. When you get this event you can safely read "Data.csv", as it is already completed by app A.
(Edit: inspired by commets) Delete the *.confirmed filed with app B when done processing the "Data.csv" file.

I have solved this issue with two features:
Implement the MemoryCache pattern seen in this question: A robust solution for FileSystemWatcher firing events multiple times
Implement a try\catch loop with a timeout for access
You need to collect average copy times in your environment and set the memory cache timeout to be at least as long as the shortest lock time on a new file. This eliminates duplicates in your processing directive and allows some time for the copy to finish. You will have much better success on first try, which means less time spent in the try\catch loop.
Here is an example of the try\catch loop:
public static IEnumerable<string> GetFileLines(string theFile)
{
DateTime startTime = DateTime.Now;
TimeSpan timeOut = TimeSpan.FromSeconds(TimeoutSeconds);
TimeSpan timePassed;
do
{
try
{
return File.ReadLines(theFile);
}
catch (FileNotFoundException ex)
{
EventLog.WriteEntry(ProgramName, "File not found: " + theFile, EventLogEntryType.Warning, ex.HResult);
return null;
}
catch (PathTooLongException ex)
{
EventLog.WriteEntry(ProgramName, "Path too long: " + theFile, EventLogEntryType.Warning, ex.HResult);
return null;
}
catch (DirectoryNotFoundException ex)
{
EventLog.WriteEntry(ProgramName, "Directory not found: " + theFile, EventLogEntryType.Warning, ex.HResult);
return null;
}
catch (Exception ex)
{
// We swallow all other exceptions here so we can try again
EventLog.WriteEntry(ProgramName, ex.Message, EventLogEntryType.Warning, ex.HResult);
}
Task.Delay(777).Wait();
timePassed = DateTime.Now.Subtract(startTime);
}
while (timePassed < timeOut);
EventLog.WriteEntry(ProgramName, "Timeout after waiting " + timePassed.ToString() + " seconds to read " + theFile, EventLogEntryType.Warning, 258);
return null;
}
Where TimeoutSeconds is a setting that you can put wherever you hold your settings. This can be tuned for your environment.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Handle the download of several tens of thousands files c# - c#

Related

C# Calculating download speed correctly when dowloading with multiple threads

Tell if a file is written to a networkshare completely [duplicate]

uwp c# async method waiting data not completely loaded exception

Tying it all together, Directory Recursive Search + Events + Threading

FileSystemWatcher - is File ready to use

Categories

Resources