Detecting a File Delete on an Open File - c#

I am opening a file with read access and allowing subsequent read|write|delete file share access to the file (tailing the file). If the file is deleted during processing is there a way to detect that the file is pending delete (see Files section http://msdn.microsoft.com/en-us/library/aa363858(v=VS.85).aspx)? If some outside process (the owning process) has issued a delete, I want to close my handle as soon as possible to allow the file deletion so as not to interfere with any logic in the owning process.
I'm in C# and see no method of detecting the pending delete. The file was opened using a FileStream object. Is there some method for detecting the delete in C# or in some other windows function?

You can use the Windows API function GetFileInformationByHandleEx to detect a pending delete on a file you have open. The second argument is an enumeration value which lets you specify what kind of information the function should return. The FileStandardInfo (1) value will cause it to return the FILE_STANDARD_INFO structure, which includes a DeletePending boolean.
Here is a demonstration utility:
using System;
using System.Text;
using System.IO;
using System.Runtime.InteropServices;
using System.Threading;
internal static class Native
{
[DllImport("kernel32.dll", SetLastError = true)]
public extern static bool GetFileInformationByHandleEx(IntPtr hFile,
int FileInformationClass,
IntPtr lpFileInformation,
uint dwBufferSize);
public struct FILE_STANDARD_INFO
{
public long AllocationSize;
public long EndOfFile;
public uint NumberOfLinks;
public byte DeletePending;
public byte Directory;
}
public const int FileStandardInfo = 1;
}
internal static class Program
{
public static bool IsDeletePending(FileStream fs)
{
IntPtr buf = Marshal.AllocHGlobal(4096);
try
{
IntPtr handle = fs.SafeFileHandle.DangerousGetHandle();
if (!Native.GetFileInformationByHandleEx(handle,
Native.FileStandardInfo,
buf,
4096))
{
Exception ex = new Exception("GetFileInformationByHandleEx() failed");
ex.Data["error"] = Marshal.GetLastWin32Error();
throw ex;
}
else
{
Native.FILE_STANDARD_INFO info = Marshal.PtrToStructure<Native.FILE_STANDARD_INFO>(buf);
return info.DeletePending != 0;
}
}
finally
{
Marshal.FreeHGlobal(buf);
}
}
public static int Main(string[] args)
{
TimeSpan MAX_WAIT_TIME = TimeSpan.FromSeconds(10);
if (args.Length == 0)
{
args = new string[] { "deleteme.txt" };
}
for (int i = 0; i < args.Length; ++i)
{
string filename = args[i];
FileStream fs = null;
try
{
fs = File.Open(filename,
FileMode.CreateNew,
FileAccess.Write,
FileShare.ReadWrite | FileShare.Delete);
byte[] buf = new byte[4096];
UTF8Encoding utf8 = new UTF8Encoding(false);
string text = "hello world!\r\n";
int written = utf8.GetBytes(text, 0, text.Length, buf, 0);
fs.Write(buf, 0, written);
fs.Flush();
Console.WriteLine("{0}: created and wrote line", filename);
DateTime t0 = DateTime.UtcNow;
for (;;)
{
Thread.Sleep(16);
if (IsDeletePending(fs))
{
Console.WriteLine("{0}: detected pending delete", filename);
break;
}
if (DateTime.UtcNow - t0 > MAX_WAIT_TIME)
{
Console.WriteLine("{0}: timeout reached with no delete", filename);
break;
}
}
}
catch (Exception ex)
{
Console.WriteLine("{0}: {1}", filename, ex.Message);
}
finally
{
if (fs != null)
{
Console.WriteLine("{0}: closing", filename);
fs.Dispose();
}
}
}
return 0;
}
}

I would use a different signaling mechanism. (I am making the assumption all file access is within your control and not from a closed external program, mainly due to the flags being employed.)
The only "solution" within those bounds I can think of is a poll on file-access and check the exception (if any) you get back. Perhaps there is something much more tricky (at a lower-level than the win32 file API?!?), but this is already going down the "uhg path" :-)

FileSystemWatcher would probably be the closest thing, but it can't detect a "pending" delete; when the file IS deleted, an event will be raised on FileSystemWatcher, and you can attach a handler that will gracefully interrupt your file processing. If the lock (or lack of one) you acquire in opening the file makes it possible for the file to be deleted at all, simply closing your read-only FileStream when that happens should not affect the file system.
The basic steps of a file watcher are to create one, passing an instance of a FileInfo object to the constructor. FileInfos can be created inexpensively by just instantiating one, passing it the path and filename of the file as a string. Then, set its NotifyFilter to the type(s) of file system modifications you want to watch for on this file. Finally, attach your process's event handler to the OnDeleted event. This event handler can probably be as simple as setting a bit flag somewhere that your main process can read, and closing the FileStream. You'll then get an exception on your next attempt to work with the stream; catch it, read the flag, and if it's set just gracefully stop doing file stuff. You can also put the file processing in a seperate worker thread, and the event handler can just tell the thread to die in some graceful method.

If the file is small enough, your application could process a copy of the file, rather than the file itself. Also, if your application needs to know whether the owning process deleted the original file, set up a FileSystemWatcher (FSW) on the file. When the file disappears, the FSW could set a flag to interrupt processing:
private bool _fileExists = true;
public void Process(string pathToOriginalFile, string pathToCopy)
{
File.Copy(pathToOriginalFile, pathToCopy);
FileSystemWatcher watcher = new FileSystemWatcher();
watcher.Path = pathToOriginalFile;
watcher.Deleted += new FileSystemEventHandler(OnFileDeleted);
bool doneProcessing = false;
watcher.EnableRaisingEvents = true;
while(_fileExists && !doneProcessing)
{
// process the copy here
}
...
}
private void OnFileDeleted(object source, FileSystemEventArgs e)
{
_fileExists = false;
}

No, there's no clean way to do this. If you were concerned about other processes opening and/or modifying the file, then oplocks could help you. But if you're just looking for notification of when the delete disposition gets set to deleted, there isn't a straightforward way to do this (sans building a file system filter, hooking the APIs, etc. all of which spooky for an application do be doing w/o very good reason).

Related

Cannot kill process: "No process is associated with this object" error

In my program I try to start a new process (open video file in a default player). This part works OK. Later when I try to close the process (player) I get an error:
System.InvalidOperationException: No process is associated with this object.
My code:
string filename = "747225775.mp4";
var myProc = new Process()
{
StartInfo = new ProcessStartInfo(filename)
};
myProc.Start();
Thread.Sleep(5000);
try
{
myProc.Kill(); //Error is here
}
catch (Exception ex)
{
Debug.WriteLine(ex);
Debugger.Break();
}
What is wrong?
Process.Start associates the Process object with native process handle only when it spawns new process directly. When filename is used as an argument instead of executable name, Process searches registry for association settings via shell32.dll functions and the exact behavior depends on them.
When association is configured in traditional way, to call command line and transfer file name as 1st argument (such as for Notepad), Process.Start spawns new process directly and correctly associates object with native handle. However, when association is configured to execute COM-object (such as for Windows Media Player), Process.Start only creates some RPC query to execute COM object method and returns without associating object with process handle. (The actual process spawn occurs in svchost.exe context, according to my tests)
This issue can be solved by following modified process start method:
using System;
using System.ComponentModel;
using System.Text;
using System.Windows.Forms;
using System.Diagnostics;
using System.Threading;
using System.Runtime.InteropServices;
namespace ProcessTest
{
public partial class Form1 : Form
{
[DllImport("Shlwapi.dll", SetLastError = true, CharSet = CharSet.Auto)]
static extern uint AssocQueryString(AssocF flags, AssocStr str, string pszAssoc, string pszExtra, [Out] StringBuilder pszOut, ref uint pcchOut);
/*Modified Process.Start*/
public static Process TrueProcessStart(string filename)
{
ProcessStartInfo psi;
string ext = System.IO.Path.GetExtension(filename);//get extension
var sb = new StringBuilder(500);//buffer for exe file path
uint size = 500;//buffer size
/*Get associated app*/
uint res = AssocQueryString(AssocF.None, AssocStr.Executable, ext,null, sb, ref size);
if (res != 0)
{
Debug.WriteLine("AssocQueryString returned error: " + res.ToString("X"));
psi = new ProcessStartInfo(filename);//can't get app, use standard method
}
else
{
psi = new ProcessStartInfo(sb.ToString(), filename);
}
return Process.Start(psi);//actually start process
}
public Form1()
{
InitializeComponent();
}
private void button2_Click(object sender, EventArgs e)
{
string filename = "c:\\images\\clip.wmv";
var myProc = TrueProcessStart(filename);
if (myProc == null)
{
MessageBox.Show("Process can't be killed");
return;
}
Thread.Sleep(5000);
try
{
myProc.Kill();
}
catch (Exception ex)
{
MessageBox.Show(ex.ToString());
}
}
}
[Flags]
enum AssocF : uint
{
None = 0,
Init_NoRemapCLSID = 0x1,
Init_ByExeName = 0x2,
Open_ByExeName = 0x2,
Init_DefaultToStar = 0x4,
Init_DefaultToFolder = 0x8,
NoUserSettings = 0x10,
NoTruncate = 0x20,
Verify = 0x40,
RemapRunDll = 0x80,
NoFixUps = 0x100,
IgnoreBaseClass = 0x200,
Init_IgnoreUnknown = 0x400,
Init_FixedProgId = 0x800,
IsProtocol = 0x1000,
InitForFile = 0x2000,
}
enum AssocStr
{
Command = 1,
Executable,
FriendlyDocName,
FriendlyAppName,
NoOpen,
ShellNewValue,
DDECommand,
DDEIfExec,
DDEApplication,
DDETopic,
InfoTip,
QuickTip,
TileInfo,
ContentType,
DefaultIcon,
ShellExtension,
DropTarget,
DelegateExecute,
SupportedUriProtocols,
Max,
}
}
Here we get the file type's associated application via AssocQueryString. The returned value is then passed to ProcessStartInfo. However it does not always work, so we sometimes have to resort to standart method. For example, image files does not have any associated exe, it's just dll being loaded into explorer's process. So we can't outright kill process in this case.
to answer your question: "What is wrong?"
I can say the underline cause of this is related to Windows Apps that are launched to handle the type of file (.mp4).
From what I can determine.. there isn't anything wrong with your code sample except that it doesn't account for this scenario (in which, admittingly, I do not understand why it behaves this way).
To replicate this, I used your code sample and a image file (.png). the program launches with 'Photos' by default.
I changed .png files to launch with Paint application by default, then ran the program again. The code sample you've provided worked fine on the desktop application.

.NET FileSystemWatcher goes into infinite loop when moving file

I have a issue with the FileSystemWatcher. I'm using it in a windows service to monitor certain folders and when a file is copied, it proccesses that file using a SSIS package. Everything works fine, but every now and then, the FileWatcher picks up the same file and fires the Created event multiple times in a infinate loop. The code below works as follow:
Firstly, this method is called by the windows service and creates a watcher :
private void CreateFileWatcherEvent(SSISPackageSetting packageSettings)
{
// Create a new FileSystemWatcher and set its properties.
FileSystemWatcher watcher = new FileSystemWatcher();
watcher.IncludeSubdirectories = false;
watcher.Path = packageSettings.FileWatchPath;
/* Watch for changes in LastAccess and LastWrite times, and
the renaming of files or directories. */
watcher.NotifyFilter = NotifyFilters.LastAccess | NotifyFilters.LastWrite
| NotifyFilters.FileName | NotifyFilters.DirectoryName | NotifyFilters.Size;
//Watch for all files
watcher.Filter = "*.*";
watcher.Created += (s, e) => FileCreated(e, packageSettings);
// Begin watching.
watcher.EnableRaisingEvents = true;
}
Next up, The Watcher.Created event looks something like this:
private void FileCreated(FileSystemEventArgs e, SSISPackageSetting packageSettings)
{
//Bunch of other code not important to the issue
ProcessFile(packageSettings, e.FullPath, fileExtension);
}
The ProcessFile method looks something like this:
private void ProcessFile(SSISPackageSetting packageSetting,string Filename,string fileExtension)
{
//COMPLETE A BUNCH OF SSIS TASKS TO PROCESS THE FILE
//NOW WE NEED TO CREATE THE OUTPUT FILE SO THAT SSIS CAN WRITE TO IT
string errorOutPutfileName = packageSetting.ImportFailurePath + #"\FailedRows" + System.DateTime.Now.ToFileTime() + packageSetting.ErrorRowsFileExtension;
File.Create(errorOutPutfileName).Close();
MoveFileToSuccessPath(Filename, packageSetting);
}
Lastly, the MoveFile Method looks like this:
private void MoveFileToSuccessPath(string filename, SSISPackageSetting ssisPackage)
{
try
{
string newFilename = MakeFilenameUnique(filename);
System.IO.File.Move(filename, ssisPackage.ArchivePath.EndsWith("\\") ? ssisPackage.ArchivePath + newFilename : ssisPackage.ArchivePath + "\\" + newFilename);
}
catch (Exception ex)
{
SaveToApplicationLog(string.Format
("Error ocurred while moving a file to the success path. Filename {0}. Archive Path {1}. Error {2}", filename, ssisPackage.ArchivePath,ex.ToString()), EventLogEntryType.Error);
}
}
So somewhere in there, we go into a infinite loop and the FileWatcher keeps on picking up the same file. Anyone have any idea? This happens randomly and intermittently.
When using the FileSystemWatcher I tend to use a dictionary to add the files to when the notification event fires. I then have a separate thread using a timer which picks files up from this collection when they are more than a few seconds old, somewhere around 5 seconds.
If my processing is also likely to change the last access time and I watch that too then I also implement a checksum which I keep in a dictionary along with the filename and last processed time for every file and use that to suppress it firing multiple times in a row. You don't have to use an expensive one to calculate, I have used md5 and even crc32 - you are only trying to prevent multiple notifications.
EDIT
This example code is very situation specific and makes lots of assumptions you may need to change. It doesn't list all your code, just somethind like the bits you need to add:
// So, first thing to do is add a dictionary to store file info:
internal class FileWatchInfo
{
public DateTime LatestTime { get; set; }
public bool IsProcessing { get; set; }
public string FullName { get; set; }
public string Checksum { get; set; }
}
SortedDictionary<string, FileWatchInfo> fileInfos = new SortedDictionary<string, FileWatchInfo>();
private readonly object SyncRoot = new object();
// Now, when you set up the watcher, also set up a [`Timer`][1] to monitor that dictionary.
CreateFileWatcherEvent(new SSISPackageSetting{ FileWatchPath = "H:\\test"});
int processFilesInMilliseconds = 5000;
Timer timer = new Timer(ProcessFiles, null, processFilesInMilliseconds, processFilesInMilliseconds);
// In FileCreated, don't process the file but add it to a list
private void FileCreated(FileSystemEventArgs e) {
var finf = new FileInfo(e.FullPath);
DateTime latest = finf.LastAccessTimeUtc > finf.LastWriteTimeUtc
? finf.LastAccessTimeUtc : finf.LastWriteTimeUtc;
latest = latest > finf.CreationTimeUtc ? latest : finf.CreationTimeUtc;
// Beware of issues if other code sets the file times to crazy times in the past/future
lock (SyncRoot) {
// You need to work out what to do if you actually need to add this file again (i.e. someone
// has edited it in the 5 seconds since it was created, and the time it took you to process it)
if (!this.fileInfos.ContainsKey(e.FullPath)) {
FileWatchInfo info = new FileWatchInfo {
FullName = e.FullPath,
LatestTime = latest,
IsProcessing = false, Processed = false,
Checksum = null
};
this.fileInfos.Add(e.FullPath, info);
}
}
}
And finally, here is the process method as it now is
private void ProcessFiles(object state) {
FileWatchInfo toProcess = null;
List<string> toRemove = new List<string>();
lock (this.SyncRoot) {
foreach (var info in this.fileInfos) {
// You may want to sort your list by latest to avoid files being left in the queue for a long time
if (info.Value.Checksum == null) {
// If this fires the watcher, it doesn't matter, but beware of big files,
// which may mean you need to move this outside the lock
string md5Value;
using (var md5 = MD5.Create()) {
using (var stream = File.OpenRead(info.Value.FullName)) {
info.Value.Checksum =
BitConverter.ToString(md5.ComputeHash(stream)).Replace("-", "").ToLower();
}
}
}
// Data store (myFileInfoStore) is code I haven't included - use a Dictionary which you remove files from
// after a few minutes, or a permanent database to store file checksums
if ((info.Value.Processed && info.Value.ProcessedTime.AddSeconds(5) < DateTime.UtcNow)
|| myFileInfoStore.GetFileInfo(info.Value.FullName).Checksum == info.Value.Checksum) {
toRemove.Add(info.Key);
}
else if (!info.Value.Processed && !info.Value.IsProcessing
&& info.Value.LatestTime.AddSeconds(5) < DateTime.UtcNow) {
info.Value.IsProcessing = true;
toProcess = info.Value;
// This processes one file at a time, you could equally add a bunch to a list for parallel processing
break;
}
}
foreach (var filePath in toRemove) {
this.fileInfos.Remove(filePath);
}
}
if (toProcess != null)
{
ProcessFile(packageSettings, toProcess.FullName, new FileInfo(toProcess.FullName).Extension);
}
}
Finally, ProcessFile needs to process your file, then once completed go inside a lock, mark the info in the fileInfos dictionary as Processed, set the ProcessedTime, and then exit the lock and move the file. You will also want to update the checksum if it changes after an acceptable amount of time has passed.
It is very hard to provide a complete sample as I know nothing about your situation, but this is the general pattern I use. You will need to consider file rates, how frequently they are updated etc. You can probably bring down the time intervals to sub second instead of 5 seconds and still be ok.

UNC path pointing to local directory much slower than local access

Some code I'm working with occasionally needs to refer to long UNC paths (e.g. \\?\UNC\MachineName\Path), but we've discovered that no matter where the directory is located, even on the same machine, it's much slower when accessing through the UNC path than the local path.
For example, we've written some benchmarking code that writes a string of gibberish to a file, then later read it back, multiple times. I'm testing it with 6 different ways to access the same shared directory on my dev machine, with the code running on the same machine:
C:\Temp
\\MachineName\Temp
\\?\C:\Temp
\\?\UNC\MachineName\Temp
\\127.0.0.1\Temp
\\?\UNC\127.0.0.1\Temp
And here are the results:
Testing: C:\Temp
Wrote 1000 files to C:\Temp in 861.0647 ms
Read 1000 files from C:\Temp in 60.0744 ms
Testing: \\MachineName\Temp
Wrote 1000 files to \\MachineName\Temp in 2270.2051 ms
Read 1000 files from \\MachineName\Temp in 1655.0815 ms
Testing: \\?\C:\Temp
Wrote 1000 files to \\?\C:\Temp in 916.0596 ms
Read 1000 files from \\?\C:\Temp in 60.0517 ms
Testing: \\?\UNC\MachineName\Temp
Wrote 1000 files to \\?\UNC\MachineName\Temp in 2499.3235 ms
Read 1000 files from \\?\UNC\MachineName\Temp in 1684.2291 ms
Testing: \\127.0.0.1\Temp
Wrote 1000 files to \\127.0.0.1\Temp in 2516.2847 ms
Read 1000 files from \\127.0.0.1\Temp in 1721.1925 ms
Testing: \\?\UNC\127.0.0.1\Temp
Wrote 1000 files to \\?\UNC\127.0.0.1\Temp in 2499.3211 ms
Read 1000 files from \\?\UNC\127.0.0.1\Temp in 1678.18 ms
I tried the IP address to rule out a DNS issue. Could it be checking credentials or permissions on each file access? If so, is there a way to cache it? Does it just assume since it's a UNC path that it should do everything over TCP/IP instead of directly accessing the disk? Is it something wrong with the code we're using for the reads/writes? I've ripped out the pertinent parts for benchmarking, seen below:
using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.InteropServices;
using System.Text;
using Microsoft.Win32.SafeHandles;
using Util.FileSystem;
namespace UNCWriteTest {
internal class Program {
[DllImport("kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)]
public static extern bool DeleteFile(string path); // File.Delete doesn't handle \\?\UNC\ paths
private const int N = 1000;
private const string TextToSerialize =
"asd;lgviajsmfopajwf0923p84jtmpq93worjgfq0394jktp9orgjawefuogahejngfmliqwegfnailsjdhfmasodfhnasjldgifvsdkuhjsmdofasldhjfasolfgiasngouahfmp9284jfqp92384fhjwp90c8jkp04jk34pofj4eo9aWIUEgjaoswdfg8jmp409c8jmwoeifulhnjq34lotgfhnq34g";
private static readonly byte[] _Buffer = Encoding.UTF8.GetBytes(TextToSerialize);
public static string WriteFile(string basedir) {
string fileName = Path.Combine(basedir, string.Format("{0}.tmp", Guid.NewGuid()));
try {
IntPtr writeHandle = NativeFileHandler.CreateFile(
fileName,
NativeFileHandler.EFileAccess.GenericWrite,
NativeFileHandler.EFileShare.None,
IntPtr.Zero,
NativeFileHandler.ECreationDisposition.New,
NativeFileHandler.EFileAttributes.Normal,
IntPtr.Zero);
// if file was locked
int fileError = Marshal.GetLastWin32Error();
if ((fileError == 32 /* ERROR_SHARING_VIOLATION */) || (fileError == 80 /* ERROR_FILE_EXISTS */)) {
throw new Exception("oopsy");
}
using (var h = new SafeFileHandle(writeHandle, true)) {
using (var fs = new FileStream(h, FileAccess.Write, NativeFileHandler.DiskPageSize)) {
fs.Write(_Buffer, 0, _Buffer.Length);
}
}
}
catch (IOException) {
throw;
}
catch (Exception ex) {
throw new InvalidOperationException(" code " + Marshal.GetLastWin32Error(), ex);
}
return fileName;
}
public static void ReadFile(string fileName) {
var fileHandle =
new SafeFileHandle(
NativeFileHandler.CreateFile(fileName, NativeFileHandler.EFileAccess.GenericRead, NativeFileHandler.EFileShare.Read, IntPtr.Zero,
NativeFileHandler.ECreationDisposition.OpenExisting, NativeFileHandler.EFileAttributes.Normal, IntPtr.Zero), true);
using (fileHandle) {
//check the handle here to get a bit cleaner exception semantics
if (fileHandle.IsInvalid) {
//ms-help://MS.MSSDK.1033/MS.WinSDK.1033/debug/base/system_error_codes__0-499_.htm
int errorCode = Marshal.GetLastWin32Error();
//now that we've taken more than our allotted share of time, throw the exception
throw new IOException(string.Format("file read failed on {0} to {1} with error code {1}", fileName, errorCode));
}
//we have a valid handle and can actually read a stream, exceptions from serialization bubble out
using (var fs = new FileStream(fileHandle, FileAccess.Read, 1*NativeFileHandler.DiskPageSize)) {
//if serialization fails, we'll just let the normal serialization exception flow out
var foo = new byte[256];
fs.Read(foo, 0, 256);
}
}
}
public static string[] TestWrites(string baseDir) {
try {
var fileNames = new List<string>();
DateTime start = DateTime.UtcNow;
for (int i = 0; i < N; i++) {
fileNames.Add(WriteFile(baseDir));
}
DateTime end = DateTime.UtcNow;
Console.Out.WriteLine("Wrote {0} files to {1} in {2} ms", N, baseDir, end.Subtract(start).TotalMilliseconds);
return fileNames.ToArray();
}
catch (Exception e) {
Console.Out.WriteLine("Failed to write for " + baseDir + " Exception: " + e.Message);
return new string[] {};
}
}
public static void TestReads(string baseDir, string[] fileNames) {
try {
DateTime start = DateTime.UtcNow;
for (int i = 0; i < N; i++) {
ReadFile(fileNames[i%fileNames.Length]);
}
DateTime end = DateTime.UtcNow;
Console.Out.WriteLine("Read {0} files from {1} in {2} ms", N, baseDir, end.Subtract(start).TotalMilliseconds);
}
catch (Exception e) {
Console.Out.WriteLine("Failed to read for " + baseDir + " Exception: " + e.Message);
}
}
private static void Main(string[] args) {
foreach (string baseDir in args) {
Console.Out.WriteLine("Testing: {0}", baseDir);
string[] fileNames = TestWrites(baseDir);
TestReads(baseDir, fileNames);
foreach (string fileName in fileNames) {
DeleteFile(fileName);
}
}
}
}
}
This doesn't surprise me. You're writing/reading a fairly small amount of data, so the file system cache is probably minimizing the impact of the physical disk I/O; basically, the bottleneck is going to be the CPU. I'm not certain whether the traffic will be going via the TCP/IP stack or not but at a minimum the SMB protocol is involved. For one thing that means the requests are being passed back and forth between the SMB client process and the SMB server process, so you've got context switching between three distinct processes, including your own. Using the local file system path you're switching into kernel mode and back but no other process is involved. Context switching is much slower than the transition to and from kernel mode.
There are likely to be two distinct additional overheads, one per file and one per kilobyte of data. In this particular test the per file SMB overhead is likely to be dominant. Because the amount of data involved also affects the impact of physical disk I/O, you may find that this is only really a problem when dealing with lots of small files.

Capturing standard out from tail -f "follow"

I am trying to capture the output from tail in follow mode, where it outputs the text as it detects changes in the file length - particularly useful for following log files as lines are added. For some reason, my call to StandardOutput.Read() is blocking until tail.exe exits completely.
Relevant code sample:
var p = new Process() {
StartInfo = new ProcessStartInfo("tail.exe") {
UseShellExecute = false,
RedirectStandardOutput = true,
Arguments = "-f c:\\test.log"
}
};
p.Start();
// the following thread blocks until the process exits
Task.Factory.StartNew(() => p.StandardOutput.Read());
// main thread wait until child process exits
p.WaitForExit();
I have also tried using the support for the OutputDataReceived event handler which exhibits the same blocking behavior:
p.OutputDataReceived += (proc, data) => {
if (data != null && data.Data != null) {
Console.WriteLine(data.Data);
}
};
p.BeginOutputReadLine();
I do have a little bit more code around the call to StandardOutput.Read(), but this simplifies the example and still exhibits the undesirable blocking behavior. Is there something else I can do to allow my code to react to the availability of data in the StandardOutput stream prior to the child application exiting?
Is this just perhaps a quirk of how tail.exe runs? I am using version 2.0 compiled as part of the UnxUtils package.
Update: this does appear to be at least partially related to quirks in tail.exe. I grabbed the binary from the GnuWin32 project as part of the CoreUtils package and the version bumped up to 5.3.0. If I use the -f option to follow without retries, I get the dreaded "bad file descriptor" issue on STDERR (easy to ignore) and the process terminates immediately. If I use the -F option to include retries it seems to work properly after the bad file descriptor message has come by and it attempts to open the file a second time.
Is there perhaps a more recent win32 build from the coreutils git repository I could try?
I know it is not exatly what you are asking but as James says in the comments, you could do the equivalent functionality directly in c# to save you having to launch another process.
One way you can do it is like this:
using System;
using System.IO;
using System.Text;
using System.Threading;
public class FollowingTail : IDisposable
{
private readonly Stream _fileStream;
private readonly Timer _timer;
public FollowingTail(FileInfo file,
Encoding encoding,
Action<string> fileChanged)
{
_fileStream = new FileStream(file.FullName,
FileMode.Open,
FileAccess.Read,
FileShare.ReadWrite);
_timer = new Timer(o => CheckForUpdate(encoding, fileChanged),
null,
0,
500);
}
private void CheckForUpdate(Encoding encoding,
Action<string> fileChanged)
{
// Read the tail of the file off
var tail = new StringBuilder();
int read;
var b = new byte[1024];
while ((read = _fileStream.Read(b, 0, b.Length)) > 0)
{
tail.Append(encoding.GetString(b, 0, read));
}
// If we have anything notify the fileChanged callback
// If we do not, make sure we are at the end
if (tail.Length > 0)
{
fileChanged(tail.ToString());
}
else
{
_fileStream.Seek(0, SeekOrigin.End);
}
}
// Not the best implementation if IDisposable but you get the idea
// See http://msdn.microsoft.com/en-us/library/ms244737(v=vs.80).aspx
// for how to do it properly
public void Dispose()
{
_timer.Dispose();
_fileStream.Dispose();
}
}
Then to call for example:
new FollowingTail(new FileInfo(#"C:\test.log"),
Encoding.ASCII,
s =>
{
// Do something with the new stuff here, e.g. print it
Console.Write(s);
});

ASP.NET Schedule deletion of temporary files

Question: I have an ASP.NET application which creates temporary PDF files (for the user to download).
Now, many users over many days can create many PDFs, which take much disk space.
What's the best way to schedule deletion of files older than 1 day/ 8 hours ?
Preferably in the asp.net application itselfs...
For each temporary file that you need to create, make a note of the filename in the session:
// create temporary file:
string fileName = System.IO.Path.GetTempFileName();
Session[string.Concat("temporaryFile", Guid.NewGuid().ToString("d"))] = fileName;
// TODO: write to file
Next, add the following cleanup code to global.asax:
<%# Application Language="C#" %>
<script RunAt="server">
void Session_End(object sender, EventArgs e) {
// Code that runs when a session ends.
// Note: The Session_End event is raised only when the sessionstate mode
// is set to InProc in the Web.config file. If session mode is set to StateServer
// or SQLServer, the event is not raised.
// remove files that has been uploaded, but not actively 'saved' or 'canceled' by the user
foreach (string key in Session.Keys) {
if (key.StartsWith("temporaryFile", StringComparison.OrdinalIgnoreCase)) {
try {
string fileName = (string)Session[key];
Session[key] = string.Empty;
if ((fileName.Length > 0) && (System.IO.File.Exists(fileName))) {
System.IO.File.Delete(fileName);
}
} catch (Exception) { }
}
}
}
</script>
UPDATE: I'm now accually using a new (improved) method than the one described above. The new one involves HttpRuntime.Cache and checking that the files are older than 8 hours. I'll post it here if anyones interested. Here's my new global.asax.cs:
using System;
using System.Web;
using System.Text;
using System.IO;
using System.Xml;
using System.Web.Caching;
public partial class global : System.Web.HttpApplication {
protected void Application_Start() {
RemoveTemporaryFiles();
RemoveTemporaryFilesSchedule();
}
public void RemoveTemporaryFiles() {
string pathTemp = "d:\\uploads\\";
if ((pathTemp.Length > 0) && (Directory.Exists(pathTemp))) {
foreach (string file in Directory.GetFiles(pathTemp)) {
try {
FileInfo fi = new FileInfo(file);
if (fi.CreationTime < DateTime.Now.AddHours(-8)) {
File.Delete(file);
}
} catch (Exception) { }
}
}
}
public void RemoveTemporaryFilesSchedule() {
HttpRuntime.Cache.Insert("RemoveTemporaryFiles", string.Empty, null, DateTime.Now.AddHours(1), Cache.NoSlidingExpiration, CacheItemPriority.NotRemovable, delegate(string id, object o, CacheItemRemovedReason cirr) {
if (id.Equals("RemoveTemporaryFiles", StringComparison.OrdinalIgnoreCase)) {
RemoveTemporaryFiles();
RemoveTemporaryFilesSchedule();
}
});
}
}
Try using Path.GetTempPath(). It will give you a path to a windows temp folder. Then it will be up to windows to clean up :)
You can read more about the method here http://msdn.microsoft.com/en-us/library/system.io.path.gettemppath.aspx
The Best way is to create a batch file which it be called by the windows task scheduler one at the interval that you want.
OR
you can create a windows service with the class above
public class CleanUpBot
{
public bool KeepAlive;
private Thread _cleanUpThread;
public void Run()
{
_cleanUpThread = new Thread(StartCleanUp);
}
private void StartCleanUp()
{
do
{
// HERE THE LOGIC FOR DELETE FILES
_cleanUpThread.Join(TIME_IN_MILLISECOND);
}while(KeepAlive)
}
}
Notice that you can also call this class at the pageLoad and it wont affect the process time because the treatment is in another thread. Just remove the do-while and the Thread.Join().
How do you store the files? If possible, you could just go with a simple solution, where all files are stored in a folder named after the current date and time.
Then create a simple page or httphandler that will delete old folders. You could call this page at intervals using a Windows schedule or other cron job.
Create a timer on Appication_Start and schedule the timer to call a method on every 1 hours and flush the files older than 8 hours or 1 day or whatever duration you need.
I sort of agree with whats said in the answer by dirk.
The idea being that the temp folder in which you drop the files to is a fixed known location however i differ slightly ...
Each time a file is created add the filename to a list in the session object (assuming there isn't thousands, if there is when this list hits a given cap do the next bit)
when the session ends the Session_End event should be raised in global.asax should be raised. Iterate all the files in the list and remove them.
private const string TEMPDIRPATH = #"C:\\mytempdir\";
private const int DELETEAFTERHOURS = 8;
private void cleanTempDir()
{
foreach (string filePath in Directory.GetFiles(TEMPDIRPATH))
{
FileInfo fi = new FileInfo(filePath);
if (!(fi.LastWriteTime.CompareTo(DateTime.Now.AddHours(DELETEAFTERHOURS * -1)) <= 0)) //created or modified more than x hours ago? if not, continue to the next file
{
continue;
}
try
{
File.Delete(filePath);
}
catch (Exception)
{
//something happened and the file probably isn't deleted. the next time give it another shot
}
}
}
The code above will remove the files in the temp directory that are created or modified more than 8 hours ago.
However I would suggest to use another approach. As Fredrik Johansson suggested, you can delete the files created by the user when the session ends. Better is to work with an extra directory based on the session ID of the user in you temp directory. When the session ends you simply delete the directory created for the user.
private const string TEMPDIRPATH = #"C:\\mytempdir\";
string tempDirUserPath = Path.Combine(TEMPDIRPATH, HttpContext.Current.User.Identity.Name);
private void removeTempDirUser(string path)
{
try
{
Directory.Delete(path);
}
catch (Exception)
{
//an exception occured while deleting the directory.
}
}
Use the cache expiry notification to trigger file deletion:
private static void DeleteLater(string path)
{
HttpContext.Current.Cache.Add(path, path, null, Cache.NoAbsoluteExpiration, new TimeSpan(0, 8, 0, 0), CacheItemPriority.NotRemovable, UploadedFileCacheCallback);
}
private static void UploadedFileCacheCallback(string key, object value, CacheItemRemovedReason reason)
{
var path = (string) value;
Debug.WriteLine(string.Format("Deleting upladed file '{0}'", path));
File.Delete(path);
}
ref: MSDN | How to: Notify an Application When an Item Is Removed from the Cache

Categories

Resources