Setting Timestamps on files/directories is extremely slow - c#

I'm working on a project, which requires to copy a lot of files and directories, while preserving their original timestamps. So I need to make many calls to the target's SetCreationTime(), SetLastWriteTime() and SetLastAccessTime() methods in order to copy the original values from source to target. As the screenshot below shows, these simple operations take up to 42% of the total computation time.
Since this is limiting my whole application's performance tremendously, I'd like to speed things up. I assume, that each of these calls requires to open and close a new stream to the file/directory. If that's the reason, I'd like to leave this stream open until I finished writing all attributes. How do I accomplish this? I guess this would require the use of some P/Invoke.
Update:
I followed Lukas' advice to use the WinAPI method CreateFile(..) with the FILE_WRITE_ATTRIBUTES. In order to P/Invoke the mentioned method I created following wrapper:
public class Win32ApiWrapper
{
[DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Auto)]
private static extern SafeFileHandle CreateFile(string lpFileName,
[MarshalAs(UnmanagedType.U4)] FileAccess dwDesiredAccess,
[MarshalAs(UnmanagedType.U4)] FileShare dwShareMode,
IntPtr lpSecurityAttributes,
[MarshalAs(UnmanagedType.U4)] FileMode dwCreationDisposition,
[MarshalAs(UnmanagedType.U4)] FileAttributes dwFlagsAndAttributes,
IntPtr hTemplateFile);
public static SafeFileHandle CreateFileGetHandle(string path, int fileAttributes)
{
return CreateFile(path,
(FileAccess)(EFileAccess.FILE_WRITE_ATTRIBUTES | EFileAccess.FILE_WRITE_DATA),
0,
IntPtr.Zero,
FileMode.Create,
(FileAttributes)fileAttributes,
IntPtr.Zero);
}
}
The enums I used can be found here.This allowed me to do all all things with only opening the file once: Create the file, apply all attributes, set the timestamps and copy the actual content from the original file.
FileInfo targetFile;
int fileAttributes;
IDictionary<string, long> timeStamps;
using (var hFile = Win32ApiWrapper.CreateFileGetHandle(targetFile.FullName, attributeFlags))
using (var targetStream = new FileStream(hFile, FileAccess.Write))
{
// copy file
Win32ApiWrapper.SetFileTime(hFile, timeStamps);
}
Was it worth the effort? YES. It reduced computation time by ~40% from 86s to 51s.
Results before optimization:
Results after optimization:

I'm not a C# programmer and I don't know how those System.IO.FileSystemInfo methods are implemented. But I've made a few tests with the WIN32 API function SetFileTime(..) which will be called by C# at some point.
Here is the code snippet of my benchmark-loop:
#define NO_OF_ITERATIONS 100000
int iteration;
DWORD tStart;
SYSTEMTIME tSys;
FILETIME tFile;
HANDLE hFile;
DWORD tEllapsed;
iteration = NO_OF_ITERATIONS;
GetLocalTime(&tSys);
tStart = GetTickCount();
while (iteration)
{
tSys.wYear++;
if (tSys.wYear > 2020)
{
tSys.wYear = 2000;
}
SystemTimeToFileTime(&tSys, &tFile);
hFile = CreateFile("test.dat",
GENERIC_WRITE, // FILE_WRITE_ATTRIBUTES
0,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
if (hFile == INVALID_HANDLE_VALUE)
{
printf("CreateFile(..) failed (error: %d)\n", GetLastError());
break;
}
SetFileTime(hFile, &tFile, &tFile, &tFile);
CloseHandle(hFile);
iteration--;
}
tEllapsed = GetTickCount() - tStart;
I've seen that the expensive part of setting the file times is the opening/closing of the file. About 60% of the time is used to open the file and about 40% to close it (which needs to flush the modifications to disc). The above loop took about 9s for 10000 iterations.
A little research showed that calling CreateFile(..) with FILE_WRITE_ATTRIBUTES (instead of GENERIC_WRITE) is sufficient to change the time attributes of a file.
This modification speed things up significantly! Now the same loop finishes within 2s for 10000 iterations. Since the number of iterations is quite small I've made a second run with 100000 iterations to get a more reliable time measurement:
FILE_WRITE_ATTRIBUTES: 5 runs with 100000 iterations: 12.7-13.2s
GENERIC_WRITE: 5 runs with 100000 iterations: 63.2-72.5s
Based on the above numbers my guess is that the C# methods use the wrong access mode when opening the file to change to file time. Or some other C# behavior slows things down...
So maybe a solution to your speed issue is to implement a DLL which exports a C function which changes the file times using SetFileTime(..)? Or maybe you can even import the functions CreateFile(..), SetFileTime(..) and CloseHandle(..) directly to avoid calling the C# methods?
Good luck!

Related

CLR GC thread behavior: SafeFileHandle unexpectedly finalized

We recently hit some issues that may be related to the GC behavior of CLR.
The problem I encountered is as follows:
We have a long running stress testing application written in C# that keeps opening file handles on a remote SMB file share (which is Azure Files Service), and uses those handles to perform file system operations like read/write, etc.
Typically we’ll keep those handle open for quite a long time, as we’ll use them repeatedly. But sometimes when we try to access some of those opened handles, we found that these handles were closed already. And from the trace logs captured by Process Monitor (one sample below):
fltmgr.sys!FltpPerformPreCallbacks+0x324
fltmgr.sys!FltpPassThroughInternal+0x8c
fltmgr.sys!FltpPassThrough+0x169
fltmgr.sys!FltpDispatch+0x9e
ntoskrnl.exeIopCloseFile+0x146
ntoskrnl.exeObpDecrementHandleCount+0x9a
ntoskrnl.exeNtClose+0x3d9
ntoskrnl.exeKiSystemServiceCopyEnd+0x13
ntdll.dll!ZwClose+0xa
KERNELBASE.dll!CloseHandle+0x17
mscorlib.ni.dll!mscorlib.ni.dll!+0x566038
clr.dll!CallDescrWorkerInternal+0x83
clr.dll!CallDescrWorkerWithHandler+0x4a
clr.dll!DispatchCallSimple+0x60
clr.dll!SafeHandle::RunReleaseMethod+0x69
clr.dll!SafeHandle::Release+0x152
clr.dll!SafeHandle::Dispose+0x5a
clr.dll!SafeHandle::DisposeNative+0x9b
mscorlib.ni.dll!mscorlib.ni.dll!+0x48d9d1
mscorlib.ni.dll!mscorlib.ni.dll!+0x504b83
clr.dll!FastCallFinalizeWorker+0x6
clr.dll!FastCallFinalize+0x55
clr.dll!MethodTable::CallFinalizer+0xac
clr.dll!WKS::CallFinalizer+0x61
clr.dll!WKS::DoOneFinalization+0x92
clr.dll!WKS::FinalizeAllObjects+0x8f
clr.dll!WKS::FinalizeAllObjects_Wrapper+0x18
clr.dll!ManagedThreadBase_DispatchInner+0x2d
clr.dll!ManagedThreadBase_DispatchMiddle+0x6c
clr.dll!ManagedThreadBase_DispatchOuter+0x75
clr.dll!ManagedThreadBase_DispatchInCorrectAD+0x15
clr.dll!Thread::DoADCallBack+0xff
clr.dll!ManagedThreadBase_DispatchInner+0x1d822c
clr.dll!WKS::DoOneFinalization+0x145
clr.dll!WKS::FinalizeAllObjects+0x8f
clr.dll!WKS::GCHeap::FinalizerThreadWorker+0xa1
clr.dll!ManagedThreadBase_DispatchInner+0x2d
clr.dll!ManagedThreadBase_DispatchMiddle+0x6c
clr.dll!ManagedThreadBase_DispatchOuter+0x75
clr.dll!WKS::GCHeap::FinalizerThreadStart+0xd7
clr.dll!Thread::intermediateThreadProc+0x7d
KERNEL32.dll!BaseThreadInitThunk+0x1a
ntdll.dll!RtlUserThreadStart+0x1d
It seems that the handles were closed in CLR GC Finalizer thread. However, our handles are opened in the following pattern which should not be GC’ed:
We use P/Invoke to open a file handle and obtain a SafeFileHandle and use that SafeFileHandle to construct a FileStream, and we’ll save the FileStream object in another object defined as follows:
public class ScteFileHandle
{
/// <summary>
/// local file handle
/// </summary>
[NonSerialized]
public FileStream FileStreamHandle;
/*
* Some other fields
*/
}
P/Invoke we use:
[DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Auto)]
public static extern SafeFileHandle CreateFile(
string lpFileName,
Win32FileAccess dwDesiredAccess,
Win32FileShare dwShareMode,
IntPtr lpSecurityAttributes,
Win32FileMode dwCreationDisposition,
Win32FileAttributes dwFlagsAndAttributes,
IntPtr hTemplateFile);
SafeFileHandle fileHandle = Win32FileIO.CreateFile(fullFilePath, win32FileAccess, win32FileShare, IntPtr.Zero, win32FileMode, win32FileAttr, IntPtr.Zero);
FileStream fileStream = new FileStream(fileHandle, fileAccess, Constants.XSMBFileSectorSize);
One thing we’re sure of is that during the whole lifetime of our stress testing application, we definitely keep a reference to the ScteFileHandle object, so it will never be cleaned up by GC. However, we do have observed the SafeHandle referenced within the ScteFileHandle ‘s FileStream got finalized in CLR GC thread, as pasted in above trace log.
So I’m wondering what caused the SafeFileHandle to be GC’ed and if there’s any approach to avoid this ? I’m not familiar with the CLR GC behavior but from my perspective, the SafeFileHandle is not supposed to be GC’ed.
Any pointer or insight is greatly appreciated ! Please let me know if any other detail you need to diagnostic this issue : )

SetConsoleActiveScreenBuffer does not display screen buffer

I am currently trying to write a console application in C# with two screen buffers, which should be swapped back and forth (much like VSync on a modern GPU). Since the System.Console class does not provide a way to switch buffers, I had to P/Invoke several methods from kernel32.dll.
This is my current code, grossly simplified:
static void Main(string[] args)
{
IntPtr oldBuffer = GetStdHandle(-11); //Gets the handle for the default console buffer
IntPtr newBuffer = CreateConsoleScreenBuffer(0, 0x00000001, IntPtr.Zero, 1, 0); //Creates a new console buffer
/* Write data to newBuffer */
SetConsoleActiveScreenBuffer(newBuffer);
}
The following things occured:
The screen remains empty, even though it should be displaying newBuffer
When written to oldBuffer instead of newBuffer, the data appears immediately. Thus, my way of writing into the buffer should be correct.
Upon calling SetConsoleActiveScreenBuffer(newBuffer), the error code is now 6, which means invalid handle. This is strange, as the handle is not -1, which the documentation discribes as invalid.
I should note that I very rarely worked with the Win32 API directly and have very little understanding of common Win32-related problems. I would appreciate any sort of help.
As IInspectable points out in the comments, you're setting dwDesiredAccess to zero. That gives you a handle with no access permissions. There are some edge cases where such a handle is useful, but this isn't one of them.
The only slight oddity is that you're getting "invalid handle" rather than "access denied". I'm guessing you're running Windows 7, so the handle is a user-mode object (a "pseudohandle") rather than a kernel handle.
At any rate, you need to set dwDesiredAccess to GENERIC_READ | GENERIC_WRITE as shown in the sample code.
Also, as Hans pointed out in the comments, the declaration on pinvoke.net was incorrect, specifying the last argument as a four-byte integer rather than a pointer-sized integer. I believe the correct declaration is
[DllImport("kernel32.dll", SetLastError = true)]
static extern IntPtr CreateConsoleScreenBuffer(
uint dwDesiredAccess,
uint dwShareMode,
IntPtr lpSecurityAttributes,
uint dwFlags,
IntPtr lpScreenBufferData
);

Server slowdown when copying files

I have a C# program which queries a database (server3) to determine the files a user is after, then copies those files from (server1) to (server2).
To simplify that further
C# application is executed on a desktop computer
Original files are on server1
Files are to be copied to server2
Server3 contains the database
When I run this program on my desktop, everything works fine except server1, which seems to almost grind to a holt after about 5 minutes, even though the copying process continues working fine even after 5 minutes. Any other application/user who try to connect to that server can't.
They just get a spinning cursor which only stops if I stop running the program on my desktop. For the first 5 minutes into the copying process, everything is fine for everyone. When going beyond the 5 minute range, the files continue to copy, but that's when others start experiencing connection problems to server1.
I have even tried using sleep as I assumed that the slow down was because of too much network activity and/or too much disk I/O activity on server1. sleep did nothing to help, the same problem continues. So I'm guessing the problem is happening for some other reason.
I am using code similar to this to copy the files
while (reader1.read(){
// system.threading.thread.sleep(2000);
system.io.file.copy(source, destination);
}
Why is this happening?
According to this article, the main cause of the slowdown is the use of buffering by the file copy.
On Windows Vista or later, it's possible to avoid using buffering by specifying COPY_FILE_NO_BUFFERING to the CopyFileEx() Windows API function.
You can specify the P/Invoke as follows:
enum CopyProgressResult: uint
{
PROGRESS_CONTINUE = 0,
PROGRESS_CANCEL = 1,
PROGRESS_STOP = 2,
PROGRESS_QUIET = 3
}
enum CopyProgressCallbackReason: uint
{
CALLBACK_CHUNK_FINISHED = 0x00000000,
CALLBACK_STREAM_SWITCH = 0x00000001
}
delegate CopyProgressResult CopyProgressRoutine(
long TotalFileSize,
long TotalBytesTransferred,
long StreamSize,
long StreamBytesTransferred,
uint dwStreamNumber,
CopyProgressCallbackReason dwCallbackReason,
IntPtr hSourceFile,
IntPtr hDestinationFile,
IntPtr lpData);
[Flags]
enum CopyFileFlags: uint
{
COPY_FILE_FAIL_IF_EXISTS = 0x00000001,
COPY_FILE_RESTARTABLE = 0x00000002,
COPY_FILE_OPEN_SOURCE_FOR_WRITE = 0x00000004,
COPY_FILE_ALLOW_DECRYPTED_DESTINATION = 0x00000008,
COPY_FILE_COPY_SYMLINK = 0x00000800, //NT 6.0+
COPY_FILE_NO_BUFFERING = 0x00001000
}
[DllImport("kernel32.dll", SetLastError=true, CharSet=CharSet.Auto)]
[return: MarshalAs(UnmanagedType.Bool)]
static extern bool CopyFileEx
(
string lpExistingFileName,
string lpNewFileName,
CopyProgressRoutine lpProgressRoutine,
IntPtr lpData,
ref Int32 pbCancel,
CopyFileFlags dwCopyFlags
);
Then call it like this (substituting your own file names);
int cancel = 0;
CopyFileEx(#"C:\tmp\test.bin", #"F:\test.bin", null, IntPtr.Zero, ref cancel, CopyFileFlags.COPY_FILE_NO_BUFFERING);
It might be worth trying this and seeing if it helps.

How to change a file without affecting the "last write time"

I would like to write some stuff to file, like
using( var fs = File.OpenWrite( file ) )
{
fs.Write( bytes, 0, bytes.Length );
}
However, this changes the "last write time". I can reset it later, by using
File.SetLastWriteTime( file, <old last write time> );
But in the meantime, a FileSystemWatcher already triggers.
Now my question: Is is possible to write a file without altering the "last write time"?
You can achieve it by using P/Invoke calls in Kernel32.dll.
This Powershell script from MS TechNet achieves it, and explicitly states that a FileSystemWatcher's events are not triggered.
I have briefly looked into the script and the code is pretty straightforward and can easily be copied to your C# project.
Declaration:
[DllImport("kernel32.dll", SetLastError = true)]
[return: MarshalAs(UnmanagedType.Bool)]
public static extern bool SetFileTime(IntPtr hFile, ref long lpCreationTime, ref long lpLastAccessTime, ref long lpLastWriteTime);
The script uses SetFileTime to lock the file times before writing.
private const int64 fileTimeUnchanged = 0xFFFFFFFF;
This constant is passed as reference to the method for lpCreationTime, lpLastAccessTime and lpLastWriteTime:
// assuming fileStreamHandle is an IntPtr with the handle of the opened filestream
SetFileTime(fileStreamHandle, ref fileTimeUnchanged, ref fileTimeUnchanged, ref fileTimeUnchanged);
// Write to the file and close the stream
Don't think it's possible,nor that I'm aware of.
Also consider that "last write time" Is Not Always Updated, which leads to some wired results if you're going to pick (say) some files from the folder based on that parameter or rely on that property in some way. So it's not a parameter you can rely on in your development, it's just not reliable by architecture of OS.
Simply create a flag: boolean, if this you write and watch into the same application,
or a flag, like some specific named file, if you write form one and watch from another application.

how do disable disk cache in c# invoke win32 CreateFile api with FILE_FLAG_NO_BUFFERING

everyone,i have a lot of files write to disk per seconds,i want to disable disk cache to improve performance,i google search find a solution:win32 CreateFile method with FILE_FLAG_NO_BUFFERING and How to empty/flush Windows READ disk cache in C#?.
i write a little of code to test whether can worked:
const int FILE_FLAG_NO_BUFFERING = unchecked((int)0x20000000);
[DllImport("KERNEL32", SetLastError = true, CharSet = CharSet.Auto, BestFitMapping = false)]
static extern SafeFileHandle CreateFile(
String fileName,
int desiredAccess,
System.IO.FileShare shareMode,
IntPtr securityAttrs,
System.IO.FileMode creationDisposition,
int flagsAndAttributes,
IntPtr templateFile);
static void Main(string[] args)
{
var handler = CreateFile(#"d:\temp.bin", (int)FileAccess.Write, FileShare.None,IntPtr.Zero, FileMode.Create, FILE_FLAG_NO_BUFFERING, IntPtr.Zero);
var stream = new FileStream(handler, FileAccess.Write, BlockSize);//BlockSize=4096
byte[] array = Encoding.UTF8.GetBytes("hello,world");
stream.Write(array, 0, array.Length);
stream.Close();
}
when running this program,the application get exception:IO operation will not work. Most likely the file will become too long or the handle was not opened to support synchronous IO operations
later,i found this article When you create an object with constraints, you have to make sure everybody who uses the object understands those constraints,but i can't fully understand,so i change my code to test:
var stream = new FileStream(handler, FileAccess.Write, 4096);
byte[] ioBuffer = new byte[4096];
byte[] array = Encoding.UTF8.GetBytes("hello,world");
Array.Copy(array, ioBuffer, array.Length);
stream.Write(ioBuffer, 0, ioBuffer.Length);
stream.Close();
it's running ok,but i just want "hello,world" bytes not all.i trying change blocksize to 1 or other integer(not 512 multiple) get same error.i also try win32 WriteFile api also get same error.someone can help me?
CreateFile() function in No Buffering mode imposes strict requirements on what may and what may not be done. Having a buffer of certain size (multiple of device sector size) is one of them.
Now, you can improve file writes in this way only if you use buffering in your code. If you want to write 10 bytes without buffering, then No Buffering mode won't help you.
If I understood your requirements correctly, this is what I'd try first:
Create a queue with objects that have the data in memory and the target file on the disk.
You start writing the files first just into memory, and then on another thread start going through the queue, opening io-completion port based filestream handles (isAsync=True) - just don't open too many of them as at some point you'll probably start losing perf due to cache trashing etc. You need to profile to see what is optimal amount for your system and ssd's.
After each open, you can use the async filestream methods Begin... to start writing data from memory to the files. the isAsync puts some requirements so this may not be as easy to get working in every corner case as using filestream normally.
Whether there will be any improvement to using another thread to create the files and another to write to them using the async api, that might only be the case if there is a possibility that creating/opening the files would block. SSD's perform various things internally to keep the access to data fast, so when you start doing this sort of extreme performance stuff, there may be pronounced differences between SSD controllers. It's also possible that if the controller drivers aren't well implemented, OS/Windows may start to feel sluggish or freeze. The hardware benchmarks sites do not really stress this particular kind of scenario (eg. create and write x KB into million files asap) and no doubt there's some drivers out there that are slower than others.

Categories

Resources