Dump a process memory to file / recreate process from dump file

Dump a process memory to file / recreate process from dump file - c#

Just curious, maybe someone knows a way:
Is it possible, while having an opened process (app domain), dump its entire memory space to a file, send it by wire to a LAN workstation and recreate the process as it was on the first computer.
Assumptions:
the application exists on both computers;
the process is not creating any local settings/temporary files;
the OS is the same on both computers;

If you want to do so, you have to ensure you have the same environment to run the "dumped" process. Some of them:
You have to provide the same handles with the same state (process, threads, file, etc.)
The new environment must have the same memory addresses allocated (including runtime allocations) as previous had
All the libraries must be initialized and put in the same state
If you have some GUI interface even GPU must be in the same state (you have to preload all graphic resources etc.)
And many more stuff to take care about.

This is what's involved on Linux:
http://www.cs.iit.edu/~scs/psfiles/dsn08_dccs.pdf
Not exactly easy.

Related

When using Filestream Filemode.Append does it overwrite what is lying next to the file?

Lets assume that exactly 1 byte after the File-1-EOF another file (file2) starts.
If I open up file 1 and use FileStream Filemode.Append, does it overwrite file2 or does it make another copy at a place where there is enough memory?
Thanks, in regards!
Edit:
For everyone after me: I forgot that you have a file system, which is split into chunks. Making this question nonsense!

You appear to be laboring under the misapprehension that files are stored sequentially on disk, and that extending one file might overwrite parts of another file. This doesn't happen when you go via a filestream append in c#. The operating system will write the bytes you add however it likes, wherever it likes (and it likes to not overwrite other files) which is how files end up broken into smaller chunks (and why defragging is thing) scattered all over the disk. None of this is of any concern to you, because the OS presents those scattered file fragments as a single contiguous stream of bytes to any program that wants to read them
Of course, if you wrote a program that bypassed the OS and performed low level disk access, located the end of the file and then blindly write more bytes into the locations after it then you would end up damaging other files, and even the OS's carefully curated filesystem .. but a .net file stream won't make that possible
TLDR; add your bytes and don't worry about it. Keeping the filesystem in order is not your job

If I open up file 1 and use FileStream Filemode.Append, does it overwrite file2 or does it make another copy at a place where there is enough memory?
Thankfully no.
Here's a brief overview why:
Your .NET C# code does not have direct OS level interaction.
Your code is compiled into byte-code and is interpreted at runtime by the .NET runtime.
During runtime your byte-code is executed by the .NET Runtime which is built mostly in a combination of C#/C/C++.
The runtime secures what it calls SafeHandles, which are wrappers around the file handles provided by what I can assume is window.h(for WIN32 applications at least), or whatever OS level provider for file handles you're architecture is running on.
The runtime uses these handles to read and write data using the OS level API.
It is the OS's job to ensure changes to yourfile.txt, using the handle it's provided to the runtime, only affects that file.
Files are not generally stored in memory, and as such are not subject to buffer overflows.
The runtime may use a buffer in memory to.. buffer your reads and writes but that is implemented by the runtime, and has no affect on the file and operating system.
Any attempt to overflow this buffer is safe-guarded by the runtime itself and the execution of your code will stop. Regardless, if a buffer overflow happened on this buffer successfully - no extra bytes will be written to the underlying handle. Rather the runtime would likely stop executing with a memory access violation, or general unspecified behavior.
The handle you're given is little more than a token that the OS uses to keep track which file you want to read or write bytes to.
If you attempt to write more bytes to a file than an architecture allows - most operating systems will have safe guards in place to end your process, close the file, or straight up send an interrupt to crash the system.

What happens if computer hangs while persisting a memory-mapped file?

I'm very interested in using managed memory-mapped files available since .NET 4.0.
Check the following statement extracted from MSDN:
Persisted files are memory-mapped files that are associated with a
source file on a disk. When the last process has finished working with
the file, the data is saved to the source file on the disk. These
memory-mapped files are suitable for working with extremely large
source files.
My question is: what happens if computer hangs while persisting a memory-mapped file?
I mean, since memory-mapped files are stored in virtual memory (I understand that this is in the page file), maybe a file can be restored from virtual memory and try to store it again to the source file after restarting Windows.

The data pages that underlie a memory mapped file reside in the OS cache (file cache). Whenever you shutdown Windows it writes all modified cache pages to the file system.
The pages in the cache are either ordinary file data (from processes doing reads/writes from/to files) or memory mapped pages which are read/written by the paging system.
If Windows is unable (e.g. crashes or freezes) to flush cache contents to disk then that data is lost.

If enable persistence , memory map file not remove after reboot .
you can use atomic action process with a flag that show data is valid or not if valid you can restore else data lost
If your os support (kernel or filesystem lifetime) like unix you can use share memory with synchronisation that is more faster than map file
Modern Operating Systems 3e (2007) book memory map file: Shared libraries are really a special case of a more general facility called memory-mapped files. The idea here is that a process can issue a system call to map a file onto a portion of its virtual address space. In most implementations, no pages are brought in at the time of the mapping, but as pages are touched, they are demand paged in one at a time, using the disk file as the backing store. When the process exits, or explicitly unmaps the file, all the modified pages are written back to the file. Mapped files provide an alternative model for I/O. Instead of doing reads and writes, the file can be accessed as a big character array in memory. In some situations, programmers find this model more convenient. If two or more processes map onto the same file at the same time, they can communicate over shared memory. Writes done by one process to the shared memory are immediately visible when the other one reads from the part of its virtual address spaced mapped onto the file. This mechanism thus provides a high bandwidth channel between processes and is often used as such (even to the extent of mapping a scratch file). Now it should be clear that if memory-mapped files are available, shared libraries can use this mechanism
In http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2044.html
Shared Memory
POSIX defines a shared memory object as "An object that represents memory that can be mapped concurrently into the address space of more than one process."
Shared memory is similar to file mapping, and the user can map several regions of a shared memory object, just like with memory mapped files. In some operating systems, like Windows, shared memory is an special case of file mapping, where the file mapping object accesses memory backed by the system paging file. However, in Windows, the lifetime of this memory ends when the last process connected to the shared memory object closes connection or the application exits, so there is no data persistence. If an application creates shared memory, fills it with data and exits, the data is lost. This lifetime is known as process lifetime
In POSIX operating systems the shared memory lifetime is different since for semaphores, shared memory, and message queues it's mandatory that the object and its state (including data, if any) is preserved after the object is no longer referenced by any process. Persistence of an object does not imply that the state of the object is preserved after a system crash or reboot, but this can be achieved since shared memory objects can actually be implemented as mapped files of a permanent file system. The shared memory destruction happens with an explicit call to unlink(), which is similar to the file destruction mechanism. POSIX shared memory is required to have kernel lifetime (the object is explicitly destroyed or it's destroyed when the operating system reboots) or filesystem persistence (the shared memory object has the same lifetime as a file).
This lifetime difference is important to achieve portability. Many portable runtimes have tried to achieve perfect portability between Windows and POSIX shared memory but the author of this paper has not seen any satisfactory effort. Adding a reference count to POSIX shared memory is effective only as long as a process does not crash, something that it's very usual. Emulating POSIX behaviour in Windows using native shared memory is not possible since we could try to dump shared memory to a file to obtain persistence, but a process crash would avoid persistence. The only viable alternative is to use memory mapped files in Windows simulating shared memory, but avoiding file-memory synchronization as much as possible.
Many other named synchronization primitives (like named mutexes or semaphores) suffer the same lifetime portability problem. Automatic shared memory cleanup is useful in many contexts, like shared libraries or DLL-s communicating with other DLL-s or processes. Even when there is a crash, resources are automatically cleaned up by the operating systems. POSIX persistence is also useful when a launcher program can create and fill shared memory that another process can read or modify. Persistence also allows data recovery if a server process crashes. All the data is still in the shared memory, and the server can recover its state.
This paper proposes POSIX lifetime (kernel or filesystem lifetime) as a more portable solution, but has no strong opinion about this. The C++ committee should take into account the use cases of both approaches to decide which behaviour is better or if both options should be available, forcing the modification of both POSIX and Windows systems.

Is it wrong to embed some files in a dll?

Performance wise, is it wrong to embed a file in a resource section of a dll?
This might seem silly, but aim trying to embed some info inside the dll which can later be fetched by some methods, in case the whole solution and documentation is lost and we have only the dll.
What are the downside of doing such a thing?
Is it suggested or prohibited ?

Embedded resources are done very efficiently. Under the hood, it uses the demand paged virtual memory capabilities of the operating system. The exact equivalent of a memory-mapped file. In other words, the resource is directly accessible in memory. And you don't pay for the resource until you start using it. The first access to the resource forces it to be read from the file and copied into RAM. And is very cheap to unmap again, the operating system can simply discard the page. There is no way to make it more efficient.
The other side of the medal is that it is permanently mapped into virtual memory. In other words, your process loses the memory space occupied by the resource. You'll run of out of available address space more quickly, an OutOfMemoryException is more likely.
This is not something you normally worry about until you gobble up, say, half a gigabyte in a 32-bit process. And don't fret about at all in a 64-bit process.

Finding file locks in legacy c# code

I have an interesting problem - I've inherited a large code base (brown field).
The application runs on a schedule and takes a large amount of data files (text) in, processes them, and then exports a report and cleans up.
There is a bug that has been discovered whereby when trying to clean up afterwards, some files are left in a locked state, even though all file activity has long gone out of scope. This stops the application from being able to delete them during clean up.
There are literally hundreds of IO and stream objects etc being used in this application, and I'm wanting to find out where to start looking to save reviewing every instance of their use.
What are some good tools for investigating File locks in c# managed code, and how do you use them to do so?

This happens normally when you forgot to dispose the parent object that owns a file handle. E.g. you forgot to call Close/Dispose to a FileStream. Then the finalizer will clean up the file handles when they are no longer referenced during the next full GC.
You can check with Windbg if you have SafeFileHandles in the finalization queue ready for finalization. A profiler which can track such things is e.g. YourKit which can when you enable probes also search for files closed in the finalizer and gives you the creation call stack which gives you the ability to search in your code for the offending line.
Check out the process Inspection tab of YourKit to find the probe check.

You can monitor file access (read/write) using ProcMon from SysInternals.
Its not specific to c# but a general tool that can be used for many other things. Note you can export the results to csv and investigate it later.
You can use one of the following guides:
Detailed Windows I/O: Process Monitor - How to do simple file monitoring.
Using Process Monitor to Monitor File Access - More detailed guide explaining how to export the results into a csv you can investigate later.
Edit:
I didn't found anything for this purpose, so if I was you I would inherit from the steam used, and wrap it with logging logic.
This logging stream object, for example named LogStream will write log before each method entrance, call the base.function() and write another log when done.
This way you can monitor the file access as you wish. For example, logging each stream instance an Id using Guid.NewGuid(), logging Thread Id using System.Threading.Thread.CurrentThread.ManagedThreadId etc.
This way you can identify the instances and slowly investigate the calls.
A point to start is to check whether there is equal number both stream open and close, an exception might avoided one of the Dispose() calls.

System Wide persistent storage?

My program starts a process and I need to make sure it is killed before I can run the program again. To do this, I'd like to store the start time of the Process in something like a mutex that I could later retrieve and check to see if any process has a matching name and start time.
How could I do this? I don't really want to stick anything on the harddrive that will stick around after the user logs out.
For reference I'm using C# and .NET

You want to store the process ID, not the process name and start time. That will make it simpler to kill the process.
You can store the file in %TMP% so that it will get cleaned up when hard drive space is low.
C# code to kill the process looks like this:
int pid = Convert.ToInt32(File.ReadAllText(pidFile));
Process proc = Process.GetProcessById(pid);
proc.Kill();
You can find out the %TMP% directory like this:
var tmp = Environment.GetEnvironmentVariable("TMP");
EDIT: The PID can be reused, so you will need to deal with that, too.

I agree with rossfabricant that the Process ID should be stored instead of a name. Process IDs are not reused until after a restart, so this should be safe.
However, I'd recommend against using the TMP environment variable for storage, and instead look at Isolated Storage. This would be more .NET oriented in terms of storage. Also, it would lower the security required to run your application.
Unfortunately, both temporary directories and isolated storage will persist after a logout, though, so you'll need logic to handle that case. (Your app can clean out the info on shutdown, however).
If you have access to the code of process you are starting, it might be better to use something like named pipes or shared memory to detect whether the application is running. This also gives you a much cleaner way to shut down the process. Killing a process should be a last resort - in general, I wouldn't build an application where the design requires killing a process if it was at all avoidable.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.