How to read the contents of an entire disk bit by bit

How to read the contents of an entire disk bit by bit - c#

I've got a flash card that I need to compute a checksum on the entire contents of the drive.
If I could acquire a stream to the entire drive I could just read it in bit by bit.
Does anyone know if there is an API for doing this?
Everything I see so far requires me to open a file.
Is there any way to just read an entire drive's contents bit by bit?

If you want to write C# code, then you'll have to use P/Invoke to read data from your disk (RAW access).
Is there any way to just read an entire drive's contents bit by bit?
You'll have to make a difference between the drive (logical representation of your flash card, with a FileSystem installed on it, specified by the drive letter) and the disk (physical representation of your flash card, specified by the disk number).
See my previous answer about how to read RAW data from a drive/disk:
Basically, you'll first need a handle to the disk/drive:
// For a DISK:
IntPtr hDisk = CreateFile(string.Format("\\\\.\\PhysicalDrive{0}", diskNumber),
GenericRead,
Read | Write,
0,
OpenExisting,
0,
IntPtr.Zero);
// For a DRIVE
IntPtr hDrive = NativeMethods.CreateFile(
string.Format("\\\\.\\{0}:", DriveLetter)
GenericRead,
Read | Write,
IntPtr.Zero,
OpenExisting,
0,
IntPtr.Zero);
Then use SetFilePointerEx (so you can move the offset where you want to read), ReadFile (fills a buffer with bytes read from the disk/drive), CloseHandle (closes the handle opened by CreateFile).
Read the disk/drive by chunks (so basically, a loop from offset "0" to offset "disk/drive size").
What's important (or ReadFile will always fail): the size of read chunks must be a multiple of the sector size of your disk (512 bytes generally).

I'm not sure if there is direct support for this in .NET, but you can use Platform Invoke to call the Win32 API functions. CreateFile() should be your starting point, as it allows you to get a handle to the physical drive:
You can use the CreateFile function to open a physical disk drive or a volume, which
returns a direct access storage device (DASD) handle that can be used with the
DeviceIoControl function. This enables you to access the disk or volume directly, for
example such disk metadata as the partition table.
The documentation informs you of restrictions (such as requiring administrative privileges) and hints at how to get a physical drive number from the volume letter, etc.

"Stream" is an abstraction that normally assumes the presence of a file system, so conceptually it's not quite as simple as that.
On Windows, you might want to start by looking at the Windows Defragmentation API. Here are a couple of links I found:
MSDN page
Defrag API C# wrappers

Related

Validate very large .zip files (~12 GB) downlaoded from FTP using chilkat

I am using chilkat to download large .zip files from FTP server..
Files size usually goes around 12-13GB and after downloading I need to validate if file is not corrupt.
I've trying to use ICSharpCode.SharpZipLib.Zip
like this
ZipFile zip = new ZipFile(path);
bool isValidZip = zip.TestArchive(true, TestStrategy.FindFirstError, null);
But validation take VERY long time or even crashes..
Is there any quicker solutions ?

If the customer is uploading to FTP, then maybe the customer can also upload a SHA256 hash. For example, if the customer uploads x.zip, then compute the SHA256 of x.zip and also upload x.zip.sha256. Then your application can download both x.zip and x.zip.sha256, and then use Chilkat.Crypt2.HashFile to hash the x.zip and check against x.zip.sha256.
If it's not possible to get an expected hash value, then you might first check the file size against what is on the server. FTP servers can differ in how file information is provided. Older servers will provide human-readable directory listings (LIST command) whereas newer servers (i.e. within the last 10 years) support MLSD. Chilkat will use MLSD if possible. The older FTP servers might provide in accurate (non-exact) file size information, whereas MLSD will be accurate. You can call the Ftp2.Feat method to check to see if MLSD is supported. If so, then you can first validate the size of the downloaded file. If it's not the expected size, then you can skip any remaining validation because you already know it's invalid. (You can set Ftp2.AutoGetSizeForProgress = true, and then Chilkat will not return a success status when MLSD is used and the total number of bytes downloaded is not equal to the expected download size.
Assuming the byte counts are equal, or if you can't get an accurate byte count, and you don't have an expected hash, then you can test to see if the zip is valid. The 1st option is to call the Chilkat.Zip.OpenZip method. Opening the .zip will walk the zip's local file headers and central directory headers. Most errors will be caught if the .zip is corrupt. The more comprehensive check is only possible by actually decompressing the data for each file within the zip -- and this is probably why SharpZipLib takes so long. The only way to validate the compressed data is to actually do the decompression. Corrupted bytes would likely cause the decompressor to encounter an impossible internal state, which is clearly corruption. Also, the CRC-32 of the uncompressed data is stored in each local file header within the .zip. Checking the CRC-32 requires decompression. SharpZipLib is surely checking the CRC-32 (after it decompresses, and it's probably trying to decompress in memory and runs out of memory). Chilkat.OpenZip does not check the CRC-32 because it's not decompressing. You can call Chilkat.Unzip to unzip to the filesystem, and the act of unzipping also checks the CRC-32.
Anyway.. you might decide that checking the byte count and being able to call Chilkat.Zip.OpenZip successfully is sufficient for the validation check.
Otherwise, it's best to design the validation (using a parallel .sha256 file) in the system architecture if you're dealing with huge files..

Some FTP servers have implemented hash commands (see Appendix B). Issue HELP on ftp prompt in order to get a list of all available commands and see if your server supports a hash command. Otherwise you must stick to zip testing.

Save/Use images without allocating 300mb memory

I am building a windows project in .net 4.0 c#.
I am now saving images to hard drive, and that's not taking any memory at all because i am only loading the image once i want to view it. But now i need to remove the creating of images in hard drive and create it some other way. Like creating a memory stream and save it to an object and serialize it down to hard drive. The important part is that i cant have the images visible in hard drive, they must be encrypted or in an object or something.
So....When i tried to put it in a memory stream and save it to a list and then serialize it down to drive, i got a HUGE program memory allocation because for every image i create,and save as memory stream in my list, i allocate that memory and my program gets over 2-300 mb big.
I really don't have any idea of how to do this, can i somehow save it to memory stream and not allocate that memory in the program? Or can save it some other way without having the pictures totally visible as images in hard drive?
Main thing is as i said, i cant have the images as regular images on hard drive, they must not be able to be viewed by the user without the application. And i need to find a way that don't allocate all computers memory.
Thank you in advance!

save it to memory stream and not allocate that memory in the program
No. If it's in a memory stream, it is obviously in RAM.
So you need to either store the image entirely in RAM, or save it to disk.
If you don't want it to be viewable on the disk, then you need to encrypt the file so that the user can't view it outside of your application. How heavy the encryption is depends on how hard you want it to be to crack. If you don't care that much, then very basic XOR encryption will be very fast and not increase the size of the file. If you do care, then you want to use something like 3DES.

The file access is built on the principle of streams, which can be plug together in a chain. What you can do is, instead of directly reading/writing the images from/to disk through a filestream, you plug a CryptoStream between it.

You can use a GZIPStream and CryptoStream to make the pictures both smaller and encrypted.
This article shows you exactly how;
http://www.liensberger.it/web/blog/?p=33

Is there an addressing issue for writing an FTP client that needs to upload files larger than 4 gigs?

If my FTP client intends to upload files over 4 gigs in size, assuming I'm streaming the data, my pointer is going to hit the wall at around 4 gigs, if it's a 32 bit pointer, right? I'm trying to imagine what's going on behind the scenes and am not able to visualize how this could work... however it MUST work, since I have downloaded files larger than this in the past.
So, my question is two fold... what happens on the client (and does it need to be a 64 bit client, on a 64 bit machine) and what happens on the server (and does IT have to also be a 64 bit machine?)
I realize that the file will be broken into smaller files for transmission, but isn't the program going to explode just trying to address the parts of the file beyond the 4,294,967,295 mark?
I think this is a related post, but I'm not sure what conclusion they come to. The answers seem to point both to the limitations of the pointer (in their case PERL) and the OS.
Why can't my Perl program create files over 4 GB on Windows?

The client or server should read the data in chunks (I would do a multiple of the page size or something similar) and write the chunks to disk. There is no need to have the whole file in RAM all at once.
Something like this psuedo code (error checking and similar omitted) on the receiving end:
chunk = new byte[4096];
while(int size = recv(socket, chunk, 4096)) {
write(file, chunk, size);
}
So the above sample is for the server, the client would do something similar too.
chunk = new byte[4096];
while(int size = read(file, chunk, 4096)) {
send(sock, chunk, size);
}
EDIT:
To address your comment. One thing you have to keep in mind is that the offset in the file isn't neccessarily 32-bit on a 32-bit system, it can be 64-bit since it is not actually a pointer, it is simply an offset from the beginning of the file. If the OS supports 64-bit offsets (and modern windows/linux/osx all do), then you don't have to worry about it. As noted elsewhere, the filesystem the OS is trying to access is also a factor, but I figure if you have a file that is greater than 4GB, then it is clearly on a filesystem that supports it ;-).

I think your confusion may stem from the overloaded use of the word "pointer". A file's current position pointer is not the same as a pointer to an object in memory. Modern 32-bit OSes support 64-bit file pointers just fine.

32 or 64 bit client has nothing to do with file size, 32 bit OS supports files larger then 4GB, the only thing needed is the underlying file system must support it. FAT16 does not support files bigger then 4GB, however FAT32 and NTFS does.
Every programming SDK supports 64 bit addressing for files, even inside 32 bit operating system. So even if you have 32 bit server and client you can still transfer file more then 4GB.
The handle of file used inside program maintains LONG integer(8 bytes), http://www.cplusplus.com/reference/clibrary/cstdio/ftell/ you can see that long is 8 bytes in most systems.
However if your SDK or OS only supports 32 bit file pointers, then you have problem.

Encrypting files in resource constrained mobile devices

So the basic question is in encrypting files in resource constrained devices.
I have used a rather dangerous approach to use two FileStreams, where
FileStream 1 is reading from the file and copying it to a byte array
The contents of the byte array is encrypted.
FileStream 2, writes the bytes back to the same file.
This works fine but has a great chance of messing up a file if the encryption stops halfway etc.
So the normal approach is to write to a temp file, and then move it to the original location and replacing the original file.
However the problem is in mobile phones where resources (especially storage) are very limited, creating lets say another 200MB or 300MB file maybe impossible.
So what approaches are there to handle this problem in Mobile Devies? Do I have to gamble between space and messing up the file?

One way to make the process a little safer, could be to:
FileStream 1 is reading from the file and copying it to a byte array
The bytes you read, is written to a small "scratch" file the same size as your buffer, along with position of last block succesfully read.
The contents of the byte array is encrypted.
FileStream 2, writes the bytes back to the same file.
If the process is interrupted, check in the scratch file to see where your last position was. Then you can re-start the process from there, and still be able to encrypt the whole file. (And if you wanted to get the original file back, you would encrypt the remaining blocks, then decrypt it).
Of course, this process only works if you are using an encryption algorithm, that relies on the result of the preceding blocks when encrypting the current block. Depending on your choice of algorithm, you might need to store a little bit more.

First of all, you can always check if there is enough space to write your array to a tmp file.
Next, the problem you ask is not a real problem since if you're encrypting, you have read the complete file to array. Once encryption is finished, you can be sure that the byte array is encrypted. If this was not the case, the function would throw an exception. So, in step 3, when you write to file, you can overwrite it.
edit
I now realize that you encrypt and write to file partially since otherwise it wouldn't fit into ram. Is that correct?

Do I have to gamble between space and messing up the file?
Basically, Yes.
If space-constraints force you to convert (encrypt) in-place, there is no rollback option.
The next problem is Size. If your conversion (can) increase the size of the data, you have very limited room to maneuver. If ResultSize > (InputSize + Buffer) Then you're not going to succeed.
In the case of encryption, you can use a CompressStream in front of the CryptoStream, but you won't be able to predict if it's going to work.
In short, on a Mobile device you have reached a limit. You will have to mandate an extra Memory device.

Editing raw data of a FAT drive

I am trying to edit the raw data of a FAT drive (I think I found a solution for NTFS, but didn't work for FAT. I don't have anything with FAT, just all my devices are using it) with C# (the result should be a drive in a different format - my own format). I was able to read the raw data (was nice seeing the FAT from inside) from it using CreateFile and opening a stream using the IntPtr I got, but couldn't write to it.
I tried several computers, USB flash drives, SD cards, floppy disks - nothing.
If it isn't possible with C#, I can do it with another language and later call the function using DLLImport.
Thanks.

If you edit/modify the drive on the sector level it could be no longer be fully compatible.
The standard way is to make a big file to fill al the space and then operate on those sectors.
Since your goal is space FAT is actually not efficient. If you control both ends ( read/write) you can just cange sector 0 so that is is not recogninzed as an existing file system and then you can wirte your own sectors.
Windows would nag you at insertion that the drive is not formatted.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.