I am using Sharpziplib version 0.86 to extract a zip file. It is working fine but while extracting a winzip file through code,Lastwritetime is changing in seconds...
Have used this also File.SetLastWriteTime(fullPath, theEntry.DateTime);
Actual file Lastwritetime:4/8/2010 2:29:03PM
After zipping that file using winzip and while extracting that file using the code,extracted file Lastwritetime changes to 4/8/2010 2:29:04PM...Is there any fix for this???
I got this response from Sharpziplib Forum
Hi
This appears to be a WinZip bug. I have not noticed this before.
I did this test:
1) Use WinZip to add a file to a zip. In WinZip click Properties and Details. Look through the details list and find the file time stamp.
2) Use SharpZipLib to create a similar zip file with the same inputfile. Open the result in Winzip and look at the Properties > Details for the file time stamp.
My input file has a Modified timestamp (file properties) of 2010-12-14 15:51:28 and in my test, SharpZipLib stored it correctly in the zip, while WinZip stored it as 2010-12-14 15:51:30
In other words WinZip added 2 seconds when putting it into the zip.
After extracting (either with WinZip or SharpZip), the Modified is now 15:51:30 instead of the original 15:51:28.
it is amazing that such an obvious bug in WinZip can have gone unreported and unfixed for so long. If you have a paid version you should certainly raise a bug fault with them.
I just remembered something about 2 second granularity in the old 8.3 file system timestamps.
Quick google found this ...
Quote "Original DOS file system had only 32 bytes to represent a file in the directory. The very restrictive 8.3 filename and the limited granularity (2 second) in file date are corrected in the Win32 file systems (VFAT)."
from http://www.xxcopy.com/xxcopy15.htm
The Zip format only allows 2 second granularity in the standard time stamp entry.The date and time are encoded in standard MS-DOS format.
An optional NTFS Extra Data field (0x000a) can be included, which may hold last modification time, last access time and creation time. WinZip does not appear to create it. SharpZip will use it if present but as far as I can see, it is not created when using FastZip to create a zip. That might be a useful option to add to the code. You can certainly create it manually if using ZipFile.
Hope this helps,
David
I think it might just be the operating system that is causing this. I've tried what happens in Explorer. I've got a text file with a modified time stamp of 17:06:45. I right click on the file and choose Send to | Compressed (zipped) folder. Then I right click on the new zip-file and choose Extract All... followed by Next, Next, Finish. Now the extracted text file has a time stamp of 17:06:46.
The same happens when I use 7-Zip, or WinRar. But then it only happens when using a .zip file. If I let them create a .7Z or a .RAR file the time stamp isn't changed.
Found an article on Wikipedia about the zip format. If you search it for "seconds" you'll find a section describing that the ZIP file system mimics the DOS FAT file system, which only has a time resolution of two seconds.
Related
Context:
To save more space, I want to further compress some files in a zip archive with an algorithm, and other files in same archive with another algorithm. Later I need to revert the process to get the original zip archive, because the zip files are owned by users.
How to locate compressed bits of certain files in a zip archive for further processing?
Language: Guess this kind of code is usually C/C++ for performance, but C# is good too.
OS: Windows Server 2012 R2 or later.
Edit:
I learned that in zip(zlib) format, compressed files are organized in blocks. We should be able to locate the files by searching headers. Still checking on how to code it.
The zip file format is fully documented. e.g. http://www.fileformat.info/format/zip/corion.htm
You can find other sources and descriptions easily.
What you have to do is to read bytes according to the format and you have exact knowledge about where a certain file's compressed bits are. You can find open source libraries for this or you can roll your own in your preferred language.
As a side note, compressing an otherwise compressed file in a zip archive may not worth the efforts.
I have googled without any luck.
My question is if there is a way to see when the file has been zipped? I am not asking for when the file has been created or modified rather to see when it has been zipped.
Is there a byte I can read from the zipped file somehow to find out this in code (C#)?
Best regards
As far as I can tell,that information is not stored in the zip file. The only info you can get is the last time that file inside the zip was modified. That datetime could or could not be the first time that file was included in the zip file.
If you want to get the last time the entry in the zip archive was changed, you can use the ZipArchiveEntry.LastWriteTime in the System.IO.Compression Namespace.
If you have any doubt of what info is availble in a zip file, you can check this wikipedia article : Zip File Format
I'm creating an encrypted zip file using DotNetZip using WinZip AES 256. However I'm able to read the directory and even to remove some of the zipentries without having the encryption key.
As far as I understand the directory visibility is a limitation of the Zip format. I just wonder, if this also applies for any changes in removing / adding components to the zip file or does there exist a way for preventing such changes.
EDIT:
A quick read of Zip File Format seems to show, that double zipping seems to be the only solution to prevent random removal / addition of comoponents in a zipfile, regardless of encryption of the single entry.
From the kb of the Winzip last update last updated 20 Feb, 2013:
To hide the names of the files in your encrypted Zip file, you can double zip them. To do this:
So I'll say no :-)
Winrar has an option to encrypt the filenames, sadly the algorithm isn't public.
There is a virus that my brother got in his computer and what that virus did was to rename almost all files in his computer. It changed the file extensions as well. so a file that might have been named picture.jpg was renamed to kjfks.doc for example.
so what I have done in order to solve this problem is:
remove all file extensions from files. (I use a recursive method to search for all files in a directory and as I go through the files I remove the extension)
now the files do not have an extension. the files now look like:
I think this file names are stored in a local database created by the virus and if I purchase the anti virus they will be renamed back to their original name.
since my brother created a backup I selected the files that had a creation date latter than when my brother performed the backup. so I have placed that files in a directory.
I am not interested in getting the right extension as long as I can see the content of the file. for example, I will scan each file and if it has text inside I know it will have a .txt extension. maybe it was a .html or .css extension I will not be able to know that I know.
I belive that all pdf files should have something in common. or doc files should also have something in common. How can I figure what the most common types (pdf, doc, docx, png, jpg, etc) files have in common)
Edit:
I know it will probably take less time to go over all this 200 files and test each one instead of creating this program. it is just that I am curios to see if it will be possible to get the file extension.
In unix, you can use file to determine the type of file. There is also a port for windows and you can obviously write a script (batch, powershell, etc.) or C# program to automate this.
First, congratulate your brother on doing a backup. Many people don't, and are absolutely wiped out by these problems.
You're going to have to do a lot of research, I'm afraid, but you're on the right track.
Open each file with a TextReader or a BinaryReader and examine the headers. Most of them are detectable.
For instance: Every PDF starts with "%PDF-" and then its version number. Just look at those first 5 characters. If it's "%PDF-", then put a PDF on the filename and move on.
Similarly: "ÿØÿà..JFIF" for JPEG's, "[InternetShortcut]" for URL shortcuts, "L...........À......Fƒ" for regular shortcuts (the "." is a zero/null, BTW)
ZIPs / Compressed directories start with {0x50}{0x4B]{0x03}{0x04}{0x14}, and you should be aware that Office 2007/2010 documents are really ZIPs with XML files inside of them.
You'll have to do some digging as you find each type, but you should be able to write something to establish most of the file types.
You'll have to write some recursion to work through directories, but you can eliminate any file with no extension.
BTW - A great tool to help pwith this is HxD: http://www.mh-nexus.de/ It's what I used to pull this answer together!
Good luck!
"most common types" each have it's own format and most of them have some magic bytes at the fixed position near beginning of the file. You can detect most of formats quite easily. Even HTML, XML, .CSS and similar text files can be detected by analyzing their beginning. But it will take some time to write an application that will guess the format. For some types (such as ODF format or JAR format, which are built on top of regular ZIPs) you will be also able to detect this format.
But ... Can it be that there exists such application on the market? I guess you can find something if you search, cause the task is not as tricky as it initially seems to be.
I'm trying to find some lost .jpg pictures. Here's a .bat file to setup a simplified version of my situation
md TestSetup
cd TestSetup
md a
cd a
echo "Can we find this later?" > a.abc
del a.abc
cd..
rd a
What code would be needed to open the text file again? I'm actually looking for .jpeg files that were treated in a similar manner
More details: I'm trying to recover picture files from a previous one-touch backup where the directories and files have been deleted and everything was saved in the backup with a single character name and every file has the same 3 letter extension. There is a current backup but they need to view the previous deleted ones (or at least the .jpg files).
Here's how I was trying to approach it: C# code
To the best of my knowledge, most file recovery tools actually read the low-level filesystem format on the disk and try to piece together deleted files. This works because, at least in FAT, a deleted file still resides in the sector specifying the directory (just with a different first character to identify it as "deleted"). New files may overwrite these deleted entries and therefore make the file unrecoverable. That's just a little bit of theory.
There is a current backup but they
need to view the previous deleted ones
(or at least the .jpg files).
Unless there's a backup for that file at the time that you want to restore from, I believe you're going to have a hard time getting that file without resorting to a low-level filesystem read. And even then, you may be out of luck if enough revisions have been made (or it's not a trivial filesystem like FAT).