I am trying extract the word document, It has embedded files(word,excel,package). I am not able to extract package and save it Using C# Open XML.
The below code just extracts word and excel but not package.
using (WordprocessingDocument document = WordprocessingDocument.Open(fileName, false))
{
foreach (EmbeddedPackagePart pkgPart in document.MainDocumentPart.GetPartsOfType<EmbeddedPackagePart>())
{
if (pkgpart.uri.tostring().startswith(embeddingpartstring))
{
string filename1 = pkgpart.uri.tostring().remove(0, embeddingpartstring.length);
// get the stream from the part
system.io.stream partstream = pkgpart.getstream();
string filepath = "d:\\test\\" + filename1;
// write the steam to the file.
system.io.filestream writestream = new system.io.filestream(filepath, filemode.create, fileaccess.write);
readwritestream(pkgpart.getstream(), writestream);
}
}
}
The issue you're having is, that when you go to MainDocument.Parts and start searching, what you'll get is things like "Imagepart", "ChartPart" etc. where the ChartPart might have it's own embedded part, which could be the Excel or Word file you are looking for.
In short, you need to extend your search for embedded parts, to the actual parts in the mainDocument.
If I just wanted to extract all embedded parts in one of the files from my own project, I would go about it like this.
using (var document = WordprocessingDocument.Open(#"C:\Test\myTestDocument.docx", false))
{
//just grab all the parts, might be relevant to be a bit more clever about it, depending on sizes of files and how many files you want to search through
foreach(var part in document.MainDocumentPart.Parts)
{
//foreach part see if that part containts an EmbeddedPackagePart
var testForEmbedding = part.OpenXmlPart.GetPartsOfType<EmbeddedPackagePart>();
foreach(EmbeddedPackagePart embedding in testForEmbedding)
{
//You should probably insert some clever naming scheme here..
string fileName = embedding.Uri.OriginalString.Split('/').Last();
//stream the EmbeddedPackagePart to a file
using(FileStream myFile = File.Create(#"C:\test\" + fileName))
using (var stream = embedding.GetStream())
{
stream.Seek(0, SeekOrigin.Begin);
stream.CopyTo(myFile);
myFile.Close();
}
}
}
}
I hope this helps!
I have converted a .zip file into a byte[], and now I am trying to convert the byte[] back to the original .zip file. I am running out of the options that I have tried. Anyone give me a pointer how can I achieve this?
You want the System.IO.Compression.ZipArchive class:
using (ZipArchive zip = ZipFile.Open("test.zip", ZipArchiveMode.Create))
{
var entry = zip.CreateEntry("File Name.txt");
using (StreamWriter sw = new StreamWriter(entry.Open()))
{
sw.Write("Some Text");
}
}
using (ZipArchive zip = ZipFile.Open("test.zip", ZipArchiveMode.Read))
{
foreach (ZipArchiveEntry entry in zip.Entries)
{
using (StreamReader sr = new StreamReader(entry.Open()))
{
var result = sr.ReadToEnd();
}
}
}
You probably don't want to read in the raw zip file into a byte array first and then try to decompress it. Instead, access it through this helper method.
Note the use of ZipArchive.Entries to access the sub-files stored in the single zip archive; this tripped me up when first learning to use zip files.
I'm tring to create a program to archieve some files from a folder to a single binary file so I can read files from the binary archieve later.
So I created a archivation method but I don't really know how can I read the files from the binary without unpacking them...
Some code:
public static void PackFiles()
{
using (var doFile = File.Create("root.extension"))
using (var doBinary = new BinaryWriter(doFile))
{
foreach (var file in Directory.GetFiles("Data"))
{
doBinary.Write(true);
doBinary.Write(Path.GetFileName(file));
var data = File.ReadAllBytes(file);
doBinary.Write(data.Length);
doBinary.Write(data);
}
doBinary.Write(false);
}
}
Also, can I set a kind of "password" to the binary file so the archieve can only be unpacked if the password is known?
P.S: I dont need zip :)
I think the best way to go is using ZIP for this.
There is a solid and fast library called dotnetzip
Example using password:
using (ZipFile zip = new ZipFile())
{
zip.Password= "123456!";
zip.AddFile("ReadMe.txt");
zip.AddFile("7440-N49th.png");
zip.AddFile("2005_Annual_Report.pdf");
zip.Save("Backup.zip");
}
Is there anyway in .Net (C#) to extract data from a zip file without decompressing the complete file?
I possibly want to extract data (file) from the start of a zip file if the compression algorithm compress the file used was in a deterministic order.
With .Net Framework 4.5 (using ZipArchive):
using (ZipArchive zip = ZipFile.Open(zipfile, ZipArchiveMode.Read))
foreach (ZipArchiveEntry entry in zip.Entries)
if(entry.Name == "myfile")
entry.ExtractToFile("myfile");
Find "myfile" in zipfile and extract it.
DotNetZip is your friend here.
As easy as:
using (ZipFile zip = ZipFile.Read(ExistingZipFile))
{
ZipEntry e = zip["MyReport.doc"];
e.Extract(OutputStream);
}
(you can also extract to a file or other destinations).
Reading the zip file's table of contents is as easy as:
using (ZipFile zip = ZipFile.Read(ExistingZipFile))
{
foreach (ZipEntry e in zip)
{
if (header)
{
System.Console.WriteLine("Zipfile: {0}", zip.Name);
if ((zip.Comment != null) && (zip.Comment != ""))
System.Console.WriteLine("Comment: {0}", zip.Comment);
System.Console.WriteLine("\n{1,-22} {2,8} {3,5} {4,8} {5,3} {0}",
"Filename", "Modified", "Size", "Ratio", "Packed", "pw?");
System.Console.WriteLine(new System.String('-', 72));
header = false;
}
System.Console.WriteLine("{1,-22} {2,8} {3,5:F0}% {4,8} {5,3} {0}",
e.FileName,
e.LastModified.ToString("yyyy-MM-dd HH:mm:ss"),
e.UncompressedSize,
e.CompressionRatio,
e.CompressedSize,
(e.UsesEncryption) ? "Y" : "N");
}
}
Edited To Note: DotNetZip used to live at Codeplex. Codeplex has been shut down. The old archive is still available at Codeplex. It looks like the code has migrated to Github:
https://github.com/DinoChiesa/DotNetZip. Looks to be the original author's repo.
https://github.com/haf/DotNetZip.Semverd. This looks to be the currently maintained version. It's also packaged up an available via Nuget at https://www.nuget.org/packages/DotNetZip/
Something like this will list and extract the files one by one, if you want to use SharpZipLib:
var zip = new ZipInputStream(File.OpenRead(#"C:\Users\Javi\Desktop\myzip.zip"));
var filestream = new FileStream(#"C:\Users\Javi\Desktop\myzip.zip", FileMode.Open, FileAccess.Read);
ZipFile zipfile = new ZipFile(filestream);
ZipEntry item;
while ((item = zip.GetNextEntry()) != null)
{
Console.WriteLine(item.Name);
using (StreamReader s = new StreamReader(zipfile.GetInputStream(item)))
{
// stream with the file
Console.WriteLine(s.ReadToEnd());
}
}
Based on this example: content inside zip file
Here is how a UTF8 text file can be read from a zip archive into a string variable (.NET Framework 4.5 and up):
string zipFileFullPath = "{{TypeYourZipFileFullPathHere}}";
string targetFileName = "{{TypeYourTargetFileNameHere}}";
string text = new string(
(new System.IO.StreamReader(
System.IO.Compression.ZipFile.OpenRead(zipFileFullPath)
.Entries.Where(x => x.Name.Equals(targetFileName,
StringComparison.InvariantCulture))
.FirstOrDefault()
.Open(), Encoding.UTF8)
.ReadToEnd())
.ToArray());
the following code can read specific file as byte array :
using ZipArchive zipArchive = ZipFile.OpenRead(zipFilePath);
foreach(ZipArchiveEntry zipArchiveEntry in zipArchive.Entries)
{
if(zipArchiveEntry.Name.Equals(fileName,StringComparison.OrdinalIgnoreCase))
{
Stream stream = zipArchiveEntry.Open();
using MemoryStream memoryStream = new MemoryStream();
await stream.CopyToAsync(memoryStream);
return memoryStream.ToArray();
}
}
Zip files have a table of contents. Every zip utility should have the ability to query just the TOC. Or you can use a command line program like 7zip -t to print the table of contents and redirect it to a text file.
In such case you will need to parse zip local header entries. Each file, stored in zip file, has preceding Local File Header entry, which (normally) contains enough information for decompression, Generally, you can make simple parsing of such entries in stream, select needed file, copy header + compressed file data to other file, and call unzip on that part (if you don't want to deal with the whole Zip decompression code or library).
I am currently working with SharpZipLib under .NET 2.0 and via this I need to compress a single file to a single compressed archive. In order to do this I am currently using the following:
string tempFilePath = #"C:\Users\Username\AppData\Local\Temp\tmp9AE0.tmp.xml";
string archiveFilePath = #"C:\Archive\Archive_[UTC TIMESTAMP].zip";
FileInfo inFileInfo = new FileInfo(tempFilePath);
ICSharpCode.SharpZipLib.Zip.FastZip fZip = new ICSharpCode.SharpZipLib.Zip.FastZip();
fZip.CreateZip(archiveFilePath, inFileInfo.Directory.FullName, false, inFileInfo.Name);
This works exactly (ish) as it should, however while testing I have encountered a minor gotcha. Lets say that my temp directory (i.e. the directory that contains the uncompressed input file) contains the following files:
tmp9AE0.tmp.xml //The input file I want to compress
xxx_tmp9AE0.tmp.xml // Some other file
yyy_tmp9AE0.tmp.xml // Some other file
wibble.dat // Some other file
When I run the compression all the .xml files are included in the compressed archive. The reason for this is because of the final fileFilter parameter passed to the CreateZip method. Under the hood SharpZipLib is performing a pattern match and this also picks up the files prefixed with xxx_ and yyy_. I assume it would also pick up anything postfixed as well.
So the question is, how can I compress a single file with SharpZipLib? Then again maybe the question is how can I format that fileFilter so that the match can only ever pick up the file I want to compress and nothing else.
As an aside, is there any reason as to why System.IO.Compression not include a ZipStream class? (It only supports GZipStream)
EDIT : Solution (Derived from accepted answer from Hans Passant)
This is the compression method I implemented:
private static void CompressFile(string inputPath, string outputPath)
{
FileInfo outFileInfo = new FileInfo(outputPath);
FileInfo inFileInfo = new FileInfo(inputPath);
// Create the output directory if it does not exist
if (!Directory.Exists(outFileInfo.Directory.FullName))
{
Directory.CreateDirectory(outFileInfo.Directory.FullName);
}
// Compress
using (FileStream fsOut = File.Create(outputPath))
{
using (ICSharpCode.SharpZipLib.Zip.ZipOutputStream zipStream = new ICSharpCode.SharpZipLib.Zip.ZipOutputStream(fsOut))
{
zipStream.SetLevel(3);
ICSharpCode.SharpZipLib.Zip.ZipEntry newEntry = new ICSharpCode.SharpZipLib.Zip.ZipEntry(inFileInfo.Name);
newEntry.DateTime = DateTime.UtcNow;
zipStream.PutNextEntry(newEntry);
byte[] buffer = new byte[4096];
using (FileStream streamReader = File.OpenRead(inputPath))
{
ICSharpCode.SharpZipLib.Core.StreamUtils.Copy(streamReader, zipStream, buffer);
}
zipStream.CloseEntry();
zipStream.IsStreamOwner = true;
zipStream.Close();
}
}
}
This is an XY problem, just don't use FastZip. Follow the first example on this web page to avoid accidents.