I need to extract a zip file to memory (not to the disk). I cannot save it to a directory, even temporarily.
Is there a way to extract a zip file just to memory, and perform "File" functions there?
I can't open the file as a file stream because this doesn't allow me to read the metadata (last write time, attributes, etc). Some but not all the file attributes can be read from zip entry itself but this is insufficient for my purposes.
I've been using:
using (ZipArchive archive = ZipFile.OpenRead(openFileDialog.FileName)) // Read files from the zip file
{
foreach (ZipArchiveEntry entry in archive.Entries)
{
if(entry.Name.EndsWith(".txt", StringComparison.InvariantCultureIgnoreCase)) // get .txt file
{
FileStream fs = entry.Open() as FileStream;
}
}
}
Thanks.
The code below presents a way to get the file into memory as an array of strings, but it is unclear as to what file functions you are asking for. Other commenters have mentioned ExternalAttributes, which is OS dependent therefore it is relevant to have more information as to the problem space.
using System;
using System.IO;
using System.IO.Compression;
namespace StackOverflowSampleCode
{
class Program
{
/// <summary>
/// Validate the extension is correct
/// </summary>
/// <param name="entry"></param>
/// <param name="ext"></param>
/// <returns></returns>
static bool validateExtension(ZipArchiveEntry entry, string ext)
{
return entry.Name.EndsWith(
ext,
StringComparison.InvariantCultureIgnoreCase);
}
/// <summary>
/// Convert the entry into an array of strings
/// </summary>
/// <param name="entry"></param>
/// <returns></returns>
static string[] extractFileStrings(ZipArchiveEntry entry)
{
string[] file;
// Store into MemoryStream
using (var ms = entry.Open() as MemoryStream)
{
// Verify we are at the start of the stream
ms.Seek(0, SeekOrigin.Begin);
// Handle the bytes of the memory stream
// by converting to array of strings
file = ms.ToString().Split(
Environment.NewLine, // OS agnostic
StringSplitOptions.RemoveEmptyEntries);
}
return file;
}
static void Main(string[] args)
{
string fileName = "";
using (ZipArchive archive = ZipFile.OpenRead(fileName))
{
foreach (var entry in archive.Entries)
{
// Limit results to files with ".txt" extension
if (validateExtension(entry, ".txt"))
{
var file = extractFileStrings(entry);
foreach (var line in file)
{
Console.WriteLine(line);
}
Console.WriteLine($"Last Write Time: {entry.LastWriteTime}");
Console.WriteLine($"External Attributes: {entry.ExternalAttributes}");
}
}
}
}
}
}
Related
I'm working on converting some files, but I'm having some issues on the 2nd step of this.
Load file from source location
Save file to temp folder
Save converted file to Output location
I have 2 methods for reading the original file, but there is a problem with both of them.
Method 1: The file remains locked (so when something goes wrong, I have to restart the app)
Method 2: The temp file is empty
Anybody got an idea on how to fix one of those problems?
Utilities class
/// <summary>
/// Get document stream
/// </summary>
/// <param name="DocumentName">Input document name</param>
public static Stream GetDocumentStreamFromLocation(string documentLocation)
{
try
{
//ExStart:GetDocumentStream
// Method one: works, but locks file
return File.Open(documentLocation, FileMode.Open, FileAccess.Read);
// Method two: gives empty file on temp folder
using (FileStream fsSource = File.Open(documentLocation, FileMode.Open, FileAccess.Read))
{
var stream = new MemoryStream((int)fsSource.Length);
fsSource.CopyTo(stream);
return stream;
}
//ExEnd:GetDocumentStream
}
catch (FileNotFoundException ioEx)
{
Console.WriteLine(ioEx.Message);
return null;
}
}
/// <summary>
/// Save file in any format
/// </summary>
/// <param name="filename">Save as provided string</param>
/// <param name="content">Stream as content of a file</param>
public static void SaveFile(string filename, Stream content, string location = OUTPUT_PATH)
{
try
{
//ExStart:SaveAnyFile
//Create file stream
using (FileStream fileStream = File.Create(Path.Combine(Path.GetFullPath(location), filename)))
{
content.CopyTo(fileStream);
}
//ExEnd:SaveAnyFile
}
catch (System.Exception ex)
{
Console.WriteLine(ex.Message);
}
}
I Call the following functions as following:
public static StreamContent Generate(string sourceLocation)
{
// Get filename
var fileName = Path.GetFileName(sourceLocation);
// Create tempfilename
var tempFilename = $"{Guid.NewGuid()}_{fileName}";
// Put file in storage location
Utilities.SaveFile(tempFilename, Utilities.GetDocumentStreamFromLocation(sourceLocation), Utilities.STORAGE_PATH);
// ... More code
}
In order to copy the source file to a temp folder, the easiest way is to use the File.Copy method from the System.IO namespace. Consider the following:
// Assuming the variables have been set as you already had, this creates a copy in the intended location.
File.Copy(documentLocation, filename);
After some further digging. I found out that you can add a property in the File.Open that "fixes" this issue:
return File.Open(documentLocation, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
With the downside that you still can't move / rename the file, but the lock is removed.
I need to create zip archive which should contain files with certain extensions only, but I need to save the structure of the original directory.
For example, I have a directory with the following structure:
dir\
sub_dir1\
1.exe
sub_dir_2\
1.txt
1.exe
1.txt
1.bat
and I need to get an archive with the following structure (only .exe and .bat files):
dir\
sub_dir1\
1.exe
sub_dir_2\
1.exe
1.bat
I know how to find these files via Directory.GetFiles method:
var ext = new List<string> {".exe", ".bat"};
var myFiles = Directory.GetFiles(dir, "*.*", SearchOption.AllDirectories)
.Where(s => ext.Any(e => s.EndsWith(e));
but I don't know how to save the archive's structure then.
How can I achieve such behavior?
You can get all the files with extension .exe and .bat from all the sub directories like:
IList<FileInfo> info = null;
DirectoryInfo dirInfo = new DirectoryInfo(path);
info = dirInfo
.GetFiles("*.*", SearchOption.AllDirectories)
.Where( f => f.Extension
.Equals(".exe", StringComparison.CurrentCultureIgnoreCase)
|| f.Extension
.Equals(".bat", StringComparison.CurrentCultureIgnoreCase)
)
.ToList()
;
Then based on this FileInfo list you can create you zip and folder structure.You can find the fileinfo details Here
System.IO.Directory.GetFiles("C:\\temp", "*.exe", SearchOption.AllDirectories); will return an array with the full file paths like:
[C:\temp\dir1\app1.exe]
[C:\temp\dir2\subdir1\app2.exe]
[C:\temp\dir3\subdir2\subdir3\app3.exe]
So you won't have any trouble to put these files in a zip container with ZipArchive.CreateEntry because this method will create the same directory structure in the zip. However, you should remove the C:\ at the beginning.
I believe this very nice tutorial will help you to do that.
If you want to keep empty folder in the target zip, maybe you have to use ZipArchive.CreateEntry method to do. In this demo, the author only use ZipArchive. CreateEntryFromFile method to archive a file from a file path.
Also you can use DotNetZip library for solving your problem. For example the following code snippet can help you. Use CreateZip() method. In brief you should find files for writing to archive GetFileNames() and create zip file using CreateZipFromFileNames():
/// <summary>
/// Create zip archive from root directory with search patterns
/// </summary>
/// <param name="rootDirectory">Root directory</param>
/// <param name="searchPatterns">Search patterns</param>
/// <param name="zipFileName">Zip archive file name</param>
public static void CreateZip(string rootDirectory, List<string> searchPatterns, string zipFileName)
{
var fileNames = GetFileNames(rootDirectory, searchPatterns, true);
CreateZipFromFileNames(rootDirectory, zipFileName, fileNames);
}
/// <summary>
/// Get file names filtered by search patterns
/// </summary>
/// <param name="rootDirectory">Root diirectory</param>
/// <param name="searchPatterns">Search patterns</param>
/// <param name="includeSubdirectories">True if it is included files from subdirectories</param>
/// <returns>List of file names</returns>
private static IEnumerable<string> GetFileNames(string rootDirectory, List<string> searchPatterns, bool includeSubdirectories)
{
var foundFiles = new List<string>();
var directoriesToSearch = new Queue<string>();
directoriesToSearch.Enqueue(rootDirectory);
// Breadth-First Search
while (directoriesToSearch.Count > 0)
{
var path = directoriesToSearch.Dequeue();
foreach (var searchPattern in searchPatterns)
{
foundFiles.AddRange(Directory.EnumerateFiles(path, searchPattern));
}
if (includeSubdirectories)
{
foreach (var subDirectory in Directory.EnumerateDirectories(path))
{
directoriesToSearch.Enqueue(subDirectory);
}
}
}
return foundFiles;
}
/// <summary>
/// Create zip archive from list of file names
/// </summary>
/// <param name="rootDirectroy">Root directory (for saving required structure of directories)</param>
/// <param name="zipFileName">File name of zip archive</param>
/// <param name="fileNames">List of file names</param>
private static void CreateZipFromFileNames(string rootDirectroy, string zipFileName, IEnumerable<string> fileNames)
{
var rootZipPath = Directory.GetParent(rootDirectroy).FullName;
using (var zip = new ZipFile(zipFileName))
{
foreach (var filePath in fileNames)
{
var directoryPathInArchive = Path.GetFullPath(Path.GetDirectoryName(filePath)).Substring(rootZipPath.Length);
zip.AddFile(filePath, directoryPathInArchive);
}
zip.Save();
}
}
Example of use:
CreateZip("dir", new List<string> { "*.exe", "*.bat" }, "myFiles.zip");
What #Didgeridoo said: DotNetZip. But DotNetZip lets you be even more expressive. For instance:
string cwd = Environment.CurrentDirectory ;
try
{
Environment.CurrentDirectory = #"c:\root\of\directory\tree\to\be\zipped" ;
using ( ZipFile zipfile = new ZipFile() )
{
zipfile.AddSelectedFiles( "name = *.bat OR name = *.exe" , true ) ;
zipfile.Save( #"c:\foo\bar\my-archive.zip") ;
}
}
finally
{
Environment.CurrentDirectory = cwd ;
}
Edited To Note: DotNetZip used to live at Codeplex. Codeplex has been shut down. The old archive is still available at Codeplex. It looks like the code has migrated to Github:
https://github.com/DinoChiesa/DotNetZip. Looks to be the original author's repo.
https://github.com/haf/DotNetZip.Semverd. This looks to be the currently maintained version. It's also packaged up an available via Nuget at https://www.nuget.org/packages/DotNetZip/
I can't imagine this is hard to do, but I haven't been able to get it to work. I have a files class that just stores the location, directory, and name of the files I want to zip. The files I'm zipping exist on disk so the FileLocation is the full path. ZipFileDirectory doesn't exist on disk. If I have two items in my files list,
{ FileLocation = "path/file1.doc", ZipFileDirectory = #"\", FileName = "CustomName1.doc" },
{ FileLocation = "path/file2.doc", ZipFileDirectory = #"\NewDirectory", FileName = "CustomName2.doc" }
I would expect to see MyCustomName1.doc in the root, and a folder named NewDirectory containing MyCustomName2.doc, but what happens is they both end up in the root using this code:
using (var zip = new Ionic.Zip.ZipFile())
{
foreach (var file in files)
{
zip.AddFile(file.FileLocation, file.ZipFileDirectory).FileName = file.FileName;
}
zip.Save(HttpContext.Current.Response.OutputStream);
}
If I use this:
zip.AddFiles(files.Select(o => o.FileLocation), false, "NewDirectory");
Then it creates the new directory and puts all of the files inside, as expected, but then I lose the ability to use the custom naming with this method, and it also introduces more complexities that the first method would handle perfectly.
Is there a way I can get the first method (AddFile()) to work as I expect?
On further inspection, since posting a comment a few minutes ago, I suspect that setting FileName is erasing the archive path.
Testing confirms this.
Setting the name to #"NewDirectory\CustomName2.doc" will fix the problem.
You can also use #"\NewDirectory\CustomName2.doc"
Not sure if this exactly suites your needs but thought I would share. It is a method that is part of a helper class that I created to make working with DotNetZip a bit easier for my dev team. The IOHelper class is another simple helper class that you can ignore.
/// <summary>
/// Create a zip file adding all of the specified files.
/// The files are added at the specified directory path in the zip file.
/// </summary>
/// <remarks>
/// If the zip file exists then the file will be added to it.
/// If the file already exists in the zip file an exception will be thrown.
/// </remarks>
/// <param name="filePaths">A collection of paths to files to be added to the zip.</param>
/// <param name="zipFilePath">The fully-qualified path of the zip file to be created.</param>
/// <param name="directoryPathInZip">The directory within the zip file where the file will be placed.
/// Ex. specifying "files\\docs" will add the file(s) to the files\docs directory in the zip file.</param>
/// <param name="deleteExisting">Delete the zip file if it already exists.</param>
public void CreateZipFile(ICollection<FileInfo> filePaths, string zipFilePath, string directoryPathInZip, bool deleteExisting)
{
if (deleteExisting)
{
IOHelper ioHelper = new IOHelper();
ioHelper.DeleteFile(zipFilePath);
}
using (ZipFile zip = new ZipFile(zipFilePath))
{
foreach (FileInfo filePath in filePaths)
{
zip.AddFile(filePath.FullName, directoryPathInZip);
}
zip.Save();
}
}
I'm testing how the classes FileStream and StreamReader work togheter. Via a Console application.
I'm trying to go in a file and read the lines and print them on the console.
I've been able to do it with a while-loop, but I want to try it with a foreach loop.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace testing
{
public class Program
{
public static void Main(string[] args)
{
string file = #"C:\Temp\New Folder\New Text Document.txt";
using(FileStream fs = new FileStream(file, FileMode.Open, FileAccess.Read))
{
using(StreamReader sr = new StreamReader(fs))
{
foreach(string line in file)
{
Console.WriteLine(line);
}
}
}
}
}
}
The error I keep getting for this is: Cannot convert type 'char' to 'string'
The while loop, which does work, looks like this:
while((line = sr.ReadLine()) != null)
{
Console.WriteLine(line);
}
I'm probably overlooking something really basic, but I can't see it.
If you want to read a file line-by-line via foreach (in a reusable fashion), consider the following iterator block:
public static IEnumerable<string> ReadLines(string path)
{
using (StreamReader reader = File.OpenText(path))
{
string line;
while ((line = reader.ReadLine()) != null)
{
yield return line;
}
}
}
Note that this this is lazily evaluated - there is none of the buffering that you would associate with File.ReadAllLines(). The foreach syntax will ensure that the iterator is Dispose()d correctly even for exceptions, closing the file:
foreach(string line in ReadLines(file))
{
Console.WriteLine(line);
}
(this bit is added just for interest...)
Another advantage of this type of abstraction is that it plays beautifully with LINQ - i.e. it is easy to do transformations / filters etc with this approach:
DateTime minDate = new DateTime(2000,1,1);
var query = from line in ReadLines(file)
let tokens = line.Split('\t')
let person = new
{
Forname = tokens[0],
Surname = tokens[1],
DoB = DateTime.Parse(tokens[2])
}
where person.DoB >= minDate
select person;
foreach (var person in query)
{
Console.WriteLine("{0}, {1}: born {2}",
person.Surname, person.Forname, person.DoB);
}
And again, all evaluated lazily (no buffering).
To read all lines in New Text Document.txt:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace testing
{
public class Program
{
public static void Main(string[] args)
{
string file = #"C:\Temp\New Folder\New Text Document.txt";
using(FileStream fs = new FileStream(file, FileMode.Open, FileAccess.Read))
{
using(StreamReader sr = new StreamReader(fs))
{
while(!sr.EndOfStream)
{
Console.WriteLine(sr.ReadLine());
}
}
}
}
}
}
I have a LineReader class in my MiscUtil project. It's slightly more general than the solutions given here, mostly in terms of the way you can construct it:
From a function returning a stream, in which case it will use UTF-8
From a function returning a stream, and an encoding
From a function which returns a text reader
From just a filename, in which case it will use UTF-8
From a filename and an encoding
The class "owns" whatever resources it uses, and closes them appropriately. However, it does this without implementing IDisposable itself. This is why it takes Func<Stream> and Func<TextReader> instead of the stream or the reader directly - it needs to be able to defer the opening until it needs it. It's the iterator itself (which is automatically disposed by a foreach loop) which closes the resource.
As Marc pointed out, this works really well in LINQ. One example I like to give is:
var errors = from file in Directory.GetFiles(logDirectory, "*.log")
from line in new LineReader(file)
select new LogEntry(line) into entry
where entry.Severity == Severity.Error
select entry;
This will stream all the errors from a whole bunch of log files, opening and closing as it goes. Combined with Push LINQ, you can do all kinds of nice stuff :)
It's not a particularly "tricky" class, but it's really handy. Here's the full source, for convenience if you don't want to download MiscUtil. The licence for the source code is here.
using System;
using System.Collections;
using System.Collections.Generic;
using System.IO;
using System.Text;
namespace MiscUtil.IO
{
/// <summary>
/// Reads a data source line by line. The source can be a file, a stream,
/// or a text reader. In any case, the source is only opened when the
/// enumerator is fetched, and is closed when the iterator is disposed.
/// </summary>
public sealed class LineReader : IEnumerable<string>
{
/// <summary>
/// Means of creating a TextReader to read from.
/// </summary>
readonly Func<TextReader> dataSource;
/// <summary>
/// Creates a LineReader from a stream source. The delegate is only
/// called when the enumerator is fetched. UTF-8 is used to decode
/// the stream into text.
/// </summary>
/// <param name="streamSource">Data source</param>
public LineReader(Func<Stream> streamSource)
: this(streamSource, Encoding.UTF8)
{
}
/// <summary>
/// Creates a LineReader from a stream source. The delegate is only
/// called when the enumerator is fetched.
/// </summary>
/// <param name="streamSource">Data source</param>
/// <param name="encoding">Encoding to use to decode the stream
/// into text</param>
public LineReader(Func<Stream> streamSource, Encoding encoding)
: this(() => new StreamReader(streamSource(), encoding))
{
}
/// <summary>
/// Creates a LineReader from a filename. The file is only opened
/// (or even checked for existence) when the enumerator is fetched.
/// UTF8 is used to decode the file into text.
/// </summary>
/// <param name="filename">File to read from</param>
public LineReader(string filename)
: this(filename, Encoding.UTF8)
{
}
/// <summary>
/// Creates a LineReader from a filename. The file is only opened
/// (or even checked for existence) when the enumerator is fetched.
/// </summary>
/// <param name="filename">File to read from</param>
/// <param name="encoding">Encoding to use to decode the file
/// into text</param>
public LineReader(string filename, Encoding encoding)
: this(() => new StreamReader(filename, encoding))
{
}
/// <summary>
/// Creates a LineReader from a TextReader source. The delegate
/// is only called when the enumerator is fetched
/// </summary>
/// <param name="dataSource">Data source</param>
public LineReader(Func<TextReader> dataSource)
{
this.dataSource = dataSource;
}
/// <summary>
/// Enumerates the data source line by line.
/// </summary>
public IEnumerator<string> GetEnumerator()
{
using (TextReader reader = dataSource())
{
string line;
while ((line = reader.ReadLine()) != null)
{
yield return line;
}
}
}
/// <summary>
/// Enumerates the data source line by line.
/// </summary>
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
}
The problem is in:
foreach(string line in file)
{
Console.WriteLine(line);
}
Its because the "file" is string, and string implements IEnumerable. But this enumerator returns "char" and "char" can not be implictly converted to string.
You should use the while loop, as you sayd.
Slightly more elegant is the following...
using (var fileStream = new FileStream(file, FileMode.Open, FileAccess.Read))
{
using (var streamReader = new StreamReader(fileStream))
{
while (!streamReader.EndOfStream)
{
yield return reader.ReadLine();
}
}
}
Looks like homework to me ;)
You're iterating over the filename (a string) itself which gives you one character at a time. Just use the while approach that correctly uses sr.ReadLine().
Instead of using a StreamReader and then trying to find lines inside the String file variable, you can simply use File.ReadAllLines:
string[] lines = File.ReadAllLines(file);
foreach(string line in lines)
Console.WriteLine(line);
You are enumerating a string, and when you do that, you take one char at the time.
Are you sure this is what you want?
foreach(string line in file)
A simplistic (not memory efficient) approach of iterating every line in a file is
foreach (string line in File.ReadAllLines(file))
{
..
}
I presume you want something like this:
using ( FileStream fileStream = new FileStream( file, FileMode.Open, FileAccess.Read ) )
{
using ( StreamReader streamReader = new StreamReader( fileStream ) )
{
string line = "";
while ( null != ( line = streamReader.ReadLine() ) )
{
Console.WriteLine( line );
}
}
}
I've got few files in resources (xsd files) that i use for validating received xml messages. The resource file i use is named AppResources.resx and it contains a file called clientModels.xsd. When i try to use the file like this: AppResources.clientModels, i get a string with the file's content. i would like to get a stream instead.
i do not wish to use assembly.GetManifestResourceStream as i had bad experiences with it (using these streams to archive files with SharpZipLib didn't work for some reason).
is there any other way to do it? i've heard about ResourceManager - is it anything that could help me?
Could you feed the string you get into a System.IO.StringReader, perhaps? That may do what you want. You may also want to check out MemoryStream.
here is the code from the link
//Namespace reference
using System;
using System.Resources;
#region ReadResourceFile
/// <summary>
/// method for reading a value from a resource file
/// (.resx file)
/// </summary>
/// <param name="file">file to read from</param>
/// <param name="key">key to get the value for</param>
/// <returns>a string value</returns>
public string ReadResourceValue(string file, string key)
{
//value for our return value
string resourceValue = string.Empty;
try
{
// specify your resource file name
string resourceFile = file;
// get the path of your file
string filePath = System.AppDomain.CurrentDomain.BaseDirectory.ToString();
// create a resource manager for reading from
//the resx file
ResourceManager resourceManager = ResourceManager.CreateFileBasedResourceManager(resourceFile, filePath, null);
// retrieve the value of the specified key
resourceValue = resourceManager.GetString(key);
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
resourceValue = string.Empty;
}
return resourceValue;
}
#endregion
I did not write the code it came from
http://www.dreamincode.net/code/snippet1683.htm
HTH
bones
I have a zip file loaded as a resource, and referencing it directly from the namespace gives me bytes, not a string. Right-click on your file in the resources designer, and change the filetype from text to binary. Then you will get a bytearray, which you could load into a MemoryStream.