Putting Multiple Pdfs in a MemoryStream - c#

I'm trying to take preexisting pdf files and read them all into a memory stream to then be shown on a telerik pdf viewer. If I just do one file it works but as soon as I try multiple files it gives me a internal null error (object ref not set to blah blah) and can't step in the code to see where its actualy null. Am I doing this wrong or something?
List<string> applicableReports = CurrentWizard.GetApplicableReports();
previousReportsStream = new MemoryStream();
Stream[] streams = new Stream[applicableReports.Count];
for (int i = 0; i < streams.Length; i++)
{
streams[i] = new MemoryStream(DocumentHelper.Instance.ConvertFileToByteArray(applicableReports[i]));
streams[i].CopyTo(previousReportsStream);
}
RadPdfViewer radPdfViewer = new RadPdfViewer();
RadFixedDocument document = new PdfFormatProvider(previousReportsStream, FormatProviderSettings.ReadAllAtOnce).Import();
radPdfViewer.Document = document;
This is where error is thrown:
RadFixedDocument document = new PdfFormatProvider(previousReportsStream, FormatProviderSettings.ReadAllAtOnce).Import();
DocumentHelper File to byte[]:
public byte[] ConvertFileToByteArray(string fileName)
{
FileInfo fileInfo = new FileInfo(fileName);
byte[] fileData = null;
using (FileStream fileStream = new FileStream(fileInfo.FullName, FileMode.Open, FileAccess.Read))
{
BinaryReader binaryReader = new BinaryReader(fileStream);
fileData = binaryReader.ReadBytes((int)fileStream.Length);
}
return fileData;
}

One possible cause is the h process is out of memory because the code creates many MemoryStream object and does not dispose them.
Try change code to this:
List<string> applicableReports = CurrentWizard.GetApplicableReports();
previousReportsStream = new MemoryStream();
try
{
for (int i = 0; i < streams.Length; i++)
{
using( MemoryStream memStream = new MemoryStream(DocumentHelper.Instance.ConvertFileToByteArray(applicableReports[i]))
{
memStream.CopyTo(previousReportsStream);
}
}
RadPdfViewer radPdfViewer = new RadPdfViewer();
RadFixedDocument document = new PdfFormatProvider(previousReportsStream, FormatProviderSettings.ReadAllAtOnce).Import();
radPdfViewer.Document = document;
}
finally
{
previousReportsStream.Close();
}
As MemoryStream implements the IDisposable interface, you call dispose it to free the native resources; if not, the it will lead to high memory usage.
Please read MSDN for more details.

Related

How write memorystream into ZipFile

I have a zip file. I want open it with SharpZipLib and add a new ZipEntry to it that it is created in memory.
I am new to SharpZipLib. I googled very much but couldn't find similar problem.
My Sample Code is:
public Stream GetNewZipFileStream(string zipFilePath)
{
byte[] zipFileBytes = null;
zipFileBytes = ReadFileBytes(zipFilePath);
var zipFileMemoryStream = new MemoryStream(zipFileBytes);
ZipOutputStream zipOutStream = new ZipOutputStream(zipFileMemoryStream);
var newEntry = new ZipEntry("NewFile.txt");
zipOutStream.PutNextEntry(newEntry);
var newFileMemoryStream = MakeOnTheFlyStream();
StreamUtils.Copy(newFileMemoryStream , zipOutStream, new byte[4096]);
zipOutStream.CloseEntry();
newFileMemoryStream.Close();
zipOutStream.IsStreamOwner = false;
zipOutStream.Close();
newFileMemoryStream.Position = 0;
return newFileMemoryStream;
}
ReadFileBytes and MakeOnTheFlyStream are my methods.

Using a generic stream to create a zipped file with SharpCompress

Since System.IO.Compression seems to be out of reach for now if I want to use both dotnet core + net461, I've tried with SharpCompress.
The "read zip" part was easy, but I am having trouble finding out how to write to a zip stream.
The wiki of the project is a bit outdated. This is the only example that I've found that applies to writing to streams. I've tried to follow it and adapt it to my needs, but I am stuck at the exception it throws:
using Microsoft.VisualStudio.TestTools.UnitTesting;
using SharpCompress.Common;
using SharpCompress.Compressors.Deflate;
using SharpCompress.Writers;
using System;
using System.IO;
namespace DbManager.DjdbCore.Tests
{
[TestClass]
public class ZipTests
{
public ZipTests()
{
Directory.SetCurrentDirectory(AppContext.BaseDirectory);
}
[TestMethod]
public void Test()
{
var zip = File.OpenWrite(#"..\..\..\..\..\test-resources\zip_file_test.zip");
var writerOptions = new WriterOptions(CompressionType.Deflate);
var zipWriter = WriterFactory.Open(zip, ArchiveType.Zip, writerOptions);
var memoryStream = new MemoryStream();
var binaryWriter = new BinaryWriter(memoryStream);
binaryWriter.Write("Test string inside binary file - text to fill it up: qoiwjqefñlawijfñlaskdjfioqwjefñalskvndñaskvnqo`wiefowainvñaslkfjnwpowiqjfeopwiqjnfjñlaskdjfñlasdfjiowiqjefñaslkdjfñalskjfpqwoiefjqw");
var deflateStream = new DeflateStream(memoryStream, SharpCompress.Compressors.CompressionMode.Compress);
deflateStream.Write(memoryStream.ToArray(), 0, Convert.ToInt32(memoryStream.Length));
// EXCEPTION: SharpCompress.Compressors.Deflate.ZlibException: 'Cannot Read after Writing.'
// Source code: if (_streamMode != StreamMode.Reader) { throw new ZlibException("Cannot Read after Writing."); }
zipWriter.Write("test_file_inside_zip.bin", deflateStream, DateTime.Now);
zip.Flush();
zipWriter.Dispose();
zip.Dispose();
}
}
}
In case it helps, this is what I used (and it worked, but only in dotnet core) using the library System.IO.Compression:
private void WriteAsZipBinary()
{
//Open the zip file if it exists, else create a new one
var zip = ZipPackage.Open(this.FileFullPath, FileMode.OpenOrCreate, FileAccess.ReadWrite);
var zipStream = ZipManager.GetZipWriteStream(zip, nameOfFileInsideZip);
var memoryStream = new MemoryStream();
var binaryWriter = new BinaryWriter(memoryStream);
// Here is where strings etc are written to the binary file:
WriteStuffInBinaryStream(ref binaryWriter);
//Read all of the bytes from the file to add to the zip file
byte[] bites = new byte[Convert.ToInt32(memoryStream.Length - 1) + 1];
memoryStream.Position = 0;
memoryStream.Read(bites, 0, Convert.ToInt32(memoryStream.Length));
binaryWriter.Dispose();
binaryWriter = null;
memoryStream.Dispose();
memoryStream = null;
zipStream.Position = 0;
zipStream.Write(bites, 0, bites.Length);
zip.Close();
}
public static Stream GetZipWriteStream(Package zip, string renamedFileName)
{
//Replace spaces with an underscore (_)
string uriFileName = renamedFileName.Replace(" ", "_");
//A Uri always starts with a forward slash "/"
string zipUri = string.Concat("/", Path.GetFileName(uriFileName));
Uri partUri = new Uri(zipUri, UriKind.Relative);
string contentType = "Zip"; // System.Net.Mime.MediaTypeNames.Application.Zip;
//The PackagePart contains the information:
// Where to extract the file when it's extracted (partUri)
// The type of content stream (MIME type): (contentType)
// The type of compression: (CompressionOption.Normal)
PackagePart pkgPart = zip.CreatePart(partUri, contentType, CompressionOption.Normal);
//Compress and write the bytes to the zip file
return pkgPart.GetStream();
}
I'll post here the answer on github from #adamhathcock (the owner of the project):
[TestMethod]
public void Test()
{
var writerOptions = new WriterOptions(CompressionType.Deflate);
using(var zip = File.OpenWrite(#"..\..\..\..\..\test-resources\zip_file_test.zip"))
using(var zipWriter = WriterFactory.Open(zip, ArchiveType.Zip, writerOptions))
{
var memoryStream = new MemoryStream();
var binaryWriter = new BinaryWriter(memoryStream);
binaryWriter.Write("Test string inside binary file - text to fill it up: qoiwjqefñlawijfñlaskdjfioqwjefñalskvndñaskvnqo`wiefowainvñaslkfjnwpowiqjfeopwiqjnfjñlaskdjfñlasdfjiowiqjefñaslkdjfñalskjfpqwoiefjqw");
memoryStream.Position = 0;
zipWriter.Write("test_file_inside_zip.bin", memoryStream, DateTime.Now);
}
}
2 things:
You forgot to reset the MemoryStream after writing to it so it can be read.
You don't need to manually use the DeflateStream. You've told the ZipWriter what compression to use. If it worked, you would have double compressed the bytes which would be garbage really.

C# ZipArchive losing data

I'm trying to copy the contents of one Excel file to another Excel file while replacing a string inside of the file on the copy. It's working for the most part, but the file is losing 27 kb of data. Any suggestions?
public void ReplaceString(string what, string with, string path) {
List < string > doneContents = new List < string > ();
List < string > doneNames = new List < string > ();
using(ZipArchive archive = ZipFile.Open(_path, ZipArchiveMode.Read)) {
int count = archive.Entries.Count;
for (int i = 0; i < count; i++) {
ZipArchiveEntry entry = archive.Entries[i];
using(var entryStream = entry.Open())
using(StreamReader reader = new StreamReader(entryStream)) {
string txt = reader.ReadToEnd();
if (txt.Contains(what)) {
txt = txt.Replace(what, with);
}
doneContents.Add(txt);
string name = entry.FullName;
doneNames.Add(name);
}
}
}
using(MemoryStream zipStream = new MemoryStream()) {
using(ZipArchive newArchive = new ZipArchive(zipStream, ZipArchiveMode.Create, true, Encoding.UTF8)) {
for (int i = 0; i < doneContents.Count; i++) {
int spot = i;
ZipArchiveEntry entry = newArchive.CreateEntry(doneNames[spot]);
using(var entryStream = entry.Open())
using(var sw = new StreamWriter(entryStream)) {
sw.Write(doneContents[spot]);
}
}
}
using(var fileStream = new FileStream(path, FileMode.Create)) {
zipStream.Seek(0, SeekOrigin.Begin);
zipStream.CopyTo(fileStream);
}
}
}
I've used Microsoft's DocumentFormat.OpenXML and Excel Interop, however, they are both lacking in a few main components that I need.
Update:
using(var fileStream = new FileStream(path, FileMode.Create)) {
var wrapper = new StreamWriter(fileStream);
wrapper.AutoFlush = true;
zipStream.Seek(0, SeekOrigin.Begin);
zipStream.CopyTo(wrapper.BaseStream);
wrapper.Flush();
wrapper.Close();
}
Try the process without changing the string and see if the file size is the same. If so then it would seem that your copy is working correctly, however as Marc B suggested, with compression, even a small change can result in a larger change in the overall size.

iTextSharp is producing a corrupt PDF

The code snippet below returns a corrupt PDF document however if I return mergedDocument instead it always returns a valid PDF. mergedDocument is based on a PDF file i created using Word, whereas completed document is entirely programmatically generated. The code "works" in that it throws no exceptions. Why is iTextSharp creating a corrupt PDF?
byte[] completedDocument = null;
using (MemoryStream streamCompleted = new MemoryStream())
{
using (Document document = new Document())
{
PdfCopy copy = new PdfCopy(document, streamCompleted);
document.Open();
copy.Open();
foreach (var item in eventItems)
{
byte[] mergedDocument = null;
PdfReader reader = new PdfReader(pdfTemplates[item.DataTokens[NotifyTokenType.OrganisationID]]);
using (MemoryStream streamTemplate = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(reader, streamTemplate))
{
foreach (var token in item.DataTokens)
{
if (stamper.AcroFields.Fields.Any(fld => fld.Key == token.Key.ToString()))
{
stamper.AcroFields.SetField(token.Key.ToString(), token.Value);
}
}
stamper.FormFlattening = true;
stamper.Writer.CloseStream = false;
}
mergedDocument = new byte[streamTemplate.Length];
streamTemplate.Position = 0;
streamTemplate.Read(mergedDocument, 0, (int)streamTemplate.Length);
}
reader = new PdfReader(mergedDocument);
for (int i = 1; i <= reader.NumberOfPages; i++)
{
document.SetPageSize(PageSize.A4);
copy.AddPage(copy.GetImportedPage(reader, i));
}
}
completedDocument = new byte[streamCompleted.Length];
streamCompleted.Position = 0;
streamCompleted.Read(completedDocument, 0, (int)streamCompleted.Length);
}
}
return completedDocument;
You need to close the document and copy objects to flush the PDF writing buffer. This, however, causes some problems when trying to read the stream into an array. The fix for that is to use the ToArray() method of the MemoryStream which still works on closed streams. The changes I made have comments on them.
byte[] completedDocument = null;
using (MemoryStream streamCompleted = new MemoryStream())
{
using (Document document = new Document())
{
PdfCopy copy = new PdfCopy(document, streamCompleted);
document.Open();
copy.Open();
foreach (var item in eventItems)
{
byte[] mergedDocument = null;
PdfReader reader = new PdfReader(pdfTemplates[item.DataTokens[NotifyTokenType.OrganisationID]]);
using (MemoryStream streamTemplate = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(reader, streamTemplate))
{
foreach (var token in item.DataTokens)
{
if (stamper.AcroFields.Fields.Any(fld => fld.Key == token.Key.ToString()))
{
stamper.AcroFields.SetField(token.Key.ToString(), token.Value);
}
}
stamper.FormFlattening = true;
stamper.Writer.CloseStream = false;
}
//Copy the stream's bytes
mergedDocument = streamTemplate.ToArray();
}
reader = new PdfReader(mergedDocument);
for (int i = 1; i <= reader.NumberOfPages; i++)
{
document.SetPageSize(PageSize.A4);
copy.AddPage(copy.GetImportedPage(reader, i));
}
//Close the document and the copy
document.Close();
copy.Close();
}
//ToArray() can operate on closed streams
completedDocument = streamCompleted.ToArray();
}
}
return completedDocument;
Also make sure your html doesn't contains hr tag while converting html to pdf
hdnEditorText.Value.Replace("\"", "'").Replace("<hr />", "").Replace("<hr/>", "")

Writing to MemoryStream with StreamWriter returns empty

I am not sure what I am doing wrong, have seen a lot of examples, but can't seem to get this working.
public static Stream Foo()
{
var memStream = new MemoryStream();
var streamWriter = new StreamWriter(memStream);
for (int i = 0; i < 6; i++)
streamWriter.WriteLine("TEST");
memStream.Seek(0, SeekOrigin.Begin);
return memStream;
}
I am doing a simple test on this method to try and get it to pass, but no matter what, my collection count is 0.
[Test]
public void TestStreamRowCount()
{
var stream = Foo();
using (var reader = new StreamReader(stream))
{
var collection = new List<string>();
string input;
while ((input = reader.ReadLine()) != null)
collection.Add(input);
Assert.AreEqual(6, collection.Count);
}
}
Note: I changed some syntax above without compiling in the Test method. What is more important is the first method which seems to be returning an empty stream (my reader.ReadLine() always reads once). Not sure what I am doing wrong. Thank you.
You are forgetting to flush your StreamWriter instance.
public static Stream Foo()
{
var memStream = new MemoryStream();
var streamWriter = new StreamWriter(memStream);
for (int i = 0; i < 6; i++)
streamWriter.WriteLine("TEST");
streamWriter.Flush(); <-- need this
memStream.Seek(0, SeekOrigin.Begin);
return memStream;
}
Also note that StreamWriter is supposed to be disposed of, since it implements IDisposable, but that in turn generates another problem, it will close the underlying MemoryStream as well.
Are you sure you want to return a MemoryStream here?
I would change the code to this:
public static byte[] Foo()
{
using (var memStream = new MemoryStream())
using (var streamWriter = new StreamWriter(memStream))
{
for (int i = 0; i < 6; i++)
streamWriter.WriteLine("TEST");
streamWriter.Flush();
return memStream.ToArray();
}
}
[Test]
public void TestStreamRowCount()
{
var bytes = Foo();
using (var stream = new MemoryStream(bytes))
using (var reader = new StreamReader(stream))
{
var collection = new List<string>();
string input;
while ((input = reader.ReadLine()) != null)
collection.Add(input);
Assert.AreEqual(6, collection.Count);
}
}
Since you are not using "using" or streamWriter.Flush() the writer did not commit changes to the stream. As result Stream itslef does not have data yet. In general you want to wrap manipulation with Stream and StremaWriter instances with using.
You also should consider returning new instance of MemoryStream:
using(var memStream = new MemoryStream())
{
....
return new MemoryStream(memStream.ToArray(), false /*writable*/);
}
Try flushing streamWriter after writing your lines.

Categories

Resources