Read large file into byte array and encode it to ToBase64String

Read large file into byte array and encode it to ToBase64String - c#

I have implemented POC to read entire file content into Byte[] array. I am now succeeded to read files whose size below 100MB, when I load file whose size more than 100MB then it is throwing
Convert.ToBase64String(mybytearray) Cannot obtain value of the
local variable or argument because there is not enough memory
available.
Below is my code that I have tried to read content from file to Byte array
var sFile = fileName;
var mybytearray = File.ReadAllBytes(sFile);
var binaryModel = new BinaryModel
{
fileName = binaryFile.FileName,
binaryData = Convert.ToBase64String(mybytearray),
filePath = string.Empty
};
My model class is as below
public class BinaryModel
{
public string fileName { get; set; }
public string binaryData { get; set; }
public string filePath { get; set; }
}
I am getting "Convert.ToBase64String(mybytearray) Cannot obtain value of the local variable or argument because there is not enough memory available." this error at Convert.ToBase64String(mybytearray).
Is there anything which I need to take care to prevent this error?
Note: I do not want to add line breaks to my file content

To save memory you can convert stream of bytes in 3-packs. Every three bytes produce 4 bytes in Base64. You don't need whole file in memory at once.
Here is pseudocode:
Repeat
1. Try to read max 3 bytes from stream
2. Convert to base64, write to output stream
And simple implementation:
using (var inStream = File.OpenRead("E:\\Temp\\File.xml"))
using (var outStream = File.CreateText("E:\\Temp\\File.base64"))
{
var buffer = new byte[3];
int read;
while ((read = inStream.Read(buffer, 0, 3)) > 0)
{
var base64 = Convert.ToBase64String(buffer, 0, read);
outStream.Write(base64);
}
}
Hint: every multiply of 3 is valid. Higher - more memory, better performance, lower - less memory, worse performance.
Additional info:
File stream is an example. As a result stream use [HttpContext].Response.OutputStream and write directly to it. Processing hundreds of megabytes in one chunk will kill you and your server.
Think about total memory requirements. 100MB in string, leads to 133 MB in byte array, since you wrote about model I expect copy of this 133 MB in response. And remember it's just a simple request. A few such requests could drain your memory.

I would use two filestreams - one to read the large file, one to write the result back out.
So in chunks you would convert to base 64 ... then convert the resulting string to bytes ... and write.
private static void ConvertLargeFileToBase64()
{
var buffer = new byte[16 * 1024];
using (var fsIn = new FileStream("D:\\in.txt", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
using (var fsOut = new FileStream("D:\\out.txt", FileMode.CreateNew, FileAccess.Write))
{
int read;
while ((read = fsIn.Read(buffer, 0, buffer.Length)) > 0)
{
// convert to base 64 and convert to bytes for writing back to file
var b64 = Encoding.ASCII.GetBytes(Convert.ToBase64String(buffer));
// write to the output filestream
fsOut.Write(b64, 0, read);
}
fsOut.Close();
}
}
}

Related

Wrong file extension added in image byte array

I have a method that takes a web-based image, converts it into a byte array and then passes that to a CDN as an image.
I have now used this successfully hundreds of times, however on migrating a particular set of images, I've noticed that these Jpeg files are identifying as .png files which when they arrive at the CDN are blank.
Using some code that I copied from the web, I am able to identify the file extension from the image byte array after it is built.
So, it's the conversion from the original image to byte array that is mysteriously updating the file type.
This is my method:-
public byte[] Get(string fullPath)
{
byte[] imageBytes = { };
var imageRequest = (HttpWebRequest)WebRequest.Create(fullPath);
var imageResponse = imageRequest.GetResponse();
var responseStream = imageResponse.GetResponseStream();
if (responseStream != null)
{
using (var br = new BinaryReader(responseStream))
{
imageBytes = br.ReadBytes(500000);
br.Close();
}
responseStream.Close();
}
imageResponse.Close();
return imageBytes;
}
I have also tried converting this to use MemoryStream instead.
I'm not sure what else I can do to ensure that this identifies the correct file type.
Edit
I have now updated the number of allowed bytes which has resulted in viable images.
However the issue with the JPEG files being altered to PNG is still ongoing.
It's only this selection of images that are affected.
These images were saved in an old CMS system so I do wonder if the way that they were saved is the cause?

Up to now, the code only reads 500,000 bytes for each file. If the file is larger than that, the end is truncated and the content is not valid anymore. In order to read all bytes, you can use the following code:
public byte[] Get(string fullPath)
{
List<byte> imageBytes = new List<byte>(500000);
var imageRequest = (HttpWebRequest)WebRequest.Create(fullPath);
using (var imageResponse = imageRequest.GetResponse())
{
using (var responseStream = imageResponse.GetResponseStream())
{
using (var br = new BinaryReader(responseStream))
{
var buffer = new byte[500000];
int bytesRead;
while ((bytesRead = br.Read(buffer, 0, buffer.length)) > 0)
{
imageBytes.AddRange(buffer);
}
}
}
}
return imageBytes.ToArray();
}
Above sample reads the data in chunks of 500,000 bytes - for most of your files, this should be sufficient. If a file is larger, the code reads more chunks until there are no more bytes to read. All the chunks are assembled in a list.
This asserts that all the bytes are read, even if the content is larger than 500,000 bytes.

C# increase the size of reading binary data

I am using the below code from Jon Skeet's article. Of late, the binary data that needs to be processed has grown multi-fold. The binary data size file size that I am trying to import is ~ 900 mb almost 1 gb. How do I increase the memory stream size.
public static byte[] ReadFully (Stream stream)
{
byte[] buffer = new byte[32768];
using (MemoryStream ms = new MemoryStream())
{
while (true)
{
int read = stream.Read (buffer, 0, buffer.Length);
if (read <= 0)
return ms.ToArray();
ms.Write (buffer, 0, read);
}
}
}

Your method returns a byte array, which means it will return all of the data in the file. Your entire file will be loaded into memory.
If that is what you want to do, then simply use the built in File methods:
byte[] bytes = System.IO.File.ReadAllBytes(string path);
string text = System.IO.File.ReadAllText(string path);
If you don't want to load the entire file into memory, take advantage of your Stream
using (var fs = new FileStream("path", FileMode.Open))
using (var reader = new StreamReader(fs))
{
var line = reader.ReadLine();
// do stuff with 'line' here, or use one of the other
// StreamReader methods.
}

You don't have to increase the size of MemoryStream - by default it expands to fit the contents.
Apparently there can be problems with memory fragmentation, but you can pre-allocate memory to avoid them:
using (MemoryStream ms = new MemoryStream(1024 * 1024 * 1024)) // initial capacity 1GB
{
}
In my opinion 1GB should be no big deal these days, but it's probably better to process the data in chunks if possible. That is what Streams are designed for.

BinaryReader reading different length of data depending on BufferSize

The issue is as follows, I am using an HttpWebRequest to request some online data from dmo.gov.uk. The response I am reading using a BinaryReader and writing to a MemoryStream. I have packaged the code being used into a simple test method:
public static byte[] Test(int bufferSize)
{
var request = (HttpWebRequest)WebRequest.Create("http://www.dmo.gov.uk/xmlData.aspx?rptCode=D3B.2");
request.Method = "GET";
request.Credentials = CredentialCache.DefaultCredentials;
var buffer = new byte[bufferSize];
using (var httpResponse = (HttpWebResponse)request.GetResponse())
{
using (var ms = new MemoryStream())
{
using (var reader = new BinaryReader(httpResponse.GetResponseStream()))
{
int bytesRead;
while ((bytesRead = reader.Read(buffer, 0, bufferSize)) > 0)
{
ms.Write(buffer, 0, bytesRead);
}
}
return ms.GetBuffer();
}
}
}
My real-life code uses a buffer size of 2048 bytes usually, however I noticed today that this file has a huge amount of empty bytes (\0) at the end which bloats the file size. As a test I tried increasing the buffer size to near-on the file size I expected (I was expecting ~80Kb so made the buffer size 79000) and now I get the right file size. But I'm confused, I expected to get the same file size regardless of the buffer size used to read the data.
The following test:
Console.WriteLine(Test(2048).Length);
Console.WriteLine(Test(79000).Length);
Console.ReadLine();
Yields the follwoing output:
131072
81341
The second figure, using the high buffer size is the exact file size I was expecting (This file changes daily, so expect that size to differ after today's date). The first figure contains \0 for everything after the file size expected.
What's going on here?

You should change ms.GetBuffer(); to ms.ToArray();.
GetBuffer will return the entire MemoryStream buffer while ToArray will return all the values inside the MemoryStream.

HttpWebResponse + Stream.Read, adding null chars at the end

I'm trying to get a byte[] array filled with request response, without any extra garbage data.
This is how I fetch the data:
using (Stream MyResponseStream = hwresponse.GetResponseStream())
{
byte[] MyBuffer = new byte[4096];
int BytesRead;
while (0 < (BytesRead = MyResponseStream.Read(MyBuffer, 0, MyBuffer.Length)))
{
ByteArrayToFile("request.txt", MyBuffer);
}
}
I use the function 'ByteArrayToFile' to see what data has been recieved.
public void ByteArrayToFile(string _FileName, byte[] _ByteArray)
{
System.IO.FileStream _FileStream = new System.IO.FileStream(_FileName, System.IO.FileMode.Append, System.IO.FileAccess.Write);
_FileStream.Write(_ByteArray, 0, _ByteArray.Length);
_FileStream.Close();
}
I get request written to the file, but a lot of 'null' characters are added at the end. How do I trim them? Since I'm going to need this to handle binary files, how can I safely trim out the endings and have just pure array of response? Thanks!

You need to utilise the value BytesRead, this will indicate exactly how many bytes were received:
public void ByteArrayToFile(string _FileName, byte[] _ByteArray, int _BytesRead)
{
using (var _FileStream = new FileStream(
_FileName, FileMode.Append, FileAccess.Write))
{
_FileStream.Write(_ByteArray, 0, _BytesRead);
}
}
Otherwise you're writing out an array of length X which has only been populated with Y number of elements, causing a number of 'unused' elements in the array to also be written out. There is also the possibility of stale data remaining in the buffer with a pass, meaning misinformation could also end up being written out with the next write.
You should also dispose of FileStream instances when done (although Close does this for a Stream, I'd recommend the consistency of calling Dispose in one of two ways: explicitly or as illustrated in the code above, implicitly using the using construct).

How to use ICSharpCode.ZipLib with stream?

I'm very sorry for the conservative title and my question itself,but I'm lost.
The samples provided with ICsharpCode.ZipLib doesn't include what I'm searching for.
I want to decompress a byte[] by putting it in InflaterInputStream(ICSharpCode.SharpZipLib.Zip.Compression.Streams.InflaterInputStream)
I found a decompress function ,but it doesn't work.
public static byte[] Decompress(byte[] Bytes)
{
ICSharpCode.SharpZipLib.Zip.Compression.Streams.InflaterInputStream stream =
new ICSharpCode.SharpZipLib.Zip.Compression.Streams.InflaterInputStream(new MemoryStream(Bytes));
MemoryStream memory = new MemoryStream();
byte[] writeData = new byte[4096];
int size;
while (true)
{
size = stream.Read(writeData, 0, writeData.Length);
if (size > 0)
{
memory.Write(writeData, 0, size);
}
else break;
}
stream.Close();
return memory.ToArray();
}
It throws an exception at line(size = stream.Read(writeData, 0, writeData.Length);) saying it has a invalid header.
My question is not how to fix the function,this function is not provided with the library,I just found it googling.My question is,how to decompress the same way the function does with InflaterStream,but without exceptions.
Thanks and again - sorry for the conservative question.

the code in lucene is very nice.
public static byte[] Compress(byte[] input) {
// Create the compressor with highest level of compression
Deflater compressor = new Deflater();
compressor.SetLevel(Deflater.BEST_COMPRESSION);
// Give the compressor the data to compress
compressor.SetInput(input);
compressor.Finish();
/*
* Create an expandable byte array to hold the compressed data.
* You cannot use an array that's the same size as the orginal because
* there is no guarantee that the compressed data will be smaller than
* the uncompressed data.
*/
MemoryStream bos = new MemoryStream(input.Length);
// Compress the data
byte[] buf = new byte[1024];
while (!compressor.IsFinished) {
int count = compressor.Deflate(buf);
bos.Write(buf, 0, count);
}
// Get the compressed data
return bos.ToArray();
}
public static byte[] Uncompress(byte[] input) {
Inflater decompressor = new Inflater();
decompressor.SetInput(input);
// Create an expandable byte array to hold the decompressed data
MemoryStream bos = new MemoryStream(input.Length);
// Decompress the data
byte[] buf = new byte[1024];
while (!decompressor.IsFinished) {
int count = decompressor.Inflate(buf);
bos.Write(buf, 0, count);
}
// Get the decompressed data
return bos.ToArray();
}

Well it sounds like the data is just inappropriate, and that otherwise the code would work okay. (Admittedly I'd use a "using" statement for the streams instead of calling Close explicitly.)
Where did you get your data from?

Why don't you use the System.IO.Compression.DeflateStream class (available since .Net 2.0)? This uses the same compression/decompression method but doesn't require an extra library dependency.
Since .Net 2.0 you only need the ICSharpCode.ZipLib if you need the file container support.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Read large file into byte array and encode it to ToBase64String - c#

Related

Wrong file extension added in image byte array

C# increase the size of reading binary data

BinaryReader reading different length of data depending on BufferSize

HttpWebResponse + Stream.Read, adding null chars at the end

How to use ICSharpCode.ZipLib with stream?

Categories

Resources