I have a method that takes a web-based image, converts it into a byte array and then passes that to a CDN as an image.
I have now used this successfully hundreds of times, however on migrating a particular set of images, I've noticed that these Jpeg files are identifying as .png files which when they arrive at the CDN are blank.
Using some code that I copied from the web, I am able to identify the file extension from the image byte array after it is built.
So, it's the conversion from the original image to byte array that is mysteriously updating the file type.
This is my method:-
public byte[] Get(string fullPath)
{
byte[] imageBytes = { };
var imageRequest = (HttpWebRequest)WebRequest.Create(fullPath);
var imageResponse = imageRequest.GetResponse();
var responseStream = imageResponse.GetResponseStream();
if (responseStream != null)
{
using (var br = new BinaryReader(responseStream))
{
imageBytes = br.ReadBytes(500000);
br.Close();
}
responseStream.Close();
}
imageResponse.Close();
return imageBytes;
}
I have also tried converting this to use MemoryStream instead.
I'm not sure what else I can do to ensure that this identifies the correct file type.
Edit
I have now updated the number of allowed bytes which has resulted in viable images.
However the issue with the JPEG files being altered to PNG is still ongoing.
It's only this selection of images that are affected.
These images were saved in an old CMS system so I do wonder if the way that they were saved is the cause?
Up to now, the code only reads 500,000 bytes for each file. If the file is larger than that, the end is truncated and the content is not valid anymore. In order to read all bytes, you can use the following code:
public byte[] Get(string fullPath)
{
List<byte> imageBytes = new List<byte>(500000);
var imageRequest = (HttpWebRequest)WebRequest.Create(fullPath);
using (var imageResponse = imageRequest.GetResponse())
{
using (var responseStream = imageResponse.GetResponseStream())
{
using (var br = new BinaryReader(responseStream))
{
var buffer = new byte[500000];
int bytesRead;
while ((bytesRead = br.Read(buffer, 0, buffer.length)) > 0)
{
imageBytes.AddRange(buffer);
}
}
}
}
return imageBytes.ToArray();
}
Above sample reads the data in chunks of 500,000 bytes - for most of your files, this should be sufficient. If a file is larger, the code reads more chunks until there are no more bytes to read. All the chunks are assembled in a list.
This asserts that all the bytes are read, even if the content is larger than 500,000 bytes.
Related
I am trying to read an IFormFile received from a HTTP POST request like this:
public async Task<ActionResult> UploadDocument([FromForm]DataWrapper data)
{
IFormFile file = data.File;
string fileName = file.FileName;
long length = file.Length;
if (length < 0)
return BadRequest();
using FileStream fileStream = new FileStream(fileName, FileMode.OpenOrCreate);
byte[] bytes = new byte[length];
fileStream.Read(bytes, 0, (int)file.Length);
...
}
but something is wrong, after this line executes:
fileStream.Read(bytes, 0, (int)file.Length);
all of the elements of bytes are zero.
Also, the file with the same name is created in my Visual Studio project, which I would prefer not to happen.
You can't open an IFormFile the same way you would a file on disk. You'll have to use IFormFile.OpenReadStream() instead. Docs here
public async Task<ActionResult> UploadDocument([FromForm]DataWrapper data)
{
IFormFile file = data.File;
long length = file.Length;
if (length < 0)
return BadRequest();
using var fileStream = file.OpenReadStream();
byte[] bytes = new byte[length];
fileStream.Read(bytes, 0, (int)file.Length);
}
The reason that fileStream.Read(bytes, 0, (int)file.Length); appears to be empty is, because it is. The IFormFile.Filename is the name of the file given by the request and doesn't exist on disk.
Your code's intent seems to be to write to a FileStream, not a byte buffer. What it actually does though, is create a new empty file and read from it into an already cleared buffer. The uploaded file is never used.
Writing to a file
If you really want to save the file, you can use CopyTo :
using(var stream = File.Create(Path.Combine(folder_I_Really_Want,file.FileName))
{
file.CopyTo(stream);
}
If you want to read from the uploaded file into a buffer without saving to disk, use a MemoryStream. That's just a Stream API buffer over a byte[] buffer. You don't have to specify the size but that reduces reallocations as the internal buffer grows.
Reading into byte[]
Reading into a byte[] through MemoryStream is essentially the same :
var stream = new MemoryStream(file.Length);
file.CopyTo(stream);
var bytes=stream.ToArray();
The problem is that you are opening a new filestream based on the file name in your model which will be the name of the file that the user selected when uploading. Your code will create a new empty file with that name, which is why you are seeing the file in your file system. Your code is then reading the bytes from that file which is empty.
You need to use IFormFile.OpenReadStream method or one of the CopyTo methods to get the actual data from the stream.
You then write that data to your file on your file system with name you want.
var filename ="[Enter or create name for your file here]";
using (FileStream fs = new FileStream(filename, FileMode.OpenOrCreate))
//Create the file in your file system with the name you want.
{
using (MemoryStream ms = new MemoryStream())
{
//Copy the uploaded file data to a memory stream
file.CopyTo(ms);
//Now write the data in the memory stream to the new file
fs.Write(ms.ToArray());
}
}
I have implemented POC to read entire file content into Byte[] array. I am now succeeded to read files whose size below 100MB, when I load file whose size more than 100MB then it is throwing
Convert.ToBase64String(mybytearray) Cannot obtain value of the
local variable or argument because there is not enough memory
available.
Below is my code that I have tried to read content from file to Byte array
var sFile = fileName;
var mybytearray = File.ReadAllBytes(sFile);
var binaryModel = new BinaryModel
{
fileName = binaryFile.FileName,
binaryData = Convert.ToBase64String(mybytearray),
filePath = string.Empty
};
My model class is as below
public class BinaryModel
{
public string fileName { get; set; }
public string binaryData { get; set; }
public string filePath { get; set; }
}
I am getting "Convert.ToBase64String(mybytearray) Cannot obtain value of the local variable or argument because there is not enough memory available." this error at Convert.ToBase64String(mybytearray).
Is there anything which I need to take care to prevent this error?
Note: I do not want to add line breaks to my file content
To save memory you can convert stream of bytes in 3-packs. Every three bytes produce 4 bytes in Base64. You don't need whole file in memory at once.
Here is pseudocode:
Repeat
1. Try to read max 3 bytes from stream
2. Convert to base64, write to output stream
And simple implementation:
using (var inStream = File.OpenRead("E:\\Temp\\File.xml"))
using (var outStream = File.CreateText("E:\\Temp\\File.base64"))
{
var buffer = new byte[3];
int read;
while ((read = inStream.Read(buffer, 0, 3)) > 0)
{
var base64 = Convert.ToBase64String(buffer, 0, read);
outStream.Write(base64);
}
}
Hint: every multiply of 3 is valid. Higher - more memory, better performance, lower - less memory, worse performance.
Additional info:
File stream is an example. As a result stream use [HttpContext].Response.OutputStream and write directly to it. Processing hundreds of megabytes in one chunk will kill you and your server.
Think about total memory requirements. 100MB in string, leads to 133 MB in byte array, since you wrote about model I expect copy of this 133 MB in response. And remember it's just a simple request. A few such requests could drain your memory.
I would use two filestreams - one to read the large file, one to write the result back out.
So in chunks you would convert to base 64 ... then convert the resulting string to bytes ... and write.
private static void ConvertLargeFileToBase64()
{
var buffer = new byte[16 * 1024];
using (var fsIn = new FileStream("D:\\in.txt", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
using (var fsOut = new FileStream("D:\\out.txt", FileMode.CreateNew, FileAccess.Write))
{
int read;
while ((read = fsIn.Read(buffer, 0, buffer.Length)) > 0)
{
// convert to base 64 and convert to bytes for writing back to file
var b64 = Encoding.ASCII.GetBytes(Convert.ToBase64String(buffer));
// write to the output filestream
fsOut.Write(b64, 0, read);
}
fsOut.Close();
}
}
}
I use the following code to return a byte array in HttpResponseMessage:
using (WebResponse response = (HttpWebResponse)request.GetResponse())
{
byte[] bytes = ReadFully(response.GetResponseStream());
......
}
public static byte[] ReadFully(Stream input)
{
byte[] buffer = new byte[16*1024];
using (MemoryStream ms = new MemoryStream())
{
int read;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
ms.Write(buffer, 0, read);
}
return ms.ToArray(); // This line throws OutOfMemory exception
}
}
An OutOfMemory exception is thrown in the last return ms.ToArray() statement.
I need to set the resulting byte[] as HttpResponseMessage.Content.
You should return the stream directly instead of reading it into memory first.
public HttpResponseMessage CreateMessage(Stream input)
{
HttpResponseMessage result = new HttpResponseMessage(HttpStatusCode.OK);
result.Content = new StreamContent(input);
return result;
}
Do not forget to set the appropriate headers etc.
Edit
... i need to write the byte array from the HttpResponseMessage into a file
Based on your last comment you changed your question and want to go the other way. Here is an example of writing to a file from a web response.
public void writetoFile(HttpWebResponse response)
{
var inStream = response.GetResponseStream();
using (var file = System.IO.File.OpenWrite("your file path here"))
{
inStream.CopyTo(file);
}
}
Igor posted the solution, and the correct way to deal with stream content. Use one of the MVC helper functions like File(stream,contentype) or classes like StreamContent to send the stream contents directly to the client, eg:
return File(myStream,myExcelContentTypeString);
or
return File(myStream,myExcelContentTypeString,"ReallyBigFile.xlsx");
The reason for the error, is that OOM can occur because memory is too fragmented to allocate a new object. A MemoryStream stores data in a buffer. When it exceeds the buffer limits, it allocates a new one with double the capacity and copies the old data. Copying 250MB of data like this is going to cause a lot of reallocations and thus a lot of memory fragmentation.
This can be avoided by specifying the desired capacity in the stream's constructor. This will allocate a large enough buffer immediatelly.
It's even better though to avoid caching this content though, by sending it to the browser directly.
The issue is as follows, I am using an HttpWebRequest to request some online data from dmo.gov.uk. The response I am reading using a BinaryReader and writing to a MemoryStream. I have packaged the code being used into a simple test method:
public static byte[] Test(int bufferSize)
{
var request = (HttpWebRequest)WebRequest.Create("http://www.dmo.gov.uk/xmlData.aspx?rptCode=D3B.2");
request.Method = "GET";
request.Credentials = CredentialCache.DefaultCredentials;
var buffer = new byte[bufferSize];
using (var httpResponse = (HttpWebResponse)request.GetResponse())
{
using (var ms = new MemoryStream())
{
using (var reader = new BinaryReader(httpResponse.GetResponseStream()))
{
int bytesRead;
while ((bytesRead = reader.Read(buffer, 0, bufferSize)) > 0)
{
ms.Write(buffer, 0, bytesRead);
}
}
return ms.GetBuffer();
}
}
}
My real-life code uses a buffer size of 2048 bytes usually, however I noticed today that this file has a huge amount of empty bytes (\0) at the end which bloats the file size. As a test I tried increasing the buffer size to near-on the file size I expected (I was expecting ~80Kb so made the buffer size 79000) and now I get the right file size. But I'm confused, I expected to get the same file size regardless of the buffer size used to read the data.
The following test:
Console.WriteLine(Test(2048).Length);
Console.WriteLine(Test(79000).Length);
Console.ReadLine();
Yields the follwoing output:
131072
81341
The second figure, using the high buffer size is the exact file size I was expecting (This file changes daily, so expect that size to differ after today's date). The first figure contains \0 for everything after the file size expected.
What's going on here?
You should change ms.GetBuffer(); to ms.ToArray();.
GetBuffer will return the entire MemoryStream buffer while ToArray will return all the values inside the MemoryStream.
I have the following method and for some reason the first call to Copy to seems to do nothing? Anyone know why?
In input to the method is compressed and base64 can supply that method to if need.
private byte[] GetFileChunk(string base64)
{
using (
MemoryStream compressedData = new MemoryStream(Convert.FromBase64String(base64), false),
uncompressedData = new MemoryStream())
{
using (GZipStream compressionStream = new GZipStream(compressedData, CompressionMode.Decompress))
{
// first copy does nothing ?? second works
compressionStream.CopyTo(uncompressedData);
compressionStream.CopyTo(uncompressedData);
}
return uncompressedData.ToArray();
}
}
If the first call to Read() returns 0 then Stream.CopyTo() isn't going to work either. While this points to a problem with GZipStream, it is very unlikely that it has a bug like this. Far more likely is that something went wrong when you created the compressed data. Like compressing 0 bytes first, followed by compressing the real data.
Just a guess, but is it because the new GZipStream constructor leaves the index at the end of the array, and the first CopyTo resets it to the start, so that when you call the second CopyTo its now at the start and copies the data properly?
How sure are you that first copy does nothing and the second works
, that would be a bug in the GZipStream class. Your code should work fine without calling CopyTo twice.
Hi thank you for everybody’s input. It turns out the error was caused by a mistake in encoding method. The method was
/// <summary>
/// Compress file data and then base64s the compressed data for safe transportation in XML.
/// </summary>
/// <returns>Base64 string of file chunk</returns>
private string GetFileChunk()
{
// MemoryStream for compression output
using (MemoryStream compressed = new MemoryStream())
{
using (GZipStream zip = new GZipStream(compressed, CompressionMode.Compress))
{
// read chunk from file
byte[] plaintext = new byte[this.readSize];
int read = this.file.Read(plaintext, 0, plaintext.Length);
// write chunk to compreesion
zip.Write(plaintext, 0, read);
plaintext = null;
// Base64 compressed data
return Convert.ToBase64String(compressed.ToArray());
}
}
}
The return line should be below the using allowing the compression stream to close and flush, this caused the inconsistent behaviour when decompressing the stream.
/// <summary>
/// Compress file data and then base64s the compressed data for safe transportation in XML.
/// </summary>
/// <returns>Base64 string of file chunk</returns>
private string GetFileChunk()
{
// MemoryStream for compression output
using (MemoryStream compressed = new MemoryStream())
{
using (GZipStream zip = new GZipStream(compressed, CompressionMode.Compress))
{
// read chunk from file
byte[] plaintext = new byte[this.readSize];
int read = this.file.Read(plaintext, 0, plaintext.Length);
// write chunk to compreesion
zip.Write(plaintext, 0, read);
plaintext = null;
}
// Base64 compressed data
return Convert.ToBase64String(compressed.ToArray());
}
}
Thank you for everybody's help.