What is the prefered method for creating a byte array from an input stream?
Here is my current solution with .NET 3.5.
Stream s;
byte[] b;
using (BinaryReader br = new BinaryReader(s))
{
b = br.ReadBytes((int)s.Length);
}
Is it still a better idea to read and write chunks of the stream?
It really depends on whether or not you can trust s.Length. For many streams, you just don't know how much data there will be. In such cases - and before .NET 4 - I'd use code like this:
public static byte[] ReadFully(Stream input)
{
byte[] buffer = new byte[16*1024];
using (MemoryStream ms = new MemoryStream())
{
int read;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
ms.Write(buffer, 0, read);
}
return ms.ToArray();
}
}
With .NET 4 and above, I'd use Stream.CopyTo, which is basically equivalent to the loop in my code - create the MemoryStream, call stream.CopyTo(ms) and then return ms.ToArray(). Job done.
I should perhaps explain why my answer is longer than the others. Stream.Read doesn't guarantee that it will read everything it's asked for. If you're reading from a network stream, for example, it may read one packet's worth and then return, even if there will be more data soon. BinaryReader.Read will keep going until the end of the stream or your specified size, but you still have to know the size to start with.
The above method will keep reading (and copying into a MemoryStream) until it runs out of data. It then asks the MemoryStream to return a copy of the data in an array. If you know the size to start with - or think you know the size, without being sure - you can construct the MemoryStream to be that size to start with. Likewise you can put a check at the end, and if the length of the stream is the same size as the buffer (returned by MemoryStream.GetBuffer) then you can just return the buffer. So the above code isn't quite optimised, but will at least be correct. It doesn't assume any responsibility for closing the stream - the caller should do that.
See this article for more info (and an alternative implementation).
While Jon's answer is correct, he is rewriting code that already exists in CopyTo. So for .Net 4 use Sandip's solution, but for previous version of .Net use Jon's answer. Sandip's code would be improved by use of "using" as exceptions in CopyTo are, in many situations, quite likely and would leave the MemoryStream not disposed.
public static byte[] ReadFully(Stream input)
{
using (MemoryStream ms = new MemoryStream())
{
input.CopyTo(ms);
return ms.ToArray();
}
}
Just want to point out that in case you have a MemoryStream you already have memorystream.ToArray() for that.
Also, if you are dealing with streams of unknown or different subtypes and you can receive a MemoryStream, you can relay on said method for those cases and still use the accepted answer for the others, like this:
public static byte[] StreamToByteArray(Stream stream)
{
if (stream is MemoryStream)
{
return ((MemoryStream)stream).ToArray();
}
else
{
// Jon Skeet's accepted answer
return ReadFully(stream);
}
}
MemoryStream ms = new MemoryStream();
file.PostedFile.InputStream.CopyTo(ms);
var byts = ms.ToArray();
ms.Dispose();
just my couple cents... the practice that I often use is to organize the methods like this as a custom helper
public static class StreamHelpers
{
public static byte[] ReadFully(this Stream input)
{
using (MemoryStream ms = new MemoryStream())
{
input.CopyTo(ms);
return ms.ToArray();
}
}
}
add namespace to the config file and use it anywhere you wish
You can simply use ToArray() method of MemoryStream class, for ex-
MemoryStream ms = (MemoryStream)dataInStream;
byte[] imageBytes = ms.ToArray();
You can even make it fancier with extensions:
namespace Foo
{
public static class Extensions
{
public static byte[] ToByteArray(this Stream stream)
{
using (stream)
{
using (MemoryStream memStream = new MemoryStream())
{
stream.CopyTo(memStream);
return memStream.ToArray();
}
}
}
}
}
And then call it as a regular method:
byte[] arr = someStream.ToByteArray()
I get a compile time error with Bob's (i.e. the questioner's) code. Stream.Length is a long whereas BinaryReader.ReadBytes takes an integer parameter. In my case, I do not expect to be dealing with Streams large enough to require long precision, so I use the following:
Stream s;
byte[] b;
if (s.Length > int.MaxValue) {
throw new Exception("This stream is larger than the conversion algorithm can currently handle.");
}
using (var br = new BinaryReader(s)) {
b = br.ReadBytes((int)s.Length);
}
In case anyone likes it, here is a .NET 4+ only solution formed as an extension method without the needless Dispose call on the MemoryStream. This is a hopelessly trivial optimization, but it is worth noting that failing to Dispose a MemoryStream is not a real failure.
public static class StreamHelpers
{
public static byte[] ReadFully(this Stream input)
{
var ms = new MemoryStream();
input.CopyTo(ms);
return ms.ToArray();
}
}
The one above is ok...but you will encounter data corruption when you send stuff over SMTP (if you need to). I've altered to something else that will help to correctly send byte for byte:
'
using System;
using System.IO;
private static byte[] ReadFully(string input)
{
FileStream sourceFile = new FileStream(input, FileMode.Open); //Open streamer
BinaryReader binReader = new BinaryReader(sourceFile);
byte[] output = new byte[sourceFile.Length]; //create byte array of size file
for (long i = 0; i < sourceFile.Length; i++)
output[i] = binReader.ReadByte(); //read until done
sourceFile.Close(); //dispose streamer
binReader.Close(); //dispose reader
return output;
}'
Combinig two of the most up-voted answers into an extension method:
public static byte[] ToByteArray(this Stream stream)
{
if (stream is MemoryStream)
return ((MemoryStream)stream).ToArray();
else
{
using MemoryStream ms = new();
stream.CopyTo(ms);
return ms.ToArray();
}
}
Create a helper class and reference it anywhere you wish to use it.
public static class StreamHelpers
{
public static byte[] ReadFully(this Stream input)
{
using (MemoryStream ms = new MemoryStream())
{
input.CopyTo(ms);
return ms.ToArray();
}
}
}
In namespace RestSharp.Extensions there is method ReadAsBytes. Inside this method is used MemoryStream and there is the same code like in some examples on this page but when you are using RestSharp this is easiest way.
using RestSharp.Extensions;
var byteArray = inputStream.ReadAsBytes();
If a stream supports the Length property, a byte array can be directly created. The advantage is that MemoryStream.ToArray creates the array twice. Plus, probably some unused extra bytes in the buffer. This solution allocates the exact array needed. If the stream does not support the Length property, it will throw NotSupportedException exception.
It is also worth noting that arrays cannot be bigger than int.MaxValue.
public static async Task<byte[]> ToArrayAsync(this Stream stream)
{
var array = new byte[stream.Length];
await stream.ReadAsync(array, 0, (int)stream.Length);
return array;
}
Complete code which switches between both versions based on whether the stream supports seeking or not. It includes checks for Position and unreliable Length. That might slightly reduce speed. In my tests ToArrayAsyncDirect is about 3 times faster compared to ToArrayAsyncGeneral.
public static class StreamExtensions
{
public static readonly byte[] TempArray = new byte[4];
/// <summary>
/// Converts stream to byte array.
/// </summary>
/// <param name="stream">Stream</param>
/// <param name="cancellationToken">Cancellation token</param>
/// <returns>Stream data as array</returns>
/// <returns>Binary data from stream in an array</returns>
public static async Task<byte[]> ToArrayAsync(this Stream stream, CancellationToken cancellationToken)
{
if (!stream.CanRead)
{
throw new AccessViolationException("Stream cannot be read");
}
if (stream.CanSeek)
{
return await ToArrayAsyncDirect(stream, cancellationToken);
}
else
{
return await ToArrayAsyncGeneral(stream, cancellationToken);
}
}
/// <summary>
/// Converts stream to byte array through MemoryStream. This doubles allocations compared to ToArrayAsyncDirect.
/// </summary>
/// <param name="stream">Stream</param>
/// <param name="cancellationToken">Cancellation token</param>
/// <returns></returns>
private static async Task<byte[]> ToArrayAsyncGeneral(Stream stream, CancellationToken cancellationToken)
{
using MemoryStream memoryStream = new MemoryStream();
await stream.CopyToAsync(memoryStream, cancellationToken);
return memoryStream.ToArray();
}
/// <summary>
/// Converts stream to byte array without unnecessary allocations.
/// </summary>
/// <param name="stream">Stream</param>
/// <param name="cancellationToken">Cancellation token</param>
/// <returns>Stream data as array</returns>
/// <exception cref="ArgumentException">Thrown if stream is not providing correct Length</exception>
private static async Task<byte[]> ToArrayAsyncDirect(Stream stream, CancellationToken cancellationToken)
{
if (stream.Position > 0)
{
throw new ArgumentException("Stream is not at the start!");
}
var array = new byte[stream.Length];
int bytesRead = await stream.ReadAsync(array, 0, (int)stream.Length, cancellationToken);
if (bytesRead != array.Length ||
await stream.ReadAsync(TempArray, 0, TempArray.Length, cancellationToken) > 0)
{
throw new ArgumentException("Stream does not have reliable Length!");
}
return array;
}
}
This is the function which I am using, tested and worked well.
please bear in mind that 'input' should not be null and 'input.position' should reset to '0' before reading otherwise it will break the read loop and nothing will read to convert to array.
public static byte[] StreamToByteArray(Stream input)
{
if (input == null)
return null;
byte[] buffer = new byte[16 * 1024];
input.Position = 0;
using (MemoryStream ms = new MemoryStream())
{
int read;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
ms.Write(buffer, 0, read);
}
byte[] temp = ms.ToArray();
return temp;
}
}
Since there's no modern (i.e. async) version of this answer, this is the extension method I use for this purpose:
public static async Task<byte[]> ReadAsByteArrayAsync(this Stream source)
{
// Optimization
if (source is MemoryStream memorySource)
return memorySource.ToArray();
using var memoryStream = new MemoryStream();
await source.CopyToAsync(memoryStream);
return memoryStream.ToArray();
}
The optimization is based on the fact the source code for ToArray calls some internal methods.
You can use this extension method.
public static class StreamExtensions
{
public static byte[] ToByteArray(this Stream stream)
{
var bytes = new List<byte>();
int b;
// -1 is a special value that mark the end of the stream
while ((b = stream.ReadByte()) != -1)
bytes.Add((byte)b);
return bytes.ToArray();
}
}
i was able to make it work on a single line:
byte [] byteArr= ((MemoryStream)localStream).ToArray();
as clarified by johnnyRose, Above code will only work for MemoryStream
Related
The following is a simple compression method I wrote using DeflateStream:
public static int Compress(
byte[] inputData,
int inputStartIndex,
int inputLength,
byte[] outputData,
int outputStartIndex,
int outputLength)
{
if (inputData == null)
throw new ArgumentNullException("inputData must be non-null");
MemoryStream memStream = new MemoryStream(outputData, outputStartIndex, outputLength);
using (DeflateStream dstream = new DeflateStream(memStream, CompressionLevel.Optimal))
{
dstream.Write(inputData, inputStartIndex, inputLength);
return (int)(memStream.Position - outputStartIndex);
}
}
What is special in this method is that I didn't use the parameter-less constructor of MemoryStream. This is because it is a high-throughput server. Array outputData is rented from ArrayPool, to be used to hold the compressed bytes, so that after I make use of it I can return it to ArrayPool.
The compression happened properly, and the compressed data is properly placed into outputData, but memStream.Position was zero, so I can't find out how many bytes have been written into the MemoryStream.
Only part of outputData is occupied by the compressed data. How do I find out the length of the compressed data?
MemoryStream.Position is 0 because data was not actually written there yet at the point you read Position. Instead, tell DeflateStream to leave underlying stream (MemoryStream) open, then dispose DeflateStream. At this point you can be sure it's done writing whatever it needs. Now you can read MemoryStream.Position to check how many bytes were written:
public static int Compress(
byte[] inputData,
int inputStartIndex,
int inputLength,
byte[] outputData,
int outputStartIndex,
int outputLength)
{
if (inputData == null)
throw new ArgumentNullException("inputData must be non-null");
using (var memStream = new MemoryStream(outputData, outputStartIndex, outputLength)) {
// leave open
using (DeflateStream dstream = new DeflateStream(memStream, CompressionLevel.Optimal, leaveOpen: true)) {
dstream.Write(inputData, inputStartIndex, inputLength);
}
return (int) memStream.Position; // now it's not 0
}
}
You also don't need to substract outputStartIndex, because Position is already relative to that index you passed to constructor.
I've been needing to convert my driver to bytes, so I can load it without downloading anything.
Here is what I've tried.
class Program
{
public byte[] StreamFile(string filename)
{
FileStream fs = new FileStream(filename, FileMode.Open, FileAccess.Read);
// Create a byte array of file stream length
byte[] ImageData = new byte[fs.Length];
//Read block of bytes from stream into the byte array
fs.Read(ImageData, 0, System.Convert.ToInt32(fs.Length));
byte[] bytes = System.IO.File.ReadAllBytes(filename);
//Close the File Stream
fs.Close();
return ImageData; //return the byte data
}
static void Main(string[] args)
{
StreamFile(#"");
}
}
I get an error in my Main,
An object is required for the non-static void field, method, or property "Program.StreamFile(string)"
Does anyone know why this happens?
Besides it is not clear to me what you would like to achieve, your code does not compile because you need StreamFile to be static, since you are calling it from a static method.
So this fixes the syntax error
class Program
{
// !!!!!! add this
public static byte[] StreamFile(string filename)
{
FileStream fs = new FileStream(filename, FileMode.Open, FileAccess.Read);
// Create a byte array of file stream length
byte[] ImageData = new byte[fs.Length];
//Read block of bytes from stream into the byte array
fs.Read(ImageData, 0, System.Convert.ToInt32(fs.Length));
byte[] bytes = System.IO.File.ReadAllBytes(filename);
//Close the File Stream
fs.Close();
return ImageData; //return the byte data
}
static void Main(string[] args)
{
StreamFile(#"");
}
}
You can rewrite your code and make it more robust in this way (.net core)
class Program
{
public static byte[] StreamFile(string filename)
{
return System.IO.File.ReadAllBytes(filename);
}
static void Main(string[] args)
{
StreamFile(#"");
}
}
EDITED the below code shows how to use using, but requires little more knowledge about how to handle the buffers (I just placed an exception as this is out of scope now)
class Program
{
public static byte[] StreamFile(string filename)
{
byte[] data;
// let the stream be managed on its own
using (var fs = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read))
{
// If you want to ask for the length of the file, be sure nobody is changing it over time. See the FileShare.Read above
data = new byte[fs.Length];
if (data.Length != fs.Read(data, 0, data.Length))
throw new InvalidOperationException("Something went wrong.");
}
return data;
}
static void Main(string[] args)
{
StreamFile(#"");
}
}
Please note that the using construct automatically closes/disposes the resource you are using when it goes out of scope1. This way you cannot make mistakes and it is clear what is your intent.
You must lock the file (in your case, just asking nobody can change it via FileShare.Read) otherwise you may have a race condition on the file since you might read an inconsistent data.
1 They must implement the IDisposable interface.
the error you get is because of calling a non static method from inside of the static main method,
may be you should change your question title.
add static keyword to StreamFile method.
So I have a file upload form which (after uploading) encrypts the file and uploads it to an S3 bucket. However, I'm doing an extra step which I want to avoid. First, I'll show you some code what I am doing now:
using (MemoryStream memoryStream = new MemoryStream())
{
Security.EncryptFile(FileUpload.UploadedFile.OpenReadStream(), someByteArray, memoryStream);
memoryStream.Position = 0; // reset it's position
await S3Helper.Upload(objectName, memoryStream);
}
My Security.EncryptFile method:
public static void EncryptFile(Stream inputStream, byte[] key, Stream outputStream)
{
CryptoStream cryptoStream;
using (SymmetricAlgorithm cipher = Aes.Create())
using (inputStream)
{
cipher.Key = key;
// aes.IV will be automatically populated with a secure random value
byte[] iv = cipher.IV;
// Write a marker header so we can identify how to read this file in the future
outputStream.WriteByte(69);
outputStream.WriteByte(74);
outputStream.WriteByte(66);
outputStream.WriteByte(65);
outputStream.WriteByte(69);
outputStream.WriteByte(83);
outputStream.Write(iv, 0, iv.Length);
using (cryptoStream =
new CryptoStream(inputStream, cipher.CreateEncryptor(), CryptoStreamMode.Read))
{
cryptoStream.CopyTo(outputStream);
}
}
}
The S3Helper.Upload method:
public async static Task Upload(string objectName, Stream inputStream)
{
try
{
// Upload a file to bucket.
using (inputStream)
{
await minio.PutObjectAsync(S3BucketName, objectName, inputStream, inputStream.Length);
}
Console.Out.WriteLine("[Bucket] Successfully uploaded " + objectName);
}
catch (MinioException e)
{
Console.WriteLine("[Bucket] Upload exception: {0}", e.Message);
}
}
So, what happens above is I'm creating a MemoryStream, running the EncryptFile() method (which outputs it back to the stream), I reset the stream position and finally reuse it again to upload it to the S3 bucket (Upload()).
The question
What I'd like to do is the following (if possible): directly upload the uploaded file to the S3 bucket, without storing the full file in memory first (kinda like the code below, even though it's not working):
await S3Helper.Upload(objectName, Security.EncryptFile(FileUpload.UploadedFile.OpenReadStream(), someByteArray));
So I assume it has to return a buffer to the Upload method, which will upload it, and waits for the EncryptFile() method to return a buffer again until the file has been fully read. Any pointers to the right direction will be greatly appreciated.
What you could do is make your own EncryptionStream that overloads the Stream class. When you read from this stream, it will take a block from the inputstream, encrypt it and then output the encrypted data.
As an example, something like this:
public class EncrypStream : Stream {
private Stream _cryptoStream;
private SymmetricAlgorithm _cipher;
private Stream InputStream { get; }
private byte[] Key { get; }
public EncrypStream(Stream inputStream, byte[] key) {
this.InputStream = inputStream;
this.Key = key;
}
public override int Read(byte[] buffer, int offset, int count) {
if (this._cipher == null) {
_cipher = Aes.Create();
_cipher.Key = Key;
// aes.IV will be automatically populated with a secure random value
byte[] iv = _cipher.IV;
// Write a marker header so we can identify how to read this file in the future
// #TODO Make sure the BUFFER is big enough...
var idx = offset;
buffer[idx++] = 69;
buffer[idx++] = 74;
buffer[idx++] = 66;
buffer[idx++] = 65;
buffer[idx++] = 69;
buffer[idx++] = 83;
Array.Copy(iv, 0, buffer, idx, iv.Length);
offset = idx + iv.Length;
// Startup stream
this._cryptoStream = new CryptoStream(InputStream, _cipher.CreateEncryptor(), CryptoStreamMode.Read);
}
// Write block
return this._cryptoStream.Read(buffer, offset, count);
}
protected override void Dispose(bool disposing) {
base.Dispose(disposing);
// Make SURE you properly dispose the underlying streams!
this.InputStream?.Dispose();
this._cipher?.Dispose();
this._cryptoStream?.Dispose();
}
// Omitted other methods from stream for readability...
}
Which allows you to call the stream as:
using (var stream = new EncrypStream(FileUpload.UploadedFile.OpenReadStream(), someByteArray)) {
await S3Helper.Upload(objectName, stream);
}
As I notice your upload method requires the total bytelength of the encrypted data, you can look into this post here to get an idea how you would be able to calculate this.
(I'm guessing that the CryptoStream does not return the expected length of the encrypted data, but please correct me if I'm wrong on this)
I'm trying to compress some text in my UWP application. I created this method to make it easier later on:
public static byte[] Compress(this string s)
{
var b = Encoding.UTF8.GetBytes(s);
using (MemoryStream ms = new MemoryStream())
using (GZipStream zipStream = new GZipStream(ms, CompressionMode.Compress))
{
zipStream.Write(b, 0, b.Length);
zipStream.Flush(); //Doesn't seem like Close() is available in UWP, so I changed it to Flush(). Is this the problem?
return ms.ToArray();
}
}
But unfortunately this always returns 10 bytes, no matter what the input text is. Is it because I don't use .Close() on the GZipStream?
You are returning the byte data too early.
The Close() method is replaced by the Dispose() method. So the GZIP stream will be written only when disposed so after you leave the using(GZipStream) {} block.
public static byte[] Compress(string s)
{
var b = Encoding.UTF8.GetBytes(s);
var ms = new MemoryStream();
using (GZipStream zipStream = new GZipStream(ms, CompressionMode.Compress))
{
zipStream.Write(b, 0, b.Length);
zipStream.Flush(); //Doesn't seem like Close() is available in UWP, so I changed it to Flush(). Is this the problem?
}
// we create the data array here once the GZIP stream has been disposed
var data = ms.ToArray();
ms.Dispose();
return data;
}
I have this code:
public static List<ReplicableObject> ParseStreamForObjects(Stream stream)
{
List<ReplicableObject> result = new List<ReplicableObject>();
while (true)
{
// HERE I want to check that there's at least four bytes left in the stream
BinaryReader br = new BinaryReader(stream);
int length = br.ReadInt32();
// HERE I want to check that there's enough bytes left in the stream
byte[] bytes = br.ReadBytes(length);
MemoryStream ms = new MemoryStream(bytes);
ms.Position = 0;
result.Add((ReplicableObject) Formatter.Deserialize(ms));
ms.Close();
br.Close();
}
return result;
}
Unfortunately, the stream object is always going to be a TCP stream, which means no seek operations. So how can I check to make sure that I'm not over-running the stream where I've put the // HERE comments?
I don't think there's any way to query a NetworkStream to find the data you're looking for. What you'll probably need to do is buffer whatever data the stream makes available into another data structure, then parse objects out of that structure once you know it's got enough bytes in it.
The NetworkStream class provides a DataAvailable property that tells you if any data is available to be read, and the Read() method returns a value indicating how many bytes it actually retrieved. You should be able to use those values to do the buffering you need.
See Mr. Skeets page
Sometimes, you don't know the length of the stream in advance (for instance a network stream) and just want to read the whole lot into a buffer. Here's a method to do just that:
/// <summary>
/// Reads data from a stream until the end is reached. The
/// data is returned as a byte array. An IOException is
/// thrown if any of the underlying IO calls fail.
/// </summary>
/// <param name="stream">The stream to read data from</param>
public static byte[] ReadFully (Stream stream)
{
byte[] buffer = new byte[32768];
using (MemoryStream ms = new MemoryStream())
{
while (true)
{
int read = stream.Read (buffer, 0, buffer.Length);
if (read <= 0)
return ms.ToArray();
ms.Write (buffer, 0, read);
}
}
}
This should give you some ideas. Once you have the byte array, checking the Length will be easy to do.
In your example, it would look something like this:
int bytes_to_read = 4;
byte[] length_bytes = new byte[bytes_to_read];
int bytes_read = stream.Read(length_bytes, 0, length_bytes.Length);
// Check that there's at least four bytes left in the stream
if(bytes_read != bytes_to_read) break;
int bytes_in_msg = BitConverter.ToInt32(length_bytes);
byte[] msg_bytes = new byte[bytes_in_msg];
bytes_read = stream.Read(msg_bytes, 0, msg_bytes.Length);
// Check that there's enough bytes left in the stream
if(bytes_read != bytes_in_msg ) break;
...