Channels & Memory Management Strategies for Large Objects - c#

I'm trying to determine how to best implement .Net Core 3 Channels and whether it's a good idea to pass very large objects between tasks. In my example, one task that is very fast can read in a 1GB chunk from a very large file. A number of consumer tasks can read a chunk from the channel and process them in parallel, as processing is much slower and needs parallel (multi-threaded) execution.
In testing my code, there is a massive amount of GC happening and total RAM used far exceeds the sum of all data waiting in one bounded channel and all executing tasks. I've simplified my code down to the most basic example hoping someone can give me some tips on how to better allocate/manage memory or if this approach is a good idea?
using System;
using System.IO;
using System.Threading.Channels;
using System.Threading.Tasks;
namespace MergeSort
{
public class Example
{
private Channel<byte[]> _channelProcessing;
public async Task DoSort(int queueDepth, int parallelTaskCount)
{
// Hard-code some values so we can talk about details
queueDepth = 2;
parallelTasks = 8;
_channelProcessing = Channel.CreateBounded<byte[]>(queueDepth);
Task[] processingTasks = new Task[parallelTaskCount];
int outputBufferSize = 1024 * 1024;
for (int x = 0; x < parallelTaskCount; x++)
{
string outputFile = $"C:\\Output.{x:00000000}.txt";
processingTasks[x] = Task.Run(() => ProcessChunkAsync(outputBufferSize));
}
// Task put unsorted chunks on the channel
string inputFile = "C:\\Input.txt";
int chunkSize = 1024 * 1024 * 1024; // 1GiB
Task inputTask = Task.Run(() => ReadInputAsync(inputFile, chunkSize));
// Wait for all tasks building chunk files to complete before continuing
await inputTask;
await Task.WhenAll(processingTasks);
}
private async Task ReadInputAsync(string inputFile, int chunkSize)
{
int bytesRead = 0;
byte[] chunkBuffer = new byte[chunkSize];
using (FileStream fileStream = File.Open(inputFile, FileMode.Open, FileAccess.Read, FileShare.Read))
{
// Read chunks until input EOF
while (fileStream.Position != fileStream.Length)
{
bytesRead = fileStream.Read(chunkBuffer, 0, chunkBuffer.Length);
// Fake code him to simulate the work I need to do showing outBuffer.Length is calculated at runtime
Random rnd = new Random();
int runtimeCalculatedAmount = rnd.Next(100, 600);
byte[] tempBuffer = new byte[runtimeCalculatedAmount];
// Create the buffer with a slightly variable size that needs to be passed to the channel for next task
byte[] outBuffer = new byte[1024 * 1024 * 1024 + runtimeCalculatedAmount];
Array.Copy(chunkBuffer, outBuffer, bytesRead);
Array.Copy(tempBuffer, 0, outBuffer, bytesRead, outBuffer.Length);
await _channelProcessing.Writer.WriteAsync(outBuffer);
outBuffer = null;
}
}
// Not sure if it's safe to .Complete() before consumers have read all data from channel?
_channelProcessing.Writer.Complete();
}
private async Task ProcessChunkAsync(int outputBufferSize)
{
while (await _channelProcessing.Reader.WaitToReadAsync())
{
if (_channelProcessing.Reader.TryRead(out byte[] inBuffer))
{
// myBigThing is also a very large object (result of processing inBuffer and slightly larger)
MyBigThing myBigThing = new MyBigThing(inBuffer);
inBuffer = null;
// Create file and write all rows
using (FileStream fileStream = File.Create("C:\\Output.txt", outputBufferSize, FileOptions.SequentialScan))
{
// Write myBigThing to output file
fileStream.Write(myBigThing.Data);
}
myBigThing = null;
}
}
}
}
}

Related

How can I improve the performance of this CopyTo method?

EDIT: I have now solved this. My answer posted below and will mark as solved when SO lets me.
I have a CopyTo (and a CopyToAsync) method to copy files in my C# application.
I have found that it is actually quite slow to copy the files, compared to something like Xcopy.
I extracted the core functionality of the copy method and placed it into a test console app to get the speed that it operates at versus Xcopy, and found the results actually quite different.
The results I get are:
Async Method: 36.59 seconds - Average speed: 1512.63 mb/sec
Sync Method: 36.49 seconds - Average speed: 1516.72 mb/sec
XCOPY: 5.62 seconds - Average speed: 9842.11 mb/sec
All three of these used the exact same file, and the exact same destination.
StreamExtensions class:
public static class StreamExtensions
{
const int DEFAULT_BUFFER = 0x1000; // 4096 bits
public static async Task CopyToAsync(this Stream source, Stream destination, IProgress<long> progress, CancellationToken cancellationToken = default, int bufferSize = DEFAULT_BUFFER)
{
var buffer = new byte[bufferSize];
int bytesRead;
long totalRead = 0;
while ((bytesRead = await source.ReadAsync(buffer, 0, buffer.Length, cancellationToken)) > 0)
{
await destination.WriteAsync(buffer, 0, bytesRead, cancellationToken);
cancellationToken.ThrowIfCancellationRequested();
totalRead += bytesRead;
progress.Report(totalRead);
}
}
public static void CopyTo(this Stream source, Stream destination, IProgress<long> progress, int bufferSize = DEFAULT_BUFFER)
{
var buffer = new byte[bufferSize];
int bytesRead;
long totalRead = 0;
while ((bytesRead = source.Read(buffer, 0, buffer.Length)) > 0)
{
destination.Write(buffer, 0, bytesRead);
totalRead += bytesRead;
progress.Report(totalRead);
}
}
}
The IProgress<long> object is to report the file progress back to the calling method.
Example call implementation:
// Asynchronous version
public static async Task CopyFileSetAsync(Dictionary<string, string> fileSet)
{
for (var x = 0; x < fileSet.Count; x++)
{
var item = fileSet.ElementAt(x);
var from = item.Key;
var to = item.Value;
int currentProgress = 0;
long fileSize = new FileInfo(from).Length;
IProgress<long> progress = new SynchronousProgress<long>(value =>
{
decimal fileProg = (decimal)(value * 100) / fileSize;
if (fileProg != currentProgress)
{
currentProgress = (int)fileProg;
OnUpdateFileProgress(null, new FileProgressEventArgs(fileProg));
}
});
using (var outStream = new FileStream(to, FileMode.Create, FileAccess.Write, FileShare.Read))
{
using (var inStream = new FileStream(from, FileMode.Open, FileAccess.Read, FileShare.Read))
{
await inStream.CopyToAsync(outStream, progress);
}
}
OnUpdateFileProgress(null, new FileProgressEventArgs(100)); // Probably redundant
}
}
// Synchronous version
public static void CopyFileSet(Dictionary<string, string> fileSet)
{
for (var x = 0; x < fileSet.Count; x++)
{
var item = fileSet.ElementAt(x);
var from = item.Key;
var to = item.Value;
int currentProgress = 0;
long fileSize = new FileInfo(from).Length;
IProgress<long> progress = new SynchronousProgress<long>(value =>
{
decimal fileProg = (decimal)(value * 100) / fileSize;
if (fileProg != currentProgress)
{
currentProgress = (int)fileProg;
OnUpdateFileProgress(null, new FileProgressEventArgs(fileProg));
}
});
using (var outStream = new FileStream(to, FileMode.Create, FileAccess.Write, FileShare.Read))
{
using (var inStream = new FileStream(from, FileMode.Open, FileAccess.Read, FileShare.Read))
{
inStream.CopyTo(outStream, progress, 1024);
}
}
OnUpdateFileProgress(null, new FileProgressEventArgs(100)); // Probably redundant
}
}
Is there something that's preventing this from running as fast as it could? I'm just stumped as to how much slower it is compared to copy.
EDIT: Fixed a typo where I forgot a single ` around IProgress
Thanks to Tom and xanatos, I answered my own question:
I misunderstood the impact of buffer size. I had only gone so far as 8192 bytes as the buffer size. After taking on their suggestions, I increased the buffer size to 1mb (1048576 bytes), and this made a massive difference to the performance.
Async Method: 5.57 seconds - Average speed: 9938.68 mb/sec
Sync Method: 5.52 seconds - Average speed: 10028.36 mb/sec
XCOPY: 5.03 seconds - Average speed: 11007.84 mb/sec

What is different with the writing in FileStream?

When I searched the method about decompress the file by using SharpZipLib, I found lot of methods like this:
public static void TarWriteCharacters(string tarfile, string targetDir)
{
using (TarInputStream s = new TarInputStream(File.OpenRead(tarfile)))
{
//some codes here
using (FileStream fileWrite = File.Create(targetDir + directoryName + fileName))
{
int size = 2048;
byte[] data = new byte[2048];
while (true)
{
size = s.Read(data, 0, data.Length);
if (size > 0)
{
fileWrite.Write(data, 0, size);
}
else
{
break;
}
}
fileWrite.Close();
}
}
}
The format FileStream.Write is:
FileStream.Write(byte[] array, int offset, int count)
Now I try to separate part of read and write because I want to use thread to speed up the decompress rate in write function, and I use dynamic array byte[] and int[] to deposit the file's data and size like below
Read:
public static void TarWriteCharacters(string tarfile, string targetDir)
{
using (TarInputStream s = new TarInputStream(File.OpenRead(tarfile)))
{
//some codes here
using (FileStream fileWrite= File.Create(targetDir + directoryName + fileName))
{
int size = 2048;
List<int> SizeList = new List<int>();
List<byte[]> mydatalist = new List<byte[]>();
while (true)
{
byte[] data = new byte[2048];
size = s.Read(data, 0, data.Length);
if (size > 0)
{
mydatalist.Add(data);
SizeList.Add(size);
}
else
{
break;
}
}
test = new Thread(() =>
FileWriteFun(pathToTar, args, SizeList, mydatalist)
);
test.Start();
streamWriter.Close();
}
}
}
Write:
public static void FileWriteFun(string pathToTar , string[] args, List<int> SizeList, List<byte[]> mydataList)
{
//some codes here
using (FileStream fileWrite= File.Create(targetDir + directoryName + fileName))
{
for (int i = 0; i < mydataList.Count; i++)
{
fileWrite.Write(mydataList[i], 0, SizeList[i]);
}
fileWrite.Close();
}
}
Edit
(1)byte[] data = new byte[2048] into while loop to assign data to new array.
(2)change int[] SizeList = new int[2048] to List<int> SizeList = new List<int>() because of int range
As read on a stream is only guarantied to return one byte (typically it will be more, but you can't rely on the full requested length each time), your solution can theoretically fail after 2048 bytes as your SizeList can only hold 2048 entries.
You could use a List to hold the sizes.
Or use a MemoryStream instead of inventing your own.
But the two main problems are:
1) You keep reading into the same byte array, overwriting previously read data. When you add your data byte array to mydatalist, you must assign data to a new byte array.
2) you close your stream before the second thread is done writing.
In general threading is difficult and should only be used where you know it will improve performance. Simply reading and writing data is typically IO bound in performance, not cpu bound, so introducing a second thread will just give a small performance penalty and no gain in speed. You could use multithreading to ensure concurrent read/write operations, but most likely the disk cache will do this for you if you stick to the first solution - amd if not, using async is easier than multithreaded to achieve this.

Binaryreader read from Filestream which loads in chunks

I'm reading values from a huge file (> 10 GB) using the following code:
FileStream fs = new FileStream(fileName, FileMode.Open);
BinaryReader br = new BinaryReader(fs);
int count = br.ReadInt32();
List<long> numbers = new List<long>(count);
for (int i = count; i > 0; i--)
{
numbers.Add(br.ReadInt64());
}
unfortunately the read-speed from my SSD is stuck at a few MB/s. I guess the limit are the IOPS of the SSD, so it might be better to read in chunks from the file.
Question
Does the FileStream in my code really read only 8 bytes from the file everytime the BinaryReader calls ReadInt64()?
If so, is there a transparent way for the BinaryReader to provide a stream that reads in larger chunks from the file to speed up the procedure?
Test-Code
Here's a minimal example to create a test-file and to measure the read-performance.
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
namespace TestWriteRead
{
class Program
{
static void Main(string[] args)
{
System.IO.File.Delete("test");
CreateTestFile("test", 1000000000);
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
IEnumerable<long> test = Read("test");
stopwatch.Stop();
Console.WriteLine("File loaded within " + stopwatch.ElapsedMilliseconds + "ms");
}
private static void CreateTestFile(string filename, int count)
{
FileStream fs = new FileStream(filename, FileMode.CreateNew);
BinaryWriter bw = new BinaryWriter(fs);
bw.Write(count);
for (int i = 0; i < count; i++)
{
long value = i;
bw.Write(value);
}
fs.Close();
}
private static IEnumerable<long> Read(string filename)
{
FileStream fs = new FileStream(filename, FileMode.Open);
BinaryReader br = new BinaryReader(fs);
int count = br.ReadInt32();
List<long> values = new List<long>(count);
for (int i = 0; i < count; i++)
{
long value = br.ReadInt64();
values.Add(value);
}
fs.Close();
return values;
}
}
}
You should configure the stream to use SequentialScan to indicate that you will read the stream from start to finish. It should improve the speed significantly.
Indicates that the file is to be accessed sequentially from beginning
to end. The system can use this as a hint to optimize file caching. If
an application moves the file pointer for random access, optimum
caching may not occur; however, correct operation is still guaranteed.
using (
var fs = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 8192,
FileOptions.SequentialScan))
{
var br = new BinaryReader(fs);
var count = br.ReadInt32();
var numbers = new List<long>();
for (int i = count; i > 0; i--)
{
numbers.Add(br.ReadInt64());
}
}
Try read blocks instead:
using (
var fs = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 8192,
FileOptions.SequentialScan))
{
var br = new BinaryReader(fs);
var numbersLeft = (int)br.ReadInt64();
byte[] buffer = new byte[8192];
var bufferOffset = 0;
var bytesLeftToReceive = sizeof(long) * numbersLeft;
var numbers = new List<long>();
while (true)
{
// Do not read more then possible
var bytesToRead = Math.Min(bytesLeftToReceive, buffer.Length - bufferOffset);
if (bytesToRead == 0)
break;
var bytesRead = fs.Read(buffer, bufferOffset, bytesToRead);
if (bytesRead == 0)
break; //TODO: Continue to read if file is not ready?
//move forward in read counter
bytesLeftToReceive -= bytesRead;
bytesRead += bufferOffset; //include bytes from previous read.
//decide how many complete numbers we got
var numbersToCrunch = bytesRead / sizeof(long);
//crunch them
for (int i = 0; i < numbersToCrunch; i++)
{
numbers.Add(BitConverter.ToInt64(buffer, i * sizeof(long)));
}
// move the last incomplete number to the beginning of the buffer.
var remainder = bytesRead % sizeof(long);
Buffer.BlockCopy(buffer, bytesRead - remainder, buffer, 0, remainder);
bufferOffset = remainder;
}
}
Update in response to a comment:
May I know what's the reason that manual reading is faster than the other one?
I don't know how the BinaryReader is actually implemented. So this is just assumptions.
The actual read from the disk is not the expensive part. The expensive part is to move the reader arm into the correct position on the disk.
As your application isn't the only one reading from a hard drive the disk have to re-position itself every time an application requests a read.
Thus if the BinaryReader just reads the requested int it have to wait on the disk for every read (if some other application make a read in-between).
As I read a much larger buffer directly (which is faster) I can process more integers without having to wait for the disk between reads.
Caching will of course speed things up a bit, and that's why it's "just" three times faster.
(future readers: If something above is incorrect, please correct me).
You can use a BufferedStream to increase the read buffer size.
In theory memory mapped files should help here. You could load it into memory using several very large chunks. Not sure though how much is this relevant when using SSDs.

Process buffer in chunk sizes

I have an IRandomAccessStream to store data on WindowsPhone 8.1 and I want to process the data in a certain BLOCKSIZE.
Here is the code
in the capture function:
await this.mediaCapture.StartRecordToStreamAsync(recordProfile, this.message);
in the stop function, when done I stop:
await this.mediaCapture.StopRecordAsync();
using (var dataReader = new DataReader(message.GetInputStreamAt(0)))
{
dataReader.UnicodeEncoding = Windows.Storage.Streams.UnicodeEncoding.Utf8;
dataReader.ByteOrder = Windows.Storage.Streams.ByteOrder.LittleEndian;
// await dataReader.LoadAsync((uint)message.Size);
await dataReader.LoadAsync(((uint)message.Size + (int)FREQUENCY) * 10);
byte[] buffer = new byte[((int)message.Size + (int)FREQUENCY) * 10];
dataReader.ReadBytes(buffer);
short[] signal = new short[BLOCKSIZE];
int bufferReadResult = 0;
while (bufferReadResult != buffer.Length)
{
for (int index = 0; index < BLOCKSIZE; index += 2, bufferReadResult++)
{
signal[bufferReadResult] = BitConverter.ToInt16(buffer, index);
}
// .....process the signal[] in BLOCKSIZE,
// keep processing BLOCKSIZE until end of buffer
// so process [BLOCKSIZE][BLOCKSIZE]..............buffer end
}
}
The problem is that when bufferReadResult reaches BLOCKSIZE that's it. It won't continue with a new block unit end of buffer.
What is the best way to do this? I have a buffer or IRandomAccessStream and I want to process the all the data in chunks of short[BLOCKSIZE].

How do you implement asynchronous I/O?

I'm writing a program that encrypts files, this means I need to read a lot of data, encrypt that data, and write that data to a new file. In an attempt to increase performance, I've tried to multi-thread the process. I've created two blocking collections, using them as queues I read data into a byte array and pass that array into the first blocking collection, I take that array out of the blocking collection, encrypt it and pass it to the second blocking collection where I take that array out and write it to a file.
Although i'm getting improvements on speed, the process is very unreliable. It can start at 30/mb's, peak at 35mb/s and trail of drastically to almost a standstill after about 800Mb have been processed. I have a strong feeling i'm doing this wrong, and I have spent the best part of three hours this afternoon, and even longer yesterday trying to understand how I could instead using filestream.beginread() asynchronously to improve reliability and speed.
Here is the code:
public void Encrypt()
{
using (BlockingCollection<Byte[]> EncryptionQueue = new BlockingCollection<Byte[]>(32),
WritingQueue = new BlockingCollection<Byte[]>(32))
{
Task readAction = Task.Factory.StartNew(() =>
{
FileStream input = new FileStream(DataFile.FullName, FileMode.Open, FileAccess.Read, FileShare.Read, 1024 * 8);
BinaryReader binaryInput = new BinaryReader(input);
ByteTotal = (int)input.Length;
while (BytesLeft < ByteTotal)
{
EncryptionQueue.Add(binaryInput.ReadBytes(8192));
BytesLeft += 8192;
}
EncryptionQueue.CompleteAdding();
input.Close();
}, TaskCreationOptions.LongRunning);
Task encryptAction = Task.Factory.StartNew(() =>
{
RC4Engine StreamCipher = new RC4Engine(this.keyString);
foreach (Byte[] chunk in EncryptionQueue.GetConsumingEnumerable())
{
int len = chunk.Length;
for (int x = 0; x < len; x++)
{
chunk[x] ^= StreamCipher.OutputByte();
}
WritingQueue.Add(chunk);
}
WritingQueue.CompleteAdding();
}, TaskCreationOptions.LongRunning);
Task writeAction = Task.Factory.StartNew(() =>
{
FileStream output = new FileStream(TempFile.FullName, FileMode.Create, FileAccess.Write, FileShare.Write, 1024 * 8);
foreach (Byte[] chunk in WritingQueue.GetConsumingEnumerable())
{
output.Write(chunk, 0, chunk.Length);
}
output.Close();
}, TaskCreationOptions.LongRunning);
Task timerAction = Task.Factory.StartNew(() =>
{
while (BytesLeft < ByteTotal)
{
Thread.Sleep(10);
progressReport.Report((BytesLeft / (1024 * 1024)).ToString());
}
}, TaskCreationOptions.LongRunning);
Task.WaitAll(readAction, encryptAction, writeAction, timerAction);
// Delete the existing file if it exists
File.Delete(DataFile.FullName);
// Rename the temporary file created during encryption to it's final filename.
File.Move(TempFile.FullName, DataFile.FullName);
}
Progress.report is just a way of updating the ui thread with how many bytes have been encrypted so far.
How would I write this if I were to use Asynchronous I/O?

Categories

Resources