How can I improve the performance of this CopyTo method?

How can I improve the performance of this CopyTo method? - c#

EDIT: I have now solved this. My answer posted below and will mark as solved when SO lets me.
I have a CopyTo (and a CopyToAsync) method to copy files in my C# application.
I have found that it is actually quite slow to copy the files, compared to something like Xcopy.
I extracted the core functionality of the copy method and placed it into a test console app to get the speed that it operates at versus Xcopy, and found the results actually quite different.
The results I get are:
Async Method: 36.59 seconds - Average speed: 1512.63 mb/sec
Sync Method: 36.49 seconds - Average speed: 1516.72 mb/sec
XCOPY: 5.62 seconds - Average speed: 9842.11 mb/sec
All three of these used the exact same file, and the exact same destination.
StreamExtensions class:
public static class StreamExtensions
{
const int DEFAULT_BUFFER = 0x1000; // 4096 bits
public static async Task CopyToAsync(this Stream source, Stream destination, IProgress<long> progress, CancellationToken cancellationToken = default, int bufferSize = DEFAULT_BUFFER)
{
var buffer = new byte[bufferSize];
int bytesRead;
long totalRead = 0;
while ((bytesRead = await source.ReadAsync(buffer, 0, buffer.Length, cancellationToken)) > 0)
{
await destination.WriteAsync(buffer, 0, bytesRead, cancellationToken);
cancellationToken.ThrowIfCancellationRequested();
totalRead += bytesRead;
progress.Report(totalRead);
}
}
public static void CopyTo(this Stream source, Stream destination, IProgress<long> progress, int bufferSize = DEFAULT_BUFFER)
{
var buffer = new byte[bufferSize];
int bytesRead;
long totalRead = 0;
while ((bytesRead = source.Read(buffer, 0, buffer.Length)) > 0)
{
destination.Write(buffer, 0, bytesRead);
totalRead += bytesRead;
progress.Report(totalRead);
}
}
}
The IProgress<long> object is to report the file progress back to the calling method.
Example call implementation:
// Asynchronous version
public static async Task CopyFileSetAsync(Dictionary<string, string> fileSet)
{
for (var x = 0; x < fileSet.Count; x++)
{
var item = fileSet.ElementAt(x);
var from = item.Key;
var to = item.Value;
int currentProgress = 0;
long fileSize = new FileInfo(from).Length;
IProgress<long> progress = new SynchronousProgress<long>(value =>
{
decimal fileProg = (decimal)(value * 100) / fileSize;
if (fileProg != currentProgress)
{
currentProgress = (int)fileProg;
OnUpdateFileProgress(null, new FileProgressEventArgs(fileProg));
}
});
using (var outStream = new FileStream(to, FileMode.Create, FileAccess.Write, FileShare.Read))
{
using (var inStream = new FileStream(from, FileMode.Open, FileAccess.Read, FileShare.Read))
{
await inStream.CopyToAsync(outStream, progress);
}
}
OnUpdateFileProgress(null, new FileProgressEventArgs(100)); // Probably redundant
}
}
// Synchronous version
public static void CopyFileSet(Dictionary<string, string> fileSet)
{
for (var x = 0; x < fileSet.Count; x++)
{
var item = fileSet.ElementAt(x);
var from = item.Key;
var to = item.Value;
int currentProgress = 0;
long fileSize = new FileInfo(from).Length;
IProgress<long> progress = new SynchronousProgress<long>(value =>
{
decimal fileProg = (decimal)(value * 100) / fileSize;
if (fileProg != currentProgress)
{
currentProgress = (int)fileProg;
OnUpdateFileProgress(null, new FileProgressEventArgs(fileProg));
}
});
using (var outStream = new FileStream(to, FileMode.Create, FileAccess.Write, FileShare.Read))
{
using (var inStream = new FileStream(from, FileMode.Open, FileAccess.Read, FileShare.Read))
{
inStream.CopyTo(outStream, progress, 1024);
}
}
OnUpdateFileProgress(null, new FileProgressEventArgs(100)); // Probably redundant
}
}
Is there something that's preventing this from running as fast as it could? I'm just stumped as to how much slower it is compared to copy.
EDIT: Fixed a typo where I forgot a single ` around IProgress

Thanks to Tom and xanatos, I answered my own question:
I misunderstood the impact of buffer size. I had only gone so far as 8192 bytes as the buffer size. After taking on their suggestions, I increased the buffer size to 1mb (1048576 bytes), and this made a massive difference to the performance.
Async Method: 5.57 seconds - Average speed: 9938.68 mb/sec
Sync Method: 5.52 seconds - Average speed: 10028.36 mb/sec
XCOPY: 5.03 seconds - Average speed: 11007.84 mb/sec

Related

Channels & Memory Management Strategies for Large Objects

I'm trying to determine how to best implement .Net Core 3 Channels and whether it's a good idea to pass very large objects between tasks. In my example, one task that is very fast can read in a 1GB chunk from a very large file. A number of consumer tasks can read a chunk from the channel and process them in parallel, as processing is much slower and needs parallel (multi-threaded) execution.
In testing my code, there is a massive amount of GC happening and total RAM used far exceeds the sum of all data waiting in one bounded channel and all executing tasks. I've simplified my code down to the most basic example hoping someone can give me some tips on how to better allocate/manage memory or if this approach is a good idea?
using System;
using System.IO;
using System.Threading.Channels;
using System.Threading.Tasks;
namespace MergeSort
{
public class Example
{
private Channel<byte[]> _channelProcessing;
public async Task DoSort(int queueDepth, int parallelTaskCount)
{
// Hard-code some values so we can talk about details
queueDepth = 2;
parallelTasks = 8;
_channelProcessing = Channel.CreateBounded<byte[]>(queueDepth);
Task[] processingTasks = new Task[parallelTaskCount];
int outputBufferSize = 1024 * 1024;
for (int x = 0; x < parallelTaskCount; x++)
{
string outputFile = $"C:\\Output.{x:00000000}.txt";
processingTasks[x] = Task.Run(() => ProcessChunkAsync(outputBufferSize));
}
// Task put unsorted chunks on the channel
string inputFile = "C:\\Input.txt";
int chunkSize = 1024 * 1024 * 1024; // 1GiB
Task inputTask = Task.Run(() => ReadInputAsync(inputFile, chunkSize));
// Wait for all tasks building chunk files to complete before continuing
await inputTask;
await Task.WhenAll(processingTasks);
}
private async Task ReadInputAsync(string inputFile, int chunkSize)
{
int bytesRead = 0;
byte[] chunkBuffer = new byte[chunkSize];
using (FileStream fileStream = File.Open(inputFile, FileMode.Open, FileAccess.Read, FileShare.Read))
{
// Read chunks until input EOF
while (fileStream.Position != fileStream.Length)
{
bytesRead = fileStream.Read(chunkBuffer, 0, chunkBuffer.Length);
// Fake code him to simulate the work I need to do showing outBuffer.Length is calculated at runtime
Random rnd = new Random();
int runtimeCalculatedAmount = rnd.Next(100, 600);
byte[] tempBuffer = new byte[runtimeCalculatedAmount];
// Create the buffer with a slightly variable size that needs to be passed to the channel for next task
byte[] outBuffer = new byte[1024 * 1024 * 1024 + runtimeCalculatedAmount];
Array.Copy(chunkBuffer, outBuffer, bytesRead);
Array.Copy(tempBuffer, 0, outBuffer, bytesRead, outBuffer.Length);
await _channelProcessing.Writer.WriteAsync(outBuffer);
outBuffer = null;
}
}
// Not sure if it's safe to .Complete() before consumers have read all data from channel?
_channelProcessing.Writer.Complete();
}
private async Task ProcessChunkAsync(int outputBufferSize)
{
while (await _channelProcessing.Reader.WaitToReadAsync())
{
if (_channelProcessing.Reader.TryRead(out byte[] inBuffer))
{
// myBigThing is also a very large object (result of processing inBuffer and slightly larger)
MyBigThing myBigThing = new MyBigThing(inBuffer);
inBuffer = null;
// Create file and write all rows
using (FileStream fileStream = File.Create("C:\\Output.txt", outputBufferSize, FileOptions.SequentialScan))
{
// Write myBigThing to output file
fileStream.Write(myBigThing.Data);
}
myBigThing = null;
}
}
}
}
}

Processing one set of numbers at a time

I am dividing a bigtext file 'file1' using the code below into 64 byte sets. I want this code to process the first 64 bytes and then feed that data to Main() for further processing. Once processing in Main() is done, I would like program to comeback here to process next set of 64 bytes and so on till all the data in 'file1' is processed. How this can be done? Please advise.
public static List<byte> ByteValueCaller()
{
List<byte> numbers = new List<byte>();
GetValue(0, numbers);
return numbers;
}
public static void GetValue(int startingByte, List<byte> numbers)
{
using (FileStream fs = File.Open(#"C:\Users\file1.txt", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
using (BinaryReader br = new BinaryReader(fs))
{
//determines if the last position to use is inside your stream, or if the last position is the end of the stream.
int bytesToRead = startingByte + 64 > br.BaseStream.Length ? (int)br.BaseStream.Length - startingByte : 64;
//move your stream to the given possition
br.BaseStream.Seek(startingByte, SeekOrigin.Begin);
//populates databuffer with the given bytes
byte[] dataBuffer = br.ReadBytes(bytesToRead);
numbers.AddRange(dataBuffer);
//recursive call to the same
if (startingByte + bytesToRead < fs.Length)
GetValue(startingByte + bytesToRead, numbers);
}
}

I'm not sure that a recursive function is a good idea, I would do something like this:
public class Program
{
public static void Main(string[] args)
{
const int chunkSize = 64;
foreach (IEnumerable<byte> bytes in ReadByChunk(chunkSize))
{
Console.WriteLine("==========================================");
Console.WriteLine(Encoding.ASCII.GetString(bytes.ToArray()));
}
}
public static IEnumerable<IEnumerable<byte>> ReadByChunk(int chunkSize)
{
IEnumerable<byte> result;
int startingByte = 0;
do
{
result = ReadBytes(startingByte, chunkSize);
startingByte += chunkSize;
yield return result;
} while (result.Any());
}
public static IEnumerable<byte> ReadBytes(int startingByte, int byteToRead)
{
byte[] result;
using (FileStream stream = File.Open(#<path>, FileMode.Open, FileAccess.Read, FileShare.Read))
using (BinaryReader reader = new BinaryReader(stream))
{
int bytesToRead = Math.Max(Math.Min(byteToRead, (int)reader.BaseStream.Length - startingByte), 0);
reader.BaseStream.Seek(startingByte, SeekOrigin.Begin);
result = reader.ReadBytes(bytesToRead);
}
return result;
}
}
Note that doing this way, the file will open and close everytime you read a chunk

How to read a binary file quickly in c#? (ReadOnlySpan vs MemoryStream)

I'm trying to parse a binary file as fastest as possible. So this is what I first tried to do:
using (FileStream filestream = path.OpenRead()) {
using (var d = new GZipStream(filestream, CompressionMode.Decompress)) {
using (MemoryStream m = new MemoryStream()) {
d.CopyTo(m);
m.Position = 0;
using (BinaryReaderBigEndian b = new BinaryReaderBigEndian(m)) {
while (b.BaseStream.Position != b.BaseStream.Length) {
UInt32 value = b.ReadUInt32();
} } } } }
Where BinaryReaderBigEndian class is implemented as it follows:
public static class BinaryReaderBigEndian {
public BinaryReaderBigEndian(Stream stream) : base(stream) { }
public override UInt32 ReadUInt32() {
var x = base.ReadBytes(4);
Array.Reverse(x);
return BitConverter.ToUInt32(x, 0);
} }
Then, I tried to get a performance improvement using ReadOnlySpan instead of MemoryStream. So, I tried doing:
using (FileStream filestream = path.OpenRead()) {
using (var d = new GZipStream(filestream, CompressionMode.Decompress)) {
using (MemoryStream m = new MemoryStream()) {
d.CopyTo(m);
int position = 0;
ReadOnlySpan<byte> stream = new ReadOnlySpan<byte>(m.ToArray());
while (position != stream.Length) {
UInt32 value = stream.ReadUInt32(position);
position += 4;
} } } }
Where BinaryReaderBigEndian class changed in:
public static class BinaryReaderBigEndian {
public override UInt32 ReadUInt32(this ReadOnlySpan<byte> stream, int start) {
var data = stream.Slice(start, 4).ToArray();
Array.Reverse(x);
return BitConverter.ToUInt32(x, 0);
} }
But, unfortunately, I didn't notice any improvement. So, where am I doing wrong?

I did some measurement of your code on my computer (Intel Q9400, 8 GiB RAM, SSD disk, Win10 x64 Home, .NET Framework 4/7/2, tested with 15 MB (when unpacked) file) with these results:
No-Span version: 520 ms
Span version: 720 ms
So Span version is actually slower! Why? Because new ReadOnlySpan<byte>(m.ToArray()) performs additional copy of whole file and also ReadUInt32() performs many slicings of the Span (slicing is cheap, but not free). Since you performed more work, you can't expect performance to be any better just because you used Span.
So can we do better? Yes. It turns out that the slowest part of your code is actually garbage collection caused by repeatedly allocating 4-byte Arrays created by the .ToArray() calls in ReadUInt32() method. You can avoid it by implementing ReadUInt32() yourself. It's pretty easy and also eliminates need for Span slicing. You can also replace new ReadOnlySpan<byte>(m.ToArray()) with new ReadOnlySpan<byte>(m.GetBuffer()).Slice(0, (int)m.Length);, which performs cheap slicing instead of copy of whole file. So now code looks like this:
public static void Read(FileInfo path)
{
using (FileStream filestream = path.OpenRead())
{
using (var d = new GZipStream(filestream, CompressionMode.Decompress))
{
using (MemoryStream m = new MemoryStream())
{
d.CopyTo(m);
int position = 0;
ReadOnlySpan<byte> stream = new ReadOnlySpan<byte>(m.GetBuffer()).Slice(0, (int)m.Length);
while (position != stream.Length)
{
UInt32 value = stream.ReadUInt32(position);
position += 4;
}
}
}
}
}
public static class BinaryReaderBigEndian
{
public static UInt32 ReadUInt32(this ReadOnlySpan<byte> stream, int start)
{
UInt32 res = 0;
for (int i = 0; i < 4; i++)
{
res = (res << 8) | (((UInt32)stream[start + i]) & 0xff);
}
return res;
}
}
With these changes I get from 720 ms down to 165 ms (4x faster). Sounds great, doesn't it? But we can do even better. We can completely avoid MemoryStream copy and inline and further optimize ReadUInt32():
public static void Read(FileInfo path)
{
using (FileStream filestream = path.OpenRead())
{
using (var d = new GZipStream(filestream, CompressionMode.Decompress))
{
var buffer = new byte[64 * 1024];
do
{
int bufferDataLength = FillBuffer(d, buffer);
if (bufferDataLength % 4 != 0)
throw new Exception("Stream length not divisible by 4");
if (bufferDataLength == 0)
break;
for (int i = 0; i < bufferDataLength; i += 4)
{
uint value = unchecked(
(((uint)buffer[i]) << 24)
| (((uint)buffer[i + 1]) << 16)
| (((uint)buffer[i + 2]) << 8)
| (((uint)buffer[i + 3]) << 0));
}
} while (true);
}
}
}
private static int FillBuffer(Stream stream, byte[] buffer)
{
int read = 0;
int totalRead = 0;
do
{
read = stream.Read(buffer, totalRead, buffer.Length - totalRead);
totalRead += read;
} while (read > 0 && totalRead < buffer.Length);
return totalRead;
}
And now it takes less than 90 ms (8x faster then the original!). And without Span! Span is great in situations, where it allows perform slicing and avoid array copy, but it won't improve performance just by blindly using it. After all, Span is designed to have performance characteristics on par with Array, but not better (and only on runtimes that have special support for it, such as .NET Core 2.1).

Progressbar in Windows Phone 8.1

I am trying to make my progressbar work in Windows phone 8.1.
Here is my code :
using (var writeStream = await newZipFile.OpenAsync(FileAccessMode.ReadWrite))
{
using (var outputStream = writeStream.GetOutputStreamAt(0))
{
using (var dataWriter = new DataWriter(outputStream))
{
using (Stream input = webResponse.GetResponseStream())
{
long totalSize = 0;
int read;
uint zeroUint = Convert.ToUInt32(0);
uint readUint;
while ((read = input.Read(buffer, 0, buffer.Length)) >0)
{
// totalSize += read;
totalSize = totalSize + read;
//pb2.Value = totalSize * 100 / sizeFit;
await dispatcher.RunAsync(CoreDispatcherPriority.Normal, () =>
{
//Declaration of variables
// load.progresschanged(totalSize * 100 / sizeFit);
pb2.Value = totalSize * 100 / sizeFit;
});
readUint = Convert.ToUInt32(read);
IBuffer ibuffer = buffer.AsBuffer();
dataWriter.WriteBuffer(ibuffer, zeroUint, readUint);
}
await dataWriter.StoreAsync();
await outputStream.FlushAsync();
dataWriter.DetachStream();
}
}
}
}
The problem here is that progressbar updates only when the value is 100.
Any help appreciated.

1)Make sure the maxvalue of progressbar is 100 in view.
2) pb2.Value = sizeFit * 100 / totalSize; Might be you are calculating the percentage wrong. Since I am not able to find sizeFit value in your code why not place a breakpoint and check whats the value being returned

Binaryreader read from Filestream which loads in chunks

I'm reading values from a huge file (> 10 GB) using the following code:
FileStream fs = new FileStream(fileName, FileMode.Open);
BinaryReader br = new BinaryReader(fs);
int count = br.ReadInt32();
List<long> numbers = new List<long>(count);
for (int i = count; i > 0; i--)
{
numbers.Add(br.ReadInt64());
}
unfortunately the read-speed from my SSD is stuck at a few MB/s. I guess the limit are the IOPS of the SSD, so it might be better to read in chunks from the file.
Question
Does the FileStream in my code really read only 8 bytes from the file everytime the BinaryReader calls ReadInt64()?
If so, is there a transparent way for the BinaryReader to provide a stream that reads in larger chunks from the file to speed up the procedure?
Test-Code
Here's a minimal example to create a test-file and to measure the read-performance.
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
namespace TestWriteRead
{
class Program
{
static void Main(string[] args)
{
System.IO.File.Delete("test");
CreateTestFile("test", 1000000000);
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
IEnumerable<long> test = Read("test");
stopwatch.Stop();
Console.WriteLine("File loaded within " + stopwatch.ElapsedMilliseconds + "ms");
}
private static void CreateTestFile(string filename, int count)
{
FileStream fs = new FileStream(filename, FileMode.CreateNew);
BinaryWriter bw = new BinaryWriter(fs);
bw.Write(count);
for (int i = 0; i < count; i++)
{
long value = i;
bw.Write(value);
}
fs.Close();
}
private static IEnumerable<long> Read(string filename)
{
FileStream fs = new FileStream(filename, FileMode.Open);
BinaryReader br = new BinaryReader(fs);
int count = br.ReadInt32();
List<long> values = new List<long>(count);
for (int i = 0; i < count; i++)
{
long value = br.ReadInt64();
values.Add(value);
}
fs.Close();
return values;
}
}
}

You should configure the stream to use SequentialScan to indicate that you will read the stream from start to finish. It should improve the speed significantly.
Indicates that the file is to be accessed sequentially from beginning
to end. The system can use this as a hint to optimize file caching. If
an application moves the file pointer for random access, optimum
caching may not occur; however, correct operation is still guaranteed.
using (
var fs = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 8192,
FileOptions.SequentialScan))
{
var br = new BinaryReader(fs);
var count = br.ReadInt32();
var numbers = new List<long>();
for (int i = count; i > 0; i--)
{
numbers.Add(br.ReadInt64());
}
}
Try read blocks instead:
using (
var fs = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 8192,
FileOptions.SequentialScan))
{
var br = new BinaryReader(fs);
var numbersLeft = (int)br.ReadInt64();
byte[] buffer = new byte[8192];
var bufferOffset = 0;
var bytesLeftToReceive = sizeof(long) * numbersLeft;
var numbers = new List<long>();
while (true)
{
// Do not read more then possible
var bytesToRead = Math.Min(bytesLeftToReceive, buffer.Length - bufferOffset);
if (bytesToRead == 0)
break;
var bytesRead = fs.Read(buffer, bufferOffset, bytesToRead);
if (bytesRead == 0)
break; //TODO: Continue to read if file is not ready?
//move forward in read counter
bytesLeftToReceive -= bytesRead;
bytesRead += bufferOffset; //include bytes from previous read.
//decide how many complete numbers we got
var numbersToCrunch = bytesRead / sizeof(long);
//crunch them
for (int i = 0; i < numbersToCrunch; i++)
{
numbers.Add(BitConverter.ToInt64(buffer, i * sizeof(long)));
}
// move the last incomplete number to the beginning of the buffer.
var remainder = bytesRead % sizeof(long);
Buffer.BlockCopy(buffer, bytesRead - remainder, buffer, 0, remainder);
bufferOffset = remainder;
}
}
Update in response to a comment:
May I know what's the reason that manual reading is faster than the other one?
I don't know how the BinaryReader is actually implemented. So this is just assumptions.
The actual read from the disk is not the expensive part. The expensive part is to move the reader arm into the correct position on the disk.
As your application isn't the only one reading from a hard drive the disk have to re-position itself every time an application requests a read.
Thus if the BinaryReader just reads the requested int it have to wait on the disk for every read (if some other application make a read in-between).
As I read a much larger buffer directly (which is faster) I can process more integers without having to wait for the disk between reads.
Caching will of course speed things up a bit, and that's why it's "just" three times faster.
(future readers: If something above is incorrect, please correct me).

You can use a BufferedStream to increase the read buffer size.

In theory memory mapped files should help here. You could load it into memory using several very large chunks. Not sure though how much is this relevant when using SSDs.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How can I improve the performance of this CopyTo method? - c#

Related

Channels & Memory Management Strategies for Large Objects

Processing one set of numbers at a time

How to read a binary file quickly in c#? (ReadOnlySpan vs MemoryStream)

Progressbar in Windows Phone 8.1

Binaryreader read from Filestream which loads in chunks

Categories

Resources