(C#) What is the fastest way to count byte in a file? - c#

I want to know a fastest way to count all byte in file ? I need to work on large binary file
I want to know the quantity of all byte in file (Quantity of 0x00, 0x01, .. 0xff)
It's for add a graph with file representation in my WPF Hexeditor usercontrol https://github.com/abbaye/WPFHexEditorControl like in HxD hexeditor.
This code work fine but it's to slow for large file.
public Dictionary<int, long> GetByteCount()
if (IsOpen)
Position = 0;
int currentByte = 0;
// Build dictionary
Dictionary<int, long> cd = new Dictionary<int, long>();
for (int i = 0; i <= 255; i++) cd.Add(i, 0);
for (int i = 0; i <= Length; i++)
//if (EOF) break;
currentByte = ReadByte();
if (currentByte != -1) cd[currentByte]++;
return cd;
return new Dictionary<int, long>();

/// <summary>
/// Get an array of long computing the total of each byte in the file.
/// The position of the array makes it possible to obtain the sum of the desired byte
/// </summary>
public long[] GetByteCount()
if (IsOpen)
const int bufferLenght = 1048576; //1mb
var storedCnt = new long[256];
Position = 0;
while (!Eof)
var testLenght = Length - Position;
var buffer = testLenght <= bufferLenght ? new byte[testLenght] : new byte[bufferLenght];
Read(buffer, 0, buffer.Length);
foreach (var b in buffer)
Position += bufferLenght;
return storedCnt;
return null;

I have optimized David's solution a bit. The "Position" - calls are not necessary. I've found that the buffer length and the unbuffered read mode are not very important, but the "for"- instead of "foreach" - construct in the calculation made a big difference.
Results with
foreach (var b in buffer.Take(count))
file length is 4110217216
duration 00:00:51.1686821
Results with
for(var i = 0; i < count; i++)
file length 4110217216
duration 00:00:05.9695418
Here the program
private static void Main(
const string fileForCheck = #"D:\Data\System\en_visual_studio_enterprise_2015_x86_x64_dvd_6850497.iso";
var watch = new Stopwatch();
var counter = new FileBytesCounter(fileForCheck);
var results = counter.GetByteCount();
Console.WriteLine(string.Join(", ", results.Select((c, b) => $"{b} -> {c}")));
var sumBytes = results.Sum(c => c);
Debug.Assert((new FileInfo(fileForCheck)).Length == sumBytes); // here's the proof
Console.WriteLine($"file length {sumBytes}");
Console.WriteLine($"duration {watch.Elapsed}");
and here the class
internal class FileBytesCounter
: FileStream
private const FileOptions FileFlagNoBuffering = (FileOptions)0x20000000;
private const int CopyBufferSize = 1024 * 1024;
//private const int CopyBufferSize = 4 * 1024 * 16;
public FileBytesCounter(string path, FileShare share = FileShare.Read)
: base(path, FileMode.Open, FileAccess.Read, share, CopyBufferSize/*, FileFlagNoBuffering*/)
public long[] GetByteCount()
var buffer = new byte[CopyBufferSize];
var storedCnt = new long[256];
int count;
Position = 0;
while ((count = Read(buffer, 0, CopyBufferSize)) > 0)
for(var i = 0; i < count; i++)
return storedCnt;
See also https://www.codeproject.com/Articles/172613/Fast-File-Copy-With-Managed-Code-UBCopy-update for FileFlagNoBuffering

It seems like you want something like this:
public Dictionary<char, long> GetCharCount(string filePath)
var result = new Dictionary<char, long>();
var content = File.ReadAllText(filePath);
foreach(var c in content)
if (result.ContainsKey(c))
result[c] = result[c] + 1;
result.Add(c, 1);
return result;


NAudio SampleProvider not updating buffer on Read as expected

I am trying to use Naudio to input audio and then output it again after it has been processed by a plugin. To do the output step I have created a custom SampleProvider but the buffer is not behaving as I expect and I can't hear any sound. The code that reads the audio and attempts to play it again is as follows
var audioFile = new AudioFileReader(#"C:\Users\alex.clayton\Downloads\Rhythm guitar.mp3");
var vstSampleProvider = new VstSampleProvider(44100, 2);
var devices = DirectSoundOut.Devices.Last();
var output = new DirectSoundOut(devices.Guid);
int chunckStep = 0;
while (chunckStep < audioFile.Length)
var nAudiobuffer = new float[blockSize * 2];
audioFile.Read(nAudiobuffer, 0, blockSize * 2);
var leftSpan = inputMgr.Buffers.ToArray()[0].AsSpan();
for (int i = 0; i < blockSize; i++)
leftSpan[i] = nAudiobuffer[i*2] / int.MaxValue;
var rightSpan = inputMgr.Buffers.ToArray()[0].AsSpan();
for (int i = 1; i < blockSize; i++)
rightSpan[i] = nAudiobuffer[i*2 + 1] / int.MaxValue;
PluginContext.PluginCommandStub.Commands.ProcessReplacing(inputBuffers, outputBuffers);
chunckStep += blockSize;
The Sample provider code is this
public class VstSampleProvider : ISampleProvider
private readonly int _sampleRate;
private readonly int _channels;
private readonly Queue<float> _buffer;
public VstSampleProvider(int sampleRate, int channels)
_sampleRate = sampleRate;
_channels = channels;
_buffer = new Queue<float>();
public WaveFormat WaveFormat => WaveFormat.CreateIeeeFloatWaveFormat(_sampleRate, _channels);
public void LoadBuffer(VstAudioBuffer[] outputBuffers)
var totalSampleCount = outputBuffers[0].SampleCount * _channels;
if (_channels == 1)
for (int i = 0; i < totalSampleCount; i++)
for (int i = 0; i < totalSampleCount; i++)
if (i % 2 == 0)
var value = outputBuffers[0][i / 2];
_buffer.Enqueue(outputBuffers[1][(i - 1) / 2]);
catch (Exception ex)
// Probably should log or something
public int Read(float[] buffer, int offset, int count)
if (_buffer.Count < count)
return 0;
if (offset > 0)
throw new NotImplementedException();
for (int i = 0; i < count; i++)
var value = _buffer.Dequeue();
buffer[i] = value;
if (buffer.Any(f => f > 1))
return count;
return count;
When I look at the values being dequeued they are all between -1 and 1, as expected but when I put a break point on the line after if (buffer.Any(f => f > 1)) I can see that the buffer values are integers larger than 1 or 0 and bear no resemblance to the dequeued values that, I thought, were added to the buffer.
I expect I have not understood something about how the SampleProvider is supposed to work byt looing at ones already in Naudio I cant see what I'm doing wrong.
Any help would be much appreciated. Thank you
So it turns out that the main issue was reading the input file and turning the volume down, so I was playing but very quietly.
leftSpan[i] = nAudiobuffer[i*2] / int.MaxValue
There was no need for the / int.MaxValue

ARC4 encryption not working correctly server side

I have a socket.io client which sends data to each other where encryption is based on ARC4.
I tried multiple different scenarios but it keeps failing to decrypt anything and I'm not sure why.
The class: ARC4_New
public class ARC4_New
private int i;
private int j;
private byte[] bytes;
public const int POOLSIZE = 256;
public ARC4_New()
bytes = new byte[POOLSIZE];
public ARC4_New(byte[] key)
bytes = new byte[POOLSIZE];
public void Initialize(byte[] key)
this.i = 0;
this.j = 0;
for (i = 0; i < POOLSIZE; ++i)
this.bytes[i] = (byte)i;
for (i = 0; i < POOLSIZE; ++i)
j = (j + bytes[i] + key[i % key.Length]) & (POOLSIZE - 1);
this.Swap(i, j);
this.i = 0;
this.j = 0;
private void Swap(int a, int b)
byte t = this.bytes[a];
this.bytes[a] = this.bytes[b];
this.bytes[b] = t;
public byte Next()
this.i = ++this.i & (POOLSIZE - 1);
this.j = (this.j + this.bytes[i]) & (POOLSIZE - 1);
this.Swap(i, j);
return this.bytes[(this.bytes[i] + this.bytes[j]) & 255];
public void Encrypt(ref byte[] src)
for (int k = 0; k < src.Length; k++)
src[k] ^= this.Next();
public void Decrypt(ref byte[] src)
this.Encrypt(ref src);
public System.Numerics.BigInteger RandomInteger(int bitSize)
var integerData = new byte[bitSize / 8];
integerData[integerData.Length - 1] &= 0x7f;
return new System.Numerics.BigInteger(integerData);
My script which generates a key:
System.Numerics.BigInteger DHPrivate = RandomInteger(256);
System.Numerics.BigInteger DHPrimal = RandomInteger(256);
System.Numerics.BigInteger DHGenerated = RandomInteger(256);
if (DHGenerated > DHPrimal)
System.Numerics.BigInteger tempG = DHGenerated;
DHGenerated= DHPrimal;
DHPrimal = tempG;
Then with those values I generate a public key:
System.Numerics.BigInteger DHPublic = System.Numerics.BigInteger.ModPow(DHGenerated, DHPrivate, DHPrimal);
Then I encrypt this key:
string pkey = EncryptY(CalculatePublic, DHPublic);
(Additional code for the encryption below)
protected virtual string EncryptY(Func<System.Numerics.BigInteger, System.Numerics.BigInteger> calculator, System.Numerics.BigInteger value)
byte[] valueData = Encoding.UTF8.GetBytes(value.ToString());
valueData = PKCSPad(valueData);
var paddedInteger = new System.Numerics.BigInteger(valueData);
System.Numerics.BigInteger calculatedInteger = calculator(paddedInteger);
byte[] paddedData = calculatedInteger.ToByteArray();
string encryptedValue = Utils.Converter.BytesToHexString(paddedData).ToLower();
return encryptedValue.StartsWith("00") ? encryptedValue.Substring(2) : encryptedValue;
protected virtual byte[] PKCSPad(byte[] data)
var buffer = new byte[128 - 1];
int dataStartPos = (buffer.Length - data.Length);
buffer[0] = (byte)Padding;
Buffer.BlockCopy(data, 0, buffer, dataStartPos, data.Length);
int paddingEndPos = (dataStartPos - 1);
bool isRandom = (Padding == PKCSPadding.RandomByte);
for (int i = 1; i < paddingEndPos; i++)
buffer[i] = (byte)(isRandom ?
_numberGenerator.Next(1, 256) : byte.MaxValue);
return buffer;
After all that I sent the string PKEY to the server.
And after decrypting the string, the server gets the public key which is for example: 127458393
When I connect both my client and server using: 127458393
BigInteger key = System.Numerics.BigInteger.Parse("127458393");
client = new ARC4_New(PrimalDing.ToByteArray());
My client sends a string like:
And my server reads it like:
But it fails, and gets a random unreadable string.
What am I doing wrong here?
I managed to fix the issue
For some reason, my server was and is reversing the bytes i used in the ARC4 client..
So i simple reverse it now as a hotfix
System.Numerics.BigInteger temp = System.Numerics.BigInteger.Parse(textBox1.Text);
client = new ARC4_New(temp.ToByteArray().Reverse().ToArray());

How to play back a array of samples after reading them from an 'ISampleSource'

So, I have a piece of code that reads out an ISampleSource in to a float[][], the first array layer being for the number of channels and the second being for the sample data within the channel. I am going to take this data and attempt to apply signal processing to it, however for debugging purposes I might want to manipulate the sample array and then play it back so that I can "hear" what the code is doing. is there an easy way to take the data returned by ISampleSource.Read and stick it back in to a new ISampleSource so it can then be converted to an IWaveSource and played using WasapiOut?
Here is the class I tried to make so far, you pass it the float[][] and basically all the data in a WaveFormat for it to make one from.. but it doesn't actually do anything. doesn't error, doesn't play.. just does nothing. What am I doing wrong?
private class SampleSource : ISampleSource
public long Position { get; set; }
public WaveFormat WaveFormat { get; private set; }
public bool CanSeek => true;
public long Length => _data.Length;
private float[] _data;
private long readPoint = 0;
public SampleSource(float[][] samples, int sampleRate, int bits, int channels)
WaveFormat = new WaveFormat(sampleRate, bits, channels);
if (samples.Length <= 0) return;
_data = new float[samples[0].Length * samples.Length];
int cchannels = samples.Length;
int sampleLength = samples[0].Length;
for (var i = 0; i < sampleLength; i += cchannels)
for (var n = 0; n < cchannels; n++)
_data[i + n] = samples[n][i / cchannels];
public int Read(float[] buffer, int offset, int count)
if (_data.Length < Position + count)
count = (int) (_data.Length - Position);
float[] outFloats = new float[count];
for (var i = 0; i < count; i++)
outFloats[i] = _data[i + Position + offset];
buffer = outFloats;
Position += count;
return count;
public void Dispose() =>_data = null;
Rather than trying to set buffer to a new array (which makes no sense) I needed to directly write to the buffer array elements, so that they can be used outside of the function call. I don't really like doing it this way, maybe it's to fix an issue I don't see, but clearly that's how the library I'm using does it.
private class SampleSource : ISampleSource
public long Position { get; set; }
public WaveFormat WaveFormat { get; private set; }
public bool CanSeek => true;
public long Length => _data.Length;
private float[] _data;
private long readPoint = 0;
public SampleSource(float[][] samples, int sampleRate, int bits, int channels)
WaveFormat = new WaveFormat(sampleRate, bits, channels);
if (samples.Length <= 0) return;
_data = new float[samples[0].Length * samples.Length];
int cchannels = samples.Length;
int sampleLength = samples[0].Length;
for (var i = 0; i < sampleLength; i += cchannels)
for (var n = 0; n < cchannels; n++)
_data[i + n] = samples[n][i / cchannels];
public int Read(float[] buffer, int offset, int count)
if (_data.Length < Position + count)
count = (int) (_data.Length - Position);
for (var i = 0; i < count; i++)
buffer[i] = _data[i + Position + offset];
Position += count;
return count;
public void Dispose() =>_data = null;

Reading MNIST Database

I am currently exploring neural networks and machine learning and I implemented a basic neural network in c#. Now I wanted to test my back propagation training algorithm with the MNIST database. Although I am having serious trouble reading the files correctly.
Spoiler the code is currently very badly optimised for performance. My aim currently is to grasp the subject and get a structured view how things work before I start throwing out my data structures for faster ones.
To train the network I want to feed it a custom TrainingSet data structure:
public class TrainingSet
public Dictionary<List<double>, List<double>> data = new Dictionary<List<double>, List<double>>();
Keys will be my input data (784 pixels per entry(image) which will represent the greyscale values in range from 0 to 1). Values will be my output data (10 entries representing the digits from 0-9 with all entries on 0 except the exspected one at 1)
Now I want to read the MNIST database according to this contract. I am currentl on my 2nd try which is inspired by this blogpost: https://jamesmccaffrey.wordpress.com/2013/11/23/reading-the-mnist-data-set-with-c/ . Sadly it is still producing the same nonsense as my first try scattering the pixels in a strange pattern:
My current reading algorithm:
public static TrainingSet GenerateTrainingSet(FileInfo imagesFile, FileInfo labelsFile)
MnistImageView imageView = new MnistImageView();
TrainingSet trainingSet = new TrainingSet();
List<List<double>> labels = new List<List<double>>();
List<List<double>> images = new List<List<double>>();
using (BinaryReader brLabels = new BinaryReader(new FileStream(labelsFile.FullName, FileMode.Open)))
using (BinaryReader brImages = new BinaryReader(new FileStream(imagesFile.FullName, FileMode.Open)))
int magic1 = brImages.ReadBigInt32(); //Reading as BigEndian
int numImages = brImages.ReadBigInt32();
int numRows = brImages.ReadBigInt32();
int numCols = brImages.ReadBigInt32();
int magic2 = brLabels.ReadBigInt32();
int numLabels = brLabels.ReadBigInt32();
byte[] pixels = new byte[numRows * numCols];
// each image
for (int imageCounter = 0; imageCounter < numImages; imageCounter++)
List<double> imageInput = new List<double>();
List<double> exspectedOutput = new List<double>();
for (int i = 0; i < 10; i++) //generate empty exspected output
//read image
for (int p = 0; p < pixels.Length; p++)
byte b = brImages.ReadByte();
pixels[p] = b;
imageInput.Add(b / 255.0f); //scale in 0 to 1 range
//read label
byte lbl = brLabels.ReadByte();
exspectedOutput[lbl] = 1; //modify exspected output
//Debug view showing parsed image.......................
Bitmap image = new Bitmap(numCols, numRows);
for (int y = 0; y < numRows; y++)
for (int x = 0; x < numCols; x++)
image.SetPixel(x, y, Color.FromArgb(255 - pixels[x * y], 255 - pixels[x * y], 255 - pixels[x * y])); //invert colors to have 0,0,0 be white as specified by mnist
for (int i = 0; i < images.Count; i++)
trainingSet.data.Add(images[i], labels[i]);
return trainingSet;
All images produce a pattern as shown above. It's never the exact same pattern but always seems to have the pixels "pulled" down to the right corner.
That is how I did it:
public static class MnistReader
private const string TrainImages = "mnist/train-images.idx3-ubyte";
private const string TrainLabels = "mnist/train-labels.idx1-ubyte";
private const string TestImages = "mnist/t10k-images.idx3-ubyte";
private const string TestLabels = "mnist/t10k-labels.idx1-ubyte";
public static IEnumerable<Image> ReadTrainingData()
foreach (var item in Read(TrainImages, TrainLabels))
yield return item;
public static IEnumerable<Image> ReadTestData()
foreach (var item in Read(TestImages, TestLabels))
yield return item;
private static IEnumerable<Image> Read(string imagesPath, string labelsPath)
BinaryReader labels = new BinaryReader(new FileStream(labelsPath, FileMode.Open));
BinaryReader images = new BinaryReader(new FileStream(imagesPath, FileMode.Open));
int magicNumber = images.ReadBigInt32();
int numberOfImages = images.ReadBigInt32();
int width = images.ReadBigInt32();
int height = images.ReadBigInt32();
int magicLabel = labels.ReadBigInt32();
int numberOfLabels = labels.ReadBigInt32();
for (int i = 0; i < numberOfImages; i++)
var bytes = images.ReadBytes(width * height);
var arr = new byte[height, width];
arr.ForEach((j,k) => arr[j, k] = bytes[j * height + k]);
yield return new Image()
Data = arr,
Label = labels.ReadByte()
Image class:
public class Image
public byte Label { get; set; }
public byte[,] Data { get; set; }
Some extension methods:
public static class Extensions
public static int ReadBigInt32(this BinaryReader br)
var bytes = br.ReadBytes(sizeof(Int32));
if (BitConverter.IsLittleEndian) Array.Reverse(bytes);
return BitConverter.ToInt32(bytes, 0);
public static void ForEach<T>(this T[,] source, Action<int, int> action)
for (int w = 0; w < source.GetLength(0); w++)
for (int h = 0; h < source.GetLength(1); h++)
action(w, h);
foreach (var image in MnistReader.ReadTrainingData())
//use image here
foreach (var image in MnistReader.ReadTestData())
//use image here
Why not use a nuget package:
MNIST.IO Just a datareader (disclaimer: my package)
Accord.DataSets Contains classes to download and parse machine learning datasets such as MNIST, News20, Iris. This package is part of the Accord.NET Framework.

Convert byte/int to List<int> reversed and vice versa

Was wondering how can I convert an int to a List in reverse order padded with zeroes and vice versa?
Have a byte that represents List(8), sometimes 2 bytes for List(16), 8 bytes for List(64); so looking for a good solution to handle converting to an int list, manipulate then back again.
e.g. Input of 3 to a List of 1,1,0,0,0,0,0,0
Or input of 42 to a List of 0,1,0,1,0,1,0,0
And vice-versa, take a List of 1,1,0,0,0,0,0,0 and return 3 or List of 0,1,0,1,0,1,0,0 and return 42
What I have done at present is build a couple of functions to handle both scenarios, all works fine, just wondering if there is a better / more elegant solution that I've completelt overlooked?
private List<int> IntToList(int _Input)
string _Binary = ReverseString(Convert.ToString(_Input, 2).PadLeft(8, '0'));
List<int> _List = new List<int>(8);
for (int i = 0; i < _Binary.Length; i++)
_List.Add(Convert.ToInt32(_Binary.Substring(i, 1)));
return _List;
private int IntsToByte(List<int> _List)
string _Binary = "";
for (int i = 7; i > -1; i--)
_Binary += _List[i];
return Convert.ToInt32(_Binary, 2);
You can work with bitwise operations. They might be fast.
Warning : Be aware of Little/Big Endian (More here)
The following code works :
private List<int> IntToList(int _Input, int _MaxSize = 8)
int padding = 1;
List<int> resultList = new List<int>(_MaxSize);
while (padding < 1 << _MaxSize)
resultList.Add((_Input & padding) == padding ? 1 : 0);
padding = padding << 1;
return resultList;
private int IntsToByte(List<int> _List)
int result = 0, padding = 0;
foreach (int i in _List)
result = result | (i << padding++);
return result;
This should work
int number = 42
char[] reverse = Convert.ToString(number, 2).PadLeft(8, '0').ToCharArray();
Try this
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace ConsoleApplication1
class Program
static void Main(string[] args)
List<ulong> results = null;
List<byte> output = null;
List<byte> input1 = new List<byte>() { 1, 1, 0, 0, 0, 0, 0, 0 };
results = ReadList(input1, 1);
output = WriteList(results,1);
List<byte> input2 = new List<byte>() { 0, 1, 0, 1, 0, 1, 0, 0 };
results = ReadList(input2, 1);
output = WriteList(results,1);
static List<ulong> ReadList(List<byte> input, int size)
List<ulong> results = new List<ulong>();
MemoryStream stream = new MemoryStream(input.ToArray());
BinaryReader reader = new BinaryReader(stream);
int count = 0;
ulong newValue = 0;
while (reader.PeekChar() != -1)
switch (size)
case 1:
newValue = ((ulong)Math.Pow(2, size) * newValue) + (ulong)reader.ReadByte();
case 2:
newValue = ((ulong)Math.Pow(2, size) * newValue) + (ulong)reader.ReadInt16();
if (++count == size)
newValue = 0;
count = 0;
return results;
static List<byte> WriteList(List<ulong> input, int size)
List<byte> results = new List<byte>();
foreach (ulong num in input)
ulong result = num;
for (int count = 0; count < size; count++)
if (result > 0)
byte bit = (byte)(result % Math.Pow(2, size));
result = (ulong)(result / Math.Pow(2, size));
return results;
Solution from OP.
Have gone with Jean Bob's suggestion of using BitWise.
For anyone elses benefit, here is my modified version to read / write in blocks of 8 to/from the list.
private List<int> IntToList(List<int> _List, int _Input)
int _Padding = 1;
while (_Padding < 1 << 8)
_List.Add((_Input & _Padding) == _Padding ? 1 : 0);
_Padding = _Padding << 1;
return _List;
private int IntsToByte(List<int> _List, int l)
int _Result = 0, _Padding = 0;
for (int i = l; i < (l + 8); i++)
_Result = _Result | (_List[i] << _Padding++);
return _Result;

