Processing one set of numbers at a time - c#

I am dividing a bigtext file 'file1' using the code below into 64 byte sets. I want this code to process the first 64 bytes and then feed that data to Main() for further processing. Once processing in Main() is done, I would like program to comeback here to process next set of 64 bytes and so on till all the data in 'file1' is processed. How this can be done? Please advise.
public static List<byte> ByteValueCaller()
{
List<byte> numbers = new List<byte>();
GetValue(0, numbers);
return numbers;
}
public static void GetValue(int startingByte, List<byte> numbers)
{
using (FileStream fs = File.Open(#"C:\Users\file1.txt", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
using (BinaryReader br = new BinaryReader(fs))
{
//determines if the last position to use is inside your stream, or if the last position is the end of the stream.
int bytesToRead = startingByte + 64 > br.BaseStream.Length ? (int)br.BaseStream.Length - startingByte : 64;
//move your stream to the given possition
br.BaseStream.Seek(startingByte, SeekOrigin.Begin);
//populates databuffer with the given bytes
byte[] dataBuffer = br.ReadBytes(bytesToRead);
numbers.AddRange(dataBuffer);
//recursive call to the same
if (startingByte + bytesToRead < fs.Length)
GetValue(startingByte + bytesToRead, numbers);
}
}

I'm not sure that a recursive function is a good idea, I would do something like this:
public class Program
{
public static void Main(string[] args)
{
const int chunkSize = 64;
foreach (IEnumerable<byte> bytes in ReadByChunk(chunkSize))
{
Console.WriteLine("==========================================");
Console.WriteLine(Encoding.ASCII.GetString(bytes.ToArray()));
}
}
public static IEnumerable<IEnumerable<byte>> ReadByChunk(int chunkSize)
{
IEnumerable<byte> result;
int startingByte = 0;
do
{
result = ReadBytes(startingByte, chunkSize);
startingByte += chunkSize;
yield return result;
} while (result.Any());
}
public static IEnumerable<byte> ReadBytes(int startingByte, int byteToRead)
{
byte[] result;
using (FileStream stream = File.Open(#<path>, FileMode.Open, FileAccess.Read, FileShare.Read))
using (BinaryReader reader = new BinaryReader(stream))
{
int bytesToRead = Math.Max(Math.Min(byteToRead, (int)reader.BaseStream.Length - startingByte), 0);
reader.BaseStream.Seek(startingByte, SeekOrigin.Begin);
result = reader.ReadBytes(bytesToRead);
}
return result;
}
}
Note that doing this way, the file will open and close everytime you read a chunk

Related

How can I improve the performance of this CopyTo method?

EDIT: I have now solved this. My answer posted below and will mark as solved when SO lets me.
I have a CopyTo (and a CopyToAsync) method to copy files in my C# application.
I have found that it is actually quite slow to copy the files, compared to something like Xcopy.
I extracted the core functionality of the copy method and placed it into a test console app to get the speed that it operates at versus Xcopy, and found the results actually quite different.
The results I get are:
Async Method: 36.59 seconds - Average speed: 1512.63 mb/sec
Sync Method: 36.49 seconds - Average speed: 1516.72 mb/sec
XCOPY: 5.62 seconds - Average speed: 9842.11 mb/sec
All three of these used the exact same file, and the exact same destination.
StreamExtensions class:
public static class StreamExtensions
{
const int DEFAULT_BUFFER = 0x1000; // 4096 bits
public static async Task CopyToAsync(this Stream source, Stream destination, IProgress<long> progress, CancellationToken cancellationToken = default, int bufferSize = DEFAULT_BUFFER)
{
var buffer = new byte[bufferSize];
int bytesRead;
long totalRead = 0;
while ((bytesRead = await source.ReadAsync(buffer, 0, buffer.Length, cancellationToken)) > 0)
{
await destination.WriteAsync(buffer, 0, bytesRead, cancellationToken);
cancellationToken.ThrowIfCancellationRequested();
totalRead += bytesRead;
progress.Report(totalRead);
}
}
public static void CopyTo(this Stream source, Stream destination, IProgress<long> progress, int bufferSize = DEFAULT_BUFFER)
{
var buffer = new byte[bufferSize];
int bytesRead;
long totalRead = 0;
while ((bytesRead = source.Read(buffer, 0, buffer.Length)) > 0)
{
destination.Write(buffer, 0, bytesRead);
totalRead += bytesRead;
progress.Report(totalRead);
}
}
}
The IProgress<long> object is to report the file progress back to the calling method.
Example call implementation:
// Asynchronous version
public static async Task CopyFileSetAsync(Dictionary<string, string> fileSet)
{
for (var x = 0; x < fileSet.Count; x++)
{
var item = fileSet.ElementAt(x);
var from = item.Key;
var to = item.Value;
int currentProgress = 0;
long fileSize = new FileInfo(from).Length;
IProgress<long> progress = new SynchronousProgress<long>(value =>
{
decimal fileProg = (decimal)(value * 100) / fileSize;
if (fileProg != currentProgress)
{
currentProgress = (int)fileProg;
OnUpdateFileProgress(null, new FileProgressEventArgs(fileProg));
}
});
using (var outStream = new FileStream(to, FileMode.Create, FileAccess.Write, FileShare.Read))
{
using (var inStream = new FileStream(from, FileMode.Open, FileAccess.Read, FileShare.Read))
{
await inStream.CopyToAsync(outStream, progress);
}
}
OnUpdateFileProgress(null, new FileProgressEventArgs(100)); // Probably redundant
}
}
// Synchronous version
public static void CopyFileSet(Dictionary<string, string> fileSet)
{
for (var x = 0; x < fileSet.Count; x++)
{
var item = fileSet.ElementAt(x);
var from = item.Key;
var to = item.Value;
int currentProgress = 0;
long fileSize = new FileInfo(from).Length;
IProgress<long> progress = new SynchronousProgress<long>(value =>
{
decimal fileProg = (decimal)(value * 100) / fileSize;
if (fileProg != currentProgress)
{
currentProgress = (int)fileProg;
OnUpdateFileProgress(null, new FileProgressEventArgs(fileProg));
}
});
using (var outStream = new FileStream(to, FileMode.Create, FileAccess.Write, FileShare.Read))
{
using (var inStream = new FileStream(from, FileMode.Open, FileAccess.Read, FileShare.Read))
{
inStream.CopyTo(outStream, progress, 1024);
}
}
OnUpdateFileProgress(null, new FileProgressEventArgs(100)); // Probably redundant
}
}
Is there something that's preventing this from running as fast as it could? I'm just stumped as to how much slower it is compared to copy.
EDIT: Fixed a typo where I forgot a single ` around IProgress
Thanks to Tom and xanatos, I answered my own question:
I misunderstood the impact of buffer size. I had only gone so far as 8192 bytes as the buffer size. After taking on their suggestions, I increased the buffer size to 1mb (1048576 bytes), and this made a massive difference to the performance.
Async Method: 5.57 seconds - Average speed: 9938.68 mb/sec
Sync Method: 5.52 seconds - Average speed: 10028.36 mb/sec
XCOPY: 5.03 seconds - Average speed: 11007.84 mb/sec

how can i modify a small section of bytes in a memory stream, that was written to using binarywriter

how do i edit the first four bytes in memory stream? Imagine "bytes" in the following code is a few 100 bytes long. i need to write a place holder of say, 4 bytes of value 0 and come back and update those bytes to new values.
static MemoryStream stream = new MemoryStream();
static BinaryWriter writer = new BinaryWriter(stream);
writer.Write(bytes);
How about this solution:
static void UpdateNthLong(MemoryStream ms, long idx, long newValue)
{
var currPos = ms.Position;
try
{
var offset = sizeof(long) * idx;
ms.Position = offset;
var bw = new BinaryWriter(ms);
bw.Write(newValue);
}
finally { ms.Position = currPos; }
}
static void ShowByteArray(byte[] array)
{
Console.WriteLine("Size: {0}", array.Length);
for(int i = 0; i < array.Length; i++)
{
Console.WriteLine("{0} => {1}", i, array[i]);
}
}
static void Main(string[] args)
{
using (var ms = new MemoryStream())
{
var bw = new BinaryWriter(ms);
bw.Write(1L); // 0-th
bw.Write(2L); // 1-th
bw.Write(3L); // 2-th
bw.Write(4L); // 3-th
var bytes = ms.ToArray();
Console.WriteLine("Before update:");
ShowByteArray(bytes);
// Update 0-th
UpdateNthLong(ms, 0, 0xFFFFFFFFFFFFFF);
// Update 3-th
UpdateNthLong(ms, 3, 0xBBBBBBBBBBBBBBB);
bytes = ms.ToArray();
Console.WriteLine("After update:");
ShowByteArray(bytes);
}
}

Reading a file one byte at a time in reverse order

Hi I am trying to read a file one byte at a time in reverse order.So far I only managed to read the file from begining to end and write it on another file.
I need to be able to read the file from the end to the begining and print it to another file.
This is what I have so far:
string fileName = Console.ReadLine();
using (FileStream file = new FileStream(fileName ,FileMode.Open , FileAccess.Read))
{
//file.Seek(endOfFile, SeekOrigin.End);
int bytes;
using (FileStream newFile = new FileStream("newsFile.txt" , FileMode.Create , FileAccess.Write))
{
while ((bytes = file.ReadByte()) >= 0)
{
Console.WriteLine(bytes.ToString());
newFile.WriteByte((byte)bytes);
}
}
}
I know that I have to use the Seek method on the fileStream and that gets me to the end of the file.I already did that at the commented protion of the code , but I do not know how to read the file now in the while loop.
How can I achive this?
string fileName = Console.ReadLine();
using (FileStream file = new FileStream(fileName, FileMode.Open, FileAccess.Read))
{
byte[] output = new byte[file.Length]; // reversed file
// read the file backwards using SeekOrigin.Current
//
long offset;
file.Seek(0, SeekOrigin.End);
for (offset = 0; offset < fs.Length; offset++)
{
file.Seek(-1, SeekOrigin.Current);
output[offset] = (byte)file.ReadByte();
file.Seek(-1, SeekOrigin.Current);
}
// write entire reversed file array to new file
//
File.WriteAllBytes("newsFile.txt", output);
}
You could do it by reading one byte at a time, or you could read a larger buffer, write it to the output file in reverse, and continue like that until you've reached the beginning of the file. For example:
string inputFilename = "inputFile.txt";
string outputFilename = "outputFile.txt";
using (ofile = File.OpenWrite(outputFilename))
{
using (ifile = File.OpenRead(inputFilename))
{
int bufferSize = 4096;
byte[] buffer = new byte[bufferSize];
long filePos = ifile.Length;
do
{
long newPos = Math.Max(0, filePos - bufferSize);
int bytesToRead = (int)(filePos - newPos);
ifile.Seek(newPos, SeekOrigin.Set);
int bytesRead = ifile.Read(buffer, 0, bytesToRead);
// write the buffer to the output file, in reverse
for (int i = bytesRead-1; i >= 0; --i)
{
ofile.WriteByte(buffer[i]);
}
filePos = newPos;
} while (filePos > 0);
}
}
An obvious optimization would be to reverse the buffer after you've read it, and then write it in one whole chunk to the output file.
And if you know that the file will fit into memory, it's really easy:
var buffer = File.ReadAllBytes(inputFilename);
// now, reverse the buffer
int i = 0;
int j = buffer.Length-1;
while (i < j)
{
byte b = buffer[i];
buffer[i] = buffer[j];
buffer[j] = b;
++i;
--j;
}
// and write it
File.WriteAllBytes(outputFilename, buffer);
If the file is small (fits in your RAM) then this would work:
public static IEnumerable<byte> Reverse(string inputFilename)
{
var bytes = File.ReadAllBytes(inputFilename);
Array.Reverse(bytes);
foreach (var b in bytes)
{
yield return b;
}
}
Usage:
foreach (var b in Reverse("smallfile.dat"))
{
}
If the file is large (bigger than your RAM) then this would work:
using (var inputFile = File.OpenRead("bigfile.dat"))
using (var inputFileReversed = new ReverseStream(inputFile))
using (var binaryReader = new BinaryReader(inputFileReversed))
{
while (binaryReader.BaseStream.Position != binaryReader.BaseStream.Length)
{
var b = binaryReader.ReadByte();
}
}
It uses the ReverseStream class which can be found here.

Replace sequence of bytes in binary file

What is the best method to replace sequence of bytes in binary file to the same length of other bytes? The binary files will be pretty large, about 50 mb and should not be loaded at once in memory.
Update: I do not know location of bytes which needs to be replaced, I need to find them first.
Assuming you're trying to replace a known section of the file.
Open a FileStream with read/write access
Seek to the right place
Overwrite existing data
Sample code coming...
public static void ReplaceData(string filename, int position, byte[] data)
{
using (Stream stream = File.Open(filename, FileMode.Open))
{
stream.Position = position;
stream.Write(data, 0, data.Length);
}
}
If you're effectively trying to do a binary version of a string.Replace (e.g. "always replace bytes { 51, 20, 34} with { 20, 35, 15 } then it's rather harder. As a quick description of what you'd do:
Allocate a buffer of at least the size of data you're interested in
Repeatedly read into the buffer, scanning for the data
If you find a match, seek back to the right place (e.g. stream.Position -= buffer.Length - indexWithinBuffer; and overwrite the data
Sounds simple so far... but the tricky bit is if the data starts near the end of the buffer. You need to remember all potential matches and how far you've matched so far, so that if you get a match when you read the next buffer's-worth, you can detect it.
There are probably ways of avoiding this trickiness, but I wouldn't like to try to come up with them offhand :)
EDIT: Okay, I've got an idea which might help...
Keep a buffer which is at least twice as big as you need
Repeatedly:
Copy the second half of the buffer into the first half
Fill the second half of the buffer from the file
Search throughout the whole buffer for the data you're looking for
That way at some point, if the data is present, it will be completely within the buffer.
You'd need to be careful about where the stream was in order to get back to the right place, but I think this should work. It would be trickier if you were trying to find all matches, but at least the first match should be reasonably simple...
My solution :
/// <summary>
/// Copy data from a file to an other, replacing search term, ignoring case.
/// </summary>
/// <param name="originalFile"></param>
/// <param name="outputFile"></param>
/// <param name="searchTerm"></param>
/// <param name="replaceTerm"></param>
private static void ReplaceTextInBinaryFile(string originalFile, string outputFile, string searchTerm, string replaceTerm)
{
byte b;
//UpperCase bytes to search
byte[] searchBytes = Encoding.UTF8.GetBytes(searchTerm.ToUpper());
//LowerCase bytes to search
byte[] searchBytesLower = Encoding.UTF8.GetBytes(searchTerm.ToLower());
//Temporary bytes during found loop
byte[] bytesToAdd = new byte[searchBytes.Length];
//Search length
int searchBytesLength = searchBytes.Length;
//First Upper char
byte searchByte0 = searchBytes[0];
//First Lower char
byte searchByte0Lower = searchBytesLower[0];
//Replace with bytes
byte[] replaceBytes = Encoding.UTF8.GetBytes(replaceTerm);
int counter = 0;
using (FileStream inputStream = File.OpenRead(originalFile)) {
//input length
long srcLength = inputStream.Length;
using (BinaryReader inputReader = new BinaryReader(inputStream)) {
using (FileStream outputStream = File.OpenWrite(outputFile)) {
using (BinaryWriter outputWriter = new BinaryWriter(outputStream)) {
for (int nSrc = 0; nSrc < srcLength; ++nSrc)
//first byte
if ((b = inputReader.ReadByte()) == searchByte0
|| b == searchByte0Lower) {
bytesToAdd[0] = b;
int nSearch = 1;
//next bytes
for (; nSearch < searchBytesLength; ++nSearch)
//get byte, save it and test
if ((b = bytesToAdd[nSearch] = inputReader.ReadByte()) != searchBytes[nSearch]
&& b != searchBytesLower[nSearch]) {
break;//fail
}
//Avoid overflow. No need, in my case, because no chance to see searchTerm at the end.
//else if (nSrc + nSearch >= srcLength)
// break;
if (nSearch == searchBytesLength) {
//success
++counter;
outputWriter.Write(replaceBytes);
nSrc += nSearch - 1;
}
else {
//failed, add saved bytes
outputWriter.Write(bytesToAdd, 0, nSearch + 1);
nSrc += nSearch;
}
}
else
outputWriter.Write(b);
}
}
}
}
Console.WriteLine("ReplaceTextInBinaryFile.counter = " + counter);
}
You can use my BinaryUtility to search and replace one or more bytes without loading the entire file into memory like this:
var searchAndReplace = new List<Tuple<byte[], byte[]>>()
{
Tuple.Create(
BitConverter.GetBytes((UInt32)0xDEADBEEF),
BitConverter.GetBytes((UInt32)0x01234567)),
Tuple.Create(
BitConverter.GetBytes((UInt32)0xAABBCCDD),
BitConverter.GetBytes((UInt16)0xAFFE)),
};
using(var reader =
new BinaryReader(new FileStream(#"C:\temp\data.bin", FileMode.Open)))
{
using(var writer =
new BinaryWriter(new FileStream(#"C:\temp\result.bin", FileMode.Create)))
{
BinaryUtility.Replace(reader, writer, searchAndReplace);
}
}
BinaryUtility code:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
public static class BinaryUtility
{
public static IEnumerable<byte> GetByteStream(BinaryReader reader)
{
const int bufferSize = 1024;
byte[] buffer;
do
{
buffer = reader.ReadBytes(bufferSize);
foreach (var d in buffer) { yield return d; }
} while (bufferSize == buffer.Length);
}
public static void Replace(BinaryReader reader, BinaryWriter writer, IEnumerable<Tuple<byte[], byte[]>> searchAndReplace)
{
foreach (byte d in Replace(GetByteStream(reader), searchAndReplace)) { writer.Write(d); }
}
public static IEnumerable<byte> Replace(IEnumerable<byte> source, IEnumerable<Tuple<byte[], byte[]>> searchAndReplace)
{
foreach (var s in searchAndReplace)
{
source = Replace(source, s.Item1, s.Item2);
}
return source;
}
public static IEnumerable<byte> Replace(IEnumerable<byte> input, IEnumerable<byte> from, IEnumerable<byte> to)
{
var fromEnumerator = from.GetEnumerator();
fromEnumerator.MoveNext();
int match = 0;
foreach (var data in input)
{
if (data == fromEnumerator.Current)
{
match++;
if (fromEnumerator.MoveNext()) { continue; }
foreach (byte d in to) { yield return d; }
match = 0;
fromEnumerator.Reset();
fromEnumerator.MoveNext();
continue;
}
if (0 != match)
{
foreach (byte d in from.Take(match)) { yield return d; }
match = 0;
fromEnumerator.Reset();
fromEnumerator.MoveNext();
}
yield return data;
}
if (0 != match)
{
foreach (byte d in from.Take(match)) { yield return d; }
}
}
}
public static void BinaryReplace(string sourceFile, byte[] sourceSeq, string targetFile, byte[] targetSeq)
{
FileStream sourceStream = File.OpenRead(sourceFile);
FileStream targetStream = File.Create(targetFile);
try
{
int b;
long foundSeqOffset = -1;
int searchByteCursor = 0;
while ((b=sourceStream.ReadByte()) != -1)
{
if (sourceSeq[searchByteCursor] == b)
{
if (searchByteCursor == sourceSeq.Length - 1)
{
targetStream.Write(targetSeq, 0, targetSeq.Length);
searchByteCursor = 0;
foundSeqOffset = -1;
}
else
{
if (searchByteCursor == 0)
{
foundSeqOffset = sourceStream.Position - 1;
}
++searchByteCursor;
}
}
else
{
if (searchByteCursor == 0)
{
targetStream.WriteByte((byte) b);
}
else
{
targetStream.WriteByte(sourceSeq[0]);
sourceStream.Position = foundSeqOffset + 1;
searchByteCursor = 0;
foundSeqOffset = -1;
}
}
}
}
finally
{
sourceStream.Dispose();
targetStream.Dispose();
}
}

zlib from C++ to C#(How to convert byte[] to stream and stream to byte[])

My task is to decompress a packet(received) using zlib and then use an algoritm to make a picture from the data
The good news is that I have the code in C++,but the task is to do it in C#
C++
//Read the first values of the packet received
DWORD image[200 * 64] = {0}; //used for algoritm(width always = 200 and height always == 64)
int imgIndex = 0; //used for algoritm
unsigned char rawbytes_[131072] = {0}; //read below
unsigned char * rawbytes = rawbytes_; //destrination parameter for decompression(ptr)
compressed = r.Read<WORD>(); //the length of the compressed bytes(picture)
uncompressed = r.Read<WORD>(); //the length that should be after decompression
width = r.Read<WORD>(); //the width of the picture
height = r.Read<WORD>(); //the height of the picture
LPBYTE ptr = r.GetCurrentStream(); //the bytes(file that must be decompressed)
outLen = uncompressed; //copy the len into another variable
//Decompress
if(uncompress((Bytef*)rawbytes, &outLen, ptr, compressed) != Z_OK)
{
printf("Could not uncompress the image code.\n");
Disconnect();
return;
}
//Algoritm to make up the picture
// Loop through the data
for(int c = 0; c < (int)height; ++c)
{
for(int r = 0; r < (int)width; ++r)
{
imgIndex = (height - 1 - c) * width + r;
image[imgIndex] = 0xFF000000;
if(-((1 << (0xFF & (r & 0x80000007))) & rawbytes[((c * width + r) >> 3)]))
image[imgIndex] = 0xFFFFFFFF;
}
}
I'm trying to do this with zlib.NET ,but all demos have that code to decompress(C#)
private void decompressFile(string inFile, string outFile)
{
System.IO.FileStream outFileStream = new System.IO.FileStream(outFile, System.IO.FileMode.Create);
zlib.ZOutputStream outZStream = new zlib.ZOutputStream(outFileStream);
System.IO.FileStream inFileStream = new System.IO.FileStream(inFile, System.IO.FileMode.Open);
try
{
CopyStream(inFileStream, outZStream);
}
finally
{
outZStream.Close();
outFileStream.Close();
inFileStream.Close();
}
}
public static void CopyStream(System.IO.Stream input, System.IO.Stream output)
{
byte[] buffer = new byte[2000];
int len;
while ((len = input.Read(buffer, 0, 2000)) > 0)
{
output.Write(buffer, 0, len);
}
output.Flush();
}
My problem:I don't want to save the file after decompression,because I have to use the algoritm shown in the C++ code.
How to convert the byte[] array into a stream similiar to the one in the C# zlib code to decompress the data and then how to convert the stream back into byte array?
Also,How to change the zlib.NET code to NOT save files?
Just use MemoryStreams instead of FileStreams:
// Assuming inputData is a byte[]
MemoryStream input = new MemoryStream(inputData);
MemoryStream output = new MemoryStream();
Then you can use output.ToArray() afterwards to get a byte array out.
Note that it's generally better to use using statements instead of a single try/finally block - as otherwise if the first call to Close fails, the rest won't be made. You can nest them like this:
using (MemoryStream output = new MemoryStream())
using (Stream outZStream = new zlib.ZOutputStream(output))
using (Stream input = new MemoryStream(bytes))
{
CopyStream(inFileStream, outZStream);
return output.ToArray();
}
I just ran into this same issue.
For Completeness... (since this stumped me for several hours)
In the case of ZLib.Net you also have to call finish(), which usually happens during Close(), before you call return output.ToArray()
Otherwise you will get an empty/incomplete byte array from your memory stream, because the ZStream hasn't actually written all of the data yet:
public static void CompressData(byte[] inData, out byte[] outData)
{
using (MemoryStream outMemoryStream = new MemoryStream())
using (ZOutputStream outZStream = new ZOutputStream(outMemoryStream, zlibConst.Z_DEFAULT_COMPRESSION))
using (Stream inMemoryStream = new MemoryStream(inData))
{
CopyStream(inMemoryStream, outZStream);
outZStream.finish();
outData = outMemoryStream.ToArray();
}
}
public static void DecompressData(byte[] inData, out byte[] outData)
{
using (MemoryStream outMemoryStream = new MemoryStream())
using (ZOutputStream outZStream = new ZOutputStream(outMemoryStream))
using (Stream inMemoryStream = new MemoryStream(inData))
{
CopyStream(inMemoryStream, outZStream);
outZStream.finish();
outData = outMemoryStream.ToArray();
}
}
In this example I'm also using the zlib namespace:
using zlib;
Originally found in this thread:
ZLib decompression
I don't have enough points to vote up yet, so...
Thanks to Tim Greaves for the tip regarding finish before ToArray
And Jon Skeet for the tip regarding nesting the using statements for streams (which I like much better than try/finally)

Categories

Resources