Fastest way to determine sequence from stream in c#

Fastest way to determine sequence from stream in c# - c#

I need to read bytes (converted from string to byte array and then sent to stream) from stream and stop reading as soon as I encounter specific sequence, in my case it's [13, 10, 13, 10] or "\r\n\r\n" if converted to string (ASCII).
Currently I have two versions of the same process:
1) Read from stream one byte at time and check EVERY byte if last 4 bytes of read sequence equals [13, 10, 13, 10] (note that I can't read and check every 4 bytes as sequence can be 7 bytes long, for example, so it'll read first 4 bytes and then stuck because only 3 of 4 bytes are available):
NetworkStream streamBrowser = tcpclientBrowser.GetStream();
byte[] data;
using (MemoryStream ms = new MemoryStream())
{
byte[] check = new byte[4] { 13, 10, 13, 10 };
byte[] buff = new byte[1];
do
{
streamBrowser.Read(buff, 0, 1);
ms.Write(buff, 0, 1);
data = ms.ToArray();
} while (!data.Skip(data.Length - 4).SequenceEqual(check));
}
2) Use StreamReader.ReadLine to read until "\r\n" and then read again to see if returned line is null, and then add to first returned string "\r\n", that way I'll get string that ends with "\r\n\r\n".
My question is - what method is preferable in terms of perfomance (if any, it may be that both are too slow and there are better way which I really would want to know)?

I am drunk, but maybe it will help:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
namespace ConsoleApp5
{
class Program
{
static void Main (string[] args)
{
var ms = new MemoryStream (new byte[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 });
var sequence = new byte[] { 6, 7, 8 };
var buffer = new byte[1];
var queue = new Queue<byte> ();
int position = 0;
while (true)
{
int count = ms.Read (buffer, 0, 1);
if (count == 0) return;
queue.Enqueue (buffer[0]);
position++;
if (IsSequenceFound (queue, sequence))
{
Console.WriteLine ("Found sequence at position: " + (position - queue.Count));
return;
}
if (queue.Count == sequence.Length) queue.Dequeue ();
}
}
static bool IsSequenceFound (Queue<byte> queue, byte[] sequence)
{
return queue.SequenceEqual (sequence);
// normal for (int i ...) can be faster
}
}
}

Related

AsyncTCP C# - arduino/unity communication

I've been working the last weeks with a tcp protocol to send packet from arduino to unity using this code:
using System.Collections;
using System.Collections.Generic;
using System.Net;
using System.Net.Sockets;
using System.Text;
using System.Threading;
using UnityEngine;
public class TCPConnection : MonoBehaviour
{
public string IP_Seat = "192.168.137.161";
public int port = 34197;
#region private members
private TcpClient socketConnection;
private Thread clientReceiveThread;
public float a, b, c, vel;
public float test = 0.0f;
#endregion
// Use this for initialization
void Awake()
{
ConnectToTcpServer();
}
/// <summary>
/// Setup socket connection.
/// </summary>
private void ConnectToTcpServer()
{
try
{
clientReceiveThread = new Thread(new ThreadStart(ListenForData));
clientReceiveThread.IsBackground = true;
clientReceiveThread.Start();
}
catch (Exception e)
{
Debug.Log("On client connect exception " + e);
}
}
/// <summary>
/// Runs in background clientReceiveThread; Listens for incoming data.
/// </summary>
private void ListenForData()
{
var aold = 0.0f;
var bold = 0.0f;
var cold = 0.0f;
var velold = 0.0f;
try
{
socketConnection = new TcpClient(IP_Seat, port);
//cketConnection.ConnectAsync(IP_Seat, port); // non si connette
//socketConnection.Client.Blocking = false;
//socketConnection.Client.ConnectAsync(IP_Seat,port);
Byte[] bytes = new Byte[16];
while (socketConnection.Connected)
{
// Get a stream object for reading
using (NetworkStream stream = socketConnection.GetStream())
{
int length;
// Read incoming stream into byte arrary.
while ((length = stream.Read(bytes, 0, bytes.Length)) != 0)
{
//Debug.Log("I'm receiving Data");
if (length == 16)
{
//Debug.Log("I'm receiving len 16 and I like it");
var incomingData = new byte[length];
var A = new Byte[4];
var B = new Byte[4];
var C = new Byte[4];
var VEL = new Byte[4];
Array.Copy(bytes, 0, incomingData, 0, length);
// Convert byte array to string message.
string serverMessage = Encoding.ASCII.GetString(incomingData);
Array.Copy(bytes, 0, A, 0, 4);
Array.Copy(bytes, 4, B, 0, 4);
Array.Copy(bytes, 8, C, 0, 4);
Array.Copy(bytes, 12, VEL, 0, 4);
a = BitConverter.ToSingle(A, 0) < 0 ? BitConverter.ToSingle(A, 0) : aold;
b = BitConverter.ToSingle(B, 0) < 0 ? BitConverter.ToSingle(B, 0) : bold;
c = BitConverter.ToSingle(C, 0) < 0 ? BitConverter.ToSingle(C, 0) : cold;
vel = BitConverter.ToSingle(VEL, 0); //< 0 ? BitConverter.ToSingle(C, 0) : 0;
//Debug.Log("server message received as: " + serverMessage +a +" "+b + " " + c + " " + vel);
aold = a;
bold = b;
cold = c;
velold = vel;
}
else {
//evitare che bilancia aspetti ack di tcp
}
}
}
}
}
catch (SocketException socketException)
{
Debug.Log("Socket exception: " + socketException);
}
}
}
and right now i'm having blocking issues: i'd like to use the async method for the TCP but the tcpClient.ConnectAsync but it returns a SocketEXception and I can't figure out why.
The arduino sends 4 float in 16 bytes packet and 98-99% of them arrive correctly, but the missing 1-2% causes the system to block and an undesirable behaviour ( since i'm programming a device I need no delay waiting for an ack or a packet)
How can I make this sokcet async?
EDIT:
Using NET 4.X: how can I use the connectASync(ip,port) method in this script?

As already said TCP is just an undefined stream of data you need to implement an according protocol for knowing when a message ends (is fully received) and when a new one starts.
Now in your case you seem to already know that you have exactly 16 bytes.
NetworkStream.Read however, does not necessarily wait until actually 16 bytes where received. If for some reason (delay in the network) at the moment it is called there are less bytes in the stream then it will receive only the amount that is available.
Now let's say your sender sends 3 packages á 16 bytes (so 48 bytes).
It might happen that the first time you read only 8 bytes. From now on every following read call receives 16 bytes.
=> Result: You get two complete buffers, but with invalid data since you always started reading at the middle of a message.
Note the second parameter of Read
int offset -> The location in buffer to begin storing the data to.
what you want to do is wait until the buffer is actually full like
var receivedBytes = 0;
while (receivedBytes < bytes.Length)
{
receivedBytes += stream.Read(bytes, receivedBytes, bytes.Length - receivedBytes);
}
// Use bytes
Now this will fill exactly one buffer of 16 bytes before continuing since at the same time we increase the offset we also decrease the maximum amount of bytes to read.
And another note: you have a huge amount of redundant array creations, copies and BitConverter going on!
I would rather use
var bytes = new byte[16];
var floats = new float[4];
And then later on after receiving the bytes do
Buffer.BlockCopy(bytes, 0, floats, 0, 16);
This copies the bytes over into floats directly in the underlaying byte layer.
And then you can simply do
a = floats[0] < 0 ? floats[0] : aold;
b = floats[1] < 0 ? floats[1] : bold;
c = floats[2] < 0 ? floats[2] : cold;
v = floats[3];
Note: Typed on the phone but I hope the idea gets clear

What is different with the writing in FileStream?

When I searched the method about decompress the file by using SharpZipLib, I found lot of methods like this:
public static void TarWriteCharacters(string tarfile, string targetDir)
{
using (TarInputStream s = new TarInputStream(File.OpenRead(tarfile)))
{
//some codes here
using (FileStream fileWrite = File.Create(targetDir + directoryName + fileName))
{
int size = 2048;
byte[] data = new byte[2048];
while (true)
{
size = s.Read(data, 0, data.Length);
if (size > 0)
{
fileWrite.Write(data, 0, size);
}
else
{
break;
}
}
fileWrite.Close();
}
}
}
The format FileStream.Write is:
FileStream.Write(byte[] array, int offset, int count)
Now I try to separate part of read and write because I want to use thread to speed up the decompress rate in write function, and I use dynamic array byte[] and int[] to deposit the file's data and size like below
Read:
public static void TarWriteCharacters(string tarfile, string targetDir)
{
using (TarInputStream s = new TarInputStream(File.OpenRead(tarfile)))
{
//some codes here
using (FileStream fileWrite= File.Create(targetDir + directoryName + fileName))
{
int size = 2048;
List<int> SizeList = new List<int>();
List<byte[]> mydatalist = new List<byte[]>();
while (true)
{
byte[] data = new byte[2048];
size = s.Read(data, 0, data.Length);
if (size > 0)
{
mydatalist.Add(data);
SizeList.Add(size);
}
else
{
break;
}
}
test = new Thread(() =>
FileWriteFun(pathToTar, args, SizeList, mydatalist)
);
test.Start();
streamWriter.Close();
}
}
}
Write:
public static void FileWriteFun(string pathToTar , string[] args, List<int> SizeList, List<byte[]> mydataList)
{
//some codes here
using (FileStream fileWrite= File.Create(targetDir + directoryName + fileName))
{
for (int i = 0; i < mydataList.Count; i++)
{
fileWrite.Write(mydataList[i], 0, SizeList[i]);
}
fileWrite.Close();
}
}
Edit
(1)byte[] data = new byte[2048] into while loop to assign data to new array.
(2)change int[] SizeList = new int[2048] to List<int> SizeList = new List<int>() because of int range

As read on a stream is only guarantied to return one byte (typically it will be more, but you can't rely on the full requested length each time), your solution can theoretically fail after 2048 bytes as your SizeList can only hold 2048 entries.
You could use a List to hold the sizes.
Or use a MemoryStream instead of inventing your own.
But the two main problems are:
1) You keep reading into the same byte array, overwriting previously read data. When you add your data byte array to mydatalist, you must assign data to a new byte array.
2) you close your stream before the second thread is done writing.
In general threading is difficult and should only be used where you know it will improve performance. Simply reading and writing data is typically IO bound in performance, not cpu bound, so introducing a second thread will just give a small performance penalty and no gain in speed. You could use multithreading to ensure concurrent read/write operations, but most likely the disk cache will do this for you if you stick to the first solution - amd if not, using async is easier than multithreaded to achieve this.

Uncompress String with nodejs that was compressed using a C# snippet

I collect some large log infos using a C# tool. Therefore I searched for a way to compress that giant string and I found this snippet to do the trick:
public static string CompressString(string text)
{
byte[] buffer = Encoding.UTF8.GetBytes(text);
var memoryStream = new MemoryStream();
using (var gZipStream = new GZipStream(memoryStream, CompressionMode.Compress, true))
{
gZipStream.Write(buffer, 0, buffer.Length);
}
memoryStream.Position = 0;
var compressedData = new byte[memoryStream.Length];
memoryStream.Read(compressedData, 0, compressedData.Length);
var gZipBuffer = new byte[compressedData.Length + 4];
Buffer.BlockCopy(compressedData, 0, gZipBuffer, 4, compressedData.Length);
Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gZipBuffer, 0, 4);
return Convert.ToBase64String(gZipBuffer);
}
After my logging action the C# tool sends this compressed String to a node.js REST interface which writes it into a database.
Now (in my naive understanding of compression) I thought that I could simply use something like the follwoing code on nodejs side to uncompress it:
zlib.gunzip(Buffer.from(compressedLogMessage, 'base64'), function(err, uncompressedLogMessage) {
if(err) {
console.error(err);
}
else {
console.log(uncompressedLogMessage.toString('utf-8'));
}
});
But I get the error:
{ Error: incorrect header check
at Zlib._handle.onerror (zlib.js:370:17) errno: -3, code: 'Z_DATA_ERROR' }
It seems that the compression method does not match with the uncompression function. I expect that anyone with compression/uncompression knowledge could maybe see the issue(s) immediately.
What could I change or improve to make the uncompression work?
Thanks a lot!
========== UPDATE ===========
It seems that message receiving and base64 decoding works..
Using CompressString("Hello World") results in:
// before compression
"Hello World"
// after compression before base64 encoding
new byte[] { 11, 0, 0, 0, 31, 139, 8, 0, 0, 0, 0, 0, 0, 3, 243, 72, 205, 201, 201, 87, 8, 207, 47, 202, 73, 1, 0, 86, 177, 23, 74, 11, 0, 0, 0 }
// after base64 encoding
CwAAAB+LCAAAAAAAAAPzSM3JyVcIzy/KSQEAVrEXSgsAAAA=
And on node js side:
// after var buf = Buffer.from('CwAAAB+LCAAAAAAAAAPzSM3JyVcIzy/KSQEAVrEXSgsAAAA=', 'base64');
{"buf":{"type":"Buffer","data":[11,0,0,0,31,139,8,0,0,0,0,0,0,3,243,72,205,201,201,87,8,207,47,202,73,1,0,86,177,23,74,11,0,0,0]}}
// after zlib.gunzip(buf, function(err, dezipped) { ... }
{ Error: incorrect header check
at Zlib._handle.onerror (zlib.js:370:17) errno: -3, code: 'Z_DATA_ERROR' }
=============== Update 2 ==================
#01binary's answer was correct! That's the working solution:
function toArrayBuffer(buffer) {
var arrayBuffer = new ArrayBuffer(buffer.length);
var view = new Uint8Array(arrayBuffer);
for (var i = 0; i < buffer.length; ++i) {
view[i] = buffer[i];
}
return arrayBuffer;
}
// Hello World (compressed with C#) => CwAAAB+LCAAAAAAAAAPzSM3JyVcIzy/KSQEAVrEXSgsAAAA=
var arrayBuffer = toArrayBuffer(Buffer.from('CwAAAB+LCAAAAAAAAAPzSM3JyVcIzy/KSQEAVrEXSgsAAAA=', 'base64'))
var zlib = require('zlib');
zlib.gunzip(Buffer.from(arrayBuffer, 4), function(err, uncompressedMessage) {
if(err) {
console.log(err)
}
else {
console.log(uncompressedMessage.toString()) // Hello World
}
});

The snippet you found appears to write 4 extra bytes to the beginning of the output stream, containing the "uncompressed" size of the original data. The original author must have assumed that logic on the receiving end is going to read those 4 bytes, know that it needs to allocate a buffer of that size, and pass the rest of the stream (at +4 offset) to gunzip.
If you are using this signature on the Node side:
https://nodejs.org/api/buffer.html#buffer_class_method_buffer_from_arraybuffer_byteoffset_length
...then pass a byte offset of 4. The first two bytes of your gzip stream should be { 0x1F, 0x8b }, and you can see in your array that those two bytes start at offset 4. A simple example of the zlib header can be found here:
Zlib compression incompatibile C vs C# implementations

How to Divide an array on c#?

I have to do a program that read and image and puts it into a byte array
var Imagenoriginal = File.ReadAllBytes("10M.bmp");
And Divide That byte Array into 3 Diferent Arrays in order to send each one of this new arrays to other computer ( Using Pipes ) to process them there and finally take them back to the origial computer and finally give the result.
But my question Is how do I do an Algorithm able to divide the byte array in three different bytes arrays if the image selected can have diferent size.
Thanks for your help, have a nice day. =)

You can divide length of array, so you have three integers n1, n2 and n3 with all of them summing up to array.Length. Then, this snippet, using LINQ should be of help:
var arr1 = sourceArray.Take(n1).ToArray();
var arr2 = sourceArray.Skip(n1).Take(n2).ToArray();
var arr3 = sourceArray.Skip(n1+n2).Take(n3).ToArray();
Now, in arr1,arr2 and arr3 you will have three parts of your source array. You need to use LINQ, so in the beginning of the code don't forget using System.Linq;.

You can try like this:
public static IEnumerable<IEnumerable<T>> DivideArray<T>(this T[] array, int size)
{
for (var i = 0; i < (float)array.Length / size; i++)
{
yield return array.Skip(i * size).Take(size);
}
}
The call like this:
var arr = new byte[] {1, 2, 3, 4, 5,6};
var dividedArray = arr.DivideArray(3);
Here is a LINQ approach:
public List<List<byte>> DivideArray(List<byte> arr)
{
return arr
.Select((x, i) => new { Index = i, Value = x })
.GroupBy(x => x.Index / 100)
.Select(x => x.Select(v => v.Value).ToList())
.ToList();
}

Have you considered using Streams? You could extend the Stream class to provide the desired behavior, as follows:
public static class Utilities
{
public static IEnumerable<byte[]> ReadBytes(this Stream input, long bufferSize)
{
byte[] buffer = new byte[bufferSize];
int read;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
using (MemoryStream tempStream = new MemoryStream())
{
tempStream.Write(buffer, 0, read);
yield return tempStream.ToArray();
}
}
}
public static IEnumerable<byte[]> ReadBlocks(this Stream input, int nblocks)
{
long bufferSize = (long)Math.Ceiling((double)input.Length / nblocks);
return input.ReadBytes(bufferSize);
}
}
The ReadBytes extension method reads the input Stream and returns its data as a sequence of byte arrays, using the specified bufferSize.
The ReadBlocks extension method calls ReadBytes with the appropriate buffer size, so that the number of elements in the sequence equals nblocks.
You could then use ReadBlocks to achieve what you want:
public class Program
{
static void Main()
{
FileStream inputStream = File.Open(#"E:\10M.bmp", FileMode.Open);
const int nblocks = 3;
foreach (byte[] block in inputStream.ReadBlocks(nblocks))
{
Console.WriteLine("{0} bytes", block.Length);
}
}
}
Note how ReadBytes uses tempStream and read to write in memory the bytes read from the input stream before converting them into an array of bytes, it solves the problem with the leftover bytes mentioned in the comments.

BinaryReader overwhelming me by padding byte array

So I have this really simple code that reads a file and spits its data out in a hex viewer fashion. Here it is:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace HexViewer
{
class Program
{
static void Main(string[] args)
{
BinaryReader br = new BinaryReader(new FileStream("C:\\dump.bin", FileMode.Open));
for (int i = 0; i < br.BaseStream.Length; i+= 16)
{
Console.Write(i.ToString("x") + ": ");
byte[] data = new byte[16];
br.Read(data, i, 16);
Console.WriteLine(BitConverter.ToString(data).Replace("-", " "));
}
Console.ReadLine();
}
}
}
The problem is that after the first iteration, when I do
br.Read(data, 16, 16);
The byte array is padded by 16 bytes, and then filled with data from 15th byte to 31st byte of the file. Because it can't fit 32 bytes into a 16 byte large array, it throws an exception. You can try this code with any file larger than 16 bytes. So, the question is, what is wrong with this code?

Just change br.Read(data, i, 16); to br.Read(data, 0, 16);
You are reading in a new block of data each time, so no need to use i for the data buffer.
Even better, change:
byte[] data = new byte[16];
br.Read(data, 0, 16);
To:
var data = br.ReadBytes(16);

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Fastest way to determine sequence from stream in c# - c#

Related

AsyncTCP C# - arduino/unity communication

What is different with the writing in FileStream?

Uncompress String with nodejs that was compressed using a C# snippet

How to Divide an array on c#?

BinaryReader overwhelming me by padding byte array

Categories

Resources