Optimization - Encode a string and get hexadecimal representation of 3 bytes - c#

I am currently working in an environment where performance is critical and this is what I am doing :
var iso_8859_5 = System.Text.Encoding.GetEncoding("iso-8859-5");
var dataToSend = iso_8859_5.GetBytes(message);
The I need to group the bytes by 3 so I have a for loop that does this (i being the iterator of the loop):
byte[] dataByteArray = { dataToSend[i], dataToSend[i + 1], dataToSend[i + 2], 0 };
I then get an integer out of these 4 bytes
BitConverter.ToUInt32(dataByteArray, 0)
and finally the integer is converted to a hexadecimal string that I can place in a network packet.
The last two lines repeat about 150 times
I am currently hitting 50 milliseconds of execution times and ideally I would want to reach 0... Is there a faster way to do this that I am not aware of?
UPDATE
Just tried
string hex = BitConverter.ToString(dataByteArray);
hex.Replace("-", "")
to get the hex string directly but it is 3 times slower
Ricardo Silva's answer adapted
public byte[][] GetArrays(byte[] fullMessage, int size)
{
var returnArrays = new byte[(fullMessage.Length / size)+1][];
int i, j;
for (i = 0, j = 0; i < (fullMessage.Length - 2); i += size, j++)
{
returnArrays[j] = new byte[size + 1];
Buffer.BlockCopy(
src: fullMessage,
srcOffset: i,
dst: returnArrays[j],
dstOffset: 0,
count: size);
returnArrays[j][returnArrays[j].Length - 1] = 0x00;
}
switch ((fullMessage.Length % i))
{
case 0: {
returnArrays[j] = new byte[] { 0, 0, EOT, 0 };
} break;
case 1: {
returnArrays[j] = new byte[] { fullMessage[i], 0, EOT, 0 };
} break;
case 2: {
returnArrays[j] = new byte[] { fullMessage[i], fullMessage[i + 1], EOT, 0 };
} break;
}
return returnArrays;
}

After the line below you will get the total byte array.
var dataToSend = iso_8859_5.GetBytes(message);
My sugestion is work with Buffer.BlockCopy and test to see if this will be faster than your current method.
Try the code below and tell us if is faster than your current code:
public byte[][] GetArrays(byte[] fullMessage, int size)
{
var returnArrays = new byte[fullMessage.Length/size][];
for(int i = 0, j = 0; i < fullMessage.Length; i += size, j++)
{
returnArrays[j] = new byte[size + 1];
Buffer.BlockCopy(
src: fullMessage,
srcOffset: i,
dst: returnArrays[j],
dstOffset: 0,
count: size);
returnArrays[j][returnArrays[j].Length - 1] = 0x00;
}
return returnArrays;
}
EDIT1: I run the test below and the output was 245900ns (or 0,2459ms).
[TestClass()]
public class Form1Tests
{
[TestMethod()]
public void GetArraysTest()
{
var expected = new byte[] { 0x30, 0x31, 0x32, 0x00 };
var size = 3;
var stopWatch = new Stopwatch();
stopWatch.Start();
var iso_8859_5 = System.Text.Encoding.GetEncoding("iso-8859-5");
var target = iso_8859_5.GetBytes("012");
var arrays = Form1.GetArrays(target, size);
BitConverter.ToUInt32(arrays[0], 0);
stopWatch.Stop();
foreach(var array in arrays)
{
for(int i = 0; i < expected.Count(); i++)
{
Assert.AreEqual(expected[i], array[i]);
}
}
Console.WriteLine(string.Format("{0}ns", stopWatch.Elapsed.TotalMilliseconds * 1000000));
}
}
EDIT 2
I looked to your code and I have only one suggestion. I understood that you need to add EOF message and the length of input array will not be Always multiple of size that you want to break.
BUT, now the code below has TWO responsabilities, that break the S of SOLID concept.
The S talk about Single Responsability - Each method has ONE, and only ONE responsability.
The code you posted has TWO responsabilities (break input array into N smaller arrays and add EOF). Try think a way to create two totally independente methods (one to break an array into N other arrays, and other to put EOF in any array that you pass). This will allow you to create unit tests for each method (and guarantee that they Works and will never be breaked for any changed), and call the two methods from your class that make the system integration.

Related

Convert byte array to array segments of a certain length

I have a byte array and I would like to return sequential chuncks (in the form of new byte arrays) of a certain size.
I tried:
originalArray = BYTE_ARRAY
var segment = new ArraySegment<byte>(originalArray,0,640);
byte[] newArray = new byte[640];
for (int i = segment.Offset; i <= segment.Count; i++)
{
newArray[i] = segment.Array[i];
}
Obviously this only creates an array of the first 640 bytes from the original array. Ultimately, I want a loop that goes through the first 640 bytes and returns an array of those bytes, then it goes through the NEXT 640 bytes and returns an array of THOSE bytes. The purpose of this is to send messages to a server and each message must contain 640 bytes. I cannot garauntee that the original array length is divisible by 640.
Thanks
if speed isn't a concern
var bytes = new byte[640 * 6];
for (var i = 0; i <= bytes.Length; i+=640)
{
var chunk = bytes.Skip(i).Take(640).ToArray();
...
}
Alternatively you could use
Span.Slice Method
Buffer.BlockCopy(Array, Int32, Array, Int32, Int32) Method
Span
Span<byte> bytes = arr; // Implicit cast from T[] to Span<T>
...
slicedBytes = bytes.Slice(i, 640);
BlockCopy
Note this will probably be the fastest of the 3
var chunk = new byte[640]
Buffer.BlockCopy(bytes, i, chunk, 0, 640);
If you truly want to make new arrays from each 640 byte chunk, then you're looking for .Skip and .Take
Here's a working example (and a repl of the example) that I hacked together.
using System;
using System.Linq;
using System.Text;
using System.Collections;
using System.Collections.Generic;
class MainClass {
public static void Main (string[] args) {
// mock up a byte array from something
var seedString = String.Join("", Enumerable.Range(0, 1024).Select(x => x.ToString()));
var byteArrayInput = Encoding.ASCII.GetBytes(seedString);
var skip = 0;
var take = 640;
var total = byteArrayInput.Length;
var output = new List<byte[]>();
while (skip + take < total) {
output.Add(byteArrayInput.Skip(skip).Take(take).ToArray());
skip += take;
}
output.ForEach(c => Console.WriteLine($"chunk: {BitConverter.ToString(c)}"));
}
}
It's really probably better to actually use the ArraySegment properly --unless this is an assignment to learn LINQ extensions.
You can write a generic helper method like this:
public static IEnumerable<T[]> AsBatches<T>(T[] input, int n)
{
for (int i = 0, r = input.Length; r >= n; r -= n, i += n)
{
var result = new T[n];
Array.Copy(input, i, result, 0, n);
yield return result;
}
}
Then you can use it in a foreach loop:
byte[] byteArray = new byte[123456];
foreach (var batch in AsBatches(byteArray, 640))
{
Console.WriteLine(batch.Length); // Do something with the batch.
}
Or if you want a list of batches just do this:
List<byte[]> listOfBatches = AsBatches(byteArray, 640).ToList();
If you want to get fancy you could make it an extension method, but this is only recommended if you will be using it a lot (don't make an extension method for something you'll only be calling in one place!).
Here I've changed the name to InChunksOf() to make it more readable:
public static class ArrayExt
{
public static IEnumerable<T[]> InChunksOf<T>(this T[] input, int n)
{
for (int i = 0, r = input.Length; r >= n; r -= n, i += n)
{
var result = new T[n];
Array.Copy(input, i, result, 0, n);
yield return result;
}
}
}
Which you could use like this:
byte[] byteArray = new byte[123456];
// ... initialise byteArray[], then:
var listOfChunks = byteArray.InChunksOf(640).ToList();
[EDIT] Corrected loop terminator from r > n to r >= n.

checksum calculation using ArraySegment<byte>

I have issue with the following method - I don't understand why it behaves the way it does
private static bool chksumCalc(ref byte[] receive_byte_array)
{
Console.WriteLine("receive_byte_array -> " + receive_byte_array.Length); //ok,151 bytes in my case
ArraySegment<byte> segment = new ArraySegment<byte>(receive_byte_array, 0, 149);
Console.WriteLine("segment # -> " + segment.Count); //ok,149 bytes
BitArray resultBits = new BitArray(8); //hold the result
Console.WriteLine("resultBits.Length -> " + resultBits.Length); //ok, 8bits
//now loop through the 149 bytes
for (int i = segment.Offset; i < (segment.Offset + segment.Count); ++i)
{
BitArray curBits = new BitArray(segment.Array[i]);
Console.WriteLine("curBits.Length -> " + curBits.Length); //gives me 229 not 8?
resultBits = resultBits.Xor(curBits);
}
//some more things to do ... return true...
//or else
return false;
}
I need to XOR 149 bytes and I don't understand why segment.Array[i] doesn't give me 1 byte. If I have array of 149 bytes if I use for example segment.Array[1] it has to yield the 2nd byte or am I that wrong? Where does the 229 come from? Can someone please clarify? Thank you.
This is the constructor you're calling: BitArray(int length)
Initializes a new instance of the BitArray class that can hold the specified number of bit values, which are initially set to false.
If you look, all of the constructors for BitArray read like that. I don't see why you need to use the BitArray class at all, though. Just use a byte to store your XOR result:
private static bool chksumCalc(ref byte[] receive_byte_array)
{
var segment = new ArraySegment<byte>(receive_byte_array, 0, 149);
byte resultBits = 0;
for (var i = segment.Offset; i < (segment.Offset + segment.Count); ++i)
{
var curBits = segment.Array[i];
resultBits = (byte)(resultBits ^ curBits);
}
//some more things to do ... return true...
//or else
return false;
}
I don't think you need the ArraySegment<T> either (not for the code presented), but I left it as is since it's beside the point of the question.

Intersect and Union in byte array of 2 files

I have 2 files.
1 is Source File and 2nd is Destination file.
Below is my code for Intersect and Union two file using byte array.
FileStream frsrc = new FileStream("Src.bin", FileMode.Open);
FileStream frdes = new FileStream("Des.bin", FileMode.Open);
int length = 24; // get file length
byte[] src = new byte[length];
byte[] des = new byte[length]; // create buffer
int Counter = 0; // actual number of bytes read
int subcount = 0;
while (frsrc.Read(src, 0, length) > 0)
{
try
{
Counter = 0;
frdes.Position = subcount * length;
while (frdes.Read(des, 0, length) > 0)
{
var data = src.Intersect(des);
var data1 = src.Union(des);
Counter++;
}
subcount++;
Console.WriteLine(subcount.ToString());
}
}
catch (Exception ex)
{
}
}
It is works fine with fastest speed.
but Now the problem is that I want count of it and when I Use below code then it becomes very slow.
var data = src.Intersect(des).Count();
var data1 = src.Union(des).Count();
So, Is there any solution for that ?
If yes,then please lete me know as soon as possible.
Thanks
Intersect and Union are not the fastest operations. The reason you see it being fast is that you never actually enumerate the results!
Both return an enumerable, not the actual results of the operation. You're supposed to go through that and enumerate the enumerable, otherwise nothing happens - this is called "deferred execution". Now, when you do Count, you actually enumerate the enumerable, and incur the full cost of the Intersect and Union - believe me, the Count itself is relatively trivial (though still an O(n) operation!).
You'll need to make your own methods, most likely. You want to avoid the enumerable overhead, and more importantly, you'll probably want a lookup table.
A few points: the comment // get file length is misleading as it is the buffer size. Counter is not the number of bytes read, it is the number of blocks read. data and data1 will end up with the result of the last block read, ignoring any data before them. That is assuming that nothing goes wrong in the while loop - you need to remove the try structure to see if there are any errors.
What you can do is count the number of occurences of each byte in each file, then if the count of a byte in any file is greater than one then it is is a member of the intersection of the files, and if the count of a byte in all the files is greater than one then it is a member of the union of the files.
It is just as easy to write the code for more than two files as it is for two files, whereas LINQ is easy for two but a little bit more fiddly for more than two. (I put in a comparison with using LINQ in a naïve fashion for only two files at the end.)
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var file1 = #"C:\Program Files (x86)\Electronic Arts\Crysis 3\Bin32\Crysis3.exe"; // 26MB
var file2 = #"C:\Program Files (x86)\Electronic Arts\Crysis 3\Bin32\d3dcompiler_46.dll"; // 3MB
List<string> files = new List<string> { file1, file2 };
var sw = System.Diagnostics.Stopwatch.StartNew();
// Prepare array of counters for the bytes
var nFiles = files.Count;
int[][] count = new int[nFiles][];
for (int i = 0; i < nFiles; i++)
{
count[i] = new int[256];
}
// Get the counts of bytes in each file
int bufLen = 32768;
byte[] buffer = new byte[bufLen];
int bytesRead;
for (int fileNum = 0; fileNum < nFiles; fileNum++)
{
using (var sr = new FileStream(files[fileNum], FileMode.Open, FileAccess.Read))
{
bytesRead = bufLen;
while (bytesRead > 0)
{
bytesRead = sr.Read(buffer, 0, bufLen);
for (int i = 0; i < bytesRead; i++)
{
count[fileNum][buffer[i]]++;
}
}
}
}
// Find which bytes are in any of the files or in all the files
var inAny = new List<byte>(); // union
var inAll = new List<byte>(); // intersect
for (int i = 0; i < 256; i++)
{
Boolean all = true;
for (int fileNum = 0; fileNum < nFiles; fileNum++)
{
if (count[fileNum][i] > 0)
{
if (!inAny.Contains((byte)i)) // avoid adding same value more than once
{
inAny.Add((byte)i);
}
}
else
{
all = false;
}
};
if (all)
{
inAll.Add((byte)i);
};
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
// Display the results
Console.WriteLine("Union: " + string.Join(",", inAny.Select(x => x.ToString("X2"))));
Console.WriteLine();
Console.WriteLine("Intersect: " + string.Join(",", inAll.Select(x => x.ToString("X2"))));
Console.WriteLine();
// Compare to using LINQ.
// N/B. Will need adjustments for more than two files.
var srcBytes1 = File.ReadAllBytes(file1);
var srcBytes2 = File.ReadAllBytes(file2);
sw.Restart();
var intersect = srcBytes1.Intersect(srcBytes2).ToArray().OrderBy(x => x);
var union = srcBytes1.Union(srcBytes2).ToArray().OrderBy(x => x);
Console.WriteLine(sw.ElapsedMilliseconds);
Console.WriteLine("Union: " + String.Join(",", union.Select(x => x.ToString("X2"))));
Console.WriteLine();
Console.WriteLine("Intersect: " + String.Join(",", intersect.Select(x => x.ToString("X2"))));
Console.ReadLine();
}
}
}
The counting-the-byte-occurences method is roughly five times faster than the LINQ method on my computer, even without the latter loading the files and on a range of file sizes (a few KB to a few MB).

Code change vb.net to c#

I have no idea where to ask a question like this, so probably should say sorry right away.
Private Function RCON_Command(ByVal Command As String, ByVal ServerData As Integer) As Byte()
Dim Packet As Byte() = New Byte(CByte((13 + Command.Length))) {}
Packet(0) = Command.Length + 9 'Packet Size (Integer)
Packet(4) = 0 'Request Id (Integer)
Packet(8) = ServerData 'SERVERDATA_EXECCOMMAND / SERVERDATA_AUTH (Integer)
For X As Integer = 0 To Command.Length - 1
Packet(12 + X) = System.Text.Encoding.Default.GetBytes(Command(X))(0)
Next
Return Packet
End Function
Can someone tell me how should this code look like in c#? Tried my self but always getting error Cannot implicitly convert type 'int' to 'byte'. An explicit conversion exists (are you missing a cast?)
Tried to cast, then getting error about no need to cast
My code:
private byte[] RCON_Command(string command, int serverdata)
{
byte[] packet = new byte[command.Length + 13];
packet[0] = command.Length + 9;
packet[4] = 0;
packet[8] = serverdata;
for (int i = 0; i < command.Length; i++)
{
packet[12 + i] = System.Text.Encoding.UTF8.GetBytes(command[i])[0];
}
return packet;
}
error is in packet[0] and packet [8] line
You need to cast the two items to byte before assigning them. Another option I've done below is to change the method to accept serverdata as a byte instead of int - there's no point in taking the extra bytes only to throw them away.
Another problem is in the for loop - the indexer of string returns a char, which UTF8.GetBytes() can't accept. I think my translation should work, but you'll need to test it.
private byte[] RCON_Command(string command, byte serverdata)
{
byte[] packet = new byte[command.Length + 13];
packet[0] = (byte)(command.Length + 9);
packet[4] = 0;
packet[8] = serverdata;
for (int i = 0; i < command.Length; i++)
{
packet[12 + i] = System.Text.Encoding.UTF8.GetBytes(command)[i];
}
return packet;
}
Here you go. The Terik converter was no use - that code wouldn't compile.
This code runs...
private byte[] RCON_Command(string Command, int ServerData)
{
byte[] commandBytes = System.Text.Encoding.Default.GetBytes(Command);
byte[] Packet = new byte[13 + commandBytes.Length + 1];
for (int i = 0; i < Packet.Length; i++)
{
Packet[i] = (byte)0;
}
int index = 0;
//Packet Size (Integer)
byte[] bytes = BitConverter.GetBytes(Command.Length + 9);
foreach (var byt in bytes)
{
Packet[index++] = byt;
}
//Request Id (Integer)
bytes = BitConverter.GetBytes((int)0);
foreach (var byt in bytes)
{
Packet[index++] = byt;
}
//SERVERDATA_EXECCOMMAND / SERVERDATA_AUTH (Integer)
bytes = BitConverter.GetBytes(ServerData);
foreach (var byt in bytes)
{
Packet[index++] = byt;
}
foreach (var byt in commandBytes)
{
Packet[index++] = byt;
}
return Packet;
}
In addition to the need for casting, you need to be aware that C# uses array sizes when creating the array, not the upper bound that VB uses - so you need "14 + Command.Length":
private byte[] RCON_Command(string Command, int ServerData)
{
byte[] Packet = new byte[Convert.ToByte(14 + Command.Length];
Packet[0] = Convert.ToByte(Command.Length + 9); //Packet Size (Integer)
Packet[4] = 0; //Request Id (Integer)
Packet[8] = Convert.ToByte(ServerData); //SERVERDATA_EXECCOMMAND / SERVERDATA_AUTH (Integer)
for (int X = 0; X < Command.Length; X++)
{
Packet[12 + X] = System.Text.Encoding.Default.GetBytes(Command[X])[0];
}
return Packet;
}
Just add the explicit casts. You might want to make sure that it's safe to down cast from a 32-bit value type to an 8-bit type.
packet[0] = (byte)(command.Length + 9);
...
packet[8] = (byte)serverdata;
EDIT:
TheEvilPenguin is also right that you will have a problem with your call to GetBytes().
This is how I would fix it to make sure I don't change the meaning of the existing VB.NET code:
packet[12 + i] = System.Text.Encoding.UTF8.GetBytes(new char[] {command[i]})[0];
And also, one more detail:
When you declare an array in VB.NET, you define the maximum array index. In C#, the number in the array declaration represents the number of elements in the array. This means that in the translation from VB.NET to C#, to keep equivalent behavior, you need to add + 1 to the number in the array declaration:
byte[] packet = new byte[command.Length + 13 + 1]; // or + 14 if you want

Byte[] Split Up Into 1022 Length Sections Then Restored Not Matching Up

I am trying to take a PDF document and upload it via a MVC website to be stored into an SAP structure. The SAP structure requires the byte array to be broken up into 1022 length sections. The program seems to work good up to the point where I try to view the PDF document out of SAP. Unfortunately, I cannot view the PDF data stored in SAP due to access rights. So, I created a sort of MOCK program to match up the byte array from before it is sent to SAP (fileContent) and then what it should look like once it is returned from SAP (fileContentPostSAP).
The program compares the byte arrays and finds mismatching values at array location 1022.
Is there a bug in my program that is causing the byte arrays to not match? They are supposed to match exactly, right?
ClaimsIdentityMgr claimIdentityMgr = new ClaimsIdentityMgr();
ClaimsIdentity currentClaimsIdentity = claimIdentityMgr.GetCurrentClaimsIdentity();
var subPath = "~/App_Data/" + currentClaimsIdentity.EmailAddress;
var destinationPath = Path.Combine(Server.MapPath(subPath), "LG WM3455H Spec Sheet.pdf");
byte[] fileContent = System.IO.File.ReadAllBytes(destinationPath);
//pretend this is going to SAP
var arrList = SAPServiceRequestRepository.CreateByteListForStructure(fileContent);
var mockStructureList = new List<byte[]>();
foreach (byte[] b in arrList)
mockStructureList.Add(b);
//now get it back from Mock SAP
var fileContentPostSAP = new byte[fileContent.Count()];
var rowCounter = 0;
var prevLength = 0;
foreach (var item in mockStructureList)
{
if (rowCounter == 0)
System.Buffer.BlockCopy(item, 0, fileContentPostSAP, 0, item.Length);
else
System.Buffer.BlockCopy(item, 0, fileContentPostSAP, prevLength, item.Length);
rowCounter++;
prevLength = item.Length;
}
//compare the orginal array with the new one
var areEqual = (fileContent == fileContentPostSAP);
for (var i = 0; i < fileContent.Length; i++)
{
if (fileContent[i] != fileContentPostSAP[i])
throw new Exception("i = " + i + " | fileContent[i] = " + fileContent[i] + " | fileContentPostSAP[i] = " + fileContentPostSAP[i]);
}
And here is the CreateByteListForStructure function:
public static List<byte[]> CreateByteListForStructure(byte[] fileContent)
{
var returnList = new List<byte[]>();
for (var i = 0; i < fileContent.Length; i += 1022)
{
if (fileContent.Length - i >= 1022)
{
var localByteArray = new byte[1022];
System.Buffer.BlockCopy(fileContent, i, localByteArray, 0, 1022);
returnList.Add(localByteArray);
}
else
{
var localByteArray = new byte[fileContent.Length - i];
System.Buffer.BlockCopy(fileContent, i, localByteArray, 0, fileContent.Length - i);
returnList.Add(localByteArray);
}
}
return returnList;
}
There seems to be a simple bug in the code.
This loop, which reconstructs the contents of the array from the blocks:
var prevLength = 0;
foreach (var item in mockStructureList)
{
if (rowCounter == 0)
System.Buffer.BlockCopy(item, 0, fileContentPostSAP, 0, item.Length);
else
System.Buffer.BlockCopy(item, 0, fileContentPostSAP, prevLength, item.Length);
rowCounter++;
prevLength = item.Length;
}
By the description of the blocks, every block is 1022 bytes, which means that after the first iteration, prevLength is set to 1022, but after the next iteration it is set to 1022 again.
The more correct assignment of prevLength would be this:
prevLength += item.Length;
^
|
+-- added this
This will correctly move the pointer in the output array forward one block at a time, instead of moving it to the second block and then leaving it there.
Basically you write block 0 in the correct place, but all the other blocks on top of block 1, leaving block 2 and onwards in the output array as zeroes.

Categories

Resources