Find bytes from an offset - c#

So I have this:
public static long FindPosition(Stream stream, byte[] byteSequence)
{
if (byteSequence.Length > stream.Length)
return -1;
byte[] buffer = new byte[byteSequence.Length];
using (BufferedStream bufStream = new BufferedStream(stream, byteSequence.Length))
{
int i;
while ((i = bufStream.Read(buffer, 0, byteSequence.Length)) == byteSequence.Length)
{
if (byteSequence.SequenceEqual(buffer))
return bufStream.Position - byteSequence.Length;
else
bufStream.Position -= byteSequence.Length - PadLeftSequence(buffer, byteSequence);
}
}
return -1;
}
private static int PadLeftSequence(byte[] bytes, byte[] seqBytes)
{
int i = 1;
while (i < bytes.Length)
{
int n = bytes.Length - i;
byte[] aux1 = new byte[n];
byte[] aux2 = new byte[n];
Array.Copy(bytes, i, aux1, 0, n);
Array.Copy(seqBytes, aux2, n);
if (aux1.SequenceEqual(aux2))
return i;
i++;
}
return i;
}
Which works perfectly to get an offset that has a specific set of bytes, but now I want to do the inverse, find a set of bytes from a specific offset.
How can I do that?

Try this example:
long offset = 100L; // Offset
int bytesCount = 20; // Number of bytes to read
byte[] buffer = new byte[bytesCount];
stream.Seek( offset, SeekOrigin.Begin ); // Set offset from Begin of a stream
stream.Read( buffer, 0, bytesCount ); // Read bytesCount from previous set offset
More detail about Seek and Read

Related

C# - upload file by chunks - bad last chunk size

I am trying to upload large files to 3rd part service by chunks. But I have problem with last chunk. Last chunk would be always smaller then 5mb, but all chunks incl. the last have all the same size - 5mb
My code:
int chunkSize = 1024 * 1024 * 5;
using (Stream streamx = new FileStream(file.Path, FileMode.Open, FileAccess.Read))
{
byte[] buffer = new byte[chunkSize];
int bytesRead = 0;
long bytesToRead = streamx.Length;
while (bytesToRead > 0)
{
int n = streamx.Read(buffer, 0, chunkSize);
if (n == 0) break;
// do work on buffer...
// uploading chunk ....
var partRequest = HttpHelpers.InvokeHttpRequestStream
(
new Uri(endpointUri + "?partNumber=" + i + "&uploadId=" + UploadId),
"PUT",
partHeaders,
buffer
); // upload buffer
bytesRead += n;
bytesToRead -= n;
}
streamx.Dispose();
}
buffer is uploaded on 3rd party service.
Solved, someone posted updated code in comment, but after some seconds deleted this comment. But there was solution. I added this part after
if (n == 0)
this code, which resizes last chunk on the right size
// Let's resize the last incomplete buffer
if (n != buffer.Length)
Array.Resize(ref buffer, n);
Thank you all.
I post full working code:
int chunkSize = 1024 * 1024 * 5;
using (Stream streamx = new FileStream(file.Path, FileMode.Open, FileAccess.Read))
{
byte[] buffer = new byte[chunkSize];
int bytesRead = 0;
long bytesToRead = streamx.Length;
while (bytesToRead > 0)
{
int n = streamx.Read(buffer, 0, chunkSize);
if (n == 0) break;
// Let's resize the last incomplete buffer
if (n != buffer.Length)
Array.Resize(ref buffer, n);
// do work on buffer...
// uploading chunk ....
var partRequest = HttpHelpers.InvokeHttpRequestStream
(
new Uri(endpointUri + "?partNumber=" + i + "&uploadId=" + UploadId),
"PUT",
partHeaders,
buffer
); // upload buffer
bytesRead += n;
bytesToRead -= n;
}
}

c# efficient way of reading arrays of specific type from stream

I'm looking for an efficient way of reading multiple arrays of a specific type from a stream.
So far I'm using a class like this below to read single values like: int, byte, sbyte, uint, short, ushort, ...
but also for arrays like: ushort[], short[], uint[], int[], byte[], sbyte[], ...
public byte[] ReadBytes(int count)
{
byte[] buffer = new byte[count];
int retValue = _Stream.Read(buffer, 0, count);
return buffer;
}
public ushort ReadUshort()
{
byte[] b = ReadBytes(2);
if (BitConverter.IsLittleEndian) // for motorola (big endian)
Array.Reverse(b);
return BitConverter.ToUInt16(b, 0);
}
public ushort[] ReadUshorts(int count)
{
ushort[] data = new ushorts[count];
for (int i = 0; i < count; i++)
{
data[i] = ReadUshort();
}
return data;
}
public uint ReadUint()
{
byte[] b = ReadBytes(4);
if (BitConverter.IsLittleEndian) // for motorola (big endian)
Array.Reverse(b);
return BitConverter.ToUInt32(b, 0);
}
public uint[] ReadUints(int count)
{
// ...
}
Is there a more efficient way compared to code snippet I've shared here to read the arrays?
I have a feeling that a combination of for-loop and each time a single read call is not so efficient. But the problem is that I need to check for IsLittleEndian each time and reverse if needed, so I can read many bytes at ones. Not sure if this could be rewritten more efficiently.
You could write a generic method, and use Buffer.BlockCopy to copy the data into the target array:
public static T[] ReadElements<T>(Stream input, int count)
{
int bytesPerElement = Marshal.SizeOf(typeof(T));
byte[] buffer = new byte[bytesPerElement * count];
int remaining = buffer.Length;
int offset = 0;
while (remaining > 0)
{
int read = input.Read(buffer, offset, remaining);
if (read == 0) throw new EndOfStreamException();
offset += read;
remaining -= read;
}
if (BitConverter.IsLittleEndian)
{
for (int i = 0; i < buffer.Length; i += bytesPerElement)
{
Array.Reverse(buffer, i, bytesPerElement);
}
}
T[] result = new T[count];
Buffer.BlockCopy(buffer, 0, result, 0, buffer.Length);
return result;
}

What am I doing wrong when parsing a wav file?

I'm trying to parse a wav file. I'm not sure if there can be multiple data chunks in a wav file, but I originally assumed there was only 1 since the wav file format description I was reading only mentioned there being 1.
But I noticed that the subchunk2size was very small (like 26) when the wav file being parsed was something like 36MB and the sample rate was 44100.
So I tried to parse it assuming there were multiple chunks, but after the 1st chunk, there was no subchunk2id to be found.
To go chunk by chunk, I was using the below code
int chunkSize = System.BitConverter.ToInt32(strm, 40);
int widx = 44; //wav data starts at the 44th byte
//strm is a byte array of the wav file
while(widx < strm.Length)
{
widx += chunkSize;
if(widx < 1000)
{
//log "data" or "100 97 116 97" for the subchunkid
//This is only getting printed the 1st time though. All prints after that are garbage
Debug.Log( strm[widx] + " " + strm[widx+1] + " " + strm[widx+2] + " " + strm[widx+3]);
}
if(widx + 8 < strm.Length)
{
widx += 4;
chunkSize = System.BitConverter.ToInt32(strm, widx);
widx += 4;
}else
{
widx += 8;
}
}
A .wav-File has 3 chunks:
Each chunk has a size of 4 Byte
The first chunk is the "RIFF"-chunk. It includes 8 Byte the filesize(4 Byte) and the name of the format(4byte, usually "WAVE").
The next chunk is the "fmt "-chunk (the space in the chunk-name is important). It includes the audio-format(2 Byte), the number of channels (2 Byte), the sample rate (4 Byte), the byte rate (4 Byte), blockalign (2 Byte) and the bits per sample (2 Byte).
The third and last chunk is the data-chunk. Here are the real data and the amplitudes of the samples. It includes 4 Byte for the datasize, which is the number of bytes for the data.
You can find further explanations of the properties of a .wav-file here.
From this knowledge I have already created the following class:
public sealed class WaveFile
{
//privates
private int fileSize;
private string format;
private int fmtChunkSize;
private int audioFormat;
private int numChannels;
private int sampleRate;
private int byteRate;
private int blockAlign;
private int bitsPerSample;
private int dataSize;
private int[][] data;//One array per channel
//publics
public int FileSize => fileSize;
public string Format => format;
public int FmtChunkSize => fmtChunkSize;
public int AudioFormat => audioFormat;
public int NumChannels => numChannels;
public int SampleRate => sampleRate;
public int ByteRate => byteRate;
public int BitsPerSample => bitsPerSample;
public int DataSize => dataSize;
public int[][] Data => data;
public WaveFile(string path)
{
FileStream fs = File.OpenRead(path);
LoadChunk(fs); //read RIFF Chunk
LoadChunk(fs); //read fmt Chunk
LoadChunk(fs); //read data Chunk
fs.Close();
}
private void LoadChunk(FileStream fs)
{
ASCIIEncoding Encoder = new ASCIIEncoding();
byte[] bChunkID = new byte[4];
fs.Read(bChunkID, 0, 4);
string sChunkID = Encoder.GetString(bChunkID);
byte[] ChunkSize = new byte[4];
fs.Read(ChunkSize, 0, 4);
if (sChunkID.Equals("RIFF"))
{
fileSize = BitConverter.ToInt32(ChunkSize, 0);
byte[] Format = new byte[4];
fs.Read(Format, 0, 4);
this.format = Encoder.GetString(Format);
}
if (sChunkID.Equals("fmt "))
{
fmtChunkSize = BitConverter.ToInt32(ChunkSize, 0);
byte[] audioFormat = new byte[2];
fs.Read(audioFormat, 0, 2);
this.audioFormat = BitConverter.ToInt16(audioFormat, 0);
byte[] numChannels = new byte[2];
fs.Read(numChannels, 0, 2);
this.numChannels = BitConverter.ToInt16(numChannels, 0);
byte[] sampleRate = new byte[4];
fs.Read(sampleRate, 0, 4);
this.sampleRate = BitConverter.ToInt32(sampleRate, 0);
byte[] byteRate = new byte[4];
fs.Read(byteRate, 0, 4);
this.byteRate = BitConverter.ToInt32(byteRate, 0);
byte[] blockAlign = new byte[2];
fs.Read(blockAlign, 0, 2);
this.blockAlign = BitConverter.ToInt16(blockAlign, 0);
byte[] bitsPerSample = new byte[2];
fs.Read(bitsPerSample, 0, 2);
this.bitsPerSample = BitConverter.ToInt16(bitsPerSample, 0);
}
if (sChunkID.Equals("data"))
{
dataSize = BitConverter.ToInt32(ChunkSize, 0);
data = new int[this.numChannels][];
byte[] temp = new byte[dataSize];
for (int i = 0; i < this.numChannels; i++)
{
data[i] = new int[this.dataSize / (numChannels * bitsPerSample / 8)];
}
for (int i = 0; i < data[0].Length; i++)
{
for (int j = 0; j < numChannels; j++)
{
if (fs.Read(temp, 0, blockAlign / numChannels) > 0)
{
if (blockAlign / numChannels == 2)
{ data[j][i] = BitConverter.ToInt32(temp, 0); }
else
{ data[j][i] = BitConverter.ToInt16(temp, 0); }
}
}
}
}
}
}
Needed using-directives:
using System;
using System.IO;
using System.Text;
This class reads all chunks byte per byte and sets the properties. You just have to initialize this class and it will return all properties of your selected wave-file.
In the reference you added I dont see any mention of the chunk size being repeated for each data chunk...
Try something like this:
int chunkSize = System.BitConverter.ToInt32(strm, 40);
int widx = 44; //wav data starts at the 44th byte
//strm is a byte array of the wav file
while(widx < strm.Length)
{
if(widx < 1000)
{
//log "data" or "100 97 116 97" for the subchunkid
//This is only getting printed the 1st time though. All prints after that are garbage
Debug.Log( strm[widx] + " " + strm[widx+1] + " " + strm[widx+2] + " " + strm[widx+3]);
}
widx += chunkSize;
}

How have a generic conversion from 32/24bit From Bytes To 16bit To bytes

Have been searching the solution for two days.
I want to convert my wave 32 or 24 bits to a 16bit.
This my code after reading few stackoverflow topics):
byte[] data = Convert.FromBase64String("-- Wav String encoded --") (32 or 24 bits)
int conv = Convert.ToInt16(data);
byte[] intBytes = BitConverter.GetBytes(conv);
if (BitConverter.IsLittleEndian)
Array.Reverse(intBytes);
byte[] result = intBytes;
but when i writeAllbyte my result, nothing to hear...
Here is a method that cuts the least significant bits:
byte[] data = ...
var skipBytes = 0;
byte[] data16bit;
int samples;
if( /* data was 32 bit */ ) {
skipBytes = 2;
samples = data.Length / 4;
} else if( /* data was 24 bit */ ) {
skipBytes = 1;
samples = data.Length / 3;
}
data16bit = new byte[samples * 2];
int writeIndex = 0;
int readIndex = 0;
for(var i = 0; i < samples; ++i) {
readIndex += skipBytes; //skip the least significant bytes
//read the two most significant bytes
data16bit[writeIndex++] = data[readIndex++];
data16bit[writeIndex++] = data[readIndex++];
}
This assumes a little endian byte order (least significant byte is the first byte, usual for WAV RIFF). If you have big endian, you have to put the readIndex += ... after the two read lines.
You could implement your own conversion iterator for this task like so:
IEnumerable<byte> ConvertTo16Bit(byte[] data, int skipBytes)
{
int bytesToRead = 0;
int bytesToSkip = skipBytes;
int readIndex = 0;
while (readIndex < data.Length)
{
if (bytesToSkip > 0)
{
readIndex += bytesToSkip;
bytesToSkip = 0;
bytesToRead = 2;
continue;
}
if (bytesToRead == 0)
{
bytesToSkip = skipBytes;
continue;
}
yield return data[readIndex++];
bytesToRead--;
}
}
This way you don't have to create a new array if there is no need for it. And you could simply convert the data array to a new 16 bit array with the IEnumerable<T> extension methods:
var data16bit = ConvertTo16Bit(data, 1).ToArray();
Or if you don't need the array, you can iterate the data skipping the least significant bytes:
foreach (var b in ConvertTo16Bit(data, 1))
{
Console.WriteLine(b);
}

Fastest way to extract variable width signed integer from byte[]

The title speaks for itself. I have a file containing a base64 encoded byte[] of variable width integer, min 8 bit, max 32bit
I have a large file (48MB) and I am trying to find the fastest way of grabbing integers from the stream.
This is the fastest code from a perf app:
static int[] Base64ToIntArray3(string base64, int size)
{
List<int> res = new List<int>();
byte[] buffer = new byte[4];
using (var ms = new System.IO.MemoryStream(Convert.FromBase64String(base64)))
{
while(ms.Position < ms.Length)
{
ms.Read(buffer, 0, size);
res.Add(BitConverter.ToInt32(buffer, 0));
}
}
return res.ToArray();
}
I can't see a faster way of padding the bytes to 32bit. Any ideas, chaps and chapettes? Solutions should be in c#. I could fall down to C/++ if i must but i don't want to.
There is no reason to use a memory stream to move bytes from an array to another array, just read from the array directly. Also, the size of the array is known, so there is need to add the items to a list that is then converted to an array, you can use an array from the start:
static int[] Base64ToIntArray3(string base64, int size) {
byte[] data = Convert.FromBase64String(base64);
int cnt = data.Length / size;
int[] res = new int[cnt];
for (int i = 0; i < cnt; i++) {
switch (size) {
case 1: res[i] = data[i]; break;
case 2: res[i] = BitConverter.ToInt16(data, i * 2); break;
case 3: res[i] = data[i * 3] + data[i * 3 + 1] * 256 + data[i * 3 + 2] * 65536; break;
case 4: res[i] = BitConverter.ToInt32(data, i * 4); break;
}
}
return res;
}
Note: Untested code! You have to verify that it actually does what it is supposed to do, but at least it shows the principle.
This is probably how I would do it. Not using a stream should increase performance. This seems like the sort of thing that should be easy to do using Linq but I couldn't figure it out.
static int[] Base64ToIntArray3(string base64, int size)
{
if (size < 1 || size > 4) throw new ArgumentOutOfRangeException("size");
byte[] data = Convert.FromBase64String(base64);
List<int> res = new List<int>();
byte[] buffer = new byte[4];
for (int i = 0; i < data.Length; i += size )
{
Buffer.BlockCopy(data, i, buffer, 0, size);
res.Add(BitConverter.ToInt32(buffer, 0));
}
return res.ToArray();
}
Ok so I believe this is the Linq way to do this:
static int[] Base64ToIntArray3(string base64, int size)
{
byte[] data = Convert.FromBase64String(base64);
return data.Select((Value, Index) => new { Value, Index })
.GroupBy(p => p.Index / size)
.Select(g => BitConverter.ToInt32(g.Select(p => p.Value).Union(new byte[4 - size]).ToArray(), 0))
.ToArray();
}

Categories

Resources