I am trying to read a binary file (.bin) and convert the resources of the file into a matrix. The code I use to get the file is here.
using (BinaryReader Reader = new BinaryReader(File.Open(string.Format("{0}{1}.bin", DefaultFilePath, "MyBinaryFile"), FileMode.Open)))
{
//the code to convert binary to AxB matrix here.
byteArray = Reader.ReadBytes(100000);
float myFloat = System.BitConverter.ToSingle(byteArray, 0);
}
I need to write a piece of code which can convert the resources of a binary file into a AxB matrix. From the code above, you can see that I convert binary file into Byte[], then to float, but I am stuck in here.
In Matlab, you can read .bin file easily and get the AxB array such as in this link.
How can I proceed?
If the file is just a long list of 32-bit floats, you already got the first value converter correctly. Now you just need to do the rest by adding a loop and incrementing the second argument for ToSingle by 4 each time.
Or, since you're already using a BinaryReader you could just use its ReadSingle method in a loop. If you want a two-dimensional matrix, using a multidimensional array might be a good idea.
// In reality you might want to figure out the array size based on the file size
float[,] floatArray = new float[5000, 32];
using (BinaryReader reader = new BinaryReader(File.Open(string.Format("{0}{1}.bin", DefaultFilePath, "MyBinaryFile"), FileMode.Open)))
{
for (x = 0; x < floatArray.GetLength(0); x++)
{
for (y = 0; y < floatArray.GetLength(1); y++)
floatArray[x, y] = reader.ReadSingle();
}
}
Note: you might need to flip things around depending on whether your data file and desired memory representation is row-major or column-major.
Also remember that multidimensional arrays are contiguous in memory, so if your file is huge, you might run into problems processing it.
Related
I plan on reading the marks from a text file and then calculating what the average mark is based upon data written in previous code. I haven't been able to read the marks though or calculate how many marks there are as BinaryReader doesn't let you use .Length.
I have tried using an array to hold each mark but it doesn't like each mark being an integer
public static int CalculateAverage()
{
int count = 0;
int total = 0;
float average;
BinaryReader markFile;
markFile = new BinaryReader(new FileStream("studentMarks.txt", FileMode.Open));
//A loop to read each line of the file and add it to the total
{
//total = total + eachMark;
//count++;
}
//average = total / count;
//markFile.Close();
//Console.WriteLine("Average mark:", average);
return 0;
}
This is my studentMark.txt file in VS
First of all, don't use BinerayRead you can use StreamReader for example.
Also with using statement is not necessary implement the close().
There is an answer using a while loop, so using Linq you can do in one line:
var avg = File.ReadAllLines("file.txt").ToArray().Average(a => Int32.Parse(a));
Console.WriteLine("avg = "+avg); //5
Also using File.ReadAllLines() according too docs the file is loaded into memory and then close, so there is no leak memory problem or whatever.
Opens a text file, reads all lines of the file into a string array, and then closes the file.
Edit to add the way to read using BinaryReader.
First thing to know is you are reading a txt file. Unless you have created the file using BinaryWriter, the binary reader will not work. And, if you are creating a binary file, there is not a good practice name as .txt.
So, assuming your file is binary, you need to loop and read every integer, so this code shoul work.
var fileName = "file.txt";
if (File.Exists(fileName))
{
using (BinaryReader reader = new BinaryReader(File.Open(fileName, FileMode.Open)))
{
while (reader.BaseStream.Position < reader.BaseStream.Length)
{
total +=reader.ReadInt32();
count++;
}
}
average = total/count;
Console.WriteLine("Average = "+average); // 5
}
I've used using to ensure file is close at the end.
If your file only contains numbers, you only have to use ReadInt32() and it will work.
Also, if your file is not binary, obviously, binary writer will not work. By the way, my binary file.txt created using BinaryWriter looks like this:
So I'm assuming you dont have a binary file...
i have an XML file which holds alot of data.
For the moment i can read out every data in c# except a mp3 file which is hold as an base64 string in a child.elemt named Data with commentary line: "4 bytes float array converted to base64".
I am very new to c# and before just a beginner in php/java, so be indulgent.
I have attached the base64 string in a text file and the original mp3, maybe it helps.
Can you tell me how i can convert this back ? I already tryd to get single bytes out of the array to a stream and write it back as mp3 file, but atleast its 4 times bigger and absolute not near the same file and just holds in crap.
https://www.file-upload.net/download-12719496/base64string.rar.html
edit:
After the Help of L.B, i got this, thank you.
var mp3base64string = Convert.FromBase64String(child.Element("Data").Value);
using(FileStream file = File.Create(mp3datafilename)) {
using(BinaryWriter writer = new BinaryWriter(file)) {
for (int i = 0; i < mp3base64string.Length; i += 4) {
writer.Write((byte)(967.644334 f * BitConverter.ToSingle(mp3base64string, i)));
}
}
}
This code works and the output is exactly the same as original mp3, but don't ask how I got that magic number :) (Does author of xml think it is some kind of encryption/obfuscation?)
var buf = Convert.FromBase64String(File.ReadAllText(#"base64string.txt"));
int count = 0;
var buf2 = buf.GroupBy(x => count++ / 4)
.Select(g => (byte)(967.644334f * BitConverter.ToSingle(g.ToArray(), 0)))
.ToArray();
File.WriteAllBytes(#"base64string.mp3", buf2);
PS: A non-linq version will be faster....
Assuming my WAV file contains 16 bit PCM, How can I read wav file as double array:
using (WaveFileReader reader = new WaveFileReader("myfile.wav"))
{
Assert.AreEqual(16, reader.WaveFormat.BitsPerSample, "Only works with 16 bit audio");
byte[] bytesBuffer = new byte[reader.Length];
int read = reader.Read(bytesBuffer, 0, buffer.Length);
// HOW TO GET AS double ARRAY
}
Just use the ToSampleProvider extension method on your WaveFileReader, and the Read method will take a float[] with the samples converted to floating point. Alternatively use AudioFileReader instead of WaveFileReader and again you can access a version of the Read method that fills a float[]
16-bit PCM is an signed-integer encoding. Presuming you want doubles between 0 and 1, you simply read each sample as an 16-bit signed integer, and then divide by (double)32768.0;
var floatSamples = new double[read/2];
for(int sampleIndex = 0; sampleIndex < read/2; sampleIndex++) {
var intSampleValue = BitConverter.ToInt16(bytesBuffer, sampleIndex*2);
floatSamples[sampleIndex] = intSampleValue/32768.0;
}
Note that stereo channels are interleaved (left sample, right sample, left sample, right sample).
There's some good info on the format here: http://blog.bjornroche.com/2013/05/the-abcs-of-pcm-uncompressed-digital.html
I want to deserialize a list of 1 million pairs of (String,Guid) for a performance critical app. The format can be anything I choose, and serialization does not have the same performance requirements.
What sort of approach is best? Text or binary? Write each pair (string,guid) consecutively, or write all strings followed by all guids?
I started playing with LinqPad, (and the simpler example of deserializing strings only) and found that (slightly counter-intuitively), using a TextReader and ReadLine() was a fair bit faster than using a BinaryReader and ReadString(). (Is the filesystem cache playing tricks on me?)
public string[] DeSerializeBinary()
{
var tmr = System.Diagnostics.Stopwatch.StartNew();
long ms = 0;
string[] arr = null;
using (var rdr = new BinaryReader(new FileStream(file, FileMode.Open, FileAccess.Read)))
{
var num = rdr.ReadInt32();
arr = new String[num];
for (int i = 0; i < num; i++)
{
arr[i] = rdr.ReadString();
}
tmr.Stop();
ms = tmr.ElapsedMilliseconds;
Console.WriteLine("DeSerializeBinary took {0}ms", ms);
}
return arr;
}
public string[] DeserializeText()
{
var tmr = System.Diagnostics.Stopwatch.StartNew();
long ms = 0;
string[] arr = null;
using (var rdr = File.OpenText(file))
{
var num = Int32.Parse(rdr.ReadLine());
arr = new String[num];
for (int i = 0; i < num; i++)
{
arr[i] = rdr.ReadLine();
}
tmr.Stop();
ms = tmr.ElapsedMilliseconds;
Console.WriteLine("DeserializeText took {0}ms", ms);
}
return arr;
}
Some Edits:
I used RamMap to clear the file system cache, and it turns out there was very little difference to Text & Binary reader for strings only.
I have a fairly simple class that holds the string and guid. It also holds an int index which corresponds to its position in the list. Obviously there's no need to include this in serialization.
In a test for (binary) deSerializing Strings and Guids alternately, I get around 500ms.
Ideal timing is 50ms, or as close as I can get. However, a simple experiment showed it takes at least 120ms to read the (compressed) file into memory from a reasonably fast SSD drive, without any sort of parsing at all. So 50ms seems unlikely.
Our strings have no theoretical length restrictions. However, we can assume that the performance target only applies if they are all 20 characters or less.
Timings include opening the file.
Reading the Strings is the clear bottleneck now (hence my experiments with serializing strings only). The JIT_NewFast took 30% before I preallocated an array of 16bytes for reading GUIDs.
It's not surprising that reading a bunch of strings is faster with StreamReader than with BinaryReader. StreamReader reads in blocks from the underlying stream, and parses the strings from that buffer. BinaryReader doesn't have a buffer like that. It reads the string length from the underlying stream, and then reads that many characters. So BinaryReader makes more calls to the base stream's Read method.
But there's more to deserializing a (String, Guid) pair than just reading. You also have to parse the Guid. If you write the file in binary then the Guid is written in binary, which makes it much easier and faster to create a Guid structure. If it's a string, then you have to call new Guid(string) to parse the text and create a Guid, after you split the line into its two fields.
Hard to say which of those will be faster.
I can't imagine that we're talking about a whole lot of time here. Certainly reading a file with a million lines will take around a second. Unless the string is really long. A GUID is only 36 characters if you count the separators, right?
With BinaryWriter, you can write the file like this:
writer.Write(count); // integer number of records
foreach (var pair in pairs)
{
writer.Write(pair.theString);
writer.Write(pair.theGuid.ToByteArray());
}
And to read it, you have:
count = reader.ReadInt32();
byte[] guidBytes = new byte[16];
for (int i = 0; i < count; ++i)
{
string s = reader.ReadString();
reader.Read(guidBytes, 0, guidBytes.Length);
pairs.Add(new Pair(s, new Guid(guidBytes));
}
Whether that's faster than splitting a string and calling the Guid constructor that takes a string parameter, I don't know.
I suspect that any difference is going to be pretty slight. I'd probably go with the simplest method: a text file.
If you want to get really crazy, you can write a custom format that you can easily slurp up in just a couple of large reads (a header, an index, and two arrays for strings and GUIDs), and do everything else in memory. That would almost certainly be faster. But faster enough to warrant the extra work? Doubtful.
Update
Or maybe not doubtful. Here's some code that writes and reads a custom binary format. The format is:
count (int32)
guids (count * 16 bytes)
strings (one big concatenated string)
index (index of each string's starting character in the big string)
I assume you're using a Dictionary<string, Guid> to hold these things. But your data structure doesn't really matter. The code would be substantially the same.
Note that I tested this very briefly. I won't say that the code is 100% bug free, but I think you can get the idea of what I'm doing.
private void WriteGuidFile(string filename, Dictionary<string, Guid>guids)
{
using (var fs = File.Create(filename))
{
using (var writer = new BinaryWriter(fs, Encoding.UTF8))
{
List<int> stringIndex = new List<int>(guids.Count);
StringBuilder bigString = new StringBuilder();
// write count
writer.Write(guids.Count);
// Write the GUIDs and build the string index
foreach (var pair in guids)
{
writer.Write(pair.Value.ToByteArray(), 0, 16);
stringIndex.Add(bigString.Length);
bigString.Append(pair.Key);
}
// Add one more entry to the string index.
// makes deserializing easier
stringIndex.Add(bigString.Length);
// Write the string that contains all of the strings, combined
writer.Write(bigString.ToString());
// write the index
foreach (var ix in stringIndex)
{
writer.Write(ix);
}
}
}
}
Reading is just slightly more involved:
private Dictionary<string, Guid> ReadGuidFile(string filename)
{
using (var fs = File.OpenRead(filename))
{
using (var reader = new BinaryReader(fs, Encoding.UTF8))
{
// read the count
int count = reader.ReadInt32();
// The guids are in a huge byte array sized 16*count
byte[] guidsBuffer = new byte[16*count];
reader.Read(guidsBuffer, 0, guidsBuffer.Length);
// Strings are all concatenated into one
var bigString = reader.ReadString();
// Index is an array of int. We can read it as an array of
// ((count+1) * 4) bytes.
byte[] indexBuffer = new byte[4*(count+1)];
reader.Read(indexBuffer, 0, indexBuffer.Length);
var guids = new Dictionary<string, Guid>(count);
byte[] guidBytes = new byte[16];
int startix = 0;
int endix = 0;
for (int i = 0; i < count; ++i)
{
endix = BitConverter.ToInt32(indexBuffer, 4*(i+1));
string key = bigString.Substring(startix, endix - startix);
Buffer.BlockCopy(guidsBuffer, (i*16),
guidBytes, 0, 16);
guids.Add(key, new Guid(guidBytes));
startix = endix;
}
return guids;
}
}
}
A couple of notes here. First, I'm using BitConverter to convert the data in the byte arrays to integers. It would be faster to use unsafe code and just index into the arrays using an int32*.
You might gain some speed by using pointers to index into the guidBuffer and calling Guid Constructor (Int32, Int16, Int16, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte) rather than using Buffer.BlockCopy to copy the GUID into the temporary array.
You could make the string index an index of lengths rather than the starting positions. That would eliminate the need for the extra value at the end of the array, but it's unlikely that it'd make any difference in the speed.
There might be other optimization opportunities, but I think you get the general idea here.
public void EncryptFile()
{
OpenFileDialog dialog = new OpenFileDialog();
dialog.Filter = "JPEG Files (*.jpeg)|*.jpeg|PNG Files (*.png)|*.png|All files (*.*)|*.*";
dialog.InitialDirectory = #"C:\";
dialog.Title = "Please select an image file to encrypt.";
if (dialog.ShowDialog() == DialogResult.OK)
{
byte[] ImageBytes = File.ReadAllBytes(dialog.FileName);
foreach (byte X in ImageBytes)
{
//How can I take byte "X" and add a numerical value to it?
}
}
}
So, I'm trying to encrypt an image file by just converting it to byte[] array and then adding a numerical value to each byte.
How can I add a numerical value to a byte?
You just add it. The problem is that you can't modify the value in your foreach loop there. You actually want a for loop:
for(int k = 0; k < ImagesBytes.Length; k++){
ImageBytes[k] = (byte) (ImageBytes[k] + 5); // needs a cast
}
byte is a value type, which means it's always copied when it's returned. Consequently, you can only add a value to the local byte value inside your foreach, pretty much like changing the value of a byte argument inside a function won't change the value outside the function (unless, of course, you used the ref keyword).
You can't use a foreach for this task. Use a regular for loop:
for(int i = 0; i < ImageBytes.Length; i++)
ImageBytes[i] += MyNumericValue;
You need to use modulo (specifically modulo 256) addition, so that the operation is reversible. Alternatively you could use a bitwise operation, XOR is a common choice.
Modulo 256 operation is simple to implement for bytes, you just need to cast the result, as in:
ImageBytes[k] = (unsigned byte) ((unsigned byte) ImageBytes[k] + x)
Beware however that such "encryption" is rather weak. A way to improve the strength of such encryption is to add a distinct value for each byte, for example by taking the added value in a circular buffer (i.e. with a sequence which eventually repeats itself). A better way, still may use the values readily decoded as part of the operands.
Question: Why not just use one of the built in crypto streams in .NET?
If you don't want to do that, assuming that you are going to want to use the image in some way after you obscure the bits of it, I would look at doing a custom stream class and just modify the bytes are the come in.
There is a great end to end walk through here Custom Transform Streams (and the rotate stream would be a better faster way to solve your problem of obscuring the image file). This also gets rid of the overflow issues with adding to a byte.