I want to create a binary file and store string data in it, I used this sample:
FileStream fs = new FileStream("c:\\test.data", FileMode.Create);
BinaryWriter bw = new BinaryWriter(fs);
bw.Write(Encoding.ASCII.GetBytes("david stein"));
bw.Close();
but when I opened created file by this sample (test.data) in notepad, it has string data in it ("david stein"), now my question is that whats the difference between this binary writing and text writing when the result is string?
I'm looking to create a data in binary file until user can not open and read my data by note pad and if user open it in notepad see real binary data .
in some files when you open theme in text editors you can not read file content like jpg files contents,they do not use any encryption methods,what about it?how can i wite my data like this?
now my question is that whats the difference between this binary writing and text writing when the result is string?
The data in a file is always "a sequence of bytes". In this case, the sequence of bytes you've written is "the bytes representing the text 'david stein'" in the ASCII encoding. So yes, if you open the file in any editor which tries to interpret the bytes as text in a way which is compatible with ASCII, you'll see the text "david stein". Really it's just a load of bytes though - it all depends on how you interpret them.
If you'd written:
File.WriteAllText("c:\\test.data", "david stein", Encoding.ASCII);
you'd have ended up with the exact same sequence of bytes. There are any number of ways you could have created a file with the same bytes in. There's nothing about File.WriteAllText which "marks" the file as text, and there's nothing about FileStream or BinaryWriter which marks the file as binary.
EDIT: From comments:
I'm looking to create a data in binary file until user can not open and read my data by note pad
Well, there are lots of ways of doing that with different levels of security. Ideally, you'd want some sort of encryption - but then the code reading the data would need to be able to decrypt it as well, which means it would need to be able to get a key. That then moves the question to "how do I store a key".
If you only need to verify the data in the file (e.g. check that it matches something from elsewhere) then you could use a cryptographic hash instead.
If you only need to prevent the most casual of snoopers, you could use something which is basically just obfuscation - a very weak form of encryption with no "key" as such. Anyone who dceompiled your code would easily be able to get at the data in that case, but you may not care too much about that.
It all depends on your requirements.
All data is binary. A text file is binary data that happens to be a limited subset that represent valid characters, but it's still binary.
The way text editors typically differentiate a text file from a binary file is they scan a certain portion of the file for zero values, \0. These never exist in text-only files and almost always exist in binary files.
Related
I am trying to open and read a bunch of geo-referenced timelog files that are in binary format. They supposedly follow the ISO-11783 (ISOBUS) standard for agricultural machinery, but after reading 100s of pages of the standard I cannot figure out how to read the files either with a hex editor or programmatically with .NET c#. I know the timelog comes in file-pairs: an xml file and a binary file. The binary file, for example, is named TLG00004.bin and in notepad it looks like this (partial):
and when I open that file in Visual Studio 2015 (Community) as a binary file the hex looks like this:
which does not help me. I don't even know how to begin reading this as a byte stream in code (or anything else for that matter).
I know the file is supposed to look like this in human readable form:
(TimeStart, PositionNorth, PositionEast, PositionStatus, # DLV, DLV 0, PDV 0, DLV 1, PDV 1, DLV 2, PDV 2,...) it can have up to 255 DLV-PDV pairs which I believe are 32-bit integers. An example was shown as: (2005-05-02T16:32:00,51.00678,6.03489,1,2,0,10,1,15)
Little hints I have seen in the documentation indicate to me this must be utf-8 and perhaps base64 encoding with little endian and no Byte Order Mark. But I tried opening this in the free version of Hexinator and can't (human) read it using any of the dozens of encodings in that app, including utf-8, 16, 32...
I know this is not normal programming stuff but am throwing it out there to see if I'm lucky enough that someone has done this before and sees this. Any hints or resource-pointing would find me grateful, and I would be very thankful if someone can share any code that reads this kind of file.
Your data seems to follow the ISO 11783-10 standard for "Log data binary file structure" data exchange.
You will need to unpack your binary data into data types according to the specification. For example, the first 32 bits of the data are the milliseconds since midnight stored as a 32 bit unsigned integer. The next 16 bits are the days since 1980-01-01 stored as a 16 bit unsigned integer.
Unpacking binary data is programming language specific and some programming languages have useful libraries to assist in shifting through binary data.
As your question is about the general parsing of ISOBUS and I'm not proficient in your given language (C#), I can only give you an initial pointer.
BinaryReader looks to be the ideal way of unpacking a binary file by reading a number of bits from a stream and advancing the pointer through it:
using (BinaryReader reader = new BinaryReader(File.Open(fileName, FileMode.Open)))
{
milliSecondsSinceMidnight = reader.ReadUInt32();
daysSince1980 = reader.ReadUInt16();
}
If you need further help, you can now ask a specific question about byte parsing in C#.
I have binary data stored in database which I need to convert them back for backup purposes. Most of them are .doc files with images attached in document. My method to restore them is to write binary data to byte string and write those bytes to the file like mydoc.doc. The problem is, it works for txt files and it actually works for text part of .doc file as well. Since most of the .doc files contain jpeg attached, after conversion I get some readable text and random characters which I believe are there for picture attached in doc file. Any help is appreciated. Thanks in advance...
Note: Binary data is stored in image data type in database. Database contains file path and name (which doesn't exist now) and corresponding binary data stored in image type, so from path I can detect the file type that it was before... some of them are .txt (which I was able to convert perfectly), some of them are .doc (which is problem because of attahcmens inside it)
Here is my code:
string s = "D0CF11E0A1B11AE100000000000000000000"; // note: string is for example
var bytes = GetBytesFromByteString(s).ToArray();
File.WriteAllBytes("C:\\temp\\test.doc", bytes);
A .doc file is not a string or even a text or ASCII. It is a raw binary file format.
So if your database cell contains a BLOB (Binary Large Object) simply treat it as an array of bytes and write it out to a (binary) file. No conversions, no encodings, nothing.
Edit
Whoever designed this database, they designed to store all kinds of files as an image (in the sense of memory-dump-image) i.e. a series of raw bytes in a cell of type image.
You should treat these bytes exactly as mentioned above: A series of raw bytes.
I have a binary file. i stored it in byte array. file size can be 20MB or more. then i want to parse or find particular value in the file. i am doing it by 2 ways ->
1. By converting full file in char array.
2. By converting full file in hex string.(i also have hex values)
what is best way to parse full file..or should i do in binary form. i am using vs-2005.
From the aspect of memory consumption, it would be best it you could parse it directly, on-the-fly.
Converting it to a char array in C# means effectively doubling it's size in memory (presuming you are converting each byte to a char), while hex string will take at least 4 times the size (C# chars are 16-bit unicode characters).
On the other hand, it you need to make many searches and parsing over an existing set of data repeatedly, you may benefit from having it stored in any form which suits your needs better.
What's stopping you from seaching in the byte[]?
IMHO, If you're simply searching for a byte of specified value, or several continous bytes, this is the easiest way and most efficient way to do it.
If I understood your question correctly you need to find strings which can contain any characters in a large binary file. Does the binary file contain text? If so do you know the encoding? If so you can use StreamReader class like so:
using (StreamReader sr = new StreamReader("C:\test.dat", System.Text.Encoding.UTF8))
{
string s = sr.ReadLine();
}
In any case I think it's much more efficient using some kind of stream access to the file, instead of loading it all to memory.
You could load it by chunks into the memory, and then use some pattern matching algorithm (like Knuth-Moris-Pratt or Karp-Rabin)
Today i'm cutting video at work (yea me!), and I came across a strange video format, an MOD file format with an companion MOI file.
I found this article online from the wiki, and I wanted to write a file format handler, but I'm not sure how to begin.
I want to write a file format handler to read the information files, has anyone ever done this and how would I begin?
Edit:
Thanks for all the suggestions, I'm going to attempt this tonight, and I'll let you know. The MOI files are not very large, maybe 5KB in size at most (I don't have them in front of me).
You're in luck in that the MOI format at least spells out the file definition. All you need to do is read in the file and interpret the results based on the file definition.
Following the definition, you should be able to create a class that could read and interpret a file which returns all of the file format definitions as properties in their respective types.
Reading the file requires opening the file and generally reading it on a byte-by-byte progression, such as:
using(FileStream fs = File.OpenRead(path-to-your-file)) {
while(true) {
int b = fs.ReadByte();
if(b == -1) {
break;
}
//Interpret byte or bytes here....
}
}
Per the wiki article's referenced PDF, it looks like someone already reverse engineered the format. From the PDF, here's the first entry in the format:
Hex-Address: 0x00
Data Type: 2 Byte ASCII
Value (Hex): "V6"
Meaning: Version
So, a simplistic implementation could pull the first 2 bytes of data from the file stream and convert to ASCII, which would provide a property value for the Version.
Next entry in the format definition:
Hex-Address: 0x02
Data Type: 4 Byte Unsigned Integer
Value (Hex):
Meaning: Total size of MOI-file
Interpreting the next 4 bytes and converting to an unsigned int would provide a property value for the MOI file size.
Hope this helps.
If the files are very large and just need to be streamed in, I would create a new reader object that uses an unmanagedmemorystream to read the information in.
I've done a lot of different file format processing like this. More recently, I've taken to making a lot of my readers more functional where reading tends to use 'yield return' to return read only objects from the file.
However, it all depends on what you want to do. If you are trying to create a general purpose format for use in other applications or create an API, you probably want to conform to an existing standard. If however you just want to get data into your own application, you are free to do it however you want. You could use a binaryreader on the stream and construct the information you need within your app, or get the reader to return objects representing the contents of the file.
The one thing I would recommend. Make sure it implements IDisposable and you wrap it in a using!
I got this next problem.
I have a binary file, which I write to it vital data of the system.
One of the fields is time, which I use DateTime.Now.ToString("HHmmssffffff), in format of microseconds. This data (in a string) I convert (to ToCahrArray) (and checked it in debugging in it is fine), it consists of time valid till the microseconds.
Then I write it and flush it to the file. When opening it with PsPad that translate binary to Ascii, I see that data is corrupted in this field and a nother one but the rest of the message is fine.
The code:
void Write(string strData) {
char[] cD = strData.ToCharArry();
bw.Write(c); //br is from type of BinaryWriter
bw.Flush();
}
You're writing out the bytes in Unicode characters, not Ascii bytes. If you want Ascii bytes, you should change this to use the Encoding class.
byte[] data = Encoding.ASCII.GetBytes(strData);
bw.Write(data);
I strongly recommend reading Joel Spolsky's article on character sets and encoding. It may help you understand what your current code is not working properly.