I am working on a system that needs to read a binary file containing certain Persian names/stock instruments. I need to convert the binary data into string to be used in further processes. I have googled it and haven't really found a solution to my problem. Anyone here who has worked in such a scenario or knows how to tackle such a problem?
Here is the code that I am using to convert the bytes to string (simple as it maybe):
byte[] data = binaryReader.ReadBytes(amountOfData);
string symbolRead = Encoding.ASCII.GetString(data);
FYI, I have tried to change my system locale to Persian and that hasn't helped either. Although it does allow me to view already written text in Persian.
Hoping to find a solution.
Thanks.
Don't use ASCII for encoding. First try using Default after setting your locale; then try asking directly someone what encoding is most used for Persia, and use this one.
Determine what coding is used in your file and use the corresponding encoding instead of Encoding.ASCII.GetString(...). Possible values could be Encoding.UTF8.GetString(...) or Encoding.Default.GetString(...) to use your system encoding. See documentation of the Encoding class for other possibilities.
Related
I'm working on a c# project, I created a method that will receive a file as bytes value byte[] parameter.
Method
public static FileEncryptionModel Encrypt(byte[] Filebytes)
{
//Some code here
}
the question is how can I know the file extension just from its bytes?
thank you.
There is no foolproof way to achieve this. Take a look at the list of file signatures here: https://en.m.wikipedia.org/wiki/List_of_file_signatures
As mentioned here: https://en.m.wikipedia.org/wiki/File_format#Magic_number
Originally, this term was used for a specific set of 2-byte identifiers at the beginnings of files, but since any binary sequence can be regarded as a number, any feature of a file format which uniquely distinguishes it can be used for identification.
Your best bet is to store the file type along with the byte array.
You can try the magic number approach, but how could you know an html file from any other text file?
Okay, I have this big .NET project which uses multiple databases with a lot of already-written requests. The databases all uses WE8DEC as the character system, until now, all the data was latin and there was no problem.
But I now have the task to use a new database, again in WE8DEC, but this database stores russian data, written in cyrillic. Using a tool like DBeaver, it shows data like ÇÎËÎÒÀ�Å instead of the actual cyrillic text.
I know I can retrieve the byte data directly from the database using the dump function to retrieve the bytes and then convert them.
WORD | DUMP(WORD)
ÇÎËÎÒÀ�Å | Typ=1 Len=9: 199,206,203,206,210,192,208,197,194
But I don't feel like duplicating/altering all my request and the way I retrieve the results in c#, I have a place just before sending the data as JSON where I could just reincode all the string before sending them.
So I was looking for a way to retrieve the bytes just like in Oracle and found a way using this line of code :
byte[] bytes = Encoding.GetEncoding("Windows-1252").GetBytes(word);
But my main problem is this, I don't find any exact equivalent of the WE8DEC encoding from Oracle in .NET, Windows-1252 is the closest I found (but still incorrect).
So the question, is there an exact equivalent of WE8DEC, also called MCS, in c#?
Is there a way to tell the BinaryReader to interpret as big-endian? Like just saying "interpret everything big endian" so I don't have to write extra code to manually read in bytes, reverse them, and then convert it to int or float or whatever I need.
UPDATE
looked around, seems like you can't.
Which is kind of strange; I figured it's something you'd naturally do when writing a class that will read binary data from arbitrary files.
Try creating a BinaryReader BinaryReader(stream,encoding) using the Encoding.BigEndianUnicode Property
Since it was pointed out that this is for text only, you will have to create your own code to manually convert it, or you can use Scott Chamberlain's example at the end of this MSDN Forum Posting .
I have a byte[] with some data in it, I would like to write this byte array AS-IS to the log file using log4.net. The problems that i am facing is that
There are no overload for byte[] in TextWriter, so even implementing an IObjectRenderer is of no use.
I dont have access to the underlying Stream object of Log4.net
Also tried converting byte[] into char[] still when i write it, it adds an extra byte.
Is this even possible with Log4.net.
Thanx in Advance.
Log files are usually plain text files. It's probably best to log your byte array represented as string.
Have a look at BitConverter.ToString or Convert.ToBase64String.
Nope. Have you thought about writing it out as a hex string (see this post)?
I also think that logging any larger data is kind of useless, however, i guess this is what you are looking for - this converts your bytes to string.
System.Text.Encoding.ASCII.GetString(byteArray)
I believe you can figure out how to use that for logging.
Pz, the TaskConnect developer
If you are logging into DB then use Binary type with maximum size
Today i'm cutting video at work (yea me!), and I came across a strange video format, an MOD file format with an companion MOI file.
I found this article online from the wiki, and I wanted to write a file format handler, but I'm not sure how to begin.
I want to write a file format handler to read the information files, has anyone ever done this and how would I begin?
Edit:
Thanks for all the suggestions, I'm going to attempt this tonight, and I'll let you know. The MOI files are not very large, maybe 5KB in size at most (I don't have them in front of me).
You're in luck in that the MOI format at least spells out the file definition. All you need to do is read in the file and interpret the results based on the file definition.
Following the definition, you should be able to create a class that could read and interpret a file which returns all of the file format definitions as properties in their respective types.
Reading the file requires opening the file and generally reading it on a byte-by-byte progression, such as:
using(FileStream fs = File.OpenRead(path-to-your-file)) {
while(true) {
int b = fs.ReadByte();
if(b == -1) {
break;
}
//Interpret byte or bytes here....
}
}
Per the wiki article's referenced PDF, it looks like someone already reverse engineered the format. From the PDF, here's the first entry in the format:
Hex-Address: 0x00
Data Type: 2 Byte ASCII
Value (Hex): "V6"
Meaning: Version
So, a simplistic implementation could pull the first 2 bytes of data from the file stream and convert to ASCII, which would provide a property value for the Version.
Next entry in the format definition:
Hex-Address: 0x02
Data Type: 4 Byte Unsigned Integer
Value (Hex):
Meaning: Total size of MOI-file
Interpreting the next 4 bytes and converting to an unsigned int would provide a property value for the MOI file size.
Hope this helps.
If the files are very large and just need to be streamed in, I would create a new reader object that uses an unmanagedmemorystream to read the information in.
I've done a lot of different file format processing like this. More recently, I've taken to making a lot of my readers more functional where reading tends to use 'yield return' to return read only objects from the file.
However, it all depends on what you want to do. If you are trying to create a general purpose format for use in other applications or create an API, you probably want to conform to an existing standard. If however you just want to get data into your own application, you are free to do it however you want. You could use a binaryreader on the stream and construct the information you need within your app, or get the reader to return objects representing the contents of the file.
The one thing I would recommend. Make sure it implements IDisposable and you wrap it in a using!