C# Working with files/bytes - c#

I have some questions about editing files with c#.
I have managed to read a file into a byte[]. How can I get the ASCII code of each byte and show it in the text area of my form?
Also, how can I change the bytes and then write them back into a file?
For example:
I have a file and I know the first three bytes are letters. How can I change say, the second letter, to "A", then save the file?
Thanks!

If the file is ASCII, then each byte IS the ASCII code. To print the value of the byte to, say, a label, is as simple as this.
If you have read your file into byte[] file;
label1.Text = file[1].ToString();
To change the second letter to A:
file[1] = (byte)'A';
Or
file[1] = (byte)(int)'A';
I'm not sure, I don't have C# on my Mac to test.
But seriously, if it is a text file, you are better reading it in as text, not as a byte[]. And you would probably want to manipulate it using a StringBuilder
Firstly, to read it in as a string:
// Read the file as one string.
System.IO.StreamReader myFile =
new System.IO.StreamReader("c:\\test.txt");
string myString = myFile.ReadToEnd();
myFile.Close();
And this will work if the file is unicode as well.
Then, you can get the Unicode values (which for most latin characters is the same as the ASCII value) like so: int value = (int)myString[5]; or so.
You can then write back to a file like so:
System.IO.File.WriteAllText("c:\\test.txt", myString);
If you are going to do heavy modifications on the text, you should use a StringBuilder, otherwise, normal string operations would be fine.

I can only assume that you want to practice writing to/from files by the byte. You need to look into the class BitConverter, there is a lot of help out there for this class. To read in a value you would take in each byte into a byte[]. Once you have your byte[] it would look something like this.
string s = BitConverter.ToString(byteArray);
You can then make your adjustments to your string value, for writing back to the file you'll want to use the GetBytes method.
byte[] newByteArray = BitConveter.GetBytes(s);
Then you could write your bytes back to your file.

Related

Open file, read as hex and convert it to ASCII?

Is it possible to read a file hex values into c# and output the corresponding ASCII? I can view the file in a hex editor which I can then see the appropriate ASCII next to the hex but rather than manually copying out the parts I need I imagine there is a way of the machine doing it for me in a c# program?
I did find Converting HEX data in a file to ascii but that didn't really help?
It sounds like you just need:
string text = File.ReadAllText("file.txt");
There's no such thing as "hex values" in a file - they're just bytes which are shown as hex in various editors geared towards editing non-text files.
The above line of code will load a text file, decoding it as UTF-8 - which is compatible with ASCII, so if your file is truly ASCII, it should be fine. If you need to specify a different encoding, you can do it with an overload, e.g.
// Load an ISO-8859-1 file
string text = File.ReadAllText("file.txt", Encoding.GetEncoding(28591));

How to read and write byte array line by line in text file

I want to record binary message packets in text file, one message per one line. After that I will read line by line to parse into meaningful message.
I looked into binaryWriter class and found write method which writes byte array but could not find writeLine method.
Please suggest good approach to record byte array in text file.
When you write binary to a file; you aren't writing this:
1011100111011
0110101010101
1000110100101
Because thats not actually binary. That is textual (human-readable) representation of binary. A real binary file represented by text is the ASCII/Unicode encoding of the binary. Its very hard to read; if you want proof; just open up a PNG file in Notepad++.
Thus; having line endings for a binary file makes no sense at all. Hence, no WriteLine method on BinaryWriter.
If you want to write out the binary above; you need to format it as a string, like so:
textWriter.WriteLine(Convert.ToString(value, 2));
Now, you probably can just use BinaryWriter (that is how you write byte[] after all) but just don't expect it to be human readable! You would then use BinaryReader to deserialize your written file.
If you really want to save binary data to a text file, but also have line breaks, you might want to use Convert.ToBase64String. This will ensure you don't have any line feed characters inside of your binary data, that would inadvertently break the line.

Converting a string rappresentation of a file (byte array) back to a file in C#

As in the title I'm trying to convert back a string rappresentation of a bytearray to the original file where the bytes where taken.
What I've done:
I've a web service that gets a whole file and sends it:
answer.FileByte = File.ReadAllBytes(#"C:\QRY.txt");
After the serialization in the transmitted result xml I've this line:
<a:FileByte>TVNIfGF8MjAxMzAxMDF8YQ1QSUR8YXxhfGF8YXxhfGF8YXwyMDEzMDEwMXxhfGF8YXxhfGF8YXxhfGF8YXxhfGF8YXxhDVBWMXxhfGF8YXxhfGF8YXxhfGF8YXxhfDIwMTMwMTAxfDIwMTMwMTAxfDB8MHxhDQo=</a:FileByte>
I've tried to convert it back with this line in another simple application:
//filepath is the path of the file created
//bytearray is the string from the xml (copypasted)
File.WriteAllBytes(filepath, Encoding.UTF8.GetBytes(bytearray));
I've used UTF8 as enconding since the xml declares to use this charset. Keeping the datatype is not an option since I'm writing a simple utility to check the file conversion.
Maybe I'm missing something very basic but I'm not able to come up with a working solution.
This certainly isn't UTF8, the serializer probably converted it to Base64.
Use Convert.FromBase64String() to get your bytes back
Assuming that bytearray is the "TVNIfGF8M..." string, try:
string bytearray = ...;
File.WriteAllBytes(filepath, Convert.FromBase64String(bytearray));
UTF8 is a way to convert arbitrary text into bytes.
It was used by ReadAllText() to turn the bytes on disk back into XML.
You're seeing a mechanism to convert arbitrary bytes into text that can fit into XML. (that text is then convert to different bytes using UTF8).
It's probably Base64; use Convert.FromBase64String().

Reading special characters from Byte[]

I'm writing and readingfrom Mifare - RFID cards.
To WRITE into the card, i'm using a Byte[] like this:
byte[] buffer = Encoding.ASCII.GetBytes(txt_IDCard.Text);
Then, to READ from the card, I'm getting some error with special characters, when it's supposed to show me é, ã, õ, á, à... I get ? instead:
string result = System.Text.Encoding.UTF8.GetString(buffer);
string result2 = System.Text.Encoding.ASCII.GetString(buffer, 0, buffer.Length);
string result3 = Encoding.UTF7.GetString(buffer);
e.g: Instead I get Àgua, amanhã, você I receive/read ?gua, amanh?, voc?.
How may I solve it ?
ASCII by its very definition only supports 128 characters.
What you need is ANSI characters if you are reading legacy text.
You can use Encoding.Default instead of Encoding.ASCII to interpret characters in the current locale's default ANSI code page.
Ideally, you would know exactly which code page you are expecting the ANSI characters to use and specify the code page explicitly using this overload of Encoding.GetEncoding(int codePage), for example:
string result = System.Text.Encoding.GetEncoding(1252).GetString(buffer);
Here's a very good reference page on Unicode: http://www.joelonsoftware.com/articles/Unicode.html
And another here: http://msdn.microsoft.com/en-us/library/b05tb6tz%28v=vs.90%29.aspx
But maybe you can just use UTF8 when reading and writing
I don't know the details of the card reader. Is the data you read and write to the card just a load of bytes?
If so, you can just use UTF8 for both reading and writing and it will all just work. It's only necessary to use ANSI if you are working with a legacy device which is expecting (or providing) ANSI text. If the device just stores bytes blindly without implying any particular format, you can do what you like - in this case, just always use UTF8.
It seems like you're using characters that aren't mapped in the 7 bits ASCII, but in the "extensions" ISO-8859-1 or ISO-8859-15. You'll need to choose a specific encoding for mapping to your byte array and things should work fine;
byte[] buffer = Encoding.GetEncoding("ISO-8859-1").GetBytes(txt_IDCard.Text);
You have two problems there:
ASCII supports only a limited amount of characters.
You're currently using two different Encodings for reading and writing.
You should write with the same Encoding as you read.
Writing
byte[] buffer = Encoding.UTF8.GetBytes(txt_IDCard.Text);
Reading
string result = Encoding.UTF8.GetString(buffer);

ByteArray In C# Is Unable To Show All Contents In TextBox

I'm parsing a pdf file...I converted data into byte array but it doesnt show full file..
i dnt want to use any lib or softy..
FileStream fs = new FileStream(fname, FileMode.Open);
BinaryReader br = new BinaryReader(fs);
int pos = 0;
int length = (int)br.BaseStream.Length;
byte [] file = br.ReadBytes(length);
String text = System.Text.ASCIIEncoding.ASCII.GetString(file);
displayFile.Text = text;
It would really help if you'd give more detail - including some code, preferably a short but complete program that demonstrates the problem.
My guess is that when you're doing the conversion you end up with some text containing a null character ('\0') - which Windows Forms controls treat as a string terminator.
For example, if you use:
label.Text = "hello\0there";
you'll only see "hello".
Now you may have this problem due to converting from a byte array to text using the wrong encoding - but we can't really help much more with the little information you've provided.
Based on your code example, I would say that the problem is that you are assuming that the PDF file contains plain ascii text, which is not the case. PDF is a complicated format, and there are libraries that allow you to parse them.
Doing a quick google search: iTextSharp can read the pdf format.
You cannot convert a PDF to text by just interpreting it as ASCII. You may be lucky enough that some of the text actually is ASCII, but you can also expect some of the non-text contents to be indistinguishable from ASCII.
Instead use one of the solutions for parsing PDF. Here is one way using PDFBox and IKVM: Naspinski.net: Parsing/Reading a PDF file with C# and Asp.Net to text
Even pure Ascii set contains lots of non-printable, non-display-able and control characters.
Like Jon said, a \0 (NUL) at the beginning of a string terminates everything in .NET. I had painful experience with this behavior years back. Control characters like 'bell' and 'backspace' etc etc will give you funny output. But do not expect to hear a bell ringing :P.

Categories

Resources