Bit Conversion trying to write into a stream - c#

I have a string which represents a binary number: "1010", which represents a 10(ten) in decimal.
I need to write this string into a Stream but keeping the binary format. When normally you want to write a string, .Net will save it converting the current string into a byte array and then putting those bytes into the string, I do not want that, because the bytes that I want to contain my stream is that I have into the string "1010" for example.
How I do this???

If "1010" is a string, you can still write it to a stream and preserve the format, provided the receiving end uses the proper encoding. Of course, you could use a StreamWriter and just write the string as a string.
UPDATE
Ok, so your comments seem to clarify your question a little. So it seems like you want to convert a base-2 string into a byte, so that you aren't storing multiple bytes to represent in text what you can represent in a single byte. Is that fair? If so, use Convert.ToByte(String, Int32), and specify 2 for the base. Then you have a byte corresponding to your string and you can write it out.

Related

BigInteger.Parse() trouble reading in large numbers

Presently I am attempting to do this challenge (http://cryptopals.com/sets/1/challenges/1) and I am having some trouble completing the task in C#. I can not seem to parse the number into a big integer.
So code looks like below:
string output = "";
BigInteger hexValue = BigInteger.Parse("49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6");
output = Convert.ToBase64String(hexValue.ToByteArray());
Console.WriteLine(hexValue);
Console.WriteLine(output);
Console.ReadKey();
return "";
And at present the problem I am getting is when I run the program it fails with the error
System.FormatException: 'The value could not be parsed.' and I am not entirely sure why.
So, what is the appropriate way to get a large integer from a string into a BigInt?
The initial problem
The BigInteger.Parse method expects the value to be decimal, not hex. You can "fix" that by passing in NumberStyles.HexNumber.
The bigger problem with using BigInteger for this
If you're just trying to convert a string of hex digits into bytes, I would avoid using BigInteger at all. For one thing, you could end up with problems if the original byte array started with zeroes, for example. The zeroes wouldn't be in the resulting byte array. (Sample input: "0001" - you want to get two bytes out, but you'll only get one, after persuading it to parse hex.)
Even if you don't lose any information, the byte[] you receive from BigInteger.ToByteArray() isn't what you were probably expecting. For example, consider this code, which just converts the data to byte[] and back to hex via BitConverter:
BigInteger bigInt = BigInteger.Parse("1234567890ABCDEF", NumberStyles.HexNumber);
byte[] bytes = bigInt.ToByteArray();
Console.WriteLine(BitConverter.ToString(bytes));
The output of that is "EF-CD-AB-90-78-56-34-12" - because BigInteger.ToByteArray returns the data in little-endian order:
The individual bytes in the array returned by this method appear in little-endian order. That is, the lower-order bytes of the value precede the higher-order bytes.
That's not what you want - because it means the last part of the original string is the first part of the byte array, etc.
Avoiding BigInteger altogether
Instead, parse the data directly to a byte array, as in this question, or this one, or various others. I won't reproduce the code here, but it's simple enough, with different options depending on whether you're trying to create simple source code or an efficient program.
General advice on conversions
In general it's a good idea to avoid intermediate representations of data unless you're absolutely convinced that you won't lose information in the process - as you would here. It's fine to convert the hex string to a byte array before converting the result to base64, because that's not a lossy transformation.
So your conversions are:
String (hex) to BigInteger: lossy (in the context of leading 0s being significant, as they are in this situation)
BigInteger to byte[]: not lossy
byte[] to String (base64): not lossy
I'm recommending:
String (hex) to byte[]: not lossy (assuming you have an even number of nybbles to convert, which is generally a reasonable assumption)
byte[] to String (base64): not lossy
Use NumberStyles.HexNumber:
BigInteger.Parse("49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6",
NumberStyles.HexNumber,
CultureInfo.InvariantCulture);
If your number is supposed to be always positive, add a leading zero to your string.
The problem is that the input is not decimal but hexadecimal, therefore you need to pass an additional parameter for parsing:
BigInteger number = BigInteger.Parse(
hexString,
NumberStyles.AllowHexSpecifier);

How a real packet should be constructed?

Probably this question will anger all stack overflow gods, but I just cant get my head around how a packet should look like.
I mean how to make a packet by these guidelines?
.
byte[] data0 = Encoding.Unicode.GetBytes("1");
byte[] data1 = Encoding.Unicode.GetBytes(0x03);
string data2 = "127.0.0.0:80";
string data3 = "";
I imagine everything like this and make a byte list/array out of this?
Or that string has to be converted to byte and then pack it to array/list?
Or maybe someone has a simple explanation how its done?
You appear to be trying to implement a client to use Valve's "Master Server Query Protocol". If you click on the "String Zero" link, you'll find it simply describes a null-terminated string. Presumably the encoding is ASCII, but I don't see anything in the documentation that makes that clear.
You will need to construct a datagram with the bytes formatted as described in the document. For ASCII characters, you can just cast char to int; for non-ASCII characters, you'd get values outside the valid range for ASCII, but char uses UTF16 which, for the range of characters shared with ASCII also shares the actual value for the character. You can then cast that character to byte for the purpose of the datagram. However, none of this really matters in this particular example because the only part of the protocol described as a specific character, you already know the byte value for and so can just specify that explicitly.
To build up the byte[] for the datagram, I'd recommend using MemoryStream (you could also use BinaryWriter for more complex types of data, but here you're really only dealing with bytes…BinaryWriter uses its own length-prefixed format for strings, so you'd have to convert to byte[] for the strings anyway). Something like the following ought to work:
byte[] GetMasterServerQueryDatagram(byte regionCode, string address, string filter)
{
MemoryStream stream = new MemoryStream();
stream.WriteByte(0x31);
stream.WriteByte(regionCode);
byte[] stringZero = Encoding.ASCII.GetBytes(address + "\0");
stream.Write(stringZero, 0, stringZero.Length);
stringZero = Encoding.ASCII.GetBytes(filter + "\0");
stream.Write(stringZero, 0, stringZero.Length);
return stream.ToArray();
}
Notes:
You might want to declare an enum based on the table in the Valve documentation to represent the regionCode value, so that your other code can refer to the region by name.
Pay close attention to the requirement in the documentation that you pass "0.0.0.0:0" as the first IP:port value, but then pass the last value returned by their servers in your subsequent queries.

Change string to Byte Array without conversion

I have a JSON formatted object, and the byte array is coming through as a string. I need to change that string to a byte array, but without converting the char's.
static byte[] GetBytes(string str)
{
return str.Select(Convert.ToByte).ToArray();
}
The above code half solves the issue, unfortunately it's still converting each char to it's respective byte.
For completness, here is my string
"PCFET0NUWVBFIGh0bWwgUFVCTElDICItLy9XM0MvL0RURCBIVE1MIDQuMDEvL0VOIiAiaHR0cDovL3d3dy53My5vcmcvVFIvaHRtbDQvc3RyaWN0LmR0ZCI+PGh0bWw+PGhlYWQ+PE1FVEEgaHR0cC1lcXVpdj0iQ29udGVudC1UeXBlIiBjb250ZW50PSJ0ZXh0L2h0bWw7IGNoYXJzZXQ9dXRmLTE2Ij48dGl0bGU+Q2l2aWwgUHJvY2VkdXJlIGluIE1hZ2lzdHJhdGVzJyBDb3VydHM8L3RpdGxlPjxsaW5rIHR5cGU9InRleHQvY3NzIiByZWw9InN0eWxlc2hlZXQiIGhyZWY9Ii4uL1N0eWxlcy9CV0NvbW1vbi5jc3MiPjxsaW5rIHR5cGU9InRleHQvY3NzIiByZWw9InN0eWxlc2hlZXQiIGhyZWY9Ii4uL1N0eWxlcy9TaXRlQ29tbW9uLmNzcyI+PHNjcmlwdCB0eXBlPSJ0ZXh0L2phdmFzY3JpcHQiIHNyYz0iLi4vc2NyaXB0cy9qcXVlcnktMS42LjIubWluLmpzIj48L3NjcmlwdD48c2NyaXB0IHR5cGU9InRleHQvamF2YXNjcmlwdCIgc3JjPSIuLi9zY3JpcHRzL3BvcHVwLmpzIj48L3NjcmlwdD48c2NyaXB0IHR5cGU9InRleHQvamF2YXNjcmlwdCIgc3JjPSIuLi9zY3JpcHRzL2hvdmVyYm94LmpzIj48L3NjcmlwdD48c2NyaXB0IHR5cGU9InRleHQvamF2YXNjcmlwdCIgc3JjPSIuLi9TY3JpcHRzL2pxdWVyeS5wcmludEVsZW1lbnQuanMiPjwvc2NyaXB0PjwvaGVhZD48Ym9keSBvbmxvYWQ9ImlmKHBhcmVudC5zZXRUb29scylwYXJlbnQuc2V0VG9vbHMoKSI+PGRpdiBpZD0iY29udGVudCI+PHAgY2xhc3M9IkdlbmVyYXRvci1IZWFkaW5nIj5Db250ZW50czwvcD48cCBjbGFzcz0iR2VuZXJhdG9yLUl0ZW0iIHN0eWxlPSJtYXJnaW4tbGVmdDoxNXB0Ij48YSBocmVmPSIjIiBvbmNsaWNrPSJsb2FkQ29udGVudCgnLi4vMWInLCBmYWxzZSk7Ij5DaXZpbCBQcm9jZWR1cmU8L2E+PC9wPjxwIGNsYXNzPSJHZW5lcmF0b3ItSXRlbSIgc3R5bGU9Im1hcmdpbi1sZWZ0OjE1cHQ7Y29sb3I6IzAwMjY3RjttYXJnaW4tdG9wOjEycHQiPjxiPsKgwqA8aW1nIHN0eWxlPSJib3JkZXI6MCIgd2lkdGg9IjgiIGhlaWdodD0iOSIgc3JjPSIuLi9SZXNvdXJjZXMvSW1hZ2VzL2Fycm93cy5naWYiIGFsdD0iIj7CoMKgQ2l2aWwgUHJvY2VkdXJlIGluIE1hZ2lzdHJhdGVzJyBDb3VydHM8L2I+PC9wPjxwIGNsYXNzPSJHZW5lcmF0b3ItSXRlbSIgc3R5bGU9Im1hcmdpbi1sZWZ0OjQ1cHQ7Y29sb3I6IzAwMjY3RjttYXJnaW4tYm90dG9tOjEycHQiPjxiPkF1dGhvcnFxcTogPC9iPkVkaXRvcnM6IERSIEhhcm1zLCBBZHZvY2F0ZSBvZiB0aGUgSGlnaCBDb3VydCwgTWVtYmVyIG9mIHRoZSBQcmV0b3JpYSBCYXI7IEYgU291dGh3b29kLiBGb3JtZXIgQ29udHJpYnV0b3JzOiBJIHZhbiBkZXIgV2FsdCwgQWR2b2NhdGUgb2YgdGhlIEhpZ2ggQ291cnQsIE1lbWJlciBvZiB0aGUgUHJldG9yaWEgQmFyOyBDIExvdXcsIEFkdm9jYXRlIG9mIHRoZSBIaWdoIENvdXJ0LCBNZW1iZXIgb2YgdGhlIFByZXRvcmlhIEJhcjsgQnJlbmRhIE5ldWtpcmNoZXIsIEFkdm9jYXRlIG9mIHRoZSBIaWdoIENvdXJ0LCBNZW1iZXIgb2YgdGhlIFByZXRvcmlhIEJhcjsgSkEgRmFyaXMsIFByb2Zlc3NvciBvZiBMYXcsIFVuaXZlcnNpdHkgb2YgU291dGggQWZyaWNhPGJyPjxiPkxhc3QgVXBkYXRlZDogPC9iPk9jdG9iZXIgMjAxMyAtIFNJIDMyLiBQcm9kdWN0IGRldmVsb3BlcjogQ3JhaWdlbiBTdXJhamxhbGw8L3A+PHA+PGJyPjxicj48L3A+PHA+PGJyPjxicj48L3A+PC9kaXY+PGRpdiBpZD0iY29udGV4dE1lbnUiPjwvZGl2PjwvYm9keT48c2NyaXB0IHR5cGU9InRleHQvamF2YXNjcmlwdCI+dHJ5e3dpbmRvdy5wYXJlbnQuc2V0RnJhbWVIZWlnaHQoIE1hdGgubWF4KE1hdGgubWF4KGRvY3VtZW50LmJvZHkuc2Nyb2xsSGVpZ2h0LCBkb2N1bWVudC5kb2N1bWVudEVsZW1lbnQuc2Nyb2xsSGVpZ2h0KSwgTWF0aC5tYXgoZG9jdW1lbnQuYm9keS5vZmZzZXRIZWlnaHQsIGRvY3VtZW50LmRvY3VtZW50RWxlbWVudC5vZmZzZXRIZWlnaHQpLCBNYXRoLm1heChkb2N1bWVudC5ib2R5LmNsaWVudEhlaWdodCwgZG9jdW1lbnQuZG9jdW1lbnRFbGVtZW50LmNsaWVudEhlaWdodCkpKTt9Y2F0Y2goZSl7fTwvc2NyaXB0PjwvaHRtbD4="
I need to change that to a byte array, such as
['P','C','F'] etc, without converting each char to it's respective byte
This is not and edit of: How do I get a consistent byte representation of strings in C# without manually specifying an encoding?
In that question, the string is being converted. It's literally in the title that I do not want to convert
Assuming this is your actual problem description:
I have a base64-encoded string, that I wish to convert to a byte array where each single byte contains the ASCII code for one character from the base64 string.
Then you can very easily do that:
byte[] characterBytes = Encoding.ASCII.GetBytes(input);
Because the characters used in a base64 string are all below Unciode code point 127, they all can be represented in a single byte obtained through Encoding.ASCII.
In fact, if that is your actual problem description, that'd make this question a duplicate of C# Convert a string to ASCII bytes.

Getting a string, int, etc in binary representation?

Is it possible to get strings, ints, etc in binary format? What I mean is that assume I have the string:
"Hello" and I want to store it in binary format, so assume "Hello" is
11110000110011001111111100000000 in binary (I know it not, I just typed something quickly).
Can I store the above binary not as a string, but in the actual format with the bits.
In addition to this, is it actually possible to store less than 8 bits. What I am getting at is if the letter A is the most frequent letter used in a text, can I use 1 bit to store it with regards to compression instead of building a binary tree.
Is it possible to get strings, ints,
etc in binary format?
Yes. There are several different methods for doing so. One common method is to make a MemoryStream out of an array of bytes, and then make a BinaryWriter on top of that memory stream, and then write ints, bools, chars, strings, whatever, to the BinaryWriter. That will fill the array with the bytes that represent the data you wrote. There are other ways to do this too.
Can I store the above binary not as a string, but in the actual format with the bits.
Sure, you can store an array of bytes.
is it actually possible to store less than 8 bits.
No. The smallest unit of storage in C# is a byte. However, there are classes that will let you treat an array of bytes as an array of bits. You should read about the BitArray class.
What encoding would you be assuming?
What you are looking for is something like Huffman coding, it's used to represent more common values with a shorter bit pattern.
How you store the bit codes is still limited to whole bytes. There is no data type that uses less than a byte. The way that you store variable width bit values is to pack them end to end in a byte array. That way you have a stream of bit values, but that also means that you can only read the stream from start to end, there is no random access to the values like you have with the byte values in a byte array.
What I am getting at is if the letter
A is the most frequent letter used in
a text, can I use 1 bit to store it
with regards to compression instead of
building a binary tree.
The algorithm you're describing is known as Huffman coding. To relate to your example, if 'A' appears frequently in the data, then the algorithm will represent 'A' as simply 1. If 'B' also appears frequently (but less frequently than A), the algorithm usually would represent 'B' as 01. Then, the rest of the characters would be 00xxxxx... etc.
In essence, the algorithm performs statistical analysis on the data and generates a code that will give you the most compression.
You can use things like:
Convert.ToBytes(1);
ASCII.GetBytes("text");
Unicode.GetBytes("text");
Once you have the bytes, you can do all the bit twiddling you want. You would need an algorithm of some sort before we can give you much more useful information.
The string is actually stored in binary format, as are all strings.
The difference between a string and another data type is that when your program displays the string, it retrieves the binary and shows the corresponding (ASCII) characters.
If you were to store data in a compressed format, you would need to assign more than 1 bit per character. How else would you identify which character is the mose frequent?
If 1 represents an 'A', what does 0 mean? all the other characters?

Using C#, what is the most efficient method of converting a string containing binary data to an array of bytes

While there are 100 ways to solve the conversion problem, I am focusing on performance.
Give that the string only contains binary data, what is the fastest method, in terms of performance, of converting that data to a byte[] (not char[]) under C#?
Clarification: This is not ASCII data, rather binary data that happens to be in a string.
UTF8Encoding.GetBytes
I'm not sure ASCIIEncoding.GetBytes is going to do it, because it only supports the range 0x0000 to 0x007F.
You tell the string contains only bytes. But a .NET string is an array of chars, and 1 char is 2 bytes (because a .NET stores strings as UTF16). So you can either have two situations for storing the bytes 0x42 and 0x98:
The string was an ANSI string and contained bytes and is converted to an unicode string, thus the bytes will be 0x00 0x42 0x00 0x98. (The string is stored as 0x0042 and 0x0098)
The string was just a byte array which you typecasted or just recieved to an string and thus became the following bytes 0x42 0x98. (The string is stored as 0x9842)
In the first situation on the result would be 0x42 and 0x3F (ascii for "B?"). The second situation would result in 0x3F (ascii for "?"). This is logical, because the chars are outside of the valid ascii range and the encoder does not know what to do with those values.
So i'm wondering why it's a string with bytes?
Maybe it contains a byte encoded as a string (for instance Base64)?
Maybe you should start with an char array or a byte array?
If you realy do have situation 2 and you want to get the bytes out of it you should use the UnicodeEncoding.GetBytes call. Because that will return 0x42 and 0x98.
If you'd like to go from a char array to byte array, the fastest way would be Marshaling.. But that's not really nice, and uses double memory.
public Byte[] ConvertToBytes(Char[] source)
{
Byte[] result = new Byte[source.Length * sizeof(Char)];
IntPtr tempBuffer = Marshal.AllocHGlobal(result.Length);
try
{
Marshal.Copy(source, 0, tempBuffer, source.Length);
Marshal.Copy(tempBuffer, result, 0, result.Length);
}
finally
{
Marshal.FreeHGlobal(tempBuffer);
}
return result;
}
There is no such thing as an ASCII string in C#! Strings always contain UTF-16. Not realizing this leads to a lot of problems. That said, the methods mentioned before work because they consider the string as UTF-16 encoded and transform the characters to ASCII symbols.
/EDIT in response to the clarification: how did the binary data get in the string? Strings aren't supposed to contain binary data (use byte[] for that).
If you want to go from a string to binary data, you must know what encoding was used to convert the binary data to a string in the first place. Otherwise, you might not end up with the correct binary data. So, the most efficient way is likely GetBytes() on an Encoding subclass (such as UTF8Encoding), but you must know for sure which encoding.
The comment by Kent Boogaart on the original question sums it up pretty well. ;]

Categories

Resources