Converting from System.String to Sytstem.byte while reading from MYSQL - c#

I have a BLOB file which Im reading from database. Following is the code:
byte[] bytes;
sdr.Read();
bytes = (byte[])sdr["proposalDoc"];
But the below exception occurs:
"unable to convert from system.string to system.byte"

I wrote the following before noticing your clarification that the value returned as a string is really just a binary blob. If that's correct, than the link provided by the other commenters looks like what you need. But if the "blob" is actually a series of ASCII characters transformed to Unicode (or a stream of bytes where each byte was transformed into a word by setting the high order byte to 0), then something like the following would apply.
Assuming that the field returned by sdr["proposalDoc"] is really just an ASCII string converted to Unicode, and that all you're trying to do is reconstruct the ASCII byte string (nul-terminated), you could do something like the following. (Note, there may be more optimal ways of doing this, but this could get you started.)
// Get record field.
string tempString = sdr["proposalDoc"];
// Create byte array to hold one ASCII character per Unicode character
// in the field, plus a terminating nul.
bytes = new byte[tempString.Length + 1];
// Copy each character from the field to the byte array,
// keeping the low byte of the character.
int i = 0;
foreach (char c in tempString)
{
bytes[i++] = (byte)c;
}
// Store the terminating nul character, assuming a
// nul-terminated ASCII string is the desired outcome.
bytes[i]=0;

Related

Convert RTU Mode sensor data to ASCII mode

I am trying to develop Windows application for Modbus RTU mode (RS-485) sensor in C#.
While reading sensor data there is no problem but main problem is when I try to read version of sensor the result is showing in:
01041A4350532D524D2056312E303020323031383033323900000000007B00
But I need to show the result is like
CPS-RM V1.00 20180329
I searched for this in internet I think I should have to convert to ascii code but I am not finding any solution do you have any idea for this.
It looks like only part of the string is actually text. I suspect the third byte is the number of bytes to treat as text following it (so the final two bytes aren't part of the text). Note that it's padded with Unicode NUL characters (U+0000) that you may want to trim.
So if you have your data in a variable called bytes:
string text = Encoding.ASCII
// Decode from the 4th byte, using the 3rd byte as the length
.GetString(bytes, index: 3, count: bytes[2])
// Trim any trailing U+0000 characters
.TrimEnd('\0');
Console.WriteLine(text);
I would mention that that's based on guesswork though. I would strongly advise you to try to find a specification for the data format to check my assumption about the use of the third byte as a length.
If you haven't already got the data as bytes (instead having it in hex) I would suggest you convert it to a byte array first. There are lots of pieces of code on Stack Overflow to do that already, e.g. here and here.
I found a answer and it worked
public static string ConvertHex(String hexString)
{
try
{
string ascii = string.Empty;
for (int i = 0; i < hexString.Length; i += 2)
{
String hs = string.Empty;
hs = hexString.Substring(i, 2);
uint decval = System.Convert.ToUInt32(hs, 16);
char character = System.Convert.ToChar(decval);
ascii += character;
}
return ascii;
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
return string.Empty;
}

Convert a Binary String longer than 8 into a single unicode character

I need to convert a string into binary and back.
But when the input is an Unicode Character the resulting Binary String is longer than 8 characters and converting it back using Convert.ToByte causes an Overflow Exception.
I used the Convert.ToString method to convert the String into Binary.
String input = "⌓" //Unicode Character
String binary = Convert.ToString(input, 2); //Convert to Binary
//->> Returns "10001100010011"
byte re = Convert.ToByte("10001100010011", 2); //This causes an Overflow Exception
//What should I do here? Is there a better reverse of Convert.ToString than Convert.ToByte?
Is there an alternative or something similar?
I think the problem here is that a byte is only 8 bits. You are putting in a string with 14 bits in it. Try using Convert.ToUint16() and declare "re" as Uint16 like so:
UInt16 re = Convert.ToUInt16("10001100010011", 2);
// UInt16 is large enough to hold the data
Console.WriteLine(re); // prints 8979
//unicode decimal 8979: http://www.codetable.net/decimal/8979

C# utf8-encoding bytearray out of range

I have the following problem: if the String contains a char that is not known from ASCII, it uses a 63.
Because of that i changed the encoding to UTF8, but I know a char can have the length of two bytes, so I get a out of range error.
How can I solve the problem?
System.Text.ASCIIEncoding enc = new System.Text.ASCIIEncoding();
byte[] baInput = enc.GetBytes(strInput);
// Split byte array (6 Byte) in date (days) and time (ms) parts
byte[] baMsec = new byte[4];
byte[] baDays = new byte[2];
for (int i = 0; i < baInput.Length; i++)
{
if (4 > i)
{
baMsec[i] = baInput[i];
}
else
{
baDays[i - 4] = baInput[i];
}
}
The problem you seem to be having is that you know the number of characters, but not the number of bytes, when using UTF8. To solve just that problem, you could use:
byte[] baMsec = Encoding.UTF8.GetBytes(strInput.SubString(0, 4));
byte[] baDays = Encoding.UTF8.GetBytes(strInput.SubString(4));
Recommended Solution:
1) Split the strInput using the SubString(Int32, Int32) method and get the date and time parts in separate String variables, say strDate and strTime.
2) Then call UTF8Encoding.GetBytes on strDate and strTime and collect the byte array in baDays and baMsec respectively.
Why this works:
C# String is by default UTF-16 encoded, which is equally good to represent non-ASCII characters. Hence, no data is lost.
General Caution:
Never try to directly manipulate encoded strings at byte-level, you'll get lost. Use the String and Encoding class methods of C# to get the bytes if you want bytes.
Alternate approach:
I'm wondering (like others) why your date-time data contains non-numeric characters. I saw in a comment that you get your data from reader["TIMESTAMP2"].ToString(); and the sample content is §║ ê or l¦h. Check if you are interpreting numeric data stored in reader["TIMESTAMP2"] as String by mistake and should you actually treat it as a numeric type. Otherwise, even with this method, you'll be getting unexpected output soon.
The problem is that your baInput can contain more values than both baDays and baMsec can contain. After 6 iterations, you run out of the array size. Hence, the exception.
When you hit the seventh iteration, you get i - 4 which yields 6 - 4 = 2.
Since baDays only has two items, you can set the values on index 0 and 1.

Bit Array to String and back to Bit Array

Possible Duplicate Converting byte array to string and back again in C#
I am using Huffman Coding for compression and decompression of some text from here
The code in there builds a huffman tree to use it for encoding and decoding. Everything works fine when I use the code directly.
For my situation, i need to get the compressed content, store it and decompress it when ever need.
The output from the encoder and the input to the decoder are BitArray.
When I tried convert this BitArray to String and back to BitArray and decode it using the following code, I get a weird answer.
Tree huffmanTree = new Tree();
huffmanTree.Build(input);
string input = Console.ReadLine();
BitArray encoded = huffmanTree.Encode(input);
// Print the bits
Console.Write("Encoded Bits: ");
foreach (bool bit in encoded)
{
Console.Write((bit ? 1 : 0) + "");
}
Console.WriteLine();
// Convert the bit array to bytes
Byte[] e = new Byte[(encoded.Length / 8 + (encoded.Length % 8 == 0 ? 0 : 1))];
encoded.CopyTo(e, 0);
// Convert the bytes to string
string output = Encoding.UTF8.GetString(e);
// Convert string back to bytes
e = new Byte[d.Length];
e = Encoding.UTF8.GetBytes(d);
// Convert bytes back to bit array
BitArray todecode = new BitArray(e);
string decoded = huffmanTree.Decode(todecode);
Console.WriteLine("Decoded: " + decoded);
Console.ReadLine();
The Output of Original code from the tutorial is:
The Output of My Code is:
Where am I wrong friends? Help me, Thanks in advance.
You cannot stuff arbitrary bytes into a string. That concept is just undefined. Conversions happen using Encoding.
string output = Encoding.UTF8.GetString(e);
e is just binary garbage at this point, it is not a UTF8 string. So calling UTF8 methods on it does not make sense.
Solution: Don't convert and back-convert to/from string. This does not round-trip. Why are you doing that in the first place? If you need a string use a round-trippable format like base-64 or base-85.
I'm pretty sure Encoding doesn't roundtrip - that is you can't encode an arbitrary sequence of bytes to a string, and then use the same Encoding to get bytes back and always expect them to be the same.
If you want to be able to roundtrip from your raw bytes to string and back to the same raw bytes, you'd need to use base64 encoding e.g.
http://blogs.microsoft.co.il/blogs/mneiter/archive/2009/03/22/how-to-encoding-and-decoding-base64-strings-in-c.aspx

How to prevent conversion of Windows-1252 argument into a Unicode string?

I've written my first COM classes. My unit tests work fine, but my first use of the COM objects has hit a snag.
The COM classes provide methods which accept a string, manipulate it and return a string. The consumer of the COM objects is a dBASE PLUS program.
When the input string contains common keyboard characters (ASCII 127 or lower), the COM methods work fine. However, if the string contains characters beyond the ASCII range, some of them get remapped from Windows-1252 to C#'s Unicode. This table shows the mapping that takes place: http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT
For example, if the dBASE program calls the COM object with:
oMyComObject.MyMethod("It will cost€123") where the € is hex 80,
the C# method receives it as Unicode:
public string MyMethod(string source)
{
// source is Unicode and now the Euro symbol is hex 20AC
...
}
I would like to avoid this remapping because I want the original hex content of the string.
I've tried adding the following to MyMethod to convert the string back to Windows-1252, but the Euro symbol gets lost because it becomes a question mark:
byte[] UnicodeBytes = Encoding.Unicode.GetBytes(source.ToString());
byte[] Win1252Bytes = Encoding.Convert(Encoding.Unicode, Encoding.GetEncoding(1252), UnicodeBytes);
string Win1252 = Encoding.GetEncoding(1252).GetString(Win1252Bytes);
Is there a way to prevent this conversion of the "source" parameter to Unicode? Or, is there a way to convert it 100% from Unicode back to Windows-1252?
Yes, I'm answering my own question. The answer by "Jigsore" put me on the right track, but I want to explain more clearly in case someone else makes the same mistake I made.
I eventually figured out that I had misdiagnosed the problem. dBASE was passing the string fine and C# was receiving it fine. It was how I checked the contents of the string that was in error.
This turnkey builds on Jigsore's answer:
void Main()
{
string unicodeText = "\u20AC\u0160\u0152\u0161";
byte[] unicodeBytes = Encoding.Unicode.GetBytes(unicodeText);
byte[] win1252bytes = Encoding.Convert(Encoding.Unicode, Encoding.GetEncoding(1252), unicodeBytes);
for (int i = 0; i < win1252bytes.Length; i++)
Console.Write("0x{0:X2} ", win1252bytes[i]); // output: 0x80 0x8A 0x8C 0x9A
// win1252String represents the string passed from dBASE to C#
string win1252String = Encoding.GetEncoding(1252).GetString(win1252bytes);
Console.WriteLine("\r\nWin1252 string is " + win1252String); // output: Win1252 string is €ŠŒš
Console.WriteLine("looking at the code of the first character the wrong way: " + (int)win1252String[0]);
// output: looking at the code of the first character the wrong way: 8364
byte[] bytes = Encoding.GetEncoding(1252).GetBytes(win1252String[0].ToString());
Console.WriteLine("looking at the code of the first character the right way: " + bytes[0]);
// output: looking at the code of the first character the right way: 128
// Warning: If your input contains character codes which are large in value than what a byte
// can hold (ex: multi-byte Chinese characters), then you will need to look at more than just bytes[0].
}
The reason the first method was wrong is that casting (int)win1252String[0] (or the converse of casting an integer j to a character with (char)j) involves an implicit conversion with the Unicode character set C# uses.
I consider this resolved and would like to thank each person who took the time to comment or answer for their time and trouble. It is appreciated!
Actually you're doing the Unicode to Win-1252 conversion correctly, but you're performing an extra step. The original Win1252 codes are in the Win1252Bytes array!
Check the following code:
string unicodeText = "\u20AC\u0160\u0152\u0161";
byte[] unicodeBytes = Encoding.Unicode.GetBytes(unicodeText);
byte[] win1252bytes = Encoding.Convert(Encoding.Unicode, Encoding.GetEncoding(1252), unicodeBytes);
for (i = 0; i < win1252bytes.Length; i++)
Console.Write("0x{0:X2} ", win1252bytes[i]);
The output shows the Win-1252 codes for the unicodeText string, you can check this by looking at the CP1252.TXT table.

Categories

Resources