Sending a string containing special characters through a TcpClient (byte[]) - c#

I'm trying to send a string containing special characters through a TcpClient (byte[]). Here's an example:
Client enters "amé" in a textbox
Client converts string to byte[] using a certain encoding (I've tried all the predefined ones plus some like "iso-8859-1")
Client sends byte[] through TCP
Server receives and outputs the string reconverted with the same encoding (to a listbox)
Edit :
I forgot to mention that the resulting string was "am?".
Edit-2 (as requested, here's some code):
#DJKRAZE here's a bit of code :
byte[] buffer = Encoding.ASCII.GetBytes("amé");
(TcpClient)server.Client.Send(buffer);
On the server side:
byte[] buffer = new byte[1024];
Client.Recieve(buffer);
string message = Encoding.ASCII.GetString(buffer);
ListBox1.Items.Add(message);
The string that appears in the listbox is "am?"
=== Solution ===
Encoding encoding = Encoding.GetEncoding("iso-8859-1");
byte[] message = encoding.GetBytes("babé");
Update:
Simply using Encoding.Utf8.GetBytes("ééé"); works like a charm.

Never too late to answer a question I think, hope someone will find answers here.
C# uses 16 bit chars, and ASCII truncates them to 8 bit, to fit in a byte. After some research, I found UTF-8 to be the best encoding for special characters.
//data to send via TCP or any stream/file
byte[] string_to_send = UTF8Encoding.UTF8.GetBytes("amé");
//when receiving, pass the array in this to get the string back
string received_string = UTF8Encoding.UTF8.GetString(message_to_send);

Your problem appears to be the Encoding.ASCII.GetBytes("amé"); and Encoding.ASCII.GetString(buffer); calls, as hinted at by '500 - Internal Server Error' in his comments.
The é character is a multi-byte character which is encoded in UTF-8 with the byte sequence C3 A9. When you use the Encoding.ASCII class to encode and decode, the é character is converted to a question mark since it does not have a direct ASCII encoding. This is true of any character that has no direct coding in ASCII.
Change your code to use Encoding.UTF8.GetBytes() and Encoding.UTF8.GetString() and it should work for you.

Your question and your error is not clear to me but using Base64String may solve the problem
Something like this
static public string EncodeTo64(string toEncode)
{
byte[] toEncodeAsBytes
= System.Text.ASCIIEncoding.ASCII.GetBytes(toEncode);
string returnValue
= System.Convert.ToBase64String(toEncodeAsBytes);
return returnValue;
}
static public string DecodeFrom64(string encodedData)
{
byte[] encodedDataAsBytes
= System.Convert.FromBase64String(encodedData);
string returnValue =
System.Text.ASCIIEncoding.ASCII.GetString(encodedDataAsBytes);
return returnValue;
}

Related

Best way to transform string to valid encoding in C#

Sorry in advance if you have a duplicate or a simple question! I can't find the answer.
I'm working with a dll made in Delphi. Data can be sent to the device using a DLL. However, at the time the data is sent, some strings are not accepted or are written blank. The data sent to the device is stored in a txt file. It was generated using txt file third party program.
That is, I think the string is in an indefinite format. If I send in utf-8 format, it receives all the information. But some strings at the time ???? ???? remains.
Many of my texts are in the Cyrillic alphabet.
What I did:
// string that send to device
[MarshalAsAttribute(UnmanagedType.LPStr, SizeConst = 36)]
public string Name;
When I did this, the device received only 10 out of 100 data.
If i encoding with UTF-8:
byte[] bytes = Encoding.Default.GetBytes(getDvsName[1].ToString());
string res = Encoding.UTF8.GetString(bytes);
Got all the data this way but too many strings are became as ??? ????.
Also i tried like this:
static private string Win1251ToUTF8(string source)
{
Encoding utf8 = Encoding.GetEncoding(«utf-8»);
Encoding win1251 = Encoding.GetEncoding(«windows-1251»);
byte[] utf8Bytes = win1251.GetBytes(source);
byte[] win1251Bytes = Encoding.Convert(win1251, utf8, utf8Bytes);
source = win1251.GetString(win1251Bytes);
return source;
}
All of the above methods did not help. How can I receive incoming information in the correct format? Are there other ways?
hi there here is what went wrong you did encode the string to default instead of utf8.
string tom = "ටොම් හැන්ක්ස්";
byte[] bytes = Encoding.UTF8.GetBytes(tom);
string res = Encoding.UTF8.GetString(bytes);

What can cause Base64 decoding throw FormatException

I am using C# and .NET to encode and decode base64 string. The following are snippets of my code:
Base64 encoding:
using (var stream = new MemoryStream())
…...
return Convert.ToBase64String(stream.ToArray());
}
Base64 decoding
byte[] bytes = Convert.FromBase64String(messageBody);
My code fails 99% of the time with 1% chance to succeed though. The stack trace is as follows:
5xx Error Returned:System.FormatException: The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or an illegal character among the padding characters. at System.Convert.FromBase64_ComputeResultLength(Char inputPtr, Int32 inputLength) at System.Convert.FromBase64CharPtr(Char* inputPtr, Int32 inputLength) at System.Convert.FromBase64String(String s)*
Does anyone know what can cause base64 decoding to fail? My encoding and decoding methods are symmetric and I am really confused about what can be the root cause for this issue?
Thanks for all your replies.
It turned out there were still some old messages in Json format that previously failed in getting delivered and kept retrying in our system; however the new code change of our receiving side got deployed and our receiving side starts to expect messages in protobuf format which results in Deserialization failure when receiving old Json format messages.
In order to debug an issue like this I usually write some tests or create a console app to watch the variables as they change from function to function.
One of the possible scenario's for base64 decoding to fail is if the decoder input is HTMLEncoded. This is common when you pass an encrypted string into a URL for example. It will automatically be HTML encoded and then it sometimes can and sometimes can't be decoded depending on the characters that the encoded output has.
Here's a simple console app to demonstrate this.
class Program
{
static void Main(string[] args)
{
string input = "testaa";
TestEncodeDecode("test");
TestEncodeDecode("testa");
TestEncodeDecode("testaa");
Console.ReadLine();
}
private static void TestEncodeDecode(string input)
{
string encoded = Encode(input);
Console.WriteLine($"Encoded: {encoded}");
string htmlEncoded = WebUtility.UrlEncode(encoded);
Console.WriteLine($"htmlEncoded: {htmlEncoded}");
string decodedString = Decode(htmlEncoded);
Console.WriteLine($"Decoded: {decodedString}");
Console.WriteLine();
}
private static string Decode(string htmlEncoded)
{
try
{
byte[] decoded = Convert.FromBase64String(htmlEncoded);
return Encoding.ASCII.GetString(decoded);
}
catch(Exception)
{
return "Decoding failed";
}
}
private static string Encode(string input)
{
byte[] bytes = Encoding.ASCII.GetBytes(input);
using (var stream = new MemoryStream())
{
stream.Write(bytes);
return Convert.ToBase64String(stream.ToArray());
}
}
}
You'll see that the first two arguments ("test" and "testa") fail to decode, but the third ("testaa") will succeed.
In order to "fix" this, change the Decode method as follows:
private static string Decode(string htmlEncoded)
{
try
{
string regularEncodedString = WebUtility.UrlDecode(htmlEncoded);
byte[] decoded = Convert.FromBase64String(regularEncodedString);
return Encoding.ASCII.GetString(decoded);
}
catch(Exception)
{
return "Decoding failed";
}
}

Equivalent of System.IO.File.ReadAllBytes(filePath)

Im trying to get all bytes from a file but the issue is that i only have access to the file once so what im doing is saving the file´s string in a txt file so when i need to use it i read the saved file´s string and convert the string to a byte array but something is not working,
basicaly what i need is the equivalent of System.IO.File.ReadAllBytes(filePath) what is actually working good but is not what i need in this case.
i've tried this out so far
public byte[] getByteArray(string fileString)
{
Encoding utf8 = Encoding.utf8;
Encoding ascii = Encoding.ASCII;
Encoding unicode = Encoding.Unicode;
return utf8.GetBytes(fileString);
}
i have tried every encoding of that class but didn't work

how can I safely convert byte array to string in C# on Linux (under mono)?

Currently I am using something like this:
private static ASCEncoding = new Encoding();
...
...
and my method:
...
public object some_method(object BinaryRequest)
{
byte[] byteRequest = (byte[])BinaryRequest;
string strRequest = ASCEncoding.GetString(byteRequest);
...
}
some characters when checked under Windows are different when checked Under Linux
9I9T (win)
98T (linux)
When you are communicating between systems, it's a good idea to use a specific and documented encoding for your text. For text written in the English language (including programming languages which use English for keywords/etc), the UTF-8 encoding is likely to use the fewest overall number of bytes in the encoded representation.
byte[] byteRequest = (byte[])BinaryRequest;
string strRequest = Encoding.UTF8.GetString(byteRequest);
Obviously to use this, you are expected to produce your requests using the same encoding.
string strRequest = ...
byte[] byteRequest = Encoding.UTF8.GetBytes(strRequest);
string stringValue = Encoding.Default.GetString(byteArray);

BlockingSenderDestination.sendReceive() UTF-8 issue

In my Blackberry application I am loading JSON using the following method.
private static Object loadJson(String uriStr){
Object _json = null;
Message response = null;
BlockingSenderDestination bsd = null;
try
{
bsd = (BlockingSenderDestination)
DestinationFactory.getSenderDestination
("CommAPISample", URI.create(uriStr));
if(bsd == null)
{
bsd =
DestinationFactory.createBlockingSenderDestination
(new Context("CommAPISample"),
URI.create(uriStr), new JSONMessageProcessor()
);
}
response = bsd.sendReceive();
_json = response.getObjectPayload();
}
catch(Exception e)
{
System.out.println(e.toString());
}
finally
{
if(bsd != null)
{
bsd.release();
}
}
return _json;
}
This is working fine. But the problem is when I am getting JSON, Arabic characters show as junk
(الرئيس التنÙ) . I submitted this issue to Blackberry support form
Arabic shows corrupted in the JSON output
As per the discussion, I encode the Arabic character into \uxxxx format(In my server side application) and it was working. But now I have to use a JSON from somebody else where I can’t change the server side code.
They are using asp.net C# , as per them they are sending the data like the following.
JsonResult result = new JsonResult();
result.ContentEncoding = System.Text.Encoding.UTF8;
result.JsonRequestBehavior = JsonRequestBehavior.AllowGet;
result.Data = “Data Object (Contains Arabic) comes here”
return result;
So my question is, If the server provide the data in the above manner, BlockingSenderDestination.sendReceive method can get a utf-8 data? Or it is expecting only \uxxxx encoded data for non-ascii. Or I have to do something else (like sending some header to server) so that I can directly use the utf-8 data.
In debug mode I check the value of 'response'. It is already showing junk characters.
Except from JSON I am able to handle Arabic everywhere else.
Yesterday I posted this issue in Blackberry form . But till now no reply.
I am new to blackberry and Java. So I am sorry if this is silly question.
Thanks in advance.
What is the content type in the response? Is the server explicitly defining the UTF-8 character encoding in the HTTP header? e.g.:
Content-Type: text/json; charset=UTF-8
If the API is ignoring the charset in the HTTP content type, an easier way to do the String conversion is by determining whether the Message received is a ByteMessage or a StreamMessage. Retrieve the message as a byte array and then convert to a string using the UTF-8 encoding
i.e.:
Message msg = bsd.sendReceive();
byte[] msgBytes = null;
if (msg instanceof ByteMessage) {
msgBytes = ((ByteMessage) msg).getBytePayload();
}
else { /* StreamMessage */
// TODO read the bytes from the stream into a byte array
}
return new String(msgBytes,"UTF-8");
At last I found the solution myself.
The data sending from server was in UTF-8 which uses double byte to show single character. But BlockingSenderDestination.sendReceive() is not able to identify that. So it is creating one character for each byte. So the solution was to get each character and get the byte from that character and add to a byte array. From that byte array create a string with UTF8 encoding.
If anyone know to use BlockingSenderDestination.sendReceive() for utf-8 please post here. So that we can avoid this extra conversion method.

Categories

Resources