C# Comport - Possible Loss of Data - c#

I have been trying to troubleshoot an issue and I really don't have any thoughts as to go about resolving the following problem:
I have a software application that is able to listen for a device that is capable of uploading data to a computer. Once this data is captured, it is written to a text file and stored for later use.
void DataRecieved(object sender, EventArgs e)
{
while ((bufferedData = comport.ReadLine()) != null)
{
uploadedData += bufferedData + Environment.NewLine;
}
comport.Close();
System.IO.StreamWriter writeUploadedPro = new System.IO.StreamWriter(uploadFilePath);
writeUploadedPro.Write(uploadedData);
writeUploadedPro.Close();
isUploadComplete = true;
}
I can establish, recieve, and verify a connection, and what I have programmed does indeed produce a text file of the uploaded data, however the data is contains is not complete.
Example:
%
N0?77??.5???3
G0? X3.??? Z4.5??6
Z5.??
?3.5?76
G01 Z5.??
Z4.9471
X?.?3 Z4.???9
Z?.???6
?3.?? Z?.???
Z4.????
X3.7??4
G?? X3.???? ?4.5??6
M30
?
It has numerous '?' which either should be a letter or digit. I have reaffirmed that the settings I have for the comport (Baud, Data, Stop, Parity, Handshake, and COM name are correctly specified). I have also tried setting the ReadBufferSize, RecievedBytesThreshold, Encoding, and NewLine settings. I am not at all familiar with these properties and I didn't find MSDN to be to helpful on explaining them either.
If you have any ideas or suggestions as to why I am getting incomplete lines of data in my upload, it would be most appreciated. Thank you for your assistance.

The problem is in your encoding. Try changing the Encoding property to UTF8. The default is ASCII. In ASCII encoding, any character greater than 0x7F gets converted into a ? by default because ASCII only goes up to 0x7F (127 decimal).
Though that might fix your problem, the better way to go would be to read the data into a byte array, then use one of these encoding classes to convert it into a proper string. The reason why that works is because you no longer converting the received bytes into a string. You're using the Read(byte[], int, int) overload which doesn't do a string conversion. Its pure bytes.

I think the problem lies in:
while ((bufferedData = comport.ReadLine()) != null)
Try changing the statement to something like:
while(comport.BytesToRead>0)
{
uploadedData += comport.ReadExisting() + Environment.NewLine;
}
Let me know if that helped.

Related

Decode Stream to CSV in Python by Byte (Translate from C# code)

I am trying to consume a streamed response in Python from a soap API, and output a CSV file. The response outputs a string coded in base 64, which I do not know what to do with. Also the api documentation says that the response must be read to a destination buffer-by-buffer.
Here is the C# code was provided by the api's documentation:
byte[] buffer = new byte[4000];
bool endOfStream = false;
int bytesRead = 0;
using (FileStream localFileStream = new FileStream(destinationPath, FileMode.Create, FileAccess.Write))
{
using (Stream remoteStream = client.DownloadFile(jobId))
{
while (!endOfStream)
{
bytesRead = remoteStream.Read(buffer, 0, buffer.Length);
if (bytesRead > 0)
{
localFileStream.Write(buffer, 0, bytesRead);
totalBytes += bytesRead;
}
else
{
endOfStream = true;
}
}
}
}
I have tried many different things to get this stream to a readable csv file, but non have worked.
with open('test.csv', 'w') as f: f.write(FileString)
Returns a csv with the base64 string spread over multiple lines
Here is my latest attempt:
with open('csvfile13.csv', 'wb') as csvfile:
FileString = client.service.DownloadFile(yyy.JobId, False)
stream = io.BytesIO(str(FileString))
with open(stream,"rt",4000) as readstream:
csvfile.write(readstream)
This produces the error:
TypeError: coercing to Unicode: need string or buffer, _io.BytesIO
Any help would be greatly appreciated, even if it is just to point me in the right direction. I will be ensure to award the points to whoever is the most helpful, even if I do not completely solve the issue!
I have asked several questions similar to this one, but I have yet to find an answer that works completely:
What is the Python equivalent to FileStream in C#?
Write Streamed Response(file-like object) to CSV file Byte by Byte in Python
How to replicate C# 'byte' and 'Write' in Python
Let me know if you need further clarification!
Update:
I have tried print(base64.b64decode(str(FileString)))
This gives me a page full of webdings like
]�P�O�J��Y��KW �
I have also tried
for data in client.service.DownloadFile(yyy.JobId, False):
print data
But this just loops through the output character by characater like any other string.
I have also managed to get a long string of bytes like \xbc\x97_D\xfb(not actual bytes, just similar format) by decoding the entire string, but I do not know how to make this readable.
Edit: Corrected the output of the sample python, added more example code, formatting
It sounds like you need to use the base64 module to decode the downloaded data.
It might be as simple as:
with open(destinationPath, 'w') as localFile:
remoteFile = client.service.DownloadFile(yyy.JobId, False)
remoteData = str(remoteFile).decode('base64')
localFile.write(remoteData)
I suggest you break the problem down and determine what data you have at each stage. For example what exactly are you getting back from client.service.DownloadFile?
Decoding your sample downloaded data (given in the comments):
'UEsYAItH7brgsgPutAG\AoAYYAYa='.decode('base64')
gives
'PK\x18\x00\x8bG\xed\xba\xe0\xb2\x03\xee\xb4\x01\x80\xa0\x06\x18\x01\x86'
This looks suspiciously like a ZIP file header. I suggest you rename the file .zip and open it as such to investigate.
If remoteData is a ZIP something like the following should extract and write your CSV.
import io
import zipfile
remoteFile = client.service.DownloadFile(yyy.JobId, False)
remoteData = str(remoteFile).decode('base64')
zipStream = io.BytesIO(remoteData)
z = zipfile.ZipFile(zipStream, 'r')
csvData = z.read(z.infolist()[0])
with open(destinationPath, 'w') as localFile:
localFile.write(csvData)
Note: BASE64 can have some variations regarding padding and alternate character mapping but once you can see the data it should be reasonably clear what you need. Of course carefully read the documentation on your SOAP interface.
Are you sure FileString is a Base64 string? Based on the source code here, suds.sax.text.Text is a subclass of Unicode. You can write this to a file as you would a normal string but whatever you use to read the data from the file may corrupt it unless it's UTF-8-encoded.
You can try writing your Text object to a UTF-8-encoded file using io.open:
import io
with io.open('/path/to/my/file.txt', 'w', encoding='utf_8') as f:
f.write(FileString)
Bear in mind, your console or text editor may have trouble displaying non-ASCII characters but that doesn't mean they're not encoded properly. Another way to inspect them is to open the file back up in the Python interactive shell:
import io
with io.open('/path/to/my/file.txt', 'r', encoding='utf_8') as f:
next(f) # displays the representation of the first line of the file as a Unicode object
In Python 3, you can even use the built-in csv to parse the file, however in Python 2, you'll need to pip install backports.csv because the built-in module doesn't work with Unicode objects:
from backports import csv
import io
with io.open('/path/to/my/file.txt', 'r', encoding='utf_8') as f:
r = csv.reader(f)
next(r) # displays the representation of the first line of the file as a list of Unicode objects (each value separated)

A socket message from Python to C# comes through garbled

I'm trying to set up a very basic ZeroMQ-based socket link between Python server and C# client using simplejson and Json.NET.
I try to send a dict from Python and read it into an object in C#. Python code:
message = {'MessageType':"None", 'ContentType':"None", 'Content':"OK"}
message_blob = simplejson.dumps(message).encode(encoding = "UTF-8")
alive_socket.send(message_blob)
The message is sent as normal UTF-8 string or, if I use UTF-16, as "'\xff\xfe{\x00"\x00..." etc.
Code in C# is where my problem is:
string reply = client.Receive(Encoding.UTF8);
The UTF-8 message is received as "≻潃瑮湥≴›..." etc.
I tried to use UTF-16 and the message comes through OK, but the first symbols are still the little-endian \xFF \xFE BOM so when I try to feed it to the deserializer,
PythonMessage replyMessage = JsonConvert.DeserializeObject<PythonMessage>(reply);
//PythonMessage is just a very simple class with properties,
//not relevant to the problem
I get an error (obviously occurring at the first symbol, \xFF):
Unexpected character encountered while parsing value: .
Something is obviously wrong in the way I'm using encoding. Can you please show me the right way to do this?
The byte-order-mark is obligatory in UTF-16. You can use UTF-16LE or UTF-16BE to assume a particular byte order and the BOM will not be generated. That is, use:
message_blob = simplejson.dumps(message).encode(encoding = "UTF-16le")

Google protocol buffers-Sending a message form C# client to a java Server

The Client sends a 1481 bytes array.
The server can read all the 1481 bytes message without any problems but by parsing the given messsage from the received binary array i get this exeption:
com.google.protobuf.InvalidProtocolBufferException: Protocol message contained an invalid tag (zero).
The binary data is the same. I checked that I am using the right version of the proto files. I am a bit at a loss tbh. Any help appreciated.
Code
byte [] data= IOUtils.toByteArray(br1, "ASCII");
System.out.println("SIZE:" + data.length);
AddressBook adb1 = AddressBook.parseFrom(data); System.out.println("Server: Addressbook:" + adb1.getPersonCount()); System.out.println("Server: Addressbook:" + adb1.getPerson(0).getName());
Question:
I need to find a way to correctly parse the received Adressbook msg from the read 1481 bytes arry.
Thanks.
This is the problem:
br1 = new InputStreamReader(s.getInputStream());
That's trying to treat opaque binary data as text. It's not text, it's binary data. So when you convert that Reader into a byte array, you've lost a load of the original data - no wonder it's an invalid protocol buffer.
Just use:
AddressBook adb1 = AddressBook.parseFrom(s.getInputStream());
and avoid the lossy text conversion. That's assuming you haven't got something equally broken on the C# side, of course.
If you must go via text, you should use base64 encoding on both sides.
Now it works I had same mistake by Serializing and Sending the Protocol Buffers Message

get the video file after convert from base64 and UTF8 in c#

what i need to do is sending a file from java to c#. the java act as the client meanwhile, c# act as server.
the file is loaded in java through fileinputstream and its been converted to utf8 then base64. see the code.
FileInputStream fin=new FileInputStream(fileName);
byte[] content = new byte[fin.available()];
fin.read(content, 0, content.length);
String asString = new String(content, "UTF8");
byte[] newBytes = asString.getBytes("UTF8");
String base64 = Base64.encodeToString(newBytes, Base64.DEFAULT);
and the server (using c# language) will read the data send and convert it back as a file. im using base64 then to utf8 and last i am not sure how to make it. what im trying to send is video.mp4 size of 144kb or less. so far, the output shows the catch of "WRONG FORMAT". see the code.
try
{
for (int i = 0; i <= _server.Q.NoOfItem - 1; i++)
{
words = _server.Q.ElementAtBuffer(i).ToString();
//textBox1.Text = words;
byte[] encodedDataAsBytes = System.Convert.FromBase64String(words);
string returnValue = System.Text.Encoding.UTF8.GetString(encodedDataAsBytes);
textBox1.Text = returnValue;
}
}
catch (ArgumentNullException argNull)
{
textBox1.Text = "Received null value";
}
catch (FormatException FrmtEx)
{
textBox1.Text = "Wrong format";
}
you can ignore the for (int i = 0; i <= _server.Q.NoOfItem - 1; i++) as this is the way i want to capture/retrieve the data sent.
p/s: it works when im just trying to pass any string without load the file (string >> utf8 >> base64) and to receive (base64 >> utf8 >> string).
the file is loaded in java through fileinputstream and its been converted to utf8
Then you've lost data. Video data is not text data, so don't load it as text data. Treat it as binary data - by all means encode it to base64 if you need to represent it as a string somewhere but don't perform any text decoding on it, as that's only meant for encoded text data, which this isn't.
It's really important to understand what's wrong here. The only thing the two lines below can do is lose data. If they don't lose data, they serve no purpose - and if they do lose data, they're clearly a bad idea:
String asString = new String(content, "UTF8");
byte[] newBytes = asString.getBytes("UTF8");
You should analyze how you ended up with this code in the first place... why did you feel the need to convert the byte array to a string and back?
jowierun's answer is also correct - you shouldn't be using available() at all. You might want to use utility methods from Guava, such as Files.toByteArray if you definitely need to read the whole file into memory in one go.
p/s: it works when im just trying to pass any string without load the file (string >> utf8 >> base64) and to receive (base64 >> utf8 >> string).
Well yes - if you start with text data, then that's fine - UTF-8 can represent every valid string, and base64 is lossless, so you're fine. (Admittedly you could break it by presenting an invalid string with half of a surrogate pair, but...) The problem is at the point where you treat non-text data as text in the first place.
You shouldn't use fin.available() to assume you can read the file in one go. That is likely to work for only small files. Instead you need to do the read in a loop and collect all the contents together before you encode it.
It would make sense to (on the java side at least) to have a decode that routine you can use to TEST that your encode is working (a unit test perhaps?). You will probably find that test is failing consistently with the problem you are getting.

Read UNIX encoded file with C#

I have c# program we use to replace some Values with others, to be used after as parameters. Like 'NAME1' replaced with &1, 'NAME2' with &2, and so on.
The problem is that the data to modify is on a text file encoded on UNIX, and special characters like í, which even on memory, gets read as a square(Invalid char). Due specifications that are out of my control, the file can't be changed and have no other choice than read it like that.
I have tryed to read with most of the 130 Encodings c# offers me with:
EncodingInfo[] info = System.Text.Encoding.GetEncodings();
string text;
for (int a = 0; a < info.Length; ++a)
{
text = File.ReadAllText(fn, info[a].GetEncoding());
File.WriteAllText(fn + a, text, info[a].GetEncoding());
}
fn is the file path to read. Have checked all the made files(like 130), no one of them writes properly the í so im out of ideas and im unable to find anything on internet.
SOLUTION:
Looks like finally this code made the work to get the text properly, also, had to fix the same encoder for the Writing part:
System.Text.Encoding encoding = System.Text.Encoding.GetEncodings()[41].GetEncoding();
String text = File.ReadAllText(fn, encoding); // get file text
// DO ALL THE STUFF I HAD TO
File.WriteAllText(fn, text, encoding) System.Text.Encoding.GetEncodings()[115].GetEncoding(); //Latin 9 (ISO)
/* ALL THIS ENCODINGS WORKED APARENTLY FOR ME WITH ALL WEIRD CHARS I WAS ABLE TO WRITE :P
System.Text.Encoding.GetEncodings()[108].GetEncoding(); //Baltic (ISO)
System.Text.Encoding.GetEncodings()[107].GetEncoding(); //Latin 3 (ISO)
System.Text.Encoding.GetEncodings()[106].GetEncoding(); //Central European (ISO)
System.Text.Encoding.GetEncodings()[105].GetEncoding(); //Western European (ISO)
System.Text.Encoding.GetEncodings()[49].GetEncoding(); //Vietnamese (Windows)
System.Text.Encoding.GetEncodings()[45].GetEncoding(); //Turkish (Windows)
System.Text.Encoding.GetEncodings()[41].GetEncoding(); //Central European (Windows) <-- Used this one
*/
Thank you very much for your help
Noman(1)
you have to get the proper encoding format. try
use file -i. That will output MIME-type information for the file,
which will also include the character-set encoding. I found a
man-page for it, too :)
Or try enca
It can guess and even convert between encodings. Just look at
the man page.
If you have the proper encoding format, look for a way to apply it to your file reading.
Quotes: How to find encoding of a file in Unix via script(s)

Categories

Resources