convert binary file to text - c#

I have a program that gets a response from a url in binary format and I do not know how to convert this to a text file.
byte[] postBytes = System.Text.Encoding.UTF8.GetBytes(postString);
request.ContentLength = postBytes.Length;
Stream stream = request.GetRequestStream();
stream.Write(postBytes, 0, postBytes.Length);
stream.Close();
response = (HttpWebResponse)request.GetResponse();
Stream ReceiveStream = response.GetResponseStream();
string filename = "C:\\responseGot.txt";
byte[] buffer = new byte[1024];
FileStream outFile = new FileStream(filename, FileMode.Create);
int bytesRead;
while ((bytesRead = ReceiveStream.Read(buffer, 0, buffer.Length)) != 0)
outFile.Write(buffer, 0, bytesRead);
When I open responseGot.txt it is a binary file how do I get text file.

In what format is the response you get? There is no such thing as a text file. There are only binary files. HTTP is also 100% binary.
Text is the interpretation of bytes, and it only exists as part of running application. You can never, ever write text to a file. You can only convert the text to bytes (using various ways) and write the bytes.
Therefore, ask yourself why the bytes you received cannot be interpreted by notepad.exe as text. Maybe the response is not directly text but a ZIP file or something.
You can guess the format with a hex editor
You can ask the website owner

You don't show in your code sample saving the file anywhere.
But to convert the response to string you can use:
using (HttpWebResponse response = req.GetResponse() as HttpWebResponse)
{
StreamReader reader = new StreamReader(response.GetResponseStream());
string ResponseTXT = reader.ReadToEnd();
}
Then you can save it with usual techniques
http://msdn.microsoft.com/en-us/library/6ka1wd3w%28v=vs.110%29.aspx
Did you mean that?

Every data represented in digital computing these days is based on 2 bits ie. binary (electrical/magnetic signals: on/off or north/south).
Every file written to disk is also a binary file ie. a sequence of (8 bit) bytes.
ASCII/ANSI defines character map for each byte sequence and only about 95 of the 256 bytes are referred to as printable (text) characters.
Your downloaded file seems to have more than just the printable characters (usually referred to as a plain text file).
To view the file as it is (in your current encoding settings):
type <file.ext>
To view in a different code page:
chcp <codepage>
type <file.ext>
To view a (plain)text representation of your file, you'd encode it first (ie. translate it to a text file) eg. hex coded string via some hex editor.
The first few characters of the hex sequence should give a magic number, indicating the type of file being read. You'd then open the file with the associated program (that is capable of opening those types of files).
If it is a text file you were expecting and instead got a file which has more than just printable (plain text) characters, then it's more likely there has been some sort of compression/encryption applied to it. Once again, the magic number should hint how the file should be treated eg. decompressed before attempting to read the data/file. (Encrypted files should come with a decryption hint/key, unless exchanged/agreed earlier)

Use the ReadFully method in this topic Creating a byte array from a stream
Get the string representation to an actual string:
string text = System.Text.Encoding.Default.GetString(byteArray);
And finally create the text file and write the content:
using(StreamWriter sw = new StreamWriter("C:\\responseGot.txt"))
{
sw.WriteLine(text);
}

Related

How to encode a csv file to UTF-8 using C# and asp.net core?

I have a csv file which contains latin characters (Ascii value > 127). The file gets uploaded with any type of encoding and shows the right data after uploading. But it gets converted automatically to UTF8 after performing operations on the file.
But I am not able to see the same characters when it is converted to UTF8 after performing operations.
I believe if I will upload the files with UTF8 encoding only then I will see the same characters that were present while uploading the file. So I want to encode the file with UTF8 Encoding.
I am getting IForm File from the function. I tried these methods to change the encoding but it does not affect the file in any way.
First method
//'file' is the IForm file
string[] filecontent;
StreamReader sr = new StreamReader(file.FileName);
string data = sr.ReadLine();
filecontent = data.Split(",");
File.WriteAllLines(file.FileName, filecontent, Encoding.UTF8);
Second method
var fileStream2 = File.OpenWrite(file.FileName);
var sw = new StreamWriter(fileStream2, Encoding.UTF8, 1024, false);
sw.Write(fileStream2);
sw.Close();
Is there any other method to do this or is there any other library to encode the csv file with UTF 8 directly?

Decode Stream to CSV in Python by Byte (Translate from C# code)

I am trying to consume a streamed response in Python from a soap API, and output a CSV file. The response outputs a string coded in base 64, which I do not know what to do with. Also the api documentation says that the response must be read to a destination buffer-by-buffer.
Here is the C# code was provided by the api's documentation:
byte[] buffer = new byte[4000];
bool endOfStream = false;
int bytesRead = 0;
using (FileStream localFileStream = new FileStream(destinationPath, FileMode.Create, FileAccess.Write))
{
using (Stream remoteStream = client.DownloadFile(jobId))
{
while (!endOfStream)
{
bytesRead = remoteStream.Read(buffer, 0, buffer.Length);
if (bytesRead > 0)
{
localFileStream.Write(buffer, 0, bytesRead);
totalBytes += bytesRead;
}
else
{
endOfStream = true;
}
}
}
}
I have tried many different things to get this stream to a readable csv file, but non have worked.
with open('test.csv', 'w') as f: f.write(FileString)
Returns a csv with the base64 string spread over multiple lines
Here is my latest attempt:
with open('csvfile13.csv', 'wb') as csvfile:
FileString = client.service.DownloadFile(yyy.JobId, False)
stream = io.BytesIO(str(FileString))
with open(stream,"rt",4000) as readstream:
csvfile.write(readstream)
This produces the error:
TypeError: coercing to Unicode: need string or buffer, _io.BytesIO
Any help would be greatly appreciated, even if it is just to point me in the right direction. I will be ensure to award the points to whoever is the most helpful, even if I do not completely solve the issue!
I have asked several questions similar to this one, but I have yet to find an answer that works completely:
What is the Python equivalent to FileStream in C#?
Write Streamed Response(file-like object) to CSV file Byte by Byte in Python
How to replicate C# 'byte' and 'Write' in Python
Let me know if you need further clarification!
Update:
I have tried print(base64.b64decode(str(FileString)))
This gives me a page full of webdings like
]�P�O�J��Y��KW �
I have also tried
for data in client.service.DownloadFile(yyy.JobId, False):
print data
But this just loops through the output character by characater like any other string.
I have also managed to get a long string of bytes like \xbc\x97_D\xfb(not actual bytes, just similar format) by decoding the entire string, but I do not know how to make this readable.
Edit: Corrected the output of the sample python, added more example code, formatting
It sounds like you need to use the base64 module to decode the downloaded data.
It might be as simple as:
with open(destinationPath, 'w') as localFile:
remoteFile = client.service.DownloadFile(yyy.JobId, False)
remoteData = str(remoteFile).decode('base64')
localFile.write(remoteData)
I suggest you break the problem down and determine what data you have at each stage. For example what exactly are you getting back from client.service.DownloadFile?
Decoding your sample downloaded data (given in the comments):
'UEsYAItH7brgsgPutAG\AoAYYAYa='.decode('base64')
gives
'PK\x18\x00\x8bG\xed\xba\xe0\xb2\x03\xee\xb4\x01\x80\xa0\x06\x18\x01\x86'
This looks suspiciously like a ZIP file header. I suggest you rename the file .zip and open it as such to investigate.
If remoteData is a ZIP something like the following should extract and write your CSV.
import io
import zipfile
remoteFile = client.service.DownloadFile(yyy.JobId, False)
remoteData = str(remoteFile).decode('base64')
zipStream = io.BytesIO(remoteData)
z = zipfile.ZipFile(zipStream, 'r')
csvData = z.read(z.infolist()[0])
with open(destinationPath, 'w') as localFile:
localFile.write(csvData)
Note: BASE64 can have some variations regarding padding and alternate character mapping but once you can see the data it should be reasonably clear what you need. Of course carefully read the documentation on your SOAP interface.
Are you sure FileString is a Base64 string? Based on the source code here, suds.sax.text.Text is a subclass of Unicode. You can write this to a file as you would a normal string but whatever you use to read the data from the file may corrupt it unless it's UTF-8-encoded.
You can try writing your Text object to a UTF-8-encoded file using io.open:
import io
with io.open('/path/to/my/file.txt', 'w', encoding='utf_8') as f:
f.write(FileString)
Bear in mind, your console or text editor may have trouble displaying non-ASCII characters but that doesn't mean they're not encoded properly. Another way to inspect them is to open the file back up in the Python interactive shell:
import io
with io.open('/path/to/my/file.txt', 'r', encoding='utf_8') as f:
next(f) # displays the representation of the first line of the file as a Unicode object
In Python 3, you can even use the built-in csv to parse the file, however in Python 2, you'll need to pip install backports.csv because the built-in module doesn't work with Unicode objects:
from backports import csv
import io
with io.open('/path/to/my/file.txt', 'r', encoding='utf_8') as f:
r = csv.reader(f)
next(r) # displays the representation of the first line of the file as a list of Unicode objects (each value separated)

Saving a string to a txt file on an FTP server

I am trying to save a string containing Json syntax to a .txt file on an FTP server.
I tried using this example http://msdn.microsoft.com/en-us/library/ms229715.aspx which worked great.
But this example takes an existing .txt local file and uploads it to the ftp server.
I would like to directly create / update a txt file on the ftp server from a string variable. Without having first to create the txt file locally in my pc.
Your example link is exactly what you need, but you need to get your information from a MemoryStream instead of an existing file.
You can turn a string directly into a Stream with this:
MemoryStream memStr = MemoryStream(UTF8Encoding.Default.GetBytes("asdf"));
However, you can shortcut this more by directly turning your string into a byte array, avoiding the need to make a Stream altogether:
System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
Byte[] bytes = encoding.GetBytes(yourString);
//and now plug that into your example
Stream requestStream = request.GetRequestStream();
requestStream.Write(bytes, 0, bytes.Length);
requestStream.Close();

Compressing a file from memory with SevenZipSharp, stranges mistakes

I download the SevenZipSharp Lib in order to compress some files.
I used this in order to compress a file :
var libPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.ProgramFiles), "7-zip", "7z.dll");
SevenZip.SevenZipCompressor.SetLibraryPath(libPath);
SevenZip.SevenZipCompressor compressor = new SevenZipCompressor();
compressor.CompressFiles(#"C:\myTestFile.mdf", new string[] { #"C:\myTestFileCompressed.7z" });
With this, my file is compressed whitout problem. I can decompressed it.
Now...i would like to compress the same file, but, instead of compress directly the file, i would like to :
Read the file in a string. Yes, in a string, and not in a byte[].
Convert my string to byte[]
Compress the byte[] to another byte[].
Here is my try :
string strToCompress = File.ReadAllText(#"C:\myTestFile.mdf");
SevenZipCompressor compressor = new SevenZipCompressor();
byte[] byteArrayToCompress = Encoding.ASCII.GetBytes(text);
MemoryStream stream = new MemoryStream(byteArrayToCompress);
MemoryStream streamOut = new MemoryStream();
compressor.CompressStream(stream, streamOut);
string strcompressed = Encoding.ASCII.GetString(streamOut.ToArray());
File.WriteAllText(#"C:\myfileCompressed.7z",strcompressed);
My problem is very simple :
If i compare the size produced by these 2 methods, it's 3 603 443 bytes vs 3 604 081 bytes.
In addition, i cannot uncompressed the file produced by the second method.
Maybe it's because i used ASCII encoding, but my file to compress is not a Text, it's a binary file.
Anyone could explain me how solving it please ? I need to read my file to a string and compress it. ( i don't want to read the file directly to a byte[]).
Thanks a lot,
Best regards,
Nixeus
You cannot put binary data into a string, not every byte value has a Unicode codepoint. Using ASCII encoding will similarly always cause irretrievable data loss, it only has characters for byte values 0 through 127, higher values will produce a ?
You certainly can convert a byte[] to a string, it needs to be encoded. The standard encoding that's used for that is available in .NET from the Convert.ToBase64String() method. You recover the byte[] again with Convert.FromBase64String(). Inevitably it won't be as compact, it will be 4/3 bigger as the original data in a byte[].
You can never produce a valid .7z archive that way, it of course uses the most compact possible storage and that is bytes. You must pass a FileStream to the CompressStream() method.

How to convert this to an image to store in SQL

Trying to upload a form with a file input, only getting the form fields from the input stream and not the File.Request method.
The form contains a file input and a couple of text boxes so I cant just upload the stream to the database.
I convert to string using this method
int len = Convert.ToInt32(context.Request.InputStream.Length);
byte[] stream = new byte[len];
context.Request.InputStream.Read(stream, 0, len);
string mime = Encoding.UTF8.GetString(stream);
and then split the multipart/form-data at the boundaries and read the first line of each part to see if its a file or not. You can see the full code Here
A file will look something like this
-----------------------------17901701330412
Content-Disposition: form-data; name="file"; filename="IMG00004-20101209-1704.jpg"
Content-Type: image/jpeg
�����ExifII* ����(1�2�i��Research In MotionBlackBerry 9105HHRim Exif Version1.00a2010:12:09 17:03:59 ��n�0220�v���� � ��� �2010:12:09 17:03:59��� $ &&$ #"(-:1(+6+"#2D36;=#A#'0GLF?K:?#> >)#)>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>��!������ }!1AQa"q2���#B��R��$3br� %&'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz������������������������������������������������������������������������� w!1AQaq"2�B���� #3R�br� $4�%�&'()*56789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz�������������������������������������������������������������������������� ?�`��ⳓ5��f)¤b��a�BS�Sb�)Xz�֝�q�"s�K�PA���}F7&��Vm��GӬ��%]� Uҵ�Z7��h�`�#&i ��i��MKB�P��r���-�B|ϛ=3Yٶ ��
and a field will look something like this
-----------------------------17901701330412
Content-Disposition: form-data; name="parent"
clientphotos
Parsing the field is easy, and getting the image content is easy, but then saving that in to the database so i can read it back out as an image, isnt so easy.
I have tried byte[] data = Encoding.UTF8.GetBytes(rawdata); but the output isnt correct.
Has anyone any ideas how to take the image content and save it to a byte[] as it should be?
UPDATE
The first line is from getting the image from context.Request.Files[name].InputStream.Read(data, 0, fs);
The second line is using Encoding.UTF8.GetBytes(rawdata);
The third line is using Encoding.ASCII.GetBytes(rawdata);
Obviously the first line is correct and works.
For now Im just using the first line to get the result and it'll probably stay that way unless someone can teach me how to read it from the input stream.
UPDATE
Found a nice place to share the code Source Code The trouble is on line 49 which for now just reads the Request.Files
You shouldn't store the image as text, store in its raw binary form. If you HAVE to store binary data as text you will need to encode it with Base64, but its going to be alot bigger than it needs to be.
http://www.vbforums.com/showthread.php?t=287324
byte[] encData_byte = new byte[data.Length];
encData_byte = System.Text.Encoding.UTF8.GetBytes(data);
string encodedData = Convert.ToBase64String(encData_byte);
return encodedData;
You will also need to do the reverse to get it back into a usable image format. A much better solution would be to store the binary image in a blob/binary field type in the database.
EDITED: Just reread your post and I see your data is already in base64, just store than in the database?
EDITED AGAIN: The Initial Question has been edited a number of times, I believe I answered the question but now the question has changed through a number of iterations and my initial answer is now less applicable.
Don't get the content as a string. Ideally, you should have a varbinary(max) column to store this data (there's also a file storage option that I haven't tried before in SQL Server 2008). Here's how you'd read the data into a byte[] to store in SQL Server:
var file = Request.Files[0];
var buffer = new byte[file.ContentLength];
using (var stream = file.InputStream)
{
var bytesRead = 0;
while (bytesRead < file.ContentLength)
{
bytesRead += stream.Read(
buffer, bytesRead, file.ContentLength - bytesRead);
}
}
As far as Opera goes, I'm not familiar with the issue you brought up. Maybe you could post a link explaining what you mean by "Opera has to upload in base64 using FileReader." If you do still have to process a Base64 string, here's how you'd convert it to a byte[]:
using (var reader = new StreamReader(context.Request.InputStream))
{
var base64Str = reader.ReadToEnd();
var bytes = Convert.FromBase64String(base64Str);
}

Categories

Resources