what i need to do is sending a file from java to c#. the java act as the client meanwhile, c# act as server.
the file is loaded in java through fileinputstream and its been converted to utf8 then base64. see the code.
FileInputStream fin=new FileInputStream(fileName);
byte[] content = new byte[fin.available()];
fin.read(content, 0, content.length);
String asString = new String(content, "UTF8");
byte[] newBytes = asString.getBytes("UTF8");
String base64 = Base64.encodeToString(newBytes, Base64.DEFAULT);
and the server (using c# language) will read the data send and convert it back as a file. im using base64 then to utf8 and last i am not sure how to make it. what im trying to send is video.mp4 size of 144kb or less. so far, the output shows the catch of "WRONG FORMAT". see the code.
try
{
for (int i = 0; i <= _server.Q.NoOfItem - 1; i++)
{
words = _server.Q.ElementAtBuffer(i).ToString();
//textBox1.Text = words;
byte[] encodedDataAsBytes = System.Convert.FromBase64String(words);
string returnValue = System.Text.Encoding.UTF8.GetString(encodedDataAsBytes);
textBox1.Text = returnValue;
}
}
catch (ArgumentNullException argNull)
{
textBox1.Text = "Received null value";
}
catch (FormatException FrmtEx)
{
textBox1.Text = "Wrong format";
}
you can ignore the for (int i = 0; i <= _server.Q.NoOfItem - 1; i++) as this is the way i want to capture/retrieve the data sent.
p/s: it works when im just trying to pass any string without load the file (string >> utf8 >> base64) and to receive (base64 >> utf8 >> string).
the file is loaded in java through fileinputstream and its been converted to utf8
Then you've lost data. Video data is not text data, so don't load it as text data. Treat it as binary data - by all means encode it to base64 if you need to represent it as a string somewhere but don't perform any text decoding on it, as that's only meant for encoded text data, which this isn't.
It's really important to understand what's wrong here. The only thing the two lines below can do is lose data. If they don't lose data, they serve no purpose - and if they do lose data, they're clearly a bad idea:
String asString = new String(content, "UTF8");
byte[] newBytes = asString.getBytes("UTF8");
You should analyze how you ended up with this code in the first place... why did you feel the need to convert the byte array to a string and back?
jowierun's answer is also correct - you shouldn't be using available() at all. You might want to use utility methods from Guava, such as Files.toByteArray if you definitely need to read the whole file into memory in one go.
p/s: it works when im just trying to pass any string without load the file (string >> utf8 >> base64) and to receive (base64 >> utf8 >> string).
Well yes - if you start with text data, then that's fine - UTF-8 can represent every valid string, and base64 is lossless, so you're fine. (Admittedly you could break it by presenting an invalid string with half of a surrogate pair, but...) The problem is at the point where you treat non-text data as text in the first place.
You shouldn't use fin.available() to assume you can read the file in one go. That is likely to work for only small files. Instead you need to do the read in a loop and collect all the contents together before you encode it.
It would make sense to (on the java side at least) to have a decode that routine you can use to TEST that your encode is working (a unit test perhaps?). You will probably find that test is failing consistently with the problem you are getting.
Related
I am trying to consume a streamed response in Python from a soap API, and output a CSV file. The response outputs a string coded in base 64, which I do not know what to do with. Also the api documentation says that the response must be read to a destination buffer-by-buffer.
Here is the C# code was provided by the api's documentation:
byte[] buffer = new byte[4000];
bool endOfStream = false;
int bytesRead = 0;
using (FileStream localFileStream = new FileStream(destinationPath, FileMode.Create, FileAccess.Write))
{
using (Stream remoteStream = client.DownloadFile(jobId))
{
while (!endOfStream)
{
bytesRead = remoteStream.Read(buffer, 0, buffer.Length);
if (bytesRead > 0)
{
localFileStream.Write(buffer, 0, bytesRead);
totalBytes += bytesRead;
}
else
{
endOfStream = true;
}
}
}
}
I have tried many different things to get this stream to a readable csv file, but non have worked.
with open('test.csv', 'w') as f: f.write(FileString)
Returns a csv with the base64 string spread over multiple lines
Here is my latest attempt:
with open('csvfile13.csv', 'wb') as csvfile:
FileString = client.service.DownloadFile(yyy.JobId, False)
stream = io.BytesIO(str(FileString))
with open(stream,"rt",4000) as readstream:
csvfile.write(readstream)
This produces the error:
TypeError: coercing to Unicode: need string or buffer, _io.BytesIO
Any help would be greatly appreciated, even if it is just to point me in the right direction. I will be ensure to award the points to whoever is the most helpful, even if I do not completely solve the issue!
I have asked several questions similar to this one, but I have yet to find an answer that works completely:
What is the Python equivalent to FileStream in C#?
Write Streamed Response(file-like object) to CSV file Byte by Byte in Python
How to replicate C# 'byte' and 'Write' in Python
Let me know if you need further clarification!
Update:
I have tried print(base64.b64decode(str(FileString)))
This gives me a page full of webdings like
]�P�O�J��Y��KW �
I have also tried
for data in client.service.DownloadFile(yyy.JobId, False):
print data
But this just loops through the output character by characater like any other string.
I have also managed to get a long string of bytes like \xbc\x97_D\xfb(not actual bytes, just similar format) by decoding the entire string, but I do not know how to make this readable.
Edit: Corrected the output of the sample python, added more example code, formatting
It sounds like you need to use the base64 module to decode the downloaded data.
It might be as simple as:
with open(destinationPath, 'w') as localFile:
remoteFile = client.service.DownloadFile(yyy.JobId, False)
remoteData = str(remoteFile).decode('base64')
localFile.write(remoteData)
I suggest you break the problem down and determine what data you have at each stage. For example what exactly are you getting back from client.service.DownloadFile?
Decoding your sample downloaded data (given in the comments):
'UEsYAItH7brgsgPutAG\AoAYYAYa='.decode('base64')
gives
'PK\x18\x00\x8bG\xed\xba\xe0\xb2\x03\xee\xb4\x01\x80\xa0\x06\x18\x01\x86'
This looks suspiciously like a ZIP file header. I suggest you rename the file .zip and open it as such to investigate.
If remoteData is a ZIP something like the following should extract and write your CSV.
import io
import zipfile
remoteFile = client.service.DownloadFile(yyy.JobId, False)
remoteData = str(remoteFile).decode('base64')
zipStream = io.BytesIO(remoteData)
z = zipfile.ZipFile(zipStream, 'r')
csvData = z.read(z.infolist()[0])
with open(destinationPath, 'w') as localFile:
localFile.write(csvData)
Note: BASE64 can have some variations regarding padding and alternate character mapping but once you can see the data it should be reasonably clear what you need. Of course carefully read the documentation on your SOAP interface.
Are you sure FileString is a Base64 string? Based on the source code here, suds.sax.text.Text is a subclass of Unicode. You can write this to a file as you would a normal string but whatever you use to read the data from the file may corrupt it unless it's UTF-8-encoded.
You can try writing your Text object to a UTF-8-encoded file using io.open:
import io
with io.open('/path/to/my/file.txt', 'w', encoding='utf_8') as f:
f.write(FileString)
Bear in mind, your console or text editor may have trouble displaying non-ASCII characters but that doesn't mean they're not encoded properly. Another way to inspect them is to open the file back up in the Python interactive shell:
import io
with io.open('/path/to/my/file.txt', 'r', encoding='utf_8') as f:
next(f) # displays the representation of the first line of the file as a Unicode object
In Python 3, you can even use the built-in csv to parse the file, however in Python 2, you'll need to pip install backports.csv because the built-in module doesn't work with Unicode objects:
from backports import csv
import io
with io.open('/path/to/my/file.txt', 'r', encoding='utf_8') as f:
r = csv.reader(f)
next(r) # displays the representation of the first line of the file as a list of Unicode objects (each value separated)
I need to change the encoding of some text file from UTF-8 to ASCII pragmatically in my Windows store app project(c#). On WinRT/Win8.1, we can do this simply by manually open it with notepad and then choose "Save as" menu, but my question is how to do it in code(c#)?
[EDIT]
In WinRT, we can use FileIO.WriteLinesAsync() or FileIO.WriteTextAsync() to save a string to text file, but we can only specify UnicodeEncoding as the encoding. So, the SDK is quite different compare to full fledged .NET SDK.
[EDIT]
I know ASCII is a subset of UTF-8, but I really need to make sure the file encoding is ASCII, because I want to upload the file to a web site and it only accept ASCII encoding txt files(UTF-8/Unicode encoding would cause it complain file format error!);
[EDIT]
Problem solved:
public async void SaveStringToAnsiFile()
{
StorageFile file = await ApplicationData.Current.LocalFolder.CreateFileAsync("test.txt", CreationCollisionOption.ReplaceExisting);
await Windows.Storage.FileIO.WriteBytesAsync(file, Encoding.GetEncoding("gb2312").GetBytes("abcd→1234"));
}
Since ASCII isn't directly supported, you'll need to convert the text to a byte array and use something like WriteBytesAsync (reference). Here's a simple technique. Of course, non-ascii characters won't work (but that's not what you need anyway).
string str = "these are characters";
byte[] bytes = new byte[str.Length];
for (var i = 0; i < str.Length; i++)
{
bytes[i] = Convert.ToByte(str[i]);
}
// create the file here ... then ...
await Windows.Storage.FileIO.WriteBytesAsync(file, bytes);
I am trying to use the FileReader to obtain the base-64 representation of an image and submit that to a .net WebApi service for image uploading.
My problem is that the contents of fileReader.result are not valid as a base-64 encoded image, at least according to .net.
I am just using a very simple method, and testing with fiddler to post to the service. If I post the complete result string from filereader.result, I get an error "Invalid length for a Base-64 char array or string" when I try and read the string using FromBase64String.
public void Post([FromBody]string imgString)
{
var myString = imgString.Split(new char[]{','});
byte[] bytes = Convert.FromBase64String(myString[1]);
using (MemoryStream ms = new MemoryStream(bytes))
{
Image image = Image.FromStream(ms);
image.Save("myBlargyImage.jpg");
}
}
Is cut+paste into fiddler doing something to the string that I need to account for here, or is there something else I am not doing correctly? This seems like it should be straightforward: Encode the image as a string, send the string, decode the string, save the image.
For example, using filereader to display a preview of the image on the client, I get the following in filereader.result:
src="...oBUA00AqYL/AMCg3//Z"
I have tried both sending the entire string ("data...Z"), and just the Base64 string. Currently, I am splitting the string server side to get the Base64 string. Doing this, I always get the invalid length error.
Alternatively, I have tried sending just the base64 string. Not knowing if the leading / was actually part of the string or not, I deleted it in the post body. Doing THIS, I can actually read the value into a byte array, but then I get an error using Image.FromStream that the array is not a valid image.
So, either I get an error that the entire string as provided by filereader is an invalid length, or, I hack it up and get an error that even if it is a valid length, it is not a valid image (when reading the bytearray). That is what makes me wonder if there is some issue of translation or formatting between the filereader.read, dev tools in chrome, then cutting and pasting into fiddler.
UPDATE:
I tried a more realistic implementation by just taking the filereader.result and putting it in a $.post() call, and it works as expected.
It appears I was right, that I, or notepad++, or fiddler, are doing something to the data when I touch it to cut and paste filereader.result into a service call.
If someone knows exactly what that might be, or how one can verify they are sending a valid base-64 encoding of an image to a service, it might help others who are attempting the same thing in the future.
Again, if in the browser filereader.result yielded '', I was simply copying that string from the developer tools panel, creating a fiddler call and in the request body including the copied string: "=". Somehow, the base-64 'somestring' was getting munched in the cut+paste.
function readURL(input) {
if (input.files && input.files[0]) {
reader = new FileReader();
reader.onload = function (e) {
$('#imgPreview').attr('src', e.target.result);
$.post('/api/testy/t/4',
{'':e.target.result}
);
};
reader.readAsDataURL(input.files[0]);
reader.onloadend = function (e) {
console.log(e.target.result);
};
}
}
$("#imgUpload").change(function () {
readURL(this);
});
Don't forget to remove the 'noise' from a dataUrl,
For example in
_DATA_HERE
you have to remove the data:image/png;base64, part before, so you process only the base 64 portion.
With js, it would be
var b64 = dataUrl.split("base64,")[1];
Hope this helps. Cheers
A data uri is not a base64 encode string, it may contain a base64 encoded string at the end of it. In this case it does, so you need to only send the base64 encoded string part.
var imagestr = datauri.split(',')[1];
sendToWebService(imagestr);
Make sure fiddler is not truncating the Base 64 String
I'm wrestling with the following problem.
I am working with V.S.10 and using the .NET framework 2.0. Coding in C#.
I'm making a simple editor which hands over its text to a webservice. I know that .NET uses UTF-16 (I believe the default is LE? And I want Big Endian). I want to make it able to work in any editor and therefore attatch a BOM. The problem is that going through httml it gets changed I believe to UTF-8? Or at least that is what it seems from the following error:
Client found response content type of 'text/html;
charset=UTF-8', but expected 'text/xml'.
The request failed with an empty response.
EDIT: the documentation warns that the encoding of all the properties are UTF-8 withOUT a BOM marker. editorTextString is one of the properties. BUT the file content to upload must be in UTF-16BE WITH a BOM. I've checked to see if .net automatically translates the encoding and it does not. Or at least the chinese letters become ?'s. So I need to re-encode or convert better said, the text to UTF-16BE WITH BOM instead of the UTF-8 without BOM that it is in now.
I've looked through a ton of examples and can't see what I'm doing wrong here. Can someone offer advice or correct the code? (Yes I've also read Jon's really cool article about unicode :)) The theory is clear, the actual practice is lacking.
// Convert to UTF-16 Big Endian
Encoding leUnicode = Encoding.Unicode;
Encoding beUnicode = Encoding.BigEndianUnicode;
byte[] editorTextBytesLE = leUnicode.GetBytes(editorTextString);
Console.WriteLine("Little Endian - Encoded bytes:");
foreach (Byte b in editorTextBytesLE)
{
Console.Write("[{0}]", b);
}
Console.WriteLine();
byte[] editorTextBytesBE = Encoding.Convert(leUnicode, beUnicode, editorTextBytesLE);
Console.WriteLine("BIG ENDIAN - Encoded bytes:");
foreach (Byte b in editorTextBytesBE)
{
Console.Write("[{0}]", b);
}
Console.WriteLine();
String decodedString = UnicodeEncoding.BigEndianUnicode.GetString(editorTextBytesBE);
Console.WriteLine();
Console.WriteLine("Decoded bytes:");
Console.WriteLine(decodedString);
// inserting UTF-16BE BOM marker, which eases recognition for any editor
byte[] editorTextBytesToSend = { 0xfe, 0xff };
editorTextBytesToSend.CopyTo(editorTextBytesBE, 2);
File.WriteAllText(fileName, decodedString);
Console.WriteLine("Uploading {0} to {1} ...", fileName, myURL);
// Upload the file to the URL
editorTextBytesBE = myWebClient.UploadFile(myURL, "PUT", fileName);
I haven't been able to find anything to switch to big endian, but I've seen some examples (which I couldn't get working alas) to switch TO UTF-8. Would much appreciate any help, examples, or links to get the code to UTF-16BE.
Partial answer:
The following code does not look like it is inserting anything. Instead it overwrites 2 bytes at positions 2 and 3 with your BOM. It skips the first 2.
// inserting UTF-16BE BOM marker, which eases recognition for any editor
byte[] editorTextBytesToSend = { 0xfe, 0xff };
editorTextBytesToSend.CopyTo(editorTextBytesBE, 2);
To have a file with BOM in either of UTF-X encoding simply create TextWriter with correct encoding:
using(var writer =
new StreamWriter(fileName, new Encoding.UnicodeEncoding(true,true,true))
{
writer.Write(editorTextString);
}
Use UnicodeEncoding constructor that give BOM.
Side note: there is a good chance that your problem is not related to use of this rare encoding, but it should fix what your code tries to do now.
I managed to work with the following code:
byte[] BOMTextBytesToSend = {0xfe, 0xff };
byte[] editorTextBytesToSend = System.Text.Encoding.BigEndianUnicode.GetBytes(editorTextString);
BOMTextBytesToSend.CopyTo(editorTextBytesToSend, 0);
Trying to upload a form with a file input, only getting the form fields from the input stream and not the File.Request method.
The form contains a file input and a couple of text boxes so I cant just upload the stream to the database.
I convert to string using this method
int len = Convert.ToInt32(context.Request.InputStream.Length);
byte[] stream = new byte[len];
context.Request.InputStream.Read(stream, 0, len);
string mime = Encoding.UTF8.GetString(stream);
and then split the multipart/form-data at the boundaries and read the first line of each part to see if its a file or not. You can see the full code Here
A file will look something like this
-----------------------------17901701330412
Content-Disposition: form-data; name="file"; filename="IMG00004-20101209-1704.jpg"
Content-Type: image/jpeg
�����ExifII* ����(1�2�i��Research In MotionBlackBerry 9105HHRim Exif Version1.00a2010:12:09 17:03:59 ��n�0220�v���� � ��� �2010:12:09 17:03:59��� $ &&$ #"(-:1(+6+"#2D36;=#A#'0GLF?K:?#> >)#)>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>��!������ }!1AQa"q2���#B��R��$3br� %&'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz������������������������������������������������������������������������� w!1AQaq"2�B���� #3R�br� $4�%�&'()*56789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz�������������������������������������������������������������������������� ?�`��ⳓ5��f)¤b��a�BS�Sb�)Xz�֝�q�"s�K�PA���}F7&��Vm��GӬ��%]� Uҵ�Z7��h�`�#&i ��i��MKB�P��r���-�B|ϛ=3Yٶ ��
and a field will look something like this
-----------------------------17901701330412
Content-Disposition: form-data; name="parent"
clientphotos
Parsing the field is easy, and getting the image content is easy, but then saving that in to the database so i can read it back out as an image, isnt so easy.
I have tried byte[] data = Encoding.UTF8.GetBytes(rawdata); but the output isnt correct.
Has anyone any ideas how to take the image content and save it to a byte[] as it should be?
UPDATE
The first line is from getting the image from context.Request.Files[name].InputStream.Read(data, 0, fs);
The second line is using Encoding.UTF8.GetBytes(rawdata);
The third line is using Encoding.ASCII.GetBytes(rawdata);
Obviously the first line is correct and works.
For now Im just using the first line to get the result and it'll probably stay that way unless someone can teach me how to read it from the input stream.
UPDATE
Found a nice place to share the code Source Code The trouble is on line 49 which for now just reads the Request.Files
You shouldn't store the image as text, store in its raw binary form. If you HAVE to store binary data as text you will need to encode it with Base64, but its going to be alot bigger than it needs to be.
http://www.vbforums.com/showthread.php?t=287324
byte[] encData_byte = new byte[data.Length];
encData_byte = System.Text.Encoding.UTF8.GetBytes(data);
string encodedData = Convert.ToBase64String(encData_byte);
return encodedData;
You will also need to do the reverse to get it back into a usable image format. A much better solution would be to store the binary image in a blob/binary field type in the database.
EDITED: Just reread your post and I see your data is already in base64, just store than in the database?
EDITED AGAIN: The Initial Question has been edited a number of times, I believe I answered the question but now the question has changed through a number of iterations and my initial answer is now less applicable.
Don't get the content as a string. Ideally, you should have a varbinary(max) column to store this data (there's also a file storage option that I haven't tried before in SQL Server 2008). Here's how you'd read the data into a byte[] to store in SQL Server:
var file = Request.Files[0];
var buffer = new byte[file.ContentLength];
using (var stream = file.InputStream)
{
var bytesRead = 0;
while (bytesRead < file.ContentLength)
{
bytesRead += stream.Read(
buffer, bytesRead, file.ContentLength - bytesRead);
}
}
As far as Opera goes, I'm not familiar with the issue you brought up. Maybe you could post a link explaining what you mean by "Opera has to upload in base64 using FileReader." If you do still have to process a Base64 string, here's how you'd convert it to a byte[]:
using (var reader = new StreamReader(context.Request.InputStream))
{
var base64Str = reader.ReadToEnd();
var bytes = Convert.FromBase64String(base64Str);
}