So, the scenario is this:
A user uploads a file, my code turns this file into an array of bytes, and than the array is passed to an external API. And this works fine.
The problem is that this file contains special characters like æ,ø,å, and when the byte[] is converted into characters again, these characters are replaced by "?".
public void UploadFile(HttpPostedFileBase file){
var binary = new byte[file.ContentLength];
file.InputStream.Read(binary, 0, file.ContentLength;
var result = API.UploadDocument(binary); //Passes the file to the external API
}
Can I add some encoding-info to the byte-array or the InputStream, or is it the API's responsibillity to make sure that the text is properly encoded when converting the byte[] back to characters?
It is API's responsibility to properly convert byte array to chars. If you have access to API code than you should add encoding parameters to UploadDocument method
Related
I need to change the encoding of some text file from UTF-8 to ASCII pragmatically in my Windows store app project(c#). On WinRT/Win8.1, we can do this simply by manually open it with notepad and then choose "Save as" menu, but my question is how to do it in code(c#)?
[EDIT]
In WinRT, we can use FileIO.WriteLinesAsync() or FileIO.WriteTextAsync() to save a string to text file, but we can only specify UnicodeEncoding as the encoding. So, the SDK is quite different compare to full fledged .NET SDK.
[EDIT]
I know ASCII is a subset of UTF-8, but I really need to make sure the file encoding is ASCII, because I want to upload the file to a web site and it only accept ASCII encoding txt files(UTF-8/Unicode encoding would cause it complain file format error!);
[EDIT]
Problem solved:
public async void SaveStringToAnsiFile()
{
StorageFile file = await ApplicationData.Current.LocalFolder.CreateFileAsync("test.txt", CreationCollisionOption.ReplaceExisting);
await Windows.Storage.FileIO.WriteBytesAsync(file, Encoding.GetEncoding("gb2312").GetBytes("abcd→1234"));
}
Since ASCII isn't directly supported, you'll need to convert the text to a byte array and use something like WriteBytesAsync (reference). Here's a simple technique. Of course, non-ascii characters won't work (but that's not what you need anyway).
string str = "these are characters";
byte[] bytes = new byte[str.Length];
for (var i = 0; i < str.Length; i++)
{
bytes[i] = Convert.ToByte(str[i]);
}
// create the file here ... then ...
await Windows.Storage.FileIO.WriteBytesAsync(file, bytes);
I am trying to use the FileReader to obtain the base-64 representation of an image and submit that to a .net WebApi service for image uploading.
My problem is that the contents of fileReader.result are not valid as a base-64 encoded image, at least according to .net.
I am just using a very simple method, and testing with fiddler to post to the service. If I post the complete result string from filereader.result, I get an error "Invalid length for a Base-64 char array or string" when I try and read the string using FromBase64String.
public void Post([FromBody]string imgString)
{
var myString = imgString.Split(new char[]{','});
byte[] bytes = Convert.FromBase64String(myString[1]);
using (MemoryStream ms = new MemoryStream(bytes))
{
Image image = Image.FromStream(ms);
image.Save("myBlargyImage.jpg");
}
}
Is cut+paste into fiddler doing something to the string that I need to account for here, or is there something else I am not doing correctly? This seems like it should be straightforward: Encode the image as a string, send the string, decode the string, save the image.
For example, using filereader to display a preview of the image on the client, I get the following in filereader.result:
src="...oBUA00AqYL/AMCg3//Z"
I have tried both sending the entire string ("data...Z"), and just the Base64 string. Currently, I am splitting the string server side to get the Base64 string. Doing this, I always get the invalid length error.
Alternatively, I have tried sending just the base64 string. Not knowing if the leading / was actually part of the string or not, I deleted it in the post body. Doing THIS, I can actually read the value into a byte array, but then I get an error using Image.FromStream that the array is not a valid image.
So, either I get an error that the entire string as provided by filereader is an invalid length, or, I hack it up and get an error that even if it is a valid length, it is not a valid image (when reading the bytearray). That is what makes me wonder if there is some issue of translation or formatting between the filereader.read, dev tools in chrome, then cutting and pasting into fiddler.
UPDATE:
I tried a more realistic implementation by just taking the filereader.result and putting it in a $.post() call, and it works as expected.
It appears I was right, that I, or notepad++, or fiddler, are doing something to the data when I touch it to cut and paste filereader.result into a service call.
If someone knows exactly what that might be, or how one can verify they are sending a valid base-64 encoding of an image to a service, it might help others who are attempting the same thing in the future.
Again, if in the browser filereader.result yielded '', I was simply copying that string from the developer tools panel, creating a fiddler call and in the request body including the copied string: "=". Somehow, the base-64 'somestring' was getting munched in the cut+paste.
function readURL(input) {
if (input.files && input.files[0]) {
reader = new FileReader();
reader.onload = function (e) {
$('#imgPreview').attr('src', e.target.result);
$.post('/api/testy/t/4',
{'':e.target.result}
);
};
reader.readAsDataURL(input.files[0]);
reader.onloadend = function (e) {
console.log(e.target.result);
};
}
}
$("#imgUpload").change(function () {
readURL(this);
});
Don't forget to remove the 'noise' from a dataUrl,
For example in
_DATA_HERE
you have to remove the data:image/png;base64, part before, so you process only the base 64 portion.
With js, it would be
var b64 = dataUrl.split("base64,")[1];
Hope this helps. Cheers
A data uri is not a base64 encode string, it may contain a base64 encoded string at the end of it. In this case it does, so you need to only send the base64 encoded string part.
var imagestr = datauri.split(',')[1];
sendToWebService(imagestr);
Make sure fiddler is not truncating the Base 64 String
I download the SevenZipSharp Lib in order to compress some files.
I used this in order to compress a file :
var libPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.ProgramFiles), "7-zip", "7z.dll");
SevenZip.SevenZipCompressor.SetLibraryPath(libPath);
SevenZip.SevenZipCompressor compressor = new SevenZipCompressor();
compressor.CompressFiles(#"C:\myTestFile.mdf", new string[] { #"C:\myTestFileCompressed.7z" });
With this, my file is compressed whitout problem. I can decompressed it.
Now...i would like to compress the same file, but, instead of compress directly the file, i would like to :
Read the file in a string. Yes, in a string, and not in a byte[].
Convert my string to byte[]
Compress the byte[] to another byte[].
Here is my try :
string strToCompress = File.ReadAllText(#"C:\myTestFile.mdf");
SevenZipCompressor compressor = new SevenZipCompressor();
byte[] byteArrayToCompress = Encoding.ASCII.GetBytes(text);
MemoryStream stream = new MemoryStream(byteArrayToCompress);
MemoryStream streamOut = new MemoryStream();
compressor.CompressStream(stream, streamOut);
string strcompressed = Encoding.ASCII.GetString(streamOut.ToArray());
File.WriteAllText(#"C:\myfileCompressed.7z",strcompressed);
My problem is very simple :
If i compare the size produced by these 2 methods, it's 3 603 443 bytes vs 3 604 081 bytes.
In addition, i cannot uncompressed the file produced by the second method.
Maybe it's because i used ASCII encoding, but my file to compress is not a Text, it's a binary file.
Anyone could explain me how solving it please ? I need to read my file to a string and compress it. ( i don't want to read the file directly to a byte[]).
Thanks a lot,
Best regards,
Nixeus
You cannot put binary data into a string, not every byte value has a Unicode codepoint. Using ASCII encoding will similarly always cause irretrievable data loss, it only has characters for byte values 0 through 127, higher values will produce a ?
You certainly can convert a byte[] to a string, it needs to be encoded. The standard encoding that's used for that is available in .NET from the Convert.ToBase64String() method. You recover the byte[] again with Convert.FromBase64String(). Inevitably it won't be as compact, it will be 4/3 bigger as the original data in a byte[].
You can never produce a valid .7z archive that way, it of course uses the most compact possible storage and that is bytes. You must pass a FileStream to the CompressStream() method.
what i need to do is sending a file from java to c#. the java act as the client meanwhile, c# act as server.
the file is loaded in java through fileinputstream and its been converted to utf8 then base64. see the code.
FileInputStream fin=new FileInputStream(fileName);
byte[] content = new byte[fin.available()];
fin.read(content, 0, content.length);
String asString = new String(content, "UTF8");
byte[] newBytes = asString.getBytes("UTF8");
String base64 = Base64.encodeToString(newBytes, Base64.DEFAULT);
and the server (using c# language) will read the data send and convert it back as a file. im using base64 then to utf8 and last i am not sure how to make it. what im trying to send is video.mp4 size of 144kb or less. so far, the output shows the catch of "WRONG FORMAT". see the code.
try
{
for (int i = 0; i <= _server.Q.NoOfItem - 1; i++)
{
words = _server.Q.ElementAtBuffer(i).ToString();
//textBox1.Text = words;
byte[] encodedDataAsBytes = System.Convert.FromBase64String(words);
string returnValue = System.Text.Encoding.UTF8.GetString(encodedDataAsBytes);
textBox1.Text = returnValue;
}
}
catch (ArgumentNullException argNull)
{
textBox1.Text = "Received null value";
}
catch (FormatException FrmtEx)
{
textBox1.Text = "Wrong format";
}
you can ignore the for (int i = 0; i <= _server.Q.NoOfItem - 1; i++) as this is the way i want to capture/retrieve the data sent.
p/s: it works when im just trying to pass any string without load the file (string >> utf8 >> base64) and to receive (base64 >> utf8 >> string).
the file is loaded in java through fileinputstream and its been converted to utf8
Then you've lost data. Video data is not text data, so don't load it as text data. Treat it as binary data - by all means encode it to base64 if you need to represent it as a string somewhere but don't perform any text decoding on it, as that's only meant for encoded text data, which this isn't.
It's really important to understand what's wrong here. The only thing the two lines below can do is lose data. If they don't lose data, they serve no purpose - and if they do lose data, they're clearly a bad idea:
String asString = new String(content, "UTF8");
byte[] newBytes = asString.getBytes("UTF8");
You should analyze how you ended up with this code in the first place... why did you feel the need to convert the byte array to a string and back?
jowierun's answer is also correct - you shouldn't be using available() at all. You might want to use utility methods from Guava, such as Files.toByteArray if you definitely need to read the whole file into memory in one go.
p/s: it works when im just trying to pass any string without load the file (string >> utf8 >> base64) and to receive (base64 >> utf8 >> string).
Well yes - if you start with text data, then that's fine - UTF-8 can represent every valid string, and base64 is lossless, so you're fine. (Admittedly you could break it by presenting an invalid string with half of a surrogate pair, but...) The problem is at the point where you treat non-text data as text in the first place.
You shouldn't use fin.available() to assume you can read the file in one go. That is likely to work for only small files. Instead you need to do the read in a loop and collect all the contents together before you encode it.
It would make sense to (on the java side at least) to have a decode that routine you can use to TEST that your encode is working (a unit test perhaps?). You will probably find that test is failing consistently with the problem you are getting.
I want to output a Byte[] array to a string so I can send it along a HTTPRequest. Can it be done? And will the server pick up the data and create a file from it? Or does some special encoding need to be done?
The file is an image. At the moment I have:
Byte[] fBuff = File.ReadAllBytes("C:/pic.jpeg");
I need to take what's in fBuff and output it to send along a post request.
Use the Convert.ToBase64String method
Byte[] fBuff = File.ReadAllBytes("C:/pic.jpeg");
String base64 = Convert.ToBase64String(fBuff);
This way the string will as compact as posible and is sort of the "standard" way to writing bytes to string and back to bytes.
To convert back to bytes use Convert.FromBase64String:
String base64 = ""; // get the string
Byte[] fBuff = Convert.FromBase64String(base64);
You could just create a String where each byte is a character of the String. If you do the same opposite procedure at the receiver you will not have any problems (I have done something similar but in Java).
Convert.ToBase64String looks like your best option to store the bytes in a transmittable array, you should look into these functions.
If you are sending just the file, you can use the UploadFile method of the WebClient class:
using (WebClient client = new WebClient) {
client.UploadFile("http://site.com/ThePage.aspx", #"C:\pic.jpeg");
}
This will post the file as a regular file upload, just as from a web page with a file input. On the receiving server the file comes in the Request.Files collection.
Any reason of not using the WebClient upload file?