C#: Converting byte[] to UTF8 encoded string - c#

I am using a library called EXIFextractor to extract metadata information from images. This lib in part is using System.Drawing.Imaging.PropertyItem to do all the hard work. Some of the data in PropertyItem, such as Image Details etcetera, are fetched as an ASCII-string stored in a byte[] according to the Microsoft documentation.
My problem is that international characters (å, ä, ö, etcetera) are dropped and replaced by questionmarks. When I debug the code it is apparent that the byte[] is a representation of an UTF-8.
I'd like to parse the byte[] as an UTF8-string, how can I do this without loosing any information in the process?
Thanks in advance!
Update:
I have been asked to provide a snippet from my code:
The first snippet is from the class I use, namely the EXIFextractor.cs written by Asim Goheer
foreach( System.Drawing.Imaging.PropertyItem p in parr )
{
string v = "";
// ...
else if( p.Type == 0x2 )
{
// string
v = ascii.GetString(p.Value);
}
And this is my code where I try my best to handle the results of the above.
try {
EXIFextractor exif = new EXIFextractor(ref bmp, "");
object o;
if ((o = exif["Image Description"]) != null)
MediaFile.Description = Tools.UTF8Encode(o.ToString());
I have also tried a couple of other ways of getting my precious å, ä, ö from the data, but nothing seems to do the trick. I am starting to think Hans Passant is right about his conclusions in his answer below.

string yourText = System.Text.Encoding.UTF8.GetString(yourByteArray);

Use the GetString method on the Encoding.UTF8 object.

Yes, this is a problem with the app or camera that originated the image. The EXIF standard has horrible support for text, it has to be encoded in ASCII. That only ever works out well when the photographer speaks English. No doubt the software that encoded the image is ignoring this requirement. Which is what the PropertyItem class is doing as well, it encodes a string to byte[] with Marshal.StringToHGlobalAnsi(), which assumes the system's default code page.
There's no obvious fix for this, you'll get mojibake when the photo was made too far away from your machine.

Maybe you could try another encoding? UTF16, Unicode?
If you aren't sure if it got encodes right in the first place try to view the exif metadata with another exif reader.

Related

Parameter Not valid Image Conversion

Code:
Image imgnew = null;
using (var ms1 = new MemoryStream(img))
{
imgnew = Image.FromStream(ms1);
}
Getting a Parameter No valid while trying to convert binary to image
Read a lot of solution most of them claiming the byte is incorrect where as I generated the code from
this site http://codebeautify.org/base64-to-image-converter
and the byte code represents the correct image
Thanks
UPDATE:
Sorry for the unclear question earlier, was running out of time
As of now I do not have the exact Code but im writing the steps
Receiving a String
Converting it to byte array using Encoding.ASCII.GetBytes(base64String)
and then passing the bye array to the above code
Turned Out To Be an Encoding issue by:
Encoding.ASCII.GetBytes(base64String)
Sorted Out by changing it to :
Convert.FromBase64String(base64String)
Hope this might be of some help to someone else.

Can't figure out where these C# and Java code differ

I have some C# code that converts an image to base64 string. The code is :
MemoryStream ms = new MemoryStream();
Image img = Image.FromFile(filename);
img.Save(ms, System.Drawing.Imaging.ImageFormat.Jpeg);
string s = Convert.ToBase64String(ms.GetBuffer());
I am trying to implement it with JAVA. my java code is :
BufferedImage img = null;
img = ImageIO.read(new File(filename));
byte[] bytes = ((DataBufferByte)img.getData().getDataBuffer()).getData();
String js = Base64.encodeBase64String(bytes);
this two piece of code should return the same string for the same image file. But they are returning different strings. I am unable to figure out why. Can anyone shed some light on it?
this two piece of code should return the same string for the same image file
No, they really shouldn't.
The C# code is returning the base64 representation of a JPEG-encoded version of the image data - and potentially some 0s at the end, as you're using GetBuffer instead of ToArray. (You want ToArray here.)
The Java code is returning the base64 representation of the raw raster data, according to its SampleModel. I'd expect this to be significantly larger than the string returned by the C# code.
Even if both pieces of code encoded the image with the same format, that doesn't mean they'll come up with the exact same data - it will depend on the encoding.
Importantly, if you just want "the contents of the file in base64" then you don't need to go via an Image at all. For example, in C# you could use:
string base64 = Convert.ToBase64String(File.ReadAllBytes(filename));
The fact that it's an image is irrelevant in that respect - the file is just a collection of bytes, and you can base64-encode that without understanding the meaning of those bytes at all.

Obfuscating/Randomizing a String

How would I go about obfuscating text so it's unreadable to a user reading a text file, but my program could still read it? Basically, I'm going to have something like True*True*False*True*False*False*False*True*False*true* in a text file, and I need it to be all crazy looking.
I know how to get the text from the file and write to the file and all that stuff, I just need to figure out how to obfuscate the string and de-obfuscate it. Is this possible without getting into all crazy encryption stuff? I think AES and other encryption methods are overkill because in my program, this info isn't top secret or something, it can be viewed in the program anyways. I just don't want it edited directly through the file.
Thanks a bunch :D
Nathan
Is this possible without getting into all crazy encryption stuff?
Sure, but if user even remotely knows what he's doing he will be able to decode it with no problem.
// Encode
var bytes = Encoding.UTF8.GetBytes("true*false*true");
var base64 = Convert.ToBase64String(bytes);
// Decode
var data = Convert.FromBase64String(base64);
var decodedString = Encoding.UTF8.GetString(data); // get string and not bytes, thanks trope

Converting Binary Data to String [In Persian]

I am working on a system that needs to read a binary file containing certain Persian names/stock instruments. I need to convert the binary data into string to be used in further processes. I have googled it and haven't really found a solution to my problem. Anyone here who has worked in such a scenario or knows how to tackle such a problem?
Here is the code that I am using to convert the bytes to string (simple as it maybe):
byte[] data = binaryReader.ReadBytes(amountOfData);
string symbolRead = Encoding.ASCII.GetString(data);
FYI, I have tried to change my system locale to Persian and that hasn't helped either. Although it does allow me to view already written text in Persian.
Hoping to find a solution.
Thanks.
Don't use ASCII for encoding. First try using Default after setting your locale; then try asking directly someone what encoding is most used for Persia, and use this one.
Determine what coding is used in your file and use the corresponding encoding instead of Encoding.ASCII.GetString(...). Possible values could be Encoding.UTF8.GetString(...) or Encoding.Default.GetString(...) to use your system encoding. See documentation of the Encoding class for other possibilities.

Convert image to string (base64) in metro/windows 8 .NET 4.5

I need to convert an picture (that is stored in an object of type Image) to a string for storage (and later for conversion back into an Image object for display) in a metro app
I have found lots of answers for converting an image to a base64 string in .NET 4.0 etc but in 4.5 the System.Windows.Bitmap namespace isn't there (the Image class is in Windows.UI.Xaml.Media.Imaging) and the method that was in that namespace that made it possible in 4.0 "Save()" doesn't seem to be in 4.5...unless I just can't find it.
Theres an example of doing this here but like I said it doesn't work in a metro app/.NET 4.5
any ideas?
more details:
the method that will do this will convert an instance field that contains an image object (ive used its source property, is this correct?) and needs to store the resultant string from the conversion in an instance string field. this whole object can then be serialized, ignoring the Image field, with the hope of deserializing later and restoring the string to the Image field for display. so far ive tried to use a DataContractSerializer to serialize string from the image, but it doesn't seem to like it. Once I get a string from the image I would be able to serialize that, but its not something I've ever done before.
Also, it seems that the only .net 4.5 documentation that is definitely correct is the pages here: http://msdn.microsoft.com/library/windows/apps/
pages at the "normal looking" msdn site for .net 4.5 don't seem to always work in metro apps? (just a theory?)
[solved]
I finally got it! for anyone else that ever has to do this the answer is here: http://social.msdn.microsoft.com/Forums/en-US/winappswithcsharp/thread/38c6cb85-7454-424f-ae94-32782c036567/
I did this
var reader = new DataReader(myMemoryStream.GetInputStreamAt(0));
var bytes = new byte[myMemoryStream.Size];
await reader.LoadAsync((uint)myMemoryStream.Size);
reader.ReadBytes(bytes);
after this sequence, the byte array bytes will have the data from the stream in it, from there I set a string to the value of
Convert.ToBase64String(bytes);
I finally got it! for anyone else that ever has to do this the answer is here: http://social.msdn.microsoft.com/Forums/en-US/winappswithcsharp/thread/38c6cb85-7454-424f-ae94-32782c036567/
I did this
var reader = new DataReader(myMemoryStream.GetInputStreamAt(0));
var bytes = new byte[myMemoryStream.Size];
await reader.LoadAsync((uint)myMemoryStream.Size);
reader.ReadBytes(bytes);
after this sequence, the byte array bytes will have the data from the stream in it, from there I set a string to the value of
Convert.ToBase64String(bytes);
I'm not sure of this because I do not have the .net 4.5 installed here, but I think this could work:
You could use the BitmapSource.CopyPixels() method to extract the pixels of the image:
http://msdn.microsoft.com/en-us/library/ms616043(v=vs.110).aspx
Then use Convert.ToBase64String() to do the convertion.
Also, here are some useful imaging HOW-TOs:
http://msdn.microsoft.com/en-us/library/ms750864(v=vs.110)
Try BitmapEncoder. Example of how to create a BitmapEncoder here. The appropriate namespace is Windows.Graphics.Imaging.
BitmapEncoder gets you an encoder. You can then use GetPixelDataAsync(BitmapPixelFormat, BitmapAlphaMode, BitmapTransform, ExifOrientationMode, ColorManagementMode) to get your pixel data. Following that you can use any generic C# base64 encoder.
(Examples are Javascript but should work for C# as well since the classes exist in C#)
You should store the encoded format of image (say JPEG) format, decoded back to byte[], create a MemoryStream, then Metro BitmapImage can be created from the stream.

Categories

Resources