I've handled base64 encoded images and strings and have been able to decode them using C# in the past.
I'm now trying on what looks to me like a base64 string, but the value I'm getting is about 98% accurate and I just don't understand what is affecting the output.
Here is the string:
http://pastebin.com/ntcth6uN
And this is the decoded value:
http://pastebin.com/Buh4xXDA
That IS what it should be, but you can clearly see where there are artifacts and the decoded value isn't quite right.
Any idea why it's failing?
var data = Convert.FromBase64String(Faces[i].InfoData);
Faces[i].InfoData = Encoding.UTF8.GetString(data);
Thanks for your help.
The string is not encoded as UTF8, but instead as another encoding. Thus the same encoding must be used to decode it.
Use the following to decode it:
Encoding.ASCII.GetString(data);
If ASCII isn't the correct encoding, here's some code that will iterate through available encodings and list the first 200 characters in order to manually select original encoding:
String encStr = "PFdhdGNoIG5hbWU9IlBpa2FjaHUDV2F0Y2DDQ2hyb25vIiBkZXNjcmlwdGlvbj0iIiBhdXRob3I9IkFsY2hlbWlzdFByaW1lIiB3ZWJfbGluaz0iaHR0cgov42ZhY2VyZXBv4mNvbS9hcHAvZmFjZXMvdXNlci9BbGNoZW1pc3RQcmltZS8xIiBiZ19jb2xvcj0iYWJhYmFiIiBpbmRfbG9jPSJ0YyIDaW5kX2JnPSJZIiBob3R3b3JkX2xvYz0idGMiIGhvdHdvcmRfYmc9IlkiPDoDICADPExheWVyIHR5cGU9ImltYWdlIiBLPSIwIiB5PSIwIiBneXJvPSIwIiByb3RhdGlvbj0iMCIDc2tld19LPSIwIiBza2V3X3k9IjAiIG9wYWNpdHk9IjEwMCIDYWxpZ25tZW50PSJjYyIDcGF0ag0i4mltZzQ5OTIucHBuZyIDd2lkdGD9IjU1MCIDaGVpZ2h0PSI1NTAiIGNvbG9yPSJmZmZmZmYiIGRpc3BsYXk9ImJkIi8+CiADICA8TGF5ZXIDdHlwZT0idGVLdCIDeg0iMTciIHk9IjQ5IiBneXJvPSIwIiByb3RhdGlvbj0iMCIDc2tld19LPSIwIiBza2V3X3k9IjAiIG9wYWNpdHk9IjEwMCIDYWxpZ25tZW50PSJjYyIDdGVLdg0ie2RofTp7ZG16fSIDdGVLdF9zaXplPSIzMiIDZm9udg0iTENEUEhPTkUiIHRyYW5zZm9ybT0ibiIDY29sb3JfZGltPSIxZgFkMWQiIGNvbG9yPSIxZgFkMWQiIGRpc3BsYXk9ImJkIi8+CiADICA8TGF5ZXIDdHlwZT0idGVLdCIDeg0iOTAiIHk9IjU0IiBneXJvPSIwIiByb3RhdGlvbj0iMCIDc2tld19LPSIwIiBza2V3X3k9IjAiIG9wYWNpdHk9IjEwMCIDYWxpZ25tZW50PSJjYyIDdGVLdg0ie2Rzen0iIHRleHRfc2l6ZT0iMTDiIGZvbnQ9IkxgRFBIT05FIiB0cmFuc2Zvcm09ImLiIGNvbG9yX2RpbT0iMWQxZgFkIiBjb2xvcj0iMWQxZgFkIiBkaXNwbGF5PSJiZCIvPDoDICADPExheWVyIHR5cGU9InRleHQiIHD9IjEyNCIDeT0iNTQiIGd5cm89IjAiIHJvdGF0aW9uPSIwIiBza2V3X3D9IjAiIHNrZXdfeT0iMCIDb3BhY2l0eT0ie3N3cnN9PT0DMCBhbmQDMTAwIG9yIgAiIGFsaWdubWVudg0iY2MiIHRleHQ9IntkYX0iIHRleHRfc2l6ZT0iMjAiIGZvbnQ9IkxgRFBIT05FIiB0cmFuc2Zvcm09ImLiIGNvbG9yX2RpbT0iMWQxZgFkIiBjb2xvcj0iMWQxZgFkIiBkaXNwbGF5PSJiZCIvPDoDICADPExheWVyIHR5cGU9InRleHQiIHD9Ii0LOSIDeT0iNTIiIGd5cm89IjAiIHJvdGF0aW9uPSIwIiBza2V3X3D9IjAiIHNrZXdfeT0iMCIDb3BhY2l0eT0iMTAwIiBhbGlnbm1lbnQ9ImNjIiB0ZXh0PSJ7Ymx9IiB0ZXh0X3NpemU9IjE5IiBmb250PSJMQ0RQSE9ORSIDdHJhbnNmb3JtPSJuIiBjb2xvcl9kaW09IjFkMWQxZCIDY29sb3I9IjFkMWQxZCIDZGlzcGxheT0iYmQi4zLKICADIgxMYXllciB0eXBlPSJzaGFwZSIDeg0iMSIDeT0iNgDiIGd5cm89IjAiIHJvdGF0aW9uPSIwIiBza2V3X3D9IjAiIHNrZXdfeT0iMCIDb3BhY2l0eT0ie3N3cnN9PT0DMCBhbmQDMCBvciAxMgAiIGFsaWdubWVudg0iY2MiIHNoYXBlPSJTcXVhcmUiIHdpZHRoPSIyNzDiIGhlaWdodg0iNgUiIGNvbG9yPSJhYmFiYWIiIGRpc3BsYXk9ImJkIi8+CiADICA8TGF5ZXIDdHlwZT0idGVLdCIDeg0iMCIDeT0iNgDiIGd5cm89IjAiIHJvdGF0aW9uPSIwIiBza2V3X3D9IjAiIHNrZXdfeT0iMCIDb3BhY2l0eT0ie3N3cnN9PT0DMCBhbmQDMCBvciAxMgAiIGFsaWdubWVudg0iY2MiIHRleHQ9Intzd219Ontzd3N9Ontzd3Nzc30iIHRleHRfc2l6ZT0iMzYiIGZvbnQ9IkxgRFBIT05FIiB0cmFuc2Zvcm09ImLiIGNvbG9yX2RpbT0iMWQxZgFkIiBjb2xvcj0iMWQxZgFkIiBkaXNwbGF5PSJiZCIvPDoDICADPExheWVyIHR5cGU9InRleHQiIHD9Ii0xMTDiIHk9Ijk2IiBneXJvPSIwIiByb3RhdGlvbj0iMCIDc2tld19LPSIwIiBza2V3X3k9IjAiIG9wYWNpdHk9IjEwMCIDYWxpZ25tZW50PSJjYyIDdGVLdg0ie3N3cn0DYW5kICZhcG9zO1NUT1AmYXBvczsDb3IDJmFwb3M7U1RBUlQmYXBvczsiIHRleHRfc2l6ZT0iMTUiIGZvbnQ9IlJvYm90by1SZWd1bGFyIiB0cmFuc2Zvcm09ImLiIGNvbG9yX2RpbT0iMWQxZgFkIiBjb2xvcj0iMWQxZgFkIiBkaXNwbGF5PSJiZCIDdGFwX2FjdGlvbj0ic3dfc3RhcnRfc3RvcCIvPDoDICADPExheWVyIHR5cGU9InRleHQiIHD9IjExOCIDeT0iOTYiIGd5cm89IjAiIHJvdGF0aW9uPSIwIiBza2V3X3D9IjAiIHNrZXdfeT0iMCIDb3BhY2l0eT0iMTAwIiBhbGlnbm1lbnQ9ImNjIiB0ZXh0PSJSRVNFVCIDdGVLdF9zaXplPSIxNSIDZm9udg0iUm9ib3Rv4VJlZ3VsYXIiIHRyYW5zZm9ybT0ibiIDY29sb3JfZGltPSIxZgFkMWQiIGNvbG9yPSIxZgFkMWQiIGRpc3BsYXk9ImJkIiB0YXBfYWN0aW9uPSJzd19yZXNldCIvPDoDICADPExheWVyIHR5cGU9InNoYXBlIiBLPSItMTI0IiB5PSIxNgEiIGd5cm89IjAiIHJvdGF0aW9uPSItOTAiIHNrZXdfeg0iMCIDc2tld195PSIwIiBvcGFjaXR5PSIxMgAiIGFsaWdubWVudg0iY2MiIHNoYXBlPSJUcmlhbmdsZSIDd2lkdGD9IjM1IiBoZWlnaHQ9IjM1IiBjb2xvcj0iMWQxZgFkIiBkaXNwbGF5PSJiZCIvPDoDICADPExheWVyIHR5cGU9InNoYXBlIiBLPSItMTE3IiB5PSIxNgQiIGd5cm89IjAiIHJvdGF0aW9uPSIwIiBza2V3X3D9IjAiIHNrZXdfeT0iMCIDb3BhY2l0eT0iMCIDYWxpZ25tZW50PSJjYyIDc2hhcGU9IlNxdWFyZSIDd2lkdGD9IjEyMCIDaGVpZ2h0PSIxMjAiIGNvbG9yPSIyOWI5ZgMiIGRpc3BsYXk9ImJkIiB0YXBfYWN0aW9uPSJzd19zdGFydF9zdG9wIi8+CiADICA8TGF5ZXIDdHlwZT0ic2hhcGUiIHD9IjEyNCIDeT0iMTQxIiBneXJvPSIwIiByb3RhdGlvbj0iOTAiIHNrZXdfeg0iMCIDc2tld195PSIwIiBvcGFjaXR5PSIxMgAiIGFsaWdubWVudg0iY2MiIHNoYXBlPSJUcmlhbmdsZSIDd2lkdGD9IjM1IiBoZWlnaHQ9IjM1IiBjb2xvcj0iMWQxZgFkIiBkaXNwbGF5PSJiZCIvPDoDICADPExheWVyIHR5cGU9InNoYXBlIiBLPSIxMTciIHk9IjE0NCIDZ3lybz0iMCIDcm90YXRpb2L9IjAiIHNrZXdfeg0iMCIDc2tld195PSIwIiBvcGFjaXR5PSIwIiBhbGlnbm1lbnQ9ImNjIiBzaGFwZT0iU3F1YXJlIiB3aWR0ag0iMTIwIiBoZWlnaHQ9IjEyMCIDY29sb3I9IjI5YjlkMyIDZGlzcGxheT0iYmQiIHRhcF9hY3Rpb2L9InN3X3Jlc2V0Ii8+CiADICA8TGF5ZXIDdHlwZT0idGVLdCIDeg0i4TDyIiB5PSItNgMiIGd5cm89IjAiIHJvdGF0aW9uPSIwIiBza2V3X3D9IjAiIHNrZXdfeT0iMCIDb3BhY2l0eT0iMTAwIiBhbGlnbm1lbnQ9ImNjIiB0ZXh0PSJ7d3RkfSIDdGVLdF9zaXplPSIyNSIDZm9udg0iTENEUEhPTkUiIHRyYW5zZm9ybT0ibiIDY29sb3JfZGltPSIxZgFkMWQiIGNvbG9yPSIxZgFkMWQiIGRpc3BsYXk9ImJkIi8+CiADICA8TGF5ZXIDdHlwZT0iaW1hZ2VfY29uZCIDeg0i4TD5IiB5PSItOTIiIGd5cm89IjAiIHJvdGF0aW9uPSIwIiBza2V3X3D9IjAiIHNrZXdfeT0iMCIDb3BhY2l0eT0iMTAwIiBhbGlnbm1lbnQ9ImNjIiBwYXRoPSJ3ZWF0aGVyX3NldF8z4nBwbmciIHdpZHRoPSI3MCIDaGVpZ2h0PSI3MCIDY29sb3I9IjQ1NgU0NSIDY29uZF92YWx1ZT0iJmFwb3M7e3djaX0mYXBvczsDPT0DJmFwb3M7MgFkJmFwb3M7IGFuZCAxIG9yICZhcG9zO3t3Y2l9JmFwb3M7Ig09ICZhcG9zOzAyZCZhcG9zOyBhbmQDMiBvciAmYXBvczt7d2NpfSZhcG9zOyA9PSAmYXBvczswM2QmYXBvczsDYW5kIgMDb3IDJmFwb3M7e3djaX0mYXBvczsDPT0DJmFwb3M7MgRkJmFwb3M7IGFuZCA0IG9yICZhcG9zO3t3Y2l9JmFwb3M7Ig09ICZhcG9zOzA5ZCZhcG9zOyBhbmQDNSBvciAmYXBvczt7d2NpfSZhcG9zOyA9PSAmYXBvczsxMGQmYXBvczsDYW5kIgYDb3IDJmFwb3M7e3djaX0mYXBvczsDPT0DJmFwb3M7MTFkJmFwb3M7IGFuZCA3IG9yICZhcG9zO3t3Y2l9JmFwb3M7Ig09ICZhcG9zOzEzZCZhcG9zOyBhbmQDOCBvciAmYXBvczt7d2NpfSZhcG9zOyA9PSAmYXBvczs1MGQmYXBvczsDYW5kIgkDb3IDMSIDY29uZF9ncmlkPSIzegMiIGRpc3BsYXk9ImJkIi8+CiADICA8TGF5ZXIDdHlwZT0idGVLdCIDeg0iMTEzIiB5PSIzIiBneXJvPSIwIiByb3RhdGlvbj0iMCIDc2tld19LPSIwIiBza2V3X3k9IjAiIG9wYWNpdHk9IjEwMCIDYWxpZ25tZW50PSJjYyIDdGVLdg0iTW9vbiIDdGVLdF9zaXplPSIxOCIDZm9udg0iTENEUEhPTkUiIHRyYW5zZm9ybT0ibiIDY29sb3JfZGltPSIxZgFkMWQiIGNvbG9yPSIxZgFkMWQiIGRpc3BsYXk9ImJkIi8+CiADICA8TGF5ZXIDdHlwZT0ic2hhcGUiIHD9IjExNCIDeT0i4TUyIiBneXJvPSIwIiByb3RhdGlvbj0iMCIDc2tld19LPSIwIiBza2V3X3k9IjAiIG9wYWNpdHk9IjEwMCIDYWxpZ25tZW50PSJjYyIDc2hhcGU9IkNpcmNsZSIDd2lkdGD9IjYwIiBoZWlnaHQ9IjYwIiBjb2xvcj0iNgU0NTQ1IiBkaXNwbGF5PSJiZCIvPDoDICADPExheWVyIHR5cGU9ImltYWdlX2NvbmQiIHD9IjExNCIDeT0i4TUyIiBneXJvPSIwIiByb3RhdGlvbj0iMCIDc2tld19LPSIwIiBza2V3X3k9IjAiIG9wYWNpdHk9IjEwMCIDYWxpZ25tZW50PSJjYyIDcGF0ag0ibW9vbl9zZXRfMy5wcG5nIiB3aWR0ag0iNjAiIGhlaWdodg0iNjAiIGNvbG9yPSJhYmFiYWIiIGNvbmRfdmFsdWU9Int3bXB9IiBjb25kX2dyaWQ9IjNLMyIDZGlzcGxheT0iYmQi4zLKICADIgxMYXllciB0eXBlPSJ0ZXh0IiBLPSIwIiB5PSIwIiBneXJvPSIwIiByb3RhdGlvbj0iMCIDc2tld19LPSIwIiBza2V3X3k9IjAiIG9wYWNpdHk9IjEwMCIDYWxpZ25tZW50PSJjYyIDdGVLdg0iIiB0ZXh0X3NpemU9IjQwIiBmb250PSJSb2JvdG8tUmVndWxhciIDdHJhbnNmb3JtPSJuIiBjb2xvcl9kaW09ImZmZmZmZiIDY29sb3I9ImZmZmZmZiIDZGlzcGxheT0iYmQi4zLKICADIgxMYXllciB0eXBlPSJ0ZXh0IiBLPSIxMCIDeT0i4TExNiIDZ3lybz0iMCIDcm90YXRpb2L9IjAiIHNrZXdfeg0iMCIDc2tld195PSIwIiBvcGFjaXR5PSIxMgAiIGFsaWdubWVudg0iY2MiIHRleHQ9IntkZHd9IHtkbm5ufSB7ZGR9IiB0ZXh0X3NpemU9IjE2IiBmb250PSJMQ0RQSE9ORSIDdHJhbnNmb3JtPSJuIiBjb2xvcl9kaW09IjFkMWQxZCIDY29sb3I9IjFkMWQxZCIDZGlzcGxheT0iYmQi4zLKPC9XYXRjagLK";
var data = Convert.FromBase64String(encStr);
// Iterate over all encodings, and decode
foreach (EncodingInfo ei in Encoding.GetEncodings())
{
Encoding e = ei.GetEncoding();
Console.Write("{0,-15} - {1}{2}", ei.CodePage, e.EncodingName, System.Environment.NewLine);
Console.WriteLine(e.GetString(data).Substring(0, 200));
Console.Write(System.Environment.NewLine + "---------------------" + System.Environment.NewLine);
}
Fiddle: https://dotnetfiddle.net/LcV5s8
Related
Im having a problem generating a base64string with a specific encoder.
I have an application that generate this base64string
RQA6AFwAUAByAG8AagBlAGMAdABzAFwAWQBvAHUAdAB1AGIAZQAuAE0AYQBuAGEAZwBlAHIAXABZAG8AdQB0AHUAYgBlAC4ATQBhAG4AYQBnAGUAcgAuAE0AbwBkAGUAbABzAC4AQwBvAG4AdABhAGkAbgBlAHIAXABvAGIAagBcAFIAZQBsAGUAYQBzAGUAXABuAGUAdABzAHQAYQBuAGQAYQByAGQAMgAuADAAXABZAG8AdQB0AHUAYgBlAC4ATQBhAG4AYQBnAGUAcgAuAE0AbwBkAGUAbABzAC4AQwBvAG4AdABhAGkAbgBlAHIALgBkAGwAbAAAAA==
which is equal to
E:\Projects\Youtube.Manager\Youtube.Manager.Models.Container\obj\Release\netstandard2.0\Youtube.Manager.Models.Container.dll
Now im trying convert
E:\Projects\Youtube.Manager\Youtube.Manager.Models.Container\obj\Release\netstandard2.0\Youtube.Manager.Models.Container.dll
To base64string but im getting this instead
RTpcUHJvamVjdHNcWW91dHViZS5NYW5hZ2VyXFlvdXR1YmUuTWFuYWdlci5Nb2RlbHMuQ29udGFpbmVyXG9ialxSZWxlYXNlXG5ldHN0YW5kYXJkMi4wXFlvdXR1YmUuTWFuYWdlci5Nb2RlbHMuQ29udGFpbmVyLmRsbA==
I want to get the same result as the first base64string which is
RQA6AFwAUAByAG8AagBlAGMAdABzAFwAWQBvAHUAdAB1AGIAZQAuAE0AYQBuAGEAZwBlAHIAXABZAG8AdQB0AHUAYgBlAC4ATQBhAG4AYQBnAGUAcgAuAE0AbwBkAGUAbABzAC4AQwBvAG4AdABhAGkAbgBlAHIAXABvAGIAagBcAFIAZQBsAGUAYQBzAGUAXABuAGUAdABzAHQAYQBuAGQAYQByAGQAMgAuADAAXABZAG8AdQB0AHUAYgBlAC4ATQBhAG4AYQBnAGUAcgAuAE0AbwBkAGUAbABzAC4AQwBvAG4AdABhAGkAbgBlAHIALgBkAGwAbAAAAA==
How can i do it?
This is my code which is giving me a wrong result
var bytes= Encoding.ASCII.GetBytes(msg);
return Convert.ToBase64String(bytes);
The problem here is the text encoding you're using.
The first Base64 string you posted is encoded using Unicode with a nul terminator byte pair. The trailing 'AAAAA==' is a dead giveaway here. You can see it yourself by examining the byte array:
var originalB64 = "RQA6AFwAUAByAG8AagBlAGMAdABzAFwAWQBvAHUAdAB1AGIAZQAuAE0AYQBuAGEAZwBlAHIAXABZAG8AdQB0AHUAYgBlAC4ATQBhAG4AYQBnAGUAcgAuAE0AbwBkAGUAbABzAC4AQwBvAG4AdABhAGkAbgBlAHIAXABvAGIAagBcAFIAZQBsAGUAYQBzAGUAXABuAGUAdABzAHQAYQBuAGQAYQByAGQAMgAuADAAXABZAG8AdQB0AHUAYgBlAC4ATQBhAG4AYQBnAGUAcgAuAE0AbwBkAGUAbABzAC4AQwBvAG4AdABhAGkAbgBlAHIALgBkAGwAbAAAAA==";
var bytes = Convert.FromBase64String(originalB64);
Converting this to a string will give you a null-terminated string 125 characters long, with the last character being nul.
Given a path that is not nul-terminated you can reproduce that string as follows:
string path = #"E:\Projects\Youtube.Manager\Youtube.Manager.Models.Container\obj\Release\netstandard2.0\Youtube.Manager.Models.Container.dll";
string newB64 = Convert.ToBase64String(Encoding.Unicode.GetBytes(path + "\0"));
This matches the original Base64 string exactly in my tests.
I need an advice on decoding base64. I will be doing it in c#.
The thing is, I don't know what type of format the decoding will output it may be text, XML, images or PDF. I only have the base64 encoded string.
How do you guys advice me to proceed? Any suggestions?
Many image types and pdfs include a magic number, where the first X bytes identify the file type. You should decode the string and examine the binary for these (https://asecuritysite.com/forensics/magic gives a list of them). If you still can't identify it check whether it parses as XML using an XML parser else assume it's text.
Extract the MIME type from a base64 string:
/**
* Extract the MIME type from a base64 string
* #param encoded Base64 string
* #return MIME type string
*/
private static String extractMimeType(final String encoded) {
final Pattern mime = Pattern.compile("^data:([a-zA-Z0-9]+/[a-zA-Z0-9]+).*,.*");
final Matcher matcher = mime.matcher(encoded);
if (!matcher.find())
return "";
return matcher.group(1).toLowerCase();
}
Usage:
final String encoded = "data:image/png;base64,iVBORw0KGgoAA...5CYII=";
extractMimeType(encoded); // "image/png"
extractMimeType("garbage"); // ""
Then you can write your byte array:
var filePath = System.IO.Path.Combine(folderPath, string.Format("pdf_{0}.pdf", Guid.NewGuid()));
var byteArray = Convert.FromBase64String(base64pdf);
File.WriteAllBytes(filePath, byteArray);
And open you file:
Device.OpenUri(new Uri("file://" + filePath));
Or tokenize the data since the 64 encoded data look like this "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAoAAAAKAC" and parse that string.
I get question marks in output of my program: ?????? ??????
string str = "Привет медвед";
Encoding srcEncodingFormat = Encoding.GetEncoding("utf-16");
Encoding dstEncodingFormat = Encoding.ASCII;
byte [] originalByteString = srcEncodingFormat.GetBytes(str);
byte [] convertedByteString = Encoding.Convert(srcEncodingFormat,
dstEncodingFormat, originalByteString);
string finalString = dstEncodingFormat.GetString(convertedByteString);
Console.WriteLine (finalString);
There is no text but encoded text. But, .NET's char and string use Unicode/UTF-16, as you know. So, you can simplify your code by calling GetBytes and passing in the string instead of doing it twice as your code does.
As for your question, you have a choice of a lossy conversion or no conversion at all. Below is code that prevents a lossy conversion.
Now, how to see the result? As with all text, it is a sequence of bytes. Your best bet is to write them to a file and open the file in an editor that you can indicate the encoding to and that can use a font that supports the characters you want to see.
string str = "Привет медвед";
Encoding dstEncodingFormat = Encoding.GetEncoding("US-ASCII",
new EncoderExceptionFallback(),
new DecoderReplacementFallback());
byte[] output = dstEncodingFormat.GetBytes(str);
File.WriteAllBytes("Test Привет медвед.txt", output);
Possible Duplicate Converting byte array to string and back again in C#
I am using Huffman Coding for compression and decompression of some text from here
The code in there builds a huffman tree to use it for encoding and decoding. Everything works fine when I use the code directly.
For my situation, i need to get the compressed content, store it and decompress it when ever need.
The output from the encoder and the input to the decoder are BitArray.
When I tried convert this BitArray to String and back to BitArray and decode it using the following code, I get a weird answer.
Tree huffmanTree = new Tree();
huffmanTree.Build(input);
string input = Console.ReadLine();
BitArray encoded = huffmanTree.Encode(input);
// Print the bits
Console.Write("Encoded Bits: ");
foreach (bool bit in encoded)
{
Console.Write((bit ? 1 : 0) + "");
}
Console.WriteLine();
// Convert the bit array to bytes
Byte[] e = new Byte[(encoded.Length / 8 + (encoded.Length % 8 == 0 ? 0 : 1))];
encoded.CopyTo(e, 0);
// Convert the bytes to string
string output = Encoding.UTF8.GetString(e);
// Convert string back to bytes
e = new Byte[d.Length];
e = Encoding.UTF8.GetBytes(d);
// Convert bytes back to bit array
BitArray todecode = new BitArray(e);
string decoded = huffmanTree.Decode(todecode);
Console.WriteLine("Decoded: " + decoded);
Console.ReadLine();
The Output of Original code from the tutorial is:
The Output of My Code is:
Where am I wrong friends? Help me, Thanks in advance.
You cannot stuff arbitrary bytes into a string. That concept is just undefined. Conversions happen using Encoding.
string output = Encoding.UTF8.GetString(e);
e is just binary garbage at this point, it is not a UTF8 string. So calling UTF8 methods on it does not make sense.
Solution: Don't convert and back-convert to/from string. This does not round-trip. Why are you doing that in the first place? If you need a string use a round-trippable format like base-64 or base-85.
I'm pretty sure Encoding doesn't roundtrip - that is you can't encode an arbitrary sequence of bytes to a string, and then use the same Encoding to get bytes back and always expect them to be the same.
If you want to be able to roundtrip from your raw bytes to string and back to the same raw bytes, you'd need to use base64 encoding e.g.
http://blogs.microsoft.co.il/blogs/mneiter/archive/2009/03/22/how-to-encoding-and-decoding-base64-strings-in-c.aspx
I have strings like this:
RowKey = "Local (Automatic/Manual) Tests",
When I try to store in Windows Azure then this fails as I assume it does not accept the "/" as part of the row key.
Is there a simple way that I can encode the value before putting into RowKey?
Also once the data is in the table I get it out with the following:
var Stores = storeTable.GetAll(u => u.PartitionKey == "ABC");
Is there a simple way that I can get out the value of RowKey and decode it?
One possible way for handling is this by converting the PartitionKey and RowKey values in Base64 encoded string and save it. Later when you retrieve the values, you just decode it. In fact I have had this issue some days back in our tool and Base64 encoding was suggested to me on MSDN forums: http://social.msdn.microsoft.com/Forums/en-US/windowsazuredata/thread/a20cd3ce-20cb-4273-a1f2-b92a354bd868. But again it is not fool proof.
I'm not familiar with Azure, so I don't know if there is an existing API for that. But it's not hard to code:
encode:
const string escapeChar='|';
RowKey.Replace(escapeChar,escapeChar+escapeChar).Replace("/",escapeChar+"S");
decode:
StringBuilder sb=new StringBuilder(s.Length);
bool escape=false;
foreach(char c in s)
{
if(escape)
{
if(c=='S')
sb.Append('/');
else if(c==escapeChar)
sb.Append(escapeChar);
else
throw new ArgumentException("Invalid escape sequence "+escapeChar+c);
}
else if(c!=escapeChar)
{
sb.Append(c);
escape=false;
}
else
escape=true;
return sb.ToString();
When a string is Base64 encoded, the only character that is invalid in an Azure Table Storage key column is the forward slash ('/'). To address this, simply replace the forward slash character with another character that is both (1) valid in an Azure Table Storage key column and (2) not a Base64 character. The most common example I have found (which is cited in other answers) is to replace the forward slash ('/') with the underscore ('_').
private static String EncodeToKey(String originalKey)
{
var keyBytes = System.Text.Encoding.UTF8.GetBytes(originalKey);
var base64 = System.Convert.ToBase64String(keyBytes);
return base64.Replace('/','_');
}
When decoding, simply undo the replaced character (first!) and then Base64 decode the resulting string. That's all there is to it.
private static String DecodeFromKey(String encodedKey)
{
var base64 = encodedKey.Replace('_', '/');
byte[] bytes = System.Convert.FromBase64String(base64);
return System.Text.Encoding.UTF8.GetString(bytes);
}
Some people have suggested that other Base64 characters also need encoding. According to the Azure Table Storage docs this is not the case.