How to "iso-8859-1" encoding a string in jQuery? - c#

I'm looking for a jQuery(or jQuery plugin) equivalent of this C# code block. What it does is to encode a string to base64 string in iso-8859-1 character set.
string authInfo = "encrypted secret";
Encoding encoding = Encoding.GetEncoding("iso-8859-1");
byte[] authBytes = encoding.GetBytes(authInfo);
string encryptedMsg = Convert.ToBase64String(authBytes);
Is there a plugin out there that can do this?

Found a jQuery plugin that's close enough to what I need: Base64 encode and decode
It doesn't have an option to specify character set but I can live with it for now. So the jQuery code becomes:
authInfo = $.base64.encode(authInfo);

I believe you must specify the character encoding of the page (or where ever authInfo is defined) to ISO-8859-1. You may also specify the character encoding of the tag for referenced javascript files if authInfo is defined in one of those.
As for base64 encoding, here's a page that has a code snippet that does just that: http://www.webtoolkit.info/javascript-base64.html

Related

C# Get site source code with letters other than english

I'm trying to get a site's source in C# using
WebClient client = new WebClient();
string content = client.DownloadString(url);
And it gets it just fine.
However, the source code contains Hebrew characters which shows like Gibbrish in content variable.
What do I need to do for it to recognize it?
WebClient client = new WebClient();
client.Encoding = System.Text.UTF8Encoding.UTF8; // added
string content = client.DownloadString(url);
You have to specify the encoding, you are probably requesting ASCII by default and the content could be in UTF8. This is an example where the encoding is set to UTF8. If you are not sure what it is check the source manually first and then specify the encoding accordingly. For more info see Remarks in the documentation.
The problem is the Encoding of your WebClient. MSDN says:
... the method uses the encoding specified in the Encoding property to convert the resource to a String.
Solution: Set a specific Encoding like
client.Encoding = Encoding.UTF8;
and try it again
string content = client.DownloadString(url);
UTF8 should do the trick to encode also the hebrew characters.

Encode cp1252 string to utf-8 string in c#

How I can convert cp1252 string to utf-8 string in c#?
I tried this code, but it doesn't work:
Encoding wind1252 = Encoding.GetEncoding(1252);
Encoding utf8 = Encoding.GetEncoding(1251);
byte[] wind1252Bytes = ReadFile(myString1252);
byte[] utf8Bytes = Encoding.Convert(wind1252, utf8, wind1252Bytes);
string myStringUtf8 = Encoding.UTF8.GetString(utf8Bytes);
var myGoodString = System.IO.File.ReadAllText(
#"C:\path\to\file.txt",
Encoding.GetEncoding("Windows-1252")
);
A .NET/CLR string in memory cannot be UTF-8. It is just Unicode, or UTF-16 if you like.
The above code will properly read a text file in CP1252 into a .NET string.
If you insist on going through a byte[] wind1252Bytes, it is simply:
var myGoodString = Encoding.GetEncoding("Windows-1252").GetString(wind1252Bytes);
Since this answer was written, new versions of the framework .NET have appeared which do not by default recognize all the old (legacy) Windows-specific code pages. If Encoding.GetEncoding("Windows-1252") throws an exception with your runtime version, try registrering an additional provider with
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
(may need additional assembly reference to System.Text.Encoding.CodePages.dll) before you use Encoding.GetEncoding("Windows-1252").
See CodePagesEncodingProvider class documentation.

C# decoding "â„¢" to "TM"

on a web page there is following string
"Qualcomm Snapdragon™ S4"
when i get this string in my .net code the string convert to "Qualcomm Snapdragonâ„¢ S4"
the character "TM" change to â„¢
how can i decode "â„¢" back to "TM"
Update
follwoing is the code for downloaded string using webproxy
wc is webproxy
wc.Headers.Add("Accept-Charset", "ISO-8859-1,utf-8");
string html = Server.HtmlEncode(wc.DownloadString(url));
You should read the webpage in its proper encoding in the first place. In this case it seems you are reading with Encoding.Default (i.e. probably CP1252) and the page is really in UTF-8. This should be apparent either by reading the Content-Type header of the response or by looking for a <meta http-equiv="Content-Type" content='text/html; charset=utf-8'> in the content.
If you still need to do this after the fact, then use
var bytes = Encoding.Default.GetBytes(myString);
var correctString = Encoding.UTF8.GetString(bytes);
In any case you would need to know the exact encodings that were used on the page and for reading the malformed string in the first place. Furthermore I'd generally advise explicitly against using Encoding.Default because its value isn't fixed. It's just the legacy encoding on a Windows system for use in non-Unicode applications and also gets used as the default non-Unicode text file encoding. It should have no place whatsoever in handling external resources.

How to encode string for ID3 tags in C#

I am having problems with encoding in ID3 tags. I query a webservice which returns back some XML including a node such as the one below:
<name>Blue Öyster Cult</name>
I am then using this information to update my ID3 tags. The problem is that the tag is updated as:
Blue Öyster Cult
I know this is an encoding issue, but I'm struggling to work out how to get it to work. My understanding is that ID3 tags need to be encoded as ISO-8859-1.
I wrote this code, but it makes no difference:
Encoding newEncoding = Encoding.GetEncoding("ISO-8859-1");
Encoding utf8 = Encoding.UTF8;
byte[] utfBytes = Encoding.UTF8.GetBytes(newArtistName);
byte[] asciBytes = Encoding.Convert(utf8, newEncoding, utfBytes);
string encodedArtistName = newEncoding.GetString(asciBytes);
Is this in the right direction or not?
Any advice much appreciated.
ID3 v2
Textual frames are marked with an encoding byte.
$00 – ISO-8859-1 (ASCII).
$01 – UCS-2 (UTF-16 encoded Unicode with BOM), in ID3v2.2 and ID3v2.3.
$02 – UTF-16BE encoded Unicode without BOM, in ID3v2.4.
$03 – UTF-8 encoded Unicode, in ID3v2.4.
Detailed specification can be found at http://id3.org/id3v2-00.
Also see View/edit ID3 data for MP3 files - post with similar issue.
The issue that I was having was actually prior to saving to the ID3 tags. The XML I was receiving was set to UTF-8, but the WebClient requesting the page was not. Adding the second line shown here resolved the problem.
WebClient client = new WebClient();
client.Encoding = Encoding.UTF8;
String htmlCode = client.DownloadString(requestURL);
When the value is extracting from this XML, it has the correct encoding to be saved to the files ID3 tag.

Unicode in Content-Disposition header

I am using HttpContext object implemented in HttpHandler child to download a file, when I have non-ascii characters in file name it looks weird in IE whereas it looks fine in Firefox.
below is the code:-
context.Response.ContentType = ".cs";
context.Response.AppendHeader("Content-Length", data.Length.ToString());
context.Response.AppendHeader("Content-Disposition", String.Format("attachment; filename={0}",filename));
context.Response.OutputStream.Write(data, 0, data.Length);
context.Response.Flush();
when I supply 'ß' 'ä' 'ö' 'ü' 'ó' 'ß' 'ä' 'ö' 'ü' 'ó' in file name field it looks different than what I have in file name it looks fine in firefox. adding EncodingType and charset has been of no use.
In ie it is 'ß''ä''ö''ü''ó''ß''ä''ö''ü'_'ó' and in firefox it is 'ß' 'ä' 'ö' 'ü' 'ó' 'ß' 'ä' 'ö' 'ü' 'ó'.
Any Idea how this can be fixed?
I had similar problem. You have to use HttpUtility.UrlEncode or Server.UrlEncode to encode filename. Also I remember firefox didn't need it. Moreoverit ruined filename when it's url-encoded. My code:
// IE needs url encoding, FF doesn't support it, Google Chrome doesn't care
if (Request.Browser.IsBrowser ("IE"))
{
fileName = Server.UrlEncode(fileName);
}
Response.Clear ();
Response.AddHeader ("content-disposition", String.Format ("attachment;filename=\"{0}\"", fileName));
Response.AddHeader ("Content-Length", data.Length.ToString (CultureInfo.InvariantCulture));
Response.ContentType = mimeType;
Response.BinaryWrite(data);
Edit
I have read specification more carefully. First of all RFC2183 states that:
Current [RFC 2045] grammar restricts parameter values (and hence Content-Disposition filenames) to US-ASCII.
But then I found references that [RFC 2045] is absolete and one must reference RFC 2231, which states:
Asterisks ("*") are reused to provide
the indicator that language and
character set information is present
and encoding is being used. A single
quote ("'") is used to delimit the
character set and language information
at the beginning of the parameter
value. Percent signs ("%") are used as
the encoding flag, which agrees with
RFC 2047.
Which means that you can use UrlEncode for non-ascii symbols, as long as you include the encoding as stated in the rfc. Here is an example:
string.Format("attachment; filename=\"{0}\"; filename*=UTF-8''{0}", Server.UrlEncode(fileName, Encoding.UTF8));
Note that filename is included in addition to filename* for backwards compatibility. You can also choose another encoding and modify the parameter accordingly, but UTF-8 covers everything.
HttpUtility.UrlPathEncode might be a better option. As URLEncode will replace spaces with '+' signs.
For me this solution is working on all major browsers:
Response.AppendHeader("Content-Disposition", string.Format("attachment; filename*=UTF-8''{0}", HttpUtility.UrlPathEncode(fileName).Replace(",", "%2C"));
var mime = MimeMapping.GetMimeMapping(fileName);
return File(fileName, mime);
Using ASP.NET MVC 3.
The Replace is necessary, because Chrome doesn't like Comma (,) in parameter values: http://www.gangarasa.com/lets-Do-GoodCode/tag/err_response_headers_multiple_content_disposition/
You may want to read RFC 6266 and look at the tests at http://greenbytes.de/tech/tc2231/.
For me this solved the problem:
var result = new HttpResponseMessage(HttpStatusCode.OK)
{
Content = new ByteArrayContent(data)
};
result.Content.Headers.ContentDisposition = new ContentDispositionHeaderValue("attachment")
{
FileNameStar = "foo-ä-€.html"
};
When i look ad the repsonse in fiddler i can see the filename has automaticcaly been encoded using UTF-8:
Fiddler response example with encoded Content-Disposition filename using UTF-8
If we look at the value of the Content-Disposition header we can see it will be the same as #Johannes Geyer his answer. The only difference is that we didn't have to do the encoding ourselfs, the ContentDispositionHeaderValue class takes care of that.
I used the Testcases for the Content-Disposition header on: http://greenbytes.de/tech/tc2231/ as mentioned by Julian Reschke.
Information about the ContentDispositionHeaderValue class can be found on MSDN.
For Asp.Net Core (version 2 as of this post) UrlPathEncode is deprecated, here's how to achieve the desired result:
System.Net.Mime.ContentDisposition cd = new System.Net.Mime.ContentDisposition
{
FileName = Uri.EscapeUriString(fileName),
Inline = true // false = prompt the user for downloading; true = browser to try to show the file inline
};
Response.Headers.Add("Content-Disposition", cd.ToString());
I`m using Uri.EscapeUriString for converts all characters to their hexadecimal representation, and string.Normalize for Unicode normalization form C.
(tested in ASP.NET MVC5 framework 4.5)
var contentDispositionHeader = new System.Net.Mime.ContentDisposition
{
Inline = false,
FileName = Uri.EscapeUriString(Path.GetFileName(pathFile)).Normalize()
};
Response.Headers.Add("Content-Disposition", contentDispositionHeader.ToString());
string mimeType = MimeMapping.GetMimeMapping(Server.MapPath(pathFile));
return File(file, mimeType);

Categories

Resources