C# Json not handling accents correctly [duplicate] - c#

The following code:
var text = (new WebClient()).DownloadString("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20"));
results in a variable text that contains, among many other things, the string
"$κ$-Minkowski space, scalar field, and the issue of Lorentz invariance"
However, when I visit that URL in Firefox, I get
$κ$-Minkowski space, scalar field, and the issue of Lorentz invariance
which is actually correct. I also tried
var data = (new WebClient()).DownloadData("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20");
var text = System.Text.UTF8Encoding.Default.GetString(data);
but this gave the same problem.
I'm not sure where the fault lies here. Is the feed lying about being UTF8-encoded, and the browser is smart enough to figure that out, but not WebClient? Is the feed properly UTF8-encoded, but WebClient is failing in some other way? What can I do to mitigate this?

It's not lying. You should set the webclient's encoding first before calling DownloadString.
using(WebClient webClient = new WebClient())
{
webClient.Encoding = Encoding.UTF8;
string s = webClient.DownloadString("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20");
}
As for why your alternative isn't working, it's because the usage is incorrect. Its should be:
System.Text.Encoding.UTF8.GetString()

Related

C# WebClient DownloadString and DownloadFile giving different results

I am attempting to retrieve some information from a website, parse out a specific item, and then move on with my life.
I noticed that when I check "view source" on the website, the results match with what I see when I use the WebClient class' method of DownloadFile. On the other hand, when I use the DownloadString method, the contents of that string are different from both view source and DownloadFile.
I need DownloadString to return similar contents to view source and DownloadFile. Any suggestions? My relevant code is below:
string criticalPathUrl = "http://blahblahblah&sessionId=" + sessionId;
WebClient wc = new WebClient();
wc.Encoding = System.Text.Encoding.UTF8;
//this is different
string urlContentsString = wc.DownloadString(criticalPathUrl);
//than this
wc.DownloadFile(criticalPathUrl, "rawDlTxt2.txt");
Edit: Please ignore this question as I just didn't scroll up far enough. Ugh. One of those days.
use download data instead of downloadstring and use suitable encoding to convert the string then save the file!
watch details: https://www.pavey.me/2016/04/aspnet-c-downloadstring-vs-downloaddata.html

Can't decode cyrillic value from Request.QueryString

On my IIS7 I have ASP.NET WebForms site, and I use cyrillic values in the query string. I use HttpUtility.UrlEncode for params when do redirect, in the end I have url like:
http://mysite.com/Search.aspx?SearchText=текст
When I try to read param SearchText value (include HttpUtility.Decode() function) it returns me a wrong value of ÑекÑÑ, but should return текст
It works on localhost on ASP.NET developer server, but doesn't on IIS7 (include local IIS7)
In my web.config I set up line
<globalization requestEncoding="utf-8" responseEncoding="utf-8" />
but it still doesn't work.
Appreciate any help,
Thanks a lot!
Problem actually was in UrlRewriting.net that I use in my web-application.
I solved the same problem by converting the value to ToBase64String:
Before redirecting to a target page I encoded the value:
Dim Data() As Byte 'For the data to be encoded
'Convert the string into a byte array
Dim encoding As New System.Text.UTF8Encoding
Data = encoding.GetBytes(ParamToPass)
'Converting to ToBase64String
Dim EncodedStringToPass as string = Convert.ToBase64String(Data)
Page.Response.Redirect("TargetPage.aspx?Param=" & EncodedStringToPass, False)
At the target page:
Dim Data() As Byte 'For the data to be decoded
Data = Convert.FromBase64String(Page.Request.Params("Param"))
Dim encoding As New System.Text.UTF8Encoding
Dim ParamToPass As String = encoding.GetString(Data)
P.S. The only disadvantage of the method is that one cannot see the real value of the parameters in url string of browsers. But in my case this made no problem
If you use the redirect function, yes inside it there is this call
url = UrlEncodeRedirect(url);
thats break the Cyrilic, Greece characters and probably others. If I remember well, (I say remember because this issue is from my experience some months ago) the break to the characters is after the ? symbol. In any case I have the same issue.
Possible solutions:
Make your custom redirect, maybe not so good as the original, but you can by pass this issue.
Find some alternative way to your redirect logic.
Make your custom text encode that use only valid url characters that are not change by the redirect, and then decodes them again back. The minous on that is that will be like hidden text and not visible readable search word.
This is the very basic of the redirect.
public static void RedirectSimple(string url, bool endResponse)
{
HttpResponse MyResponse = HttpContext.Current.Response;
MyResponse.Clear();
MyResponse.TrySkipIisCustomErrors = true;
MyResponse.StatusCode = 302;
MyResponse.Status = "302 Temporarily Moved";
MyResponse.RedirectLocation = url;
MyResponse.Write("<html><head><title>Object moved</title></head><body>\r\n");
MyResponse.Write("<h2>Object moved to here.</h2>\r\n");
MyResponse.Write("</body></html>\r\n");
if (endResponse){
MyResponse.End();
}
}
You can make it a function and try it to see if works correctly.

Accented vowels are come out strange character in C# WebClient [duplicate]

The following code:
var text = (new WebClient()).DownloadString("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20"));
results in a variable text that contains, among many other things, the string
"$κ$-Minkowski space, scalar field, and the issue of Lorentz invariance"
However, when I visit that URL in Firefox, I get
$κ$-Minkowski space, scalar field, and the issue of Lorentz invariance
which is actually correct. I also tried
var data = (new WebClient()).DownloadData("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20");
var text = System.Text.UTF8Encoding.Default.GetString(data);
but this gave the same problem.
I'm not sure where the fault lies here. Is the feed lying about being UTF8-encoded, and the browser is smart enough to figure that out, but not WebClient? Is the feed properly UTF8-encoded, but WebClient is failing in some other way? What can I do to mitigate this?
It's not lying. You should set the webclient's encoding first before calling DownloadString.
using(WebClient webClient = new WebClient())
{
webClient.Encoding = Encoding.UTF8;
string s = webClient.DownloadString("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20");
}
As for why your alternative isn't working, it's because the usage is incorrect. Its should be:
System.Text.Encoding.UTF8.GetString()

Google Translate Api and Special Characters

I've recently started using the google translate API inside a c# project. I am trying to translate some text from english to french. I am having issues with some special characters though.
For example the word Company comes thru as Société instead of Société as it should. Is there some way in code I can convert these to the correct special characters? ie (é to é)
Thanks
If you need anymore info let me know.
I ran into this same exact issue. If you're using the WebClient class to download the json response from google, try setting the Encoding property to UTF8.
using(var webClient = new WebClient { Encoding = Encoding.UTF8 })
{
string json = webClient.DownloadString(someUri);
...
}
I have reproduced your problem, and it looks like you are using the UTF7 encoding. UTF8 is the way you need to go.
I use Google's API by creating a WebRequest to get an HTTP response from the server, then I read the response stream with a StreamReader. StreamReader defaults to UTF8, but to reproduce your problem, I passed Encoding.UTF7 into the StreamReader's constructor.

How can I migrate email functionality from ASP Classic to ASP.NET?

I previously used CDO.Message and CDO.Configuration in ASP Classic to create HTML emails which was VERY simple to do. In .NET, it appears that you have to give the System.Net.Mail.Message object an HTML string for the content and then somehow embed the required images. Is there an easy way to do this in .NET? I'm pretty new to .NET MVC and would most appreciate any help.
This is how it looks in ASP Classic:
Set objCDO = Server.CreateObject("CDO.Message")
objCDO.To = someone#somthing.com
objCDO.From = me#myaddress.com
objCDO.CreateMHTMLBody "http://www.example.com/somepage.html"
objCDO.Subject = sSubject
'the following are for advanced CDO schematics
'for authentication and external SMTP
Set cdoConfig = CreateObject("CDO.Configuration")
With cdoConfig.Fields
.Item(cdoSendUsingMethod) = cdoSendUsingPort '2 - send using port
.Item(cdoSMTPServer) = mail.myaddress.com
.Item(cdoSMTPServerPort) = 25
.Item(cdoSMTPConnectionTimeout) = 10
.Item(cdoSMTPAuthenticate) = cdoBasic
.Item(cdoSendUsername) = "myusername"
.Item(cdoSendPassword) = "mypassword"
.Update
End With
Set objCDO.Configuration = cdoConfig
objCDO.Send
Basically I would like to send one of my views (minus site.master) as an email, images embedded.
I don't know of a simple way right off, but you could use WebClient to get your page, then pass the response as the body.
Example:
var webClient = new WebClient();
byte[] returnFromPost = webClient.UploadValues(Url, Inputs);
var utf = new UTF8Encoding();
string returnValue = utf.GetString(returnFromPost);
return returnValue;
Note: Inputs is just a dictionary of post variables.
One problem I think you'll run into right off is that I don't think you'd get the images. You could parse the HTML you get and then make the images absolute back to your server.
Thank you both for your help - here is a very clean and comprehensive tutorial posted by a .NET MVP
http://msdn.microsoft.com/en-us/vbasic/bb630227.aspx

Categories

Resources