HtmlDecode - you’ll - c#

public static UserItem DownloadJSONString(string urlJson)
{
using (WebClient wc = new WebClient())
{
var json = wc.DownloadString(urlJson);
UserItem userItems = JsonConvert.DeserializeObject<RootObject>(json);
return userItems;
}
}
I'm working on Json file to deserialzing C# poco class something like this:
var root = JsonConvert.DeserializeObject<RootObject>(json);
i have noticed that its translating from you’ll to you’ll i'm not sure where does it coming from and I have looked at the json file in the browser and it is rendering as you’ll NOT you’ll
i tried HttpUtility.HtmlDecode but does not decode.
PS: i am not sure if this help or not but i'm using Newtonsoft.Json for deserializing

This is probably not an issue specific to the JSON parser.
More likely, the issue is with how you are acquiring json. It appears to be an encoding issue -- perhaps you are reading from a file which is using a Windows-1252 or ISO-8859-1 encoding, but the reader is treating it as UTF-8. Or vice versa.
WebClient.DownloadString uses a supplied Encoding to do this conversion. You need to set it explicitly:
public static UserItem DownloadJSONString(string urlJson)
{
using (WebClient wc = new WebClient())
{
wc.Encoding = Encoding.UTF8;
var json = wc.DownloadString(urlJson);
UserItem userItems = JsonConvert.DeserializeObject<RootObject>(json);
return userItems;
}
}

Related

How can i search a hebrew word from a website using c#

Im trying to search a Hebrew word in a website using c# but i cant figure it out.
this is my current state code that im trying to work with:
var client = new WebClient();
Encoding encoding = Encoding.GetEncoding(1255);
var text = client.DownloadString("http://shchakim.iscool.co.il/default.aspx");
if (text.Contains("ביטול"))
{
MessageBox.Show("idk");
}
thanks for any help :)
The problem seems to be that WebClient is not using the right encoding when converting the response into a string, you must set the WebClient.Encoding property to the expected encoding from the server for this conversion to happen correctly.
I inspected the response from the server and it's encoded using utf-8, the updated code below reflects this change:
using (var client = new WebClient())
{
client.Encoding = System.Text.Encoding.UTF8;
var text = client.DownloadString("http://shchakim.iscool.co.il/default.aspx");
// The response from the server doesn't contains the word ביטול, therefore, for demo purposes I changed it for שוחרות which is present in the response.
if (text.Contains("שוחרות"))
{
MessageBox.Show("idk");
}
}
Here you can find more information about the WebClient.Encoding property:
https://learn.microsoft.com/en-us/dotnet/api/system.net.webclient.encoding?view=netframework-4.7.2
Hope this helps.

Kanji characters from WebClient html different from actual Kanji in website

So, I'm trying to get a portion of text from a website called Kanji-A-Day.com, but I have a problem.
You see, I'm trying to get the daily kanji from the website, and I was able to narrow the HTML down to what I want, but it seems the characters are different..?
What it looks like
What it should look like
What's even more strange is that I produced the results for the second image by copying and pasting directly from the site, so it's not a font problem.
Here's the code I use for getting the character:
public void UpdateDailyKanji() // Called at the initialization of a new main form
{
string kanji;
using (WebClient client = new WebClient()) // Grab the string
kanji = client.DownloadString("http://www.kanji-a-day.com/level4/index.php");
// Trim the HTML to just the Kanji
kanji = kanji.Remove(0, kanji.IndexOf(#"<div class=""glyph"">") + 19);
kanji = kanji.Remove(kanji.IndexOf("</div>")-2);
kanji = kanji.Trim();
Text_DailyKanji.Text = kanji; // Set the Kanji
}
Does anyone know what's going on here? I'm guessing it's some Unicode thing but I don't know much about it.
Thanks in advance.
The page you're trying to download as a string is encoded using charset=EUC-JP, also known as Japanese (EUC) (CodePage 51932). This is clearly set in the page headers.
Why is the string returned by WebClient.DownloadString encoded using the wrong encoder?
The MSDN Docs state this:
This method retrieves the specified resource. After it downloads the
resource, the method uses the encoding specified in the Encoding
property to convert the resource to a String.
Thus, you have to know beforehand what encoding will be used and specify it, setting the WebClient.Encoding property.
To verify this, check the .NET Reference Source for the WebClient.DownloadString method:
try {
WebRequest request;
byte [] data = DownloadDataInternal(address, out request);
string stringData = GetStringUsingEncoding(request, data);
if(Logging.On)Logging.Exit(Logging.Web, this, "DownloadString", stringData);
return stringData;
} finally {
CompleteWebClientState();
}
The encoding is set using the Request settings, not the Response ones.
The result is, the downloaded string is encoded using the default CodePage.
What you can do now is:
Download the page twice, the first time to check whether the WebClient encoding and the Html page encoding don't match.
Re-encode the string with the correct encoding, set in the underlying WebResponse.
Don't use WebClient, use HttpClient or WebRequest directly. Or, if you like this tool, create a custom WebClient class to handle the WebRequest/WebResponse in a more direct way.
This is a method to perform the re-encoding task:
The string returned by WebClient is converted to a Byte Array and passed to a MemoryStream, then re-encoded using a StreamReader with the Encoding retrieved from the Content-Type: charset Response Header.
EDIT:
Now using Reflection to get the page Encoding from the underlying HttpWebResponse. This should avoid errors in parsing the original CharacterSet as defined by the remote response.
using System.IO;
using System.Net;
using System.Reflection;
using System.Text;
public string WebClient_DownLoadString(Uri uri)
{
using (var client = new WebClient())
{
// If Windows 7 - Windows Server 2008 R2
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
client.CachePolicy = new System.Net.Cache.RequestCachePolicy(System.Net.Cache.RequestCacheLevel.BypassCache);
client.Headers.Add(HttpRequestHeader.Accept, "ext/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
client.Headers.Add(HttpRequestHeader.AcceptLanguage, "en-US,en;q=0.8");
client.Headers.Add(HttpRequestHeader.KeepAlive, "keep-alive");
string result = client.DownloadString(uri);
var flags = BindingFlags.Instance | BindingFlags.NonPublic;
using (var response = (HttpWebResponse)client.GetType().GetField("m_WebResponse", flags).GetValue(client))
{
var pageEncoding = Encoding.GetEncoding(wc_response.CharacterSet);
byte[] bytes = client.Encoding.GetBytes(result);
using (var ms = new MemoryStream(bytes, 0, bytes.Length))
using (var reader = new StreamReader(ms, pageEncoding))
{
ms.Position = 0;
return reader.ReadToEnd();
};
};
}
}
Now your code should get the Japanese characters in their correct form.
Uri uri = new Uri("http://www.kanji-a-day.com/level4/index.php", UriKind.Absolute);
string kanji = WebClient_DownLoadString(uri);
kanji = kanji.Remove(0, kanji.IndexOf("<div class=\"glyph\">") + 19);
kanji = kanji.Remove(kanji.IndexOf("</div>")-2);
kanji = kanji.Trim();
Text_DailyKanji.Text = kanji;

How do I grab specific strings from a source?

I want to create an app that shows current information, i can get the information using a simple (at least, it looks pretty simple) API.
Now, you can use the API by going to the website, and enter the username. The URL will result in something like this site.web/api/public/user?name=Usename.
on that page is all the information I need, in form of one line of 'code'.
{"uniqueId":"hhus-7723dec98ecb9bc6643f10588e0bb3f4","name":"Username","figureString":"hr-125-40.hd-209-1369.ch-210-64.lg-270-1408.he-3329-1408-1408","selectedBadges":[],"motto":"sample txt","memberSince":"2012-08-25T14:01:04.000+0000","profileVisible":true,"lastWebAccess":null}
I want to extract this information and display it in my program, example:
{"uniqueId":"this is an ID"}
I only want the actual ID to be shown: this is an ID.
Thanks for helping!
The format you're receiving is called JSON. There are lots of libraries to read it easily, the most widely used in C# is JSON.NET.
If you only need to extract one property, you can do something like this:
string json = ...
var obj = JObject.Parse(json);
string uniqueId = obj["uniqueId"].Value<string>();
If you also need the other properties, it's probably easier to use deserialization: create a class with the same properties as the JSON object, and use JsonConvert.DeserializeObject to read the JSON into an instance of the class.
The one line of code you're referring to is JSON data. It's stored in the format "key":"value","key:value","key:value" and so on.
You should take a look at Newtonsoft.Json which helps you do exactly this: parse JSON data :)
https://www.nuget.org/packages/Newtonsoft.Json/
Tying it all together for you...
using System.IO;
using System.Net;
using Newtonsoft.Json.Linq;
...
WebClient client = new WebClient();
Stream stream = client.OpenRead("http://site.web/api/public/user?name=Usename");
StreamReader reader = new StreamReader(stream);
string userJson = reader.ReadLine();
reader.Close();
JObject jObject = JObject.Parse(userJson);
string uniqueId = (string)jObject["uniqueId"];
This is an example of Json. The most type safe way if to deserialize the data to a class you define.
Such a class could look like this:
public class MyClass
{
public string uniqueId { get; set; }
}
If you have the data in a string you can just deserialize it with the Newtonsoft.Json nuget package.
MyClass obj = JsonConvert.Deserialize<MyClass>(myJsonString);
If you get the data from http it is easier to use an client which can do the deserialization for you. Such a client is found in the nuget package Microsoft.AspNet.WebApi.Client
using(var client = new HttpClient())
{
var response = await client.GetAsync(myUrl);
response.EnsureSuccessStatusCode();
MyClass obj = await response.Content.ReadAsAsync<MyClass>();
}
Of course this assumes the server is standards compliant and specifies it's content-type as application/json
Bonus: The classes you deserialize to can be auto generated from example at the site: http://json2csharp.com/ .

Deserializing the object inside an http post

Hi I am trying to deserialize an Object from a HttpPost method call inside an authorize attribute.I am using ASP.NET Web Api Framework.
Here is my code:
public override void OnAuthorization(HttpActionContext actionContext)
{
var rezult = DeserializeStream<EvaluationFormDataContract>(actionContext.Request.Content.ReadAsStreamAsync().Result);
}
private T DeserializeStream<T>(Stream stream)
{
var binaryFormatter = new BinaryFormatter();
var rez = binaryFormatter.Deserialize(stream);
var t = (T)binaryFormatter.Deserialize(stream);
return t;
}
When this code gets executed I get this exception when the binaryFormatter tryes to deserialize it:
The input stream is not a valid binary format. The starting contents (in bytes) are: 73-74-75-64-65-6E-74-41-73-73-69-67-6E-6D-65-6E-74 ...
What am I doing wrong?
You are trying to use BinaryFormatter to binary deserialize data which was not binary serialized. From data you sent I see that hex code represents a string.
73-74-75-64-65-6E-74-41-73-73-69-67-6E-6D-65-6E-74 decoded is studentAssignment
This leads me to believe you are doing a simple AJAX call and sending JSON data to WebAPI service.
You need to deserialize the stream using JSON.
Read request content as string
If content is JSON, deserialize it using JSON.NET
var json = actionContext.Request.Content.ReadAsStringAsync().Result;
var m = JsonConvert.DeserializeObject<EvaluationFormDataContract>(json);
If response is not JSON, but form data you can parse it like a query string.
var stringData = actionContext.Request.Content.ReadAsStringAsync().Result;
NameValueCollection data = HttpUtility.ParseQueryString(stringData);
string personId = data["personId"];

Consuming web services in C#

I just started playing around with some API's in C#. In my form I had added a service reference http://wsf.cdyne.com/WeatherWS/Weather.asmx. Everything works great and I am able to utilize its library. Now I am trying to use for example http://free.worldweatheronline.com/feed/apiusage.ashx?key=(key goes in here)&format=xml. [I have a key] Now when I try to use it as service reference I am not able to use.
Do I have to call it in my form instead of referencing it? or do some sort of conversion? Also does it matter if its xml or json type?
ASMX is old technology and uses SOAP under the hood. SOAP doesn't tend to work with query string parameters, it takes parameters as part of the message.
ASHX is something different (it could be anything, it's one way to write a raw HTML/XML page in .NET), so you can't transfer the method for calling one to the other. It also won't have a service reference, it's likely you request it via a raw HTTP request. You'll need to consuly the service documentation to discover how to use it.
worldweatheronline doesn't return a SOAP-XML that is consumable by a WebService client. Therefore you should download the response and parse it as done with many REST services.
string url = "http://free.worldweatheronline.com/feed/apiusage.ashx?key=" + apikey;
using (WebClient wc = new WebClient())
{
string xml = wc.DownloadString(url);
var xDoc = XDocument.Parse(xml);
var result = xDoc.Descendants("usage")
.Select(u => new
{
Date = u.Element("date").Value,
DailyRequest = u.Element("daily_request").Value,
RequestPerHour = u.Element("request_per_hour").Value,
})
.ToList();
}
Also does it matter if its xml or json type?
No, at the end you have to parse the response by yourself.
string url = "http://free.worldweatheronline.com/feed/apiusage.ashx?format=json&key=" + apikey;
using (WebClient wc = new WebClient())
{
string json = wc.DownloadString(url);
dynamic dynObj = JsonConvert.DeserializeObject(json);
var jArr = (JArray)dynObj.data.api_usage[0].usage;
var result = jArr.Select(u => new
{
Date = (string)u["date"],
DailyRequest = (string)u["daily_request"],
RequestPerHour = (string)u["request_per_hour"]
})
.ToList();
}
PS: I used Json.Net to parse the json string

Categories

Resources