I know that my question looks like a duplicated question, but I could not find a helpful solution for my issue.
So I am trying to scrape data from a cargo ships data providing website Link (It's a Korean website. The black button on the right is the search button)
but in order to obtain data from it, some radio buttons have to be set up then hit search.
I thought I would be able to just pass parameters values through FormUrlEncodedContent then simply use PostAsync, but somehow I could not be able to get them pass through.
Here is my codes so far
using (var client = new HttpClient())
{
client.DefaultRequestHeaders.TryAddWithoutValidation("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36");
client.DefaultRequestHeaders.TryAddWithoutValidation("Content-Type", "application/x-www-form-urlencoded");
var doc = new HtmlAgilityPack.HtmlDocument();
var content = new FormUrlEncodedContent(structInfo.ScriptValues);
var response = await client.PostAsync(structInfo.PageURL, content);
var responseString = await response.Content.ReadAsStringAsync();
Console.WriteLine(responseString);
}
using (WebClient client = new WebClient())
{
var reqparm = new System.Collections.Specialized.NameValueCollection();
reqparm.Add("v_time", "month");
reqparm.Add("ROCD", "ALL");
reqparm.Add("ORDER", "item2");
reqparm.Add("v_gu", "S");
byte[] responsebytes = client.UploadValues("http://info.bptc.co.kr:9084/content/sw/frame/berth_status_text_frame_sw_kr.jsp", "POST", reqparm);
string responsebody = Encoding.UTF8.GetString(responsebytes);
Console.WriteLine(responsebody);
}
Values I put in the StructInfo Class
PageURL = "http://info.bptc.co.kr:9084/content/sw/frame/berth_status_text_frame_sw_kr.jsp",
ScriptValues = new Dictionary<string, string>
{
{"v_time", "month"},
{"ROCD", "ALL"},
{"ORDER", "item2"},
{"v_gu", "S"}
},
What I have tried so far are HttpClient, WebClient, WebBrowser but I had no luck.
But a strange thing is when I try to send a post with Burp Suite, data comes out just fine like the way in I wanted.
I've been searching a solution for last 4 hours, didn't have any luck.
Would you guys mind help me?
Thanks
Generated code for C# - RestSharp by Postman
var client = new RestClient("http://info.bptc.co.kr:9084/Berth_status_text_servlet_sw_kr");
client.Timeout = -1;
var request = new RestRequest(Method.POST);
request.AddHeader("Content-Type", "application/x-www-form-urlencoded");
request.AddParameter("v_time", "3days");
request.AddParameter("ROCD", "ALL");
request.AddParameter("ORDER", "item2");
request.AddParameter("v_gu", "S");
IRestResponse response = client.Execute(request);
Console.WriteLine(response.Content);
HttpClient version
using var client = new HttpClient();
var content = new FormUrlEncodedContent(new[]
{
new KeyValuePair<string, string>("v_time", "3days"),
new KeyValuePair<string, string>("ROCD", "ALL"),
new KeyValuePair<string, string>("ORDER", "item2"),
new KeyValuePair<string, string>("v_gu", "S"),
});
string url = "http://info.bptc.co.kr:9084/Berth_status_text_servlet_sw_kr";
var response = await client.PostAsync(url, content);
var bytes = await response.Content.ReadAsByteArrayAsync();
string responseString = Encoding.UTF8.GetString(bytes);
Console.WriteLine(responseString);
The issue
If we talk about the HttpClient version, assuming you are using .net core.
The exception is thrown on ReadAsStringAsync call.
More specifically down below:
https://github.com/microsoft/referencesource/blob/aaca53b025f41ab638466b1efe569df314f689ea/System/net/System/Net/Http/HttpContent.cs#L95
The response has ContentType: text/html; charset=euc-kr.
And the problem is .net core is not supporting Korean charset out of the box.
My workaround is using ReadAsByteArrayAsync instead and then using supported UTF8 encoder later. It screws Korean characters though.
The better way would be to reference the System.Text.Encoding.CodePages package and then use Encoding.RegisterProvider.
Something like this Encoding.GetEncoding can't work in UWP app
Related
I'm using HttpRequestMessage from HttpClient in a couple of methods and currently, I'm repeating the following piece of code all over my code:
This code was converted by https://curl.olsh.me/ so I'm not sure if best practices were used here.
// using System.Net.Http;
using (var httpClient = new HttpClient(handler))
{
using (var request = new HttpRequestMessage(new HttpMethod("POST"), "https://www.url.com/"))
{
request.Headers.TryAddWithoutValidation("authority", "www.url.com");
request.Headers.TryAddWithoutValidation("pragma", "no-cache");
request.Headers.TryAddWithoutValidation("cache-control", "no-cache");
request.Headers.TryAddWithoutValidation("dnt", "1");
request.Headers.TryAddWithoutValidation("x-requested-with", "XMLHttpRequest");
request.Headers.TryAddWithoutValidation("x-odesk-csrf-token", "19b91748869456a4ae700ffb69077745");
request.Headers.TryAddWithoutValidation("user-agent", "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36");
request.Headers.TryAddWithoutValidation("accept", "*/*");
request.Headers.TryAddWithoutValidation("origin", "https://www.url.com");
request.Headers.TryAddWithoutValidation("sec-fetch-site", "same-origin");
request.Headers.TryAddWithoutValidation("sec-fetch-mode", "cors");
request.Headers.TryAddWithoutValidation("sec-fetch-dest", "empty");
request.Headers.TryAddWithoutValidation("referer", "https://www.url.com/");
request.Headers.TryAddWithoutValidation("accept-language", "pt-BR,pt;q=0.9,en-US;q=0.8,en;q=0.7,fr-FR;q=0.6,fr;q=0.5");
request.Headers.TryAddWithoutValidation("cookie", "G_AUTHUSER_H=1; AccountSecurity_cat=fc4d14f1.oauth2v2_812293");
request.Content.Headers.ContentType = MediaTypeHeaderValue.Parse("application/json");
var response = await httpClient.SendAsync(request);
IEnumerable<string> cookies = new List<string>();
response.Headers.TryGetValues("Set-Cookie", out cookies);
I assume you are trying to understand how to extract the code in question into a helper method given its use of using blocks. If not, please clarify and I will adjust my answer. Here is what I would do in this case:
Task<HttpResponseMessage> PostAsyncWithHeaders(Uri uri)
{
var response = new Task<HttpResponseMessage>();
using (var httpClient = new HttpClient(handler))
using (var request = new HttpRequestMessage(HttpMethod.Post, uri))
{
//Add your headers as before
response = await httpClient.SendAsync(request);
}
return response;
}
You can then call the method from anywhere it is accessible like this:
response = await PostAsyncWithHeaders(new Uri("https://www.url.com/"));
IEnumerable<string> cookies = new List<string>();
response.Headers.TryGetValues("Set-Cookie", out cookies);
//Presumably consume cookies (yum!)
I made this test in .NET Core 3 for reference.
If you want to be able to reuse your method throughout the code-base, you can try something like the following:
static async Task Main(string[] args)
{
var client = new HttpClient();
// .NET core setting the content type.
client.DefaultRequestHeaders.TryAddWithoutValidation("Content-Type", "application/json; charset=utf-8");
// this is wrapped in the using statement.
using var requestMessage = GetRequestMessage("https://www.url.com/", HttpMethod.Post);
await client.SendAsync(requestMessage);
}
// Get the request message for reuse. You can then reuse this for maybe different end-points and method types.
// the "referer" or "origin" values can be passed in as parameters too.
static HttpRequestMessage GetRequestMessage(string url, HttpMethod method)
{
var request = new HttpRequestMessage(method, url);
request.Headers.TryAddWithoutValidation("authority", "www.url.com");
request.Headers.TryAddWithoutValidation("pragma", "no-cache");
request.Headers.TryAddWithoutValidation("cache-control", "no-cache");
request.Headers.TryAddWithoutValidation("dnt", "1");
request.Headers.TryAddWithoutValidation("x-requested-with", "XMLHttpRequest");
request.Headers.TryAddWithoutValidation("x-odesk-csrf-token", "19b91748869456a4ae700ffb69077745");
request.Headers.TryAddWithoutValidation("user-agent", "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36");
request.Headers.TryAddWithoutValidation("accept", "*/*");
request.Headers.TryAddWithoutValidation("origin", "https://www.url.com");
request.Headers.TryAddWithoutValidation("sec-fetch-site", "same-origin");
request.Headers.TryAddWithoutValidation("sec-fetch-mode", "cors");
request.Headers.TryAddWithoutValidation("sec-fetch-dest", "empty");
request.Headers.TryAddWithoutValidation("referer", "https://www.url.com/");
request.Headers.TryAddWithoutValidation("accept-language", "pt-BR,pt;q=0.9,en-US;q=0.8,en;q=0.7,fr-FR;q=0.6,fr;q=0.5");
request.Headers.TryAddWithoutValidation("cookie", "G_AUTHUSER_H=1; AccountSecurity_cat=fc4d14f1.oauth2v2_812293");
return request;
}
If you are using this in a console program, you can be fine with creating HttpClient in a using statement. However, if you are intending to use the client in something like a web app in .net core, you want to use the services.AddHttpClient method and dependency injection. This is because the HttpClient isn't intended to be disposed of after every use, if you are using it multiple times in a given call especially.
really appreciate the answers from #Jaquez and #Kuroiyatsu, from both I get to the following
public async Task<string> postAsync(param1, param2)
{
using (var httpClient = new HttpClient(handler))
{
using (var request = new HttpRequestMessage(new HttpMethod("POST"), "https://www.url.com/"))
{
....
var response = await PostAsync(param1, param2);
var variable = JsonSerializer.Deserialize<Obj>(response);
WebScrapFunc(response);
...
Although it seems weird to return a Task<string> from PostAsync it fits fine my intents.
I'm trying to deserialize some JSON in C#, but when I run my program I'm getting this error message:
I've looked through all my code, and I can't find a "<" anywhere there shouldn't be one, and I went to the web address that the json is coming from:
http://forecast.weather.gov/MapClick.php?lat=47.1211&lon=-88.5694&FcstType=json,
and there isn't a "<" character. I used json2csharp.com to translate to C# classes, and everything there seems fine as well. Any thoughts? Here is the part of my code where I try to do all of this:
var http = new HttpClient();
var url = "http://forecast.weather.gov/MapClick.php?lat=47.1211&lon=-88.5694&FcstType=json";
var response = await http.GetAsync(url);
var result = await response.Content.ReadAsStringAsync();
var serializer = new DataContractJsonSerializer(typeof(RootObject2));
var ms = new MemoryStream(Encoding.UTF8.GetBytes(result));
var data = (RootObject2)serializer.ReadObject(ms);
return data;
Your call is failing because you are not setting a header the API is expecting. Add a user agent and check for success prior to attempting to read the response.
var http = new HttpClient();
var url = "http://forecast.weather.gov/MapClick.php?lat=47.1211&lon=-88.5694&FcstType=json";
//Supply the same header as chrome
http.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.84 Safari/537.36");
var response = await http.GetAsync(url);
if (response.IsSuccessStatusCode)
{
var result = await response.Content.ReadAsStringAsync();
var ms = new MemoryStream(Encoding.UTF8.GetBytes(result));
var serializer = new DataContractJsonSerializer(typeof(RootObject2));
var data = (RootObject2)serializer.ReadObject(ms);
}
check that answer, it says some issue with the connection, that he was not receiving the full response from the API
Unexpected character encountered while parsing value:
I am trying to make a request to an API called Pacer.gov. I'm expecting a file to be returned, but I'm not getting it. Can someone help me with what I'm missing?
So my C# Rest call looks like this:
(The variable PacerSession is the authentication cookie I got (with help from #jonathon-reinhart); read more about that here: How do I use RestSharp to POST a login and password to an API?)
var client = new RestClient("https://pcl.uscourts.gov/dquery");
client.CookieContainer = new System.Net.CookieContainer();
//var request = new RestRequest("/dquery", Method.POST);
var request = new RestRequest(Method.POST);
request.AddParameter("download", "1");
request.AddParameter("dl_fmt", "xml");
request.AddParameter("party", "Moncrief");
request.AddHeader("user-agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36");
request.AddHeader("content-type", "text/plain; charset=utf-8");
request.AddHeader("accept", "*/*");
request.AddHeader("accept-encoding", "gzip, deflate, sdch");
request.AddHeader("accept-language", "en-US,en;q=0.8");
request.AddHeader("cookie", "PacerSession=" + PacerSession);
IRestResponse response = client.Execute(request);
If I just type the URL https://pcl.uscourts.gov/dquery?download=1&dl_fmt=xml&party=Moncrief into Chrome, I get back an XML file. When I look at the IRestResponse, I don't see anything that looks like a file. Is there something wrong with my request or am I getting the file back and just need to know how to retrieve it?
Here's part of the file I get back if I use the URL directly in the browser:
Here's what I see in VS when I debug it and look at the IRestResponse variable:
UPDATE - 6/3/16
Received this response from Pacer tech support:
In the Advanced REST Client, you will see a HTTP 302 response (a redirect to another page). In a normal browser, the redirect is automatically followed without the user seeing anything (even on the URL in the browser).
The ARC does not automatically follow that redirect to the target page.
You can see in the header of the response the target URL that has the results.
If you manually cut and paste this URL to the ARC as a HTTP GET request, you will get the XML results. I have never used C#, but there is usually a property associated with web clients that will force the client to follow the redirect.
I tried adding this:
client.FollowRedirects = true;
but I'm still not seeing an xml file when I debug this code:
IRestResponse response = client.Execute(request);
How do I get the file? Is there something I have to do to get the file from the URL it's being redirected to?
There's one major problem with your code. You're only carrying one of the three cookies that checp-pacer-passwd.pl returns. You need to preserve all three. The following code is a possible implementation of this, with some notes afterwards.
public class PacerClient
{
private CookieContainer m_Cookies = new CookieContainer();
public string Username { get; set; }
public string Password { get; set; }
public PacerClient(string username, string password)
{
this.Username = username;
this.Password = password;
}
public void Connect()
{
var client = new RestClient("https://pacer.login.uscourts.gov");
client.CookieContainer = this.m_Cookies;
RestRequest request = new RestRequest("/cgi-bin/check-pacer-passwd.pl", Method.POST);
request.AddParameter("loginid", this.Username);
request.AddParameter("passwd", this.Password);
IRestResponse response = client.Execute(request);
if (response.Cookies.Count < 1)
{
throw new WebException("No cookies returned.");
}
}
public XmlDocument SearchParty(string partyName)
{
string requestUri = $"/dquery?download=1&dl_fmt=xml&party={partyName}";
var client = new RestClient("https://pcl.uscourts.gov");
client.CookieContainer = this.m_Cookies;
var request = new RestRequest(requestUri);
IRestResponse response = client.Execute(request);
if (!String.IsNullOrEmpty(response.Content))
{
XmlDocument result = new XmlDocument();
result.LoadXml(response.Content);
return result;
}
else return null;
}
}
It's easiest to just keep a hold of the CookieContainer throughout the entire time you're working with Pacer. I wrapped the functionality into a class, just to make it a little easier to package up with this answer, but you can implement it however you want. I didn't put in any real error checking, so you probably want to check that response.ResponseUri is actually the search page and not the logon page, and that the content is actually well-formed XML.
I've tested this using my own Pacer account, like so:
PacerClient client = new PacerClient(Username, Password);
client.Connect();
var document = client.SearchParty("Moncrief");
I am having a issue in solving REST api call in windows phone application.
The situation is something :
I want to Pass two parameter named here "session_token" and "userid" as header in rest api call i am using the following code but the expected output is not matched with the postman out put (attached in screen shot)
I have the following code
using (var client = new HttpClient())
{
client.BaseAddress = new Uri(Connection);
client.DefaultRequestHeaders.Accept.Clear();
client.DefaultRequestHeaders.TryAddWithoutValidation("Content-ype", "application/x-www-form-urlencoded");
client.DefaultRequestHeaders.TryAddWithoutValidation("session_token", "sdfsffsdfsdffsfsdfsdfsdfsdf");
client.DefaultRequestHeaders.TryAddWithoutValidation("userid", "sdfsdfsdfsdfsdfsdd");
var postData = new List<KeyValuePair<string, string>>();
postData.Add(new KeyValuePair<string, string>("changepasswordinput", "{\"oldpassword\":\"sdfsdfdf\",\"newpassword\":\"sdfsdfsdf\"}"));
HttpContent content = new FormUrlEncodedContent(postData);
HttpResponseMessage response = await client.PostAsync("changepassword", content);
if (response.IsSuccessStatusCode)
{
var outputstring = await response.Content.ReadAsStringAsync();
responseBaseClass = await response.Content.ReadAsAsync<ResponseBaseClass>();
}
}
Please tell me where i am doing wrong.
Thanks in advance.
You are doing it right, the headers are there. Just click on the Headers(2) tab in Postman on the first screenshot and you will see them.
I using Winrt, I try to parse a HTML Page for Results.
But to get the result, I must fill out a search page and hit the submit button.
Is that possible to do that by code in Win RT?
If you find your button using WinJS query, you can programatically fire the click event like this:
element.fireEvent("onclick");
I guess you haven't downloaded the page yet (or displayed in a WebView). To make a request have a closer look at HttpClient and HttpClientHandler. Depending on whether the page uses GET or POST you will need to create a HttpRequestMessage additionally. Search for the url of the form (most often the form's action attribute) to know your request uri.
Example:
var ClientHandler = new HttpClientHandler();
ClientHandler.UseCookies = true;
ClientHandler.AllowAutoRedirect = true;
ClientHandler.UseDefaultCredentials = true;
ClientHandler.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
var Client = new HttpClient(ClientHandler);
Client.DefaultRequestHeaders.Add("Accept", "text/html, application/xhtml+xml, */*");
Client.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident/6.0)");
var Response = await Client.GetAsync(RequestUri);
Your RequestUri could be something like http://www.example.com/search?query=search. But if the page you want uses POST to submit your query I think you need to create a HttpRequestMessage as below:
var RequestMessage = new HttpRequestMessage();
RequestMessage.Content = new StringContent(YourPostData, Encoding.UTF8, "application/x-www-form-urlencoded");
RequestMessage.Method = HttpMethod.Post;
RequestMessage.RequestUri = new Uri(OtherRequestUri);
Response = await Client.SendAsync(RequestMessage);
To parse the content of the response you best use the HtmlAgilityPack I think.