I wrote a little tool to check the availability of a product (yes, the PS5) by checking the products shop page:
var client = new HttpClient();
HttpResponseMessage response = await client.GetAsync("https://www.mediamarkt.de/de/product/_sony-playstation®5-2661938.html");
HttpContent responseContent = response.Content;
using (var reader = new StreamReader(await responseContent.ReadAsStreamAsync()))
{
var output = reader.ReadToEndAsync();
Console.WriteLine(output.Result);
}
For some reason the result page is requesting me to do a captcha while calling the exact same URL in my browser giving me the correct page without captcha.
What is the reason of this behaviour and how do I avoid it?
This is not a direct answer but a workaround
This website is protected by Cloudflare, which shows you recaptcha that only solvable in javascript environment. Obviously, HttpClient does not have such. While there are some solutions for this in other languages, I could not find any for C#. Will show an example in Selenium, web testing framework, that uses web browser driver (in my case Chrome).
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.Support.UI;
using System;
class Program
{
public static void Main(string[] args)
{
using (var driver = new ChromeDriver())
{
driver.Url = "https://www.mediamarkt.de/de/product/_sony-playstation®5-2661938.html";
// selenium does not behave well when element you are looking for is not visible,
// this method helps us to close cookie banner that blocks the view
CloseCookieBannerIfAppears(driver);
var buyButton = By.XPath("//div[contains(#class, \"Badge\")]").FindElement(driver);
Console.WriteLine(buyButton.Text); // Ausverkauft
}
}
private static void CloseCookieBannerIfAppears(IWebDriver driver)
{
var buttonInAcceptCookieBannerSelector = By.XPath("//button[#id=\"privacy-layer-accept-all-button\"]");
var waitForCookieBanner = new WebDriverWait(driver, TimeSpan.FromSeconds(5));
if (waitForCookieBanner.Until(x => x.FindElements(buttonInAcceptCookieBannerSelector).Count > 0))
{
driver.FindElement(buttonInAcceptCookieBannerSelector)
.Click();
}
}
}
Also looks like they have unprotected API, so you should be able to get this data directly as well. You can see that there is id parameter both in your link and in api call - _sony-playstation®5-2661938.html vs productId=2661938
using Newtonsoft.Json.Linq;
using System;
using System.Net.Http;
using System.Threading.Tasks;
class Program
{
public static async Task Main(string[] args)
{
var httpClient = new HttpClient();
var response = await httpClient.GetAsync("https://delivery-prod-teasermanagement.cloud.mmst.eu/api/teaser/find?productId=2661938");
var content = await response.Content.ReadAsStringAsync();
var status = JArray.Parse(content)[0]["promotionData"]["badge"];
Console.WriteLine(status); // Ausverkauft
}
}
Maybe there are some other edge cases, but you should be able to get the point.
Related
I want to get all hyperlinks from Wikipedia page that lead to another Wikipedia page in C#.
For example:
On the screenshot above you can see that I only want to get the links that lead to another Wiki article (red rects), even though there are another links on the page. I have written a function that scrapes every link on the page and returns a HashSet of them, its body is as follows:
private async Task<HashSet<string>> GetPages(CrawlerPage page)
{
var client = new HttpClient();
client.DefaultRequestHeaders.Add("User-Agent", "C# console program");
var htmlContent = await client.GetStringAsync(page.mainLink);
HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(htmlContent);
var programmerLinks = htmlDoc.DocumentNode
.Descendants("li")
.Where(node => !node.GetAttributeValue("class", "").Contains("tocsection")).ToList();
HashSet<string> wikiLinks = new();
foreach (var link in programmerLinks)
{
if (link.FirstChild.Attributes.Count > 0)
wikiLinks.Add("https://en.wikipedia.org/" + link.FirstChild.Attributes[0].Value);
}
return wikiLinks;
}
The function works fine, but it scrapes everything. Have a look at the screenshot below:
You can see that the things in red rects are the links that I want to get, the rest is junk (links not needed by me).
I figured out that all of these links are under <p> tag in HTML, and the links are in <a href> but I still cannot figure out how to get these concrete links.
Can you tell me how can I get these desired links?
Thanks!
I tried to come up with something like the code below, which should get only the items you need within the /wiki/.
I took the liberty to use HtmlAgilityPack since it's a well documented library.
using System;
using System.Collections.Generic;
using HtmlAgilityPack;
using System.Net.Http;
using System.Linq;
public class Program
{
public static void Main()
{
string url = "https://en.wikipedia.org/wiki/Axis_powers";
string result = "";
using (HttpClient client = new HttpClient())
{
using (HttpResponseMessage response = client.GetAsync(url).Result)
{
using (HttpContent content = response.Content)
{
result = content.ReadAsStringAsync().Result;
}
}
}
var links = ParseLinks(result).Where(x => x.Contains("/wiki/") && !x.Contains("https://")).ToList();
foreach (var link in links){
Console.WriteLine(link.ToString());
}
List<string> ParseLinks(string html)
{
var doc = new HtmlDocument();
doc.LoadHtml(html);
var nodes = doc.DocumentNode.SelectNodes("//a[#href]");
return nodes == null ? new List<string>() : nodes.ToList().ConvertAll(
r => r.Attributes.ToList().ConvertAll(
i => i.Value)).SelectMany(j => j).ToList();
}
}
}
I have a site on localhost created to learn how to send http requests, I send a post request to it, trying to simulate sending data via forms, as a response I expect that the data I sent will be added to the database, as it happens when sending via forms, but this does not happen. I assume that I am sending the post request incorrectly and something is missing in it
The site: localhost site
Post request: Pose request headers and data
My C# code:
using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Threading.Tasks;
namespace AirParsingScript
{
class Program
{
private static readonly HttpClient Client = new HttpClient();
static async Task Main()
{
var values = new Dictionary<string, string>
{
{ "Person.DocId_pre", "421" },
{ "Person.DocId", "ADGAGSA" },
{ "Person.Email", "4124421" },
{ "Person.TeleNumber", "4444" }
};
var content = new FormUrlEncodedContent(values);
var response = await Client.PostAsync("https://localhost:44391", content);
var responseString = await Client.GetStringAsync("https://localhost:44391");
// var responseString = await response.Content.ReadAsStringAsync();
Console.WriteLine(responseString);
}
}
}
I have created a web API in visual studio 2015 using a MySQL database. The API is working perfect.
So I decided to make a console client application in which I can consume my web-service (web API). The client code is based on HttpClient, and in the API I have used HttpResponse. Now when I run my console application code, I get nothing. Below is my code:
Class
class meters_info_dev
{
public int id { get; set; }
public string meter_msn { get; set; }
public string meter_kwh { get; set; }
}
This class is same as in my web API model class:
Model in web API
namespace WebServiceMySQL.Models
{
using System;
using System.Collections.Generic;
public partial class meters_info_dev
{
public int id { get; set; }
public string meter_msn { get; set; }
public string meter_kwh { get; set; }
}
Console application code
static HttpClient client = new HttpClient();
static void ShowAllProducts(meters_info_dev mi)
{
Console.WriteLine($"Meter Serial Number:{mi.meter_msn}\t Meter_kwh: {mi.meter_kwh}", "\n");
}
static async Task<List<meters_info_dev>> GetAllRecordsAsync(string path)
{
List<meters_info_dev> mID = new List<meters_info_dev>();
HttpResponseMessage response = await client.GetAsync(path);
if (response.IsSuccessStatusCode)
{
mID = await response.Content.ReadAsAsync<List<meters_info_dev>>();
}
else
{
Console.WriteLine("No Record Found");
}
return mID;
}
static void Main()
{
RunAsync().Wait();
}
static async Task RunAsync()
{
client.BaseAddress = new Uri("http://localhost:2813/");
client.DefaultRequestHeaders.Accept.Clear();
client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
var m = await GetAllRecordsAsync("api/metersinfo/");
foreach(var b in m)
{
ShowAllProducts(b);
}
}
In my API I have 3 GET methods under a single controller, so I have created different routes for them. Also the URL for them is different.
http://localhost:2813/api/metersinfo/ will return all records
While debugging the code, I found that List<meters_info_dev> mID = new List<meters_info_dev>(); is empty:
While the response is 302 Found, the URL is also correct:
Update 1
After a suggestion I have done the following:
using (var client = new HttpClient())
{
List<meters_info_dev> mID = new List<meters_info_dev>();
HttpResponseMessage response = await client.GetAsync(path);
if (response.IsSuccessStatusCode)
{
mID = await response.Content.ReadAsAsync<List<meters_info_dev>>();
}
else
{
Console.WriteLine("No Record Found");
}
return mID;
}
When I run the application, I get the exception "An invalid request URI was provided. The request URI must either be an absolute URI or BaseAddress must be set."
Update 2
I have added a new piece of code:
using (var cl = new HttpClient())
{
var res = await cl.GetAsync("http://localhost:2813/api/metersinfo");
var resp = await res.Content.ReadAsStringAsync();
}
And in the response I am getting all the records:
I don't know why it's not working with the other logic and what the problem is. I have also read the questions Httpclient consume web api via console app C# and Consuming Api in Console Application.
Any help would be highly appreciated.
The code needs quite a bit of work.
The line you highlighted will always be empty because that's where you initialise the variable. What you want is run thorugh the code until you get the result back form the call.
First, make sure your api actually works, you can call the GET method you want in the browser and you see results.
using (var client = new HttpClient())
{
var result = await client.GetAsync("bla");
return await result.Content.ReadAsStringAsync();
}
that's an example of course, so replace that with your particular data and methods.
now, when you check the results just because your response.IsSuccessStatusCode is false that doesn't mean there are no records. What it means is that the call failed completely. Success result with an empty list is not the same thing as complete failure.
If you want to see what you get back you can alter your code a little bit:
if(response.IsSuccessStatusCode)
{
var responseData = await response.Content.ReadAsStringAsync();
//more stuff
}
put a breakpoint on this line and see what you actually get back, then you worry about casting the result to your list of objects. Just make sure you get back the same thing you get when you test the call in the browser.
<------------------------------->
More details after edit.
Why don't you simplify your code a little bit.
for example just set the URL of the request in one go :
using (var client = new HttpClient())
{
var result = await client.GetAsync("http://localhost:2813/api/metersinfo");
var response = await result.Content.ReadAsStringAsync();
//set debug point here and check to see if you get the correct data in the response object
}
Your first order of the day is to see if you can hit the url and get the data.
You can worry about the base address once you get a correct response. Start simple and work your way up from there, once you have a working sample.
<----------------- new edit ---------------->
Ok, now that you are getting a response back, you can serialise the string back to the list of objects using something like Newtonsoft.Json. This is a NuGet package, you might either have it already installed, if not just add it.
Add a using statement at the top of the file.
using Newtonsoft.Json;
then your code becomes something like :
using (var client = new HttpClient())
{
var result = await client.GetAsync("bla");
var response = await result.Content.ReadAsStringAsync();
var mID = JsonConvert.DeserializeObject<List<meters_info_dev>>(response);
}
At this point you should have your list of objects and you can do whatever else you need.
I'm trying to check Microsoft Linguistic Analysis API, basic example, so I have subscribed and addad my Key 1 in Ocp-Apim-Subscription-Key and Key 2 into the subscription key here client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "{subscription key}");.
Then I add Newtonsoft.Json with Manage NuGet Packages into the References of Application, even it is not listed in using of particular example using Newtonsoft.Json; using bNewtonsoft.Json.Serialization; not sure, I'm new with this tool.
I'm trying to check this example Linguistics API for C# to get some natural language processing results for text analysis mainly of Verb and Noun values according to this example results So I'm not sure if I'm on the right direction with this example, or possible I've missed something to install, maybe I need some additions. I found this Analyze Method not sure how and if I have to use it for this particular goal.
But seems like something is wrong with var queryString = HttpUtility.ParseQueryString(string.Empty); and HttpUtility does not exist.
using System;
using System.Net.Http.Headers;
using System.Text;
using System.Net.Http;
using System.Web;
namespace CSHttpClientSample
{
static class Program
{
static void Main()
{
MakeRequest();
Console.WriteLine("Hit ENTER to exit...");
Console.ReadLine();
}
static async void MakeRequest()
{
var client = new HttpClient();
var queryString = HttpUtility.ParseQueryString(string.Empty);
// Request headers
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "{subscription key}");
var uri = "https://westus.api.cognitive.microsoft.com/linguistics/v1.0/analyze?" + queryString;
HttpResponseMessage response;
// Request body
byte[] byteData = Encoding.UTF8.GetBytes("{body}");
using (var content = new ByteArrayContent(byteData))
{
content.Headers.ContentType = new MediaTypeHeaderValue("< your content type, i.e. application/json >");
response = await client.PostAsync(uri, content);
}
}
}
}
You can create a new writeable instance of HttpValueCollection by calling System.Web.HttpUtility.ParseQueryString(string.Empty), and then use it as any NameValueCollection, like this:
NameValueCollection queryString = System.Web.HttpUtility.ParseQueryString(string.Empty);
Try adding a reference to System.Web, and possibly to System.Runtime.Serialization.
I come from an iOS (Swift) background. In one of my Swift apps, I have this class that calls an API. I'm trying to port it to C# (Windows Form application) but I'm hitting several snags. First here's the Swift code. Nothing fancy. One method does a POST request to login to the API and the other function executes a GET method to retrieve the JSON response for a user profile. Both these methods are asynchronous.
import Foundation
class API {
private let session = NSURLSession.sharedSession()
private let baseURL = "https://www.example.com/api/"
func login(userID userID: String, password: String, completion: (error: NSError?) -> ()) {
let url = NSURL(string: baseURL + "login")!
let params = ["username": userID, "password": password]
let request = NSMutableURLRequest(URL: url)
request.HTTPMethod = "POST"
request.encodeParameters(params) // encodeParameters is an extension method
session.dataTaskWithRequest(request, completionHandler: { data, response, error in
if let httpResponse = response as? NSHTTPURLResponse {
if httpResponse.statusCode != 200 {
completion(error: error)
} else {
completion(error: nil)
}
}
}).resume()
}
func fetchUser(completion: (user: User?, error: NSError?) -> ()) {
let url = NSURL(string: baseURL + "profile")!
let request = NSURLRequest(URL: url)
session.dataTaskWithRequest(request, completionHandler: { data, response, error in
if let error = error {
completion(user: nil, error: error)
} else {
// Parsing JSON
var jsonDict = [String: AnyObject]()
do {
jsonDict = try NSJSONSerialization.JSONObjectWithData(data, options: []) as! [String: AnyObject]
} catch {
print("Error occurred parsing data: \(error)")
completion(user: nil, error: error)
}
let user = User()
user.name = jsonDict["name"] as! String
user.age = jsonDict["age"] as! Int
completion(user: user, error: nil)
}
}).resume()
}
}
Here's my attempt to convert this to C#.
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Drawing;
using System.Threading.Tasks;
using System.Windows.Forms;
using System.Net.Http;
using System.Runtime.Serialization.Json;
using System.Text;
using System.Xml.Linq;
using System.Xml.XPath;
namespace MyTrayApp
{
public partial class Form1 : Form
{
private string baseURL = "https://www.example.com/api/";
public Form1()
{
InitializeComponent();
}
private async void Form1_Load(object sender, EventArgs e)
{
await login("myusername", "mypassword");
await fetchUser();
}
async Task login(string userID, string password)
{
using (var client = new HttpClient())
{
client.BaseAddress = new Uri(baseURL);
var parameters = new Dictionary<string, string>
{
{ "username", userID },
{ "password", password }
};
var encodedParameters = new FormUrlEncodedContent(parameters);
var response = await client.PostAsync("login", encodedParameters);
string responseString = await response.Content.ReadAsStringAsync();
//Console.WriteLine(responseString);
}
}
async Task fetchUser()
{
using (var client = new HttpClient())
{
client.BaseAddress = new Uri(baseURL);
var response = await client.GetAsync("profile");
response.EnsureSuccessStatusCode();
var responseString = await response.Content.ReadAsStringAsync();
var jsonReader = JsonReaderWriterFactory.CreateJsonReader(Encoding.UTF8.GetBytes(responseString.ToCharArray()), new System.Xml.XmlDictionaryReaderQuotas());
var root = XElement.Load(jsonReader);
Console.WriteLine(root.XPathSelectElement("//name").Value);
//Console.WriteLine(responseString);
}
}
}
}
These are the problems I'm having.
In my Swift methods, they have completion handlers. How can I do the same in C#?
In Swift, you get an NSData object and you can pass it to NSJSONSerialization to create a JSON object. In my current implementation, I get an XML exception at XElement.Load(jsonReader);. I'm not sure if this is the correct way to do this even. I found tons of different solutions here on SO. But some are for Metro apps, some are for web it's all too overwhelming. Also most solutions are on using third-party libraries like JSON.NET. I'm trying to achieve this without third-party libraries.
In my Swift methods, they have completion handlers. How can I do the
same in C#?
The point of wiring up a completion handler is so that you don't tie up a thread while waiting for the HTTP call to complete. The beauty of async/await is that you don't have to do this in C#. The await keyword instructs the compiler to literally rewrite the rest of the method as a callback. The current thread is freed as soon as await is encountered, preventing your UI from freezing up. You have written your async code correctly; it will behave asynchronously even though it looks synchronous.
Your second question is a bit broad, but I will make 2 suggestions:
Don't use XElement when dealing with JSON data. That part of an Microsoft's XML parsing library (one of them) and has nothing to do with JSON.
I'm not sure why achieving this without a 3rd-party library is important. I know people have their reasons, but Json.NET in particular has become so popular and ubiquitous that Microsoft itself has baked it into their ASP.NET MVC and Web API frameworks. That said, if you must avoid it, here is how you would deserialize JSON using only Microsoft libraries.