how to read the response from a web site? - c#

I have a website url which gives corresponding city names by taking zip code as input parameter. Now I want to know how to read the response from the site.
This is the link I am using http://zipinfo.com/cgi-local/zipsrch.exe?zip=60680

You'll have to Use the HTTPWebRequest object to connect to the site and scrape the information from the response.
Look for html tags or class names that wrap the content you are trying to find, then use either regexes or string functions to get the required data.
Good example here:

try this (you'll need to include System.text and System.net)
WebClient client = new WebClient();
string url = "http://zipinfo.com/cgi-local/zipsrch.exe?zip=60680";
Byte[] requestedHTML;
requestedHTML = client.DownloadData(url);
UTF8Encoding objUTF8 = new UTF8Encoding();
string html = objUTF8.GetString(requestedHTML);
Response.Write(html);

The simplest way it to use the light-weight WebClient classes in System.Net namespace. The following example code will just download the entire response as a string:
using (WebClient wc = new WebClient())
{
string response = wc.DownloadString("http://zipinfo.com/cgi-local/zipsrch.exe?zip=60680");
}
However, if you require more control over the response and request process then you can use the more heavy-weight HttpWebRequest Class. For instance, you may want to deal with different status codes or headers. There's an example of using HttpWebRequest this in the article How to use HttpWebRequest and HttpWebResponse in .NET on CodeProject.

Used the WebClient Class (http://msdn.microsoft.com/en-us/library/system.net.webclient%28v=VS.100%29.aspx) to request the page and get the response as a string.
WebClient wc = new WebClient();
String s = wc.DownloadString(DestinationUrl);
You can search the response for specific HTML using String.IndexOf, SubString, etc, regular expressions, or try something like the HTML Agility Pack (http://htmlagilitypack.codeplex.com/) which was created specifically to help parse HTML.

first of all, you better find a good Web Service for this purpose.
and this is an HttpWebRequest example:
HttpWebRequest httpRequest = (HttpWebRequest)WebRequest.Create("http://zipinfo.com/cgi-local/zipsrch.exe?zip=60680");
httpRequest.Credentials = CredentialCache.DefaultCredentials;
HttpWebResponse httpResponse = (HttpWebResponse)httpRequest.GetResponse();
Stream dataStream = httpResponse.GetResponseStream();

You need to use HttpWebRequest for receiving content and some tools for parsing html and finding what you need. One of the most popular libs for working with html in c# is HtmlAgilityPack, you can see simple example here: http://www.fairnet.com/post/2010/08/28/Html-screen-scraping-with-HtmlAgilityPack-Library.aspx

you can use a WebClient object, and an easy way to scrape the data is with xpath.

Related

POST request implementation

I need to implement a post request in a c# winform application of my project. Earlier to that I just have implemented get requests. I have checked that the API URI is working well (I checked it using Postman). I never implemented POST requests in the past. The get requests I implement using the following code:
WebClient n = new WebClient();
string uri = "API_URI";
string json = n.DownloadString(uri);
Now my requirement is to download json string using post method with an "apikey" with its value which I need to provide while calling the URI.
When I am using the above code, it is searching the "API_URI" in my local application directory.
Any direction, sample code and or tutorial will be much appreciated. Please help me with that.
Since you have the call tested in Postman, as a starting point use the "Code" link in PostMan to generate your call using RestSharp so that you can test it and further refine it.
https://learning.getpostman.com/docs/postman/sending-api-requests/generate-code-snippets/
you can do something like this:
WebClient client = new WebClient();
string uri = "API_URI";
string json = "{some:\"json data\"}";
client.Headers.Add(HttpRequestHeader.ContentType, "application/json");
client.Headers.Add("Authorization", "apikey");
string response = client.UploadString(uri,json);
this is the documentation https://learn.microsoft.com/en-us/dotnet/api/system.net.webclient.uploadstring?view=netframework-4.8
You can use POST method in this way
WebClient client = new WebClient();
string uri = "API_URI";
var reqparm=new NameValueCollection(); // Used for passing request perameter
reqparm.Add("some","json data");
response = Encoding.UTF8.GetString(client.UploadValues(uri, "POST", reqparm));
I hope this will help you.

WebRequest response content

I'm trying to find out response content of the given url using HttpWebRequest
var targetUri = new Uri("http://www.foo.com/Message/CheckMsg?msg=test");
var webRequest = (HttpWebRequest)WebRequest.Create(targetUri);
var webRequestResponse = webRequest.GetResponse();
The above code always returns the home page (http://www.foo.com) content. I was expecting http://www.foo.com/Message page content. something wrong or am I missing something?
Is the CheckMsg is an html or php file? When I'm accessing websites using webrequest I always have to use the extension. Otherwise the website will think it's a folder. I would recommend trying to add that.
var targetUri = new Uri("http://www.foo.com/Message/CheckMsg.html?msg=test");

How to scrape data

I am trying scrape data from this url: http://icecat.biz/en/p/Coby/DP102/desc.htm
I want to scrape that specs table from that url.
But I checked source code of url that spec table is not displaying because i think that table is loading using Ajax.
How can I get that table.Whats needs to be done?
I used the following code:
string Strproducturl = "http://icecat.biz/en/p/Coby/DP102/desc.htm";
System.Net.ServicePointManager.Expect100Continue = false;
HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create(Strproducturl);
httpWebRequest.KeepAlive = true;
ASCIIEncoding encoding = new ASCIIEncoding();
HttpWebResponse httpWebResponse = (HttpWebResponse)httpWebRequest.GetResponse();
Stream responseStream = httpWebResponse.GetResponseStream();
StreamReader streamReader = new StreamReader(responseStream);
string response = streamReader.ReadToEnd();
As IanNorton mentioned, you'll need to make your request to the URL that Icecat use to load the specs using AJAX. For the example link you provided, the specs details URL you'll need to request will be:
http://icecat.biz/index.cgi?ajax=productPage;product_id=1091664;language=en;request=feature
You can then work your way through the HTML response to get the spec details you require.
You mentioned in your comment that the scraping process is automated. The specs URL is in a basic format, you just need the product ID. However, if you don't have the IDs, just a series of URLs like the example on in your original question, you'll need to get the product ID from the URL you have.
For example, the URL example you gave redirects to a different URL:
http://icecat.biz/p/coby/dp102/digital-photo-frames-0716829961025-dp-102-digital-photo-frame-1091664.html
This URL contains the product ID, right at the end.
You could do a HttpWebRequest to your original URL, stop before it does the redirect and catch the redirecting URL:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://icecat.biz/en/p/Coby/DP102/desc.htm");
request.AllowAutoRedirect = false;
request.KeepAlive = true;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if(response.StatusCode == HttpStatusCode.Redirect){
string redirectUrl = response.GetResponseHeader("Location");
}
Once you've got the redirectUrl variable, you can use Regex to get the ID then do another HttpWebRequest to the specs detail URL.
I would suggest that you use a library like HtmlAgilityPack to select various elements from the html document.
I took a quick look at the link and noticed that the data is actually loaded using an addtional ajax request. You can use the following url to get the ajax data
http://icecat.biz/index.cgi?ajax=productPage;product_id=1091664;language=en;request=feature
The use HtmlAgilityPack to parse that data.
I know this is very old but you could more easily just retrieve the XML from
https://openIcecat-xml:freeaccess#data.icecat.biz/export/freexml.int/EN/1091664.xml
You will also get all images and descriptions as well :-)

How to read a file from a URI using StreamReader?

I have a file at a URI that I would like to read using StreamReader. Obviously, this causes a problem since File.OpenText does not support URI paths. The file is a txt file with a bunch of html in it. I have multiple web pages that use this same piece of html, so I have put it in a txt file, and am reading it into the page when the page loads (I can get it to work when I put the file on the file system, but need to put it in a document repository online so that a business user can get to it). I am trying to avoid using an iframe. Is there a way to use StreamReader with URI formats? If not, what other options are there using C# to read in the txt file of html? If this is not optimal, can someone suggest a better approach?
Is there a specific requirement to use StreamReader? Unless there is, you can use the WebClient class:
var webClient = new WebClient();
string readHtml = webClient.DownloadString("your_file_path_url");
You could try using the HttpWebRequestClass, or WebClient. Here's the slightly complicated web request example. It's advantage over WebClient is it gives you more control over how the request is made:
HttpWebRequest httpRequest = (HttpWebRequest) WebRequest.Create(lcUrl);
httpRequest.Timeout = 10000; // 10 secs
httpRequest.UserAgent = "Code Sample Web Client";
HttpWebResponse webResponse = (HttpWebResponse) httpRequest.GetResponse();
StreamReader responseStream = new StreamReader(webResponse.GetResponseStream());
string content = responseStream.ReadToEnd();
If you are behind a proxy don't forget to set your credentials:
WebRequest request=WebRequest.Create(url);
request.Timeout=30*60*1000;
request.UseDefaultCredentials=true;
request.Proxy.Credentials=request.Credentials;
WebResponse response=(WebResponse)request.GetResponse();
using (Stream s=response.GetResponseStream())
...

Is there any C# equivalent to the Perl's LWP::UserAgent?

In a project I'm invovled in, there is a requirment that the price of certain
stocks will be queryed from some web interface and be displayed in some way.
I know the "query" part of the requirment can be easily implemented using a Perl module like LWP::UserAgent. But for some reason, C# has been chosen as the language to implement the Display part. I don't want to add any IPC (like socket, or indirectly by database) into this tiny project, so my question is there any C# equivalent to the Perl's LWP::UserAgent?
You can use the System.Net.HttpWebRequest object.
It looks something like this:
// Setup the HTTP request.
HttpWebRequest httpWebRequest = (HttpWebRequest)HttpWebRequest.Create("http://www.google.com");
// This is optional, I'm just demoing this because of the comments receaved.
httpWebRequest.UserAgent = "My Web Crawler";
// Send the HTTP request and get the response.
HttpWebResponse httpWebResponse = (HttpWebResponse)httpWebRequest.GetResponse();
if (httpWebResponse.StatusCode == HttpStatusCode.OK)
{
// Get the HTML from the httpWebResponse...
Stream responseStream = httpWebResponse.GetResponseStream();
StreamReader reader = new StreamReader(responseStream);
string html = reader.ReadToEnd();
}
I'm not sure, but are you simply trying to make an HTTP Request? If so, you can use the HttpWebRequest class. Here's an example http://www.csharp-station.com/HowTo/HttpWebFetch.aspx
If you want to simply fetch data from the web, you could use the WebClient class. It seems to be quite good for quick requests.

Categories

Resources