How to read a file from a URI using StreamReader? - c#

I have a file at a URI that I would like to read using StreamReader. Obviously, this causes a problem since File.OpenText does not support URI paths. The file is a txt file with a bunch of html in it. I have multiple web pages that use this same piece of html, so I have put it in a txt file, and am reading it into the page when the page loads (I can get it to work when I put the file on the file system, but need to put it in a document repository online so that a business user can get to it). I am trying to avoid using an iframe. Is there a way to use StreamReader with URI formats? If not, what other options are there using C# to read in the txt file of html? If this is not optimal, can someone suggest a better approach?

Is there a specific requirement to use StreamReader? Unless there is, you can use the WebClient class:
var webClient = new WebClient();
string readHtml = webClient.DownloadString("your_file_path_url");

You could try using the HttpWebRequestClass, or WebClient. Here's the slightly complicated web request example. It's advantage over WebClient is it gives you more control over how the request is made:
HttpWebRequest httpRequest = (HttpWebRequest) WebRequest.Create(lcUrl);
httpRequest.Timeout = 10000; // 10 secs
httpRequest.UserAgent = "Code Sample Web Client";
HttpWebResponse webResponse = (HttpWebResponse) httpRequest.GetResponse();
StreamReader responseStream = new StreamReader(webResponse.GetResponseStream());
string content = responseStream.ReadToEnd();

If you are behind a proxy don't forget to set your credentials:
WebRequest request=WebRequest.Create(url);
request.Timeout=30*60*1000;
request.UseDefaultCredentials=true;
request.Proxy.Credentials=request.Credentials;
WebResponse response=(WebResponse)request.GetResponse();
using (Stream s=response.GetResponseStream())
...

Related

Downloading xml file from URL

I am trying to store data from a xml file located in this link : Link to xml file
The xml file basically contains questions as well as options relating to English grammar. The main reason for saving these file is that I can parse these information later
From the browser (Chrome or IE), the xml file is loaded normally yet saving it programmatically did not work. The issue is that the data retrieved from that URL appeared to be something else.
This is my code for getting data from the URL
using (WebClient client = new WebClient())
{
client.Encoding = Encoding.UTF8;
// the data retrieved is something meaning : "Link does not exist"
string data = client.DownloadString("http://farm04.gox.vn/edu/IOE_Exam/IOE/l5/v1/g1/exam1.xml?v=");
}
The above approach gives nothing but a string which indicates there isnt such a link (though I still think the server itself sends it rather than a 404 error).
My second attempt is using WebRequest and WebResponse but no luck. I have read that WebClient is a wrapper class for the latter method so both give the same result
WebRequest request = WebRequest.Create(Url);
WebResponse response = request.GetResponse();
using (Stream stream = response.GetResponseStream())
{
return new StreamReader(stream).ReadToEnd();
}
I suppose the server has a mechanism to prevent client from downloading it. However Chrome would do the job successfully by loading the page then "Ctrl + S" (Save as ..) to save the xml file without any problems.
What the differences between browser and WebClient API ? Is there additional action which the browser implements ?
Any suggestion would be much appreciated.

WebRequest response content

I'm trying to find out response content of the given url using HttpWebRequest
var targetUri = new Uri("http://www.foo.com/Message/CheckMsg?msg=test");
var webRequest = (HttpWebRequest)WebRequest.Create(targetUri);
var webRequestResponse = webRequest.GetResponse();
The above code always returns the home page (http://www.foo.com) content. I was expecting http://www.foo.com/Message page content. something wrong or am I missing something?
Is the CheckMsg is an html or php file? When I'm accessing websites using webrequest I always have to use the extension. Otherwise the website will think it's a folder. I would recommend trying to add that.
var targetUri = new Uri("http://www.foo.com/Message/CheckMsg.html?msg=test");

Parse and extract a value from streaming XML file?

After much ado, I managed to create a restful service in asp.net MVC following Omar's brilliant Restful Asp.net article
Just one little thing remains.
My Asp.Net MVC controller returns an XML file , which has this tag
< FileCode > 24233224< / FileCode >
This is a console application I use to send a Get request which gives me the whole Xml file
//Generate get request
string url = "http://localhost:1193/Home/index?File=343456789012286";
HttpWebRequest GETRequest = (HttpWebRequest)WebRequest.Create(url);
GETRequest.Method = "GET";
GETRequest.ContentType = "text/xml";
GETRequest.Accept = "text/xml";
Console.WriteLine("Sending GET Request");
HttpWebResponse GETResponse = (HttpWebResponse)GETRequest.GetResponse();
Stream GETResponseStream = GETResponse.GetResponseStream();
StreamReader sr = new StreamReader(GETResponseStream);
Console.WriteLine("Response from Server");
// This writes whole file on screen
Console.WriteLine(sr.ReadToEnd());
I could perhaps save this file and then use Linq to parse it, but can't I just get the value in my tag out without saving it ? I simply need the FileCode
Thankyou :)
Yuo could emply the XPathReader (source download).
It comes with source and testsuite.
What it gives you is the ability to work with highlevel query constructs (XPath) in streaming mode.
There is also a similar article on CodeProject: Fast screen scraping with XPath over a modified XmlTextReader and SgmlReader

in C#, how can I get the HTML content of a website before displaying it?

I have a web browser project in C#, I am thinking such system; when user writes the url then clicks "go" button, my browser get content of written web site ( it shouldn't visit that page, I mean it shouldn't display anything), then I want look for a specific "keyword" for ex; "violence", if there exists, I can navigate that browser to a local page that has a warning. Shortly, in C#, How can I get content of a web site before visiting?...
Sorry for my english,
Thanks in advance!
System.Net.WebClient:
string url = "http://www.google.com";
System.Net.WebClient wc = new System.Net.WebClient();
string html = wc.DownloadString(url);
You have to use WebRequest and WebResponse to load a site:
example:
string GetPageSource (string url)
{
HttpWebRequest webrequest = (HttpWebRequest)WebRequest.Create(url);
webrequest.Method = "GET";
HttpWebResponse webResponse = (HttpWebResponse)webrequest.GetResponse();
string responseHtml;
using (StreamReader responseStream = new StreamReader(webResponse.GetResponseStream()))
{
responseHtml = responseStream.ReadToEnd().Trim();
}
return responseHtml;
}
After that you can check the responseHtml for some Keywords... for example with RegEx.
You can make an HTTP request (via HttpClient to the site) and parse the results looking for the various keywords. Then you can make the decision whether or not to visibly 'navigate' the user there.
There's an HTTP client sample on Dev Center that may help.

how to read the response from a web site?

I have a website url which gives corresponding city names by taking zip code as input parameter. Now I want to know how to read the response from the site.
This is the link I am using http://zipinfo.com/cgi-local/zipsrch.exe?zip=60680
You'll have to Use the HTTPWebRequest object to connect to the site and scrape the information from the response.
Look for html tags or class names that wrap the content you are trying to find, then use either regexes or string functions to get the required data.
Good example here:
try this (you'll need to include System.text and System.net)
WebClient client = new WebClient();
string url = "http://zipinfo.com/cgi-local/zipsrch.exe?zip=60680";
Byte[] requestedHTML;
requestedHTML = client.DownloadData(url);
UTF8Encoding objUTF8 = new UTF8Encoding();
string html = objUTF8.GetString(requestedHTML);
Response.Write(html);
The simplest way it to use the light-weight WebClient classes in System.Net namespace. The following example code will just download the entire response as a string:
using (WebClient wc = new WebClient())
{
string response = wc.DownloadString("http://zipinfo.com/cgi-local/zipsrch.exe?zip=60680");
}
However, if you require more control over the response and request process then you can use the more heavy-weight HttpWebRequest Class. For instance, you may want to deal with different status codes or headers. There's an example of using HttpWebRequest this in the article How to use HttpWebRequest and HttpWebResponse in .NET on CodeProject.
Used the WebClient Class (http://msdn.microsoft.com/en-us/library/system.net.webclient%28v=VS.100%29.aspx) to request the page and get the response as a string.
WebClient wc = new WebClient();
String s = wc.DownloadString(DestinationUrl);
You can search the response for specific HTML using String.IndexOf, SubString, etc, regular expressions, or try something like the HTML Agility Pack (http://htmlagilitypack.codeplex.com/) which was created specifically to help parse HTML.
first of all, you better find a good Web Service for this purpose.
and this is an HttpWebRequest example:
HttpWebRequest httpRequest = (HttpWebRequest)WebRequest.Create("http://zipinfo.com/cgi-local/zipsrch.exe?zip=60680");
httpRequest.Credentials = CredentialCache.DefaultCredentials;
HttpWebResponse httpResponse = (HttpWebResponse)httpRequest.GetResponse();
Stream dataStream = httpResponse.GetResponseStream();
You need to use HttpWebRequest for receiving content and some tools for parsing html and finding what you need. One of the most popular libs for working with html in c# is HtmlAgilityPack, you can see simple example here: http://www.fairnet.com/post/2010/08/28/Html-screen-scraping-with-HtmlAgilityPack-Library.aspx
you can use a WebClient object, and an easy way to scrape the data is with xpath.

Categories

Resources