Downloading xml file from URL - c#

I am trying to store data from a xml file located in this link : Link to xml file
The xml file basically contains questions as well as options relating to English grammar. The main reason for saving these file is that I can parse these information later
From the browser (Chrome or IE), the xml file is loaded normally yet saving it programmatically did not work. The issue is that the data retrieved from that URL appeared to be something else.
This is my code for getting data from the URL
using (WebClient client = new WebClient())
{
client.Encoding = Encoding.UTF8;
// the data retrieved is something meaning : "Link does not exist"
string data = client.DownloadString("http://farm04.gox.vn/edu/IOE_Exam/IOE/l5/v1/g1/exam1.xml?v=");
}
The above approach gives nothing but a string which indicates there isnt such a link (though I still think the server itself sends it rather than a 404 error).
My second attempt is using WebRequest and WebResponse but no luck. I have read that WebClient is a wrapper class for the latter method so both give the same result
WebRequest request = WebRequest.Create(Url);
WebResponse response = request.GetResponse();
using (Stream stream = response.GetResponseStream())
{
return new StreamReader(stream).ReadToEnd();
}
I suppose the server has a mechanism to prevent client from downloading it. However Chrome would do the job successfully by loading the page then "Ctrl + S" (Save as ..) to save the xml file without any problems.
What the differences between browser and WebClient API ? Is there additional action which the browser implements ?
Any suggestion would be much appreciated.

Related

Parse and extract a value from streaming XML file?

After much ado, I managed to create a restful service in asp.net MVC following Omar's brilliant Restful Asp.net article
Just one little thing remains.
My Asp.Net MVC controller returns an XML file , which has this tag
< FileCode > 24233224< / FileCode >
This is a console application I use to send a Get request which gives me the whole Xml file
//Generate get request
string url = "http://localhost:1193/Home/index?File=343456789012286";
HttpWebRequest GETRequest = (HttpWebRequest)WebRequest.Create(url);
GETRequest.Method = "GET";
GETRequest.ContentType = "text/xml";
GETRequest.Accept = "text/xml";
Console.WriteLine("Sending GET Request");
HttpWebResponse GETResponse = (HttpWebResponse)GETRequest.GetResponse();
Stream GETResponseStream = GETResponse.GetResponseStream();
StreamReader sr = new StreamReader(GETResponseStream);
Console.WriteLine("Response from Server");
// This writes whole file on screen
Console.WriteLine(sr.ReadToEnd());
I could perhaps save this file and then use Linq to parse it, but can't I just get the value in my tag out without saving it ? I simply need the FileCode
Thankyou :)
Yuo could emply the XPathReader (source download).
It comes with source and testsuite.
What it gives you is the ability to work with highlevel query constructs (XPath) in streaming mode.
There is also a similar article on CodeProject: Fast screen scraping with XPath over a modified XmlTextReader and SgmlReader

Force the Download File dialog Not Working - ASP.NET C#

I am trying to implement force download file dialog in my ASP.NET C# application. The files I'd like to force download are media files not locally available available on the web server but are being served from a different location.
I am getting an error 'http://remote-site-to-webserver/somefile.asf' is not a valid virtual path.
I have searched the web for solutions but all examples point to relative path on the server using Server.MapPath
In the example below I created a webhandler.ashx page and send the download request to this page.
<%# WebHandler Language="C#" Class="DownloadHandler" %>
using System;
using System.Web;
public class DownloadHandler : IHttpHandler {
public void ProcessRequest(HttpContext context) {
var fileName = "http://remote-site-to-webserver/somefile.asf";
var r = context.Response;
r.AddHeader("Content-Disposition", "attachment; filename=" + fileName);
r.WriteFile(context.Server.MapPath(fileName));
}
public bool IsReusable { get { return false; } }
}
The Content-Disposition header looks wrong to me. I think it should be:
r.AddHeader("Content-Disposition",
"attachment; filename=DefaultNewFilename.ext");
the filename is the default name given to the downloaded file... Or in otherwords it's what is shown in the browsers save dialog.
You may also want:
r.AddHeader("Content-Type", "application/octetstream");
I'm not sure that's required.... But I've always included it for video files and so on.
In order for the download to start from a different server, you need to send a redirect answer to the client (Response.Redirect(mediaURL)).
As a consequence, you cannot force the download dialog from your web server because the browser will send a separate request to the other server. This must be solved on the server where the media is served from.
The only alternative is that you act as an intermediate, i.e. you download the media file to your server and send it as the response to the client. This shouldn't be too difficult if it's a small file that easily fits into memory. However, if it's a large file it might involve some tricky coding so you can receive and send it piecewise.
Server.MapPath()
is not used for remote http files. it is just a tool for converting virtual addresses to physical addresses, i.e. you can retrieve "C:\inetpub\wwwroot\MyWebSite\Files\blah.txt" by giving "~/Files/blah.txt" to Server.MapPath method.
if you are interested in downloading a file from another web server you will have to use HttpWebRequest class.
this is a sample code:
HttpWebRequest httpRequest = (HttpWebRequest)WebRequest.Create("http://remote-site-to-webserver/somefile.asf");
httpRequest.Credentials = CredentialCache.DefaultCredentials; //or a NetworkCredential if needed
HttpWebResponse httpResponse = (HttpWebResponse)httpRequest.GetResponse();
Stream dataStream = httpResponse.GetResponseStream();
now you can output the dataStream into your response.

Downloading a file from server and save it in client

I am currently developing an ASP.net application, where I generate a word document in server and I want to save it in client machine who access that feature with out user interactions. How can I download it and save it in client machine, using Javascript?
you cann't save it in clients machine with out knowledge of client.
You can give a link of the word document,user need to click on the link and save it in his machine.
<a href="serverLink.doc" >Click to Save Word document</a>
Note: you cant do any manipulation on client PC by using Javascript or any scripting language
You can do either of this :
CASE 1 :
private static string GetWebTest1(string url)
{
System.Net.WebClient Client = new WebClient();
return Client.DownloadString(url);
}
CASE 2 :
private static string GetWebTest2(string url)
{
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
WebResponse response = request.GetResponse();
Stream stream = response.GetResponseStream();
StreamReader reader = new StreamReader(stream);
return reader.ReadToEnd();
}
Use System.Net.WebClient.DownloadFile and you are doing bulk downloads then do use WebClient.DownloadProgressChanged Event. Sometimes we do bulk download but user get the impression that system stuck or fail somewhere and start hitting refresh. Avoid that!

C# 403 error because the file contains an inaccessible image? or what?

I'm trying to get a stream from a url:http://actueel.nl.pwc.com/site/syndicate.jsp but i get the 403 error. It doest requier login. I used fiddler to check why IE can open it while my code doesn't. What i got was that there were 2 connections done when opening the link in IE. 1 succeeded while the other got a 403. The 403 was a sublink to a giff image. Seems like the xml is a public file, but the image it contains is located in a inaccesible folder.
I need to know how to ignore the image so i can still get the rest of stream. this is my code to test it(by the way..i tryed with WeClient too and headers) :
try
{
WebRequest request = WebRequest.Create("http://actueel.nl.pwc.com/site/syndicate.jsp");
request.Credentials = CredentialCache.DefaultCredentials;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream dataStream = response.GetResponseStream();
StreamReader reader = new StreamReader(dataStream);
MessageBox.Show(reader.ReadToEnd());
}
catch(Exception ex){
MessageBox.Show(ex.Message);
}
Thanks for your reactions ;)
I agree with Dmytro. The WebRequest is NOT attempting to download the gif image referenced in the jsp file, only the contents of the jsp itself is being downloaded. Try looking carefully (in Fiddler) at the IE request compared to yours - only the url but also all the request/response headers - and see if anything else is missing, such as cookies or ACCEPT headers.
Using Wireshark and wget, the differences were in the headers only.
The remote server requires User Agent and an Accept headers.
eg:
WebRequest request = WebRequest.Create("http://actueel.nl.pwc.com/site/syndicate.jsp");
((HttpWebRequest)request).UserAgent = "stackoverflow.com/q/4233673/111013";
((HttpWebRequest) request).Accept = "*/*";

How to read a file from a URI using StreamReader?

I have a file at a URI that I would like to read using StreamReader. Obviously, this causes a problem since File.OpenText does not support URI paths. The file is a txt file with a bunch of html in it. I have multiple web pages that use this same piece of html, so I have put it in a txt file, and am reading it into the page when the page loads (I can get it to work when I put the file on the file system, but need to put it in a document repository online so that a business user can get to it). I am trying to avoid using an iframe. Is there a way to use StreamReader with URI formats? If not, what other options are there using C# to read in the txt file of html? If this is not optimal, can someone suggest a better approach?
Is there a specific requirement to use StreamReader? Unless there is, you can use the WebClient class:
var webClient = new WebClient();
string readHtml = webClient.DownloadString("your_file_path_url");
You could try using the HttpWebRequestClass, or WebClient. Here's the slightly complicated web request example. It's advantage over WebClient is it gives you more control over how the request is made:
HttpWebRequest httpRequest = (HttpWebRequest) WebRequest.Create(lcUrl);
httpRequest.Timeout = 10000; // 10 secs
httpRequest.UserAgent = "Code Sample Web Client";
HttpWebResponse webResponse = (HttpWebResponse) httpRequest.GetResponse();
StreamReader responseStream = new StreamReader(webResponse.GetResponseStream());
string content = responseStream.ReadToEnd();
If you are behind a proxy don't forget to set your credentials:
WebRequest request=WebRequest.Create(url);
request.Timeout=30*60*1000;
request.UseDefaultCredentials=true;
request.Proxy.Credentials=request.Credentials;
WebResponse response=(WebResponse)request.GetResponse();
using (Stream s=response.GetResponseStream())
...

Categories

Resources