Downloading a file from a redirection page - c#

How would I go about downloading a file from a redirection page (which itself does some calculations based on the user).
For example, if I wanted the user to download a game, I would use WebClient and do something like:
client.DownloadFile("http://game-side.com/downloadfetch/");
It's not as simple as doing
client.DownloadFile("http://game-side.com/download.exe");
But if the user were to click on the first one, it would redirect and download it.

As far as I know this isn't possible with DownloadFile();
You could use this
HttpWebRequest myHttpWebRequest=(HttpWebRequest)WebRequest.Create("http://game-side.com/downloadfetch/");
myHttpWebRequest.MaximumAutomaticRedirections=1;
myHttpWebRequest.AllowAutoRedirect=true;
HttpWebResponse myHttpWebResponse=(HttpWebResponse)myHttpWebRequest.GetResponse();
See also
Download file through code that has a redirect?

I think, you should go with slightly customized WebClient class like that. It will follow code 300 redirects:
public class MyWebClient : WebClient
{
protected override WebResponse GetWebResponse(WebRequest request)
{
(request as HttpWebRequest).AllowAutoRedirect = true;
WebResponse response = base.GetWebResponse(request);
return response;
}
}
...
WebClient client=new MyWebClient();
client.DownloadFile("http://game-side.com/downloadfetch/", "download.zip");

Related

How to download file from url that redirects?

i'm trying to download a file from a link that doesn't contain the file, but instead it redirects to another (temporary) link that contains the actual file. The objective is to get an updated copy of the program without the need to open a browser. The link is:
http://www.bleepingcomputer.com/download/minitoolbox/dl/65/
I've tried to use WebClient, but it won't work:
private void Button1_Click(object sender, EventArgs e)
{
WebClient webClient = new WebClient();
webClient.DownloadFileCompleted += new AsyncCompletedEventHandler(Completed);
webClient.DownloadFileAsync(new Uri("http://www.bleepingcomputer.com/download/minitoolbox/dl/65/"), #"C:\Downloads\MiniToolBox.exe");
}
After searching and trying many things i've found this solution that involves using HttpWebRequest.AllowAutoRedirect.
Download file through code that has a redirect?
// Create a new HttpWebRequest Object to the mentioned URL.
HttpWebRequest myHttpWebRequest=(HttpWebRequest)WebRequest.Create("http://www.contoso.com");
myHttpWebRequest.MaximumAutomaticRedirections=1;
myHttpWebRequest.AllowAutoRedirect=true;
HttpWebResponse myHttpWebResponse=(HttpWebResponse)myHttpWebRequest.GetResponse();
It seems that's exactly what i'm looking for, but i simply don't know how to use it :/
I guess the link is a parameter of WebRequest.Create. But how can i retrieve the file to my directory? yes, i´m a noob... Thanks in advance for your help.
I switched from a WebClient based approach to a HttpWebRequest too because auto redirects didn't seem to be working with WebClient. I was using similar code to yours but could never get it to work, it never redirected to the actual file. Looking in Fiddler I could see I wasn't actually getting the final redirect.
Then I came across some code for a custom version of WebClient in this question:
class CustomWebclient: WebClient
{
[System.Security.SecuritySafeCritical]
public CustomWebclient(): base()
{
}
public CookieContainer cookieContainer = new CookieContainer();
protected override WebRequest GetWebRequest(Uri myAddress)
{
WebRequest request = base.GetWebRequest(myAddress);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = cookieContainer;
(request as HttpWebRequest).AllowAutoRedirect = true;
}
return request;
}
}
The key part in that code is AllowAutoRedirect = true, it's supposed to be on by default according to the documentation, which states:
AllowAutoRedirect is set to true in WebClient instances.
but that didn't seem to be the case when I was using it.
I also needed the CookieContainer part for this to work with the SharePoint external URLs we were trying to access.
I guess the easy option is simply this (after what you've got there.. and the URL you provided in place of http://www.contoso.com):
using (var responseStream = myHttpWebResponse.GetResponseStream()) {
using (var fileStream =
new FileStream(Path.Combine("folder_here", "filename_here"), FileMode.Create)) {
responseStream.CopyTo(fileStream);
}
}
EDIT:
In fact, this won't work. It isn't a HTTP redirect that downloads the file. Look at the source of that page.. you'll see this:
<meta http-equiv="refresh" content="3; url=http://download.bleepingcomputer.com/dl/1f92ae2ecf0ba549294300363e9e92a8/52ee41aa/windows/security/security-utilities/m/minitoolbox/MiniToolBox.exe">
It basically uses the browser to redirect. Unfortunately what you're trying to do won't work.

C# NET.WebClient DownloadString() Issue - Page redirects

I have this problem - I am writing a simple web spider and it works good so far. Problem is the site I am working on has the nasty habit of redirecting or adding stuff to the address sometimes. In some pages it adds "/about" after you load them and on some it totally redirects to another page.
The webclient gets confused since it downloads the html code and starts to parse the links, but since many of them are in the format "../../something", it simply crashes after a while, because it calculates the link according to the first given address(before redirecting or adding "/about"). When the newly created page comes out of the queue it throws 404 Not Found exception(surpriiise).
Now I can just add "/about" to every page myself, but for shits and giggles, the website itself doesn't always add it...
I would appreciate any ideas.
Thank you for your time and all best!
If you want to get the redirected URI of a page for parsing the links inside it, use a subclass of WebClient like this:
class MyWebClient : WebClient
{
Uri _responseUri;
public Uri ResponseUri
{
get { return _responseUri; }
}
protected override WebResponse GetWebResponse(WebRequest request)
{
WebResponse response = base.GetWebResponse(request);
_responseUri = response.ResponseUri;
return response;
}
}
Now use MyWebClient instead of WebClient and parse the links using ResponseUri

in C#, how can I get the HTML content of a website before displaying it?

I have a web browser project in C#, I am thinking such system; when user writes the url then clicks "go" button, my browser get content of written web site ( it shouldn't visit that page, I mean it shouldn't display anything), then I want look for a specific "keyword" for ex; "violence", if there exists, I can navigate that browser to a local page that has a warning. Shortly, in C#, How can I get content of a web site before visiting?...
Sorry for my english,
Thanks in advance!
System.Net.WebClient:
string url = "http://www.google.com";
System.Net.WebClient wc = new System.Net.WebClient();
string html = wc.DownloadString(url);
You have to use WebRequest and WebResponse to load a site:
example:
string GetPageSource (string url)
{
HttpWebRequest webrequest = (HttpWebRequest)WebRequest.Create(url);
webrequest.Method = "GET";
HttpWebResponse webResponse = (HttpWebResponse)webrequest.GetResponse();
string responseHtml;
using (StreamReader responseStream = new StreamReader(webResponse.GetResponseStream()))
{
responseHtml = responseStream.ReadToEnd().Trim();
}
return responseHtml;
}
After that you can check the responseHtml for some Keywords... for example with RegEx.
You can make an HTTP request (via HttpClient to the site) and parse the results looking for the various keywords. Then you can make the decision whether or not to visibly 'navigate' the user there.
There's an HTTP client sample on Dev Center that may help.

Download file through code that has a redirect?

I have some urls in the database. The problem is there urls are urls that redirect to what I want.
I have something like this
http://www.mytestsite.com/test/test/?myphoto=true
now if I go to this site it would do a redirect to the photo so the url would end up being
http://www.mytestsite.com/test/myphoto.jpg
Is it possible to somehow scrape(download) through C# and then have it redirect and get the real url so I can download the image?
I think you are after the HttpWebRequest.AllowAutoRedirect Property. The property gets or sets a value that indicates whether the request should follow redirection responses.
Example taken from MSDN
HttpWebRequest myHttpWebRequest=(HttpWebRequest)WebRequest.Create("http://www.contoso.com");
myHttpWebRequest.MaximumAutomaticRedirections=1;
myHttpWebRequest.AllowAutoRedirect=true;
HttpWebResponse myHttpWebResponse=(HttpWebResponse)myHttpWebRequest.GetResponse();
I had issues trying to get HttpWebRequest to always fully redirect when using it with SharePoint external URLs; I simply couldn't get it to work.
After a lot of faffing about I discovered that this can be done with WebClient too and that proved more reliable for me.
To get that to work with WebClient you seem to have to create a class that derives from WebClient so that you can manually force AllowAutoRedirect to true.
I wrote up a bit more on this in this answer, which borrows its code from this question.
The key code was:
class CustomWebclient: WebClient
{
[System.Security.SecuritySafeCritical]
public CustomWebclient(): base()
{
}
public CookieContainer cookieContainer = new CookieContainer();
protected override WebRequest GetWebRequest(Uri myAddress)
{
WebRequest request = base.GetWebRequest(myAddress);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = cookieContainer;
(request as HttpWebRequest).AllowAutoRedirect = true;
}
return request;
}
}

How to check if a file exists on an webserver by its URL?

in our application we have some kind of online help. It works really simple: If the user clicks on the help button a URL is build depending on the current language and help context (e.g. "http://example.com/help/" + [LANG_ID] + "[HELP_CONTEXT]) and called within the browser.
So my question is: How can i check if a file exists on the web server without loading the complete file content?
Thanks for your Help!
Update: Thanks for your help. My question has been answered.
Now we have proxy authentication problems an cannot send the HTTP request ;)
You can use .NET to do a HEAD request and then look at the status of the response.
Your code would look something like this (adapted from The Lowly HTTP HEAD Request):
// create the request
HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
// instruct the server to return headers only
request.Method = "HEAD";
// make the connection
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
// get the status code
HttpStatusCode status = response.StatusCode;
Here's a list detailing the status codes that can be returned by the StatusCode enumerator.
Can we assume that you are running your web application on the same web server as you are retrieving your help pages from? If yes, then you can use the Server.MapPath method to find a path to the file on the server combined with the File.Exists method from the System.IO namespace to confirm that the file exists.
Had the same problem myself and found this question and the answers here really useful.
But the answers here use the old WebRequest-class which is a bit outdated, it has no async support for starters. So I wanted to use the more modern way of doing it with HttpClient. Here is an example with a little helper class to check if the file exist:
using System.Net.Http;
using System.Threading.Tasks;
class HttpClientHelper
{
private static HttpClient _httpClient;
public static async Task<bool> DoesFileExist(string url)
{
if (_httpClient == null)
{
_httpClient = new HttpClient();
}
using (HttpRequestMessage request = new HttpRequestMessage(HttpMethod.Head, url))
{
using (HttpResponseMessage response = await _httpClient.SendAsync(request))
{
return response.StatusCode == System.Net.HttpStatusCode.OK;
}
}
}
}
Usage:
if (await HttpClientHelper.DoesFileExist("https://www.google.com/favicon.ico"))
{
// Yes it does!
}
else
{
// No it doesn't!
}
Send a HEAD request for the URL (instead of a GET). The server will return a 404 if it doesn't exist.
Take a look at the HttpWebResponse class. You could do something like this:
string url = "http://example.com/help/" + LANG_ID + HELP_CONTEXT;
WebRequest request=WebRequest.Create(URL);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusDescription=="OK")
{
// worked
}
If you want to check the status of a document on the server:
function fetchStatus(address) {
var client = new XMLHttpRequest();
client.onreadystatechange = function() {
// in case of network errors this might not give reliable results
if(this.readyState == 4)
returnStatus(this.status);
}
client.open("HEAD", address);
client.send();
}
Thank you.
EDIT: Apparently a good method to do this would be a HEAD request.
You could also create a server-side application that stores the name of every available web page on the server. Your client application could then query this application and respond a little bit quicker than a full page request, and without throwing a 404 error every time the file doesn't exist.

Categories

Resources