How to process http links?

How to process http links? - c#

As you all know there is many file host websites, is there a way to process the http link of a of file on one of those sites and retrieve a result if the file exists or if the http link even exists or not. I know that maybe some of those file host websites uses their own APIs but i want a more generic way.
Edit:
So as i understand there is no file on a server, it's just that i have to read the response and read it properly. I want to ask another thing, what about redirection, does that mean if i got the response of a link that redirects to other link, i will get the final target from the response ?

You can find out if a file exist using the exists method:
bool System.IO.File.Exists(string path)
///
in order to find out if a file exist on a remove server you can try this:
WebRequest request;
WebResponse response;
String strMSG = string.Empty;
request = WebRequest.Create(new Uri(“http://www.yoururl.com/yourfile.jpg”));
request.Method = “HEAD”;
try
{
response = request.GetResponse();
strMSG = string.Format(“{0} {1}”, response.ContentLength, response.ContentType);
}
catch (Exception ex)
{
//In case of File not Exist Server return the (404) Error
strMSG = ex.Message;
}
see this:

If I understand you correctly, you're trying to tell if a given URL has content.
Use the
WebClient
class.
Call the url, if you receive a 200, you're good to go. A 404 exception or similar probably means the link is no good.
Or, even better way to do this is to do a HEAD http request. See here for more info on that.

Related

WebClient.DownloadFile requires what exactly for the URI?

I'm good with the code, it works great for other solutions of mine. I have a knowledge gap as I do not understand what constitutes a URI. This should work, but does not:
https://www.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=nasdaq&render=download
Now I'm thinking that this is not a file right? Throwing the above at a browser provides a file though. The exception message is "The underlying connection was closed: An unexpected error occurred on a receive."
String address = "https://www.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=nasdaq&render=download";
.....
using (WebClient Client = new WebClient())
{
try
{
Client.DownloadFile(address, destPath + filename);
}
catch (Exception ex)
{
Log.Line("Error: " + ex.Message);
return 1;
}
}
The URI:
this link

You've got a perfectly valid URI. The target server may respond to requests in a different way than you expect though. For example depending on your web client. To debug issues like this use curl.
curl -v https://www.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=nasdaq&render=download
The above command shows you that the server does not reply with the expected csv file. That's not a problem in your code. You can try to pretend a different user agent using the curl -H flag or set some redirection options until you get there.
In your specific case it seems to be the header Accept-Encoding: gzip that solves the issue.

C# 403 error because the file contains an inaccessible image? or what?

I'm trying to get a stream from a url:http://actueel.nl.pwc.com/site/syndicate.jsp but i get the 403 error. It doest requier login. I used fiddler to check why IE can open it while my code doesn't. What i got was that there were 2 connections done when opening the link in IE. 1 succeeded while the other got a 403. The 403 was a sublink to a giff image. Seems like the xml is a public file, but the image it contains is located in a inaccesible folder.
I need to know how to ignore the image so i can still get the rest of stream. this is my code to test it(by the way..i tryed with WeClient too and headers) :
try
{
WebRequest request = WebRequest.Create("http://actueel.nl.pwc.com/site/syndicate.jsp");
request.Credentials = CredentialCache.DefaultCredentials;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream dataStream = response.GetResponseStream();
StreamReader reader = new StreamReader(dataStream);
MessageBox.Show(reader.ReadToEnd());
}
catch(Exception ex){
MessageBox.Show(ex.Message);
}
Thanks for your reactions ;)

I agree with Dmytro. The WebRequest is NOT attempting to download the gif image referenced in the jsp file, only the contents of the jsp itself is being downloaded. Try looking carefully (in Fiddler) at the IE request compared to yours - only the url but also all the request/response headers - and see if anything else is missing, such as cookies or ACCEPT headers.

Using Wireshark and wget, the differences were in the headers only.
The remote server requires User Agent and an Accept headers.
eg:
WebRequest request = WebRequest.Create("http://actueel.nl.pwc.com/site/syndicate.jsp");
((HttpWebRequest)request).UserAgent = "stackoverflow.com/q/4233673/111013";
((HttpWebRequest) request).Accept = "*/*";

Cannot handle redirect from HTTP/HTTPS protocols to other dissimilar ones

Basically, I'm trying to grab an EXE from CNet's Download.com
So i created web parser and so far all is going well.
Here is a sample link pulled directly from their site:
http://dw.com.com/redir?edId=3&siteId=4&oId=3001-20_4-10308491&ontId=20_4&spi=e6323e8d83a8b4374d43d519f1bd6757&lop=txt&tag=idl2&pid=10566981&mfgId=6250549&merId=6250549&pguid=PlvcGQoPjAEAAH5rQL0AAABv&destUrl=ftp%3A%2F%2F202.190.201.108%2Fpub%2Fryl2%2Fclient%2Finstaller-ryl2_v1673.exe
Here is the problem: When you attempt to download, it begins with HTTP, then redirects to an FTP site. I have tried .NET's WebClient and HttpWebRequest Objects, and it looks like Neither can support Redirects.
This Code Fails at GetResponse();
HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://dw.com.com/redir");
WebResponse response = req.GetResponse();
Now, I also tried this:
HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://dw.com.com/redir");
req.AllowAutoRedirect = false;
WebResponse response = req.GetResponse();
string s = new StreamReader(response.GetResponseStream()).ReadToEnd();
And it does not throw the error anymore, however variable s turns out to be an empty string.
I'm at a loss! Can anyone help out?

You can get the value of the "Location" header from the response.headers, and then create a new FtpWebRequest to download that resource.

in your first code snippet you will be redirected to a link using a different protocol (i.e it's no longer Http as in HttpWebRequest) so it fails du to a malformed http response.
In the second part you're no longer redirected and hence you don't receive a FTP response (which is not malform when interpreted as HTTP response).
You need to acquire FTP link,as ferozo wrote you can do this by getting the value of the header "location", and use a FtpWebRequest to access the file

How can one check to see if a remote file exists using C#

How can I check a File exits in a web location in ASP.Net(in a different web application, but same web server), currently I doing like this. Is there any better way of doing this?
using (WebClient client = new WebClient())
{
try
{
Stream stream = client.OpenRead("http://localhost/images/myimage.jpg");
if (stream != null)
{
//exists
}
}
catch
{
//Not exists
}
}

Remember that you are never going to get a 100% definitive response on the existence of a file, but the way I do it would be pretty similar to yours...
bool remoteFileExists(string addressOfFile)
{
try
{
HttpWebRequest request = WebRequest.Create(addressOfFile) as HttpWebRequest;
request.Method = "HEAD";
request.CachePolicy = new RequestCachePolicy(RequestCacheLevel.NoCacheNoStore);
var response = request.GetResponse() as HttpWebResponse;
return (response.StatusCode == HttpStatusCode.OK);
}
catch(WebException wex)
{
return false;
}
}
EDIT :: looking at the edit by Anton Gogolev above (How can one check to see if a remote file exists using C#) I should have cast the response to a HttpWebResponse object and checked the status code. Edited the code to reflect that

If a file is accessible via HTTP, you can issue a HTTP HEAD requrest for that particular URL using HttpWebRequest. If HttpWebResponse.StatusCode will be 200, than file is there.
EDIT: See this on why GetResponse throws stupid exceptions when it actually should not do that.

You can use Server.MapPath to get the directory and then check if file exist using IO standard methods like File.Exists

The 404 or Not Found error message is a HTTP standard response code indicating that the client was able to communicate with the server but the server could not find what was requested. A 404 error indicates that the requested resource may be available in the future.
You can use a HEAD request (HttpWebRequest.Method = "HEAD")

How to check if a file exists on an webserver by its URL?

in our application we have some kind of online help. It works really simple: If the user clicks on the help button a URL is build depending on the current language and help context (e.g. "http://example.com/help/" + [LANG_ID] + "[HELP_CONTEXT]) and called within the browser.
So my question is: How can i check if a file exists on the web server without loading the complete file content?
Thanks for your Help!
Update: Thanks for your help. My question has been answered.
Now we have proxy authentication problems an cannot send the HTTP request ;)

You can use .NET to do a HEAD request and then look at the status of the response.
Your code would look something like this (adapted from The Lowly HTTP HEAD Request):
// create the request
HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
// instruct the server to return headers only
request.Method = "HEAD";
// make the connection
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
// get the status code
HttpStatusCode status = response.StatusCode;
Here's a list detailing the status codes that can be returned by the StatusCode enumerator.

Can we assume that you are running your web application on the same web server as you are retrieving your help pages from? If yes, then you can use the Server.MapPath method to find a path to the file on the server combined with the File.Exists method from the System.IO namespace to confirm that the file exists.

Had the same problem myself and found this question and the answers here really useful.
But the answers here use the old WebRequest-class which is a bit outdated, it has no async support for starters. So I wanted to use the more modern way of doing it with HttpClient. Here is an example with a little helper class to check if the file exist:
using System.Net.Http;
using System.Threading.Tasks;
class HttpClientHelper
{
private static HttpClient _httpClient;
public static async Task<bool> DoesFileExist(string url)
{
if (_httpClient == null)
{
_httpClient = new HttpClient();
}
using (HttpRequestMessage request = new HttpRequestMessage(HttpMethod.Head, url))
{
using (HttpResponseMessage response = await _httpClient.SendAsync(request))
{
return response.StatusCode == System.Net.HttpStatusCode.OK;
}
}
}
}
Usage:
if (await HttpClientHelper.DoesFileExist("https://www.google.com/favicon.ico"))
{
// Yes it does!
}
else
{
// No it doesn't!
}

Send a HEAD request for the URL (instead of a GET). The server will return a 404 if it doesn't exist.

Take a look at the HttpWebResponse class. You could do something like this:
string url = "http://example.com/help/" + LANG_ID + HELP_CONTEXT;
WebRequest request=WebRequest.Create(URL);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusDescription=="OK")
{
// worked
}

If you want to check the status of a document on the server:
function fetchStatus(address) {
var client = new XMLHttpRequest();
client.onreadystatechange = function() {
// in case of network errors this might not give reliable results
if(this.readyState == 4)
returnStatus(this.status);
}
client.open("HEAD", address);
client.send();
}
Thank you.

EDIT: Apparently a good method to do this would be a HEAD request.
You could also create a server-side application that stores the name of every available web page on the server. Your client application could then query this application and respond a little bit quicker than a full page request, and without throwing a 404 error every time the file doesn't exist.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to process http links? - c#

Related

WebClient.DownloadFile requires what exactly for the URI?

C# 403 error because the file contains an inaccessible image? or what?

Cannot handle redirect from HTTP/HTTPS protocols to other dissimilar ones

How can one check to see if a remote file exists using C#

How to check if a file exists on an webserver by its URL?

Categories

Resources