I want to check if an given URL a is link to http://youtube.com.
I know there are lots of various shortened version's of the links (e.g. http://youtu.be), so what I am after is a way to resolve the URL and see if it ends up as http://youtube.com.
A couple of example inputs are:
http://www.youtube.com/v/[videoid]
http://www.youtu.be/watch?v=[videoid]
Does anyone know of a way to do this?
You could perform a HEAD request:
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create("http://www.youtu.be/Ddn4MGaS3N4");
request.Method = "HEAD";
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse()) {
Console.WriteLine("Does this resolve to youtube?: {0}", response.ResponseUri.ToString().Contains("youtube.com") ? "Yes" : "No");
}
Appears to work fine. Unsure of edge cases but seems to do the job.
(Note: No error checking here such as 404 errors, etc).
bool isYoutube = false;
string host = new Uri(url).Host;
if (host == "youtube.com" || host == "youtu.be")
{
isYoutube = true;
}
First you may have to check what the hostname is for youtube (I'm just assuming it is http://youtube.com) but after you have that the following code will do what you want;
using System.Net;
IPHostEntry host = Dns.Resolve(theInputHostName);
if (host.HostName == "http://youtube.com")
// it resolves to youtube, do something.
If you want to know whether a given URL redirects (using status codes 301/302) to an YouTube URL, you may either use WebClient/HttWebRequest/whatever directly and check the response, or disable HttpWebRequest.AllowAutoRedirect and traverse all redirects manually (checking the status code and then the Location HTTP header).
Related
I am creating an application which will check for broken links in content.
All working apart from you tube links where I get a mixed response, broken links (or codes I have just made up) sometime come up with 200 ok and sometimes they come up as broken.
Is there a different way of checking broken links in youtube?
Im using standard .net/c# code
try
{
HttpWebRequest request = WebRequest.Create(match.Groups[1].ToString()) as HttpWebRequest;
//Setting the Request method HEAD, you can also use GET too.
request.Method = "HEAD";
//Getting the Web Response.
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
//Returns TRUE if the Status code == 200
// result = "true";
result = response.StatusDescription;
response.Close();
// return (response.StatusCode == HttpStatusCode.OK);
}
catch
{
//Any exception will returns false.
result = "false";
}
if(match.Groups[1].ToString().Contains(#"\n"))
{
//....
}
sometime come up with 200 ok and sometimes they come up as broken.
What you are doing is similar to web crawler, youtube will definitely have an anti-crawler mechanism, so if you have been accessing the link uninterrupted, your access may be blocked. In order to reduce the probability of this situation, you can reduce the frequency of visits and simulating the request of real users to visit the site as much as possible
As you all know there is many file host websites, is there a way to process the http link of a of file on one of those sites and retrieve a result if the file exists or if the http link even exists or not. I know that maybe some of those file host websites uses their own APIs but i want a more generic way.
Edit:
So as i understand there is no file on a server, it's just that i have to read the response and read it properly. I want to ask another thing, what about redirection, does that mean if i got the response of a link that redirects to other link, i will get the final target from the response ?
You can find out if a file exist using the exists method:
bool System.IO.File.Exists(string path)
///
in order to find out if a file exist on a remove server you can try this:
WebRequest request;
WebResponse response;
String strMSG = string.Empty;
request = WebRequest.Create(new Uri(“http://www.yoururl.com/yourfile.jpg”));
request.Method = “HEAD”;
try
{
response = request.GetResponse();
strMSG = string.Format(“{0} {1}”, response.ContentLength, response.ContentType);
}
catch (Exception ex)
{
//In case of File not Exist Server return the (404) Error
strMSG = ex.Message;
}
see this:
If I understand you correctly, you're trying to tell if a given URL has content.
Use the
WebClient
class.
Call the url, if you receive a 200, you're good to go. A 404 exception or similar probably means the link is no good.
Or, even better way to do this is to do a HEAD http request. See here for more info on that.
I am trying to unshorten urls and have not been able to find code (vb.net/c#) to do this. These are the twitter shortened urls and I guess I could try and access one of the web services available and do a httpwebrequest but would prefer to find some programmatic way of doing this.
You can get it directly from response of the shortened url since it will return a status code MovedPermanently and the location for the real url.(This should work for most of the sites without the need for navigating to the real url)
HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://t.co/xqbLEi6s");
req.AllowAutoRedirect = false;
var resp = req.GetResponse();
string realUrl = resp.Headers["Location"];
Other test data: http://goo.gl/zdf2n , http://tinyurl.com/8xc9vca , http://x.co/iEup, http://is.gd/vTOlz6 , http://bit.ly/FUA4YU
There is no magic way to unshorten a URL without asking the service which created the URL (and the way to ask will be different for each service), or more pragmatically, just opening the URL and watching where it redirects to.
How can I check a File exits in a web location in ASP.Net(in a different web application, but same web server), currently I doing like this. Is there any better way of doing this?
using (WebClient client = new WebClient())
{
try
{
Stream stream = client.OpenRead("http://localhost/images/myimage.jpg");
if (stream != null)
{
//exists
}
}
catch
{
//Not exists
}
}
Remember that you are never going to get a 100% definitive response on the existence of a file, but the way I do it would be pretty similar to yours...
bool remoteFileExists(string addressOfFile)
{
try
{
HttpWebRequest request = WebRequest.Create(addressOfFile) as HttpWebRequest;
request.Method = "HEAD";
request.CachePolicy = new RequestCachePolicy(RequestCacheLevel.NoCacheNoStore);
var response = request.GetResponse() as HttpWebResponse;
return (response.StatusCode == HttpStatusCode.OK);
}
catch(WebException wex)
{
return false;
}
}
EDIT :: looking at the edit by Anton Gogolev above (How can one check to see if a remote file exists using C#) I should have cast the response to a HttpWebResponse object and checked the status code. Edited the code to reflect that
If a file is accessible via HTTP, you can issue a HTTP HEAD requrest for that particular URL using HttpWebRequest. If HttpWebResponse.StatusCode will be 200, than file is there.
EDIT: See this on why GetResponse throws stupid exceptions when it actually should not do that.
You can use Server.MapPath to get the directory and then check if file exist using IO standard methods like File.Exists
The 404 or Not Found error message is a HTTP standard response code indicating that the client was able to communicate with the server but the server could not find what was requested. A 404 error indicates that the requested resource may be available in the future.
You can use a HEAD request (HttpWebRequest.Method = "HEAD")
in our application we have some kind of online help. It works really simple: If the user clicks on the help button a URL is build depending on the current language and help context (e.g. "http://example.com/help/" + [LANG_ID] + "[HELP_CONTEXT]) and called within the browser.
So my question is: How can i check if a file exists on the web server without loading the complete file content?
Thanks for your Help!
Update: Thanks for your help. My question has been answered.
Now we have proxy authentication problems an cannot send the HTTP request ;)
You can use .NET to do a HEAD request and then look at the status of the response.
Your code would look something like this (adapted from The Lowly HTTP HEAD Request):
// create the request
HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
// instruct the server to return headers only
request.Method = "HEAD";
// make the connection
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
// get the status code
HttpStatusCode status = response.StatusCode;
Here's a list detailing the status codes that can be returned by the StatusCode enumerator.
Can we assume that you are running your web application on the same web server as you are retrieving your help pages from? If yes, then you can use the Server.MapPath method to find a path to the file on the server combined with the File.Exists method from the System.IO namespace to confirm that the file exists.
Had the same problem myself and found this question and the answers here really useful.
But the answers here use the old WebRequest-class which is a bit outdated, it has no async support for starters. So I wanted to use the more modern way of doing it with HttpClient. Here is an example with a little helper class to check if the file exist:
using System.Net.Http;
using System.Threading.Tasks;
class HttpClientHelper
{
private static HttpClient _httpClient;
public static async Task<bool> DoesFileExist(string url)
{
if (_httpClient == null)
{
_httpClient = new HttpClient();
}
using (HttpRequestMessage request = new HttpRequestMessage(HttpMethod.Head, url))
{
using (HttpResponseMessage response = await _httpClient.SendAsync(request))
{
return response.StatusCode == System.Net.HttpStatusCode.OK;
}
}
}
}
Usage:
if (await HttpClientHelper.DoesFileExist("https://www.google.com/favicon.ico"))
{
// Yes it does!
}
else
{
// No it doesn't!
}
Send a HEAD request for the URL (instead of a GET). The server will return a 404 if it doesn't exist.
Take a look at the HttpWebResponse class. You could do something like this:
string url = "http://example.com/help/" + LANG_ID + HELP_CONTEXT;
WebRequest request=WebRequest.Create(URL);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusDescription=="OK")
{
// worked
}
If you want to check the status of a document on the server:
function fetchStatus(address) {
var client = new XMLHttpRequest();
client.onreadystatechange = function() {
// in case of network errors this might not give reliable results
if(this.readyState == 4)
returnStatus(this.status);
}
client.open("HEAD", address);
client.send();
}
Thank you.
EDIT: Apparently a good method to do this would be a HEAD request.
You could also create a server-side application that stores the name of every available web page on the server. Your client application could then query this application and respond a little bit quicker than a full page request, and without throwing a 404 error every time the file doesn't exist.