unshortening urls - c#

I am trying to unshorten urls and have not been able to find code (vb.net/c#) to do this. These are the twitter shortened urls and I guess I could try and access one of the web services available and do a httpwebrequest but would prefer to find some programmatic way of doing this.

You can get it directly from response of the shortened url since it will return a status code MovedPermanently and the location for the real url.(This should work for most of the sites without the need for navigating to the real url)
HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://t.co/xqbLEi6s");
req.AllowAutoRedirect = false;
var resp = req.GetResponse();
string realUrl = resp.Headers["Location"];
Other test data: http://goo.gl/zdf2n , http://tinyurl.com/8xc9vca , http://x.co/iEup, http://is.gd/vTOlz6 , http://bit.ly/FUA4YU

There is no magic way to unshorten a URL without asking the service which created the URL (and the way to ask will be different for each service), or more pragmatically, just opening the URL and watching where it redirects to.

Related

Redirect url to obtain the direct url

I did an application which parses an html document and then obtains some urls, the problem is the urls only can be downloaded directly from the navigator.
In VB.NET or C#, how I could redirect this url to obtain a direct link for later paste the link to download it in a Download Manager?
dim url as string = "http://m.mrtzcmp3.net/get.php?singer=Madonna&song=Like%20A%20Virgin%20&size=5242104&ids=687474703a2h2h63733434303876342g766s2g6f652h75323237363831362h617564696h732h3132323564303466333839622g6f7033"
I need to say that I'm not much experimented with http things, maybe I'm wrong and the url has anything to redirect or something similar fault, please just say me how can I redirect that kind of urls or If I'm wrong.
UPDATE:
Tried this, but I get the same url without any changes:
Dim url As String = _
"http://m.mrtzcmp3.net/get.php?singer=Madonna&song=Like%20A%20Virgin%20&size=5242104&ids=687474703a2h2h63733434303876342g766s2g6f652h75323237363831362h617564696h732h3132323564303466333839622g6f7033"
Dim request As HttpWebRequest = DirectCast(HttpWebRequest.Create(url), HttpWebRequest)
request.AllowAutoRedirect = True
Dim response As HttpWebResponse
Dim resUri As String
response = request.GetResponse
resUri = response.ResponseUri.AbsoluteUri
MsgBox(resUri)
UPDATE 2:
In the answer from here HttpWebRequest Login data Then Redirect
He says
If the redirect is handled transparently, the _response.ResponseURI
will contain the address it redirected to. If not, you have to read
the redirect header and decide yourself whether or not to request the
new page.
so...if I need to do thatm, how I can do that?
UPDATE 3:
DownloadThemAll plugin for Firefox can obtain the direct urls... as you can see all the urls finishes with an .mp3 file extension, that's what I need
To my knowledge, the url
http://m.mrtzcmp3.net/get.php?singer=Madonna&song=Like%20A%20Virgin%20&size=5242104&ids=687474703a2h2h63733434303876342g766s2g6f652h75323237363831362h617564696h732h3132323564303466333839622g6f7033
IS the direct url, a direct file url does not need to have the filetype in it.
you can download the file using
string url = "http://m.mrtzcmp3.net/get.php?singer=Madonna&song=Like%20A%20Virgin%20&size=5242104&ids=687474703a2h2h63733434303876342g766s2g6f652h75323237363831362h617564696h732h3132323564303466333839622g6f7033"
WebClient wc = new WebClient();
wc.DownloadFile(url, fileName);
you can get the fileName (Madonna-Like A Virgin -www.mrtzcmp3.net.mp3) by using
HttpWebRequest myHttpWebRequest = (HttpWebRequest)HttpWebRequest.Create(url);
string header = myHttpWebResponse.Headers.ToString();
fileName = header.Remove(0, header.IndexOf("filename=")+10);
fileName = fileName.Remove(fileName.IndexOf('"'));
that is untested, but it should work.
edit: I think this does what you want, but I may have misunderstood your question
you can perform a web request using web client to get the content (url) from that url, then you just need to perform the redirect.
Use an HttpWebRequest and use the AllowAutoRedirect=true to get the direct link and download the file.
Can you try to paste the URL to an URl shortener like tinyUrl or BitLy? Maybe there is a shortener Service that provides an API?
The file then will be downloaded at: http://tinyurl.com/phzhxsr
You will never get a direct URL from the site owner because the URL is dynamicaly parsed and the file is send with the retrun datastream, not by downloading a specific URL.

Logging into website programmatically with C# and WebRequest class

I'm trying to login to a website using C# and the WebRequest class. This is the code I wrote up last night to send POST data to a web page:
public string login(string URL, string postData)
{
Stream webpageStream;
WebResponse webpageResponse;
StreamReader webpageReader;
byte[] byteArray = Encoding.UTF8.GetBytes(postData);
_webRequest = WebRequest.Create(URL);
_webRequest.Method = "POST";
_webRequest.ContentType = "application/x-www-form-urlencoded";
_webRequest.ContentLength = byteArray.Length;
webpageStream = _webRequest.GetRequestStream();
webpageStream.Write(byteArray, 0, byteArray.Length);
webpageResponse = _webRequest.GetResponse();
webpageStream = webpageResponse.GetResponseStream();
webpageReader = new StreamReader(webpageStream);
string responseFromServer = webpageReader.ReadToEnd();
webpageReader.Close();
webpageStream.Close();
webpageResponse.Close();
return responseFromServer;
}
and it works fine, but I have no idea how I can modify it to send POST data to a login script and then save a cookie(?) and log in.
I have looked at my network transfers using Firebug on the websites login page and it is sending POST data to a URL that looks like this:
accountName=myemail%40gmail.com&password=mypassword&persistLogin=on&app=com-sc2
As far as I'm aware, to be able to use my account with this website in my C# app I need to save the cookie that the web server sends, and then use it on every request? Is this right? Or can I get away with no cookie at all?
Any help is greatly apprecated, thanks! :)
The login process depends on the concrete web site. If it uses cookies, you need to use them.
I recommend to use Firefox with some http-headers watching plugin to look inside headers how they are sent to your particular web site, and then implement it the same way in C#. I answered very similar question the day before yesterday, including example with cookies. Look here.
I've found more luck using the HtmlElement class to manipulate around websites.
Here is cross post to an example of how logging in through code would work (provided you're using a WebBrowser Control)

Cannot handle redirect from HTTP/HTTPS protocols to other dissimilar ones

Basically, I'm trying to grab an EXE from CNet's Download.com
So i created web parser and so far all is going well.
Here is a sample link pulled directly from their site:
http://dw.com.com/redir?edId=3&siteId=4&oId=3001-20_4-10308491&ontId=20_4&spi=e6323e8d83a8b4374d43d519f1bd6757&lop=txt&tag=idl2&pid=10566981&mfgId=6250549&merId=6250549&pguid=PlvcGQoPjAEAAH5rQL0AAABv&destUrl=ftp%3A%2F%2F202.190.201.108%2Fpub%2Fryl2%2Fclient%2Finstaller-ryl2_v1673.exe
Here is the problem: When you attempt to download, it begins with HTTP, then redirects to an FTP site. I have tried .NET's WebClient and HttpWebRequest Objects, and it looks like Neither can support Redirects.
This Code Fails at GetResponse();
HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://dw.com.com/redir");
WebResponse response = req.GetResponse();
Now, I also tried this:
HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://dw.com.com/redir");
req.AllowAutoRedirect = false;
WebResponse response = req.GetResponse();
string s = new StreamReader(response.GetResponseStream()).ReadToEnd();
And it does not throw the error anymore, however variable s turns out to be an empty string.
I'm at a loss! Can anyone help out?
You can get the value of the "Location" header from the response.headers, and then create a new FtpWebRequest to download that resource.
in your first code snippet you will be redirected to a link using a different protocol (i.e it's no longer Http as in HttpWebRequest) so it fails du to a malformed http response.
In the second part you're no longer redirected and hence you don't receive a FTP response (which is not malform when interpreted as HTTP response).
You need to acquire FTP link,as ferozo wrote you can do this by getting the value of the header "location", and use a FtpWebRequest to access the file

HttpRequest: pass through AuthLogin

I would need to make a simple program that logs with given credentials to certain website and then navigate to some element (link).
It is even possible (I mean this Authlogin thing)?
EDIT: SORRY - I am on my company machine and I cannot click on "Vote" or "Add comment" - the page says "Done, but with errors on page" (IE..). I do appreciate your answers and comments, you have helped me a lot!
Main things to do are:
Start using Fiddler to see what needs to be sent and in what way
Assuming we're talking a normal web form you'll probably need to use a CookieContainer with your WebRequests in order to accept the cookies that come from the login request and then re-supply them when sending subsequent requests (such context is not automagically maintained by HttpWebRequest) :-
CookieContainer _cookieContainer = new CookieContainer();
((HttpWebRequest)request).CookieContainer = _cookieContainer;
yes. it is possible.
see following code:
HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(uri);
req.AuthenticationLevel = System.Net.Security.AuthenticationLevel.MutualAuthRequested;
req.Credentials = new NetworkCredential("admin", "admin");
req.PreAuthenticate = true;
It will partly depend on how the login process is managed. Is this actually done via a web form? If so, you'll need to post the form, just as a normal browser would.
If it's done over HTTP authentication, you should be able to set the credentials in the web request, tell it to pre-authenticate, and all should be well.

HTTP Post in C# console app doesn't return the same thing as a browser request

I have a C# console app (.NET 2.0 framework) that does an HTTP post using the following code:
StringBuilder postData = new StringBuilder(100);
postData.Append("post.php?");
postData.Append("Key1=");
postData.Append(val1);
postData.Append("&Key2=");
postData.Append(val2);
byte[] dataArray = Encoding.UTF8.GetBytes(postData.ToString());
HttpWebRequest httpRequest = (HttpWebRequest)WebRequest.Create("http://example.com/");
httpRequest.Method = "POST";
httpRequest.ContentType = "application/x-www-form-urlencoded";
httpRequest.ContentLength = dataArray.Length;
Stream requestStream = httpRequest.GetRequestStream();
requestStream.Write(dataArray, 0, dataArray.Length);
requestStream.Flush();
requestStream.Close();
HttpWebResponse webResponse = (HttpWebResponse)httpRequest.GetResponse();
if (httpRequest.HaveResponse == true) {
Stream responseStream = webResponse.GetResponseStream();
StreamReader responseReader = new System.IO.StreamReader(responseStream, Encoding.UTF8);
String responseString = responseReader.ReadToEnd();
}
The outputs from this are:
webResponse.ContentLength = -1
webResponse.ContentType = text/html
webResponse.ContentEncoding is blank
The responseString is HTML with a title and body.
However, if I post the same URL into a browser (http://example.com/post.php?Key1=some_value&Key2=some_other_value), I get a small XML snippet like:
<?xml version="1.0" ?>
<RESPONSE RESULT="SUCCESS"/>
with none of the same HTML as in the application. Why are the responses so different? I need to parse the returned result which I am not getting in the HTML. Do I have to change how I do the post in the application? I don't have control over the server side code that accepts the post.
If you are indeed supposed to use the POST HTTP method, you have a couple things wrong. First, this line:
postData.Append("post.php?");
is incorrect. You want to post to post.php, you don't want post the value "post.php?" to the page. Just remove this line entirely.
This piece:
... WebRequest.Create("http://example.com/");
needs post.php added to it, so...
... WebRequest.Create("http://example.com/post.php");
Again this is assuming you are actually supposed to be POSTing to the specified page instead of GETing. If you are supposed to be using GET, then the other answers already supplied apply.
You'll want to get an HTTP sniffer tool like Fiddler and compare the headers that are being sent from your app to the ones being sent by the browser. There will be something different that is causing the server to return a different response. When you tweak your app to send the same thing browser is sending you should get the same response. (It could be user-agent, cookies, anything, but something is surely different.)
I've seen this in the past.
When you run from a browser, the "User-Agent" in the header is "Mozilla ...".
When you run from a program, it's different and generally specific to the language used.
I think you need to use a GET request, instead of POST. If the url you're using has querystring values (like ?Key1=some_value&Key2=some_other_value) then it's expecting a GET. Instead of adding post values to your webrequest, just put this data in the querystring.
HttpWebRequest httpRequest = (HttpWebRequest)WebRequest.Create("http://example.com/?val1=" + val1 + "&val2=" + val2);
httpRequest.Method = "GET";
httpRequest.ContentType = "application/x-www-form-urlencoded";
....
So, the result you're getting is different when you POST the data from your app because the server-side code has a different output when it can't read the data it's expecting in the querystring.
In your code you a specify the POST method which sends the data to the PHP file without putting the data in the web address. When you put the information in the address bar, that is not the POST method, that is the GET method. The name may be confusing, but GET just means that the data is being sent to the PHP file through the web address, instead of behind the scenes, not that it is supposed to get any information. When you put the address in the browser it is using a GET.
Create a simple html form and specify POST as the method and your url as the action. You will see that the information is sent without appearing in the address bar.
Then do the same thing but specify GET. You will see the information you sent in the address bar.
I believe the problem has something to do with the way your headers are set up for the WebRequest.
I have seen strange cases where attempting to simulate a browser by changing headers in the request makes a difference to the server.
The short answer is that your console application is not a web browser and the web server of example.com is expecting to interact with a browser.
You might also consider changing the ContentType to be "multipart/form-data".
What I find odd is that you are essentially posting nothing. The work is being done by the query string. Therefore, you probably should be using a GET instead of a POST.
Is the form expecting a cookie? That is another possible reason why it works in the browser and not from the console app.

Categories

Resources