I have already read many articles about the topic but I can't find solution.
So please don't mark this question as duplicate because other solutions won't work and are out to date.
I have a web application with a page containing a GridView (one button per row).
The button will create a HttpWebRequest (or WebClient, it's the same) and get its html.
I tried using one cookie or all the cookies but I have no success.
This is the code:
String path = Request.Url.GetLeftPart(UriPartial.Authority) + VirtualPathUtility.ToAbsolute("~/");
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(path + "MyPage.aspx");
CookieContainer cookieContainer = new CookieContainer();
HttpCookie httpCookie = HttpContext.Current.Request.Cookies.Get("ASP.NET_SessionId");
if (httpCookie != null)
{
Cookie myCookie = new Cookie();
// Convert between the System.Net.Cookie to a System.Web.HttpCookie...
myCookie.Domain = webRequest.RequestUri.Host;
myCookie.Expires = httpCookie.Expires;
myCookie.Name = httpCookie.Name;
myCookie.Path = httpCookie.Path;
myCookie.Secure = httpCookie.Secure;
myCookie.Value = httpCookie.Value;
cookieContainer.Add(myCookie);
}
webRequest.CookieContainer = cookieContainer;
string responseHTML = string.Empty;
using (HttpWebResponse response = (HttpWebResponse)webRequest.GetResponse())
{
using (Stream responseStream = response.GetResponseStream())
{
using (StreamReader responseReader = new StreamReader(responseStream))
{
responseHTML = responseReader.ReadToEnd();
}
}
}
webRequest.GetResponse will get timeout.
I think the problem is the domain (localhost), i know it's not possible but i have not any domain and i won't create a fake one in web.config. Moreover i have tried using a fake domain without success.
Without the following line
webRequest.CookieContainer = cookieContainer;
it works nicely without sharing session.
I would remember domain must be set otherwise i will received the relative error.
Session access must be serialized. When you use ASP.NET session, it is necessary to "serialize" HTTP requests to avoid threading issues. If two or more requests were processed in parallel, that would mean two threads could change or read session variables at the same time, which could cause a variety of issues.
The good news: ASP.NET will serialize the requests for you, automatically. If you send a second request with the same ASP.NET_SessionId, it will wait until the first one has completed.
The bad news: That means that a mechanism like the one you are attempting will not work. Your web request runs in the context of one HTTP request that is already in progress; it will block any additional HTTP requests until it is completed, including the request that you are sending via WebRequest.
More good news: If your page reads session data and does not write it, it can specify a hint that will allow two threads to run concurrently. Try adding this to both pages (the page your code is behind and the page that your code is attempting to access):
<% #Page EnableSessionState="ReadOnly" %>
If ASP.NET recognizes that the session needs are read-only, it'll allow two read-only threads to run at the same time with the same session ID.
If you need read/write access in either page, you are out of luck.
An alternative would be to use HttpServerUtility.Transfer instead. The role of the first page would change. Instead of serving as a proxy to the second page, it hands off control to the second page. By putting the pages in series, you avoid any issues with parallelism.
Example:
Server.Transfer("MyPage.aspx");
Related
I am trying to read a sharepoint site using HttpWebRequest, but the below code throws an exception (403 Forbidden):
HttpWebRequest r = (HttpWebRequest)WebRequest.Create(#"https://myCompany.sharepoint.com/sites/it/abc/ScriptAttest/docs/");
r.Method = "GET";
WebResponse rs = r.GetResponse();
I get the same response if I add
client.Credentials = new NetworkCredential("username", "secret");
(using my domain credentials of course)
or specify default credentials.
However, if I create a browser control (called documentBrowser) and execute the following:
documentBrowser.Navigate(#"https://myCompany.sharepoint.com/sites/it/abc/ScriptAttest/docs/");
I get the data. However, it takes a long time, and I don't really need to display the page. My objective is to parse the html and only pull out certain elements. Additionally, the data comes in stages and the control triggers the DocumentCompleted event after each segment, so I don't really know when the entire page has loaded.
SharePoint Online does not support NetworkCredential. documentBrowser.Navigate in fact use the embed IE browser which may has some SPO related cache, thus it could navigate to the site. If you want to fetch data from SPO, you could use Rest API or CSOM. If you just want to access the site page, you may consider using cookie to get it:
var login = "admin#***.onmicrosoft.com";
var password = "P#ssw0rd";
var siteUrl = "https://***.sharepoint.com/";
var creds = new SharePointOnlineCredentials(login, password);
var auth = creds.AuthenticateAsync(new Uri(siteUrl), true);
var request = (HttpWebRequest)WebRequest.Create(siteUrl);
request.CookieContainer = auth.Result.CookieContainer;
var result = (HttpWebResponse)request.GetResponse();
BR
I'm using C# to download the HTML of a webpage, but when I check the actual code of the web page and my downloaded code, they are completely different. Here is the code:
public static string getSourceCode(string url) {
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
req.Method = "GET";
HttpWebResponse resp = (HttpWebResponse)req.GetResponse();
StreamReader sr = new StreamReader(resp.GetResponseStream());
string soruceCode = sr.ReadToEnd();
sr.Close();
resp.Close();
return soruceCode;
using (StreamReader sRead = new StreamReader(resp.GetResponseStream(), Encoding.UTF8)) {
// veriyi döndür
return sRead.ReadToEnd();
}
private void button1_Click(object sender, EventArgs e) {
string url = "http://www.booking.com/hotel/tr/nena.en-gb.html?label=gog235jc-hotel-en-tr-mina-nobrand-tr-com-T002-1;sid=fcc1c6c78f188a42870dcbe1cabf2fb4;dcid=1;origin=disamb;srhash=3938286438;srpos=5";
string sourceCode = Finder.getSourceCode(url);
StreamWriter sw = new StreamWriter("HotelPrice.txt");//Here the code are completly different with web page code.
sw.Write(sourceCode);
sw.Close();
#region //Get Score Value
int StartIndex = sourceCode.IndexOf("<strong id=\"rsc_total\">") + 23;
sourceCode = sourceCode.Substring(StartIndex, 3);
#endregion
}
Most likely the cause for the difference is that when you use the browser to request the same page it's part of a session which is not established when you request the same page using the WebRequest.
Looking at the URL it looks like that query parameter sid is a session identifier or a nonce of some sort. The page probably verifies that against the actually session id and when it determines that they are different it gives you some sort of "Ooopss.. wrong seesion" sort of response.
In order to mimic the browser's request you will have to make sure you generate the proper request which may need to include one or more of the following:
cookies (previously sent to you by the webserver)
a valid/proper user agent
some specific query parameters (again depending on what the page expects)
potentially a referrer URL
authentication credentials
The best way to determine what you need is to follow a conversation between your browser and the web server serving that page from start to finish and see exactly which pages are requested, what order and what information is passed back and forth. You can accomplish this using WireShark or Fidler - both free tools!
I ran into the same problem when trying to use HttpWebRequest to crawl a page, and the page used ajax to load all the data I was after. In order to get the ajax calls to occur I switched to the WebBrowser control.
This answer provides an example of how to use the control outside of a WinForms app. You'll want to hookup to the browser's DocumentCompleted event before parsing the page. Be warned, this event may fire multiple times before the page is ready to be parsed. You may want to add something like this
if(browser.ReadyState == WebBrowserReadyState.Complete)
to your event handler, to know when the page is completely done loading.
I would like to grab some content from a website that is made with Drupal.
The challenge here is that i need to login on this site before i can access the page i want to scrape. Is there a way to automate this login process in my C# code, so i can grab the secure content?
To access the secured content, you'll need to store and send cookies with every request to your server, starting with the request that sends your log in info and then saving the session cookie that the server gives you (which is your proof that you are who you say you are).
You can use the System.Windows.Forms.WebBrowser for a less control but out-of-the-box solution that will handle cookies.
My preferred method is to use System.Net.HttpWebRequest to send and receive all web data and then use the HtmlAgilityPack to parse the returned data into a Document Object Model (DOM) which can be easily read from.
The trick to getting System.Net.HttpWebRequest to work is that you must create a long-lived System.Net.CookieContainer that will keep track of your log in info (and other things the server expects you to keep track of). The good news is that the HttpWebRequest will take care of all of this for you if you provide the container.
You need a new HttpWebRequest for each call you make, so you must sets their .CookieContainer to the same object every time. Here is an example:
UNTESTED
using System.Net;
public void TestConnect()
{
CookieContainer cookieJar = new CookieContainer();
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://www.mysite.com/login.htm");
request.CookieContainer = cookieJar;
HttpWebResponse response = (HttpWebResponse) request.GetResponse();
// do page parsing and request setting here
request = (HttpWebRequest)WebRequest.Create("http://www.mysite.com/submit_login.htm");
// add specific page parameters here
request.CookeContainer = cookieJar;
response = (HttpWebResponse) request.GetResponse();
request = (HttpWebRequest)WebRequest.Create("http://www.mysite.com/secured_page.htm");
request.CookeContainer = cookieJar;
// this will now work since you have saved your authentication cookies in 'cookieJar'
response = (HttpWebResponse) request.GetResponse();
}
http://msdn.microsoft.com/en-us/library/system.windows.forms.webbrowser.aspx
HttpWebRequest Class
http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.cookiecontainer.aspx
You'll have to use the Services module to do that. Also check out this link for a bit of explanation.
I have a situation where I'm generating my Connection String in an asp page using a functionality.This functionality I may need to completely do from scratch in .net which is redundancy.To avoid this I want to get the connection string variable from the .asp page to the .net page i.e aspx.cs. Is it possible to do this. A couple of options from google I have been able to get are Server.Execute and sending a Web Request through.net to .asp page and get those values.I wanted to know the latency associated with this methods if it is actually possible.
there is a file getconnstring.asp...classic asp file
in this file I'm constructing connection string like
strACHConnection="Provider=MSDAORA.1;Password=..."
I want to use this variable value in an asp.net website as in a getconnstring.aspx.cs.Is it possible to do using an Ajax request.
Can can get the connection string or any other information from your .asp application by making a WebRequest from your asp.net application to your .asp app.
However, there will be latency issues depending on where the two reside with respect to each other. So I would get the info once and then save it to a file or something and then read it from there the next time.
I'm posting another answer so I can post some code that doesn't get garbled. Below is a Task based version.
var webRequest = WebRequest.Create("http://www.microsoft.com");
webRequest.GetReponseAsync().ContinueWith(t =>
{
if (t.Exception == null)
{
using (var sr = new StreamReader(t.Result.GetResponseStream()))
{
string str = sr.ReadToEnd();
}
}
else
System.Diagnostics.Debug.WriteLine(t.Exception.InnerException.Message);
});
And here is a sync version that's untested but should get you going.
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://www.microsoft.com");
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream());
string str = reader.ReadtoEnd();
I'm trying to call a web service from a c# application, with sessionID.
In order to do this I need to set the "Domain" header in a cookie.
In Fiddler it looks like - "ASP.NET_SessionId=izdtd4tbzczsa3nlt5ujrbf5" (no domain is specified in the cookie).
The web service is at - "http://[some ip goes here]:8989/MyAPI.asmx".
I've tried:
http://[ip] ,
http://[ip]:8989 ,
http://[ip]:8989/MyAPI.asmx
All of these cause runtime error.
I've also tried the ip alone (i.e. 100.10.10.10) , which doesn't cause a runtime error, and sets the cookie, but the cookie is never sent when I invoke a web method.
Here's my code for setting the domain:
if (!string.IsNullOrEmpty(currentSessionID))
{
req.CookieContainer=new CookieContainer();
Cookie cookie = new Cookie("ASP.NET_SessionId", currentSessionID);
cookie.Domain = GetCookieUrl(); //<- What should this be?
req.CookieContainer.Add(cookie);
}
So what should the domain be?
Thanks.
I believe it should simply be [ip]. Drop the http:// part of what you've tried.
According to this page on MSDN, your code should be
cookie.Domain = "100.10.10.10";
Next, exactly what error are you getting? Also, are you confusing a Compile error with a Runtime error? I find it hard to believe you are getting a compilation error as Domain is a String property which means you can put pretty much anything into it.
Finally, why are you sending a cookie to a web service? The normal way is to pass everything in the form post or on the query string.
Update
BTW, if you absolutely must add a cookie to the header in order to pass it to a web service, the way you do this is (taken from here):
byte[] buffer = Encoding.ASCII.GetBytes("fareId=123456"); //the data you want to send to the web service
HttpWebRequest WebReq = (HttpWebRequest)WebRequest.Create(url);
WebReq.Method = "POST";
WebReq.ContentType = "application/x-www-form-urlencoded";
WebReq.ContentLength = buffer.Length;
WebReq.Headers["Cookie"] = "ASP.NET_SessionId=izdtd4tbzczsa3nlt5ujrbf5"
Stream PostData = WebReq.GetRequestStream();
Note that this sets the header inline with the request without instantiating a "cookie" object. The Domain property of a cookie is to help ensure the cookie is only sent to the domain listed. However, if you are initiating the request and trying to append a cookie to it, then the best way is to just add it as a string to the request headers.
The reason the cookie was not sent is that the request's content length should be set after adding the cookie, and not before.
The domain is the ip alone.
// Simple function to get cookie domain
private string GetCookieDomain(string uri)
{
Uri req_uri = new Uri(uri);
return req_uri.GetComponents(UriComponents.Host, UriFormat.Unescaped);
}