C# HttpWebRequest hangs program suddenly. Did not earlier - c#

Earlier I made an HttpWebRequest that worked perfectly fine, and my StreamReader read the HTML of the website perfectly.
But all of the sudden, after having tested it's functionality and confirmed that it worked many times, it hangs the program when it comes to the StreamReader line.
I have tried removing this line, and the code continued.
The thing is; I tried inputting a different website than the one I need to use, (I put in www.google.com) and it worked perfectly fine. So my error conclusion is, that it is only the website I need to use that I can't access anymore which makes me think that the endpart (the website) is cancelling my connection or blocking me or something. BUT! The HttpWebRequest itself doesn't hang or anything, which must mean that it successfully established a request to the website?
Enough chit-chat, here's the code:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("website here");
MessageBox.Show("1"); //This is shown.
string HTMLLink = (new StreamReader(request.GetResponse().GetResponseStream())).ReadToEnd(); //This is where the program hangs....
MessageBox.Show("2"); //This is not shown! Below this isn't being executed.
if (HTMLLink.Length > 0)
{
HTMLLink = HTMLLink.Substring(HTMLLink.IndexOf("uuu"), HTMLLink.Length - HTMLLink.IndexOf("uuu"));
HTMLLink = HTMLLink.Substring(0, HTMLLink.IndexOf("\" TARGET="));
request = (HttpWebRequest)WebRequest.Create(HTMLLink);
string HTML = (new StreamReader(request.GetResponse().GetResponseStream())).ReadToEnd();
if (HTML.Length > 0 && HTML.Contains(" </script><br><br><br>") && HTML.Contains(" <br><br><script "))
{
HTML = HTML.Substring(HTML.IndexOf(" </script><br><br><br>") + 22, HTML.IndexOf("<br><br><script "));
HTML = HTML.Substring(0, HTML.IndexOf("<br><br><script "));
HTML = HTML.Replace("\r\n", "");
HTML = HTML.Replace("\n", "");
HTML = HTML.Replace("<br>", "\r\n");
HTML = HTML.Replace("<BR>", "\r\n");
HTML = HTML.Replace("<br />", "\r\n");
HTML = HTML.Replace("<BR />", "\r\n");
textBox.Text = HTML;
}
}
And please keep in mind that it worked perfectly earlier then all of the sudden it started hanging, and that it works fine with www.google.com.
And by the way, yes I have done many searches. No useful results.
I have tried the timeout already, it does timeout.
Maybe the website has blocked my program thinking it's a spider? what then?
Everytime when I reach the StreamReader (no matter how I set it up) it starts to hang.
And it keeps hanging, it doesn't deliver any result.
This ONLY happens with lyrics007.com which is the exact website I need. It works fine with google.
Help, please!
Thanks in advance!

WebRequest.GetResponse() is a blocking call. It will wait until it can successfully connect and receive the response before it returns control to the caller, or will throw an exception if unsuccessful. This behaviour can't be modified.
You usually don't want your application to sit waiting for something to happen though, so you usually delegate the GetResponse() call to another thread, so you can continue doing other work in the current thread.
The usual way to overcome this problem is to call asynchronously. Rather than a call to GetResponse, you will call BeginGetResponse(), passing in a function which should be executed when the operation completes (eg, containing the remainder of your current method, plus a call to EndGetResponse()). Control of execution can be passed back to the caller whilst the response is being waited for in a background thread, handled for you automatically by the .NET threadpool.

The request is not sent until the call to GetResponse. If that is where it is hanging, I would be inclined to say the site is not responding. Did you try using a web browser to connect to that URL and see if it works?

Related

Multiple WebClient calls from worker threads get blocked in loopback requests back to self

I have a .NET 4.5.2 ASP.NET webapp in which a chunk of code makes async webclient calls back into web pages inside the same webapp. (Yes, my webapp makes async calls back into itself.) It does this in order to screen scrape, grab the html, and hand it to a PDF generator.
I had this all working...except that it was very slow because there are about 15 labor-intensive reports that take roughly 3 seconds each, or 45 seconds in total. Since that is so slow I attempted to generate all these concurrently in parallel, and that's when things hit the fan.
What is happening is that my aspx reports (that get hit by webclient) never make it past the class constructor until timeout. Page_Load doesn't get hit until timeout, or any other page events. The report generation (and webclient calls) are triggered when the user clicks Save in the webapp, and a bunch of stuff happens, including this async page generation activity. The webapp requires windows authentication which I'm handling fine.
So when the multithreaded stuff kicks off, a bunch of webclient requests are made, and they all get stuck in the reports' class contructor for a few minutes, and then time out. During/after timeout, session data is cleared, and when that happens, the reports cannot get their data.
Here is the multithreaded code:
Parallel.ForEach(folders, ( folderPath ) =>
{
...
string html = getReportHTML(fullReportURL, aspNetSessionID);
// hand html to the PDF generator here...
...
});
private string getReportHTML( string url, string aspNetSessionID ) {
using( WebClient webClient = new WebClient() ) {
webClient.UseDefaultCredentials = true;
webClient.Headers.Add(HttpRequestHeader.Cookie, "ASP.NET_SessionId=" + aspNetSessionID);
string fullReportURL = url;
byte[] reportBytes = webClient.DownloadData(fullReportURL);
if( reportBytes != null && reportBytes.Length > 0 ) {
string html = Encoding.ASCII.GetString(reportBytes);
return html;
}
}
return string.Empty;
}
Important points:
Notice I have to include the ASP.NET session cookie, or the web call doesn't work.
webClient.UseDefaultCredentials = true is required for the winauth.
The fragile session state and architecture is not changeable in the short term - it's an old and massive webapp and I am stuck with it. The reports are complex and rely heavily on session state (and prior to session state many db lookups and calcs are occurring.
Even though I'm calling reports from my webapp to my same webapp, I must use an absolute url - relative URL throws errors.
When I extract the code samples above into a separate .net console app, it works well, and doesn't get stuck in the constructor. Because of this, the issue must lie (at least in part) in the fact that my web app is making async calls back to itself. I don't know how to avoid doing this. I even flirted with Server.Execute() which really blows up inside worker threads.
The reports cannot be generated in a windows service or some other process - it must be linked to the webapp's save event.
There's a lot going on here, but I think the most fundamental question/problem is that these concurrent webclient calls hit the ASPX pages and get stuck in the constructor, going no further into page events. And after about 2 minutes, all those threads flood down into the page events, where failures occur because the main webapp's session state is no longer active.
Chicken or egg: I don't know whether the threads unblock and eventually hit page events because the session state was cleared, or the other way around. Or maybe there is no connection.
Any ideas?

How to fetch API data using WebClient and multiple threads?

So I am trying to query an API that's accessible via HTTP ( no authorization ). To speed things up, I tried to use a Parallel.ForEach loop but it seems like the longer it runs, the more errors pop up.
It fails to retrieve more and more requests. I know the API provider isn't limiting me because I can request the very same blocked URLs in my Internet browser. Also, these are different failed URLs each time, so it doesn't seem to be the case of malformed requests.
The error doesn't seem to occur while I use single threaded foreach loop.
My malfunctioning loop is below:
Parallel.ForEach(this.urlArray, singleUrl => {
this.apiResponseBlob = new System.Net.WebClient ().DownloadString(singleUrl );
this.responsesDictionary.Add(singleUrl, apiResponseBlob);
}
Normal foreach loop works fine but is very slow:
foreach (string singleUrl in this.urlArray) {
this.apiResponseBlob = new System.Net.WebClient ().DownloadString(singleUrl);
this.responsesDictionary.Add(singleUrl, apiResponseBlob);
}
Also: I've had a solution in PHP - I spawned several "fetchers" simultaneously and it never hung up. It seems strange to me that PHP would handle multithreaded retrieval better than C# so I must obviously miss something.
How do I query the API fastest way? Without these strange failures?
Hi did you try to speed up your code with a sync downloads like in this question (see marked answer):
DownloadStringAsync wait for request completion
your could loop through your uris and get a callback for each successfull download.
EDIT : i have seen that you use
this.apiResponseBlob = DL
when you use multithreading every thread tries to write in that variable. This could be a reason vor your bug. Try using an instance of that object type or use
lock{}
so that only one thread can write this variable at time.
http://msdn.microsoft.com/de-de/library/c5kehkcz.aspx
like
Parallel.ForEach(this.urlArray, singleUrl => {
var apiResponseBlob = new System.Net.WebClient ().DownloadString(singleUrl );
lock(singleUrl.ToString()){
this.responsesDictionary.Add(singleUrl, apiResponseBlob);
}
}

HttpWebRequest possibly slowing website

Using Visual studio 2012, C#.net 4.5 , SQL Server 2008, Feefo, Nopcommerce
Hey guys I have Recently implemented a new review service into a current site we have.
When the change went live the first day all worked fine.
Since then though the sending of sales to Feefo hasnt been working, There are no logs either of anything going wrong.
In the OrderProcessingService.cs in Nop Commerce's Service, i call a HttpWebrequest when an order has been confirmed as completed. Here is the code.
var email = HttpUtility.UrlEncode(order.Customer.Email.ToString());
var name = HttpUtility.UrlEncode(order.Customer.GetFullName().ToString());
var description = HttpUtility.UrlEncode(productVariant.ProductVariant.Product.MetaDescription != null ? productVariant.ProductVariant.Product.MetaDescription.ToString() : "product");
var orderRef = HttpUtility.UrlEncode(order.Id.ToString());
var productLink = HttpUtility.UrlEncode(string.Format("myurl/p/{0}/{1}", productVariant.ProductVariant.ProductId, productVariant.ProductVariant.Name.Replace(" ", "-")));
string itemRef = "";
try
{
itemRef = HttpUtility.UrlEncode(productVariant.ProductVariant.ProductId.ToString());
}
catch
{
itemRef = "0";
}
var url = string.Format("feefo Url",
login, password,email,name,description,orderRef,productLink,itemRef);
var request = (HttpWebRequest)WebRequest.Create(url);
request.KeepAlive = false;
request.Timeout = 5000;
request.Proxy = null;
using (var response = (HttpWebResponse)request.GetResponse())
{
if (response.StatusDescription == "OK")
{
var stream = response.GetResponseStream();
if(stream != null)
{
using (var reader = new StreamReader(stream))
{
var content = reader.ReadToEnd();
}
}
}
}
So as you can see its a simple webrequest that is processed on an order, and all product variants are sent to feefo.
Now:
this hasnt been happening all week since the 15th (day of the
implementation)
the site has been grinding to a halt recently.
The stream and reader in the the var content is there for debugging.
Im wondering does the code redflag anything to you that could relate to the process of website?
Also note i have run some SQL statements to see if there is any deadlocks or large escalations, so far seems fine, Logs have also been fine just the usual logging of Bots.
Any help would be much appreciated!
EDIT: also note that this code is in a method that is called and wrapped in A try catch
UPDATE: well forget about the "not sending", thats because i was just told my code was rolled back last week
A call to another web site while processing the order can degrade performance, as you are calling to a site that you do not control. You don't know how much time it is going to take. Furthermore, the GetResponse method can throw an exception, if you don't log anything in your outer try/catch block then you won't be able to know what's happening.
The best way to perform such a task is to implement something like the "Send Emails" scheduled task, and send data when you can afford to wait for the remote service. It is easy if you try. It is more resilient and easier to maintain if you upgrade the nopCommerce code base.
This is how I do similar things:
Avoid modifying the OrderProcessingService: Create a custom service or plugin that consumes the OrderPlacedEvent or the OrderPaidEvent (just implement the IConsumer<OrderPaidEvent> or IConsumer<OrderPlacedEvent> interface).
Do not call to a third party service directly while processing the request if you don't need the response at that moment. It will only delay your process. At the service created in step 1, store data and send it to Feefo later. You can store data to database or use an static collection if you don't mind losing pending data when restarting the site (that could be ok for statistical data for instance).
Best way to implement point #2 is to add a new scheduled task implementing ITask (remember to add a record to the ScheduleTask table). Just recover the stored data do your processing.
Add some logging. It is easy, just get an ILogger instance and call Insert.
As far as I can see, you are making a blocking synchronous call to other websites, which will definitely slow down your site in between the request-response process. What Marco has suggested is valid, try to do it in an ITask. Or you can use an asynchronous web request to potentially remove the block, if you need things done immediately instead of scheduled. :)

C# How to process several web requests at once

I have been reading a lot about ThreadPools, Tasks, and Threads. After awhile I got pretty confused with the whole thing. Lots of people saying negative/positive things about each... Maybe someone can help me find a solution for my problem. I created a simple diagram here to get my point across better.
Basically on the left is a list of 5 strings (URL's) that need to be processed. In the center is just my idea of a handler that has 2 events to track progress. Inside that handler it takes all 5 URL's creates separate tasks for them, shown in blue. Once each one complete I want each one to return the webpage results to the handler. When they have all returned a value I want the OnComplete to be called and all this information passed back to the main thread.
Hopefully you can understand what I am trying to do. Thanks in advance for anyone who would like to help!
Update
I have taken your suggestions and put them to use. But I still have a few questions. Here is the code I have built, mind it is not build proof, just a concept to see if I'm going in the right direction. Please read the comments, I had included my questions on how to proceed in there. Thank you for all who took interest in my question so far.
public List<String> ProcessList (string[] URLs)
{
List<string> data = new List<string>();
for(int i = 0; i < URLs.Length - 1; i++)
{
//not sure how to do this now??
//I want only 10 HttpWebRequest running at once.
//Also I want this method to block until all the URL data has been returned.
}
return data;
}
private async Task<string> GetURLData(string URL)
{
//First setup out web client
HttpWebRequest Request = GetWebRequest(URL);
//
//Check if the client holds a value. (There were no errors)
if (Request != null)
{
//GetCouponsAsync will return to the calling function and resumes
//here when GetResponse is complete.
WebResponse Response = await Request.GetResponseAsync();
//
//Setup our Stream to read the reply
Stream ResponseStream = Response.GetResponseStream();
//return the reply string here...
}
}
As #fendorio and #ps2goat pointed out async await is perfect for your scenario. Here is another msdn article
http://msdn.microsoft.com/en-us/library/hh300224.aspx
It seems to me that you are trying to replicate a webserver within a webserver.
Each web request starts its own thread in a webserver. As these requests can originate from anywhere that has access to the server, nothing but the server itself has access or the ability to manage them (in a clean way).
If you would like to handle requests and keep track of them like I believe you are asking, AJAX requests would be the best way to do this. This way you can leave the server to manage the threads and requests as it does best, but you can manage their progress and monitor them via JSON return results.
Look into jQuery.ajax for some ideas on how to do this.
To achieve the above mentioned functionality in a simple way, I would prefer calling a BackgroundWorker for each of the tasks. You can keep track of the progress plus you get a notification upon task completion.
Another reason to choose this is that the mentioned tasks look like a back-end job and not tightly coupled with the UI.
Here's a MSDN link and this is the link for a cool tutorial.

Cannot redirect after HTTP headers have been sent

When I try to redirect to another page through Response.Redirect(URL) am getting the following error:- System.Web.HttpException: Cannot redirect after HTTP headers have been sent.
I wrote one Response.Write("Sometext"); and Response.Flush() before calling redirect Method.
In this case how do we use Response.Redirect(URL)?
I'm executing a Stored procedure through Asynch call. The SP will take almost 3 min to execute. By that time I'll get load balancer timeout error from Server because this application is running in Cloud computer. For avoiding load balancer timeout I'm writing some text to browser (response.write() and Flush() ) .
You need to ensure that you do not write/flush anything before trying to send a HTTP header.
After sending headers there is no proper way to do a redirect as the only things you can do are outputting JavaScript to do the redirect (bad) or sending a 'meta refresh/location' tag which will most likely not be at the correct position (inside HEAD) and thus result in invalid html.
I had the same error and same approach. You might want to try using a javascript instead of directly calling Response.Redirect.
Response.Write("<script type='text/javascript'>");
Response.Write("window.location = '" + url + "'</script>");
Response.Flush();
Worked fine with me however I still need to check it on different browsers.
if (!Response.IsRequestBeingRedirected)
Response.Redirect("~/RMSPlusErrorPage.aspx?ErrorID=" + 100, false);
You can't use Response.Redirect as you've gone past headers and written out "Sometext". You have to check (redirect condition) before you start writing out data to the client or make a META redirect.
If you want one of those pages that shows text and redirects after 5s META is your option.
You won't get this error, if you redirect before the rendering of your page begins (for example when you redirect from the Load or PreRender events of the page).
I see now in your comments, that you would like to redirect after a long-running stored procedure completes. You might have to use a different approach in this case.
You could put for example an AJAX UpdatePanel with a Timer on your page, and the Timer could check in every few seconds whether the stored procedure has completed, and then do the redirection.
This approach also has the advantage, that you can put some "in progress" message on the page, while the procedure is running, so your user would know, that things are still happening.
Try to do the following:
catch (System.Threading.ThreadAbortException)
{
// To Handle HTTP Exception "Cannot redirect after HTTP headers have been sent".
}
catch (Exception e)
{//Here you can put your context.response.redirect("page.aspx");}
In my case, cause of the problem is that loading data in the scroll gridview is taking a long time. And before gridview data is not loaded completely, but I press the redirect button. I get this error.
You can lessen your data get
or
before loading completion prevent to press redirect button

Categories

Resources