I have been reading a lot about ThreadPools, Tasks, and Threads. After awhile I got pretty confused with the whole thing. Lots of people saying negative/positive things about each... Maybe someone can help me find a solution for my problem. I created a simple diagram here to get my point across better.
Basically on the left is a list of 5 strings (URL's) that need to be processed. In the center is just my idea of a handler that has 2 events to track progress. Inside that handler it takes all 5 URL's creates separate tasks for them, shown in blue. Once each one complete I want each one to return the webpage results to the handler. When they have all returned a value I want the OnComplete to be called and all this information passed back to the main thread.
Hopefully you can understand what I am trying to do. Thanks in advance for anyone who would like to help!
Update
I have taken your suggestions and put them to use. But I still have a few questions. Here is the code I have built, mind it is not build proof, just a concept to see if I'm going in the right direction. Please read the comments, I had included my questions on how to proceed in there. Thank you for all who took interest in my question so far.
public List<String> ProcessList (string[] URLs)
{
List<string> data = new List<string>();
for(int i = 0; i < URLs.Length - 1; i++)
{
//not sure how to do this now??
//I want only 10 HttpWebRequest running at once.
//Also I want this method to block until all the URL data has been returned.
}
return data;
}
private async Task<string> GetURLData(string URL)
{
//First setup out web client
HttpWebRequest Request = GetWebRequest(URL);
//
//Check if the client holds a value. (There were no errors)
if (Request != null)
{
//GetCouponsAsync will return to the calling function and resumes
//here when GetResponse is complete.
WebResponse Response = await Request.GetResponseAsync();
//
//Setup our Stream to read the reply
Stream ResponseStream = Response.GetResponseStream();
//return the reply string here...
}
}
As #fendorio and #ps2goat pointed out async await is perfect for your scenario. Here is another msdn article
http://msdn.microsoft.com/en-us/library/hh300224.aspx
It seems to me that you are trying to replicate a webserver within a webserver.
Each web request starts its own thread in a webserver. As these requests can originate from anywhere that has access to the server, nothing but the server itself has access or the ability to manage them (in a clean way).
If you would like to handle requests and keep track of them like I believe you are asking, AJAX requests would be the best way to do this. This way you can leave the server to manage the threads and requests as it does best, but you can manage their progress and monitor them via JSON return results.
Look into jQuery.ajax for some ideas on how to do this.
To achieve the above mentioned functionality in a simple way, I would prefer calling a BackgroundWorker for each of the tasks. You can keep track of the progress plus you get a notification upon task completion.
Another reason to choose this is that the mentioned tasks look like a back-end job and not tightly coupled with the UI.
Here's a MSDN link and this is the link for a cool tutorial.
Related
I'm sure this question is going to prove my ignorance, but I'm having a hard time understanding this. I'm willing to ask a dumb question to get a good answer.
All of the posts I've read about async streams do a good job of showing off the feature, but they don't explain why it's an improvement over the alternative.
Or, perhaps, when should one use async streams over good old client-server communication?
I can see where streaming the contents of a large file might be a good use for async streams, but many of the examples I've seen use async streams to transmit small bits of sensor data (temperature, for example). It seems like an IoT device with a temperature sensor could just HTTP POST the data to a server, and the server could respond. Why would the server implement async streams in that case?
I can already feel your pain as you struggle to make sense of those words, but please have mercy on me. :)
As requested, here are some examples I've come across that confused me. I'll post more as I find them, but I wanted to go ahead and get you started:
The first half of the .NET Conf keynote was a massive async stream demo... I couldn't understand why they were using async streams here: https://www.youtube.com/watch?v=1xQE2bWkwjo&list=PLReL099Y5nRd04p81Q7p5TtyjCrj9tz1t&index=4&t=
Here's another example that confused me
I wanted to write a professional response but the crude one is probably needed too:
Forget you ever heard about async streams. What were they thinking?
Call it await foreach, or async enumerables or async iterators. It has nothing to do with IO and streams.
The term is used because it exists in other languages, not because it has anything to do with IO. In Java for example, streams are Java's implementation of C#'s IEnumerable. So, to ease adoption by future Android devs, C# adopted Java's bad idea.
We can look at the language design meetings for the actual justification for this term I guess.
Serious original answer
There's no vs. It's like contrasting automatic gear boxes and cars. Cars can have automatic gear boxes, they aren't used instead of gear boxes.
Async streams is purely a programming concept that allows the creation of async iteratos. It's the feature that allows us to write this to make HTTP calls in a loop and process the results as they arrive :
await foreach(var someValue from someAsyncIterator(5))
{
...
}
IAsyncEnumerable<string> someAsyncIterator(int max)
{
for(int i=0;i<max;i++)
{
var response=await httpClient.GetStringAsync($"{baseUrl}/{i}");
yield return response;
}
}
When they appear as action results it's only to allow the ASP.NET Core middleware to start processing results as they are produced, they don't affect the contents of the HTTP response itself.
gRPC's streams on the other hand allow the server to send individual responses to the client asynchronously. Laurent Kempe in gRPC and C# 8 Async stream and Steve Gordon in Server Streaming with GRPC and .NET Core show how these can be used together
Copying from Steve Gordon's samples, let's say we have a weather service that sends forecasts to the client, whose proto file contains :
service WeatherForecasts {
rpc GetWeather (google.protobuf.Empty) returns (WeatherReply);
rpc GetWeatherStream (google.protobuf.Empty) returns (stream WeatherData);
rpc GetTownWeatherStream (stream TownWeatherRequest) returns (stream TownWeatherForecast);
}
Before C# 8, the client would have to block until it received all responses before processing them:
using var channel = GrpcChannel.ForAddress("https://localhost:5005");
var client = new WeatherForecastsClient(channel);
var reply = await client.GetWeatherAsync(new Empty());
foreach (var forecast in reply.WeatherData)
{
//Do something with the data
}
In C# 8 though, the responses can be received and processed as they arrive :
using var replies = client.GetWeatherStream(new Empty(), cancellationToken: cts.Token);
await foreach (var weatherData in replies.ResponseStream.ReadAllAsync(cancellationToken: cts.Token))
{
//Do something with the data
}
**
I have an interaction with another server which makes POST calls to my web app. The problem I have is that the server making the calls tends to lock records which my app would go back to update.
So I need to accept the post, pass it off to another thread/process in the background and get the connection closed as soon as possible.
I've tried things like:
public IHttpActionResult Post(myTestModel passIn)
{
if (ModelState.IsValid) {
logger.debut ("conn open);
var tasks = new []
{
_mymethod.PassOutOperation(passIn)
}
logger.debug ("conn closed");
return Ok("OK");
}
return BadRequest("Error in model");
}
I can tell by the amount of time the inbound requests take that the connections aren't being closed down as quickly as it could be. In testing they are just 3 consecutive posts to my web app.
Looking at my logs I would have expected my entries for connection open and closed to be at the top of the log. However the closed connections are at the bottom, after the operations that I was trying to pass out have completed.
Has anyone got any tips?
Thanks in advance!
for anyone interested I solved the problem.
I'm now using:
var tasks = new thread(() =>
{
_mymethod.PassOutOperation(passIn);
});
tasks.start();
The reason the code was stopping was because I was originally passing HttpContext.Current.Request.UserHostName in my other method. Which was out of scope when I setup the new thread. I've since changed now and declare a variable outside of the code block which create the new thread, and pass in via the methods constructor e.g.
_myMethod.PassOutOperation(passIn, userHostName);
Hope that helps someone in the future!
So I am trying to query an API that's accessible via HTTP ( no authorization ). To speed things up, I tried to use a Parallel.ForEach loop but it seems like the longer it runs, the more errors pop up.
It fails to retrieve more and more requests. I know the API provider isn't limiting me because I can request the very same blocked URLs in my Internet browser. Also, these are different failed URLs each time, so it doesn't seem to be the case of malformed requests.
The error doesn't seem to occur while I use single threaded foreach loop.
My malfunctioning loop is below:
Parallel.ForEach(this.urlArray, singleUrl => {
this.apiResponseBlob = new System.Net.WebClient ().DownloadString(singleUrl );
this.responsesDictionary.Add(singleUrl, apiResponseBlob);
}
Normal foreach loop works fine but is very slow:
foreach (string singleUrl in this.urlArray) {
this.apiResponseBlob = new System.Net.WebClient ().DownloadString(singleUrl);
this.responsesDictionary.Add(singleUrl, apiResponseBlob);
}
Also: I've had a solution in PHP - I spawned several "fetchers" simultaneously and it never hung up. It seems strange to me that PHP would handle multithreaded retrieval better than C# so I must obviously miss something.
How do I query the API fastest way? Without these strange failures?
Hi did you try to speed up your code with a sync downloads like in this question (see marked answer):
DownloadStringAsync wait for request completion
your could loop through your uris and get a callback for each successfull download.
EDIT : i have seen that you use
this.apiResponseBlob = DL
when you use multithreading every thread tries to write in that variable. This could be a reason vor your bug. Try using an instance of that object type or use
lock{}
so that only one thread can write this variable at time.
http://msdn.microsoft.com/de-de/library/c5kehkcz.aspx
like
Parallel.ForEach(this.urlArray, singleUrl => {
var apiResponseBlob = new System.Net.WebClient ().DownloadString(singleUrl );
lock(singleUrl.ToString()){
this.responsesDictionary.Add(singleUrl, apiResponseBlob);
}
}
Using Visual studio 2012, C#.net 4.5 , SQL Server 2008, Feefo, Nopcommerce
Hey guys I have Recently implemented a new review service into a current site we have.
When the change went live the first day all worked fine.
Since then though the sending of sales to Feefo hasnt been working, There are no logs either of anything going wrong.
In the OrderProcessingService.cs in Nop Commerce's Service, i call a HttpWebrequest when an order has been confirmed as completed. Here is the code.
var email = HttpUtility.UrlEncode(order.Customer.Email.ToString());
var name = HttpUtility.UrlEncode(order.Customer.GetFullName().ToString());
var description = HttpUtility.UrlEncode(productVariant.ProductVariant.Product.MetaDescription != null ? productVariant.ProductVariant.Product.MetaDescription.ToString() : "product");
var orderRef = HttpUtility.UrlEncode(order.Id.ToString());
var productLink = HttpUtility.UrlEncode(string.Format("myurl/p/{0}/{1}", productVariant.ProductVariant.ProductId, productVariant.ProductVariant.Name.Replace(" ", "-")));
string itemRef = "";
try
{
itemRef = HttpUtility.UrlEncode(productVariant.ProductVariant.ProductId.ToString());
}
catch
{
itemRef = "0";
}
var url = string.Format("feefo Url",
login, password,email,name,description,orderRef,productLink,itemRef);
var request = (HttpWebRequest)WebRequest.Create(url);
request.KeepAlive = false;
request.Timeout = 5000;
request.Proxy = null;
using (var response = (HttpWebResponse)request.GetResponse())
{
if (response.StatusDescription == "OK")
{
var stream = response.GetResponseStream();
if(stream != null)
{
using (var reader = new StreamReader(stream))
{
var content = reader.ReadToEnd();
}
}
}
}
So as you can see its a simple webrequest that is processed on an order, and all product variants are sent to feefo.
Now:
this hasnt been happening all week since the 15th (day of the
implementation)
the site has been grinding to a halt recently.
The stream and reader in the the var content is there for debugging.
Im wondering does the code redflag anything to you that could relate to the process of website?
Also note i have run some SQL statements to see if there is any deadlocks or large escalations, so far seems fine, Logs have also been fine just the usual logging of Bots.
Any help would be much appreciated!
EDIT: also note that this code is in a method that is called and wrapped in A try catch
UPDATE: well forget about the "not sending", thats because i was just told my code was rolled back last week
A call to another web site while processing the order can degrade performance, as you are calling to a site that you do not control. You don't know how much time it is going to take. Furthermore, the GetResponse method can throw an exception, if you don't log anything in your outer try/catch block then you won't be able to know what's happening.
The best way to perform such a task is to implement something like the "Send Emails" scheduled task, and send data when you can afford to wait for the remote service. It is easy if you try. It is more resilient and easier to maintain if you upgrade the nopCommerce code base.
This is how I do similar things:
Avoid modifying the OrderProcessingService: Create a custom service or plugin that consumes the OrderPlacedEvent or the OrderPaidEvent (just implement the IConsumer<OrderPaidEvent> or IConsumer<OrderPlacedEvent> interface).
Do not call to a third party service directly while processing the request if you don't need the response at that moment. It will only delay your process. At the service created in step 1, store data and send it to Feefo later. You can store data to database or use an static collection if you don't mind losing pending data when restarting the site (that could be ok for statistical data for instance).
Best way to implement point #2 is to add a new scheduled task implementing ITask (remember to add a record to the ScheduleTask table). Just recover the stored data do your processing.
Add some logging. It is easy, just get an ILogger instance and call Insert.
As far as I can see, you are making a blocking synchronous call to other websites, which will definitely slow down your site in between the request-response process. What Marco has suggested is valid, try to do it in an ITask. Or you can use an asynchronous web request to potentially remove the block, if you need things done immediately instead of scheduled. :)
Having set up a ReferenceDataRequest I send it along to an EventQueue
Service refdata = _session.GetService("//blp/refdata");
Request request = refdata.CreateRequest("ReferenceDataRequest");
// append the appropriate symbol and field data to the request
EventQueue eventQueue = new EventQueue();
Guid guid = Guid.NewGuid();
CorrelationID id = new CorrelationID(guid);
_session.SendRequest(request, eventQueue, id);
long _eventWaitTimeout = 60000;
myEvent = eventQueue.NextEvent(_eventWaitTimeout);
Normally I can grab the message from the queue, but I'm hitting the situation now that if I'm making a number of requests in the same run of the app (normally around the tenth), I see a TIMEOUT EventType
if (myEvent.Type == Event.EventType.TIMEOUT)
throw new Exception("Timed Out - need to rethink this strategy");
else
msg = myEvent.GetMessages().First();
These are being made on the same thread, but I'm assuming that there's something somewhere along the line that I'm consuming and not releasing.
Anyone have any clues or advice?
There aren't many references on SO to BLP's API, but hopefully we can start to rectify that situation.
I just wanted to share something, thanks to the code you included in your initial post.
If you make a request for historical intraday data for a long duration (which results in many events generated by Bloomberg API), do not use the pattern specified in the API documentation, as it may end up making your application very slow to retrieve all events.
Basically, do not call NextEvent() on a Session object! Use a dedicated EventQueue instead.
Instead of doing this:
var cID = new CorrelationID(1);
session.SendRequest(request, cID);
do {
Event eventObj = session.NextEvent();
...
}
Do this:
var cID = new CorrelationID(1);
var eventQueue = new EventQueue();
session.SendRequest(request, eventQueue, cID);
do {
Event eventObj = eventQueue.NextEvent();
...
}
This can result in some performance improvement, though the API is known to not be particularly deterministic...
I didn't really ever get around to solving this question, but we did find a workaround.
Based on a small, apparently throwaway, comment in the Server API documentation, we opted to create a second session. One session is responsible for static requests, the other for real-time. e.g.
_marketDataSession.OpenService("//blp/mktdata");
_staticSession.OpenService("//blp/refdata");
The means one session operates in subscription mode, the other more synchronously - I think it was this duality which was at the root of our problems.
Since making that change, we've not had any problems.
My reading of the docs agrees that you need separate sessions for the "//blp/mktdata" and "//blp/refdata" services.
A client appeared to have a similar problem. I solved it by making hundreds of sessions rather than passing in hundreds of requests in one session. Bloomberg may not be to happy with this BFI (brute force and ignorance) approach as we are sending the field requests for each session but it works.
Nice to see another person on stackoverflow enjoying the pain of bloomberg API :-)
I'm ashamed to say I use the following pattern (I suspect copied from the example code). It seems to work reasonably robustly, but probably ignores some important messages. But I don't get your time-out problem. It's Java, but all the languages work basically the same.
cid = session.sendRequest(request, null);
while (true) {
Event event = session.nextEvent();
MessageIterator msgIter = event.messageIterator();
while (msgIter.hasNext()) {
Message msg = msgIter.next();
if (msg.correlationID() == cid) {
processMessage(msg, fieldStrings, result);
}
}
if (event.eventType() == Event.EventType.RESPONSE) {
break;
}
}
This may work because it consumes all messages off each event.
It sounds like you are making too many requests at once. BB will only process a certain number of requests per connection at any given time. Note that opening more and more connections will not help because there are limits per subscription as well. If you make a large number of time consuming requests simultaneously, some may timeout. Also, you should process the request completely(until you receive RESPONSE message), or cancel them. A partial request that is outstanding is wasting a slot. Since splitting into two sessions, seems to have helped you, it sounds like you are also making a lot of subscription requests at the same time. Are you using subscriptions as a way to take snapshots? That is subscribe to an instrument, get initial values, and de-subscribe. If so, you should try to find a different design. This is not the way the subscriptions are intended to be used. An outstanding subscription request also uses a request slot. That is why it is best to batch as many subscriptions as possible in a single subscription list instead of making many individual requests. Hope this helps with your use of the api.
By the way, I can't tell from your sample code, but while you are blocked on messages from the event queue, are you also reading from the main event queue while(in a seperate event queue)? You must process all the messages out of the queue, especially if you have outstanding subscriptions. Responses can queue up really fast. If you are not processing messages, the session may hit some queue limits which may be why you are getting timeouts. Also, if you don't read messages, you may be marked a slow consumer and not receive more data until you start consuming the pending messages. The api is async. Event queues are just a way to block on specific requests without having to process all messages from the main queue in a context where blocking is ok, and it would otherwise be be difficult to interrupt the logic flow to process parts asynchronously.