Are these webrequests actually concurrent?

Are these webrequests actually concurrent? - c#

I have a UrlList of only 4 URLs which I want to use to make 4 concurrent requests. Does the code below truly make 4 requests which start at the same time?
My testing appears to show that it does, but am I correct in thinking that there will actually be 4 requests retrieving data from the URL target at the same time or does it just appear that way?
static void Main(string[] args)
{
var t = Do_TaskWhenAll();
t.Wait();
}
public static async Task Do_TaskWhenAll()
{
var downloadTasksQuery = from url in UrlList select Run(url);
var downloadTasks = downloadTasksQuery.ToArray();
Results = await Task.WhenAll(downloadTasks);
}
public static async Task<string> Run(string url)
{
var client = new WebClient();
AddHeaders(client);
var content = await client.DownloadStringTaskAsync(new Uri(url));
return content;
}

Correct, when ToArray is called, the enumerable downloadTasksQuery will yield a task for every URL, running your web requests concurrently.
await Task.WhenAll ensures your task completes only when all web requests have completed.
You can rewrite your code to be less verbose, while doing effectively the same, like so:
public static async Task Do_TaskWhenAll()
{
var downloadTasks = from url in UrlList select Run(url);
Results = await Task.WhenAll(downloadTasks);
}
There's no need for ToArray because Task.WhenAll will enumerate your enumerable for you.
I advice you to use HttpClient instead of WebClient. Using HttpClient, you won't have to create a new instance of the client for each concurrent request, as it allows you to reuse the same client for doing multiple requests, concurrently.

The short answer is yes: if you generate multiple Tasks without awaiting each one individually, they can run simultaneously, as long as they are truly asynchronous.
When DownloadStringTaskAsync is awaited, a Task is returned from your Run method, allowing the next iteration to occur whilst waiting for the response.
So the next HTTP request is allowed to be sent without waiting for the first to complete.
As an aside, your method can be written more concisely:
public static async Task Do_TaskWhenAll()
{
Results = await Task.WhenAll(UrlList.Select(Run));
}
Task.WhenAll has an overload that accepts IEnumerable<Task<TResult>> which is returned from UrlList.Select(Run).

No, there is no guarantee that your requests will be executed in parallel, or immediately.
Starting a task merely queues it to the thread pool. If all of the pool's threads are occupied, that task will necessarily wait until a thread frees up.
In your case, since there are a relatively large number of threads available in the pool, and you are queueing only a small number of items, the pool has no problem servicing them as they come in. The more tasks you queue at once, the more likely this is to change.
If you truly need concurrency, you need to be aware of what the thread pool size is, and how busy it is. The ThreadPool class will help you to manage this.

Related

Use of async/wait in REST API with CPU intensive tasks

I am having problems understanding the advantages of using async/await in REST API.
I have this CPU intensive task:
[HttpGet("GetHeavyStuffAsync")]
public async Task<string> GetHeavyStuffAsync()
{
Guid id = Guid.NewGuid();
System.Diagnostics.Debug.WriteLine($"{id} has started");
await _expensiveOperations.DoHeavyStuffOnDifferentThread();
System.Diagnostics.Debug.WriteLine($"{id} has finished");
return "request processed";
}
Where DoHeavyStuffOnDifferentThread() does this:
public async Task DoHeavyStuffOnDifferentThread()
{
var t = Task.Run(() =>
{
var limit = 4000;
var array = Enumerable.Range(0, limit).ToArray();
stoogesort(array, 0, array.Count() - 1);
});
await t;
}
I am using Task.Run(() => ... so that the CPU heavy stuff executes in a different threat without blocking the main one, and using async/await in the hope that the controller thread is not blocked by the heavy task and can continue attending requests.
To test that I write a program that launched 250 request against the GetHeavyStuffAsync(), and after that from swagger I made a request to a different endpoint in the same API controller:
[HttpGet]
public IEnumerable<WeatherForecast> Get()
{
System.Diagnostics.Debug.WriteLine("GETTING FORECAST ...");
var rng = new Random();
return Enumerable.Range(1, 5).Select(index => new WeatherForecast
{
Date = DateTime.Now.AddDays(index),
TemperatureC = rng.Next(-20, 55),
Summary = Summaries[rng.Next(Summaries.Length)]
})
.ToArray();
}
As you can see this last endpoint is the one created by default as an example by Visual Studio when you create an API project, and it is a very simple function that returns immediately.
What I expected to happen: the calls to GetHeavyStuffAsync would be processed, at await _expensiveOperations.DoHeavyStuffOnDifferentThread(); the control would pass to the thread doing the heavy stuff, and the API controller would be free to process the other request, at some point DoHeavyStuffOnDifferentThread would finish and the controller would continue with this instruction System.Diagnostics.Debug.WriteLine($"{id} has finished"); and finish the method, be free to continue processing other request.
What actually happened was that public IEnumerable<WeatherForecast> Get() took minutes to return.
So why there was no difference in behavior from what I would have had if I hadn´t used a different thread nor async/await?
(Note: during the test CPU and memory in my laptop remaining at about 50%)

I am using Task.Run(() => ... so that the cpu heavy stuff executes in a different threat without blocking the main one
There is no benefit to doing this in ASP.NET.
In a desktop application, the UI can only be updated by the main thread. So offloading CPU-intensive tasks to another thread makes sense because it frees up the UI thread to continue responding to user input.
However, in ASP.NET, there is no one "main thread." Every new request is assigned a new thread, up until the max ThreadPool count is hit, then any further requests have to wait.
So when you use Task.Run, you are freeing the main request's thread, but you're using another thread. So the net effect on the ThreadPool count is still the same.
In the article ASP.NET Core Performance Best Practices, Microsoft recommends:
Do not:
Call Task.Run and immediately await it. ASP.NET Core already runs app code on normal Thread Pool threads, so calling Task.Run only results in extra unnecessary Thread Pool scheduling. Even if the scheduled code would block a thread, Task.Run does not prevent that.
Asynchronous code only helps you when making I/O requests (network, file system, etc.), since there is truly nothing to do while waiting. But with CPU-intensive tasks, there is no benefit.

Is there a way to limit the number of parallel Tasks globally in an ASP.NET Web API application?

I have an ASP.NET 5 Web API application which contains a method that takes objects from a List<T> and makes HTTP requests to a server, 5 at a time, until all requests have completed. This is accomplished using a SemaphoreSlim, a List<Task>(), and awaiting on Task.WhenAll(), similar to the example snippet below:
public async Task<ResponseObj[]> DoStuff(List<Input> inputData)
{
const int maxDegreeOfParallelism = 5;
var tasks = new List<Task<ResponseObj>>();
using var throttler = new SemaphoreSlim(maxDegreeOfParallelism);
foreach (var input in inputData)
{
tasks.Add(ExecHttpRequestAsync(input, throttler));
}
List<ResponseObj> resposnes = await Task.WhenAll(tasks).ConfigureAwait(false);
return responses;
}
private async Task<ResponseObj> ExecHttpRequestAsync(Input input, SemaphoreSlim throttler)
{
await throttler.WaitAsync().ConfigureAwait(false);
try
{
using var request = new HttpRequestMessage(HttpMethod.Post, "https://foo.bar/api");
request.Content = new StringContent(JsonConvert.SerializeObject(input, Encoding.UTF8, "application/json");
var response = await HttpClientWrapper.SendAsync(request).ConfigureAwait(false);
var responseBody = await response.Content.ReadAsStringAsync().ConfigureAwait(false);
var responseObject = JsonConvert.DeserializeObject<ResponseObj>(responseBody);
return responseObject;
}
finally
{
throttler.Release();
}
}
This works well, however I am looking to limit the total number of Tasks that are being executed in parallel globally throughout the application, so as to allow scaling up of this application. For example, if 50 requests to my API came in at the same time, this would start at most 250 tasks running parallel. If I wanted to limit the total number of Tasks that are being executed at any given time to say 100, is it possible to accomplish this? Perhaps via a Queue<T>? Would the framework automatically prevent too many tasks from being executed? Or am I approaching this problem in the wrong way, and would I instead need to Queue the incoming requests to my application?

I'm going to assume the code is fixed, i.e., Task.Run is removed and the WaitAsync / Release are adjusted to throttle the HTTP calls instead of List<T>.Add.
I am looking to limit the total number of Tasks that are being executed in parallel globally throughout the application, so as to allow scaling up of this application.
This does not make sense to me. Limiting your tasks limits your scaling up.
For example, if 50 requests to my API came in at the same time, this would start at most 250 tasks running parallel.
Concurrently, sure, but not in parallel. It's important to note that these aren't 250 threads, and that they're not 250 CPU-bound operations waiting for free thread pool threads to run on, either. These are Promise Tasks, not Delegate Tasks, so they don't "run" on a thread at all. It's just 250 objects in memory.
If I wanted to limit the total number of Tasks that are being executed at any given time to say 100, is it possible to accomplish this?
Since (these kinds of) tasks are just in-memory objects, there should be no need to limit them, any more than you would need to limit the number of strings or List<T>s. Apply throttling where you do need it; e.g., number of HTTP calls done simultaneously per request. Or per host.
Would the framework automatically prevent too many tasks from being executed?
The framework has nothing like this built-in.
Perhaps via a Queue? Or am I approaching this problem in the wrong way, and would I instead need to Queue the incoming requests to my application?
There's already a queue of requests. It's handled by IIS (or whatever your host is). If your server gets too busy (or gets busy very suddenly), the requests will queue up without you having to do anything.

If I wanted to limit the total number of Tasks that are being executed at any given time to say 100, is it possible to accomplish this?
What you are looking for is to limit the MaximumConcurrencyLevel of what's called the Task Scheduler. You can create your own task scheduler that regulates the MaximumCongruencyLevel of the tasks it manages. I would recommend implementing a queue-like object that tracks incoming requests and currently working requests and waits for the current requests to finish before consuming more. The below information may still be relevant.
The task scheduler is in charge of how Tasks are prioritized, and in charge of tracking the tasks and ensuring that their work is completed, at least eventually.
The way it does this is actually very similar to what you mentioned, in general the way the Task Scheduler handles tasks is in a FIFO (First in first out) model very similar to how a ConcurrentQueue<T> works (at least starting in .NET 4).
Would the framework automatically prevent too many tasks from being executed?
By default the TaskScheduler that is created with most applications appears to default to a MaximumConcurrencyLevel of int.MaxValue. So theoretically yes.
The fact that there practically is no limit to the amount of tasks(at least with the default TaskScheduler) might not be that big of a deal for your case scenario.
Tasks are separated into two types, at least when it comes to how they are assigned to the available thread pools. They're separated into Local and Global queues.
Without going too far into detail, the way it works is if a task creates other tasks, those new tasks are part of the parent tasks queue (a local queue). Tasks spawned by a parent task are limited to the parent's thread pool.(Unless the task scheduler takes it upon itself to move queues around)
If a task isn't created by another task, it's a top-level task and is placed into the Global Queue. These would normally be assigned their own thread(if available) and if one isn't available it's treated in a FIFO model, as mentioned above, until it's work can be completed.
This is important because although you can limit the amount of concurrency that happens with the TaskScheduler, it may not necessarily be important - if for say you have a top-level task that's marked as long running and is in-charge of processing your incoming requests. This would be helpful since all the tasks spawned by this top-level task will be part of that task's local queue and therefor won't spam all your available threads in your thread pool.

When you have a bunch of items and you want to process them asynchronously and with limited concurrency, the SemaphoreSlim is a great tool for this job. There are two ways that it can be used. One way is to create all the tasks immediately and have each task acquire the semaphore before doing it's main work, and the other is to throttle the creation of the tasks while the source is enumerated. The first technique is eager, and so it consumes more RAM, but it's more maintainable because it is easier to understand and implement. The second technique is lazy, and it's more efficient if you have millions of items to process.
The technique that you have used in your sample code is the second (lazy) one.
Here is an example of using two SemaphoreSlims in order to impose two maximum concurrency policies, one per request and one globally. First the eager approach:
private const int maxConcurrencyGlobal = 100;
private static SemaphoreSlim globalThrottler
= new SemaphoreSlim(maxConcurrencyGlobal, maxConcurrencyGlobal);
public async Task<ResponseObj[]> DoStuffAsync(IEnumerable<Input> inputData)
{
const int maxConcurrencyPerRequest = 5;
var perRequestThrottler
= new SemaphoreSlim(maxConcurrencyPerRequest, maxConcurrencyPerRequest);
Task<ResponseObj>[] tasks = inputData.Select(async input =>
{
await perRequestThrottler.WaitAsync();
try
{
await globalThrottler.WaitAsync();
try
{
return await ExecHttpRequestAsync(input);
}
finally { globalThrottler.Release(); }
}
finally { perRequestThrottler.Release(); }
}).ToArray();
return await Task.WhenAll(tasks);
}
The Select LINQ operator provides an easy and intuitive way to project items to tasks.
And here is the lazy approach for doing exactly the same thing:
private const int maxConcurrencyGlobal = 100;
private static SemaphoreSlim globalThrottler
= new SemaphoreSlim(maxConcurrencyGlobal, maxConcurrencyGlobal);
public async Task<ResponseObj[]> DoStuffAsync(IEnumerable<Input> inputData)
{
const int maxConcurrencyPerRequest = 5;
var perRequestThrottler
= new SemaphoreSlim(maxConcurrencyPerRequest, maxConcurrencyPerRequest);
var tasks = new List<Task<ResponseObj>>();
foreach (var input in inputData)
{
await perRequestThrottler.WaitAsync();
await globalThrottler.WaitAsync();
Task<ResponseObj> task = Run(async () =>
{
try
{
return await ExecHttpRequestAsync(input);
}
finally
{
try { globalThrottler.Release(); }
finally { perRequestThrottler.Release(); }
}
});
tasks.Add(task);
}
return await Task.WhenAll(tasks);
static async Task<T> Run<T>(Func<Task<T>> action) => await action();
}
This implementation assumes that the await globalThrottler.WaitAsync() will never throw, which is a given according to the documentation. This will no longer be the case if you decide later to add support for cancellation, and you pass a CancellationToken to the method. In that case you would need one more try/finally wrapper around the task-creation logic. The first (eager) approach could be enhanced with cancellation support without such considerations. Its existing try/finally infrastructure is
already sufficient.
It is also important that the internal helper Run method is implemented with async/await. Eliding the async/await would be an easy mistake to make, because in that case any exception thrown synchronously by the ExecHttpRequestAsync method would be rethrown immediately, and it would not be encapsulated in a Task<ResponseObj>. Then the task returned by the DoStuffAsync method would fail without releasing the acquired semaphores, and also without awaiting the completion of the already started operations. That's another argument for preferring the eager approach. The lazy approach has too many gotchas to watch for.

Performance of multiple awaits compared to Task.WhenAll

General Information
I want to improve the performance of a program issuing multiple HTTP requests to the same external API endpoint. Therefore, I have created a console application to perform some tests. The method GetPostAsync sends an asynchronous HTTP request to the external API and returns the result as a string.
private static async Task<string> GetPostAsync(int id)
{
var client = new HttpClient();
var response = await client.GetAsync($"https://jsonplaceholder.typicode.com/posts/{id}");
return await response.Content.ReadAsStringAsync();
}
Additionally, I have implemented the methods below to compare the execution time of multiple calls to await and Task.WhenAll.
private static async Task TaskWhenAll(IEnumerable<int> postIds)
{
var tasks = postIds.Select(GetPostAsync);
await Task.WhenAll(tasks);
}
private static async Task MultipleAwait(IEnumerable<int> postIds)
{
foreach (var postId in postIds)
{
await GetPostAsync(postId);
}
}
Test Results
Using the integrated Stopwatch class, I have measured the timings of the two methods and interestingly enough, the approach using Task.WhenAll performed way better than its counterpart:
Issue 50 HTTP requests
TaskWhenAll: ~650ms
MultipleAwait: ~4500ms
Why is the method using Task.WhenAll so much faster and are there any negative effects (i.e exception handling) when choosing this approach over the other?

Why is the method using Task.WhenAll so much faster
It is faster because you are not awaiting GetPostAsync. So actually every time you await client.GetAsync($"https://jsonplaceholder.typicode.com/posts/{id}"); the control will be returned to the caller which then can make another HTTP request. If you consider that HTTP request is much longer than creating the new client you effectively have the parallelism by running multiple HTTP requests in parallel. The WhenAll will just create a suspension point and wait for all tasks to finish.
With the multiple await approach, you make HTTP requests sequentially one by one by await GetPostAsync(postId) from foreach loop. You start the task but at the same time, you make a suspension point and wait for it to finish.
are there any negative effects (i.e exception handling, etc.) when
choosing this approach over the other?
There are no negative effects, using await/async pattern handling exception become just, as usual, using try-catch block. WhenAll will aggregate all exception from each task which is in Faulted state.

Making more remoting calls than threads by making synchronous methods async

I have a bunch of remoting calls that are all synchronous (3rd party library). Most of them take a lot of time so I'm not able to use them more often then about 5 to 10 times per second. This is too slow because I need to call them at least 3000 times every couple of minutes and many more if the services was stopped for some time. There is virtually no CPU work on the client. It gets the data, checks some simple conditions and makes another call that it has to wait for.
What would be the best way to make them async (call them in an async fashion - I guess I need some async wrapper) so that I can make more requests at the same time? Currently It's limited by the number of threads (which is four).
I was thinking about calling them with Task.Run but every article I read says it's for CPU bound work and that it uses thread-pool threads. If I get it correctly, with this approach I won't be able to break the thread limit, will I?. So which approach would actually fit best here?
What about Task.FromResult? Can I await such methods asynchronously in a greater number than there are threads?
public async Task<Data> GetDataTakingLotsOfTime(object id)
{
var data = remoting.GetData(id);
return await Task.FromResult(data);
}

I was thinking about calling them with Task.Run but every article I read says it's for CPU bound work and that it uses thread-pool threads.
Yes, but when you're stuck with a sync API then Task.Run() might be your lesser evil, especially on a Client.
Your current version of GetDataTakingLotsOfTime() is not really async. The FromResult() merely helps to suppress the Warning about that.
What about Task.FromResult? Can I await such methods asynchronously in a greater number than there are threads?
Not clear where your "number of threads" idea comes from but yes, starting a Task method and awaiting it later essentially runs it on the ThreadPool. But Task.Run is clearer in that respect.
Note that that does not depend on the async modifier of the method - async is an implementation detail, the caller only cares that it returns a Task.
Currently It's limited by the number of threads (which is four).
This needs some explaining. I don't get it.

You are executing a remote call, and your thread needs to wait idly for the result of the remote call. During this wait your thread could do useful things, like executing other remote calls.
Times when your thread is idly waiting for other processes to finish, like writing to a disk, querying a database or fetching information from the internet are typically situations where you'll see an async function next to a non-async function: Write and WriteAsync, Send and SendAsync.
If at the deepest level of your synchronous call you have access to an async version of the call, then your life would be easy. Alas it seems that you don't have such an async version.
Your proposed solution using Task.Run has the disadvantage of the overhead in starting a new thread (or running one from the thread pool).
You could lower this overhead by creating a workshop object. In the workshop, a dedicated thread (a worker), or several dedicated threads are waiting at one input point for an order to do something. The threads performs the task and posts the result at the output point.
Users of the workshop have one access point (front office?) where they post the request to do something, and await for the result.
For this I used System.Threading.Tasks.Dataflow.BufferBlock. Install Nuget package TPL Dataflow.
You can dedicate your workshop to accept only work to GetDataTakingLotsOfTime; I made my workshop generic: I accept every job that implements interface IWork:
interface IWork
{
void DoWork();
}
The WorkShop has two BufferBlocks: one to input work requests and one to output finished work. The workshop has a thread (or several threads) that wait at the input BufferBlock until a job arrives. Does the Work, and when finished posts the job to the output BufferBlock
class WorkShop
{
public WorkShop()
{
this.workRequests = new BufferBlock<IWork>();
this.finishedWork = new BufferBlock<IWork>();
this.frontOffice = new FrontOffice(this.workRequests, this.finishedWork);
}
private readonly BufferBlock<IWork> workRequests;
private readonly BufferBlock<IWork> finishedWork;
private readonly FrontOffice frontOffice;
public FrontOffice {get{return this.frontOffice;} }
public async Task StartWorkingAsync(CancellationToken token)
{
while (await this.workRequests.OutputAvailableAsync(token)
{ // some work request at the input buffer
IWork requestedWork = this.workRequests.ReceiveAsync(token);
requestedWork.DoWork();
this.FinishedWork.Post(requestedWork);
}
// if here: no work expected anymore:
this.FinishedWork.Complete();
}
// function to close the WorkShop
public async Task CloseShopAsync()
{
// signal that no more work is to be expected:
this.WorkRequests.Complete();
// await until the worker has finished his last job for the day:
await this.FinishedWork.Completion();
}
}
TODO: proper reaction on CancellationToken.CancellationRequested
TODO: proper reaction on exceptions thrown by work
TODO: decide whether to use several threads doing the work
FrontOffice has one async function, that accepts work, sends the work to the WorkRequests and awaits for the work to finish:
public async Task<IWork> OrderWorkAsync(IWork work, CancellationToken token)
{
await this.WorkRequests.SendAsync(work, token);
IWork finishedWork = await this.FinishedWork.ReceivedAsync(token);
return finishedWork;
}
So your process created a WorkShop object and starts one or more threads that will StartWorking.
Whenever any thread (inclusive your main thread) needs some work to be performed in async-await fashion:
Create An object that holds the input parameters and the DoWork function
Ask the WorkShop for the FrontOffice
await OrderWorkAsync
.
class InformationGetter : IWork
{
public int Id {get; set;} // the input Id
public Data FetchedData {get; private set;} // the result from Remoting.GetData(id);
public void DoWork()
{
this.FetchedData = remoting.GetData(this.Id);
}
}
Finally the Async version of your remote
async Task<Data> RemoteGetDataAsync(int id)
{
// create the job to get the information:
InformationGetter infoGetter = new InformationGetter() {Id = id};
// go to the front office of the workshop and order to do the job
await this.MyWorkShop.FrontOffice.OrderWorkAsync(infoGetter);
return infoGetter.FetchedData;
}

Optimizing for fire & forget using async/await and tasks

I have about 5 million items to update. I don't really care about the response (A response would be nice to have so I can log it, but I don't want a response if that will cost me time.) Having said that, is this code optimized to run as fast as possible? If there are 5 million items, would I run the risk of getting any task cancelled or timeout errors? I get about 1 or 2 responses back every second.
var tasks = items.Select(async item =>
{
await Update(CreateUrl(item));
}).ToList();
if (tasks.Any())
{
await Task.WhenAll(tasks);
}
private async Task<HttpResponseMessage> Update(string url)
{
var client = new HttpClient();
var response = await client.SendAsync(url).ConfigureAwait(false);
//log response.
}
UPDATE:
I am actually getting TaskCanceledExceptions. Did my system run out of threads? What could I do to avoid this?

You method will kick off all tasks at the same time, which may not be what you want. There wouldn't be any threads involved because with async operations There is no thread, but there may be number of concurrent connection limits.
There may be better tools to do this but if you want to use async/await one option is to use Stephen Toub's ForEachAsync as documented in this article. It allows you to control how many simultaneous operations you want to execute, so you don't overrun your connection limit.
Here it is from the article:
public static class Extensions
{
public static async Task ExecuteInPartition<T>(IEnumerator<T> partition, Func<T, Task> body)
{
using (partition)
while (partition.MoveNext())
await body(partition.Current);
}
public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
{
return Task.WhenAll(
from partition in Partitioner.Create(source).GetPartitions(dop)
select ExecuteInPartition(partition, body));
}
}
Usage:
public async Task UpdateAll()
{
// Allow for 100 concurrent Updates
await items.ForEachAsync(100, async t => await Update(t));
}

A much better approach would be to use TPL Dataflow's ActionBlock with MaxDegreeOfParallelism and a single HttpClient:
Task UpdateAll(IEnumerable<Item> items)
{
var block = new ActionBlock<Item>(
item => UpdateAsync(CreateUrl(item)),
new ExecutionDataflowBlockOptions {MaxDegreeOfParallelism = 1000});
foreach (var item in items)
{
block.Post(item);
}
block.Complete();
return block.Completion;
}
async Task UpdateAsync(string url)
{
var response = await _client.SendAsync(url).ConfigureAwait(false);
Console.WriteLine(response.StatusCode);
}
A single HttpClient can be used concurrently for multiple requests, and so it's much better to only create and disposing a single instance instead of 5 million.
There are numerous problems in firing so many request at the same time: The machine's network stack, the target web site, timeouts and so forth. The ActionBlock caps that number with the MaxDegreeOfParallelism (which you should test and optimize for your specific case). It's important to note that TPL may choose a lower number when it deems it to be appropriate.
When you have a single async call at the end of an async method or lambda expression, it's better for performance to remove the redundant async-await and just return the task (i.e return block.Completion;)
Complete will notify the ActionBlock to not accept any more items, but finish processing items it already has. When it's done the Completion task will be done so you can await it.

I suspect you are suffering from outgoing connection management preventing large numbers of simultaneous connections to the same domain. The answers given in this extensive Q+A might give you some avenues to investigate.
What is limiting the # of simultaneous connections my ASP.NET application can make to a web service?
In terms of your code structure, I'd personally try and use a dynamic pool of connections. You know that you cant actually get 5m connections simultaneously so trying to attempt it will just fail to work - you may as well deal with a reasonable and configured limit of (for instance) 20 connections and use them in a pool. In this way you can tune up or down.
alternatively you could investigate HTTP Pipelining (which I've not used) which is intended specifically for the job you are doing (batching up Http requests). http://en.wikipedia.org/wiki/HTTP_pipelining

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.