I am only trying to test connection with the redis server? If all the connection configuration are correct and if I am able to establish a connection with the redis server using C#.
Here is the code that I used:
class Program
{
static readonly ConnectionMultiplexer redis = ConnectionMultiplexer.Connect(
new ConfigurationOptions
{
EndPoints = { "******windows.net", "6380" },
Password = "****",
Ssl = true,
AbortOnConnectFail = false,
AllowAdmin = true,
ConnectTimeout = 30000,
SyncTimeout = 30000
});
static async Task Main(string[] args)
{
ThreadPool.SetMinThreads(10, 10);
var db = redis.GetDatabase();
var pong = await db.PingAsync();
Console.WriteLine(pong);
}
}
Here is the error I am getting:
StackExchange.Redis.RedisTimeoutException:
'The timeout was reached before the message could be written to the output buffer,
and it was not sent, command=PING,
timeout: 30000,
inst: 0,
qu: 0,
qs: 0,
aw: False,
bw: CheckingForTimeout,
rs: NotStarted,
ws: Initializing,
in: 0,
last-in: 0,
cur-in: 0,
serverEndpoint: 0.0.24.236:6380,
mc: 1/1/0,
mgr: 10 of 10 available,
clientName: SJAIN(SE.Redis-v2.6.80.25426),
IOCP: (Busy=0,Free=1000,Min=10,Max=1000),
WORKER: (Busy=2,Free=32765,Min=10,Max=32767),
POOL: (Threads=13,QueuedItems=0,CompletedItems=699),
v: 2.6.80.25426
(Please take a look at this article for some common client-side issues
that can cause timeouts:
https://stackexchange.github.io/StackExchange.Redis/Timeouts)'
What am I missing here with this code?
Here are complete logs:
11:20:43.7290: Connecting (sync) on .NET Core 3.1.31 (StackExchange.Redis: v2.6.80.25426)
11:20:43.8989: endpoint.windows.net,0.0.24.236,syncTimeout=30000,allowAdmin=True,connectTimeout=30000,password=*****,ssl=True,abortConnect=False
11:20:43.9637: endpoint.windows.net:6380/Interactive: Connecting...
11:20:44.0600: endpoint.windows.net:6380: BeginConnectAsync
11:20:44.1316: 0.0.24.236:6380/Interactive: Connecting...
11:20:44.1318: 0.0.24.236:6380: BeginConnectAsync
11:20:44.1503: 2 unique nodes specified (with tiebreaker)
11:20:44.1519: endpoint.windows.net:6380: OnConnectedAsync init (State=Connecting)
11:20:44.1521: 0.0.24.236:6380: OnConnectedAsync init (State=Connecting)
11:20:44.1538: Allowing 2 endpoint(s) 00:00:30 to respond...
11:20:44.2065: Awaiting 2 available task completion(s) for 30000ms, IOCP: (Busy=2,Free=998,Min=10,Max=1000), WORKER: (Busy=0,Free=32767,Min=10,Max=32767), POOL: (Threads=6,QueuedItems=0,CompletedItems=9)
11:20:44.4260: 0.0.24.236:6380: OnConnectedAsync completed (Disconnected)
11:20:44.5679: Connection failed: 0.0.24.236:6380 (Subscription, UnableToConnect): UnableToConnect on 0.0.24.236:6380/Subscription, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 0s ago, last-write: 0s ago, keep-alive: 60s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 0s ago, v: 2.6.80.25426
11:20:44.5679: Connection failed: 0.0.24.236:6380 (Interactive, UnableToConnect): UnableToConnect on 0.0.24.236:6380/Interactive, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 0s ago, last-write: 0s ago, keep-alive: 60s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 0s ago, v: 2.6.80.25426
11:20:44.5679: Connection failed: 0.0.24.236:6380 (Subscription, UnableToConnect): UnableToConnect on 0.0.24.236:6380/Subscription, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 0s ago, last-write: 0s ago, keep-alive: 60s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 0s ago, v: 2.6.80.25426
11:20:44.5679: Connection failed: 0.0.24.236:6380 (Interactive, UnableToConnect): UnableToConnect on 0.0.24.236:6380/Interactive, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 0s ago, last-write: 0s ago, keep-alive: 60s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 0s ago, v: 2.6.80.25426
11:20:47.5071: endpoint.windows.net:6380: OnConnectedAsync completed (Disconnected)
11:20:47.5183: Connection failed: endpoint.windows.net:6380 (Interactive, UnableToConnect): UnableToConnect on endpoint.windows.net:6380/Interactive, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 3s ago, last-write: 3s ago, keep-alive: 60s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 3s ago, v: 2.6.80.25426
11:20:47.5183: Connection failed: endpoint.windows.net:6380 (Subscription, UnableToConnect): UnableToConnect on endpoint.windows.net:6380/Subscription, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 3s ago, last-write: 3s ago, keep-alive: 60s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 3s ago, v: 2.6.80.25426
11:20:47.5427: All 2 available tasks completed cleanly, IOCP: (Busy=0,Free=1000,Min=10,Max=1000), WORKER: (Busy=3,Free=32764,Min=10,Max=32767), POOL: (Threads=13,QueuedItems=0,CompletedItems=29)
11:20:47.5559: Endpoint summary:
11:20:47.5571: endpoint.windows.net:6380: Endpoint is (Interactive: Connecting, Subscription: Connecting)
11:20:47.5571: 0.0.24.236:6380: Endpoint is (Interactive: Disconnected, Subscription: Disconnected)
11:20:47.5571: Task summary:
11:20:47.5574: endpoint.windows.net:6380: Returned, but incorrectly
11:20:47.5574: 0.0.24.236:6380: Returned, but incorrectly
11:20:47.5883: Election summary:
11:20:47.5955: Election: endpoint.windows.net:6380 had no tiebreaker set
11:20:47.5955: Election: 0.0.24.236:6380 had no tiebreaker set
11:20:47.5955: Election: No primaries detected
11:20:47.6068: Endpoint Summary:
11:20:47.6079: endpoint.windows.net:6380: Standalone v4.0.0, primary; keep-alive: 00:01:00; int: Connecting; sub: Disconnected; not in use: DidNotRespond
11:20:47.6609: endpoint.windows.net:6380: int ops=0, qu=0, qs=0, qc=0, wr=0, socks=2; sub ops=0, qu=0, qs=0, qc=0, wr=0, socks=2
11:20:47.6626: endpoint.windows.net:6380: Circular op-count snapshot; int: 0 (0.00 ops/s; spans 10s); sub: 0 (0.00 ops/s; spans 10s)
11:20:47.6626: 0.0.24.236:6380: Standalone v4.0.0, primary; keep-alive: 00:01:00; int: Disconnected; sub: Disconnected; not in use: DidNotRespond
11:20:47.6645: 0.0.24.236:6380: int ops=0, qu=0, qs=0, qc=0, wr=0, socks=2; sub ops=0, qu=0, qs=0, qc=0, wr=0, socks=2
11:20:47.6645: 0.0.24.236:6380: Circular op-count snapshot; int: 0 (0.00 ops/s; spans 10s); sub: 0 (0.00 ops/s; spans 10s)
11:20:47.6646: Sync timeouts: 0; async timeouts: 0; fire and forget: 0; last heartbeat: -1s ago
11:20:47.6648: Resetting failing connections to retry...
11:20:47.6765: Retrying - attempts left: 2...
11:20:47.6765: 2 unique nodes specified (with tiebreaker)
11:20:47.6766: endpoint.windows.net:6380: OnConnectedAsync init (State=Connecting)
11:20:47.6766: 0.0.24.236:6380: OnConnectedAsync init (State=Connecting)
11:20:47.6766: Allowing 2 endpoint(s) 00:00:30 to respond...
11:20:47.6906: Awaiting 2 available task completion(s) for 30000ms, IOCP: (Busy=1,Free=999,Min=10,Max=1000), WORKER: (Busy=4,Free=32763,Min=10,Max=32767), POOL: (Threads=13,QueuedItems=0,CompletedItems=35)
11:20:47.7117: 0.0.24.236:6380: OnConnectedAsync completed (Disconnected)
11:20:47.7527: endpoint.windows.net:6380: OnConnectedAsync completed (Disconnected)
11:20:47.7888: All 2 available tasks completed cleanly, IOCP: (Busy=0,Free=1000,Min=10,Max=1000), WORKER: (Busy=3,Free=32764,Min=10,Max=32767), POOL: (Threads=13,QueuedItems=0,CompletedItems=42)
11:20:47.7888: Endpoint summary:
11:20:47.7888: endpoint.windows.net:6380: Endpoint is (Interactive: Disconnected, Subscription: Disconnected)
11:20:47.7889: 0.0.24.236:6380: Endpoint is (Interactive: Disconnected, Subscription: Disconnected)
11:20:47.7889: Task summary:
11:20:47.7889: endpoint.windows.net:6380: Returned, but incorrectly
11:20:47.7889: 0.0.24.236:6380: Returned, but incorrectly
11:20:47.7889: Election summary:
11:20:47.7889: Election: endpoint.windows.net:6380 had no tiebreaker set
11:20:47.7889: Election: 0.0.24.236:6380 had no tiebreaker set
11:20:47.7889: Election: No primaries detected
11:20:47.7889: Endpoint Summary:
11:20:47.7889: endpoint.windows.net:6380: Standalone v4.0.0, primary; keep-alive: 00:01:00; int: Disconnected; sub: Disconnected; not in use: DidNotRespond
11:20:47.7889: endpoint.windows.net:6380: int ops=0, qu=0, qs=0, qc=0, wr=0, socks=3; sub ops=0, qu=0, qs=0, qc=0, wr=0, socks=3
11:20:47.7890: endpoint.windows.net:6380: Circular op-count snapshot; int: 0 (0.00 ops/s; spans 10s); sub: 0 (0.00 ops/s; spans 10s)
11:20:47.7890: 0.0.24.236:6380: Standalone v4.0.0, primary; keep-alive: 00:01:00; int: Disconnected; sub: Disconnected; not in use: DidNotRespond
11:20:47.7890: 0.0.24.236:6380: int ops=0, qu=0, qs=0, qc=0, wr=0, socks=3; sub ops=0, qu=0, qs=0, qc=0, wr=0, socks=3
11:20:47.7890: 0.0.24.236:6380: Circular op-count snapshot; int: 0 (0.00 ops/s; spans 10s); sub: 0 (0.00 ops/s; spans 10s)
11:20:47.7890: Sync timeouts: 0; async timeouts: 0; fire and forget: 0; last heartbeat: -1s ago
11:20:47.7890: Resetting failing connections to retry...
11:20:47.8632: Retrying - attempts left: 1...
11:20:47.8632: 2 unique nodes specified (with tiebreaker)
11:20:47.8632: endpoint.windows.net:6380: OnConnectedAsync init (State=Connecting)
11:20:47.8632: 0.0.24.236:6380: OnConnectedAsync init (State=Disconnected)
11:20:47.8632: Allowing 2 endpoint(s) 00:00:30 to respond...
11:20:47.9120: endpoint.windows.net:6380: OnConnectedAsync completed (Disconnected)
11:20:47.9120: Awaiting 2 available task completion(s) for 30000ms, IOCP: (Busy=0,Free=1000,Min=10,Max=1000), WORKER: (Busy=5,Free=32762,Min=10,Max=32767), POOL: (Threads=13,QueuedItems=0,CompletedItems=47)
11:21:17.8775: Not all available tasks completed cleanly (from ReconfigureAsync#1292, timeout 30000ms), IOCP: (Busy=0,Free=1000,Min=10,Max=1000), WORKER: (Busy=1,Free=32766,Min=10,Max=32767), POOL: (Threads=12,QueuedItems=0,CompletedItems=86)
11:21:17.8866: Server[0] (endpoint.windows.net:6380) Status: RanToCompletion (inst: 0, qs: 0, in: -1, qu: 0, aw: False, in-pipe: -1, out-pipe: -1, bw: Inactive, rs: NA. ws: NA)
11:21:17.8873: Server[1] (0.0.24.236:6380) Status: WaitingForActivation (inst: 0, qs: 0, in: -1, qu: 0, aw: False, in-pipe: -1, out-pipe: -1, bw: Inactive, rs: NA. ws: NA)
11:21:17.8873: Endpoint summary:
11:21:17.8873: endpoint.windows.net:6380: Endpoint is (Interactive: Disconnected, Subscription: Disconnected)
11:21:17.8873: 0.0.24.236:6380: Endpoint is (Interactive: Disconnected, Subscription: Disconnected)
11:21:17.8873: Task summary:
11:21:17.8873: endpoint.windows.net:6380: Returned, but incorrectly
11:21:17.8874: 0.0.24.236:6380: Did not respond (Task.Status: WaitingForActivation)
11:21:17.8874: Election summary:
11:21:17.8874: Election: endpoint.windows.net:6380 had no tiebreaker set
11:21:17.8874: Election: 0.0.24.236:6380 had no tiebreaker set
11:21:17.8875: Election: No primaries detected
11:21:17.8876: Endpoint Summary:
11:21:17.8885: endpoint.windows.net:6380: Standalone v4.0.0, primary; keep-alive: 00:01:00; int: Disconnected; sub: Disconnected; not in use: DidNotRespond
11:21:17.8890: endpoint.windows.net:6380: int ops=0, qu=0, qs=0, qc=0, wr=0, socks=4; sub ops=0, qu=0, qs=0, qc=0, wr=0, socks=4
11:21:17.8897: endpoint.windows.net:6380: Circular op-count snapshot; int: 0 (0.00 ops/s; spans 10s); sub: 0 (0.00 ops/s; spans 10s)
11:21:17.8898: 0.0.24.236:6380: Standalone v4.0.0, primary; keep-alive: 00:01:00; int: Disconnected; sub: Disconnected; not in use: DidNotRespond
11:21:17.8898: 0.0.24.236:6380: int ops=0, qu=0, qs=0, qc=0, wr=0, socks=4; sub ops=0, qu=0, qs=0, qc=0, wr=0, socks=4
11:21:17.8898: 0.0.24.236:6380: Circular op-count snapshot; int: 0 (0.00 ops/s; spans 10s); sub: 0 (0.00 ops/s; spans 10s)
11:21:17.8898: Sync timeouts: 0; async timeouts: 0; fire and forget: 0; last heartbeat: -1s ago
11:21:17.8905: Starting heartbeat...
11:21:18.9710: 0.0.24.236:6380: OnConnectedAsync completed (Disconnected)
11:21:48.4544: Encountered exception: StackExchange.Redis.RedisTimeoutException: The timeout was reached before the message could be written to the output buffer, and it was not sent, command=SUBSCRIBE, timeout: 30000, inst: 0, qu: 0, qs: 0, aw: False, bw: CheckingForTimeout, last-in: 0, cur-in: 0, serverEndpoint: 0.0.24.236:6380, mc: 1/1/0, mgr: 10 of 10 available, clientName: SJAIN(SE.Redis-v2.6.80.25426), IOCP: (Busy=0,Free=1000,Min=10,Max=1000), WORKER: (Busy=0,Free=32767,Min=10,Max=32767), POOL: (Threads=14,QueuedItems=0,CompletedItems=294), v: 2.6.80.25426 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
at StackExchange.Redis.Maintenance.AzureMaintenanceEvent.AddListenerAsync(ConnectionMul
Updated Logs After Marc's answer
EndPoints = { "******windows.net", "6380" },
should be
EndPoints = { { "******windows.net", 6380 } },
or perhaps more simply:
EndPoints = { "******windows.net:6380" },
Right now, you're connecting to "******windows.net" on 6379, and (separately, as a different endpoint) "6380" on 6379.
Note that you will need to be able to reach the machine - it must resolve by DNS (or be specified as an IP address), be routable by you, and the designated port(s) must be open.
I am using Google Cloud SQL with MySql v5.7 from C# .NET-core 2.2 and entity framework 6 application.
In my logs I can see the following exception from multiple locations in the code that I use the database from:
MySql.Data.MySqlClient.MySqlException (0x80004005): Connect Timeout expired. ---> System.OperationCanceledException: The operation was canceled.
at System.Threading.CancellationToken.ThrowOperationCanceledException()
at System.Threading.SemaphoreSlim.WaitUntilCountOrTimeoutAsync(TaskNode asyncWaiter, Int32 millisecondsTimeout, CancellationToken cancellationToken)
at MySqlConnector.Core.ConnectionPool.GetSessionAsync(MySqlConnection connection, IOBehavior ioBehavior, CancellationToken cancellationToken) in C:\projects\mysqlconnector\src\MySqlConnector\Core\ConnectionPool.cs:line 42
at MySql.Data.MySqlClient.MySqlConnection.CreateSessionAsync(Nullable`1 ioBehavior, CancellationToken cancellationToken) in C:\projects\mysqlconnector\src\MySqlConnector\MySql.Data.MySqlClient\MySqlConnection.cs:line 507
at MySql.Data.MySqlClient.MySqlConnection.CreateSessionAsync(Nullable`1 ioBehavior, CancellationToken cancellationToken) in C:\projects\mysqlconnector\src\MySqlConnector\MySql.Data.MySqlClient\MySqlConnection.cs:line 523
at MySql.Data.MySqlClient.MySqlConnection.OpenAsync(Nullable`1 ioBehavior, CancellationToken cancellationToken) in C:\projects\mysqlconnector\src\MySqlConnector\MySql.Data.MySqlClient\MySqlConnection.cs:line 232
at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenDbConnectionAsync(Boolean errorsExpected, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenAsync(CancellationToken cancellationToken, Boolean errorsExpected)
at Microsoft.EntityFrameworkCore.Storage.Internal.RelationalCommand.ExecuteAsync(IRelationalConnection connection, DbCommandMethod executeMethod, IReadOnlyDictionary`2 parameterValues, CancellationToken cancellationToken)
This happens temporarily for split second when there is a some load on the database(not very high, about 20% cpu of the database machine).
Configuring The Context:
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
if (!optionsBuilder.IsConfigured)
{
optionsBuilder.UseMySql(
new System.Net.NetworkCredential(string.Empty, ConfigurationManager.CacheCS).Password, builder =>
{
builder.EnableRetryOnFailure(15, TimeSpan.FromSeconds(30), null);
}
);
}
}
This sets up to 15 retries and maximum of 30 seconds between retries.
It seems from the log that the MySqlConnector does not retry on this specific error.
My Tries
Tried adding transient error numbers to the list of error numbers to add:
builder.EnableRetryOnFailure(15, TimeSpan.FromSeconds(30), MySqlErrorCodes.TransientErrors);
where MySqlErrorCodes.TransientErrors is defined as:
public enum MySqlErrorCode
{
// Too many connections
ConnectionCountError = 1040,
// Unable to open connection
UnableToConnectToHost = 1042,
// Lock wait timeout exceeded; try restarting transaction
LockWaitTimeout = 1205,
// Deadlock found when trying to get lock; try restarting transaction
LockDeadlock = 1213,
// Transaction branch was rolled back: deadlock was detected
XARBDeadlock = 1614
}
public class MySqlErrorCodes
{
static MySqlErrorCodes()
{
TransientErrors = new HashSet<int>()
{
(int)MySqlErrorCode.ConnectionCountError,
(int)MySqlErrorCode.UnableToConnectToHost,
(int)MySqlErrorCode.LockWaitTimeout,
(int)MySqlErrorCode.LockDeadlock,
(int)MySqlErrorCode.XARBDeadlock
};
}
public static HashSet<int> TransientErrors { get; private set; }
}
This didn't work.
Questions
How can I solve this issue?
Is there a way to make Entity Framework more resilient to such connectivity issues?
Edit
The issue occurs when I use this code to execute a raw sql command to call a stored procedure:
public static async Task<RelationalDataReader> ExecuteSqlQueryAsync(this DatabaseFacade databaseFacade,
string sql,
CancellationToken cancellationToken = default(CancellationToken),
params object[] parameters)
{
var concurrencyDetector = databaseFacade.GetService<IConcurrencyDetector>();
using (concurrencyDetector.EnterCriticalSection())
{
var rawSqlCommand = databaseFacade
.GetService<IRawSqlCommandBuilder>()
.Build(sql, parameters);
return await rawSqlCommand
.RelationalCommand
.ExecuteReaderAsync(
databaseFacade.GetService<IRelationalConnection>(),
parameterValues: rawSqlCommand.ParameterValues,
cancellationToken: cancellationToken);
}
}
...
using (var context = new CacheDbContext())
{
using (var reader = await context
.Database
.ExecuteSqlQueryAsync("CALL Counter_increment2(#p0, #p1, #p2)",
default(CancellationToken),
new object[] { id, counterType, value })
.ConfigureAwait(false)
)
{
reader.DbDataReader.Read();
if (!(reader.DbDataReader[0] is DBNull))
return Convert.ToInt32(reader.DbDataReader[0]);
else
{
Logger.Error($"Counter was not found! ('{id}, '{counterType}')");
return 1;
}
}
}
I think this may be why there are no retries for the connect timeout.
How can I retry this safely while not executing the same stored procedure twice?
Edit
These are the global variables:
SHOW GLOBAL VARIABLES LIKE '%timeout%'
connect_timeout 10
delayed_insert_timeout 300
have_statement_timeout YES
innodb_flush_log_at_timeout 1
innodb_lock_wait_timeout 50
innodb_rollback_on_timeout OFF
interactive_timeout 28800
lock_wait_timeout 31536000
net_read_timeout 30
net_write_timeout 60
rpl_semi_sync_master_async_notify_timeout 5000000
rpl_semi_sync_master_timeout 3000
rpl_stop_slave_timeout 31536000
slave_net_timeout 30
wait_timeout 28800
SHOW GLOBAL STATUS LIKE '%timeout%'
Ssl_default_timeout 7200
Ssl_session_cache_timeouts 0
SHOW GLOBAL STATUS LIKE '%uptime%'
Uptime 103415
Uptime_since_flush_status 103415
In addition to the connect time out issue I am also seeing the following log:
MySql.Data.MySqlClient.MySqlException (0x80004005): MySQL Server rejected client certificate ---> System.IO.IOException: Unable to read data from the transport connection: Broken pipe. ---> System.Net.Sockets.SocketException: Broken pipe
Which seems to be a related issue regarding the connection to the database.
Is it safe to retry on such exception?
The issue happened because of low value for maximumpoolsize in the connection string.
When there are multiple threads using the database and not enough connetions to handle all the requests this may cause Connect Timeout.
To fix this change this in the connection string to a higher value:
Max Pool Size={maxConnections};
I think I've managed to make a test that shows this problem repeatably, at least on my system. This question relates to HttpClient being used for a bad endpoint (nonexistant endpoint, the target is down).
The problem is that the number of completed tasks falls short of the total, usually by about a few. I don't mind requests not working, but this just results in the app just hanging there when the results are awaited.
I get the following result form the test code below:
Elapsed: 237.2009884 seconds.
Tasks in batch array: 8000 Completed Tasks : 7993
If i set batchsize to 8 instead of 8000, it completes. For 8000 it jams on the WhenAll .
I wonder if other people get the same result, if I am doing something wrong, and if this appears to be a bug.
using System;
using System.Diagnostics;
using System.Linq;
using System.Net.Http;
using System.Threading;
using System.Threading.Tasks;
namespace CustomArrayTesting
{
/// <summary>
/// Problem: a large batch of async http requests is done in a loop using HttpClient, and a few of them never complete
/// </summary>
class ProgramTestHttpClient
{
static readonly int batchSize = 8000; //large batch size brings about the problem
static readonly Uri Target = new Uri("http://localhost:8080/BadAddress");
static TimeSpan httpClientTimeout = TimeSpan.FromSeconds(3); // short Timeout seems to bring about the problem.
/// <summary>
/// Sends off a bunch of async httpRequests using a loop, and then waits for the batch of requests to finish.
/// I installed asp.net web api client libraries Nuget package.
/// </summary>
static void Main(String[] args)
{
httpClient.Timeout = httpClientTimeout;
stopWatch = new Stopwatch();
stopWatch.Start();
// this timer updates the screen with the number of completed tasks in the batch (See timerAction method bellow Main)
TimerCallback _timerAction = timerAction;
TimerCallback _resetTimer = ResetTimer;
TimerCallback _timerCallback = _timerAction + _resetTimer;
timer = new Timer(_timerCallback, null, TimeSpan.FromSeconds(1), Timeout.InfiniteTimeSpan);
//
for (int i = 0; i < batchSize; i++)
{
Task<HttpResponseMessage> _response = httpClient.PostAsJsonAsync<Object>(Target, new Object());//WatchRequestBody()
Batch[i] = _response;
}
try
{
Task.WhenAll(Batch).Wait();
}
catch (Exception ex)
{
}
timer.Dispose();
timerAction(null);
stopWatch.Stop();
Console.WriteLine("Done");
Console.ReadLine();
}
static readonly TimeSpan timerRepeat = TimeSpan.FromSeconds(1);
static readonly HttpClient httpClient = new HttpClient();
static Stopwatch stopWatch;
static System.Threading.Timer timer;
static readonly Task[] Batch = new Task[batchSize];
static void timerAction(Object state)
{
Console.Clear();
Console.WriteLine("Elapsed: {0} seconds.", stopWatch.Elapsed.TotalSeconds);
var _tasks = from _task in Batch where _task != null select _task;
int _tasksCount = _tasks.Count();
var _completedTasks = from __task in _tasks where __task.IsCompleted select __task;
int _completedTasksCount = _completedTasks.Count();
Console.WriteLine("Tasks in batch array: {0} Completed Tasks : {1} ", _tasksCount, _completedTasksCount);
}
static void ResetTimer(Object state)
{
timer.Change(timerRepeat, Timeout.InfiniteTimeSpan);
}
}
}
Sometimes it just crashes before finishing with an Access Violation unhandled exception. The call stack just says:
> mscorlib.dll!System.Threading._IOCompletionCallback.PerformIOCompletionCallback(uint errorCode = 1225, uint numBytes = 0, System.Threading.NativeOverlapped* pOVERLAP = 0x08b38b98)
[Native to Managed Transition]
kernel32.dll!#BaseThreadInitThunk#12()
ntdll.dll!___RtlUserThreadStart#8()
ntdll.dll!__RtlUserThreadStart#8()
Most of the time it doesn't crash but just never finishes waiting on the whenall. In any case the following first chance exceptions are thrown for each request:
A first chance exception of type 'System.Net.Sockets.SocketException' occurred in System.dll
A first chance exception of type 'System.Net.WebException' occurred in System.dll
A first chance exception of type 'System.AggregateException' occurred in mscorlib.dll
A first chance exception of type 'System.ObjectDisposedException' occurred in System.dll
I made the debugger stop on the Object disposed exception, and got this call stack:
> System.dll!System.Net.Sockets.NetworkStream.UnsafeBeginWrite(byte[] buffer, int offset, int size, System.AsyncCallback callback, object state) + 0x136 bytes
System.dll!System.Net.PooledStream.UnsafeBeginWrite(byte[] buffer, int offset, int size, System.AsyncCallback callback, object state) + 0x19 bytes
System.dll!System.Net.ConnectStream.WriteHeaders(bool async = true) + 0x105 bytes
System.dll!System.Net.HttpWebRequest.EndSubmitRequest() + 0x8a bytes
System.dll!System.Net.HttpWebRequest.SetRequestSubmitDone(System.Net.ConnectStream submitStream) + 0x11d bytes
System.dll!System.Net.Connection.CompleteConnection(bool async, System.Net.HttpWebRequest request = {System.Net.HttpWebRequest}) + 0x16c bytes
System.dll!System.Net.Connection.CompleteConnectionWrapper(object request, object state) + 0x4e bytes
System.dll!System.Net.PooledStream.ConnectionCallback(object owningObject, System.Exception e, System.Net.Sockets.Socket socket, System.Net.IPAddress address) + 0xf0 bytes
System.dll!System.Net.ServicePoint.ConnectSocketCallback(System.IAsyncResult asyncResult) + 0xe6 bytes
System.dll!System.Net.LazyAsyncResult.Complete(System.IntPtr userToken) + 0x65 bytes
System.dll!System.Net.ContextAwareResult.Complete(System.IntPtr userToken) + 0x92 bytes
System.dll!System.Net.LazyAsyncResult.ProtectedInvokeCallback(object result, System.IntPtr userToken) + 0xa6 bytes
System.dll!System.Net.Sockets.BaseOverlappedAsyncResult.CompletionPortCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* nativeOverlapped) + 0x98 bytes
mscorlib.dll!System.Threading._IOCompletionCallback.PerformIOCompletionCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* pOVERLAP) + 0x6e bytes
[Native to Managed Transition]
The exception message was:
{"Cannot access a disposed object.\r\nObject name: 'System.Net.Sockets.NetworkStream'."} System.Exception {System.ObjectDisposedException}
Notice the relationship to that unhandled access violation exception that I rarely see.
So, it seems that HttpClient is not robust for when the target is down. I am doing this on windows 7 32 by the way.
I looked through the source of HttpClient using reflector. For the synchronously executed part of the operation (when it is kicked-off), there seems to be no timeout applied to the returned task, as far as I can see. There is some timeout implementation that calls Abort() on an HttpWebRequest object, but again they seem to have missed out any timeout cancellation or faulting of the returned task on this side of the async function. There maybe something on the callback side, but sometimes the callback is probably "going missing", leading to the returned Task never completing.
I posted a question asking how to add a timeout to any Task, and an answerer gave this very nice solution (here as an extension method):
public static Task<T> WithTimeout<T>(this Task<T> task, TimeSpan timeout)
{
var delay = task.ContinueWith(t => t.Result
, new CancellationTokenSource(timeout).Token);
return Task.WhenAny(task, delay).Unwrap();
}
So, calling HttpClient like this should prevent any "Tasks gone bad" from never ending:
Task<HttpResponseMessage> _response = httpClient.PostAsJsonAsync<Object>(Target, new Object()).WithTimeout<HttpResponseMessage>(httpClient.Timeout);
A couple more things that I think made requests less likely to go missing:
1. Increasing the timeout from 3s to 30s made all the tasks finish in the program that I posted with this question.
2. Increasing the number of concurrent connections allowed using for example System.Net.ServicePointManager.DefaultConnectionLimit = 100;
I came across this question when googling for solutions to a similar problem from WCF. That series of exceptions is exactly the same pattern I see. Eventually through a ton of investigation I found a bug in HttpWebRequest that HttpClient uses. The HttpWebRequest gets in a bad state and only sends the HTTP headers. It then sits waiting for a response which will never be sent.
I've raised a ticket with Microsoft Connect which can be found here: https://connect.microsoft.com/VisualStudio/feedback/details/1805955/async-post-httpwebrequest-hangs-when-a-socketexception-occurs-during-setsocketoption
The specifics are in the ticket but it requires an async POST call from the HttpWebRequest to a non-localhost machine. I've reproduced it on Windows 7 in .Net 4.5 and 4.6. The failed SetSocketOption call, which raises the SocketException, only fails on Windows 7 in testing.
For us the UseNagleAlgorithm setting causes the SetSocketOption call, but we can't avoid it as WCF turns off UseNagleAlgorithm and you can't stop it. In WCF it appears as a timed out call. Obviously this isn't great as we're spending 60s waiting for nothing.
Your exception information is being lost in the WhenAll task. Instead of using that, try this:
Task aggregateTask = Task.Factory.ContinueWhenAll(
Batch,
TaskExtrasExtensions.PropagateExceptions,
TaskContinuationOptions.ExecuteSynchronously);
aggregateTask.Wait();
This uses the PropagateExceptions extension method from the Parallel Extensions Extras sample code to ensure that exception information from the tasks in the batch operation are not lost:
/// <summary>Propagates any exceptions that occurred on the specified tasks.</summary>
/// <param name="tasks">The Task instances whose exceptions are to be propagated.</param>
public static void PropagateExceptions(this Task [] tasks)
{
if (tasks == null) throw new ArgumentNullException("tasks");
if (tasks.Any(t => t == null)) throw new ArgumentException("tasks");
if (tasks.Any(t => !t.IsCompleted)) throw new InvalidOperationException("A task has not completed.");
Task.WaitAll(tasks);
}