To begin with, I'm using unity. Which makes me stuck with .NET 3.5. I'm currently working on a server program which uses the Socket object's asynchronous methods (E.G. BeginReceive, BeginAccept, BeginReceiveFrom etc.). When the server receives a packet from a client, this packet is received on a worker thread. Now I'm left with some data on a worker thread, and I want the main thread to process this data using a function that I specify. I implemented that:
using System;
using System.Threading;
using System.Collections;
using System.Collections.Generic;
public class MyDispatcherClass
{
public delegate void MyDel();
private readonly Queue<MyDel> commands = new Queue<MyDel>();
Object lockObj = new object ();
public void Add(MyDel dc)
{
lock (lockObj)
{
commands.Enqueue (dc);
}
}
public void Invoke()
{
lock (lockObj)
{
while (commands.Count > 0)
{
commands.Dequeue().Invoke();
}
}
}
}
Then I would use it this way:
// As a global variable:
MyDispatcherClass SomeDispatcher = new MyDispatcherClass ();
//The function that I want to call:
public void MyFunction (byte[] data)
{
// Do some stuff on the main thread
}
//When I receive a message on a worker thread I do that:
SomeDispatcher.Add (()=> MyFunction (byte[] data)); //Asuume that "data" is the message I received from a client
//Each frame on the main thread I call:
SomeDispatcher.Invoke ();
After some research, I found that the lock statement does not guarantee a %100 FIFO implementation. Which is not what I wanted, sometimes this may cause a total server breakdown! I want to achieve the same result with a %100 guarantee that data will be processed in the same order it was received from a client. How could I accomplish that?
Threads will run in whatever order they want, so you can't force the order going into the queue. But you can put in more data into the queue than just what you will eventually be processing.
If you add a DateTime, (or even just an int with a specified order) to the data being sent you can sort your queue on that when you pull data from it, (and possibly not pull any data less than 0.5 seconds old to give time for other threads to write their data.)
Normally when dealing with client-server relationships each thread represents one client so you don't have to worry about this as commands are FIFO within the thread, (although they might not be when 2 different clients are sending messages.)
Do you close and re-open the socket on the same client? that could make it use different threads. If you need a specific order and are sending things fairly soon after each other it might be better to leave the socket open.
Related
Background
We have a service operation that can receive concurrent asynchronous requests and must process those requests one at a time.
In the following example, the UploadAndImport(...) method receives concurrent requests on multiple threads, but its calls to the ImportFile(...) method must happen one at a time.
Layperson Description
Imagine a warehouse with many workers (multiple threads). People (clients) can send the warehouse many packages (requests) at the same time (concurrently). When a package comes in a worker takes responsibility for it from start to finish, and the person who dropped off the package can leave (fire-and-forget). The workers' job is to put each package down a small chute, and only one worker can put a package down a chute at a time, otherwise chaos ensues. If the person who dropped off the package checks in later (polling endpoint), the warehouse should be able to report on whether the package went down the chute or not.
Question
The question then is how to write a service operation that...
can receive concurrent client requests,
receives and processes those requests on multiple threads,
processes requests on the same thread that received the request,
processes requests one at a time,
is a one way fire-and-forget operation, and
has a separate polling endpoint that reports on request completion.
We've tried the following and are wondering two things:
Are there any race conditions that we have not considered?
Is there a more canonical way to code this scenario in C#.NET with a service oriented architecture (we happen to be using WCF)?
Example: What We Have Tried?
This is the service code that we have tried. It works though it feels like somewhat of a hack or kludge.
static ImportFileInfo _inProgressRequest = null;
static readonly ConcurrentDictionary<Guid, ImportFileInfo> WaitingRequests =
new ConcurrentDictionary<Guid, ImportFileInfo>();
public void UploadAndImport(ImportFileInfo request)
{
// Receive the incoming request
WaitingRequests.TryAdd(request.OperationId, request);
while (null != Interlocked.CompareExchange(ref _inProgressRequest, request, null))
{
// Wait for any previous processing to complete
Thread.Sleep(500);
}
// Process the incoming request
ImportFile(request);
Interlocked.Exchange(ref _inProgressRequest, null);
WaitingRequests.TryRemove(request.OperationId, out _);
}
public bool UploadAndImportIsComplete(Guid operationId) =>
!WaitingRequests.ContainsKey(operationId);
This is example client code.
private static async Task UploadFile(FileInfo fileInfo, ImportFileInfo importFileInfo)
{
using (var proxy = new Proxy())
using (var stream = new FileStream(fileInfo.FullName, FileMode.Open, FileAccess.Read))
{
importFileInfo.FileByteStream = stream;
proxy.UploadAndImport(importFileInfo);
}
await Task.Run(() => Poller.Poll(timeoutSeconds: 90, intervalSeconds: 1, func: () =>
{
using (var proxy = new Proxy())
{
return proxy.UploadAndImportIsComplete(importFileInfo.OperationId);
}
}));
}
It's hard to write a minimum viable example of this in a Fiddle, but here is a start that give a sense and that compiles.
As before, the above seems like a hack/kludge, and we are asking both about potential pitfalls in its approach and for alternative patterns that are more appropriate/canonical.
Simple solution using Producer-Consumer pattern to pipe requests in case of thread count restrictions.
You still have to implement a simple progress reporter or event. I suggest to replace the expensive polling approach with an asynchronous communication which is offered by Microsoft's SignalR library. It uses WebSocket to enable async behavior. The client and server can register their callbacks on a hub. Using RPC the client can now invoke server side methods and vice versa. You would post progress to the client by using the hub (client side). In my experience SignalR is very simple to use and very good documented. It has a library for all famous server side languages (e.g. Java).
Polling in my understanding is the totally opposite of fire-and-forget. You can't forget, because you have to check something based on an interval. Event based communication, like SignalR, is fire-and-forget since you fire and will get a reminder (cause you forgot). The "event side" will invoke your callback instead of you waiting to do it yourself!
Requirement 5 is ignored since I didn't get any reason. Waiting for a thread to complete would eliminate the fire and forget character.
private BlockingCollection<ImportFileInfo> requestQueue = new BlockingCollection<ImportFileInfo>();
private bool isServiceEnabled;
private readonly int maxNumberOfThreads = 8;
private Semaphore semaphore = new Semaphore(numberOfThreads);
private readonly object syncLock = new object();
public void UploadAndImport(ImportFileInfo request)
{
// Start the request handler background loop
if (!this.isServiceEnabled)
{
this.requestQueue?.Dispose();
this.requestQueue = new BlockingCollection<ImportFileInfo>();
// Fire and forget (requirement 4)
Task.Run(() => HandleRequests());
this.isServiceEnabled = true;
}
// Cache multiple incoming client requests (requirement 1) (and enable throttling)
this.requestQueue.Add(request);
}
private void HandleRequests()
{
while (!this.requestQueue.IsCompleted)
{
// Wait while thread limit is exceeded (some throttling)
this.semaphore.WaitOne();
// Process the incoming requests in a dedicated thread (requirement 2) until the BlockingCollection is marked completed.
Task.Run(() => ProcessRequest());
}
// Reset the request handler after BlockingCollection was marked completed
this.isServiceEnabled = false;
this.requestQueue.Dispose();
}
private void ProcessRequest()
{
ImportFileInfo request = this.requestQueue.Take();
UploadFile(request);
// You updated your question saying the method "ImportFile()" requires synchronization.
// This a bottleneck and will significantly drop performance, when this method is long running.
lock (this.syncLock)
{
ImportFile(request);
}
this.semaphore.Release();
}
Remarks:
BlockingCollection is a IDisposable
TODO: You have to "close" the BlockingCollection by marking it completed:
"BlockingCollection.CompleteAdding()" or it will loop indeterminately waiting for further requests. Maybe you introduce a additional request methods for the client to cancel and/ or to update the process and to mark adding to the BlockingCollection as completed. Or a timer that waits an idle time before marking it as completed. Or make your request handler thread block or spin.
Replace Take() and Add(...) with TryTake(...) and TryAdd(...) if you want cancellation support
Code is not tested
Your "ImportFile()" method is a bottleneck in your multi threading environment. I suggest to make it thread safe. In case of I/O that requires synchronization, I would cache the data in a BlockingCollection and then write them to I/O one by one.
The problem is that your total bandwidth is very small-- only one job can run at a time-- and you want to handle parallel requests. That means that queue time could vary wildly. It may not be the best choice to implement your job queue in-memory, as it would make your system much more brittle, and more difficult to scale out when your business grows.
A traditional, scaleable way to architect this would be:
An HTTP service to accept requests, load balanced/redundant, with no session state.
A SQL Server database to persist the requests in a queue, returning a persistent unique job ID.
A Windows service to process the queue, one job at a time, and mark jobs as complete. The worker process for the service would probably be single-threaded.
This solution requires you to choose a web server. A common choice is IIS running ASP.NET. On that platform, each request is guaranteed to be handled in a single-threaded manner (i.e. you don't need to worry about race conditions too much), but due to a feature called thread agility the request might end with a different thread, but in the original synchronization context, which means you will probably never notice unless you are debugging and inspecting thread IDs.
Given the constraints context of our system, this is the implementation we ended up using:
static ImportFileInfo _importInProgressItem = null;
static readonly ConcurrentQueue<ImportFileInfo> ImportQueue =
new ConcurrentQueue<ImportFileInfo>();
public void UploadAndImport(ImportFileInfo request) {
UploadFile(request);
ImportFileSynchronized(request);
}
// Synchronize the file import,
// because the database allows a user to perform only one write at a time.
private void ImportFileSynchronized(ImportFileInfo request) {
ImportQueue.Enqueue(request);
do {
ImportQueue.TryPeek(out var next);
if (null != Interlocked.CompareExchange(ref _importInProgressItem, next, null)) {
// Queue processing is already under way in another thread.
return;
}
ImportFile(next);
ImportQueue.TryDequeue(out _);
Interlocked.Exchange(ref _importInProgressItem, null);
}
while (ImportQueue.Any());
}
public bool UploadAndImportIsComplete(Guid operationId) =>
ImportQueue.All(waiting => waiting.OperationId != operationId);
This solution works well for the loads we are expecting. That load involves a maximum of about 15-20 concurrent PDF file uploads. The batch of up to 15-20 files tends to arrive all at once and then to go quiet for several hours until the next batch arrives.
Criticism and feedback is most welcome.
From what I've read, beginReceive is considered superior to receive in almost all cases (or all?). This is because beginReceive is asynchronous and waits for data to arrive on a seperate thread, thereby allowing the program to complete other tasks on the main thread while waiting.
But with beginReceive, the callback function still runs on the main thread. And so there is overhead with switching back and forth between the worker thread and the main thread each time data is received. I know the overhead is small, but why not avoid it by simply using a separate thread to host a continuous loop of receive calls?
Can someone explain to me what is inferior about programming with the following style:
static void Main()
{
double current_temperature = 0; //variable to hold data received from server
Thread t = new Thread (UpdateData);
t.start();
// other code down here that will occasionally do something with current_temperature
// e.g. send to a GUI when button pressed
... more code here
}
static void UpdateData()
{
Socket my_socket = new Socket(AddressFamily.InterNetwork,SocketType.Stream,ProtocolType.Tcp);
my_socket.Connect (server_endpoint);
byte [] buffer = new byte[1024];
while (true)
my_socket.Receive (buffer); //will receive data 50-100 times per second
// code here to take the data in buffer
// and use it to update current_temperature
... more code here
end
}
I have a fairly vanilla web service (old school asmx). One of the methods kicks off some async processing that has no bearing on the result returned to the client. Hopefully, the little snippet below makes sense:
[System.Web.Services.WebMethod]
public List<Foo> SampleWebMethod(string id)
{
// sample db query
var foo = db.Query<Foo>("WHERE id=#0",id);
// kick of async stuff here - for example firing off emails
// dont wait to send result
DoAsyncStuffHere();
return foo;
}
My initial implementation for the DoAsyncStuffHere method made use of the ThreadPool.QueueUserWorkItem. So, it looks something like:
public void DoAsyncStuffHere()
{
ThreadPool.QueueUserWorkItem(delegate
{
// DO WORK HERE
});
}
This approach works fine under low load conditions. However, I need something that can handle a fairly high load. So, the producer/consumer pattern would seem to be the best way to go.
Where I am confused is how to constrain all work being done by the queue to a single thread across all instances of the web service. How would I best go about setting up a single queue to be accessed by any instance of the web service?
You can use a System.Collections.Concurrent.BlockingCollection<T> with a System.Collections.Concurrent.ConcurrentQueue<T> as the underlying collection.
As the name of the namespace implies, the collections are thread safe.
Start a consumer thread (or a few) to pull items from the collection, using the Take() method. When no items are available, the thread will block.
Your DoAsyncStuffHere method adds items to the BlockingCollection. These items could be unstarted System.Threading.Tasks.Task objects; the consumer thread(s) would in that case Start the tasks after taking them from the collection.
One easy way to do it would be to implement your queue as a database table.
The producers would be the request threads handled by each instance of the web service.
The consumer could be any kind of continuously running process (Windows Forms app, Windows service, database job, etc.) that monitors the queue and processes items one at a time.
You can't do this with ThreadPool - you could have a static constructor which launches a worker Thread; the DoAsyncStuffHere could insert its work item to a Queue of work that you want done, and the worker Thread can check if there are any items in the Queue to work on. If so, it does the work, otherwise it sleeps for a few millis.
The static constructor ensures that it's only called once, and only a single Thread should be launched (unless there's some bizarre .NET edge case that I'm unaware of).
Here's a layout for an example - you'd probably need to implement some locking on the queue and add a bit more sophistication to the worker thread, but I've used this pattern before with success. The WorkItem object can hold the state of the information that you want passed along to the worker thread.
public static WebService()
{
new Thread(WorkerThread).Start();
WorkQueue = new Queue<WorkItem>();
}
public static void WorkerThread()
{
while(true)
{
if(WorkQueue.Any())
{
WorkQueue.Dequeue().DoWork();
}
else
{
Thread.Sleep(100);
}
}
}
public static Queue<WorkItem> WorkQueue { get; set; }
[System.Web.Services.WebMethod]
public List<Foo> SampleWebMethod(string id)
{
WorkQueue.Queue(newWorkItem());
}
When using a single threaded loop, I was easily able to limit my messages sent per second by putting the thread to sleep (i.e. Thread.Sleep(1000/MessagesPerSecond)), easy enough... but now that I have expanded into parallel threads this no longer works correctly.
Does anyone have a suggestion how to throttle messages sent when using Parallel threads?
Parallel.For(0, NumberOfMessages, delegate(int i) {
// Code here
if (MessagesPerSecond != 0)
Thread.Sleep(1000/MessagesPerSecond);
});
Use an AutoResetEvent and a timer. Whenever the timer fires, have it Set the AutoResetEvent.
Then have your process that sends messages WaitOne on the AutoResetEvent immediately before sending.
private static readonly AutoResetEvent _Next = new AutoResetEvent(true);
private static Timer _NextTimer;
private static void SendMessages(IEnumerable<Message> messages)
{
if (_NextTimer == null)
InitializeTimer();
Parallel.ForEach(
messages,
m =>
{
_Next.WaitOne();
// Do something
}
);
}
private static void SetNext(object state)
{
_Next.Set();
}
You might consider using a shared ConcurrentQueue, which your parallel loop would populate with prepared messages. Use the System.Threading.Timer to pull messages from the queue at your desired interval and send them. Note that this design only make sense if creating the messages to be sent is expensive; if the actual sending of the messages is the expensive part, there is no reason to run the loop in parallel.
If you need to stop the timer after the messages have been sent, you'll have to do some additional work, but this design works well for a throttled message sender that has to handle asynchronous message queuing. Another boundary case to consider is 'message pile-up', where messages are queued up faster than they can be processed. You might want to consider generating an error in this case (as it may indicate a bug) or using a BlockingCollection.
I'm trying to make a networked console based application, but it needs to be able to listen to standard input and input from a socket at the same time. In C++ I would use the posix select() function to do this, but in C# it appears that the equivalent select function is for sockets only. Is there a way I can listen to both inputs in C# without resorting to multiple threads?
To wait on multiple inputs, you need a WaitHandle for each one, and then you call the static method WaitHandle.WaitAny.
But another option is to use async IO. Use the BeginXXXX to start read/receive operations. You supply a callback in each case which will be executed on completion. After you launch them, you wait on a monitor object, and in the callbacks you pulse that monitor object to notify completion. This is a very efficient form of multi-threading programming but you don't have to start any threads explicitly.
To get a raw Stream for standard input, use Console.OpenStandardInput.
I will start by saying that- no, you can not use Select() for StandardInput.
But, C# gives you a better way to listen to few I\O.
The async\await is a better way, because it lets to avoid blocking at all, even for the calling function.
In the following example, the program listening to the StandardInput with ReadAsync(), and in the meantime the PrintStaff() function prints the variable i to the console every 3 seconds.
example:
using System;
using System.Threading;
using System.IO;
using System.Text;
namespace AsyncExplore
{
class Program
{
static void Main(string[] args)
{
ReadFromConsole();
PrintStaff();
}
private static void PrintStaff()
{
int i = 0;
while (true)
{
Thread.Sleep(3000);
Console.WriteLine(i++);
}
}
private static async void ReadFromConsole()
{
while(true)
{
byte[] buffer = new byte[4000];
using Stream stdin = Console.OpenStandardInput();
{
//read from StandardInput without blocking, so the
//control yields to Main,
//and another function can run.
int numBytes = await stdin.ReadAsync(buffer, 0,
buffer.Length);
}
//convert the bytes[] to string
Console.WriteLine(Encoding.ASCII.GetString(buffer));
}
}
}
}
The advantages are
Readabilty: you can read the sequence easily.
Coding: The coding process is much more intuitive and close to the synchronous way.
Zero-threaded: You can run this Main as a "Listener" without the need of new thread. Unlike Select(), which requires you to stay blocked on Select(), here you can set the whole function as asynchronous, thus returning the control to caller method.
Note: In the example, Main() is not async, to simplify things, so it get "blocked" in the while loop. But you can make PrintStaff() async as well, and so the main, and await on PrintStaff(). In this case you are perfectly un-blocking the thread.