WCF PerCall inside-operation cross-caller event handling - c#

With my WCF service, I am solving an issue that has both performance and design effects.
The service is a stateless RESTful PerCall service, that does a lot of simple and common thins, which all work like a dandy.
But, there is one operation, that has started to scare me a lot recently, so there is the problem:
Clients make parametrized calls to the operation and the computation of the result requires lots of time to finish. But result to a call with identical parameters will always be the same, until data on the server change. And clients make an awful LOT of calls with exact the same parameters. The server, however, cannot predict the parameters, that the users will like, so sadly enough, the results cannot be precomputed.
So I came up with caching layer and store the result object as a key-value pair, where key represents the parameters which lead to this result. And if the relevant data change, I just flush the cache. Still simple and no problems with this.
Client calls the service, server receives the call, looks, whether the result is already cached and returns it, if so. But, if the result is not cached yet, the client starts the computation. The computation may take up to 2 minutes (average time 10-15 seconds) to finish and by that time, other clients may come and because the result is still not known to cache, each of them would start their own computation. Which is NOT what we really want, so there is a flag, if someone has already started the computation with this parameters this is the place in code, where other callers' code stops and waits for the computation to be finished and inserted into cache, from where each of the invoked instances will grab the result, return it to the client and dispose.
And this is the part, which I am really struggling with.
By now, my solution looks something like this (before you read further, I want to warn you, because my experience is not near decent level and I still am a big noob in all C#, WCF and related stuff... no need telling me I'm a noob, because I am fully aware of that):
Stopwatch sw = new Stopwatch();
sw.Start();
while (true)
{
if (Cache.Contains(parameters) || sw.Elapsed > threshold)
break;
Thread.Sleep(100);
}
...do relevant stuff here
As you see, there are more problems with this solution:
Having the loop, check and all this stuff does not only feel ugly, with many clients waiting this way, the resources tend to jump up.
If the operation fails (the initial caller's computation fails to deliver within the limits of threshold), I do not really know, which client has got to be next up trying the computation, or how, or even whether should I run the operation again, or return a fault to the client...
EDIT: This is not related to synchronization, I am aware of the need for locking in some parts of my application, so my concerns are not synchronization-reated.
What should I do when the relevant server-side data change while invoked code is still performing computation (resulting in such result being a wrong one). ... More over, this has some other horrible effects on the application, but yeah, I am getting to the question here:
So, like most of the time, I did my homework and performed qoogling around before asking, but did not succeed in finding some guidance that I would either understand or that would suit my issues and domain.
I got a strong feel, that I have to introduce some kind of (static?) events-based-and-or-asynchronous class (call it layer if you will), that does some tricks and organizes and manages all this things in some kind of a register-to-me-and-i-will-give-you-a-poke / poke-all-registered-threads manner. But despite being able (to certain extent) to use the newly introduced tasks, TPL, and async-await, I not only have very limited experience on this field, more sadly, I really really need help explaining how it could come together with events (or do I even need them?)... When i try / run little things in a test-console application, I might succeed, but bringing it into this bigger environment of my WCF application, I struggle to get a clue.
So guys I will gladly welcome every kind of relevant thoughts, advice, guidance, links, code and criticism touching my topic.
I am aware of the fact, it might be confusing and will do my best to clear all misunderstandings and tricky parts, just ask me for doing that.
Thanks for help!

Related

C# stop async inheritance

I'm getting in touch with the whole async / await functionality in C# right now.
I think I know what it is good for. But I encountered places where I do not want the common inheritance of all the methods which call a library function of mine to need to be "async" aware.
Consider this (rough pseudo-code, not really representing the real thing, it's just about the context):
string JokeOfTheHour;
public string GiveJokeOfTheHour()
{
if(HourIsOver)
{
jokeOfTheHour = thirdPartyLibrary.GetNewJoke().GetAwaiter().GetResult();
}
return jokeOfTheHour;
}
I have a web-back-end library function which is called up to a million times per hour (or even more).
Exactly one time of these million calls per hour, the logic within uses a third party library which just supports async calls for the methods I want to use from it.
I don't want the user of my library to even think that it would make any sense for them to asynchronously run any code when calling my library-function, because it would only generate unnessecary overhead for their code and runtime the absolute most of the time.
The reasons I would state here are:
Seperation of Concern. I know how I work, my user does not need to.
Context is everything. As a developer, having background-knowledge is the way for me to know which cases I need to consider when writing code, and which not. That enables me to ommit writing hundreds of lines of code for stuff that should never happen.
Now, I want to know what general rules there are to do this. But sadly, I can't find simple statements or rules browsing the web where anybody sais "In this, this and this situation, you can stop this "async" keyword bubbling up your method-calltree". I've just seen persons (some of them Microsoft MVP's) talking about that there absolutely are situations where this should be done, also stating that you should use .GetAwaiter().GetResult() as a best practice then, but they are never specific about the situations itself.
What I am looking for is a down-to-the-ground general rule in which I can say:
Even though I might call third party functions which are async, I do not execute async, and do not want to appear as such. I'm a bottom level function using caches 99.99999% of the time. I don't need my user to implement the async methodology all the way up to where my actual user needs to decide where the async execution stops (Which makes my user who should actually benefit timely from my library do write more code and have more execution time).
I would really be thankful for your help :)
You seem to want your method to introduce itself with: "I'm fast". The truth is that from time to time it can actually be (very) slow. This potentially has serious consequences.
The statement
I'm a bottom level function using caches 99.99999% of the time'
is not correct if you call your method once an hour.
It is better for consumers of your method to see "I can be slow, but if you call me often, I cache the result, so I will return fast" (which would be GiveJokeOfTheHourAsync() with a comment.)
If you want your method to always be fast I would suggest one of these options:
Have an UpdateJokeAsync method that you call without waiting for it in your if(HourIsOver). This would mean returning stale result until you fetch a new one.
Update your joke using a timer.
Make 'get' always get the last known and have UpdateJokeAsync to update the joke.

WCF, long running server operations and a WinRT client

Here's a problem I'm currently facing:
A WCF service exposes a large number of methods, some of which can take a longer amount of time.
The client is a WinRT (Metro-style) application (so some .NET classes are unavailable).
The timeout on the client has already been increased to 1.5 minutes.
Despite the increased timeout, some operations can take longer still (but not always).
If a timeout happens, the service continues on it's merry way. The result of the requested operation is lost. Even worse, if the operation is a success, then the client won't get the data required, and the server won't "rollback".
All operations are already implemented using the async pattern on the client. I could use an event-based implementation but, as far as I'm aware, the timeouts will still occur then.
Increasing the timeout value is definitely an option, but it feels like a very dirty solution - it feels like pushing the problem away rather than solving it.
Implementing a WS transaction flow on the server seems impossible - I don't have access to TransactionScope class when designing WinRT apps.
WS Atomic seems like overkill as well (it also requires a lot more set up, and I'm willing to bet the limited capabilities of WinRT applications will prove a big hassle to overcome).
So far my only idea (albeit one with a lot more moving parts, which sort of feels like reinventing the wheel) is to create two service methods - one which begins some long-running operation and returns some kind of "task ID", then runs the operation in the background, and saves the result of the operation (be it error or success) into a DB / storage with that task ID. The client can then poll for the operations result using that task ID via the second service method every once in a while until such a result is available (be it a success or an error).
This approach also has it's drawbacks:
long operations become even longer, as the client needs to poll for the results
lots of new moving parts, potentially making the whole thing less stable
What else could I possibly try to solve this issue?
PS. The actual service side is also not without limitations - it's an MS DAX service, which likely comes with it's own set of potential pitfalls and traps.
EDIT:
It appears my question has some similarity to this SO question... however, given the WinRT nature of the client and the MS DAX nature of the service I'm not sure anything in the answer is really useful to me.

Unit testing - how to emulate a delay

We've got a large C# solution with multiple APIs, SVCs and so on.
Usual sort of enterprisy mess that you get after the same code has been worked on for years by multiple people.
Anyway! We have an ability to call an external service and we have some unit tests in place that use a Moq like stub implementation of the services interface.
It so happens that there can be a large delay in calling the external service and it's not anything that we can control (it's a GDS interface).
We've been working on a way to streamline the user experience for this part of our platform.
The problem is, the stub doesn't actually do much at all - and of course, is lightening fast, compared to the real thing.
We want to introduce a random delay into one of the stubbed methods, that will cause the call to take between 10 and 20 seconds to complete.
The naive approach is to do:
int sleepTimer = random.Next(10, 20);
Thread.Sleep(sleepTimer * 1000);
But something about this gives me a bad feeling.
What other ways do people have of solving this kind of scenario, or is Thread.Sleep actually Ok to use in this context ?
Thanks for your time!
-Russ
Edit, To answer some of the comments:
Basically, we don't want to call the live external service from our test suite, because it costs money and other business problems.
However, we want to test that our new processes work well, even when there's a variable delay in this essential call to the external service.
I would love to explain the exact process, but I'm not allowed to.
But yeah, the summary is that our test needs to ensure that a long running call to an external service doesn't obstruct the rest of the flow; and we need to ensure that other tasks don't get into any kind of race conditions, as they depend on the result of this call.
I agree that calling it a unit-test is somewhat incorrect now!

Any 'quick wins' to make .NET remoting faster on a single machine?

I've been badly let-down and received an application that in certain situations is at least 100 times too slow, which I have to release to our customers very soon (a matter of weeks).
Through some very simple profiling I have discovered that the bottleneck is its use of .NET Remoting to transfer data between a Windows service and the graphical front-end - both running on the same machine.
Microsoft guidelines say "Minimize round trips and avoid chatty interfaces": write
MyComponent.SaveCustomer("bob", "smith");
rather than
MyComponent.Firstname = "bob";
MyComponent.LastName = "smith";
MyComponent.SaveCustomer();
I think this is the root of the problem in our application. Unfortunately calls to MyComponent.* (the profiler shows that 99.999% of the time is spent in such statements) are scattered liberally throughout the source code and I don't see any hope of redesigning the interface in accordance with the guidelines above.
Edit: In fact, most of the time the front-end reads properties from MyComponent rather than writes to it. But I suspect that MyComponent can change at any time in the back-end.
I looked to see if I can read all properties from MyComponent in one go and then cache them locally (ignoring the change-at-any-time issue above), but that would involve altering hundreds of lines of code.
My question is: Are they any 'quick-win' things I can try to improve performance?
I need at least a 100-times speed-up. I am a C/C++/Delphi programmer and am pretty-much unfamiliar with C#/.NET/Remoting other than what I have read up on in the last couple of days. I'm looking for things that can be completed in a few days - a major restructuring of the code is not an option.
Just for starters, I have already confirmed that it is using BinaryFormatter.
(Sorry, this is probably a terrible question along the lines of 'How can I feasibly fix X if I rule out all of the feasible options'… but I'm desperate!)
Edit 2
In response to Richard's comment below: I think my question boils down to:
Is there any setting I can change to reduce the cost of a .NET Remoting round-trip when both ends of the connection are on the same machine?
Is there any setting I can change to reduce the number of round-trips - so that each invocation of a remote object property doesn't result in a separate round-trip? And might this break anything?
Under .Net Remoting you have 3 ways of communicating by HTTP, TCP and IPC. If the commnuicatin is on the same pc I sugest using IPC channels it will speed up your calls.
In short, no there are no quick wins here. Personally I would not make MyComponent (as a DTO) a MarshalByRefObject (which is presumably the problem), as those round trips are going to cripple you. I would keep it as a regular class, and just move a few key methods to pump them around (i.e. have a MarshalByRef manager/repository/etc class).
That should reduce round-trips; if you still have problems then it will probably be bandwidth related; this is easier to fix; for example by changing the serializer. protobuf-net allows you to do this easily by simply implementing ISerializable and forwarding the two methods (one from the interface, plus the ctor) to ProtoBuf.Serializer - it then does all the work for you, and works with remoting. I can provide examples of this if you like.
Actually, protobuf-net may help with CPU usage too, as it is a much more CPU-efficient serializer.
Could you make MyComponent a class that will cache the values and only submit them when SaveCustomer() is called?
You can try compressing traffic. If not 100-times increase, you'll still gain some performance benefit
If you need the latest data (always see the real value), and the cost of getting the data each time dominates the runtime then you need to be radical.
How about changing polling to push. Rather than calling the remote side each time you need a value, have the remote push all changes and cache the latest values locally.
Local lookups (after the initial get) are always up to date with all remoting overhead being done in the background (on another thread). Just be careful about thread safety for non-atomic types.

Is this a good time to use multithreading in ASP.NET MVC and how is it implemented?

I want a certain action request to trigger a set of e-mail notifications. The user does something, and it sends the emails. However I do not want the user to wait for page response until the system generates and sends the e-mails. Should I use multithreading for this? Will this even work in ASP.NET MVC? I want the user to get a page response back and the system just finish sending the e-mails at it's own pace. Not even sure if this is possible or what the code would look like. (PS: Please don't offer me an alternative solution for sending e-mails, don't have time for that kind of reconfiguration.)
SmtpClient.SendAsync is probably a better bet than manual threading, though multi-threading will work fine with the usual caveats.
http://msdn.microsoft.com/en-us/library/x5x13z6h.aspx
As other people have pointed out, success/failure cannot be indicated deterministically when the page returns before the send is actually complete.
A couple of observations when using asynchronous operations:
1) They will come back to bite you in some way or another. It's a risk versus benefit discussion. I like the SendAsync() method I proposed because it means forms can return instantly even if the email server takes a few seconds to respond. However, because it doesn't throw an exception, you can have a broken form and not even know it.
Of course unit testing should address this initially, but what if the production configuration file gets changed to point to a broken mail server? You won't know it, you won't see it in your logs, you only discover it when someone asks you why you never responded to the form they filled out. I speak from experience on this one. There are ways around this, but in practicality, async is always more work to test, debug, and maintain.
2) Threading in ASP.Net works in some situations if you understand the ThreadPool, app domain refreshes, locking, etc. I find that it is most useful for executing several operations at once to increase performance where the end result is deterministic, i.e. the application waits for all threads to complete. This way, you gain the performance benefits while still having a clear indication of results.
3) Threading/Async operations do not increase performance, only perceived performance. There may be some edge cases where that is not true (such as processor optimizations), but it's a good rule of thumb. Improperly used, threading can hurt performance or introduce instability.
The better scenario is out of process execution. For enterprise applications, I often move things out of the ASP.Net thread pool and into an execution service.
See this SO thread: Designing an asynchronous task library for ASP.NET
I know you are not looking for alternatives, but using a MessageQueue (such as MSMQ) could be a good solution for this problem in the future. Using multithreading in asp.net is normally discouraged, but in your current situation I don't see why you shouldn't. It is definitely possible, but beware of the pitfalls related to multithreading (stolen here):
•There is a runtime overhead
associated with creating and
destroying threads. When your
application creates and destroys
threads frequently, this overhead
affects the overall application
performance. •Having too many threads
running at the same time decreases the
performance of your entire system.
This is because your system is
attempting to give each thread a time
slot to operate inside. •You should
design your application well when you
are going to use multithreading, or
otherwise your application will be
difficult to maintain and extend. •You
should be careful when you implement a
multithreading application, because
threading bugs are difficult to debug
and resolve.
At the risk of violating your no-alternative-solution prime directive, I suggest that you write the email requests to a SQL Server table and use SQL Server's Database Mail feature. You could also write a Windows service that monitors the table and sends emails, logging successes and failures in another table that you view through a separate ASP.Net page.
You probably can use ThreadPool.QueueUserWorkItem
Yes this is an appropriate time to use multi-threading.
One thing to look out for though is how will you express to the user when the email sending ultamitely fails? Not blocking the user is a good step to improving your UI. But it still needs to not provide a false sense of success when ultamitely it failed at a later time.
Don't know if any of the above links mentioned it, but don't forget to keep an eye on request timeout values, the queued items will still need to complete within that time period.

Categories

Resources