I have an asp.net web application running on IIS 7 set-up in web-garden mode. I want to clear runtime cache items across all worker processes using a single-step. I can setup a database key-value, but that would mean a thread executing on each worker process, on each of my load-balanced-scenario web servers will poll for changes on that key-value and flush cache. That would be a very bad mechanism as I flush cache items once per day at max. Also I cannot implement a push notification using the SqlCacheDependency with Service Broker notifications as I have a MySql db. Any thoughts? Is there any dirty work-around? One possible workaround, expose an aspx page, and hit that page multiple times using the ip and port on which the site is hosted instead of the domain name - ex: http://ip.ip.ip.ip:82/CacheClear.aspx, so that a request for that page might be sent to all the worker processes within that webserver, and on Page_Load, clear the cache items. But this is a really dirty hack and may not work in cases when all requests are sent to the same worker process.
You need to setup inter-process communication.
For caching there are two commonly used ways of doing this:
Setup a shared cache (memcached or the like.)
Setup a message queue (e.g. ms-mqueue or rabbitMq) and use it to spread state to the local caches.
A shared cache is the ultimate solution as it means the whole cache is distributed but it is also the most complex: it needs to be set up so the cache load is properly distributed between nodes and make sure it doesn't become a bottle neck.
The second option requires more code on your part but it is easier if you don't want to share the cache content (as in your case.)
The easiest is to setup a listener thread or task to handle the cache clear or individual entries invalidation messages. This thread will be dormant if there are no messages so the impact on performance is minimal.
You can also forgo the listener thread by handling messages as part of the usual iis request pipeline. I.e. set up a filter/module that checks for messages in the queue and processes them before handling the request; but performance wise the first option is (slightly) better.
Related
How can I make a button called Kill Session on grid view to kill session for users in C# asp.net web forms?
The naive answer is to have each request check a persistent queue or table of such pending "session kills" before any processing, and conditionally abandon the session.
A quick optimization is to have the application listen globally (i.e. from appstart) and actively to the external queue, so that "kill requests" can be pushed to it immediately. You can then prepare them in a local thread-safe data structure so that normal HTTP requests aren't slowed down.
Your session management application simply sends messages to that queue.
Sidenote: Any "service bus" will do usually. It is rather important than the queue be persistent as you need to handle scenarios revolving around the application restarting while messages are queued up.
The proper answer is to communicate directly with the actual session handler. I don't believe there are any standard APIs to do what you want, so this will depend on the session handler (and likely on the host/server).
We have an ASP MVC 3.0 application that reads data from the db using Entity framework (all on Azure). We have several long running queries (optimization has been done) and we want to make sure that the solution is scalable and prevent thread starvation.
We looked at async controllers and using I/O completion ports to run the query (using BeginExecute instead of the usual EF). However, async is hard to debug and increases the complexity of the code.
The proposed solution is as follows:
The web server (web role) gets a request that involves a long running query (example customer segmentation)
It enters the request information into a table along with the relevant parameters and returns thereby allowing the thread to process other requests.
We set a flag in the db that enables the UI to state that the query is in progress whenever a refresh to the page is done.
A worker role constantly queries this table and as soon as it finds this entry processes the long running query (customer segmentation) and updates the original customer table with the results.
In this case an immediate return of status to the users is not necessary. Users can check back within a couple of minutes to see if their request has been worked on. Instead of the table we were planning to use Azure Queues (but I guess Azure queues cannot notify a worker role so a db table will do just fine). Is this a workable solution. Are there any pitfalls to doing it this way?
While Windows Azure Storage queues don't give you a notification after a message has been processed, you could implement that yourself (perhaps with Windows Azure Storage tables). The nice part about queues: They handle concurrency and failed attempts.
For instance: If you have 2 worker instances processing messages off the same queue, every time a queue message is read, the message goes invisible in the queue, for an amount of time you specify. While invisible, only the worker instance that read the message has it. If that instance finishes processing, it can just delete the queue message (and update your notification table). If it fails (maybe due to the role instance crashing), the message re-appears on the queue after the invisibility timeout expires. Going one step further: Let's say it's simply a bad message that causes your code to crash every time. You can check the dequeue count before processing the message. If it's greater than, say, 2, simply store the message in a dead-letter table and inspect it manually.
One caveat with queues: The queue messages need to be idempotent operations (that is, they can be processed at least once, and the results should have the exact same side-effects each time).
If you go with a table instead of a queue, you'll need to deal with scaling (multiple threads or role instances processing the table), and dead-letter handling.
This depends. If your worker role does nothing other than delegating the heavy work to a SQL database, it seems a waste of resource and your money. Using a web role with async requests allows you to reduce the cost. If it is needed to do a heavy work in the worker role itself, then it is a good approach.
You can also use AJAX or web socket. Start the database query, and return the response immediately. The client can either poll the web role to see if a query has finished (if you use HTTP), or the web role can notify the client directly (if you use web socket).
We have multiple services that do some heavy data processing that we'd like to put multiple copies of them across multiple servers. Basically the idea is this:
Create multiple copies of identical servers with the collection of services running on them
a separate server will have an executable stub that will be run to contact one of these servers (determined arbitrarily from a list) to begin the data processing
The first server to be contacted will become the "master" server and delegate the various data processing tasks to the other "slave" servers.
We've spent quite a bit of time figuring out how to architect this and I think the design should work quite well but I thought I'd see if anyone had any suggestions on how to improve this approach.
The solution is to use a load balancer..
I am bit biased here - since I am from WSO2 - the open source WSO2 ESB can be used as a load balancer - and it has the flexibility of load balancing and routing based on different criteria. Also it supports FO load balancing as well...
Here are few samples related to load balancing with WSO2 ESB...
You can download the product from here...
eBay is using WSO2 ESB to process more than 1 Billion transactions per day in their main stream API traffic...
The first server to be contacted will become the "master" server and
delegate the various data processing tasks to the other "slave"
servers.
That is definitely not how I would build this.
I build this with the intent to use cloud computing (regardless of whether it uses true cloud computing or not). I would have a service that would receive requests and save those requests to a queue. I would then have multiple worker applications that will take an item from the queue, mark it in process and do whatever needs done. Upon completion the queue item is updated as done.
At this point I would either notify the client that the work is done, or you could have the client poll the server for reading the status of the queue.
We have a large process in our application that runs once a month. This process typically runs in about 30 minutes and generates 342000 or so log events. Recently we updated our logging to a centralized model using WCF and are now having difficulty with performance. Whereas the previous solution would complete in about 30 minutes, with the new logging, it now takes 3 or 4 hours. The problem it seems is because the application is actually waiting for the WCF request to complete before execution continues. The WCF method is already configured as IsOneWay and I wrapped the call on the client side to that WCF method in a different thread to try to prevent this type of problem but it doesn't seem to have worked. I have thought about using the async WCF calls but thought before I tried something else I would ask here to see if there is a better way to handle this.
342000 log events in 30 minutes, if I did my math correctly, comes out to 190 log events per second. I think your problem may have to do with the default throttling settings in WCF. Even if your method is set to one-way, depending on if you're creating a new proxy for each logged event, calling the method will still block while the proxy is created, the channel is opened, and if you're using an HTTP-based binding, it will block until the message has been received by the service (an HTTP-based binding sends back a null response for a 1-way method call when the message is received). The default WCF throttling limits concurrent instances to 10 on the service side, which means only 10 requests will be handled at a time, and any further requests will get queued, so pair that with an HTTP binding, and anything after the first 10 requests are going to block at the client until it's one of the 10 requests getting handled. Without knowing how your services are configured (instance mode, etc.) it's hard to say more than that, but if you're using per-call instancing, I'd recommend setting MaxConcurrentCalls and MaxConcurrentInstances on your ServiceBehavior to something much higher (the defaults are 16 and 10, respectively).
Also, to build on what others have mentioned about aggregating multiple events and submitting them all at once, I've found it helpful to setup a static Logger.LogEvent(eventData) method. That way it's simple to use throughout your code, and you can control in your LogEvent method how you want logging to behave throughout your application, such as configuring how many events should get submitted at a time.
Making a call to another process or remote service (i.e. calling a WCF service) is about the most expensive thing you can do in an application. Doing it 342,000 times is just sheer insanity!
If you must log to a centralized service, you need to accumulate batches of log entries and then, only when you have say 1000 or so in memory, send them all to the service in one hit. This will give you a reasonable performance improvement.
log4net has a buffering system that exists outside the context of the calling thread, so it won't hold up your call while it logs. Its usage should be clear from the many appender config examples - search for the term bufferSize. It's used on many of the slower appenders (eg. remoting, email) to keep the source thread moving without waiting on the slower logging medium, and there is also a generic buffering meta-appender that may be used "in front of" any other appender.
We use it with an AdoNetAppender in a system of similar volume and it works wonderfully.
There's always the traditional syslog there are plenty of syslog daemons that run on Windows. Its designed to be a more efficient way of centralised logging than WCF, which is designed for less intensive opertions, especially if you're not using the tcpip WCF configuration.
In other words, have a go with this - the correct tool for the job.
I'm creating an application that I want to put into the cloud. This application has one main function.
It hosts socket CLIENT sessions on behalf of other users (think of Beejive IM for the iPhone, where it hosts IM sessions for clients to maintain state on those IM networks, allowing the client to connect/disconnect at will, without breaking the IM network connection).
Now, the way I've planned it now, is that one 'worker instance' can likely only handle a finite number of client sessions (let's say 50,000 for argument sake). Those sessions will be very long lived worker tasks.
The issue I'm trying to get my head around is that I will sometimes need to perform tasks to specific client sessions (eg: If I need to disconnect a client session). With Azure, would I be able to queue up a smaller task that only the instance hosting that specific client session would be able to dequeue?
Right now I'm contemplating GoGrid as my provider, and I solve this issue by using Apache's Active Messaging Queue software. My web app enqueues 'disconnect' tasks that are assigned to a specific instance Id. Each client session is therefore assigned to a specific instance id. The instance then only dequeues 'disconnect' tasks that are assigned to it.
I'm wondering if it's feasible to do something similar on Azure, and how I would generally do it. I like the idea of not having to setup many different VM's to scale, but instead just deploying a single package. Also, it would be nice to make use of Azure's Queues instead of integrating a third party product such as Apache ActiveMQ, or even MSMQ.
I'd be very concerned about building a production application on Azure until the feature set, pricing, and licensing terms are finalized. For starters, you can't even do a cost comparison between it and e.g. GoGrid or EC2 or Mosso. So I don't see how it could possibly end up a front-runner. Also, we know that all of these systems will have glitches as they mature. Amazon's services are in much wider use than any of the others, and have been publicly available for much years. IMHO choosing Azure is a recipe for pain as they stabilize.
Have you considered Amazon's Simple Queue Service for queueing?
I think you can absolutely use Windows Azure for this. My recommendation would be to create a queue for each session you're tracking. Then enqueue the disconnect message (for example) on the queue for that session. The worker instance that's handling that connection should be the only one polling that queue, so it should handle performing the task on that connection.
Regarding the application hosting socket connections for clients to connect to, I'd double-check on what's allowed as I think only HTTP and HTTPS connections are allowed to be made with Azure.