EC2 Instance Selection

EC2 Instance Selection - c#

We have recently started using AWS free tier for our CRM product.
We are facing speed related issues currently, so we are planning to change EC2 Instance.
It's a dotnet based website, using ASP.Net, C#.net, Microsoft SQL server 2012, IIS 7 server.
It would be great if someone can suggest correct EC2 instance for our usage. We are planning to use t2.Medium and MS SQL Enterprise license, Route 53, 30 GB EBS Volume, CloudWatch, SES and SNS. Are we missing something here..? Also what would be the approximate monthly billing for this usage..?
Thanks in advance. Cheers!!

It's impossible to say for sure what the issue is without some performance monitoring. If you haven't already, setup Cloudwatch monitors. Personally I like to use monitoring services like New Relic as they can dive deep into your system - to the stored procedure and ASP.NET code level to identify bottlenecks.
The primary reason for doing this is to identify if your instance is maxing out on CPU usage, memory usage, swapping to disk, or if your bottleneck is in your networking bandwidth.
That being said, as jas_raj mentioned, the t-series instances are burstable, meaning if you have steady heavy traffic, you won't get good use from them. They're better suited for occasional peaks in load.
The m-series will provide a more stable level of performance but, in some cases, can be exceeded in performance by a bursting t-series machine. When I run CMS, CRM and similar apps in EC2, I typically start with an M3 instance.
There are some other things to consider as well.
Consider putting your DB on RDS or on a separate server with high performance EBS volumes (EBS optimized, provisioned IOPS, etc.).
If you can, separate your app and session state (as well as the data layer) so you can consider using smaller EC2 instances but scale them based on traffic and demand.
As you can imagine, there are a lot of factors that go into performance, but I hope this helps.

You can calculate the pricing based on your options by using Amazon's Simple Monthly Calculator.
Regarding your usage, I don't have a lot of experience on the Windows side with AWS but I would point out the fact the amount of CPU allocation on t2 instances is based on a credit system. If that's acceptable to your usage fine, otherwise switch to a non t2 instance for more deterministic CPU performance.

If you have good understandings about your application, I would suggest you check here for differences between instance types and selection suggestions.

Related

any one used ScaleArc tool to handle automatic failover of servers and load balancing?

i have come across tool called ScaleArc which helps in handling auto failover, caching, dynamic load balancing etc..
is there anyone who used it? and how can i download it and integrate it to ssms

I can attest to using ScaleArc successfully within our SSAS CRM automotive platform consisting of 4000+ Databases on 100+ SQL Servers in our production environment.
ScaleArc is an appliance that you add your SQL Server Availability groups to that provide caching of current login activity which include customizable load balancing across all the replicas within that group. The only requirement is that you have SQL Server Always on configured on your servers and ScaleArc does the rest; very easy to implement!
We have experienced automatic failovers within our single point of failure servers and NONE of our customers even noticed. We have also conducted daytime maintenance that has required us to failover to secondary nodes during business hours again; no customer impact.
If uptime is of extreme concern to you; then ScaleArc is your answer. Prior to ScaleArc; we experienced a lot of downtime, now we have a %99.9999 uptime record on our Single point of failure servers.
I hope this helps!
Michael Atkins, Director, IT Operations

I will second Mr. Atkins experience although my implementation was much smaller.
It enabled us to reduce our SQL server footprint from 22 SQL servers to 12 and with smaller SKU's dropping our overall costs by more than the ScaleArc investment. Adding to that the increased uptime and drop in issue investigations, it was a great investment.
I put ScaleArc in place for a customer support forum and it does a great job with failovers and makes patching much easier as you can leverage the script builder to take servers in an out of rotation gracefully and then add them to the run book for your patching and deployment activities. Our uptime went up considerably after the ScaleArc implementation, >99.99%.
The automatic load balancing algorithm actually improved performance by sending calls to a 2nd datacenter 65ms away. When we saw this in the report, we were able to dig into the code and find two bugs that needed to be addressed.
I strongly recommend that you take a look at ScaleArc. It may not be right for every engagement, but is well worth the time to see.
Michael Schaeffer
Senior Business Program Manager

Azure - C# Concurrency - Best Practices

We are scraping an Web based API using Microsoft Azure. The issue is that there is SO much data to retrieve (there are combinations/permutations involved).
If we use a standard Web Job approach, we calculated it would take about 200 years to process all the data we want to get - and we would like our data to be refreshed every week.
Each request/response from the API takes about a 0.5-1.0 seconds to process. Request size is on average 20000 bytes and the average response is 35000 bytes. I believe the total number of requests is in the millions.
Another way to think about this question would be: how would you use Azure to Web scrape - and make sure you don't overload (in terms of memory + network) the VM it's running on? (I don't think you need too much CPU processing in this case).
What we have tried so far:
Used Service Bus Queues/Worker Roles scaled to 8 small VMs - but this caused a lot of network errors to occur (there must be some network limit to how much EACH worker role VM can handle).
Used Service Bus Queues/Continuous Web Job scaled to 8 small VMs - but this seems to work slower - and even scaled, doesn't give us too much control on what's happening behind the scenes. (We don't REALLY know how many VMs are up).
It seems that these things are built for CPU calculation - not for Web/API scraping.
Just to clarify: I throw my requests into a queue - which then get picked up by my multiple VMs for processing to get the responses. That's how I was using the queues. Each VM was using the ServiceBusTrigger class as prescribed by microsoft.
Is it better to have a lot small VMs or few massive VMs?
What C# classes should we be looking at?
What are the technical best practices when trying to do something like this on Azure?

Actually a web scraper is something that I have up and running, in Azure, for quite some time now :-)
AFAIK there is no 'magic bullet'. Scraping a lot of sources with deadlines is quite hard.
How it works (the most important things):
I use worker roles and C# code for the code itself.
For scheduling, I use the queue storage. I put crawling tasks on the queue with a timeout (e.g. 'when to crawl then') and have the scraper pull them off. You can put triggers on the queue size to ensure you meet deadlines in terms of speed -- personally I don't need them.
SQL Azure is slow, so I don't use that. Instead, I only use table storage for storing the scraped items. Note that updating data might be quite complex.
Don't use too much threading; instead, use async IO for all network traffic.
Also you might have to consider that extra threads require extra memory (parse trees can become quite big) - so there's a trade-off there... I do recall using some threads, but it's really just a few.
Note that probably this does require you to re-design and re-implement your complete web scraper if you're now using a threaded approach.. then again, there are some benefits:
Table storage and queue storage are cheap.
I currently use a single Extra Small VM to scrape well over a thousand web sources.
Inbound network traffic is for free.
As such, the result is quite cheap as well; I'm sure it's much less than the alternatives.
As for classes that I use... well, that's a bit of a long list. I'm using HttpWebRequest for the async HTTP requests and the Azure SDK -- but all the rest is hand crafted (and not open source).
P.S.: This doesn't just hold for Azure; most of this also holds for on-premise scrapers.

I have some experience with scraping so I will share my thoughts.
It seems that these things are built for CPU calculation - not for Web/API scraping.
They are built for dynamic scaling which given your task is not something you really need.
How to make sure you don't overload the VM?
Measure the response times and error rates and tune you code to lower them.
I don't think you need too much CPU processing in this case.
Depends on how much data is coming in each second and what you are doing with it. More complex parsing on quickly incoming data (if you decide to do it on the same machine) will eat up CPU pretty quickly.
8 small VMs caused a lot of network errors to occur (there must be some network limit)
The smaller the VMs the less shared resources they get. There are throughput limits and then there is an issue with your neighbors sharing the actual hardware with you. Often, the smaller your instance size the more trouble you run into.
Is it better to have a lot small VMs or few massive VMs?
In my experience, smaller VMs are too crippled. However, your mileage may vary and it all depends on the particular task and its solution implementation. Really, you have to measure yourself in your environment.
What C# classes should we be looking at?
What are the technical best practices when trying to do something like this on Azure?
With high throughput scraping you should be looking at infrastructure. You will have different latency in different Azure datacenters, and different experience with network latency/sustained throughput at different VM sizes, and depending on who in particular is sharing the hardware with you. The best practice is to try and find what works best for you - change datacenters, VM sizes and otherwise experiment.
Azure may not be the best solution to this problem (unless you are on a spending spree). 8 small VMs is $450 a month. It is enough to pay for an unmanaged dedicated server with 256Gb of RAM, 40 hardware threads and 500Mbps - 1Gbps (or even up to several Gbps bursts) of quality network bandwidth without latency issues.
For you budget, you will have a dedicated server that you cannot overload. You will have more than enough RAM to deal with async pinning (if you decide to go async), or enough hardware threads for multi-threaded synchronous IO which gives the best throughput (if you choose to go synchronous with a fixed-size threadpool).
On a sidenote, depending on the API specifics, it might turn out that your main issue will be the API owner simply throttling you down to a crawl when you start to put too much pressure on the API endpoints.

Amazon Web Services : What Solutions for .Net Grid Computing on an EC2 Cloud?

The needs of my company are quite simple : We have a multi-threaded .Net computing program that reads many Gb of binary files, processes massive calculations, and stores the results into an SQL Server Database. We would like to do this on cloud to perform this recurrent task in the shortest time possible. So we are right into the cloud/grid/cluster computing thing.
I thought there would be tons of resources on the subject and plenty of available alternatives. I was simply stunned to figure out how wrong I was.
While mounting/running EC2 instances was a breeze, finding a relatively simple and straightforward way to parallelize and aggregate the processing power of these EC2 instances was not easy. Amazon customer service keeps dodging around and I was simply unable to get a concrete answer from them.
I found utilify which sounds promising. It is developed by the alchemi people.
However, the documentation link is broken and I had no answer to my emails when I contacted support so this was not very reassuring.
We have chosen Amazon over Azure as AMI's are straight seamless VM's (no need to "bundle" the app or other) and because EBS is a more convenient storage as it is a "real" filesystem. On the other hand, Azure seems HPC ready for windows, whereas AWS offers that for Linux powered AMI's only.
Any help and propositions are more than welcome
EDIT :
The .Net application is multi-threaded and consist of hundreds of parallel workers doing exactly the same task asynchronously.

Amazon EC2 is inherently a Infrastructure as a service system (IaaS), which means that EC2 will give you the hardware and OS but will not solve your grid computing problem for you. This is contrast to Windows Azure, which is a Platform as a Service (PaaS) system that requires using a different architecture where your application is broken out into different roles (web role, worker role, etc) that can easily be scaled out into a grid. See this question for more details about IaaS vs PaaS.
The difference for deployment on Azure vs EC2 is precisely because Azure requires you to think at a larger scale then EC2. If you want to scale on EC2 you have to do it on your own or use their Elastic Bean Stalk, which currently only supports Java on Apache Tomcat.
As for how to design the system, my recommendation would be to find a way to break the problem down into chunks that can be processed on individual machines, and load a message into a queue that describes how to perform the work. You then would have EC2 instances or Azure Roles pull work out of the Queue, perform the required calculations, and then either store the results directly in destination or send the result to an output queue that then aggregates the results. That is the most simple method of performing Grid computing without completely re-designing for something like MapReduce. You do still need to worry about what happens if a VM dies before committing results, but this can be managed by not deleting the Queue entry until it's results have been commited.

If you can go back to Azure rather than EC2, then:
David Pallman produced an example Grid project for Azure - http://azuregrid.codeplex.com/
the Lokad.Cloud project has some interesting framework code, including a simple Map-Reduce example - http://lokadcloud.codeplex.com/
Sorry - don't have any similar references for EC2 - although you may be able to get some inspiration from Microsoft's Dryad projects (I think these are currently only available under "educational" non-commercial license)

You should be looking at Windows HPC
Microsoft are working hard to deliver HPC nodes on windows azure which is exactly what you're looking for. Here's a white paper on it:
http://download.microsoft.com/download/4/5/C/45C520F4-424C-41CF-A115-E76A38ADB280/Windows_HPC_Server_and_Windows_Azure.docx
from here:
http://www.microsoft.com/hpc/en/us/default.aspx
http://www.networkworld.com/news/2010/111610-microsoft-hpc-server.html

ASP.NET Session - Use or not use and best practices for an e-commerce app

I have used ASP.NET in mostly intranet scenarios and pretty familiar with it but for something such as shopping cart or similar session data there are various possibilities. To name a few:
1) State-Server session
2) SQL Server session
3) Custom database session
4) Cookie
What have you used and what our your success or lessons learnt stories and what would you recommend? This would obviously make a difference in a large-scale public website so please comment on your experiences.
I have not mentioned in-proc since in a large-scale app this has no place.
Many thanks
Ali

The biggest lesson I learned was one I already knew in theory, but got to see in practice.
Removing all use of sessions entirely from an application (does not necessarily mean all of the site) is something we all know should bring a big improvement to scalability.
What I learnt was just how much of an improvement it could be. By removing the use of sessions, and adding some code to handle what had been handled by them before (which at each individual point was a performance lose, as each individual point was now doing more work than it had before) the performance gain was massive to the point of making actions one would measure in many seconds or even a couple of minutes become sub-second, CPU usage became a fraction of what it had been, and the number of machines and amount of RAM went from clearly not enough to cope, to be a rather over-indulgent amount of hardware.
If sessions cannot be removed entirely (people don't like the way browsers use HTTP authentication, alas), moving much of it into a few well-defined spots, ideally in a separate application on the server, can have a bigger effect that which session-storage method is used.

In-proc certainly can have a place in a large-scale application; it just requires sticky sessions at the load balancing level. In fact, the reduced maintenance cost and infrastructure overhead by using in-proc sessions can be considerable. Any enterprise-grade content switch you'd be using in front of your farm would certainly offer such functionality, and it's hard to argue for the cash and manpower of purchasing/configuring/integrating state servers versus just flipping a switch. I am using this in quite large scaled ASP.NET systems with no issues to speak of. RAM is far too cheap to ignore this as an option.

In-proc session (at least when using IIS6) can recycle at any time and is therefore not very reliable because the sessions will end when the server decides, not when the session actually times out. The sessions will also expire when you deploy a new version of the web site, which is not true of server-based session providers. This can potentially give your users a bad experience if you update in the middle of their session.
Using a Sql Server is the best option because it is possible to have sessions that never expire. However, the cost of the server, disk space, its maintenance, and peformance all have to be considered. I was using one on my E-commerce app for several years until we changed providers to one with very little database space. It was a shame that it had to go.
We have been using the state service for about 3 years now and haven't had any issues. That said, we now have the session timeout set at an hour an in E-commerce that is probably costing us some business vs the never expire model.
When I worked for a large company, we used a clustered SQL Server in another application that was more critical to remain online. We had multiple redundency on every part of the system including the network cards. Keep in mind that adding a state server or service is adding a potential single point of failure for the application unless you go the clustered route, which is more expensive to maintain.
There was also an issue when we first switched to the SQL based approach where binary objects couldn't be serialized into session state. I only had a few and modified the code so it wouldn't need the binary serialization so I could get the site online. However, when I went back to fix the serialization issue a few weeks later, it suddenly didn't exist anymore. I am guessing it was fixed in a Windows Update.

If you are concerned about security, state server is a no-no. State server performs absolutely no access checks, anybody who is granted access to the tcp port state server uses can access or modify any session state.
In proc is unreliable (and you mentioned that) so that's not to consider.
Cookies isn't really a session state replacement since you can't store much data there
I vote for a database based storage (if needed at all) of some kind, it has the best possibility to scale.

What is the most cost-effective way to break up a centralised database?

Following on from this question...
What to do when you’ve really screwed up the design of a distributed system?
... the client has reluctantly asked me to quote for option 3 (the expensive one), so they can compare prices to a company in India.
So, they want me to quote (hmm). In order for me to get this as accurate as possible, I will need to decide how I'm actually going to do it. Here's 3 scenarios...
Scenarios
Split the database
My original idea (perhaps the most tricky) will yield the best speed on both the website and the desktop application. However, it may require some synchronising between the two databases as the two "systems" so heavily connected. If not done properly and not tested thouroughly, I've learnt that synchronisation can be hell on earth.
Implement caching on the smallest system
To side-step the sync option (which I'm not fond of), I figured it may be more productive (and cheaper) to move the entire central database and web service to their office (i.e. in-house), and have the website (still on the hosted server) download data from the central office and store it in a small database (acting as a cache)...
Set up a new server in the customer's office (in-house).
Move the central database and web service to the new in-house server.
Keep the web site on the hosted server, but alter the web service URL so that it points to the office server.
Implement a simple cache system for images and most frequently accessed data (such as product information).
... the down-side is that when the end-user in the office updates something, their customers will effectively be downloading the data from a 60KB/s upload connection (albeit once, as it will be cached).
Also, not all data can be cached, for example when a customer updates their order. Also, connection redundancy becomes a huge factor here; what if the office connection is offline? Nothing to do but show an error message to the customers, which is nasty, but a necessary evil.
Mystery option number 3
Suggestions welcome!
SQL replication
I had considered MSSQL replication. But I have no experience with it, so I'm worried about how conflicts are handled, etc. Is this an option? Considering there are physical files involved, and so on. Also, I believe we'd need to upgrade from SQL express to SQL non-free, and buy two licenses.
Technical
Components
ASP.Net website
ASP.net web service
.Net desktop application
MSSQL 2008 express database
Connections
Office connection: 8 mbit down and 1 mbit up contended line (50:1)
Hosted virtual server: Windows 2008 with 10 megabit line

Having just read for the first time your original question related to this I'd say that you may have laid the foundation for resolving the problem simply because you are communicating with the database by a web service.
This web service may well be the saving grace as it allows you to split the communications without affecting the client.
A good while back I was involved in designing just such a system.
The first thing that we identified was that data which rarely changes - and immediately locked all of this out of consideration for distribution. A manual process for administering using the web server was the only way to change this data.
The second thing we identified was that data that should be owned locally. By this I mean data that only one person or location at a time would need to update; but that may need to be viewed at other locations. We fixed all of the keys on the related tables to ensure that duplication could never occur and that no auto-incrementing fields were used.
The third item was the tables that were truly shared - and although we worried a lot about these during stages 1 & 2 - in our case this part was straight-forwards.
When I'm talking about a server here I mean a DB Server with a set of web services that communicate between themselves.
As designed our architecture had 1 designated 'master' server. This was the definitive for resolving conflicts.
The rest of the servers were in the first instance a large cache of anything covered by item1. In fact it wasn't a large cache but a database duplication but you get the idea.
The second function of the each non-master server was to coordinate changes with the master. This involved a very simplistic process of actually passing through most of the work transparently to the master server.
We spent a lot of time designing and optimising all of the above - to finally discover that the single best performance improvement came from simply compressing the web service requests to reduce bandwidth (but it was over a single channel ISDN, which probably made the most difference).
The fact is that if you do have a web service then this will give you greater flexibility about how you implement this.
I'd probably start by investigating the feasability of implementing one of the SQL server replication methods
Usual disclaimers apply:

Splitting the database will not help a lot but it'll add a lot of nightmare. IMO, you should first try to optimize the database, update some indexes or may be add several more, optimize some queries and so on. For database performance tuning I recommend to read some articles from simple-talk.com.
Also in order to save bandwidth you can add bulk processing to your windows client and also add zipping (archiving) to your web service.
And probably you should upgrade to MS SQL 2008 Express, it's also free.
It's hard to recommend a good solution for your problem using the information I have. It's not clear where is the bottleneck. I strongly recommend you to profile your application to find exact place of the bottleneck (e.g. is it in the database or in fully used up channel and so on) and add a description of it to the question.
EDIT 01/03:
When the bottleneck is an up connection then you can do only the following:
1. Add archiving of messages to service and client
2. Implement bulk operations and use them
3. Try to reduce operations count per user case for the most frequent cases
4. Add a local database for windows clients and perform all operations using it and synchronize the local db and the main one on some timer.
And sql replication will not help you a lot in this case. The most fastest and cheapest solution is to increase up connection because all other ways (except the first one) will take a lot of time.
If you choose to rewrite the service to support bulking I recommend you to have a look at Agatha Project

Actually hearing how many they have on that one connection it may be time to up the bandwidth at the office (not at all my normal response) If you factor out the CRM system what else is a top user of the bandwidth? It maybe the they have reached the point of needing more bandwidth period.
But I am still curious to see how much information you are passing that is getting used. Make sure you are transferring efferently any chance you could add some easy quick measures to see how much people are actually consuming when looking at the data.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.