I am using SOA Architecture for project using Microsoft Technologies .NET 3.5 Platform. Can you give me steps/tools/guidelines/knowledge on the shortest and fast route to find the methods that cause the major Hardware bottlenecks like CPU time, memory usage. Also suggest what are the ways to improve thoroughput, scalability with response time.
Regards/Anand
I don't know any "short and fast route" to find any kind of bottle neck. So this is how I would approach the problem:
We usually generate logs for general time measures. You could inject a WCF behaviour which logs duration of each server method call. You could produce statistics from that. Consider duration of a method call and also number of calls to the same method (only optimize frequent method calls).
Memory is more complicated. You need to call a method separately to measure memory of a single method. Mostly it depends on existing data. There are tools to hunt memory leaks, if you intent to do this.
I found most unnecessary performance problems by observing database activity (eg. using Profiler for Sql Server).
Related
We are scraping an Web based API using Microsoft Azure. The issue is that there is SO much data to retrieve (there are combinations/permutations involved).
If we use a standard Web Job approach, we calculated it would take about 200 years to process all the data we want to get - and we would like our data to be refreshed every week.
Each request/response from the API takes about a 0.5-1.0 seconds to process. Request size is on average 20000 bytes and the average response is 35000 bytes. I believe the total number of requests is in the millions.
Another way to think about this question would be: how would you use Azure to Web scrape - and make sure you don't overload (in terms of memory + network) the VM it's running on? (I don't think you need too much CPU processing in this case).
What we have tried so far:
Used Service Bus Queues/Worker Roles scaled to 8 small VMs - but this caused a lot of network errors to occur (there must be some network limit to how much EACH worker role VM can handle).
Used Service Bus Queues/Continuous Web Job scaled to 8 small VMs - but this seems to work slower - and even scaled, doesn't give us too much control on what's happening behind the scenes. (We don't REALLY know how many VMs are up).
It seems that these things are built for CPU calculation - not for Web/API scraping.
Just to clarify: I throw my requests into a queue - which then get picked up by my multiple VMs for processing to get the responses. That's how I was using the queues. Each VM was using the ServiceBusTrigger class as prescribed by microsoft.
Is it better to have a lot small VMs or few massive VMs?
What C# classes should we be looking at?
What are the technical best practices when trying to do something like this on Azure?
Actually a web scraper is something that I have up and running, in Azure, for quite some time now :-)
AFAIK there is no 'magic bullet'. Scraping a lot of sources with deadlines is quite hard.
How it works (the most important things):
I use worker roles and C# code for the code itself.
For scheduling, I use the queue storage. I put crawling tasks on the queue with a timeout (e.g. 'when to crawl then') and have the scraper pull them off. You can put triggers on the queue size to ensure you meet deadlines in terms of speed -- personally I don't need them.
SQL Azure is slow, so I don't use that. Instead, I only use table storage for storing the scraped items. Note that updating data might be quite complex.
Don't use too much threading; instead, use async IO for all network traffic.
Also you might have to consider that extra threads require extra memory (parse trees can become quite big) - so there's a trade-off there... I do recall using some threads, but it's really just a few.
Note that probably this does require you to re-design and re-implement your complete web scraper if you're now using a threaded approach.. then again, there are some benefits:
Table storage and queue storage are cheap.
I currently use a single Extra Small VM to scrape well over a thousand web sources.
Inbound network traffic is for free.
As such, the result is quite cheap as well; I'm sure it's much less than the alternatives.
As for classes that I use... well, that's a bit of a long list. I'm using HttpWebRequest for the async HTTP requests and the Azure SDK -- but all the rest is hand crafted (and not open source).
P.S.: This doesn't just hold for Azure; most of this also holds for on-premise scrapers.
I have some experience with scraping so I will share my thoughts.
It seems that these things are built for CPU calculation - not for Web/API scraping.
They are built for dynamic scaling which given your task is not something you really need.
How to make sure you don't overload the VM?
Measure the response times and error rates and tune you code to lower them.
I don't think you need too much CPU processing in this case.
Depends on how much data is coming in each second and what you are doing with it. More complex parsing on quickly incoming data (if you decide to do it on the same machine) will eat up CPU pretty quickly.
8 small VMs caused a lot of network errors to occur (there must be some network limit)
The smaller the VMs the less shared resources they get. There are throughput limits and then there is an issue with your neighbors sharing the actual hardware with you. Often, the smaller your instance size the more trouble you run into.
Is it better to have a lot small VMs or few massive VMs?
In my experience, smaller VMs are too crippled. However, your mileage may vary and it all depends on the particular task and its solution implementation. Really, you have to measure yourself in your environment.
What C# classes should we be looking at?
What are the technical best practices when trying to do something like this on Azure?
With high throughput scraping you should be looking at infrastructure. You will have different latency in different Azure datacenters, and different experience with network latency/sustained throughput at different VM sizes, and depending on who in particular is sharing the hardware with you. The best practice is to try and find what works best for you - change datacenters, VM sizes and otherwise experiment.
Azure may not be the best solution to this problem (unless you are on a spending spree). 8 small VMs is $450 a month. It is enough to pay for an unmanaged dedicated server with 256Gb of RAM, 40 hardware threads and 500Mbps - 1Gbps (or even up to several Gbps bursts) of quality network bandwidth without latency issues.
For you budget, you will have a dedicated server that you cannot overload. You will have more than enough RAM to deal with async pinning (if you decide to go async), or enough hardware threads for multi-threaded synchronous IO which gives the best throughput (if you choose to go synchronous with a fixed-size threadpool).
On a sidenote, depending on the API specifics, it might turn out that your main issue will be the API owner simply throttling you down to a crawl when you start to put too much pressure on the API endpoints.
Are there any tips, tricks and techniques to prevent or minimize slowdowns or temporary freeze of an app because of the .NET GC?
Maybe something along the lines of:
Try to use structs if you can, unless the data is too large or will be mostly used inside other classes, etc.
The description of your App does not fit the usual meaning of "realtime". Realtime is commonly used for software that has a max latency in milliseconds or less.
You have a requirement of responsiveness to the user, meaning you could probably tolerate an incidental delay of 500 ms or more. 100 ms won't be noticed.
Luckily for you, the GC won't cause delays that long. And if it did you could use the Server (background) version of the GC, but I know little about the details.
But if your "user experience" does suffer, it probably won't be the GC.
IMHO, if the performance of your application is being affected noticeably by the GC, something is wrong. The GC is designed to work without intervention and without significantly affecting your application. In other words, you shouldn't have to code with the details of the GC in mind.
I would examine the structure of your application and see where the bottlenecks are, maybe using a profiler. Maybe there are places where you could reduce the number of objects that are being created and destroyed.
If parts of your application really need to be real-time, perhaps they should be written in another language that is designed for that sort of thing.
Another trick is to use GC.RegisterForFullNotifications on back-end.
Let say, that you have load balancing server and N app. servers. When load balancer recieves information about possible full GC on one of the servers it will forward requests to other servers for some time therefore SLA will not be affected by GC (which is especially usefull for x64 boxes where more than 4GB can be addressed).
Updated
No, unfortunately I don't have a code but there is a very simple example at MSDN.com with dummy methods like RedirectRequests and AcceptRequests which can be found here: Garbage Collection Notifications
I am getting ready to perform a series of performance comparisons of various of the shelf products.
What do I need to do to show credibility in the tests? How do I design my benchmark tests so that they are respectable?
I am also interested in any suggestions on the actual design of the tests. Ways to load data without effecting the tests (Heisenberg Uncertainty Principle), or ways to monitor... etc
This is a bit tricky to answer without knowing what sort of "off the shelf" products you are trying to assess. Are you looking for UI responsiveness, throughput (e.g. email, transactions/sec), startup time, etc - all of these have different criteria for what measures you should track and different tools for testing or evaluating. But to answer some of your general questions:
Credibility - this is important. Try to make sure that whatever you are measuring has little run to run variance. Utilize the technique of doing several runs of the same scenario, get rid of outliers (i.e. your lowest and highest), and evaluate your avg/max/min/median values. If you're doing some sort of throughput test, consider making it long running so you have a good sample set. For example, if you are looking at something like Microsoft Exchange and thus are using their perf counters, try to make sure you are taking frequent samples (once per sec or every few secs) and have the test run for 20mins or so. Again, chop off the first few mins and the last few mins to eliminate any startup/shutdown noise.
Heisenburg - tricky. In most modern systems, depending on what application/measures you are measuring, you can minimize this impact by being smart about what/how you are measuring. Sometimes (like in the Exchange example), you'll see near 0 impact. Try to use as least invasive tools as possible. For example, if you're measuring startup time, consider using xperfinfo and utilize the events built into the kernel. If you're using perfmon, don't flood the system with extraneous counters that you don't care about. If you're doing some exteremely long running test, ratchet down your sampling interval.
Also try to eliminate any sources of environment variability or possible sources of noise. If you're doing something network intensive, consider isolating the network. Try to disable any services or applications that you don't care about. Limit any sort of disk IO, memory intensive operations, etc. If disk IO might introduce noise in something that is CPU bound, consider using SSD.
When designing your tests, keep repeatability in mind. If you doing some sort of microbenchmark type testing (e.g. perf unit test) then have your infrastructure support running the same operation n times exactly the same. If you're driving UI, try not to physically drive the mouse and instead use the underlying accessibility layer (MSAA, UIAutomation, etc) to hit controls directly programmatically.
Again, this is just general advice. If you have more specifics then I can try to follow up with more relavant guidance.
Enjoy!
Your question is very interesting, but a bit vague, because without knowing what to test it is not easy to give you some clues.
You can test performance from many different angles, then, depending on the use or target of the library you should try one approach or another; I will try to enumerate some of the things you may have to consider for measurement:
Multithreading: if the library uses
it or your software will use the
library in a multithreaded context
then you may have to test it with
many different processor and
multiprocessor configurations to see
how it reacts.
Startup time: its
importance depends on how intensively
will you use the library and what’s
the nature of the product being built
with it (client, server …).
Response time: for this do not take
the first execution, try to execute
the same call many times after the
first one and do an average. Using
System.Diagnostics.StopWatch could be
very useful for that.
Memory
consumption: analyze the growth,
beware of exponential ones ;). Go a
step further and measure quantity of
objects being created and disposed.
Responsiveness: you should not only
measure raw performance, how the user
feels the speed of the product it is
very important too.
Network: if the
library uses resources on the network
you may have to test it with
different bandwidth and latency
configurations, there is software to
simulate these situations.
Data:
try to create many different testing
data packages, trying to cover, for
example: a big bunch of raw data,
then a large set made of many smaller
chunks, a long iteration with small
pieces of data, …
Tools:
System.Diagnostics.Stopwatch: essential for benchmarking method calls
Performance counters: whenever available they are very useful to know what’s happening inside, allowing you to monitor the software without affecting its performance.
Profilers: there are some good memory and performance profilers in the market, but as you said, they always affect the measurements. They are good for finding bottlenecks in your software, but I don’t think you can use them for a comparison test.
Why do you care about the performance? In both cases the time taken to write the message to wherever you a storing your log will be a lot slower than anything else.
If you are really doing that match logging, then you are likely to need to index your log files so you can find the log entry you need, at that point you are not doing standard logging.
I am trying to get some detailed performance information from an application my company is developing. Examples of information I am trying to get would be how long a network transaction takes, how much CPU/memory the application is using, how long it takes for a given method to complete, etc.
I have had some failed attempts at this in the past (like trying to measure small time periods by using DateTime.Now). On top of that I don't know much of anything about getting CPU and memory statistics. Are there any good .Net classes, libraries, or frameworks out there that would help me collect this sort of information and/or log it to a text file?
What you are looking for is Performance Counters. For .net you need this Performance Counter.
Performance counters are one way to go, and the System.Diagnostics.Stopwatch class are good foundational places to look for doing this.
With performance counters (beyond those provided) you will need to manage both the infrastructure of tracking the events, as well as reporting the data. The performance counter base classes supply the connection details for hooking up to the event log, but you will need to provide other reporting infrastructure if you need to report the data in another way (such as to a log file, or database).
The stopwatch class is a wrapper around the high performance timer, giving you microsecond or nanosecond resolution depending on the processor or the platform. If you do not need that high of resolution you can use System.DateTime..Now.Ticks to get the current tick count for the processor clock and do differential math with that, giving you millisecond or bettter precision for most operations.
When tracking CPU statistics be aware that multiple processors and multiple cores will complicate any accurate statistics in some cases.
One last caution with performance counters, be aware that not all performance counters are on all machines. For instance ASP.NET counters are not present on a machine which does not have IIS installed, etc.
for a modern opensource library to do performance metrics and monitoring consider using app-metrics.
github: https://github.com/AppMetrics/AppMetrics
website https://www.app-metrics.io
A platform independent open source performance recorder build on top of aspect injector can be found here: https://gitlab.com/hectorjsmith/csharp-performance-recorder. It can be added to any C# project. Instructions on how to use can be found in the README in Gitlab.
Benefits
It's aspect oriented, meaning you don't have to litter your code with DateTime.Now - you just annotate the appropriate methods
It takes care of pretty printing the results - you can do with the what you like e.g. printing them to a file
Notes
This performance recorder is focused on the timing of methods only. It doesn't cover CPU or memory usage.
For cpu/memory use performance counters. For more specific information about specific methods (or lines of code) and specific objects, use a profiler. red-gate makes a great profiler.
http://www.red-gate.com/products/ants_performance_profiler
I have written a Winform application in C#. How can I check the performance of my code. By that I mean, how can I check which forms references are active at a given time or event, so that I can remove them if they are not required (make them available for garbage collection). Is there a way to do it using VS 2005 or any free tool. Any tutorials or guide will be useful.
[Edit] Sorry if my question is confusing. I am not looking for a professional tool, but ways to know/understand the working of my code better and code more efficiently.
Thanks
Making code efficient is always a secondary step for me. First I write the code so that it works. Next, I profile it if i am unhappy with the performance. The truth is most applications run fast enough after the first time writing them. Sometimes though, better performance is needed. Performance can be gained many different ways. It all depends on your application. I write LOB apps mainly, so I deal with alot of IO to databases, services and storage. These calls are all very expensive and need to be limited so they are my first area to optimize. I optimize by lazy-loading, eager-loading, batching calls, making less frequent calls and so on. I recently had a winforms app that created hundreds of controls dynamically and it took a long time. That's another bottleneck that I have to address. I use a profiler to measure the performance of the applications.
Use the free Equatec profiler. It will show you how long calls take and how many times a call is made. The profiler gives a nice report and visual display that can drill down the call stacks.
Red Gate Performance Profiler
...it's been said here a million times before. If you suspect performance issues, profile your application. It will tell you how long calls are taking and point out the bottlenecks in your code.
Kobra,
What you're looking for is called a Memory Profiler. There happens to be one (paid) version for .NET aptly named ".NET Memory Profiler", I've not used it extensively but it should answer the questions you're asking. There are a few others ones which will do basically the same thing, like giving you instance counts of loaded types, and help you identify when instances are not being garbage collected for one reason or another (i.e. Event Handler References, Static Properties, etc).
Hope this helps,
Dylan