Right so after upgrading to Sitecore 8.2 from 8.1 with split environment i.e CD and CMS. I'm seeing few performance issues, The CMS works fine but the number of threads is around 200 in local! whereas CD just freezes by just consuming all the memory just after starting the site, there no error shown in the log as well.
Any idea what might be wrong ?
Please check if you are not getting any error continuously in log files on CD server.
Check for redirects if you have and see if they are causing problems. We found this one as an issue in our instance in the past.
You can watch for Sitecore performance counter for more diagnosis.
For CPU high usage process, i think you need to stop some agent processes which considered one of the memory consuming process, try Stop this cleanup agent in sitecore.config file be set the interval to zero
<agent type="Sitecore.Tasks.CleanupAgent" method="Run" interval="06:00:00">
A possible reason may be the size of the Sitecore event queue.
Check the record count from all the Event queues especially the web database:
SELECT count(*) FROM [EventQueue]
If the count is high like 100K you need to clean up for better performance. Work best when there are at most a few thousand records.
See:
Publish Queue, History and Event Queue too big
sitecore-event-queue-how-to-clean-it-and-why
We had a similar issue where the CPU usage was spiking but could not find any error in sitecore log files.Also the web traffic was normal as observed from the IIS logs. This is how we resolved the issue.
In IIS user interface and in the application pool where the site is hosted,please check the currently running IIS worker Processes.Here we can see that each request is in different part of ASP.NET pipeline and currently executing HTTP modules.
Now please check whether any requests are getting stuck at any stage.If there are multiple requests coming from the same URL are getting stuck then it means that some module in that URL is getting hung up or going into an infinite loop.Now we can investigate the modules used in this URL and find the actual issue.
Related
My azure web application sometimes reacts very slowly. He waits a few seconds before executing the request.
Of course I have the setting "always on" turned on.
It's running on a S2 service plan.
Avg users online 3
No vertical or horizontal scaling configured.
Application
Asp.net MVC
.net Framework 4.6.1
C#
Does anyone have an idea why this problem occasionally occurs?
Ok i see based on your picture that there is a wait time of 98.71% and lots of wait time from the compiler, so i would recommend you to consider to use precompiled views on your mvc app, to avoid the runtime compilation of the views. If you are using Azure DevOps, you should be able to change your task to build the solution and add the following options on the MSBuild arguments.
/p:PrecompileBeforePublish=true /p:UseMerge=true /p:SingleAssemblyName=AppCode
When you see the WebApp being slow it is important to understand what HTTP requests are slow and whether those HTTP requests are slow all the time or it is an intermittent issue? How are the CPU and memory metrics and what is the pattern of slowness? If you have application Insights enabled please navigate to the "Performance" tab to see the requests were are slow and whether they are dependent on an external component.
Collecting CLR profiler in the context of slowness will reveal where the time is spent.
You can navigate to Azure Portal-->WebApp-->Diagnose and solve problem blade-->Diagnostic tools-->Autoheal and enable the rule to collect the CLR profiler traces on slowness.
Once the rule triggers it will collect the profiler traces and build a report for your review.
Our Web Application uses an .net-core web api running on a loab balancer and an angular client. We access the DB using EF core.
We have a long running background-task that does a great amount of calculation and takes about 2-3 hours to do so, but will only be initiated by administrators of the application 3-4 times a year.
While the job is running we want to prevent users from adding/editing/deleting data and our client told us its even fine if the application is not avaliable for the duration as they will mostly do it overnight.
The easiest way to do this is to redirect users to an informationpage while the job is running but I have found no way of actually getting to the information if the task is running or not.
I could set a flag whether the job is running or not and just check that flag at every request but I found no way to access an applicationwide state.
I cannot save a flag to the DB because while the transaction is commiting at the end of the job (~1 hour) we cannot read from the DB
What baffles me most is that I have not found a single article or question about a problem like that which doesn't seem to be too outlandish to me, so I guess I'm missing something very obvious.
The simplest way is to store the value for your "Maintenance Mode" in a Singleton class on the server. (No database call needed). The value will remain there for as long as the server is actively running.
If distributed cache (as already mentioned) is not an option, you can run long running task in (uniquely) named transaction and then the check list of active transactions to determine if task is still running.
This is completely dependent on your setup but a simple way to approach this problem might be to make it the long-running job's responsibility to divert traffic from your site while it is running, and then undo that once it is finished.
As an example, if you were running this with an old-school .NET site in IIS the job could drop an app_offline.htm file into the site folder, run, then delete it again. Your setup is different, but if you could do something similar with your load-balancer (configure it to serve some static file instead of routing the requests to your servers) then it could work for you.
I have an C# MVC app. And one of the calls I know will take like 12 hours, I'm generating some reports.
I want to know if the IIS will keep the process running this long.
If I'll do it async will it run and somehow put it away and let him run for this long?
In general, this is a very poor practice. Your app pool may be recycled for many reasons outside of your control.
You would greatly benefit from processing this in an external process and then providing the user the results via IIS.
However, starting .net 4.5.2 (https://blogs.msdn.microsoft.com/dotnet/2014/05/05/announcing-the-net-framework-4-5-2/) you can use the HostingEnvironment.QueueBackgroundWorkItem API.
The idea behind this is that the IIS will try to finish this work in case of a graceful shutdown of the app pool.
I'm assuming the report must be using something like SSRS. Why don't you create a batch job at the backend to run the reports at a specified time. Update a table with the status of the report and just poll the status at the front end. when Ready just download it. Imagine if your report had been running for 6 hours and it's reliant on the website being up. If someone re-starts the website that's 6 hours of processing gone.
I have a problem I'm fighting for a week now. I have a WCF service running in IIS 8.5 on Windows Server 2012 R2 and a windows service client who is making one or two requests at each 30 seconds. At some point (usually withing two hours of the service running) one of the requests is causing the service app pool (separated from other app pools) process to gain CPU usage. In IIS worker process section can be seen that this request never ends and is hanging in ServiceModel-4 module in AuthenticateRequest state (i.e most likely it is in infinite loop somewhere). At some point another such request is added to the first one, until they become four, staying forever and causing 100 % CPU usage (there are 4 logical processors on the machine). What I did to investigate , fix this problem:
used wcf tracing and custom logging to determine where the problem is. Wcf tracing actually shows all the requests made to the server passed succesfully in milliseconds (!) (at the same time wcf tracing on the client side shows of course time out on the same requests). Custom logging also is showing that the service code is calling returtn of the requested operation. The result of the method are two simple dto objects, so no possible serialization issue and also there are no enpoint behaviors or wathever custom code which is execting before sending reply from the service (except the method code, which, as I mentioned returns successfully).
used iis failed request tracing which shows the request reaching the ServiceModel-4 without continuing with the following information:
ModuleName : ServiceModel-4.0
Notification: AUTHENTICATE_REQUEST
HttpStatus: 500
HttpReason: Internal Server Error
HttpSubStatus: 0
ErrorCode: The operation completed succesfully (0x0)
used Debug Diag for tracing requests continuing more than 10 minutes and saw the threads which are running long time. The stack trace is as follows:
or as follows:
I've seen these are called from iis process. Since thiese are .Net function I suspected first corrupted .Net installation, moreover there were both .Net4.5 and .Net4 installed on the server (which I don't know how exactly could happen). So:
I deinstalled .Net4 and From windows features on/off i turned off .Net4.5 features, restarted and after that i turned them on, restarted, without success
after that I by same way reinstalled the IIS (from Windows features). Again no success.
Does not have any more ideas.
it seems I have found the answer (but havent used Dot Trace or other tools). There was an access to a Generic Dictionary from multiple threads. This seems to be a known problem:
https://blogs.msdn.microsoft.com/tess/2009/12/21/high-cpu-in-net-app-using-a-static-generic-dictionary/
https://blogs.msdn.microsoft.com/asiatech/2009/05/11/asp-net-application-100-cpu-caused-by-system-collections-generic-dictionary/
Actually I noticed this problem in the beggining of the research but ruled it out, because i couldn't reproduce it (probably because I havent't testing the dictionary in iis app, of course I received various exceptions, but not a 100 % Cpu) and mainly because all logs showed that the code, accessing the dictionary has passed, also the stack trace above has nothing to do with the dictionary.
However I think that the problem happened during the serialization of this dictionary (which is data contract) which explains the logged information.
Still cannot explain how this exactly is happening. If anyone can explain it I think it will be a good knowledge for everyone.
I have a rather high-load deployment on Azure: 4 Large instances serving about 300-600 requests per second. Under normal conditions: "Average Response Time" is 70 to 150ms, but sometimes it may grow up to 200-300ms, but it's absolutely OK.
Though, one or two times per day (not at "Rush Hours") I see such picture on the Web Site Monitoring tab:
So, number of requests per minute significantly drops, average response time is growing on to 3 minutes, and after a while – everything comes back to normal.
During this "Blackout" there is only 0.1% requests being dropped (Http Server Errors with timeout), other requests just wait in queue and are normally processed after few minutes. Though, not all clients are ready to wait :-(
Memory usage is under 30% all the time, CPU usage is only up to 40-50%.
What I've already checked?:
Traces for timed-out requests: they did timed out at random locations.
Throttling for Azure Storage and other components used: no throttling at all.
I also tried to route all traffic through CloudFlare: and saw the same problems.
What could be the reason for such problems? What may I check next?
Thank you all in advance!
Update 1: BenV proposed good thing to try, but unfortunately it showed nothing :-(
I configured processes recycling every 500k requests and also added worker nodes, so CPU utilization is now less than 40% all day long, but blackouts still appear.
Update 2: Project uses ASP.Net MVC 4.
I had this exact same problem. For me I saw a lot of WinCache errors in my logs.
Whenever the site would fail, it would have a lot of WinCache errors in the log. WinCache is how IIS handles PHP to try to speed up the processing. It’s a Microsoft built add-on that is enabled by default in IIS and all Azure sites. WinCache would get hung up and instead of recycling and continuing, it would consume all the memory and file handles on an instance, essentially locking it up.
I added new App setting in the Azure Portal to scan a folder for php.ini settings changes.
d:\home\site\ini
Added a file in d:\home\site\ini\settings.ini
that contains the following
wincache.fcenabled=1
session.save_handler = files
memory_limit = 256M
wincache.chkinterval=5
wincache.ucachesize=200
wincache.scachesize=64
wincache.enablecli=1
wincache.ocenabled=0
This does a few things:
wincache.fcenabled=1
Enables file caching using WinCache (I think that's the default anyway)
session.save_handler = files
Changes the session handler from WinCache (Azure Default) to standard file based to reduce the cache engine stress
memory_limit = 256M
wincache.chkinterval=5
wincache.ucachesize=200
wincache.scachesize=64
wincache.enablecli=1
Sets the WinCache size to 256 megabytes per thread and limits the overall Cache size. This forces WinCache to clear out old data and recycle the cache more often.
wincache.ocenabled=0
This is the big one. DISABLE WinCache Operational Code caching. That is WinCache caching the actual PHP scripts into memory. Files are still cached from line one, but PHP is interpreted per normal and not cached into large binary files.
I went from having a my Azure Website crash about once every 3 days with logs that look like yours to 120 days straight so far without any issues.
Good luck!
There's some nice tools available for Web Apps in the preview portal.
The Application Insights extension especially can be useful for monitoring and troubleshooting app performance.