One DBA's Ongoing Search for Clarity in the Middle of Nowhere


*or*

Yet Another Andy Writing About SQL Server

Thursday, January 24, 2013

Why is My New Server Under CPU Pressure?



Signal waits frequently translate to the amount of time that SQLOS is under pressure for CPU resources, as outlined in this TechNet article (http://technet.microsoft.com/en-us/magazine/hh781189.aspx). While a lower number is almost always better, it is a useful metric is to start paying attention to when the number rises over 10%. Your system will usually start presenting issues at the hard line of 25%.  This something that we check on every server that we touch as part of our regular health check.

Today I tripped over a conundrum - I was given a new server with a fresh install of SQL Server 2008R2 and Reporting Services, and on this system the Signal Waits were measured at 25.19% on our scan, which shows that this system appeared to be under some CPU pressure.  All wait statistics are reset each time the SQL Server service restarts, so this number (25.19%) is only since the last service restart.

This system in question is a physical box and has two sockets with six cores each with hyperthreading (so 24 logical CPU’s), and that made me pause – why does a new server with two small databases (from Reporting Services) and 24 logical CPU’s show high Signal Waits, which usually indicates CPU pressure?

The answer is that in this case, this new server is *too* quiet, and as such the numbers are skewed.  Here are the specific wait stats, measured after clearing the total wait statistics using the DBCC SQLPERF ('sys.dm_os_wait_stats', CLEAR) command.


 
As can be seen in this shot, almost all of the Signal Wait time is in two categories, XE_TIMER_EVENT and REQUEST_FOR_DEADLOCK_SEARCH, both of which are basically system background processes (waiting for XEVENT processing and deadlock processing, respectively) and neither of which are therefore relevant to measuring the busyness of the server.  On a regularly busy server, all of the other wait categories (over 400 in total) greatly outweigh these two background processes and the normal query we use that compare signal wait time to total wait time is useful.  In this case if we exclude these two categories:

  
We now see that the server has zero Signal Waits and no CPU pressure, as expected.  For a good description of the different types of waits and what they mean, look at http://blogs.msdn.com/b/psssql/archive/2009/11/03/the-sql-server-wait-type-repository.aspx from the Microsoft Customer Service and Support (CSS) team.

Linchi Shea has written several great articles on Signal Waits and what they mean in SQL Server.  They are here and here - check them out as well.

No comments:

Post a Comment