One DBA's Ongoing Search for Clarity in the Middle of Nowhere


Yet Another Andy Writing About SQL Server

Friday, April 22, 2016

Where Is My Primary Replica Again?

We have many clients with multi-node Availability Groups - that is, AGs with more than two replicas.  One of the problems I have always had with Availability Group management via the GUI (admit it, you use the GUI sometimes all you non-PowerShell geeks) is the fact that most of the work needs to be done from the primary replica.  You can connect to the Availability Group manager on a secondary replica, but you can only see limited data about that particular secondary replica and can't see much about the other replicas, including *which* replica is the current primary replica!

To perform management you almost always need to connect to the primary replica, but how can I figure out which one is primary without just connecting to the instances one by one until I get lucky?
Enter the T-SQL:
SELECT AS AvailabilityGroupName
, HAGS.primary_replica AS PrimaryReplicaName
, HARS.role_desc as LocalReplicaRoleDesc
, DRCS.database_name AS DatabaseName
, HDRS.synchronization_state_desc as SynchronizationStateDesc
, HDRS.is_suspended AS IsSuspended
, DRCS.is_database_joined AS IsJoined
FROM master.sys.availability_groups AS AG
LEFT OUTER JOIN master.sys.dm_hadr_availability_group_states as HAGS
ON AG.group_id = HAGS.group_id
INNER JOIN master.sys.availability_replicas AS AR
ON AG.group_id = AR.group_id
INNER JOIN master.sys.dm_hadr_availability_replica_states AS HARS
ON AR.replica_id = HARS.replica_id AND HARS.is_local = 1
INNER JOIN master.sys.dm_hadr_database_replica_cluster_states AS DRCS
ON HARS.replica_id = DRCS.replica_id
LEFT OUTER JOIN master.sys.dm_hadr_database_replica_states AS HDRS
ON DRCS.replica_id = HDRS.replica_id
AND DRCS.group_database_id = HDRS.group_database_id
ORDER BY, DRCS.database_name
This query can be run on any of the replicas, and it will return information about the Availability Groups and their member databases, *including* the name of the primary replica instance!
Hope this helps!

Thursday, April 14, 2016

How Long Did That Job Run?

When you use the msdb tables to try to gather information on your SQL Server Agent jobs, inevitably you will end up in a situation like this:
SELECT TOP 10 as JobName
,sjh.run_date as RunDate
,sjh.run_time as RunTime
,sjh.run_duration as Duration
from msdb.dbo.sysjobhistory sjh
join msdb.dbo.sysjobs sj
on sjh.job_id = sj.job_id
where step_id=0
JobName RunDate RunTime Duration
ServerA_RESTORE_FROM_PROD_V2 20140904 222821 4131
ServerA_RESTORE_FROM_PROD_V2 20140904 231032 3055
ServerA_RESTORE_FROM_PROD_V2 20140904 231130 5003
Bloomberg_Pkg 20140904 231233 205
ServerA_RESTORE_FROM_PROD_V2 20140904 231350 2343
DatabaseIntegrityCheck - SYSTEM_DATABASES 20140923 1500 433
DatabaseBackup - SYSTEM_DATABASES - FULL 20140923 10000 23
syspolicy_purge_history 20140923 20000 36
DatabaseBackup - USER_DATABASES - FULL 20140923 40000 2333
DatabaseIntegrityCheck - SYSTEM_DATABASES 20140924 1500 113
So....when did "DatabaseIntegrityCheck - SYSTEM_DATABASES" start? At 1500 - is that 3pm?  You may be able hash out that this translates to 12:15am local time...but what if you want to perform datetime-style math on the RunDate/RunTime?  Sure you can do multiple leaps to say (RunDate>X and RunDate<=Y) AND (RunTime>A and RunTime<=B), but you then need to explicitly format your X, Y, A, and B in the appropriate integer-style format.  Wouldn't it be easier to just be able to datetime math?

The next part is even worse - quick - how long did the first instance of "ServerA_RESTORE_FROM_PROD_V2" run?

4,131 somethings (seconds, ms, etc), right?
In truth, the job ran forty-one minutes and thirty-one seconds.
Yes, yes I am.  No precision beyond seconds, and no quick way to do math.  if you want to figure out how much longer instance 1 of the job ran than instance 2, you have to do some serious goofiness to figure it out.

Here is the fix for both of these problems!


The first item is easy although not directly obvious without a little help from that old standby:
(Yes I am that old...)

Microsoft included a system scalar function msdb.dbo.agent_datetime(run_date, run_time) that turns the combination  on run_date and run_time into a datetime:
SELECT TOP 10 as JobName
,sjh.run_date as RunDate
,sjh.run_time as RunTime
,msdb.dbo.agent_datetime(sjh.run_date,sjh.run_time) as RunDateTime
from msdb.dbo.sysjobhistory sjh
join msdb.dbo.sysjobs sj
on sjh.job_id = sj.job_id
where step_id=0
09/04/2014 22:28:21
09/04/2014 23:10:32
09/04/2014 23:11:30
09/04/2014 23:12:33
09/04/2014 23:13:50
DatabaseIntegrityCheck - SYSTEM_DATABASES201409231500
09/23/2014 00:15:00
DatabaseBackup - SYSTEM_DATABASES - FULL2014092310000
09/23/2014 01:00:00
09/23/2014 02:00:00
DatabaseBackup - USER_DATABASES - FULL2014092340000
09/23/2014 04:00:00
DatabaseIntegrityCheck - SYSTEM_DATABASES201409241500
09/24/2014 00:15:00
I agree with the author of this post who calls the agent_datetime() function "undocumented" since there wasn't a record of it in Books Online - I checked around and couldn't find any standard documentation of it on MSDN or TechNet.

Now that we have a datetime, we can perform all of the regular datetime manipulation functions such as DATEDIFF() on the values.


The second part is a little more obnoxious - there isn't a quick Microsoft function (documented or otherwise) to make the run_duration into a process-ready value.

To hash the run_duration into a useful value I wrote a CASE statement some time ago:
SELECT TOP 10 as JobName
,CASE len(sjh.run_duration)
WHEN 1 THEN cast('00:00:0'
+ cast(sjh.run_duration as char) as char (8))
WHEN 2 THEN cast('00:00:'
+ cast(sjh.run_duration as char) as char (8))
WHEN 3 THEN cast('00:0'
+ Left(right(sjh.run_duration,3),1)
+':' + right(sjh.run_duration,2) as char (8))
WHEN 4 THEN cast('00:'
+ Left(right(sjh.run_duration,4),2)
+':' + right(sjh.run_duration,2) as char (8))
WHEN 5 THEN cast('0'
+ Left(right(sjh.run_duration,5),1)
+':' + Left(right(sjh.run_duration,4),2)
+':' + right(sjh.run_duration,2) as char (8))
WHEN 6 THEN cast(Left(right(sjh.run_duration,6),2)
+':' + Left(right(sjh.run_duration,4),2)
+':' + right(sjh.run_duration,2) as char (8))
END as 'Duration'

from msdb.dbo.sysjobhistory sjh
join msdb.dbo.sysjobs sj
on sjh.job_id = sj.job_id
where step_id=0
JobName Duration run_duration
ServerA_RESTORE_FROM_PROD_V2 00:41:31 4131
ServerA_RESTORE_FROM_PROD_V2 00:00:03 3
ServerA_RESTORE_FROM_PROD_V2 00:00:05 5
Bloomberg_Pkg 00:00:02 2
ServerA_RESTORE_FROM_PROD_V2 00:00:02 2
DatabaseIntegrityCheck - SYSTEM_DATABASES 00:00:43 43
DatabaseBackup - SYSTEM_DATABASES - FULL 00:00:02 2
syspolicy_purge_history 00:00:36 36
DatabaseBackup - USER_DATABASES - FULL 00:00:02 2
DatabaseIntegrityCheck - SYSTEM_DATABASES 00:00:11 11

Interestingly I recently discovered a more elegant solution (while looking for a answer to a different problem) that utilizes the STUFF() function.  Look at this forum post on SQLServerCentral.  The fourth item down from "Mudluck" is almost exactly the same as my CASE statement above, but look at the reply below it from "JG-324908":
SELECT stuff(stuff(replace(str(run_duration,6,0),' ','0'),3,0,':'),6,0,':') FROM sysJobHist
Sure enough, running this STUFF() code results in the same output as the much more complicated CASE statement above.  JG notes that he/she found it in a forum post somewhere and I dug around a little to see if I could find the original author without any luck :(


As with many other bits of code, I keep these things in a big NotePad file of useful code snippets (some people like OneNote instead, and there are dozens of choices - use the one you normally prefer) so that I can quickly reference them when needed.  I always note the website or forum post where I found the code if I didnt create it myself as well as the original author.  This lets me give credit where credit is due when showing the code to others, but it also gives me an idea of someone to approach in the future if I have a question about a similar topic.

It was especially interesting to find the STUFF() code because STUFF() isn't a function I often use, and in this case it was perfect.


Hope this helps!

Monday, April 4, 2016

Don't Use Delimiters In Your Object Names!

It started with a failed backup status check:
Msg 50000, Level 16, State 1, Line 73
** Backups listed in this file are missing / out of date **
db                status   recovery model   last full backup   Full BU hours old
[FinanceDB]       ONLINE   SIMPLE           NULL             NULL
Check out the database name - [FinanceDB],  At first my response was something like:
I went and looked on the server, and sure enough in Management Studio I saw one database named "FinanceDB" and a database named "[FinanceDB]".

This was on a SQL 2008R2 instance, but as a test I created a database named [test] on my local SQL 2014 instance and sure enough it worked!

The source of the problem at the client was the LiteSpeed maintenance plan.  Even though the backup task was set to backup all user databases, it wasn't picking up the square-bracketed database.

On my test system I set up a "regular" SQL Server maintenance plan, and it had did not have the same problem - the backup job did pick up the square-bracketed [test] database.

I next ran the Ola Hallengren DatabaseBackup that I already had installed, and it also picked up the rogue database, creating a backup named TestServer_[test]_FULL_20160404_143721.bak.


Since I don't have LiteSpeed on my test system I created a quick test on the client's test system  (SQL 2008R2) - I created a new database named [test] and created a new LiteSpeed maintenance plan to back up *just* that database - the job failed with this error:
Executed as user: Domain\User. Microsoft (R) SQL Server Execute Package Utility  Version 11.0.5058.0 for 64-bit  Copyright (C) Microsoft Corporation. All rights reserved.    Started:  3:47:35 PM  Progress: 2016-04-04 15:47:35.81     Source: {D8AD0CC9-710A-4C59-A8E6-1B9228562535}      Executing query "DECLARE @Guid UNIQUEIDENTIFIER EXECUTE msdb..sp_ma...".: 100% complete  End Progress  Error: 2016-04-04 15:47:37.01     Code: 0xC002F210     Source: Fast Compression Backup 1 Execute SQL Task     Description: Executing the query "execute master..xp_slssqlmaint N'-BkUpMedia DISK -..." failed with the following error: "LiteSpeed? for SQL Server?  ? 2015 Dell Inc.    Database 'test' is invalid: not found    Msg 62401, Level 16, State 1, Line 0: Database 'test' is invalid: not found    ".   End Error  DTExec: The package execution returned DTSER_FAILURE (1).  Started:  3:47:35 PM  Finished: 3:47:37 PM  Elapsed:  1.809 seconds.  The package execution failed.  The step failed.
That was strange - remember the original backup job succeeded (although not backing up the square-bracketed database).  I considered and realized in the original situation there were two databases - one with brackets and one without.  Sure enough when I created a database just named "test" and included it in my new maintenance plan, the job succeeded.

How did it succeed?  It backed up "test" TWICE:

The LiteSpeed Fast Compression backup backed up the "test" database twice (Fast Compression runs a FULL every so often and DIFF's in-between - this is why two backups in a row results in a FULL and a DIFF).  I ran the job again and saw the same thing:

I verified by creating an object in [test] and running the backups again and running a restore - what I was backing up was truly "test" over and over.


This is not about bashing Dell/Quest LiteSpeed - it is about pointing out a shortcoming in how SQL Server allows us to name objects and a cautionary tale on that naming.

Although it worked in most of the test scenarios above, you can see how the square-bracketed database name failed under one maintenance scenario - and not only did it fail, it failed in a particularly nefarious way because the backup job *didn't* fail - it just didn't pick up the database.


I re-created my [test] database on my local instance and wanted to see what the syntax would look like for a DROP - I went through the SCRIPT action in Management Studio and ended up with this:

USE [master]

/****** Object:  Database [[test]]    Script Date: 4/4/2016 2:00:55 PM ******/

Note the *three* square brackets at the end of the DROP statement – two open brackets but three close brackets.

A syntax checker parse showed statement OK – when I removed one of the closing brackets, so that I had two open and two close, the parse showed me this:

Msg 105, Level 15, State 1, Line 5
Unclosed quotation mark after the character string '[test]

Msg 102, Level 15, State 1, Line 5
Incorrect syntax near '[test]

…and an execution resulted in the same errors.  When I re-added the third close square bracket, my DROP statement succceeded.

A script of a simple BACKUP statement resulted in the same thing:

TO DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\Backup\[test].bak' WITH NOFORMAT, NOINIT,  NAME = N'[test]-Full Database Backup', SKIP, NOREWIND, NOUNLOAD,  STATS = 10


This third bracket may be the source of the issue in LiteSpeed and it may not - but it shows the importance of handling object names properly and how touchy the parser can be when you don't make it happy.


At the end of the day - just don't do it.  There is always going to be some other name that is just as useful as any name that includes a delimiter, and you don't need to handle the bizarre and inconsistent scenarios that will arise.

Hope this helps!

Monday, March 28, 2016

Why I Love #SQLSaturday

PASS SQLSaturdays are free 1-day training events for SQL Server professionals that focus on local speakers, providing a variety of high-quality technical sessions, and making it all happen through the efforts of volunteers. Whether you're attending a SQLSaturday or thinking about hosting your own, we think you'll find it's a great way to spend a Saturday – or any day. -
Sounds simple, doesn't it?  Here are the top three reasons I *LOVE* SQLSaturdays:

The opportunity to learn - the quality and quantity of free training available at a SQLSaturday never ceases to amaze me.  Most SQLSaturdays have anywhere from three to six tracks, resulting in 15-30 sessions on everything from database administration to business intelligence to professional development and everything in between.  Sessions range from 100-level introductory up through 400-level expert.
If you want to pay $100-$150, most SQLSaturdays offer full-day "pre-conference" sessions on the Friday before the event (and sometimes even the Thursday before as well). While it isn't free, there aren't very many places to get a full day of high end training for a hundred bucks!

Another aspect of this is *who* provides the training.  I regularly see sessions from Microsoft Certified Masters (MCM's) and Most Valuable Professionals (MVP's) as well as amazing content from first-timers dipping their toes in the pond.  At a recent SQL Saturday (in Boston) I saw an MCM present a great talk on working with virtualized SQL, one MVP speak on working with your infrastructure team, and another MVP talk about the Query Store, an upcoming feature from the unreleased next version of SQL Server.  Having said that, one of the best SQLSaturday sessions I have ever seen came a couple years ago from a first-time speaker excitedly presenting a new way she had devised to analyze trace data. 

All of these speakers share their expertise without pay or reward (other than usually a nice dinner the night before).

The opportunity to share - another fun part of SQLSaturday for me is being one of those speakers sharing what I know for free.  I have written before about the benefits of blogging and speaking, and they are many.  The biggest benefit to me personally (not counting the joy of helping others) is how creating a presentation helps force me to learn a new topic or a better way to do something I already do.

The presentation I used to give all the time was about doing health checks using Glenn Berry's DMV scripts, Ola Hallengren's maintenance solution, and how to make them work together to check client servers.  The presentation I frequently give now is about Extended Events (XEvents) - I told myself about a year ago I needed to learn more about XEvents and Powershell, and I knew (for me) creating presentations would help.  I submitted an Intro to Extended Events 100-level session to a couple of SQLSaturdays, and when it was chosen I was suddenly very motivated to learn more about the topic to produce the content!

The first presentation - the health check talk - highlights an aspect of how I work with SQL Server, and it is shared by many others.  The are lots of free tools and shared knowledge out there about SQL Server, and you don't need to recreate the wheel nine times out of ten - a little Google-Fu or #sqlhelp will usually give you an answer or at least give you raw material you can mold into an answer.  Just because you are working with someone else's raw material does not mean you can't write or speak about the situation - just make 100% sure you give credit (and notation) whenever it is due.  If you read my blog or see me speak with any regularity you will see that a lot if what I write about is *how* I use someone else's scripts, whether modified or "straight from the box," as opposed to creating completely new scripts, but I also reference those authors' blog posts, forum answers, and Twitter feeds.

You don't have to create a completely new way to do something for it to be useful to share!

The opportunity to network - I list this third, but it can often be the most important. While it can be very useful to interact with the #sqlfamily online, there is no substitute for being able to sit down across the table from an expert on a topic you need help with and getting hundreds of dollars worth of free consulting while forming friendships that continue on after the event ends. There is nothing like the #sqlfamily, and it is fun to watch people from other areas as their eyes open wide. I have seen an Oracle DBA visit a SQLSaturday and watched their jaw hit the floor when they heard a speaker sharing their expert knowledge for free; I have had a manager ask me "How did you get the answer so fast?" and reply "I got on Twitter and asked the person who wrote the feature and he told me"; I have watched people who have never met in person raise thousands of dollars for charitable causes suggested by #sqlfamily members.

Another way to get involved and network is to volunteer at a SQLSaturday. I mentioned the speakers are volunteers, but so ate all of the others involved - coordinators, room monitors, check-in staff, and the rest.

This networking is invaluable to your career in other ways - two of my last three jobs came from a member of the #sqlfamily notifying me directly of an opening or making a key introduction to help me start the process.

SQLSaturdays are awesome!!!

Thursday, March 24, 2016

Pulling Security Info

A frequent request I receive is to pull a list of logins/users with certain accesses, role memberships, etc.

I had a query to use xp_logininfo to pull group membership chains - that is, DOMAIN\Andy has access, but not directly - DOMAIN\Andy has access because he is a member of DOMAIN\DBAGroup.  The query is this:
Domain Login Group Security Info
DECLARE @name sysname 
CREATE TABLE ##logininfo
[account name] sysname,
[type] nvarchar(50),
[privilege] nvarchar(50),
[mapped login name] sysname,
[permission path] sysname
DECLARE namecursor cursor fast_forward
select name from master.sys.server_principals
where type='G' and name not like 'NT SERVICE%' 
open  namecursor
fetch next from namecursor into @name 
WHILE @@fetch_status=0
insert into ##logininfo EXEC ('xp_logininfo '''+ @name+''',''members''')
fetch next from namecursor into @name
CLOSE namecursor
DEALLOCATE namecursor 
select @@SERVERNAME as InstanceName, *
from ##logininfo
where [mapped login name] like'%agalb%'
DROP TABLE ##logininfo
I needed the other half (or two-thirds) - I needed the ability to pull server and database role memberships.  Rather than script something from scratch I went looking and sure enough found the raw material for what I wanted on SQLServerCentral in the forums in the post "Query to get the lisst of logins having sysadmin and serveradmin."

The query is pulled from Jagan Kondapalli's (@JVKondapalli) reply to the original poster's question.  I modified it in a couple places and am posting my modification here:
IF  EXISTS (SELECT * FROM tempdb.dbo.sysobjects WHERE name = '##Users' AND type in (N'U'))
    DROP TABLE ##Users
IF  EXISTS (SELECT * FROM tempdb.dbo.sysobjects WHERE name = '##LOGINS' AND type in (N'U'))
USE tempdb
[Login Name] varchar(50),
[Default Database] varchar(60),
[Login Type] varchar(40),
[AD Login Type] varchar(40),
[sysadmin] char(5),
[securityadmin] char(5),
[serveradmin] char(5),
[setupadmin] char(5),
[processadmin] char(5),
[diskadmin] char(5),
[dbcreator] char(5),
[bulkadmin] char(5)
    [Database] VARCHAR(64),
    [Database User ID] VARCHAR(64),
    [Server Login] VARCHAR(64),
    [Database Role] VARCHAR(64)
use master
SELECT  sid,
        loginname AS [Login Name],
        dbname AS [Default Database],
        CASE isntname
            WHEN 1 THEN 'AD Login'
            ELSE 'SQL Login'
        END AS [Login Type],
            WHEN isntgroup = 1 THEN 'AD Group'
            WHEN isntuser = 1 THEN 'AD User'
            ELSE ''
        END AS [AD Login Type],
        CASE sysadmin
            WHEN 1 THEN 'Yes'
            ELSE 'No'
        END AS [sysadmin],
        CASE [securityadmin]
            WHEN 1 THEN 'Yes'
            ELSE 'No'
        END AS [securityadmin],
        CASE [serveradmin]
            WHEN 1 THEN 'Yes'
            ELSE 'No'
        END AS [serveradmin],
        CASE [setupadmin]
            WHEN 1 THEN 'Yes'
            ELSE 'No'
        END AS [setupadmin],
        CASE [processadmin]
            WHEN 1 THEN 'Yes'
            ELSE 'No'
        END AS [processadmin],
        CASE [diskadmin]
            WHEN 1 THEN 'Yes'
            ELSE 'No'
        END AS [diskadmin],
        CASE [dbcreator]
            WHEN 1 THEN 'Yes'
            ELSE 'No'
        END AS [dbcreator],
        CASE [bulkadmin]
            WHEN 1 THEN 'Yes'
            ELSE 'No'
        END AS [bulkadmin]
SELECT @@SERVERNAME as InstanceName, [Login Name],         [Default Database],  
        [Login Type],         [AD Login Type],         [sysadmin],         [securityadmin],         [serveradmin],         [setupadmin],         [processadmin],         [diskadmin],         [dbcreator],         [bulkadmin]
FROM tempdb..##LOGINS
--where [mapped login name] like'%agalb%'
ORDER BY [Login Type], [AD Login Type], [Login Name]
USE master
DECLARE @DBName             VARCHAR(60)
DECLARE @SQLCmd             VARCHAR(1024)
Declare @DBID varchar(3)
set @DBID = (select MAX(database_id) from sys.databases)
--print @DBID
set @DBName = (select DB_NAME (''+@DBID+''))
SELECT @SQLCmd = 'INSERT ##Users ' +
        '  SELECT ''' + @DBName + ''' AS [Database],' +
        '       su.[name] AS [Database User ID], ' +
        '       COALESCE (u.[Login Name], ''** Orphaned **'') AS [Server Login], ' +
        '       COALESCE (, ''Public'') AS [Database Role] ' +
        '    FROM [' + @DBName + '].[dbo].[sysusers] su' +
        '        LEFT OUTER JOIN ##LOGINS u' +
        '            ON su.sid = u.sid' +
        '        LEFT OUTER JOIN ([' + @DBName + '].[dbo].[sysmembers] sm ' +
        '                             INNER JOIN [' + @DBName + '].[dbo].[sysusers] sug  ' +
        '                                 ON sm.groupuid = sug.uid)' +
        '            ON su.uid = sm.memberuid ' +
        '    WHERE su.hasdbaccess = 1' +
        '      AND su.[name] != ''dbo'' '
     print @DBID
     set @DBID = @DBID - 1
SELECT @@SERVERNAME as InstanceName,*
FROM ##Users
--where [server login] like '%AGALBRAI%'
ORDER BY [Database], [Database User ID]
IF  EXISTS (SELECT * FROM tempdb.dbo.sysobjects WHERE name = '##LOGINS' AND type in (N'U'))
IF  EXISTS (SELECT * FROM tempdb.dbo.sysobjects WHERE name = '##Users' AND type in (N'U'))
    DROP TABLE ##Users
Again - this is *not* my underlying query but it is exactly what I needed to pull role memberships.  I usually pull this info and dump it into Excel for client consumption.

I could not find a blog or other presence for Jagan besides his SQLServerCentral profile and the inactive Twitter account mentioned above, but thanks!


Hope this helps!

Wednesday, March 9, 2016

Update Your Scripts!

I had an incident today where a colleague was running a Wait Stats Report and the output looked like this:

As soon as he showed me the report I knew he had an outdated script.

Each new version of SQL Server introduces new wait types, often (but not always) related to new features.  For example, waits related to AlwaysOn (the feature formerly known as HADRON) have a prefix of HADR.

The wait at the top of his list, HADR_FILESTREAM_IOMGR_IOCOMPLETION  means "The FILESTREAM AlwaysOn I/O manager is waiting for I/O completion" - the catch is the AlwaysOn I/O manager is almost always waiting for I/O completion as data flows back and forth - this isn't indicative of any issue, so it can be excluded.

Paul Randal (@PaulRandal/blog) keeps a list of the "excludable" wait types in a blog post he maintains related to wait stats.  The post includes a version of the Wait Stats script from Glenn Berry's (@GlennAlanBerry/blog) DMV scripts, and that Wait Stats script has a large list of exclusions:
WHERE [wait_type] NOT IN (
        N'BROKER_TASK_STOP',                N'BROKER_TO_FLUSH',
        N'CHKPT',                           N'CLR_AUTO_EVENT',
        N'CLR_MANUAL_EVENT',                N'CLR_SEMAPHORE',
        N'EXECSYNC',                        N'FSAGENT',
        N'HADR_TIMER_TASK',                 N'HADR_WORK_QUEUE',
        N'KSOURCE_WAKEUP',                  N'LAZYWRITER_SLEEP',
        N'LOGMGR_QUEUE',                    N'ONDEMAND_TASK_QUEUE',
        N'SERVER_IDLE_CHECK',               N'SLEEP_BPOOL_FLUSH',
        N'SLEEP_DBSTARTUP',                 N'SLEEP_DCOMSTARTUP',
        N'SLEEP_SYSTEMTASK',                N'SLEEP_TASK',
        N'WAITFOR',                         N'WAITFOR_TASKSHUTDOWN',
        N'XE_DISPATCHER_WAIT',              N'XE_TIMER_EVENT')
If you refer back to my colleague's original screen shot, you'll see that the top ten items on his list - everything above "OLEDB" - is on the excludable list!


The broader lesson here is to make sure you update your script libraries regularly - even if a script still runs and provides output (that is, you think it "works") it doesn't mean you are receiving valid data.

Although this example is about wait stats and wait types, it is applicable to a wide array of configurations and settings.  Changes like this are often version-related, but even within a version it can be decided that a particular wait type/trace flag/sp_configure setting/etc. is no longer important and can be ignored - or even worse, that some item is now important but wasn't included in your original scripts!

A little regular maintenance and ongoing research will help your toolbox stay clean and organized.

Hope this helps!

Tuesday, March 1, 2016

Another Source of Knowledge

It started with a tweet from one the many members of the #sqlfamily who always has something meaningful to say, Jes Borland (blog/@grrl_geek):

The referenced link is

I followed Jes's advice, paused the other items I was working on, and watched the talk.  It was a nice tight eleven minutes on how to be present in the moment, paying attention to those around you and interacting with them in a more meaningful way.

I was not formerly familiar with the speaker Celeste Headlee (blog/@CelesteHeadlee), a public radio talk host, but after listening to her talk I could definitely listen to her again - she spoke very clearly and made her points without much fluff.

I'm sure some people will tune out right there, thinking they would never agree with a public radio host or her agenda - my take has always been that it is worth giving any speaker at least two chances.  If I have listened to you speak/present/etc. twice (anyone can have a bad day) and both times you have failed to hold my interest (on topics in which I am interested) I will discount then you as a speaker - at least as a speaker I would be interested in listening to.


While I enjoyed this particular talk, it reminded me of something I used to really enjoy doing and have all but stopped over the last couple years.

I tripped over TED talks online five or six years ago, back when I still worked in an on-site office.  I don't remember how - probably from Twitter or Facebook - but I discovered this wide array of free recordings that I could listen to on my breaks or when I had a free few minutes.  As a full-time work-from-home these last few years I have not handled my workday in the same way (I eat lunch with my family now for example) and have not watched talks in the same way I used to.

TED is a nonprofit organization that puts on a series of conferences on Technology, Entertainment, and Design.  The talks given at the conferences are by practice 18 minutes or less (although some do run longer), with many talks being around ten minutes long.

The conferences themselves aren't cheap (and as such I have never been in person) - TED2017 in Vancouver next year has a registration cost of US$8500 - but the real treasure trove are the talks themselves, posted online for free on TED's website (and often on YouTube) after a reasonable time has passed.

TED talks are given by an amazing array of speakers from all different backgrounds, from industry leaders like Bill Gates and Richard Branson to intellectuals like Sir Ken Robinson and Dan Gilbert to entertainers like Apollo Robbins and Ze Frank.

You can search the talks by topic or speaker or length, or you can watch one of the curated playlists of talks to see talks related to one another on a given subject.

A sidebar to the TED conferences and talks is TEDx, TEDx is an offshoot of the parent conference where local staff organize an event that is a combination of TED-style talks from local/regional speakers and airings of national TED talks.  The best comparison I can make for SQL Server is to think of TEDx as TED's SQL Saturday - not entirely accurate but close.  TEDx talks are also available for viewing for free at

Another benefit of a resource like this is to a speaker (or a potential speaker, which is about everyone).  You can learn a lot about speaking to an audience from simply watching other people present, and these talks are great examples.


At the end of the day, you probably won't learn anything directly about SQL Server from seeking out things like TED talks, but you will learn a lot about the world around you, and remember Knowledge is Power!


Hope this helps!

Tuesday, February 23, 2016

How is SQL Server using more than the Max Server Memory?

As usual, the pager went off...

"Server is near 100% memory utilization - what is running?"
I saw when I signed on that the Windows server was at 98% memory utilization (the sum of all processes in Windows, including but not limited to SQL Server) and that the sqlservr.exe process for the single SQL Server 2008R2 instance was using a lion’s share of it – 77GB out of 80GB on the server.

The Max Server Memory cap in SQL Server was set to 72.5GB, so how was this possible?!?

A key that many DBAs don't consider is that there are multiple other processes that run outside of the Buffer Pool memory, and before SQL server 2012, the Buffer Pool was the only thing that is governed by the Max Server Memory cap.  This is how the sqlservr process can use more than the cap.  

When you configure Max Server Memory (you *have* configured Max Server Memory, right!!?!?!?!?!) all it was doing before SQL Server 2012 was setting a cap on the Buffer Pool.  Not only does this not impact external SQL Server processes such as Reporting Service and Analysis Services, it doesn't even cover everything inside sqlservr.exe.

(SQL Server 2012 dramatically changed what is and isn't included under Max Server Memory - see the post "Memory Manager Configuration changes in SQL Server 2012" from the MS SQLOS team which discusses how several things were shifted to be included under Max Server Memory as of 2012.)

An interesting point in this situation was the fact that I saw that the Page Life Expectancy (PLE) was through the roof – over 88,000 on both NUMA nodes.  Regardless of what guidelines you subscribe to, this is a very high number and indicates a large amount of free Buffer Pool.  

This relates to how the SQL Server application (and most other enterprise applications) manage memory – they gradually grab more and more memory as they need it (for large queries, etc.) but they don’t release it gracefully (if at all).
At some point some large query or other unit of work probably needed that 72GB of Buffer Pool memory, but it was mostly free at the time I checked the PLE value (as evidenced by the large number).  

In many unfortunate cases the only way to release this memory from the Windows process is to restart the SQL Server service (MSSQLServer or MSSQL$instance) in Windows, or to lower the Max Server Memory cap in SQL Server (which will gradually force the Buffer Pool to release the memory it is holding to the level of the new cap - this can take many minutes depending on how much the value of Max Server Memory is decreased).


Sidebar: WARNING - as we have discussed in the past, remember that PLE is a trending number - the fact that PLE is 88,000 right now does *not* by itself indicate you have too much memory configured for your process - it just means that at the moment it was measured, there was significant free memory.  

PLE needs to be monitored over time at different times of day during different times of the week and month.  It may be that there is significant free memory right now, but at 2am when CheckDB is running (you *do* run CheckDB, right!!?!?!?!?!) or on the last day of the month when the payroll reports run, PLE may drop down to nothing as that process needs all of the memory available and then some.  

NEVER, EVER, make capacity planning decisions solely on a single point-in-time value measurement - even if you think it is a busy time on the server!  This is *NOT* an #ItDepends

As I was saying, there are many components that work outside of the buffer pool, and one of them (in 2008R2) is CLR (Common Language Runtime), the run-time process for executing .NET managed code (such as VB.NET and C#) inside SQL Server.

This was the output for the CLR memory space (from DBCC MEMORYSTATUS):
MEMORYCLERK_SQLCLR (node 0)              KB
---------------------------------------- -----------
VM Reserved                              6313088
VM Committed                             25792
Locked Pages Allocated                   0
SM Reserved                              0
SM Committed                             0
SinglePage Allocator                     1408
MultiPage Allocator                      39256

6313088KB reserved equals 6GB, which is a fair chunk of the “extra” memory SQL Server was using.  As you can see here the “committed” number was significantly smaller than the “reserved” number, meaning that at the specific moment I ran the query there wasn’t much CLR traffic going on, but there was enough recent work that it had to reserve 6GB of space.

The DMV companion to this is sys.dm_os_memory_clerks, and a useful query is:
SELECT top 10 type,
from sys.dm_os_memory_clerks
order by virtual_memory_reserved_kb desc

As you can see, the DMV returns comparable information. and you receive it in a nice tight query result set rather than an ugly 80-page DBCC output. :)


In the article referenced above, CLR is one of the things that shifts under Max Server Memory as of SQL Server 2012, so if this had been a SQL Server 2012/2014 the problem may not have even been noticed.  With so much free Buffer Pool (high PLE) there might have been sufficient head room under Max Server Memory to handle the CLR needs *without* taking the server to 98%+ RAM utilization.

CLR is one of those things you just have to allow for when planning memory capacity and setting Max Server Memory on a SQL Server – in this case *unless* there was something unusual going on with the CLR – such as a new code release that has badly memory-managed code in it – this showed that the Max Server Memory cap needed to be set lower on this instance (or RAM needed to be added to the server) to allow for what CLR needed.

IMPORTANT - realize this need to plan for CLR is true regardless of your SQL version - on a SQL Server 2005/2008/2008R2, Max Server Memory needs to be set sufficiently low to give head room for CLR *outside* the cap, while on a newer SQL Server with the changes described above, the cap needs to be set high enough to include CLR's needs.


As the on-call I wasn't familiar with the regular workload of this server, so I advised the client and the primary DBA that if the current situation is the “real” and “normal” situation on this server, I recommended lowering Max Server Memory by 2GB-4GB to allow more head room for the operating system, etc. while still satisfying the needs of the CLR space.

I lowered the Max Server Memory cap is 70GB (71600MB) to try to help for the immediate time – since the PLE was so high it could handle it in the immediate term and the free memory on the Windows server went up to 5% (from 2% when I logged on).

At the end of the day I turned it over to the client's primary DBA to follow-up with the client as to whether this was a memory capacity problem - does the client really need 72.5GB of Buffer Pool *and* 6+GB of CLR memory? - or if it was a code/intermittent problem - was there a new piece of poorly written code that was the underlying cause of the issue?


The takeaway here is to remember that there are many things - CLR, extended events sessions, the lock manager, linked servers, and lots more - that function outside the realm of the Buffer Pool, depending on your SQL Server version.  If your server uses CLR components, or if you do a lot of XEvents tracking, or use complicated extended stored procedures, etc., make sure to allow for that when capacity planning and when configuring Max Server Memory.

Hope this helps!