Nebraska SQL from @DBA_ANDY: February 2016

Tuesday, February 23, 2016

How is SQL Server using more than the Max Server Memory?

As usual, the pager went off...

"Server is near 100% memory utilization - what is running?"

http://www.starshipearththebigpicture.com/wp-content/uploads/2015/09/red-alert-picard-300x228.jpg

I saw when I signed on that the Windows server was at 98% memory utilization (the sum of all processes in Windows, including but not limited to SQL Server) and that the sqlservr.exe process for the single SQL Server 2008R2 instance was using a lion’s share of it – 77GB out of 80GB on the server.

The Max Server Memory cap in SQL Server was set to 72.5GB, so how was this possible?!?

A key that many DBAs don't consider is that there are multiple other processes that run outside of the Buffer Pool memory, and before SQL server 2012, the Buffer Pool was the only thing that is governed by the Max Server Memory cap. This is how the sqlservr process can use more than the cap.

When you configure Max Server Memory (you *have* configured Max Server Memory, right!!?!?!?!?!) all it was doing before SQL Server 2012 was setting a cap on the Buffer Pool. Not only does this not impact external SQL Server processes such as Reporting Service and Analysis Services, it doesn't even cover everything inside sqlservr.exe.

(SQL Server 2012 dramatically changed what is and isn't included under Max Server Memory - see the post "Memory Manager Configuration changes in SQL Server 2012" from the MS SQLOS team which discusses how several things were shifted to be included under Max Server Memory as of 2012.)

An interesting point in this situation was the fact that I saw that the Page Life Expectancy (PLE) was through the roof – over 88,000 on both NUMA nodes. Regardless of what guidelines you subscribe to, this is a very high number and indicates a large amount of free Buffer Pool.

This relates to how the SQL Server application (and most other enterprise applications) manage memory – they gradually grab more and more memory as they need it (for large queries, etc.) but they don’t release it gracefully (if at all).

http://i60.tinypic.com/1692rgj.jpg

At some point some large query or other unit of work probably needed that 72GB of Buffer Pool memory, but it was mostly free at the time I checked the PLE value (as evidenced by the large number).

In many unfortunate cases the only way to release this memory from the Windows process is to restart the SQL Server service (MSSQLServer or MSSQL$instance) in Windows, or to lower the Max Server Memory cap in SQL Server (which will gradually force the Buffer Pool to release the memory it is holding to the level of the new cap - this can take many minutes depending on how much the value of Max Server Memory is decreased).

--

Sidebar: WARNING - as we have discussed in the past, remember that PLE is a trending number - the fact that PLE is 88,000 right now does *not* by itself indicate you have too much memory configured for your process - it just means that at the moment it was measured, there was significant free memory.

PLE needs to be monitored over time at different times of day during different times of the week and month. It may be that there is significant free memory right now, but at 2am when CheckDB is running (you *do* run CheckDB, right!!?!?!?!?!) or on the last day of the month when the payroll reports run, PLE may drop down to nothing as that process needs all of the memory available and then some.

NEVER, EVER, make capacity planning decisions solely on a single point-in-time value measurement - even if you think it is a busy time on the server! This is *NOT* an #ItDepends

http://static.comicvine.com/uploads/original/3/33200/3378703-9952743819-Prude.jpg

--

As I was saying, there are many components that work outside of the buffer pool, and one of them (in 2008R2) is CLR (Common Language Runtime), the run-time process for executing .NET managed code (such as VB.NET and C#) inside SQL Server.

This was the output for the CLR memory space (from DBCC MEMORYSTATUS):

MEMORYCLERK_SQLCLR (node 0) KB
---------------------------------------- -----------
VM Reserved 6313088
VM Committed 25792
Locked Pages Allocated 0
SM Reserved 0
SM Committed 0
SinglePage Allocator 1408
MultiPage Allocator 39256

6313088KB reserved equals 6GB, which is a fair chunk of the “extra” memory SQL Server was using. As you can see here the “committed” number was significantly smaller than the “reserved” number, meaning that at the specific moment I ran the query there wasn’t much CLR traffic going on, but there was enough recent work that it had to reserve 6GB of space.

The DMV companion to this is sys.dm_os_memory_clerks, and a useful query is:

SELECT top 10 type,
virtual_memory_reserved_kb,
virtual_memory_committed_kb
from sys.dm_os_memory_clerks
order by virtual_memory_reserved_kb desc

As you can see, the DMV returns comparable information. and you receive it in a nice tight query result set rather than an ugly 80-page DBCC output. :)

--

In the article referenced above, CLR is one of the things that shifts under Max Server Memory as of SQL Server 2012, so if this had been a SQL Server 2012/2014 the problem may not have even been noticed. With so much free Buffer Pool (high PLE) there might have been sufficient head room under Max Server Memory to handle the CLR needs *without* taking the server to 98%+ RAM utilization.

CLR is one of those things you just have to allow for when planning memory capacity and setting Max Server Memory on a SQL Server – in this case *unless* there was something unusual going on with the CLR – such as a new code release that has badly memory-managed code in it – this showed that the Max Server Memory cap needed to be set lower on this instance (or RAM needed to be added to the server) to allow for what CLR needed.

IMPORTANT - realize this need to plan for CLR is true regardless of your SQL version - on a SQL Server 2005/2008/2008R2, Max Server Memory needs to be set sufficiently low to give head room for CLR *outside* the cap, while on a newer SQL Server with the changes described above, the cap needs to be set high enough to include CLR's needs.

--

As the on-call I wasn't familiar with the regular workload of this server, so I advised the client and the primary DBA that if the current situation is the “real” and “normal” situation on this server, I recommended lowering Max Server Memory by 2GB-4GB to allow more head room for the operating system, etc. while still satisfying the needs of the CLR space.

I lowered the Max Server Memory cap is 70GB (71600MB) to try to help for the immediate time – since the PLE was so high it could handle it in the immediate term and the free memory on the Windows server went up to 5% (from 2% when I logged on).

At the end of the day I turned it over to the client's primary DBA to follow-up with the client as to whether this was a memory capacity problem - does the client really need 72.5GB of Buffer Pool *and* 6+GB of CLR memory? - or if it was a code/intermittent problem - was there a new piece of poorly written code that was the underlying cause of the issue?

--

The takeaway here is to remember that there are many things - CLR, extended events sessions, the lock manager, linked servers, and lots more - that function outside the realm of the Buffer Pool, depending on your SQL Server version. If your server uses CLR components, or if you do a lot of XEvents tracking, or use complicated extended stored procedures, etc., make sure to allow for that when capacity planning and when configuring Max Server Memory.

Hope this helps!

Tuesday, February 16, 2016

Configuring a Perfmon Collector for SQL Server

Something I always stress to people when I write and present is the importance of tracking data *before* you need it. Microsoft does a fair job collecting some data via DMVs, the default trace, and Extended Events, but the data is still very limited in many ways, and in many cases the data doesn't persist past a SQL service restart.

One of the important tools for a SQL Server DBA (or Dev, or Analyst, or...) is Windows Perfmon. Perfmon can be configured to track a ton of data, but it doesn't collect anything unless you are actively watching via the Perfmon GUI or you have configured a Perfmon Collector.

One of the downsides to Perfmon, like most other monitoring tools, is performance overhead. The more frequently you measure data, the more impactful it can be.

https://33.media.tumblr.com/871fe906d55f4f01872fd65482f94a8a/tumblr_inline_njxqxpHf471s778iq.jpg

Because of this, I set up my collector to gather data every five minutes to lessen that impact. This number is a happy medium from past discussions with other colleagues for a number that is frequent enough to notice trends over time while still being infrequent enough to have minimal impact. Every five minutes may strike you as liable to miss problems, and it can - if something spikes (or valleys) for a moment - or even a minute or two - you may not see it. For many Perfmon counters however, you will see an extreme followed by a gradual change like this image from my recent "Server Using All of my RAM" blog post:

As you can see, Page Life Expectancy (PLE) on this graph dips, gradually climbs, and then dips again. With a collection every five minutes you may not catch the exact peak - all you know is that the PLE was 50,000 at 12:55am and then only 100 at 1:00am on 03/13. It may have climbed higher than that before it dipped, but by 1:00am it had dipped down to around 100 (coincidentally at 1am the CheckDB job had kicked off on a large database).

If you really need to know (in this example) exactly how high PLE gets before it dips, or exactly how low it dips, or at what specific time it valleys or dips, you need to actively watch or set up a collector with a more frequent collection. You will find that in most cases this absolute value isn't important - it is sufficient to know that a certain item peaks/valleys in a certain five minute interval, or that during a certain five minute interval ("The server was slow last night at 3am") a value was in an acceptable/unacceptable range.

If you do set up a collector with a more frequent interval, make sure to delete it (or at least turn it off) after you have collected your data. I am a fan of deleting it outright so that it doesn't accidentally get turned back on and cause impact, but sometimes it does make more sense to leave it in place. #ItDepends

--

My mechanism (not originally created by me but significantly modified from the original source material - I do not know who was the original author) uses a folder structure that I can ZIP and then unzip into C:\Perflogs, with batch files to create the collector and manage it via Windows Scheduled tasks.

Here are the counters I collect by default:

"\Memory\Available MBytes"
"\Memory\Pages/sec"
"\MSSQLSERVER:Access Methods\Forwarded Records/sec"
"\MSSQLSERVER:Access Methods\Full Scans/sec"
"\MSSQLSERVER:Access Methods\Index Searches/sec"
"\MSSQLSERVER:Buffer Manager\Buffer cache hit ratio"
"\MSSQLSERVER:Buffer Manager\Free List Stalls/sec"
"\MSSQLSERVER:Buffer Manager\Free pages"
"\MSSQLSERVER:Buffer Manager\Lazy writes/sec"
"\MSSQLSERVER:Buffer Manager\Page life expectancy"
"\MSSQLSERVER:Buffer Manager\Page reads/sec"
"\MSSQLSERVER:Buffer Manager\Page writes/sec"
"\MSSQLSERVER:General Statistics\User Connections"
"\MSSQLSERVER:Latches(*)\Latch Waits/sec"
"\MSSQLSERVER:Locks(*)\Lock Waits/sec"
"\MSSQLSERVER:Locks(*)\Number of Deadlocks/sec"
"\MSSQLSERVER:Memory Manager\Target Server Memory (KB)"
"\MSSQLSERVER:Memory Manager\Total Server Memory (KB)"
"\MSSQLSERVER:SQL Statistics\Batch Requests/sec"
"\MSSQLSERVER:SQL Statistics\SQL Compilations/sec"
"\MSSQLSERVER:SQL Statistics\SQL Re-Compilations/sec"
"\Paging File(*)\% Usage"
"\PhysicalDisk(*)\Avg. Disk sec/Read"
"\PhysicalDisk(*)\Avg. Disk sec/Write"
"\PhysicalDisk(*)\Disk Reads/sec"
"\PhysicalDisk(*)\Disk Writes/sec"
"\Process(sqlservr)\% Privileged Time"
"\Process(sqlservr)\% Processor Time"
"\Processor(*)\% Privileged Time"
"\Processor(*)\% Processor Time"
"\SQLSERVER:Access Methods\Forwarded Records/sec"
"\SQLSERVER:Access Methods\Full Scans/sec"
"\SQLSERVER:Access Methods\Index Searches/sec"
"\SQLSERVER:Buffer Manager\Buffer cache hit ratio"
"\SQLSERVER:Buffer Manager\Free List Stalls/sec"
"\SQLSERVER:Buffer Manager\Free pages"
"\SQLSERVER:Buffer Manager\Lazy writes/sec"
"\SQLSERVER:Buffer Manager\Page life expectancy"
"\SQLSERVER:Buffer Manager\Page reads/sec"
"\SQLSERVER:Buffer Manager\Page writes/sec"
"\SQLSERVER:General Statistics\User Connections"
"\SQLSERVER:Latches(*)\Latch Waits/sec"
"\SQLSERVER:Locks(*)\Lock Waits/sec"
"\SQLSERVER:Locks(*)\Number of Deadlocks/sec"
"\SQLSERVER:Memory Manager\Target Server Memory (KB)"
"\SQLSERVER:Memory Manager\Total Server Memory (KB)"
"\SQLSERVER:SQL Statistics\Batch Requests/sec"
"\SQLSERVER:SQL Statistics\SQL Compilations/sec"
"\SQLSERVER:SQL Statistics\SQL Re-Compilations/sec"
"\System\Processor Queue Length"

As you can see there is duplication between MSSQLSERVER counters and SQLSERVER counters - this is because at some version of SQL the hive name changed from MSSQLSERVER to SQLSERVER, and by including both of them in the list it covers all of the bases regardless of the Windows/SQL Server version being monitored. If the listed hive doesn't exist, the collector creates without error, so it doesn't hurt to have both of them in the list.

This is my default list, curated over time from experiences and input from others - if you have other counters you want to collect just edit the list accordingly.

One note - if you have a named instance, the counter hive will be named differently - something like MSSQL$instancename. The easiest way to handle this is to edit the counter list, copy-paste the SQLSERVER counter list, and then find-replace SQLSERVER to MSSQL$instancename for the new items.

--

The folder structure starts at the top level with a folder named SQLServerPerf (in my case Ntirety-SQLServerPerf). Below that are two folders, SQLServerPerf\BatchFiles and SQLServerPerf\Logs.

By default, the Logs folder is empty, In BatchFiles are five files - a .CFG file that includes the counter list above, and four .BAT files to create/start/stop/cycle the collector itself.

To start, unzip the package (or copy the SQLServerPerf folder) into C:\PerfLogs, resulting in C:\PerfLogs\SQLServerPerf\

Why do I use C:? By default Windows creates a PerfLogs folder there, plus using that path guarantees (almost guarantees) that the batch files will run since Windows servers in general have C: drives - using a different drive would require edits to the files to reference that different drive, and it you absolutely can't write to C: that is the fix - edit the files to change references from C: to the drive of your choice.

--

SQLServerPerf\BatchFiles\SQLServer.CFG is a text file whose contents are just the counter list:

"\Memory\Available MBytes"
"\Memory\Pages/sec"
"\MSSQLSERVER:Access Methods\Forwarded Records/sec"
"\MSSQLSERVER:Access Methods\Full Scans/sec"
"\MSSQLSERVER:Access Methods\Index Searches/sec"
"\MSSQLSERVER:Buffer Manager\Buffer cache hit ratio"
"\MSSQLSERVER:Buffer Manager\Free List Stalls/sec"
"\MSSQLSERVER:Buffer Manager\Free pages"
"\MSSQLSERVER:Buffer Manager\Lazy writes/sec"
"\MSSQLSERVER:Buffer Manager\Page life expectancy"
"\MSSQLSERVER:Buffer Manager\Page reads/sec"
"\MSSQLSERVER:Buffer Manager\Page writes/sec"
"\MSSQLSERVER:General Statistics\User Connections"
"\MSSQLSERVER:Latches(*)\Latch Waits/sec"
"\MSSQLSERVER:Locks(*)\Lock Waits/sec"
"\MSSQLSERVER:Locks(*)\Number of Deadlocks/sec"
"\MSSQLSERVER:Memory Manager\Target Server Memory (KB)"
"\MSSQLSERVER:Memory Manager\Total Server Memory (KB)"
"\MSSQLSERVER:SQL Statistics\Batch Requests/sec"
"\MSSQLSERVER:SQL Statistics\SQL Compilations/sec"
"\MSSQLSERVER:SQL Statistics\SQL Re-Compilations/sec"
"\Paging File(*)\% Usage"
"\PhysicalDisk(*)\Avg. Disk sec/Read"
"\PhysicalDisk(*)\Avg. Disk sec/Write"
"\PhysicalDisk(*)\Disk Reads/sec"
"\PhysicalDisk(*)\Disk Writes/sec"
"\Process(sqlservr)\% Privileged Time"
"\Process(sqlservr)\% Processor Time"
"\Processor(*)\% Privileged Time"
"\Processor(*)\% Processor Time"
"\SQLSERVER:Access Methods\Forwarded Records/sec"
"\SQLSERVER:Access Methods\Full Scans/sec"
"\SQLSERVER:Access Methods\Index Searches/sec"
"\SQLSERVER:Buffer Manager\Buffer cache hit ratio"
"\SQLSERVER:Buffer Manager\Free List Stalls/sec"
"\SQLSERVER:Buffer Manager\Free pages"
"\SQLSERVER:Buffer Manager\Lazy writes/sec"
"\SQLSERVER:Buffer Manager\Page life expectancy"
"\SQLSERVER:Buffer Manager\Page reads/sec"
"\SQLSERVER:Buffer Manager\Page writes/sec"
"\SQLSERVER:General Statistics\User Connections"
"\SQLSERVER:Latches(*)\Latch Waits/sec"
"\SQLSERVER:Locks(*)\Lock Waits/sec"
"\SQLSERVER:Locks(*)\Number of Deadlocks/sec"
"\SQLSERVER:Memory Manager\Target Server Memory (KB)"
"\SQLSERVER:Memory Manager\Total Server Memory (KB)"
"\SQLSERVER:SQL Statistics\Batch Requests/sec"
"\SQLSERVER:SQL Statistics\SQL Compilations/sec"
"\SQLSERVER:SQL Statistics\SQL Re-Compilations/sec"
"\System\Processor Queue Length"

--

SQLServerPerf\BatchFiles\SQLPerfmonCollector-Create.bat is a batch (text) file whose contents create the collector and also several Windows Scheduled Tasks that manage the collector:

logman create counter SQLServerPerf -f bin -si 300 -v nnnnnn -o "c:\perflogs\SQLServerPerf\Logs\SQLServerPerf" -cf "c:\perflogs\SQLServerPerf\BatchFiles\SQLServer.cfg"
timeout /T 2
logman start SQLServerPerf
timeout /T 2
schtasks /create /tn "Cycle SQLServerPerf Perfmon Counter Log" /tr C:\PerfLogs\SQLServerPerf\BatchFiles\SQLPerfmonCollector-Cycle.bat /sc daily /st 23:59:58 /ed 01/01/2099 /ru system
timeout /T 2
schtasks /create /tn "Start SQLServerPerf Perfmon Counter Log" /tr C:\PerfLogs\SQLServerPerf\BatchFiles\SQLPerfmonCollector-Start.bat /sc onstart /ru system
timeout /T 2
schtasks /create /tn "Purge SQLServerPerf Perfmon Counter Log" /tr "PowerShell -command {Get-ChildItem -path C:\PerfLogs\SQLServerPerf\Logs -Filter *.blg | where {$_.Lastwritetime -lt (date).addmonths(-13)} | remove-item}" /sc daily /st 23:59:58 /ed 01/01/2099 /ru system
pause

As you can see there are references to the direct path to our C:\PerfLogs\SQLServerPerf - if you move the files to another drive/path, these references need to be changed.

This batch file does the following:

Uses "logman create" to create the actual collector, writing files to to the Logs folder and gathering the counters listed in the SQLServer.cfg file
Uses "timeout" to pause for two seconds to allow the user to see any return messages
Uses "logman start" to start the collector
Pauses two more seconds
Uses "schtasks /create" to create a scheduled task to run SQLPerfmonCollector-Cycle.bat nightly at 11:59:58pm (Batch file contents to follow)
Pauses two more seconds
Uses "schtasks /create" to create a scheduled task to run SQLPerfmonCollector-Start.bat on system startup - this makes sure that after a reboot the collector is automatically started for minimal interruption
Pauses two more seconds
Uses "schtasks /create" to create a scheduled task to run nightly at 11:59:58pm to run a Powershell command to delete log (.BLG) files older than 13 months
Uses "pause" to stop progress until the user hits a key - this again prevents the window from closing until the user acknowledges the progress

This is specifically written to be run interactively (right-click on the BAT file and "Run" or if available, "Run As Administrator") - if you want to truly automate the deploy of the collector, the pause command should be removed.

--

**Important Note** - current versions of Windows have settings for some of what I am using Scheduled Tasks to do, such as restarting the collector. This wasn't always the case, and I wanted a solution I could deploy without worrying about the Windows version.

--

SQLServerPerf\BatchFiles\SQLPerfmonCollector-Cycle.bat is a batch (text) file whose contents stop and then start the collector:

logman stop SQLServerPerf
timeout /T 2
logman start SQLServerPerf

I included the timeout to pause for two seconds as I sometimes had issues with the stop/start process where I would get an error from the start command complaining that the collector was already running since the stop hadn't completed yet - adding the two second pause give the collector time to be completely stopped before the start attempt.

SQLServerPerf\BatchFiles\SQLPerfmonCollector-Start.bat is a batch (text) file whose contents start the collector:

logman start SQLServerPerf

SQLServerPerf\BatchFiles\SQLPerfmonCollector-Stop.bat is a batch (text) file whose contents stop the collector:

logman stop SQLServerPerf

I included this for completeness when I was creating the batch files although I have yet to need it - if I want to stop the collector for some reason I simply do so interactively in the Perfmon GUI.

--

As noted above, I keep the SQLServerPerf folder as a ZIP, and then to deploy it I unzip it into C:\PerfLogs, and then run the Create batch file - it's a simple as that.

The output file from a Perfmon collector by default is a .BLG file. As noted above, there is a scheduled tasks to stop and start the collector every night, resulting in a new .BLG file each day:

To look at an individual day's info, you simply open that day's file in Perfmon (by opening Perfmon and then opening the file, or by double-clicking the BLG file itself). If you wish to string multiple days' data together, use the Relog command as I described in this post from a couple years ago: "Handling Perfmon Logs with Relog."

As you can see in the above screenshot, on my laptop the collector stores about 1MB per day. Even on a big OLTP production cluster I don't see more than 1.5MB-2.0MB/day. Stretch that out over 13 months (remember we purge files older than 13 months) and it comes out to a max size of around 800MB. Hopefully 1GB of Perfmon data isn't going to fill your C: drive (if it is you have other problems) but if it is going be a problem, relocate the folders as described above.

--

There are few things more important for *reacting* to problems than to be *proactively* gathering data - and a Perfmon collector helps you to do just that.

Hope this helps!

Wednesday, February 10, 2016

How Bad Are Your Indexes?

In my last post "Copying SSIS packages With DTUTIL" I described a "gotcha" where the dynamic management views (DMVs) had different columns in SQL Server 2005 compared to 2008+, and I showed a version check I had built into the script to handle it.

It made me think about other places this check would be useful, and the first thing that came to mind was the Bad Indexes DMV query.

If you read my blog or have seen me speak at a SQL Saturday, you know I am a *big* fan of Glenn Berry (@GlennAlanBerry/blog) and his DMV Diagnostic Queries.

http://img.memecdn.com/im-your-biggest-fan_o_668248.jpg

Glenn has put many hours of work into deciphering and joining the dynamic management views/functions that have been in SQL Server since 2005 into useful query frameworks, and I leverage his work (with credit) whenever I can.

The concept of a "bad" index in this context is an index with many more writes than reads - that is, the index requires more effort to maintain than there is benefit from its existence.

http://www.lexisnexis.com/legalnewsroom/resized-image.ashx/__size/500x400/__key/telligent-evolution-components-attachments/13-12-00-00-00-00-00-89/ContentImage_2D00_MugShot.jpg

** IMPORTANT ** - any time you consider removing an index, always verify via use patterns, user interviews, etc. that the index is truly "removeable" - you do *not* want to be the one to remove a "bad" index only to find that it is needed for the monthly payroll run, or the quarterly bonus reports, or some other critical business process. Quite often "bad" indexes are only used periodically but are very crucial. An alternative to consider rather than dropping the index outright is to check if the index can be dropped and then recreated when it is needed, but often the index creation process incurs too much overhead for this to be viable.

Out of the box, the bad index query from Glenn's script is database-specific (rather than instance-wide):

-- Possible Bad NC Indexes (writes > reads) (Query 47) (Bad NC Indexes)
SELECT OBJECT_NAME(s.[object_id]) AS [Table Name], i.name AS [Index Name], i.index_id,
i.is_disabled, i.is_hypothetical, i.has_filter, i.fill_factor,
user_updates AS [Total Writes], user_seeks + user_scans + user_lookups AS [Total Reads],
user_updates - (user_seeks + user_scans + user_lookups) AS [Difference]
FROM sys.dm_db_index_usage_stats AS s WITH (NOLOCK)
INNER JOIN sys.indexes AS i WITH (NOLOCK)
ON s.[object_id] = i.[object_id]
AND i.index_id = s.index_id
WHERE OBJECTPROPERTY(s.[object_id],'IsUserTable') = 1
AND s.database_id = DB_ID()
AND user_updates > (user_seeks + user_scans + user_lookups)
AND i.index_id > 1
ORDER BY [Difference] DESC, [Total Writes] DESC, [Total Reads] ASC OPTION (RECOMPILE);
-- Look for indexes with high numbers of writes and zero or very low numbers of reads
-- Consider your complete workload, and how long your instance has been running
-- Investigate further before dropping an index!

The query relies on database-specific tables/views (such as sys.indexes) and therefore returns results for the current database context.

The first thing I wanted to do was to wrap the query in my undocumented little friend sp_msforeachdb.

http://mergeralpha.com/blog/wp-content/uploads/2015/10/resized_one-does-not-simply-meme-generator-one-does-not-simply-say-hello-to-my-little-friend-fdd94d.jpg

The query turned out like this:

EXEC sp_msforeachdb '
/* MODIFIED from Glenn - Possible Bad NC Indexes (writes > reads) (Query 58) (Bad NC Indexes) */
SELECT ''?'' as DBName,o.Name AS [Table Name], i.name AS [Index Name],
user_updates AS [Total Writes], user_seeks + user_scans + user_lookups AS [Total Reads],
user_updates - (user_seeks + user_scans + user_lookups) AS [Difference],
i.index_id,
i.is_disabled, i.is_hypothetical, i.has_filter, i.fill_factor
FROM sys.dm_db_index_usage_stats AS s WITH (NOLOCK)
INNER JOIN [?].sys.indexes AS i WITH (NOLOCK)
ON s.[object_id] = i.[object_id]
AND i.index_id = s.index_id
INNER JOIN [?].sys.objects as o WITH (nolock)
on i.object_ID=o.Object_ID
WHERE o.type = ''U''
AND s.database_id = DB_ID(''?'')
/* AND user_updates > (user_seeks + user_scans + user_lookups) */
AND i.index_id > 1
AND user_updates - (user_seeks + user_scans + user_lookups) >75000
ORDER BY [Difference] DESC, [Total Writes] DESC, [Total Reads] ASC;
'

The query has all of the standard artifacts of sp_msforeachdb, such as the question mark placeholder for the database name sprinkled throughout to set the proper context for all of the database-specific tables and views ( such as [?].sys.indexes).

This was where the version-specific problem came up - SQL Server 2008 introduced the concept of filtered indexes, and therefore a new column (has_filter) was added to sys.indexes. The result is that running the above query (which came from Glenn's SQL 2008 query script) errors out with a non-existent column error.

A fix to this could have been to have a modified version of the query without the offending column, and it would line up with how Glenn publishes his queries, with different scripts for each SQL Server version.

For *my* purpose I wanted a single script that I could run against any SQL Server 2005+, and the version check logic allows for that.

Here is the version checked version of the Bad Indexes For All Databases script:

/*
Bad Indexes DMV For All Databases
Modified by Andy Galbraith to run across all databases on the instance
Modified version of the Bad Indexes query in the Glenn Berry DMV scripts
http://www.sqlskills.com/blogs/glenn/category/dmv-queries/
Tested on MSSQL 2005/2008/2008R2/2012/2014
*/
SET NOCOUNT ON
DECLARE @SQLVersion char(4)
SET @SQLVersion = left(cast(SERVERPROPERTY('productversion') as varchar),4)
/* PRINT @SQLVersion */
IF LEFT(@SQLVersion,1) NOT IN ('1','9') /* Not 2005+ */
BEGIN
PRINT 'SQL Server Version Not Supported By This Script'
END
ELSE
BEGIN
IF @SQLVersion = '9.00' /* 2005 */
BEGIN
/* SQL 2005 Version - removes i.has_filter column */
EXEC sp_msforeachdb '
/*
MODIFIED from Glenn - Possible Bad NC Indexes (writes > reads) (Query 58) (Bad NC Indexes)
*/
SELECT ''?'' as DBName,o.Name AS [Table Name], i.name AS [Index Name],
user_updates AS [Total Writes], user_seeks + user_scans + user_lookups AS [Total Reads],
user_updates - (user_seeks + user_scans + user_lookups) AS [Difference],
i.index_id,
i.is_disabled, i.is_hypothetical, i.fill_factor
FROM sys.dm_db_index_usage_stats AS s WITH (NOLOCK)
INNER JOIN [?].sys.indexes AS i WITH (NOLOCK)
ON s.[object_id] = i.[object_id]
AND i.index_id = s.index_id
INNER JOIN [?].sys.objects as o WITH (nolock)
on i.object_ID=o.Object_ID
WHERE o.type = ''U''
AND s.database_id = DB_ID(''?'')
/* AND user_updates > (user_seeks + user_scans + user_lookups) */
AND i.index_id > 1
AND user_updates - (user_seeks + user_scans + user_lookups) >75000
ORDER BY [Difference] DESC, [Total Writes] DESC, [Total Reads] ASC;'
END
ELSE
BEGIN
EXEC sp_msforeachdb '
/*
MODIFIED from Glenn - Possible Bad NC Indexes (writes > reads) (Query 58) (Bad NC Indexes)
*/
SELECT ''?'' as DBName,o.Name AS [Table Name], i.name AS [Index Name],
user_updates AS [Total Writes], user_seeks + user_scans + user_lookups AS [Total Reads],
user_updates - (user_seeks + user_scans + user_lookups) AS [Difference],
i.index_id,
i.is_disabled, i.is_hypothetical, i.has_filter, i.fill_factor
FROM sys.dm_db_index_usage_stats AS s WITH (NOLOCK)
INNER JOIN [?].sys.indexes AS i WITH (NOLOCK)
ON s.[object_id] = i.[object_id]
AND i.index_id = s.index_id
INNER JOIN [?].sys.objects as o WITH (nolock)
on i.object_ID=o.Object_ID
WHERE o.type = ''U''
AND s.database_id = DB_ID(''?'')
/* AND user_updates > (user_seeks + user_scans + user_lookups) */
AND i.index_id > 1
AND user_updates - (user_seeks + user_scans + user_lookups) >75000
ORDER BY [Difference] DESC, [Total Writes] DESC, [Total Reads] ASC;'
END
END

This did exactly what I wanted, returning all non-clustered indexes with at least 75,000 more writes than reads (my chosen threshold) across all databases on the SQL Server 2005+ instance.

Hope this helps!

Friday, February 5, 2016

Copying SSIS packages With DTUTIL

A frequent need when performing a server migration is to copy the SSIS packages from one server to a new server. There are a couple of different ways to do this, including a wizard in SSMS. (See https://www.mssqltips.com/sqlservertip/2061/how-to-manage-ssis-packages-stored-in-multiple-sql-server-database-instances/). The catch to this is that these are manual and they only move one package at a time.

I recently had to migrate a server with over twenty packages, and I knew I didn't want to click-click-click over and over again. :)

I looked around and was reminded of dtutil, the utility designed to manage DTS and then SSIS packages from the command line. I found a comment at http://www.sqlservercentral.com/Forums/Topic1068518-1550-1.aspx that included a SELECT statement to generate dtutil commands based on the contents of msdb.dbo.sysssispackages:

select 'DTUTIL /SQL "'+f.foldername+'"/"'+ name +'" /DestServer [YOURSQLSERVERDEST] /COPY SQL;"'+f.foldername+'"/"'+name+'" /QUIET'
from msdb.dbo.sysssispackages p
inner join msdb.dbo.sysssispackagefolders f
on p.folderid = f.folderid

I played with it a little and it did serve my purpose - I was able to generate twenty dtutil commands, drop them in a Notepad batch file, and successfully run that batch from Windows to move the packages.

I fiddled with the script and started testing it on different SQL Server versions. The biggest gotcha I found was that on SQL Server 2005 there is no ssispackages table - the comparable table is sysdtspackages90 (and sysdtspackages90folders). A quick modification to the script to add a version check dealt with this:

-------

/*

SSIS Package Copy with DTUTIL in xp_cmdshell

Run on source server where packages are stored
Set parameter @TargetServer to server name where packages are moving

Modified be Andy Galbraith @DBA_Andy from an idea at http://www.sqlservercentral.com/Forums/Topic1068518-1550-1.aspx

Tested on MSSQL 2005/2008/2008R2/2012/2014

*/

SET NOCOUNT ON

DECLARE @TargetServer sysname, @SQLVersion char(4)

SET @TargetServer = 'ServerB'

SET @SQLVersion = left(cast(SERVERPROPERTY('productversion') as varchar),4)

/* PRINT @SQLVersion */

IF LEFT(@SQLVersion,1) NOT IN ('1','9') /* Not 2005+ */
BEGIN
PRINT 'SQL Server Version Not Supported By This Script'
END
ELSE
BEGIN
IF @SQLVersion = '9.00' /* 2005 */
BEGIN
select 'EXEC xp_cmdshell ''DTUTIL /SQL "'+f.foldername+'\'+ name
+'" /DestServer "'+@TargetServer+'" /COPY SQL;"'+f.foldername+'\'+name+'" /QUIET'''
from msdb.dbo.sysdtspackages90 p
inner join msdb.dbo.sysdtspackagefolders90 f
on p.folderid = f.folderid
END
ELSE /* 2008+ */
BEGIN
select 'EXEC xp_cmdshell ''DTUTIL /SQL "'+f.foldername+'\'+ name
+'" /DestServer "'+@TargetServer+'" /COPY SQL;"'+f.foldername+'\'+name+'" /QUIET'''
from msdb.dbo.sysssispackages p
inner join msdb.dbo.sysssispackagefolders f
on p.folderid = f.folderid
END
END

-------

In the above script I wrapped the dtutil statements in xp_cmdshell calls so that I could run it from SQL Server rather than the Windows command line (or batch files).

If your environment doesn't support xp_cmdshell (which is a completely different best practices discussion - see a great post by K Brian Kelley (blog/@kbriankelley) here about the risks of enabling xp_cmdshell in your environment) then it is easy to remove the xp_cmdshell piece to return the results back to simple dtutil calls:

-------

/*

SSIS Package Copy with DTUTIL

Run on source server where packages are stored
Set parameter @TargetServer to server name where packages are moving

Modified be Andy Galbraith @DBA_Andy from an idea at http://www.sqlservercentral.com/Forums/Topic1068518-1550-1.aspx

Tested on MSSQL 2005/2008/2008R2/2012/2014

*/

SET NOCOUNT ON

DECLARE @TargetServer sysname, @SQLVersion char(4)

SET @TargetServer = 'ServerB'

SET @SQLVersion = left(cast(SERVERPROPERTY('productversion') as varchar),4)

/* PRINT @SQLVersion */

IF LEFT(@SQLVersion,1) NOT IN ('1','9') /* Not 2005+ */
BEGIN
PRINT 'SQL Server Version Not Supported By This Script'
END
ELSE
BEGIN
IF @SQLVersion = '9.00' /* 2005 */
BEGIN
select 'DTUTIL /SQL "'+f.foldername+'\'+ name
+'" /DestServer "'+@TargetServer+'" /COPY SQL;"'+f.foldername+'\'+name+'" /QUIET'
from msdb.dbo.sysdtspackages90 p
inner join msdb.dbo.sysdtspackagefolders90 f
on p.folderid = f.folderid
END
ELSE /* 2008+ */
BEGIN
select 'DTUTIL /SQL "'+f.foldername+'\'+ name
+'" /DestServer "'+@TargetServer+'" /COPY SQL;"'+f.foldername+'\'+name+'" /QUIET'
from msdb.dbo.sysssispackages p
inner join msdb.dbo.sysssispackagefolders f
on p.folderid = f.folderid
END
END

-------

Hope this helps!