One DBA's Ongoing Search for Clarity in the Middle of Nowhere


Yet Another Andy Writing About SQL Server

Tuesday, December 8, 2015

T-SQL Tuesday #73 - Not Running CHECKDB is Naughty!

This month T-SQL Tuesday is hosted by Bradley Balls (blog/@SQLBalls) and his topic of choice is “Naughty or Nice.”

(If you don’t know about T-SQL Tuesday check out the information here – each month there is a new topic and it is a great excuse to write each month (and to start writing!) because someone offers a topic, so you already have taken the first step!).


Probably the number one problem (Naughty, Naughty!) I find when I run across a new server is DBCC CHECKDB - or rather, the complete lack of CHECKDB.

I have heard any number of excuses over the years, but they are almost all some variant of this:

It takes too much time/CPU/RAM/IO/Wheatons to run CHECKDB on my databases.

To this I say one thing - <BLEEP!>

At the end of the day you can't afford *not* to run CHECKDB- you just can't.  Your database objects are only as reliable as your last clean CHECKDB (meaning your last CHECKALLOC + CHECKCATALOG + individual CHECKTABLE's on all of your tables - but more on that in a moment).

If you are not running regular CHECKDB, you may have endless unknown corruption across your databases, and you won't find out until someone tried to access the data.  

Ask yourself - wouldn't you rather find out about data corruption from a regular maintenance process than from the Senior Vice President when she tries to run her Management Big Bonuses Report and it fails?

My recommendation is always to run CHECKDB as often as is plausible, with a practical line of at most once per day (there is no need to run CHECKDB every minute just because you can).

If you can't run daily CHECKDB's, then run it weekly (or even monthly), but run it.

One thing I have done at multiple clients that have not had "sufficient" resources to run full CHECKDB on a regular basis is to split the job up into its component parts mentioned above.  This comes from an idea originally given by Paul Randal (blog/@PaulRandal) and works out like this:

First, analyze the size of your tables to create grouping of comparable sizes.  I like a personal variant of the script at from "marc_s" that breaks out the objects:
s.Name AS SchemaName,
t.NAME AS TableName,
p.rows AS RowCounts,
SUM(a.total_pages) * 8 AS TotalSpaceKB,
SUM(a.used_pages) * 8 AS UsedSpaceKB,
(SUM(a.total_pages) - SUM(a.used_pages)) * 8 AS UnusedSpaceKB
FROM sys.tables t
INNER JOIN sys.schemas s
ON s.schema_id = t.schema_id
INNER JOIN sys.indexes i
ON t.OBJECT_ID = i.object_id
INNER JOIN sys.partitions p
ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN sys.allocation_units a
ON p.partition_id = a.container_id
t.NAME NOT LIKE 'dt%'    /* filter out system tables for diagramming */
AND t.is_ms_shipped = 0
GROUP BY t.Name, s.Name, p.Rows
ORDER BY TotalSpaceKB desc
Next, after you have divided the tables into two to four groups, schedule them into jobs running CHECKTABLE individually on groups of tables, with a final job running a catch-all step for all tables not explicitly checked - something like this (I like Ola Hallengren maintenance so this syntax is from there, but the concept will be visible even if you aren't familiar with his code):

Job 1 - run CHECKTABLE on three big tables in BigDatabase01:
EXECUTE [dbo].[DatabaseIntegrityCheck]
@Databases = 'BigDatabase01',
@CheckCommands = 'CHECKTABLE',
@Objects = 'BigDatabase01.dbo.BigTable01, BigDatabase01.dbo.BigTable02, BigDatabase01.dbo.BigTable03',
@LogToTable = 'Y'

Job 2 - run CHECKTABLE on the next two big tables in BigDatabase01:
EXECUTE [dbo].[DatabaseIntegrityCheck]
@Databases = 'BigDatabase01',
@CheckCommands = 'CHECKTABLE',
@Objects = 'BigDatabase01.dbo.BigTable04, BigDatabase01.dbo.BigTable05',
@LogToTable = 'Y'

Job 3 - Step 1 - run CHECKTABLE on all tables in BigDatabase01 except the five big tables from Jobs 1 and 2 - this "catch-all" step is very important because you need to make sure new tables added to the database get checked - if all you have are steps running against individual tables, new table never get checked!
EXECUTE [dbo].[DatabaseIntegrityCheck]
@Databases = 'BigDatabase01',
@CheckCommands = 'CHECKTABLE',
@Objects = 'ALL_OBJECTS, -BigDatabase01.dbo.BigTable01, -BigDatabase01.dbo.BigTable02, -BigDatabase01.dbo.BigTable03, -BigDatabase01.dbo.BigTable04, -BigDatabase01.dbo.BigTable05',
@LogToTable = 'Y'

Job 3 - Step 2 - run CHECKTABLE on all tables in all databases except BigDatabase01 - the same "catch-all" logic applies here - if you run checks against individual named databases (rather than all databases except BigDatabase01) then new databases don't get picked up:
EXECUTE [dbo].[DatabaseIntegrityCheck]
@Databases = 'USER_DATABASES, -BigDatabase01',
@CheckCommands = 'CHECKTABLE',
@Objects = 'ALL_OBJECTS',
@LogToTable = 'Y'

Job 3 - Step 3 - run CHECKALLOC on all databases:
EXECUTE [dbo].[DatabaseIntegrityCheck]
@Databases = 'USER_DATABASES',
@CheckCommands = 'CHECKALLOC',
@LogToTable = 'Y'

Job 3 - Step 4 - run CHECKCATALOG on all databases:
EXECUTE [dbo].[DatabaseIntegrityCheck]
@Databases = 'USER_DATABASES',
@CheckCommands = 'CHECKCATALOG',
@LogToTable = 'Y'

Over the course of three jobs (usually scheduled over three nights), this plan runs CHECKTABLE on all tables in all databases, runs CHECKALLOC on all databases, and runs CHECKCATALOG on all databases - BINGO! CHECKDB!

This is a relatively simple example with a server that has one large database with multiple large tables. (Of course, this is a fairly common situation on many of these "can't CHECKDB" servers.)  The same principle - divide the CHECKDB operation into component parts - allows me to scheduled CHECKDB on multiple systems when a client says "oh we can *never* run CHECKDB here."

This same principle of exclusion is useful to start running CHECKDB (or its components) on systems that have large databases or objects even if you can't get the offending objects in progress.  Usually when I find one of these servers, there are no CHECKDB's in place on *any* databases even though the problem is ReallyBigTable01 in BiggestDatabase01.  A great first step is to set up CHECKDB or its components on all objects except the offending one(s) - as above, set up CHECKTABLE on everything on the server except ReallyBigTable01 and also CHECKCATALOG and CHECKALLOC everywhere, and you have much more protection while you figure out what to do about ReallyBigTable01.


At the end of the day - stay off the naughty list - 

No comments:

Post a Comment