One DBA's Ongoing Search for Clarity in the Middle of Nowhere


*or*

Yet Another Andy Writing About SQL Server

Wednesday, April 18, 2018

Why Don't I have a Preferred Replica?


The client reported that the log file on their main database was growing unusually large, and when they checked the backup target the database didn't have any LOG backups!

Oh, and the database was part of an Availability Group.

https://imgflip.com/i/28n8j8
--

After some investigation we found that while the Availability Group was configured to prefer the primary replica for backups, it wasn't evaluating as the preferred backup via the system function sys.fn_hadr_backup_is_preferred_replica!

What we found in the output logs for the LOG backup (Hallengren) job looked like this:

--

Availability group: DATABASE1_AG1
Availability group role: PRIMARY
Availability group backup preference: PRIMARY
Is preferred backup replica: No

--

Even though the preferred backup replica is the primary replica, and we are on the primary replica, the is_preferred check says No.

A test showed that a manual BACKUP LOG statement worked, but both the Hallengren DatabaseBackup *and* a "regular" maintenance plan backup ignored Database1.

To check the preferred replica status we manually ran the command in a query window:

--

SELECT sys.fn_hadr_backup_is_preferred_replica(‘Database1’)

--

...it came back as 0 (false) on both nodes – SQL1 and SQL2.


As long as the preferred backup replica is 0, no intelligent backups will run against that replica, which is why backups weren’t happening on SQL1 or SQL2.

--

Now what?  Back to the old standby:

https://imgflip.com/i/28n92w

As described in the solution post, one possible answer is if the internal server names don't match - we checked the servers and sure enough:

--

SELECT SERVERPROPERTY('ServerName') as [InstanceName]
, SERVERPROPERTY('ComputerNamePhysicalNetBIOS') as [ComputerNamePhysicalNetBIOS]
, @@SERVERNAME as ServerName

--


InstanceName
ComputerNamePhysicalNetBIOS
@@SERVERNAME
SQL1
SQL1
123-223



InstanceName
ComputerNamePhysicalNetBIOS
@@SERVERNAME
SQL2
SQL2
437-326

--

The SERVERPROPERTY names reflect the friendly SQL1 and SQL2 names while the @@SERVERNAME reflects the client's standard infrastructure team ###-### names.

This often means that the Windows Server was renamed after SQL was installed but the final step of renaming SQL wasn’t taken (more on that at the end of the email).

--

Renaming the servers (taking a downtime) was not immediately acceptable so after some consideration we tried manually editing the code of the master.dbo.DatabaseBackup procedure to hard-code the fact that the primary replica is truly the preferred backup replica:

-- 
Original code: 
IF @Version >= 11 AND @Cluster IS NOT NULL AND @CurrentAvailabilityGroup IS NOT NULL
BEGIN
SELECT @CurrentIsPreferredBackupReplica = sys.fn_hadr_backup_is_preferred_replica(@CurrentDatabaseName)
END


Modified code: 
IF @Version >= 11 AND @Cluster IS NOT NULL AND @CurrentAvailabilityGroup IS NOT NULL AND @CurrentAvailabilityGroupRole = 'PRIMARY'
BEGIN
SELECT @CurrentIsPreferredBackupReplica = 1
END
--
               
…and that allowed the backups to run successfully by setting the procedure to true as long as the replica is primary.

http://2.bp.blogspot.com/_s03XW1ZvVvE/S9E3tjNwDMI/AAAAAAAAA6w/2QFwfKb-aNo/s1600/kool-aid.jpg

--

One other catch that came up after that also relates directly back to the mismatched names…after making the above change, the LOG backup of Database1 was succeeding but the overall LOG backup job was still failing.  Looking in the job output we found an error:

--

Msg 50000, Level 16, State 1, Server SQL1, Procedure DatabaseBackup, Line 1041

--

The author of DatabaseBackup (Ola Hallengren) is ready for the fact that mismatched names can cause issues and has a custom error built into his procedure to throw the Error 50000 if the names don’t match.  The procedure proceeds to run backups if it can, but still ultimately fails overall with the 50000 message.

To mitigate this we further edited the code of DatabaseBackup to comment out the code block that generates this specific error:

--

/*
  2018-04-18 - Commented out to prevent 50000 error since @@SERVERNAME does not match ServerName

  AndyG, Ntirety


  IF @@SERVERNAME <> SERVERPROPERTY('ServerName') AND SERVERPROPERTY('IsHadrEnabled') = 1
  BEGIN
    SET @ErrorMessage = 'The @@SERVERNAME does not match SERVERPROPERTY(''ServerName''). See ' + CASE WHEN SERVERPROPERTY('IsClustered') = 0 THEN 'https://docs.microsoft.com/en-us/sql/database-engine/install-windows/rename-a-computer-that-hosts-a-stand-alone-instance-of-sql-server' WHEN SERVERPROPERTY('IsClustered') = 1 THEN 'https://docs.microsoft.com/en-us/sql/sql-server/failover-clusters/install/rename-a-sql-server-failover-cluster-instance' END + '.' + CHAR(13) + CHAR(10) + ' '
    RAISERROR(@ErrorMessage,16,1) WITH NOWAIT
    SET @Error = @@ERROR
  END
*/

--

This directly bypasses the error check and prevents the 50000 error for the name mismatch.

--

As an aside I want to comment on the code editing we were doing - when modifying code of any type consider the source and their licensing/intellectual property rights.  Ola's licensing is very short and sweet:

--

License 
The SQL Server Maintenance Solution is licensed under the MIT license, a popular and widely used open source license.
Copyright (c) 2018 Ola Hallengren 
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: 
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. 
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
https://ola.hallengren.com/license.html 

--

At the end of the day as long as you don't remove his copyright notices you have the right to modify his code.

This is very common with code online - ALWAYS ATTRIBUTE YOUR SOURCES - but if you are looking at any type of commercial code be extremely careful ; not only may you be violating the letter (or spirit) of the law, you may even prevent the code's creator from supporting you in the future!

--

The edits to DatabaseBackup are allowing the jobs to run successfully for now, but it is truly a bandage rather than a real fix.  The ultimate fix is to rename the SQL Server internally (as described at the link in Ola’s 5000 error https://docs.microsoft.com/en-us/sql/database-engine/install-windows/rename-a-computer-that-hosts-a-stand-alone-instance-of-sql-server)

The document title references a “stand-alone instance of SQL Server” but that is what this is since it is two SQL Servers with an availability group rather than a failover cluster (shared) instance.

*BUT*

It is unknown from the information here whether changing the @@SERVERNAME from 123-223 to SQL1 would impact any other business or application processes – this is something the client's teams would need to determine before making any changes to the name.

Also changing the server name would require a SQL downtime (service restart, not Windows reboot) for the name change to take effect (as described in the Microsoft doc) so even if you are comfortable with the name change it needs to be scheduled for a downtime window.

--

Hope this helps!


Wednesday, April 4, 2018

Toolbox - Do You Have Current Backups?

On our managed servers we run a job each day to check that either a FULL or DIFF has run in the last 24 hours, but all-too-frequently I find myself on a server that we don't manage, trying to determine the current situation (aka the "health check").

An important thing to check with the health check - or even when responding to a 911-SAVEME call - is whether there are current backups.

https://stephcalvertart.com/wp-content/uploads/2014/07/funny-memes-wordpress-maintenance-backups-updates-hearts-and-laserbeams-star-wars-obi-wan-kenobi.jpg
Similar to my previous Toolbox post,  I wanted a script I could just run without worrying whether the server was 2005 or 2016, and it also needed to handle availability groups.  This second requirement made the first even more important as their are new DMV's for availability groups that are needed in SQL 2012+ but *don't exist* in 2005-2008.

I used the same SQLVersion logic as the previous post, CONVERTing the SERVERPROPERTY('ProductVersion') into a four character value:
9.00
10.0
10.5
11.0
12.0
13.0
...which I then used to branch the code.

Like most of my code, I started with several general queries (cited in the code) and then modified them significantly for my needs. #AlwaysGiveCreditWhereDue

--


https://memegenerator.net/instance/53270254/yeah-thatd-be-great-yeah-if-you-could-just-show-me-some-code-thatd-be-great
Without further ado, the code:

--
/*
Most Recent Backups

There is a code branch to allow for availability groups so that it is visible
whether the given database is the preferred backup replica or not
since backups may appear “missing” on the non-preferred replicas

Guts of availability group backup query modified from:
http://www.centinosystems.com/blog/sql/when-was-your-last-backup/

Modified to exclude AG-only fields for pre-2012
*/


DECLARE @SQLVersion as NVARCHAR(4)

SET @SQLVersion = LEFT(CONVERT(VARCHAR, SERVERPROPERTY('ProductVersion')),4)

/*
PRINT @SQLVersion
*/


IF @SQLVersion in ('9.00', '10.0', '10.5')
BEGIN
SELECT @@SERVERNAME as [InstanceName]
, db.[name] as [DatabaseName]
, db.[recovery_model_desc] as [RecoveryModelDescription]
, 'Not AG Database' as [IsAGDatabase] /* Placeholder since no AGs before 2012 */
, 'N/A' as [AGName]
, 'N/A' as [IsPreferredReplica] /* Placeholder since no AGs before 2012 */
,  CASE
WHEN LastFullBackup is NULL
THEN 'NONE'
ELSE 
CONVERT(varchar(10),t.[LastFullBackup], 111) + ' ' /* Format YYYY/MM/DD */
+ CONVERT(varchar(10),t.[LastFULLBackup], 108) /* Format HH:MM:SS */
  END  as [LastFullBackupDate]
, CASE
WHEN LastDiffBackup is NULL
THEN 'NONE'
ELSE 
CONVERT(varchar(10),t.[LastDiffBackup], 111) + ' '  /* Format YYYY/MM/DD */
+ CONVERT(varchar(10),t.[LastDiffBackup], 108) /* Format HH:MM:SS */
  END  as [LastDiffBackupDate]
, CASE
WHEN db.[recovery_model_desc] = 'SIMPLE'
THEN 'N/A - SIMPLE RECOVERY' /* No LOG backups in SIMPLE */
WHEN LastLogBackup is NULL
THEN 'NONE'
ELSE 
CONVERT(varchar(10),t.[LastLogBackup], 111) + ' ' /* Format YYYY/MM/DD */ 
+ CONVERT(varchar(10),t.[LastLogBackup], 108) /* Format HH:MM:SS */
  END as [LastLogBackupDate]
, CASE
WHEN LastFullBackup is NULL
THEN 'N/A - NO FULLS'
ELSE  cast(t.[DaysSinceLastFullBackup] as varchar(100))
  END as [DaysSinceLastFullBackup]
, CASE
WHEN LastDiffBackup is NULL
THEN 'N/A - NO DIFFS'
ELSE  cast(t.[DaysSinceLastDiffBackup] as varchar(100))
  END as [DaysSinceLastDiffBackup]
, CASE
WHEN db.[recovery_model_desc] = 'SIMPLE'
THEN 'N/A - SIMPLE RECOVERY' /* No LOG backups in SIMPLE */
WHEN LastLogBackup is NULL
THEN 'N/A - NO LOGS'ELSE
cast (t.[MinutesSinceLastLogBackup]  as varchar(100))
  END as [MinutesSinceLastLogBackup]
FROM sys.databases db
LEFT OUTER JOIN
(
SELECT p.[database_name] as [DatabaseName]
, MAX(p.[D]) as [LastFullBackup]
, MAX(p.[I]) as [LastDiffBackup]
, MAX(p.[L]) as [LastLogBackup]
, DATEDIFF(DAY, MAX(p.[D]), GETDATE()) as [DaysSinceLastFullBackup]
, DATEDIFF(DAY, MAX(p.[I]), GETDATE()) as [DaysSinceLastDiffBackup]
, DATEDIFF(MINUTE, MAX(p.[L]), GETDATE()) as [MinutesSinceLastLogBackup]
FROM msdb.dbo.backupset bs
PIVOT (MAX(bs.backup_finish_date) FOR [type] IN ([D],[L],[I])) as p
GROUP BY p.[database_name]
) t
ON db.[name] = t.[DatabaseName]
/*
-- Commented Out since no AG's before 2012
LEFT OUTER JOIN sys.dm_hadr_database_replica_states agdb
ON agdb.[database_id] = db.[database_id]
AND agdb.[is_local] = 1
LEFT OUTER JOIN sys.dm_hadr_name_id_map agmap
ON agdb.group_id = agmap.ag_id
*/
WHERE db.[name]<>'tempdb'
ORDER BY LastFullBackupDate DESC
END

ELSE /* SQL 2012+ */

BEGIN
SELECT @@SERVERNAME as [InstanceName]
, db.[name] as [DatabaseName]
, db.[recovery_model_desc] as [RecoveryModelDescription]
, CASE
  WHEN agdb.[database_id] IS NOT NULL
  THEN 'AG Database'
  ELSE 'Not AG Database'
  END as [IsAGDatabase]
, ISNULL(agmap.ag_name,'N/A') as [AGName]
, CASE
  WHEN
  sys.fn_hadr_backup_is_preferred_replica(db.[name]) = 1
  AND agdb.[database_id] IS NOT NULL
  THEN 'YES'
  WHEN sys.fn_hadr_backup_is_preferred_replica(db.[name]) = 1
  AND agdb.[database_id] IS NULL
  THEN 'N/A'
  ELSE 'NO'
  END as [IsPreferredReplica]
, CASE
WHEN LastFullBackup is NULL
THEN 'NONE'
ELSE 
CONVERT(varchar(10),t.[LastFullBackup], 111) + ' ' /* Format YYYY/MM/DD */ 
+ CONVERT(varchar(10),t.[LastFULLBackup], 108) /* Format HH:MM:SS */
  END  as [LastFullBackupDate]
, CASE
WHEN LastDiffBackup is NULL
THEN 'NONE'
ELSE 
CONVERT(varchar(10),t.[LastDiffBackup], 111) + ' ' /* Format YYYY/MM/DD */ 
+ CONVERT(varchar(10),t.[LastDiffBackup], 108) /* Format HH:MM:SS */
  END  as [LastDiffBackupDate]
, CASE
WHEN db.[recovery_model_desc] = 'SIMPLE'
THEN 'N/A - SIMPLE RECOVERY' /* No LOG backups in SIMPLE */
WHEN LastLogBackup is NULL
THEN 'NONE'
ELSE 
CONVERT(varchar(10),t.[LastLogBackup], 111) + ' ' /* Format YYYY/MM/DD */ 
+ CONVERT(varchar(10),t.[LastLogBackup], 108) /* Format HH:MM:SS */
  END as [LastLogBackupDate]
, CASE
WHEN LastFullBackup is NULL
THEN 'N/A - NO FULLS'
ELSE  cast(t.[DaysSinceLastFullBackup] as varchar(100))
  END as [DaysSinceLastFullBackup]
, CASE
WHEN LastDiffBackup is NULL
THEN 'N/A - NO DIFFS'
ELSE  cast(t.[DaysSinceLastDiffBackup] as varchar(100))
  END as [DaysSinceLastDiffBackup]
, CASE
WHEN db.[recovery_model_desc] = 'SIMPLE'
THEN 'N/A - SIMPLE RECOVERY' /* No LOG backups in SIMPLE */
WHEN LastLogBackup is NULL
THEN 'N/A - NO LOGS'ELSE
cast (t.[MinutesSinceLastLogBackup]  as varchar(100))
  END as [MinutesSinceLastLogBackup]
FROM sys.databases db
LEFT OUTER JOIN
(
SELECT p.[database_name] as [DatabaseName]
, MAX(p.[D]) as [LastFullBackup]
, MAX(p.[I]) as [LastDiffBackup]
, MAX(p.[L]) as [LastLogBackup]
, DATEDIFF(DAY, MAX(p.[D]), GETDATE()) as [DaysSinceLastFullBackup]
, DATEDIFF(DAY, MAX(p.[I]), GETDATE()) as [DaysSinceLastDiffBackup]
, DATEDIFF(MINUTE, MAX(p.[L]), GETDATE()) as [MinutesSinceLastLogBackup]
FROM msdb.dbo.backupset bs
PIVOT (MAX(bs.backup_finish_date) FOR [type] IN ([D],[L],[I])) as p
GROUP BY p.[database_name]
) t ON db.[name] = t.[DatabaseName]
LEFT OUTER JOIN sys.dm_hadr_database_replica_states agdb
ON agdb.[database_id] = db.[database_id]
AND agdb.[is_local] = 1
LEFT OUTER JOIN sys.dm_hadr_name_id_map agmap
ON agdb.group_id = agmap.ag_id
WHERE db.[name]<>'tempdb'
ORDER BY LastFullBackupDate DESC
END
--

The results look like this (split for readability):

--

InstanceName DatabaseName Recovery
Model
Description
IsAGDatabase AGName IsPreferredReplica
Instance01 Database01 FULL Not AG Database N/A N/A
Instance01 Database02 FULL AG Database AvailGroup99 YES
Instance01 Database03 FULL AG Database AvailGroup99 YES
Instance01 Database04 FULL AG Database AvailGroup99 YES
Instance01 Database05 FULL AG Database AvailGroup99 YES
Instance01 Database06 FULL AG Database AvailGroup99 YES
Instance01 msdb SIMPLE Not AG Database N/A N/A
Instance01 model FULL Not AG Database N/A N/A
Instance01 master SIMPLE Not AG Database N/A N/A

InstanceName DatabaseName LastFullBackupDate LastDiffBackupDate LastLogBackupDate
Instance01 Database01 NONE NONE NONE
Instance01 Database02 2018/04/03 22:30:39 NONE 2018/04/04 13:01:09
Instance01 Database03 2018/04/03 22:30:38 NONE 2018/04/04 13:01:08
Instance01 Database04 2018/04/03 22:30:38 NONE 2018/04/04 13:01:09
Instance01 Database05 2018/04/03 22:29:40 NONE 2018/04/04 13:01:02
Instance01 Database06 2018/04/03 22:28:29 NONE 2018/04/04 13:01:01
Instance01 msdb 2018/04/03 22:00:05 NONE N/A - SIMPLE RECOVERY
Instance01 model 2018/04/03 22:00:04 NONE NONE
Instance01 master 2018/04/03 22:00:02 NONE N/A - SIMPLE RECOVERY

InstanceName DatabaseName DaysSinceLastFullBackup DaysSinceLastDiffBackup MinutesSinceLastLogBackup
Instance01 Database01 N/A - NO FULLS N/A - NO DIFFS N/A - NO LOGS
Instance01 Database02 1 N/A - NO DIFFS 23
Instance01 Database03 1 N/A - NO DIFFS 23
Instance01 Database04 1 N/A - NO DIFFS 23
Instance01 Database05 1 N/A - NO DIFFS 23
Instance01 Database06 1 N/A - NO DIFFS 23
Instance01 msdb 1 N/A - NO DIFFS N/A - SIMPLE RECOVERY
Instance01 model 1 N/A - NO DIFFS N/A - NO LOGS
Instance01 master 1 N/A - NO DIFFS N/A - SIMPLE RECOVERY

--

The code flags whether each database is in an AG, and if so whether it is currently the primary replica and the name of its availability group.

It calculates the age of the most recent backup of each type and then sorts by the FULL value desc.

--

I use this query all the time - hope this helps!