Nebraska SQL from @DBA

Tuesday, March 2, 2021

Disabling Non-Clustered Indexes For Fun and Profit

One of the frequently quoted ETL best practices is to disable your non-clustered indexes during a data load. Basically the conventional wisdom is that you disable the non-clustered indexes, do your load, and then enable the non-clustered indexes, which functionally rebuilds them with the newly loaded data (a step you should be performing after a load of any size even if you *don't* disable your indexes - rebuild those newly fragmented indexes!)

So why doesn't anybody seem to do this?

https://memegenerator.net/img/instances/53039992/ponder-we-must.jpg

IMPORTANT NOTE <cue scary music> - do NOT disable the clustered index when doing data loads or at any other time you want the table to be accessible. Disabling the clustered index makes the object (table or view) inaccessible.

If you disable the clustered index you are functionally disabling the object, and any attempts to access that object will result in an Error 8655:

Msg 8655, Level 16, State 1, Line 23
The query processor is unable to produce a plan because the index ‘PK_Person_BusinessEntityID’ on table or view ‘Person’ is disabled.

https://memegenerator.net/img/instances/62333367/no-please-dont-do-that.jpg

I recently had a situation with a client where they have a load process that altogether runs over 24 hours - not one day-long query but rather a string of processes that usually takes 24+ hours to complete. The situation has worsened to the point where other things are impacted because the process is taking so long, and they asked me to check it out.

I started with all of the basics, and right away Page Life Expectancy during the relevant period jumped out:

The server has 128GB of RAM...so a PLE consistently below 1,000 is pretty low.

I used XEvents to gather queries with large memory grants, and it brought to light an issue that the client has previously considered but not handled - disabling indexes during the load.

** SIDEBAR**

The code for the XEvents session to pull large memory grant queries I used is as follows - it collects any memory grant over 8MB (which you can modify in the WHERE clause) and its calling query:

CREATE EVENT SESSION [Ntirety_MemoryGrantUsage] ON SERVER
ADD EVENT sqlserver.query_memory_grant_usage
(
ACTION(sqlserver.client_app_name,sqlserver.client_hostname,sqlserver.database_id
,sqlserver.database_name,sqlserver.server_principal_name,sqlserver.session_id,sqlserver.sql_text)

WHERE ([package0].[greater_than_uint64]([granted_memory_kb],(8192))))

ADD TARGET package0.event_file(SET filename=N'Ntirety_MemoryGrantUsage',max_file_size=(256)),

ADD TARGET package0.ring_buffer(SET max_events_limit=(0),max_memory=(1048576))

WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=5 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=ON
)

The query to then retrieve the data is this - it dynamically pulls your default error log path and then queries the XEL output file from the event_file target in the session at that path:

USE master
GO

DECLARE @ErrorLogPath nvarchar(400), @XELPath nvarchar(500)

SET @ErrorLogPath = (
SELECT LEFT(cast(SERVERPROPERTY('ErrorLogFileName') as nvarchar(400)),LEN(cast(SERVERPROPERTY('ErrorLogFileName') as nvarchar(400)))-8)
)

SET @XELPath = @ErrorLogPath+'Ntirety_MemoryGrantUsage*.xel'

SELECT DISTINCT *
FROM
(
SELECT
DATEADD(hh, DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), event_data.value(N'(event/@timestamp)[1]', N'datetime'))as Event_time
, CAST(n.value('(data[@name="granted_memory_kb"]/value)[1]', 'bigint')/1024.0 as DECIMAL(30,2)) AS granted_memory_mb
, CAST(n.value('(data[@name="used_memory_kb"]/value)[1]', 'bigint')/1024.0 as DECIMAL(30,2)) AS used_memory_mb
, n.value('(data[@name="usage_percent"]/value)[1]', 'int') AS usage_percent
, n.value ('(action[@name="database_name"]/value)[1]', 'nvarchar(50)') AS database_name
, n.value ('(action[@name="client_app_name"]/value)[1]','nvarchar(50)') AS client_app_name
, n.value ('(action[@name="client_hostname"]/value)[1]','nvarchar(50)') AS client_hostname
, n.value ('(action[@name="server_principal_name"]/value)[1]','nvarchar(500)') AS [server_principal_name]
, n.value('(@name)[1]', 'varchar(50)') AS event_type
, n.value ('(action[@name="sql_text"]/value)[1]', 'nvarchar(4000)') AS sql_text
FROM
(
SELECT CAST(event_data AS XML) AS event_data
FROM sys.fn_xe_file_target_read_file(
@XELPath
, NULL
, NULL
, NULL)
)
AS Event_Data_Table
CROSS APPLY event_data.nodes('event') AS q(n)
) MemoryGrantUsage
ORDER BY event_time desc

** END SIDEBAR **

Looking at the queries that were the top memory consumers over the time period led to some pretty basic queries:

INSERT INTO "dbo"."Junk"
SELECT "UID"
, "ContainerId"
, "CategoryCode"
, "CategoryName"
, "CategoryDataValue"
, "CharacteristicCode"
, "CharacteristicName"
, "CharacteristicDataValue"
, "DimensionCode"
, "DimensionName"
, "DimensionDataValue"
, "OptionId"
, "NumberOfItems"
, "IsStandard"
, "LanguageId"
, "Market"
, "VehicleType"
, "YearMonth"
, "BatchRefreshDate"
, "CreatedBy"
, "CreatedTimestamp"
, "ModifiedBy"
, "ModifiedTimestamp"
, "BatchId"
FROM "dbo"."Junk_Stg"
where "Market" = 'XYZ'

https://media.makeameme.org/created/so-simple-right.jpg

The immediate problem to me was that memory grants usually come from SORT and HASH operations (check out Erik Darling's great description at https://www.erikdarlingdata.com/starting-sql/starting-sql-memory-grants-in-execution-plans/) and this query doesn't obviously do any of that - it is a very straightforward INSERT...SELECT from one table into another, but the query plan looks like this:

Pretty crazy right? With all of those SORT's, no wonder SQL Server wants to give this query a giant memory grant!

But why are there six different SORT operations on this one basic INSERT...SELECT?

Then I realized this is why:

Six SORT's...for SIX non-clustered indexes!

Inserting rows into a table with enabled non-clustered indexes requires a SORT to add rows to that index - the INSERT can't simply dump the rows on to the end of the index, but has to sort the input to match the indexes sort.

As a test I created a copy of the target table that I could tweak the indexes on, and a second copy without the indexes in place, and my INSERT...SELECT query plan against the No Index copy looked a little bit better:

As I test I copy-pasted the two INSERT...SELECT's into a single query window and pulled the plans for each so that I could compare them:

The Query Cost numbers in Management Studio always have to be taken with a grain of salt, but for comparison's sake consider - the six-way insert costs almost four times as much as the no-index query!

To test further I disabled three of the indexes on my copy table and ran the plan again:

Bingo - six streams now down to three for the three remaining enabled indexes.

I disabled the rest of the non-clustered indexes and:

https://memegenerator.net/img/instances/63334945/booyah.jpg

For one final comparison I ran the double query against my "Indexes All Newly Disabled" copy and my "No Index" copy together:

Spot on!

As I mentioned above the cost to this is that at the end of the process you have to have time to enable the indexes again, which does a rebuild of that index.

As always Your Mileage May Vary, as the direct impact will relate to the size of the table, the number of non-clustered indexes in place, and multiple other items - but check it out, especially if your process runs longer than you want - this may be the problem!

Hope this helps!

Monday, February 8, 2021

Unable to Connect to Target Server? But, I'm Already Connected!

I am working on an Azure SQL Database migration from an on-premises VM, and at the client's request I am working through the BACPAC Import/Export process rather than using Azure Database Migration Service.

https://i.imgflip.com/2t69mm.jpg

I exported the database into a BACPAC file, and went to perform the Import operation into Azure, using the wizards in SSMS

(Awesome step-by-step instructions at https://www.mssqltips.com/sqlservertip/5255/learn-how-to-migrate-a-sql-server-database-to-sql-azure-server/ - thanks Vitor Montalvão!)

The problem is my Import operation failed with an error that didn't make any sense. To start the process you connect to you Azure SQL Server in Management Studio, and then run the Import Data-Tier Application wizard:

..and point at you BACPAC file to import, providing details on what you want the new database to be called, what service level it needs to be in, etc. At that point the import runs, and everything is slick, right?

"Unable to connect to target server"

Um...I was already connected to the server to get *this* far!

After some poking around, I tripped over several references to a potential "bug" related to importing BACPAC files, as described here https://stackoverflow.com/questions/19990779/unable-to-connect-to-target-server-when-restoring-bacpac-file-to-sql-server-20

When I checked, sure enough the Management Studio on this server was 2014.

https://media.tenor.com/images/258a205ccd6bfa979a3505e95f1fcda7/raw

I went to https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms/ and downloaded the current Management Studio and applied it to the server, and:

If you can keep your tools updated - in most cases the tools are backwards compatible so unless you have written custom plug-ins, etc you are likely OK - also most recent versions let you install multiple versions side-by-side so you can install a new version without deleting the old one.

Hope this helps!

Friday, January 15, 2021

Another One Bites the Dust - PASS is Gone, so...What Now?

There has been a lot of noise over the last few weeks since it was announced that due to financial constraints related to canceling the in-person Summit due to COVID-19, PASS would cease to exist as of January 15, 2021.

After reading the release and a few tweets, I paused and considered.

As I read through the various materials, I was struck by several distinct emotions...

https://tenor.com/view/blue-cry-sad-bad-day-gif-13423780

My initial response was Sadness; I knew when the decision was made (and basically had to be made) to move Summit from in-person to all-virtual that there would be repercussions, especially financial ones. I have never been on the Board or anything like that, but I have always paid enough attention to be aware that a majority of the funding that kept our no-membership-fee organization going came from Summit registrations, and you can't charge in-person experience fees for a virtual conference. I have been to a half dozen Summits over the years, presenting twice, presented at numerous SQL Saturdays, and been a part of I-don't-know-how-many in-person and virtual user groups. The first thing through my mind was that all of that was...gone.

The more I read, the more I felt Anger and Disgust; over the years there has been an ongoing conflict within the SQL Server community regarding the operations of PASS and its management company Christianson & Company (C&C). There have been numerous snipes back and forth on whether C&C did a good job, did they cost too much money, did they care about the community, etc. etc. etc. I have always stayed out of these discussions, but also always been annoyed how some people couldn't make the distinction between the workings of the organization and the volunteers that make things work. At times I have seen people on the PASS Executive Board and other volunteers attacked because of one decision or another, and it just isn't right. Even if a certain person participated in a decision (or made it individually themselves) there is no reason to attack someone who is giving personally of their time. I have met many of the people who have served as PASS leadership over the years, and while they have varying personalities they have all been dedicated to supporting the organization for little personal gain, not back-room power-playing, and I salute them.

https://imgflip.com/memetemplate/63747205/inside-out-joy

Finally I arrived at a wistful sense of Joy. I have learned so much from the various functions of PASS over the years, but even more importantly I have met and interacted with so many amazing people along the way - people I never would have met without PASS. I blog because I saw a presentation talking about how blogging can help you learn and share your knowledge. I became a speaker as the next step down that learn and share path, and becoming a speaker is what really opened up networking to me. For everything I have learned in a session at a SQLSaturday or User Group session, I have learned even more from someone I met at a Speaker Dinner or in the hallway between sessions. This is true of both Community members and Microsoft employees as well - on multiple occasions I have met and had long discussions with Microsoft Product Team members who design, build, and fix the SQL Server product and I would not have done so without PASS.

So now what? What should we feel now?

https://wallha.com/wallpaper/fear-inside-out-440608

Fear? NO!!!

There are lots of options for content post-PASS, and I have compiled a list I want to share with you!

The biggest single source of content I have found is meetup.com. Many of the larger user groups have turned to Meetup during COVID while they couldn't meet in-person, and even more are springing up there now since PASS made its announcement. All of the groups I list here are "Public" so regardless of what geographic area they are affiliated with, anyone should be able to join!

First are a couple of groups that have arisen from the ashes of large PASS Virtual Chapters:

Database Professionals Virtual Meetup Group - 1,453 members - Meets Biweekly https://www.meetup.com/dbavug/

DBA Fundamentals Group - 877 members - Meets Weekly

https://www.meetup.com/dba-fundamentals-group/

--

There are many regional/local groups on Meetup as well; I gathered this list by spending some time searching Meetup for SQL and Azure content and then checking the groups to see if they have regular meetings. All of the groups listed here meet at least monthly and have regular meetings scheduled. I found many groups that had hundreds of people listed as members but if they didn't have anything scheduled I didn't include them.

Here they are sorted by member count - an omission from this list does not mean anything in particular other than a group didn't show up on my search or didn't have regular meetings scheduled at the time I ran the search:

New England SQL Server User Group - 2,767 members - Meets Monthly

https://www.meetup.com/NESQLUG/

Azure in the ATL - 2,409 members - Meets Monthly

https://www.meetup.com/Azure-in-the-ATL/

Northern Virginia SQL Server Users Group - 1,553 members - Meets Monthly

https://www.meetup.com/NOVASQL

Triangle SQL Server User Group (NC) - 1,541 members - Meets Monthly

https://www.meetup.com/tripass

Arizona SQL Server User Group - 1,136 members - Meets Monthly

https://www.meetup.com/Arizona-SQL-Server-User-Group

Sacramento SQL Server User Group - 994 members - Meets Monthly

https://www.meetup.com/Sacramento-SQL-Server-User-Group

North Boston Azure Cloud User Group - 899 members - Meets Monthly

https://www.meetup.com/North-Boston-Azure-Cloud-User-Group

Utah SQL Server Group - 848 members - Meets Monthly

https://www.meetup.com/Utah-SQL-Server-Group/

Rhode Island SQL Server User Group - 610 members - Meets Monthly

https://www.meetup.com/Rhode-Island-SQL-Server-Users-Group

PASSMN (Minnesota SQL Server User Group) - 596 Members - Meets Monthly

https://www.meetup.com/MN-SQL-Server-User-Group-PASSMN

Atlanta Azure Data User Group - 589 members - Meets Monthly

https://www.meetup.com/atlanta-azure-data-user-group

CincyPASS - Cincinnati SQL Server Users Group - 476 members - Meets Monthly

https://www.meetup.com/CincyPASS

Albuquerque SQL Server User Group - 354 members - Meets Monthly

https://www.meetup.com/Albuquerque-SQL-Server-User-Group

Data Platform Downunder Meetup Group (Australia Time Zone) - 90 members - Meets Weekly

https://www.meetup.com/data-platform-downunder-meetup-group/

A couple of more non-Meetup links:

Data Saturdays - a potential replacement for SQLSaturdays - there are a few upcoming events scheduled already:

https://datasaturdays.com/

SQLUGs - a site to host local user group webpages being provided for *free* by Denny Cherry and Associates (website/@DCACco) as a service to the SQL Community:

https://sqlugs.com/

--

Let me know if you find other treasure troves of information to share, and hope this helps!

https://www.lemax.net/wp-content/uploads/2018/09/Much-to-learn-we-all-still-have.jpg