One DBA's Ongoing Search for Clarity in the Middle of Nowhere


*or*

Yet Another Andy Writing About SQL Server

Monday, June 13, 2016

Finding File Growths with Extended Events

I give a SQLSaturday session called "Introduction to Extended Events" and at a recent presentation I had a pair of attendees ask me afterwards for a more "real world" example.  I had a basic demo in my session showing setting up a session with a predicate filter to pull particular events for a single database, but it didn't really show using XEvents to solve a business problem.

I pondered for a while and decided to try to recreate something I normally do with the default trace.  The obvious example to me was pulling data on file growths.

What?  You didn't know you could pull information about file growths from the default trace?

http://canitbesaturdaynow.com/images/fpics/1137/kitty_-(8)__large.jpg
I discovered this some time ago when I needed to find out why a database was blowing up at 1am without needing to be awake and online at 1am...at first I was going to set up a manual trace when I found a pair of blog posts that helped me out.

The first was "Did we have recent autogrows?" from Tibor Karazi (blog/@TiborKaraszi) which clued me in to the fact that autogrow information is even in the default trace.  The catch to Tibor's query is that it relied on a specific trc file - the default trace is a rollover structure so at any given time there are usually multiple sequential trc files, and using his query meant I needed to run it, edit the file name, run it again, etc.

The next piece was the Simple Talk article "Collecting the Information in the Default Trace" from Feodor Georgiev (blog/@FeodorGeorgiev) that showed me how to dynamically build the log file path *and* the fact that I could simply reference log,trc (instead of log_123.trc) and the function would concatenate all of the trc files into a single resultset.

I have said it before and I will say it again - when you need help, odds are someone else has already solved the same (or a similar) problem - look around and you will probably find it!  When you save or reference someone else's work, always make sue to give credit (as I have done above) - even if you make modifications to code or ideas, always acknowledge the original material.

http://www.funcatpictures.com/wp-content/uploads/2013/10/funny-cat-pics-internet-high-five.jpg
I took the two queries I found and combined them into a single modified query that used Tibor's base query to pull autogrowths but Feodor's concept of the custom built log path:
/*
Default Trace AutoGrow Query
Modified from Tibor Karazi
http://sqlblog.com/blogs/tibor_karaszi/archive/2008/06/19/did-we-have-recent-autogrow.aspx
and Feodor Georgiev
https://www.simple-talk.com/sql/database-administration/collecting-the-information-in-the-default-trace/
*/

DECLARE @df bit
SELECT @df = is_default FROM sys.traces WHERE id = 1
IF @df = 0 OR @df IS NULL
BEGIN
  RAISERROR('No default trace running!', 16, 1)
  RETURN
END
SELECT te.name as EventName
, t.DatabaseName
, t.FileName
, t.StartTime
, t.ApplicationName
, HostName
, LoginName
, Duration
, TextData
FROM fn_trace_gettable(
(SELECT REVERSE(SUBSTRING(REVERSE(path), CHARINDEX('\', REVERSE(path)),256)) + 'log.trc'
FROM    sys.traces
WHERE   is_default = 1
), DEFAULT) AS t
INNER JOIN sys.trace_events AS te
ON t.EventClass = te.trace_event_id
WHERE 1=1
and te.name LIKE '%Auto Grow'
--and DatabaseName='tempdb'
--and StartTime>'05/27/2014'
ORDER BY StartTime
--
SELECT TOP 1 'Oldest StartTime' as Label, t.StartTime
FROM fn_trace_gettable(
(SELECT REVERSE(SUBSTRING(REVERSE(path), CHARINDEX('\', REVERSE(path)),256)) + 'log.trc'
FROM    sys.traces
WHERE   is_default = 1
), DEFAULT) AS t
INNER JOIN sys.trace_events AS te
ON t.EventClass = te.trace_event_id
ORDER BY StartTime   
The top query returns all rows for an event like '%Auto Grow' (to capture DATA and LOG autogrowths) and as you can see you can also filter on a specific database, date range, etc.

The second query simply uses the same base table set to return the datetime of the oldest row in the current default trace.  Since the default trace is a rolling first in first out (FIFO) arrangement, there is no way to definitively say how far back the trace goes other than to actually look in the trace like this.  If you have a lot of trace-relevant events or a lot of service restarts, your default trace may only have a few days (or even hours) of data and not have what you need - for example if you are trying to troubleshoot a problem from last Saturday and by Monday it has already aged out of the trace, then this isn't useful.

The resultset looks like this:



Pretty slick, right?

http://i.imgur.com/vIdCPPt.png
...but what does this have to do with Extended Events?

As I started out, I was looking for something slick I do with a trace that I could replicate in Extended Events, and this was a great candidate.

--

The catch as I discovered, is that while file growths are captured in the default trace, they are *not* in the system health session...

http://www.wicproject.com/images/2010/05/hobbes_doh.jpg
This was slightly annoying, but it also gave me an excuse to create a demo for setting up an XEvents session :)

--

First, the session:
USE [master]
GO
 
SET NOCOUNT ON 
/* Create Extended Events Session */
IF EXISTS (SELECT 1 FROM master.sys.server_event_sessions WHERE name = 'DemoFileSize')
DROP EVENT SESSION [DemoFileSize] ON SERVER
GO
CREATE EVENT SESSION [DemoFileSize] ON SERVER
ADD EVENT sqlserver.database_file_size_change(SET collect_database_name=(1)
    ACTION(package0.collect_system_time,sqlos.task_time,
sqlserver.client_app_name,sqlserver.client_hostname,
sqlserver.client_pid,sqlserver.database_id,sqlserver.database_name,
sqlserver.server_instance_name,sqlserver.session_id,
sqlserver.sql_text,sqlserver.username)),
/* Note - no predicate/filter - will collect *all* DATA file size changes */
ADD EVENT sqlserver.databases_log_file_size_changed(
    ACTION(package0.collect_system_time,sqlos.task_time,
sqlserver.client_app_name,sqlserver.client_hostname,
sqlserver.client_pid,sqlserver.database_id,sqlserver.database_name,
sqlserver.server_instance_name,sqlserver.session_id,
sqlserver.sql_text,sqlserver.username))
/* Note - no predicate/filter - will collect *all* LOG file size changes */
ADD TARGET package0.event_file(SET filename=N'c:\XE\DemoFileSize.xel',
max_file_size=(500),max_rollover_files=(10))
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,
MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=ON)
GO

ALTER EVENT SESSION [DemoFileSize] ON SERVER
STATE = START;
GO
This just creates a basic session, tracking two events, sqlserver.database_file_size_change and sqlserver.log_file_size_change, writing the output to a file target.

--

Next the database and table:
/* Create DemoGrowth database, dropping any existing DemoGrowth database */
IF EXISTS (SELECT database_id from sys.databases WHERE name = 'DemoGrowth')
BEGIN
 ALTER DATABASE [DemoGrowth] SET SINGLE_USER WITH ROLLBACK IMMEDIATE
 DROP DATABASE [DemoGrowth]
END
CREATE DATABASE [DemoGrowth]
 ON  PRIMARY
( NAME = N'DemoGrowth_Data'
 , FILENAME = 'C:\Temp\DemoGrowth.MDF'
 , SIZE = 4MB , MAXSIZE = UNLIMITED, FILEGROWTH = 2MB )
 /* Small increments for maximum autogrowths */
 LOG ON
( NAME = N'DemoGrowth_log'
 , FILENAME = 'C:\Temp\DemoGrowth.LDF'
 , SIZE = 2MB , MAXSIZE = 2048GB , FILEGROWTH = 2MB )

ALTER DATABASE [DemoGrowth] SET RECOVERY FULL
GO

BACKUP DATABASE [DemoGrowth]
TO DISK = 'C:\TEMP\DemoGrowth.bak'
WITH STATS = 5, INIT

USE [DemoGrowth]
GO
/* Create DemoGrow table, dropping any existing DemoGrow table */
IF OBJECT_ID('dbo.DemoGrow','U') is not null
DROP TABLE dbo.DemoGrow
CREATE TABLE dbo.DemoGrow
/* Purposely large datatypes for maximum needed autogrowths */
(
ID BIGINT IDENTITY(1,1)
,BIGNAME NCHAR(4000) DEFAULT 'BOB'
)
Setting the table with an identity and a character column with a default value makes the test inserts straightforward to code.  The database was created small and the table with large datatypes in order to maximize potential auto-growths.

--

The next step is to Watch Live Data for the session in order to see that autogrowths are happening - note that the sessions as configured is set to write to an event_file target, so Watch Live Data is the easy way to watch events happen, albeit with some minor performance impact.



Next, let's insert some rows and see what happens:
/* Insert 6000 rows and return file sizes before and after inserts */
SELECT name, size*8.0/1024.0 as SizeMB from DemoGrowth.sys.sysfiles 
GO
INSERT INTO dbo.Grow DEFAULT VALUES 
GO 6000 
SELECT count(ID) from dbo.Grow (NOLOCK) 
SELECT name, size*8.0/1024.0 as SizeMB from DemoGrowth.sys.sysfiles
/* If after sizes <> before sizes, check Watch Live Data for rows */
The sysfiles query shows whether the data and log files have grown after the inserts - it is possible (although highly unlikely) that the 6000 inserts don't cause either file to grow - if this is the case (or if you just want to witness more traffic) you re-run the code block multiple times.

--

Next we need to backup the LOG (remember that we are in FULL recovery) and manually shrink the database in order to see that the shrink also causes events:
BACKUP LOG DemoGrowth
TO DISK = 'C:\TEMP\DemoGrowth.trn'
WITH STATS = 5, INIT
/* Shrink files and return sizes before and after */
SELECT name, size*8.0/1024.0 as SizeMB from DemoGrowth.sys.sysfiles 
DBCC SHRINKFILE (N'DemoGrowth_Data')
GO
DBCC SHRINKFILE (N'DemoGrowth_Log')
GO
SELECT name, size*8.0/1024.0 as SizeMB from DemoGrowth.sys.sysfiles
/* Assuming shrink occurs, check Watch Live Data for rows */
/* If the LOG doesn't show shrunk, then re-run LOG backup and shrink again */
Just as a shrink logs events, so do manual growths:
/* Manually grow DATA file to see effect on session */
SELECT name, size*8.0/1024.0 as SizeMB from DemoGrowth.sys.sysfiles
/* Check Current File Size to set an appropriate size in ALTER grow statement */
ALTER DATABASE [DemoGrowth] MODIFY FILE ( NAME = N'DemoGrowth_Data', SIZE = 128MB )
GO
SELECT name, size*8.0/1024.0 as SizeMB from DemoGrowth.sys.sysfiles
/* Check Watch Live Data for rows */
--

So what is in the event_file (XEL file)?  There is a system function (sys.fn_xe_file_target_read_file) to return that data:
/* So let's look at the XEL file */
SELECT * FROM sys.fn_xe_file_target_read_file('C:\temp\DemoFileSize*.xel',NULL, NULL, NULL);  

BLEAH!

As always, someone has already come up with a better solution.  Google brought me to SQL Zealot and his query to shred the XML at https://sqlzealots.com/2015/04/01/auto-file-growth-track-the-growth-events-using-extended-events-in-sql-server/  - I modified the query slightly to return the data elements I wanted, including the timestamp:
SELECT
Case when file_type = 'Data file' Then 'Data File Grow' Else File_Type End AS [Event Name]
, database_name AS DatabaseName
, dateadd(minute, datediff(minute, sysutcdatetime(), sysdatetime()), timestamp1) as LocalTimeStamp
/* added the timestamp and in XE is UTC - this code converts it to local server time zone */
, file_names
, size_change_mb
, duration
, client_app_name AS Client_Application
, client_hostname
, session_id AS SessionID
, Is_Automatic
FROM
(
SELECT
(n.value ('(data[@name="size_change_kb"]/value)[1]', 'int')/1024.0) AS size_change_mb
, n.value('(@timestamp)[1]', 'datetime2') as timestamp1
, n.value ('(data[@name="database_name"]/value)[1]', 'nvarchar(50)') AS database_name
, n.value ('(data[@name="duration"]/value)[1]', 'int') AS duration
, n.value ('(data[@name="file_type"]/text)[1]','nvarchar(50)') AS file_type
, n.value ('(action[@name="client_app_name"]/value)[1]','nvarchar(50)') AS client_app_name
, n.value ('(action[@name="session_id"]/value)[1]','nvarchar(50)') AS session_id
, n.value ('(action[@name="client_hostname"]/value)[1]','nvarchar(50)') AS Client_HostName
, n.value ('(data[@name="file_name"]/value)[1]','nvarchar(50)') AS file_names
, n.value ('(data[@name="is_automatic"]/value)[1]','nvarchar(50)') AS Is_Automatic
FROM
(
SELECT CAST(event_data AS XML) AS event_data
FROM sys.fn_xe_file_target_read_file(
N'C:\temp\DemoFileSize*.xel'
, NULL
, NULL
, NULL
)
) AS Event_Data_Table
CROSS APPLY event_data.nodes('event') AS q(n)) xyz
ORDER BY timestamp1 desc
/* Much Better! */

--

So as you can see, a basic XEvents session provides the information comparable to the default trace.

But....XEvents is supposed to be better right?

I neglected to tell you one thing...in the query above you can see that there is a data element for sql_text - look what's in it:
INSERT INTO dbo.DemoGrow DEFAULT VALUES
http://s.quickmeme.com/img/15/151d67b9a73228d44b1bfabea0d012b54b9cd2821a25bf4b4be1bad10c41a95d.jpg
Pretty slick, huh?  The element shows you *what query* caused the file size change - something the default trace does not.  If you go back to the default trace, you will see there is a textData field, but that field is NULL for all of the operations in question (it is used for a few other types of events such as DBCC events, but not for the file growth events).

Also you may have noticed that our XEvents are called sqlserver.database_file_size_change and sqlserver.log_file_size_change, and they return data for growths *and* shrinks.  The default trace only returns data on file growths directly (although you could scrape the "DBCC Events" looking for DBCC SHRINKDATABASE and DBCC SHRINKFILE, but there are a lot of other DBCC events to search through.

--

Extended Events are awesome, and with a little work can give you everything that trace gives you and so much more.

Hope this helps!



2 comments:

  1. Great article, but won't the final query below be as below, which includes the action nodes rather than the one in the article?
    SELECT
    Case when file_type = 'Data file' Then 'Data File Grow' Else File_Type End AS [Event Name]
    , database_name AS DatabaseName
    , dateadd(minute, datediff(minute, sysutcdatetime(), sysdatetime()), timestamp1) as LocalTimeStamp
    /* added the timestamp and in XE is UTC - this code converts it to local server time zone */
    , file_names
    , size_change_mb
    , duration
    , client_app_name AS Client_Application
    , client_hostname
    , session_id AS SessionID
    , sql_txt
    , sql_username
    , Is_Automatic

    FROM
    (
    SELECT
    (n.value ('(data[@name="size_change_kb"]/value)[1]', 'int')/1024.0) AS size_change_mb
    , n.value('(@timestamp)[1]', 'datetime2') as timestamp1
    , n.value ('(data[@name="database_name"]/value)[1]', 'nvarchar(50)') AS database_name
    , n.value ('(data[@name="duration"]/value)[1]', 'int') AS duration
    , n.value ('(data[@name="file_type"]/text)[1]','nvarchar(50)') AS file_type
    , n.value ('(action[@name="client_app_name"]/value)[1]','nvarchar(50)') AS client_app_name
    , n.value ('(action[@name="session_id"]/value)[1]','nvarchar(50)') AS session_id
    , n.value ('(action[@name="client_hostname"]/value)[1]','nvarchar(50)') AS Client_HostName
    , n.value ('(data[@name="file_name"]/value)[1]','nvarchar(50)') AS file_names
    , n.value ('(data[@name="is_automatic"]/value)[1]','nvarchar(50)') AS Is_Automatic
    , n.value ('(action[@name="sql_text"]/value)[1]','nvarchar(500)') AS sql_txt
    , n.value ('(action[@name="username"]/value)[1]','nvarchar(50)') AS sql_username

    FROM
    (
    SELECT CAST(event_data AS XML) AS event_data
    FROM sys.fn_xe_file_target_read_file(
    N'C:\temp\DemoFileSize*.xel'
    , NULL
    , NULL
    , NULL
    )
    ) AS Event_Data_Table
    CROSS APPLY event_data.nodes('event') AS q(n)) xyz
    ORDER BY timestamp1 desc

    ReplyDelete
    Replies
    1. Spot on - the query I included in the blog originally was an earlier iteration and omitted the sql_text and sql_username fields - I have updated it to include those fields as your example shows - thanks!

      Delete