One DBA's Ongoing Search for Clarity in the Middle of Nowhere


*or*

Yet Another Andy Writing About SQL Server

Monday, November 18, 2013

The Next Phase of My Career

For anyone who hasn't seen an update somewhere else in the social media universe, I recently started a new job as a Senior SQL Server DBA for Ntirety on their remote DBA team.  Ntirety is a top Remote DBA services firm that works in MSSQL, Oracle, and mySQL support.

Why would you do that, Andy?

I made this move for several reasons:
  • Ntirety has a larger Microsoft SQL Server business than my previous employer - sure Ntirety also handles Oracle and mySQL, but MSSQL is the largest portion of the business.  My previous employer was heavily Oracle (and VMware) focused.
  • The ability to work from home - at my previous job, I had the opportunity to intermittently work from home when I had an appointment, etc. (and they were very easy to work with from this point-of-view), but the expectation was clearly that we should be in the office whenever possible (and this expectation has grown over the two years I was there).  My new job is 100% WFH with occasional trips to the mother ship in Boston.  I have never worked 100% from home, but am looking forward to the opportunity to not have a 30-40 minute commute each way every day.
  • Decreased travel - at my previous job we were 50/50 (maybe 60/40) consulting/remote managed services, and almost all of the consulting was traveling to client sites all over the United States.  While I traveled less than some of my colleagues, it was still significantly more travel than I would like.  I came into the company after an interview process discussion about how the company did both types of work and that people could choose a path, but it became readily apparent that the model was designed around starting in managed services and then "graduating" (for lack of a better analogy) into becoming a traveling consultant, and this model is not for me.
  • The opportunity to work with other senior personnel - two of the staff at Ntirety are MCMs and I know just working with them will help me enhance my skillset. (HASHTAG MCMsAreCool HASHTAG Fanboy HASHTAG TotallyStealingThisFromJimmyAndJT)  As many readers of my blog already know, the star MSSQL employee (and hopefully soon-to-be MCM) from my former company recently left to form his own practice, leaving a void that has not been filled.
Sure there is a little more money (plus some savings in gas, etc. from no commute) but that isn't the reason I moved - if I had just needed a little more cash I would have worked with my employer to find out what I needed to do to make that happen. This is 100% about quality of life, the opportunity for technical growth and (most importantly) the opportunity to be more present with my family and our boys.

So what's next?

As you may (or may not) have noticed, my blogging has been very slack over the last year,  and I hope my new role will re-energize me to increase my focus on this important aspect of technical development.  As mentioned above, I have never worked from home full-time, so expect to see posts in the coming weeks on my experiences with this new job style.  There are already multiple blogs describing tips and tricks and personal experiences on working from home (listed in no particular order):
While I greatly value all of these fellow #sqlfamily members' opinions, I also know that everyone's experience is different and I look forward to documenting my personal journey with telecommuting.

I also know that every job brings a new set of technological experiences, and hope to blog about those as well. 

Wish me luck!

Wednesday, July 10, 2013

Hung SQL Server Restores from Tivoli Storage Manager/Tivoli Data Protection



I ran into this with a client last night when trying to restore their SQL Server data warehouse database from TSM.

We went through several iterations of the restore failing without any meaningful error other than a cryptic “TCP/IP Connection Refused” even though we could ping and telnet from the database server to the AIX server hosting TSM.  The TSM technician at the client could see multiple tapes loading and then just sitting, with only a little data transferred to the database server at the beginning of the process but no large scale data transfer.  From my end on SQL Server I could see several connections from the TSM client (Tivoli Data Protection/TDP) that appeared to be waiting as well for BACKUPIO with little CPU or I/O traffic, but the wait time was cycling up and down rather than continuously increasing, as if the thread was doing a little work rather than completely blocked.  Perfmon showed no I/O on the disk to which we were going to restore.

The first several tries at the restore ran for about 20 minutes and then failed with the TCP/IP error.  When this occurred, a database was created on the SQL Server in “restoring” mode immediately before the failure and empty files were created on the disk at this same time.  For each subsequent attempt I deleted the restoring database and the empty files.

I found an article that talks about the length of time required to zero-initialize large database files and the potential need to increase the timeout on the TSM server (COMMTIMEOUT) to allow the SQL Server time to zero-initialize the files.  We tried this, but realized after 25 minutes of no change that with no traffic on the disk (in PerfMon) that it couldn’t be just waiting for SQL to zero-init the files.

The TSM technician found a IBM article about Tivoli Data Protection that talked about hung restores being related to striped backups.  In this client’s case, they stripe their TSM/TDP backups with four stripes to increase performance (fine) but they do it to media pools (not fine).  As a result in this case, there were four backup stripes written to only three tapes.  By default, the TSM/TDP client restores with the same number of stripes as the backup, but when it tried to queue up all four stripes at once it couldn’t with two of them on one tape. 

The answer was to modify the restore request to run as a single striped restore, which succeeded.  It still took about two hours for the restore of the 214GB database, but when we kicked off the single striped restore we immediately saw a difference on both the TSM server and the SQL Server.  It took the TSM server a couple minutes to load the first tape and then progress was visible in the TSM/TDP client window, data was being transferred from the TSM server, and the connections in SQL Server showed I/O.

This is potentially an issue with any enterprise backup system that backs up using multiple stripes to tapes in media pools unless that system is internally intelligent enough to prevent the “two stripes on one tape” issue described.

Wednesday, June 19, 2013

New Page - "How to Get Involved With the SQL Server Community"

I recently composed an email (on the side) for a client DBA looking for info on getting involved with the community, looking for training, etc - and now I have turned it into a web page here on my blog - please check it out and please let me know anything I should add or fix - I want this to be as accurate and helpful as possible - thanks!

http://nebraskasql.blogspot.com/p/how-to-get-started-with-sql-server.html


Tuesday, May 28, 2013

SQLPerformance.com

I have had the opportunity to go through the first three SQLSkills Immersion Events, and one of the side benefits of that is getting to meet and interact with their team.  One of their team that I never really even knew of before he joined SQLskills was Joe Sack, and (like everyone at SQLskills) he is just too smart (It isnt fair!) :)

He has a great new article up on SQLPerformance.com about "Troubleshooting SQL Server CPU Performance Issues" and while it may seem like basic stuff it isn't - CPU issues are not something every DBA deals with every day (Memory and IO issues yes, but CPU not so much) and he describes a good framework for where to look for information and how to get started.

I don't think for a minute that there is anybody reading my little blog that doesn't already read the SQLskills blogs, but I wanted to draw attention to this because it isn't a SQLskills blogSQLPerformance.com is a dual effort between SQLskills and SQL Sentry, and their blogroll is crazy-talented:



Make sure to check it out - a new article comes out every week or two, and the content is top of the line.  You won't regret it!

Tuesday, May 14, 2013

T-SQL Tuesday #42 - The Long and Winding Road








It's T-SQL Tuesday again (Thanks go to Adam Machanic (B/T)) and this month the host is Wendy Pastrick (B/T).  Wendy's excellent subject is "the long and winding road" - changes in our work lives and how they impact us.

I am luckier than most in that many of the changes in my work life over the years have been self-induced - I have never been fired, laid off, downsized, or any of those other HR terms.  I started as a help desk support admin who got the opportunity to learn Microsoft SQL Server when our group at the University went to a Microsoft development model (Visual Basic 6.0 on SQL Server 7.0 at the time - yes, I'm old).  Since then I have changed jobs from working for a major public University to a large regional bank to a regional healthcare system then back to that bank and then on to my current consulting gig, and all along I have changed positions on my terms - not always exactly when I would have preferred, but at least on my say-so.

Most of the direct changes in my work life over the years have been changes of technology and role - learning MSSQL 2000-2012, gaining experience in different industries and interacting with different side technologies such as VMware, becoming a senior/lead DBA and then a consultant, building amazing relationships with my #sqlfamily, and meeting some of the most brilliant (and nicest) people on the planet.

All of that aside, the biggest change in my 13+ year DBA work life is an indirect change that has nothing to do with any of the above.  Six years ago I married my best friend, and over the last three-and-a-half years we have had three amazing boys (Noah 3 years, Jonah 20 months, and Micah 3 months).

Wow - this really all I can say - Wow.

I am lucky enough that what I do is sufficiently lucrative that my wife can stay home with our boys, but an unfortunate side effect of what I do is that I have to go on the road for my job - both to consult and for training - sometimes for days, and sometimes for weeks in a row.  I am writing this in a hotel room in Chicago while they are 500 miles away at home.  My growing desire to be involved with SQL Saturdays and the community (my #sqlfamily) just infringes on my real family even more.  It has become an amazing balancing act and many days I don't feel like I do one side or the other justice, but I keep trying.

...and trying.
 
I am at the point in my career life where nothing is more important than my wife and kids, and getting further ahead in my job is less important than getting a raise/promotion/whatever at work.  My job is merely a means to an end to allow me to support my family - it is not my reason for being.  This is a double-edged sword as it can be difficult to even keep up in our field without excessive time off-clock keeping our knowledge updated, and it also makes me painfully aware of the times I bring work frustrations home and how that impacts my wife and sons (even as little as they are).  I begrudge every minute I am on-call or have to work an after-hours release because I know I am missing some first in my boys' little lives, but that's the job, and I know that.

No matter what, I will always keep trying - they are my everything and they are definitely worth it.



Thursday, February 7, 2013

Announcing SQL Saturday 197 Omaha Pre-Cons!

Here is a flyer produced by someone on the team - he says it all very well:

--

Friday, April 5, 2013
Get Warmed Up For SQL Saturday By Attending The Preconference!  

Three Great Speakers, Three Great Topics!

--

Scaling SQL Server 2012 (Glenn Berry - @GlennAlanBerry)

How can you scale SQL Server 2012? Scaling up is relatively easy (but can be expensive), while scaling out requires significant engineering time and effort. If you suggest hardware upgrades you may be accused of simply “throwing hardware at the problem”, and if you try to scale out, you may be thwarted by a lack of development resources or 3rd party software restrictions. As your database server nears its load capacity, what can you do? This session gives you concrete, practical advice on how to deal with this situation. Starting with your present workload, configuration and hardware, we will explore how to find and alleviate bottlenecks, whether they are workload related, configuration related, or hardware related. Next, we will cover how you can decide whether you should scale up or scale out your data tier. Once that decision is made, you will learn how to scale up properly, with nearly zero down-time. If you decide to scale out, you will learn about practical, production-ready techniques such as vertical partitioning, horizontal partitioning, and data dependent routing. We will also cover how to use middle-tier caching and other application techniques to increase your overall scalability.


Practical Self-Service BI with PowerPivot 2012 for Excel (Bill Pearson - @Bill_Pearson)


SQL Server MVP and Business Intelligence Architect Bill Pearson leads this full-day, hands-on introduction to using PowerPivot 2012 for Excel to deliver self-service business intelligence.  The focus of the course is to help those new to PowerPivot to become familiar with the assembly of data from diverse sources into models that deliver business intelligence upon demand.  Participants will gain exposure to accessing and relating data, and to employing the Data Analysis Expressions (DAX) language, to construct and share PowerPivot applications that support analysis and reporting throughout the enterprise.  The intended audience is information workers (business- or IT-based) involved in analysis and reporting of data,  intermediate - to - advanced Excel users,  practitioners that want to gain familiarity with PowerPivot to build compelling analysis and reporting applications to deliver self-service BI.


 


Prof. PowerShell: Or How I Learned to Stop Worrying and Love PowerShell (Jeff Hicks - @JeffHicks)


Windows PowerShell has a special talent for simultaneously exciting and terrifying IT Pros. Many see it as just another scripting language or passing fad. Other's see it as an essential technology for managing today's Windows-based environments. Some IT Pros feel overwhelmed and don't know where to start and others dive right into the deep end, sometimes with mixed results. In this one day session, PowerShell MVP, author and Prof. PowerShell will dispel fears, myths and misconceptions surrounding this must-know management tool. If you've been dabbling in PowerShell, wondering what all the fuss is about, or simply want to get a jump start, this is the session for you. After this crash course on PowerShell 3.0 you'll realize you had nothing to worry about begin to feel the PowerShell love.

This isn't a formal training course, but bring your laptop with PowerShell 3.0 and feel free to follow along.


Get Tickets Now:
http://SQLSaturdayOmahaPrecon.eventbrite.com/

Pricing:
Early Bird Special = $100
After March 22nd = $115

--

Register now!

Tuesday, February 5, 2013

The Value of Certifications

A former co-worker recently sent me an email asking about Microsoft certifications, and if I pursued them, and what I thought about them.  I started to write a response and it sounded like a good blog post {-: so here it is:

--

I am certified through the gills in SQL 2000 (MCDBA) and 2005 (MCITP), but when it came time to upgrade to 2008 I got distracted by life and let it slide.  I finally tried the upgrade exam once right before it retired and just missed passing it.

My experience with certs is that many people don't value Microsoft certs because they are seen as just paper - the only cert that really is valued in the community is the MCM (now the MCSM) - the Master certification.  The catch in the MCM/MCSM is that it requires most of the lower level certs as prerequisites - you have to be an MCITP Admin *and* an MCITP Developer as prereq's for Master.  (The developer requirement is why I sincerely doubt I will ever pursue MCM/MCSM.)

Unless you are interested in going for Master (or are looking at a specific job listing that requires you to be an MCITP) I would not spend your effort (and $$) on it - you will not see much real reward out of it other than whatever intrinsic feeling of accomplishment you get from passing tests - there are no raises or better jobs or anything else just from being certified *if* you already have experience.  For a brand-new DBA there may be some benefit if you are competing for jobs against other new DBA's - the guy with the no experience and the cert may have an edge over the guy with no experience and no cert.

Having said that, the certification blueprints (topic lists) are good sources of study material if you just want to know what there is to learn for learning's sake - but you can do that without shelling out a few hundred dollars per test for multiple tests.  Most of my SQL 2000 and 2005 certs were achieved while I was working for a boss who saw great value in certification and education and therefore funded everything.

Someone told me back when I worked at the University of Nebraska - a certification just tells a potential employer one thing - that you know how to take tests.  For Microsoft certs lower than Master, I completely agree.  The Master is a two-part written and lab exam and is supposed to be the best exam MS has ever created.  One of the guys in our office recently took (and passed) the Knowledge (written) half of the exam and it blew his mind compared to all of the MCITP tests.

So....if you are thinking of becoming a certified Master (I think there are ~75 of them in the world) then you *have* to take the lower tests and you should get on it.  If you are not thinking of that, then I wouldn't worry about it.

--

...and that's my $.02

Thursday, January 24, 2013

Why is My New Server Under CPU Pressure?



Signal waits frequently translate to the amount of time that SQLOS is under pressure for CPU resources, as outlined in this TechNet article (http://technet.microsoft.com/en-us/magazine/hh781189.aspx). While a lower number is almost always better, it is a useful metric is to start paying attention to when the number rises over 10%. Your system will usually start presenting issues at the hard line of 25%.  This something that we check on every server that we touch as part of our regular health check.

Today I tripped over a conundrum - I was given a new server with a fresh install of SQL Server 2008R2 and Reporting Services, and on this system the Signal Waits were measured at 25.19% on our scan, which shows that this system appeared to be under some CPU pressure.  All wait statistics are reset each time the SQL Server service restarts, so this number (25.19%) is only since the last service restart.

This system in question is a physical box and has two sockets with six cores each with hyperthreading (so 24 logical CPU’s), and that made me pause – why does a new server with two small databases (from Reporting Services) and 24 logical CPU’s show high Signal Waits, which usually indicates CPU pressure?

The answer is that in this case, this new server is *too* quiet, and as such the numbers are skewed.  Here are the specific wait stats, measured after clearing the total wait statistics using the DBCC SQLPERF ('sys.dm_os_wait_stats', CLEAR) command.


 
As can be seen in this shot, almost all of the Signal Wait time is in two categories, XE_TIMER_EVENT and REQUEST_FOR_DEADLOCK_SEARCH, both of which are basically system background processes (waiting for XEVENT processing and deadlock processing, respectively) and neither of which are therefore relevant to measuring the busyness of the server.  On a regularly busy server, all of the other wait categories (over 400 in total) greatly outweigh these two background processes and the normal query we use that compare signal wait time to total wait time is useful.  In this case if we exclude these two categories:

  
We now see that the server has zero Signal Waits and no CPU pressure, as expected.  For a good description of the different types of waits and what they mean, look at http://blogs.msdn.com/b/psssql/archive/2009/11/03/the-sql-server-wait-type-repository.aspx from the Microsoft Customer Service and Support (CSS) team.

Linchi Shea has written several great articles on Signal Waits and what they mean in SQL Server.  They are here and here - check them out as well.