May 232011
 

If there’s one thing that really stands out about SQL Server, it’s the fantastic social community behind the product. If you keep an eye out on SQL blogs and Twitter networks, you can see events running every week across the US and the UK. Australia also has quite a strong SQL Server community, with the Melbourne SQL Server user group exceptionally popular. Although 30 minutes is set aside for socialising, I feel it’s just not enough – I’m such a social butterfly, you see.

To that end, Luke Hayler (@lukehayler)and I have organised an evening of social drinks. Come along, meet some other SQL Server professionals based in Melbourne, and discuss anything you like: SQL Server, virtualisation, your dog, or just wax lyrical about the quality of the beer available.

More details, and signup information at http://sqlserversocial.eventbrite.com/

Luke’s original post

image

That’s my kind of database cluster!

Apr 122011
 

Welcome to T-SQL Tuesday for April 12, 2011. This month is generously hosted by Matt Velic (Blog | Twitter), who poses the topic of APPLY, one of the many excellent additions to the T-SQL language in SQL Server 2005.

If you read other T-SQL Tuesday posts from today, you’ll get some excellent tips on how to use APPLY, and a list of the excellent things you can do with it. I’m going to go in a slightly different direction, and examine what your options are when using APPLY with database compatibility level 80.

Books Online states that the database must be at least compatibility level 90 to use APPLY.

It turns out that you can create a query that uses APPLY, and have it run in compatibility level 80. Consider the following setup:

CREATE DATABASE Compat80
GO
ALTER DATABASE Compat80 SET COMPATIBILITY_LEVEL=80
GO

USE Compat80
GO
SET NOCOUNT ON 
CREATE TABLE A (i int, name varchar(20)) 
CREATE TABLE B (i int, amount decimal(15,3)) 
INSERT INTO A (i, name) VALUES (1, 'Jack') 
INSERT INTO A (i, name) VALUES (2, 'Bob') 
INSERT INTO A (i, name) VALUES (3, 'Sally') 
INSERT INTO B (i, amount) VALUES (1, 25.0) 
INSERT INTO B (i, amount) VALUES (1, 50.5) 
INSERT INTO B (i, amount) VALUES (2, 16.0) 
INSERT INTO B (i, amount) VALUES (3, 110.0) 
GO 

Can we use an APPLY in Compatibility 80?

We’ve created a database in compatibility level 80, and created two tables, A and B. Let’s try a query with CROSS APPLY:

SELECT A.Name, dv.Amount FROM A 
CROSS APPLY (SELECT i, SUM(Amount) Amount 
             FROM B WHERE B.i = A.i GROUP BY i) dv 

image

There’s not much to say here, except that I proved Books Online wrong – I can do a CROSS APPLY in Compatibility 80. I don’t know the exact reason why I can do this, but it’s likely to be the simplistic nature of the query (which makes it less interesting). I’ll do a quick little victory dance, and we’ll move on.

A more useful use of APPLY

Where APPLY really shines is when functions are used. Let’s go ahead and create a simple function to test with:

CREATE FUNCTION dbo.fn_getB(@i AS int) 
RETURNS @result TABLE 
( 
      i   int NOT NULL 
    , amount decimal(15,3) 
) 
AS BEGIN 
        INSERT INTO @result 
        SELECT i, amount FROM B WHERE i = @i 
        RETURN 
END 
GO

The logic for this function is slightly different than the previous, but it only serves to make it more interesting. Let’s go ahead and use this function in an APPLY, remembering that we’re still in Compatibility 80:

SELECT * FROM A 
CROSS APPLY dbo.fn_getB(A.i) dv 

Msg 102, Level 15, State 1, Line 2

Incorrect syntax near ‘.’.

That’s disappointing, and rather cryptic. Let’s try again after switching to Compatibility 90:

ALTER DATABASE Compat80 SET COMPATIBILITY_LEVEL=90
GO 
SELECT * FROM A 
CROSS APPLY dbo.fn_getB(A.i) dv 
GO 

image

Success!

Getting around this restriction

We’ve seen that APPLY using functions is a handy tool, but can we still use it in databases that are in Compatibility 80? The following is a workaround, but it’s a bit of a hack. I just like thinking outside the box.

The limitation here is that we cannot let a Compatibility 80 database execute the query. We can, however, run the query from within a different database, such as Master (ignoring permission issues) by using sp_executesql.

EXEC sp_executesql N'USE Master; 
        SELECT * FROM Compat80.dbo.A 
        CROSS APPLY Compat80.dbo.fn_getB(A.i) dv ' 
GO

The main reason I’d want to use APPLY is to use the DMFs with the DMVs provided by SQL Server 2005, such as sys.dm_exec_*. However, these DMVs typically supply a database_id column, meaning you can run the query from within Master anyway. With that in mind, I can’t think of any real benefit of this technique – it’s almost always a better option to simply upgrade your databases to compatibility 90 or above – we are in 2011, after all.

Thanks again for this month’s edition of T-SQL Tuesday, Matt!

Apr 122011
 

It’s happened. After a long adolescence, SQL Server 2005 has finally grown up and left the nest. Today, April 12, 2011, marks the end-of-life date for mainstream support for SQL Server 2005.

As a parent, Microsoft will no longer have to completely support SQL Server 2005 in every way. MS won’t have to pick it up from Saturday night parties, or drive it to sports on Saturday mornings. (Quiz: Which sport would SQL Server 2005 play? Leave a comment!) While MS may spot SQL 2005 if it needs to catch a taxi home after a big night out, this sort of support will be fewer and far between.

SQL Server 2005 can now smoke if it chooses to, and if it happens to, it’s up to MS to decide whether to dissuade it, or let it continue. If you do catch your server smoking, I highly recommend putting out the fire immediately.

I think I’ve beaten that analogy quite enough. Extended support will continue until 12 April 2016, but it’s definitely time to consider a upgrade (to SQL Server 2008 R2 or SQL11) as part of your three-year plan. If you’ve still got SQL Server 2000 databases, I’d still consider those as a top priority though.

Thanks, SQL Server 2005! It’s been a pleasure, but it’s time to fly the coop. Good luck! (And yes, I do realise the irony of wishing it luck at the same time telling it to go away and never come back!)

(Full list of End Of Life dates for SQL Server)

Mar 282011
 

A colleague mentioned he received a warning while building indexes online the other day, so I decided to check it out. A quick search in sys.messages for messages LIKE ‘%online%index%’ found the following message:

Warning: Online index operation on table ‘%.*ls’ will proceed but concurrent access to the table may be limited due to residual lock on the table from a previous operation in the same transaction.

The error message is fairly clear – if you have a transaction open with existing locks, you’ll get this warning, letting you know that your online index rebuild might not be as online as you think it will be.

As a test, let’s try the following code on AdventureWorks2008:

BEGIN TRAN
UPDATE Person.Person SET FirstName = 'Kenneth' WHERE BusinessEntityID = 1
-- Rebuild an index on this table
ALTER INDEX IX_Person_LastName_FirstName_MiddleName ON Person.Person 
   REBUILD WITH (ONLINE = ON)
ROLLBACK

In this case, we get the warning message due to the five extra locks taken as part of the UPDATE command. While this index is being rebuilt, other sessions trying to access that row will be blocked. This is a good warning.

However, what happens if we try to rebuild an index on a different table? Ideally, I’d want the same sort of warning – if I have any locks that could cause blocking, I’d like to know immediately. The following code will test that:

BEGIN TRAN
UPDATE Person.Person SET FirstName = 'Kenneth' WHERE BusinessEntityID = 1
-- Rebuild an index on a completely different table
ALTER INDEX IX_Customer_TerritoryID ON Sales.Customer 
   REBUILD WITH (ONLINE = ON)
ROLLBACK

The result is that we get no warning. Rebuilding an index on the Sales.Customer table has nothing to do with the Person.Person update we performed, so we miss out of the warning.

This goes against what I’d ideally like. In this case, access to the row modified in Person.Person will result in a block until the index rebuild has finished.

This is not much of an issue though. I can’t think of many situations where I’d want to rebuild an index inside a transaction – it’s the type of task where it’s more likely to be run in a very narrow, constrained transaction.

Mar 152011
 

Every couple of years, I pull out a particular book that’s been sitting on my bookshelf for the past decade, read it, and am suddenly refilled with the passion of computers and computing science. This book is Code: The Hidden Language of Computer Hardware and Software, authored by Charles Petzold.

This book is perfect for those who never did an EE or CS degree at university, and want to know exactly how and why computers work. I’m sure a lot of DBAs fall into this category! Petzold takes us on a wonderful journey, on how communication is encoded into our lives, and focussing particularly on digital codes – binary.  Starting with the most simple concepts (communicating simple messages with flashlights and Morse code over short range), new concepts are introduced in simple, logical steps. From the flashlights, electricity is introduced, and a telegraph system is designed.

From Morse code, we are introduced to Braille and to UPC barcodes, all tied together with the theme of binary logic – the information can be coded by a series of ones and zeroes.

The book slowly builds up these concepts in an easy to follow fashion, using telegraph relays to build logic gates, through to a machine that can add two binary numbers, and finally culminates in a design for a theoretical general purpose, programmable computer.

From there, the concept of a transistor is introduced, the Intel 8080 and the Motorola 68000 CPUs are examined in detail, and it’s then a whirlwind of fast logical steps from machine code, to assembly, to higher level languages, operating systems and file systems (with a reasonably detailed look at CP/M). Finally, modern (well, 1999) graphical operating systems are examined, along with how various types of information are stored – graphics formats, sound formats, video, and HTML.

The book doesn’t go into networking protocols (except for a brief couple of paragraphs on modems), but it is easy to extrapolate how computers can talk to one another from the previous concepts.

All along the way, Petzold sprinkles in the history of computing, and the entire story is woven in a very engaging tone. Every time I read this book, I’m reawakened to the beauty and underlying simplicity (underlying – there’s nothing simple about 10 million transistor processors!) of computing.

Highly recommended!

Feb 142011
 

I recently got around to doing exam 70-433 TS: Microsoft SQL Server 2008, Database Development.  As I had previously completed exam 70-451 PRO: Designing Database Solutions and Data Access Using Microsoft SQL Server 2008 while it was in beta two years previously, this had the result of granting me both the MCTS Database Development and the MCITP Database Developer 2008 certifications at the same time.

MCITP(rgb)_1255_502

Next step – upgrade my MCITP Database Administrator from 2005 to 2008.

Jan 182011
 

imageWhen you think of a craftsman, a likely image to immediately pop into your head would be one of an older man, working with hand tools on a piece of wood.  Each movement he makes are deliberate and precise, and he seems to know intuitively what needs to happen next.  His tools are so familiar to him, and used so effortlessly that they seem like an extension of his body.

Although database work is a far cry from any form of woodwork, it is still a craft, albeit one that is difficult to garner as much sympathy at family gatherings as the classic crafts.  As a craft, one of the two main things database professionals will talk about is "what have you created or done?", and "which tools do you use?".

This second question is a great question, as you may discover a fantastic tool that you’d never heard of, and which can provide vast improvements in the efficiency of your regular time-consuming tasks, whether by providing necessary information more easily, or by automating tasks you currently do manually.

I frequently get odd looks when I state that I don’t regularly use third-party tools at all (with the exception of backup compression tools). 

What?  No Tools?

The primary reason for this stance is that I work for many different clients, and the majority do not have any tools over what is provided with SQL Server, and for various reasons, cannot justify acquiring any additional tools.

This is not to say that I am against third-party tools – I definitely do make use of them when available, but I believe that for every task that a tool provides assistance with, a database professional should be able to do without the tool at a pinch.

I liken this to our craftsman’s view of power tools.  There are many situations where power tools will greatly increase the speed and ease of creating some projects in the workshop, but for other situations hand-tools reign supreme.  Perhaps the craftsman is visiting family and is asked to repair a broken chair, or has a simple task to do that doesn’t justify setting up the power tools.

Let’s See Some Examples

As an example, Ola Hallengren’s excellent Index Optimisation scripts (amongst other things) are an excellent and free utility for more intelligently performing index maintenance.  However, at a pinch, a DBA should be able to query the sys.dm_db_index_* family of DMVs to determine which indexes require maintenance, and issue the appropriate ALTER INDEX command.

Automatic code-completion is another helpful tool that some people swear by.  As this is a relatively new feature in the SQL Server world (whether a third-party tool, or the one provided as part of SQL Server 2008′s Management Studio), many DBAs are used to not using this.  In the future, however, we will become used to having code completion tools available.  It’s still important to know how to quickly query the database structure using sp_help, sp_helptext, sys.objects, INFORMATION_SCHEMA, etc. (Incidentally, I wow at least 5 people a year by using the Alt-F1 shortcut for sp_help. Simply highlight a table name in a query window in Management Studio, and hit Alt-F1.  Very useful.)

There are a number of tools available that can be used to trace expensive queries, and I do enjoy using a tool developed in-house to provide trace control and analysis.  If this is not available, or if it would take too much time to set up, I’m happy to pull out Profiler, script out a focussed trace, collect some data, and then analyse the resulting trace files with fn_trace_gettable().

There are numerous other examples.  Can you read a deadlock graph, or do you need Profiler to draw a picture?  Can you read a blocking chain in sp_who2, or do you need Adam Machanic’s (blog | twitter) Who Is Active?  What if you are called upon to help with a SQL Server 2000 instance, where this is unavailable?

Regardless of whether you primarily use the base set of tools, or primarily use the "power" tools, it pays to be familiar with both types, and to be able to use each tool in the proper way in the right situation.

Jan 112011
 

Another T-SQL Tuesday has rolled around, and for once I was prepared for it.  This month’s topic, hosted by Jen McCown (blog|twitter) is on the topic of Resolutions.  I spent a week rolling around ideas in my head for this topic, hit upon some excellent angles, and then had fate completely change the direction of my post.

 

I’m not a big believer in "New Year’s Resolutions", mostly because they’re so clichéd and popular opinion is that you’re doomed to fail if you have one.  Instead, I appreciate the holiday period as a time to reflect and set more general life directions, as opposed to the focussed task of "exercise more". 

In terms of SQL Server related goals that I want to accomplish this year, a quick brainstorm came up with the following:

  • I really should blog more often, say once per week.   Something 95% of SQL bloggers say every few months.
  • I really should post a bit more on my favourite SQL-based forums.  Two quality posts a day sounds feasible.  This one mostly comes down to time.
  • Wow, that MCM certification looks awfully interesting, challenging, yet achievable.  Let’s do that!
  • 2011 will be the year where I get my head around recursive CTEs!
  • Can I please, please, please use the MERGE command in production code?
  • Denali, I’m playing with you.  Get ready to be swept off your feet!
  • I’m going to slow my handwriting down, and write legibly and neatly for the first time in 15 years. (Not a technical wish, but still important!)

That third point, the MCM should be enough to keep me busy for the first half of the year at least.  I’ll need to do the MCITP 2008: Upgrade from 2005 DBA exam (70-453), and then the MCTS 2008: Developer exam (70-433) (I’ve already done the 70-451, back when it was in beta).

I have already planned for these in a more serious way.  I changed different things in my life in November, December and January, so there’s nothing particularly significant about January 1.  Do it now – don’t wait until some arbitrary date!

However, it is important to consider the non-technical side of your life, which is something that hit home to me again last night as I received news of my wife’s family in flood-ravaged south-east Queensland.  Don’t be so fixated on SQL Server that you dilute or lose connections with your loved ones.  No matter how warm SQL Server may be (especially when you’re standing next to the server during business hours!), it’s a very poor substitute to family and friends.

Dec 222010
 

In my previous post, I mentioned that the superior error checking and correcting abilities of a solid state disk (SSD) may allow you to trust your database files to a single drive and avoid RAID.  Let’s have a closer look at the issues.

SSDs will, in theory, be more reliable due to their lack of moving parts and lower power and cooling requirements.  Additionally, their ability to detect and correct errors in a more superior way (via Hamming codes to allow double error detection, single error correction) provides more protection than a RAID 5 array.  However, it is still possible for the drive to fail, and this issue must be considered prior to betting the farm on a single SSD.

Firstly, unless you’re happy to roll back to your previous full backup, transaction logs should be mirrored, whether they’re on magnetic hard disks or SSDs, and appropriate off-server log backups should be taken frequently.

In the case of using a single SSD for TempDB, what happens if the SSD fails?

  1. If you have another server to fail over to, great.  You’ll have a short outage while the other server picks up the workload, and hopefully you won’t lose any transactions.  If the standby server has no SSDs then performance may be slower, but still acceptable.  If there is no standby server, read on.
  2. You’ll have a SQL Server outage for the amount of time it takes to re-home TempDB to another disk.
  3. You’ll need to source a new home for TempDB.  You may already have space available on another attached drive, or you may need to provision more space from the SAN (if you have one).
  4. If you have no additional space, you’re in deeper trouble – you’ll need more disks, or move your databases to another server.
  5. Once you have the space, you’ll also need to consider performance.  Did you originally use a SSD for TempDB because the workload was so high that regular drives barely handle it?  You load might have grown since then, and you simply cannot handle the TempDB load without a SSD.
  6. How long does it take to get a new SSD shipped in?  Until then, your system may be down.
  7. If you have a spare SSD sitting on the shelf that you can quickly slot in, why didn’t you just RAID it in the first place, or put it in a secondary server?

In this case, we don’t even care about the contents of the SSD – TempDB will be recreated when we restart SQL Server, and things will be back to normal.  The main issues stem from how quickly you can re-home TempDB, and the restrictions on where you can place it.

If you have TempDB on dual SSDs in a RAID configuration, and you have one fail, then you’ll continue running with no outage (although you may need to schedule one to replace the faulty SSD).  Of course, this is a much more expensive option, and it’s possible that neither SSD will ever fail – but that’s the insurance game.  You pay for more piece of mind.

Data files are a similar story, but there are some differences.  While the loss of TempDB will guarantee a full SQL Server outage, loss of a single data file will only result in that database having reduced availability.  If the data is critical, and your system cannot run without it, then you will have a problem until you failover to a standby, or restore the data elsewhere.  If you can survive without this data, and you can regenerate or restore it later, there’s less of an issue.

The summary is that you need to consider answers to the following questions for each of your SSDs, and then make an appropriate decision.

  1. If this SSD fails, what are the effects in terms of server availability, data loss, and the amount of time to recover?  What is the action plan, with estimated times to get this back online?
  2. During this downtime, what effect will this have on the business?  Will the entire organisation grind to a halt, or will a non-critical data warehouse be unavailable for 24 hours?  What is the cost of this (lack of productivity, lost sales, your job)?
  3. What is the additional cost associated with providing redundancy to ensure that this won’t fail, whether as an additional SSD in a RAID configuration, or a standby server (which may or may not have SSDs).

Non-critical data on a single SSD can replace a RAID-5 array, but you need to thoroughly understand the risks first, and have a solid contingency plan in place.

Nov 172010
 

I was at the Melbourne SQL Server user group last night, listening to Greg Linwood (of MyDBA) present on solid state disks (SSDs).  It was a very informative session, with Greg going into detail on how SSDs work, and the specific things you have to keep in mind when using them, such as providing sufficient OS memory to manage the SSD’s allocations, and over-provisioning to allow sufficient space to support the fact that updates involve writing to a new location and the space is not freed synchronously.

The first of three main points of the session were that SQL Server is a read oriented system, in that to perform any write operation, the rows to be updated/deleted must first be located and read in from disk.  In particular, indexes are the most likely objects to benefit from being on SSDs as the access pattern is completely random – an insert into a clustered index may be predictably via an IDENTITY value, but the non-clustered indexes will experience an insert into the middle of the B-tree structure.

The second main point is that enterprise-level SSDs, such as Fusion IO’s offerings, provide a large reduction in space requirements (the devices can be much more densely packed), cooling requirements (there’s no moving parts, so things don’t get hot), and subsequently lower power requirements.  There is also less of a need to RAID SSDs together – individual hard disks fail frequently due to their mechanical nature, but SSDs are less likely to experience this due to their more robust nature.  Of course, the entire SSD could fail, but this is akin to losing your RAID controller.  If your transaction logs are mirrored on a hard disk array (which is ideally suited to sequential write operations), then the error checking and correction (Hamming codes, so single error correction, double error detection, which is superior to RAID-5’s single error detection) capabilities may allow you to trust your data files to a single device.  I should point out that this is not a recommendation – it is up to you to perform risk assessment and contingency.

In terms of cost, hard disks beat SSDs when comparing single units, but the superior performance of SSDs, coupled with the lower space, cooling, power  and advantages make SSDs significantly cheaper.  Compare the TPC’s current TPC-C benchmarks – the best performing hard disk solution comes in at 6 million transactions per minute with $20 milllion for storage costs, while the best performing SSD solution comes in at 10 million transactions per minute with $3 million for storage costs. (It’s worth noting that both of these are DB2 systems, and not SQL Server, but the RDBMS isn’t as important here.)

SSDs are definitely the next big thing, and are now considered stable and mainstream enough to improve what can be considered the biggest bottleneck in database systems today – disk I/O.  Greg presented a captivating session and definitely cleared up a lot of misconceptions around SSD technology and their uses with database systems.