Category: Articles

Easily removing repeated lines using Excel

Imagine that you’ve just used SQL Server Management Studio to generate a script, and it’s done it’s job wonderfully, except for a lot of unsightly GO and blank lines:

GRANT SELECT ON [Person].[Address] TO [SomeUser];
GO

GRANT INSERT ON [Person].[Address] TO [SomeUser];
GO

GRANT UPDATE ON [Person].[Address] TO [SomeUser];
GO

GRANT DELETE ON [Person].[Address] TO [SomeUser];
GO

Although technically correct, this may offend your sense of aesthetics. If there’s only a couple of rows (as in the excerpt above, you can manually remove the unsightly GOs and blank lines, but this is a bigger ask when you have more than about 20 commands.

The simplest method I’ve come across is to use one of the DBA’s most trusted tools – Excel. Although there are other ways of doing this sort of operation, Excel is installed on almost every Windows machine, and this is a quick operation.

First, copy the entire set of commands, and paste into Excel as Column B, and in column A, add a number for each row (essentially an IDENTITY column). Remember that you can just fill out 1 and 2, select both of these, and drag down to the bottom:

image

This number column is used so that we know which order each row was in originally, because we’re about to sort the data. Select columns A and B for all rows, and sort by column B:

image

Next, select the rows you don’t want, and delete them.

image

Now, select the remaining data, and sort again, this time on column A, to return to the original sort order (minus the offending rows):

image

Copy the commands back into Management Studio, and you’re done.

GRANT SELECT ON [Person].[Address] TO [SomeUser];
GRANT INSERT ON [Person].[Address] TO [SomeUser];
GRANT UPDATE ON [Person].[Address] TO [SomeUser];
GRANT DELETE ON [Person].[Address] TO [SomeUser];

T-SQL Tuesday #17–APPLY in Compatibility 80

Welcome to T-SQL Tuesday for April 12, 2011. This month is generously hosted by Matt Velic (Blog | Twitter), who poses the topic of APPLY, one of the many excellent additions to the T-SQL language in SQL Server 2005.

If you read other T-SQL Tuesday posts from today, you’ll get some excellent tips on how to use APPLY, and a list of the excellent things you can do with it. I’m going to go in a slightly different direction, and examine what your options are when using APPLY with database compatibility level 80.

Books Online states that the database must be at least compatibility level 90 to use APPLY.

It turns out that you can create a query that uses APPLY, and have it run in compatibility level 80. Consider the following setup:

CREATE DATABASE Compat80
GO
ALTER DATABASE Compat80 SET COMPATIBILITY_LEVEL=80
GO

USE Compat80
GO
SET NOCOUNT ON
CREATE TABLE A (i int, name varchar(20))
CREATE TABLE B (i int, amount decimal(15,3))
INSERT INTO A (i, name) VALUES (1, 'Jack')
INSERT INTO A (i, name) VALUES (2, 'Bob')
INSERT INTO A (i, name) VALUES (3, 'Sally')
INSERT INTO B (i, amount) VALUES (1, 25.0)
INSERT INTO B (i, amount) VALUES (1, 50.5)
INSERT INTO B (i, amount) VALUES (2, 16.0)
INSERT INTO B (i, amount) VALUES (3, 110.0)
GO 

Can we use an APPLY in Compatibility 80?

We’ve created a database in compatibility level 80, and created two tables, A and B. Let’s try a query with CROSS APPLY:

SELECT A.Name, dv.Amount FROM A
CROSS APPLY (SELECT i, SUM(Amount) Amount
             FROM B WHERE B.i = A.i GROUP BY i) dv 

image

There’s not much to say here, except that I proved Books Online wrong – I can do a CROSS APPLY in Compatibility 80. I don’t know the exact reason why I can do this, but it’s likely to be the simplistic nature of the query (which makes it less interesting). I’ll do a quick little victory dance, and we’ll move on.

A more useful use of APPLY

Where APPLY really shines is when functions are used. Let’s go ahead and create a simple function to test with:

CREATE FUNCTION dbo.fn_getB(@i AS int)
RETURNS @result TABLE
(
      i   int NOT NULL
    , amount decimal(15,3)
)
AS BEGIN
        INSERT INTO @result
        SELECT i, amount FROM B WHERE i = @i
        RETURN
END
GO

The logic for this function is slightly different than the previous, but it only serves to make it more interesting. Let’s go ahead and use this function in an APPLY, remembering that we’re still in Compatibility 80:

SELECT * FROM A
CROSS APPLY dbo.fn_getB(A.i) dv 

Msg 102, Level 15, State 1, Line 2

Incorrect syntax near ‘.’.

That’s disappointing, and rather cryptic. Let’s try again after switching to Compatibility 90:

ALTER DATABASE Compat80 SET COMPATIBILITY_LEVEL=90
GO
SELECT * FROM A
CROSS APPLY dbo.fn_getB(A.i) dv
GO 

image

Success!

Getting around this restriction

We’ve seen that APPLY using functions is a handy tool, but can we still use it in databases that are in Compatibility 80? The following is a workaround, but it’s a bit of a hack. I just like thinking outside the box.

The limitation here is that we cannot let a Compatibility 80 database execute the query. We can, however, run the query from within a different database, such as Master (ignoring permission issues) by using sp_executesql.

EXEC sp_executesql N'USE Master;
        SELECT * FROM Compat80.dbo.A
        CROSS APPLY Compat80.dbo.fn_getB(A.i) dv '
GO

The main reason I’d want to use APPLY is to use the DMFs with the DMVs provided by SQL Server 2005, such as sys.dm_exec_*. However, these DMVs typically supply a database_id column, meaning you can run the query from within Master anyway. With that in mind, I can’t think of any real benefit of this technique – it’s almost always a better option to simply upgrade your databases to compatibility 90 or above – we are in 2011, after all.

Thanks again for this month’s edition of T-SQL Tuesday, Matt!

What’s the difference between a filter on the ON and the WHERE clause?

The question is, what is the difference between having a filter on the ON clause, or in the WHERE clause?

Consider the following queries:

USE TempDB
GO
CREATE TABLE A (i int, name varchar(20))
INSERT INTO A (i, name) VALUES (1, 'Jack'), (2, 'Ryan')
    , (3, 'Simon'), (4, 'Sandra'), (5, 'Daryl')
GO 

CREATE TABLE B (i int, amount decimal(9,3))
INSERT INTO B (i, amount) VALUES (1, 100.25), (2, 10000)
    , (1, 0.45), (4, 234.23) 

-- WHERE
SELECT * FROM A
INNER JOIN B ON A.i = B.i
WHERE A.Name NOT LIKE 'S%' 

-- ON
SELECT * FROM A
INNER JOIN B ON A.i = B.i AND A.Name NOT LIKE 'S%' 

Consider these two queries are the bottom. The first uses a “A.Name NOT LIKE ‘S%’” filter in the WHERE clause, whereas the second uses the same filter in the ON clause. The results are identical:

image

Where this gets interesting is when we use an OUTER JOIN. For example, consider the following two queries, which are identical to the previous two, but using a LEFT OUTER JOIN:

-- Query A - WHERE filter 
SELECT * FROM A
LEFT JOIN B ON A.i = B.i
WHERE A.Name NOT LIKE 'S%' 

-- Query B - ON filter 
SELECT * FROM A
LEFT JOIN B ON A.i = B.i AND A.Name NOT LIKE 'S%' 

And the results:

image

What’s happened here? The only difference was the placement of the A.Name clause, but the second query (the ON) has added Simon and Sandra’s rows.

What’s happened is due to the ordering of the logical query processing steps. Logically, all JOINs are performed as INNER JOINs using the ON filters, and then, as a subsequent step, any OUTER JOIN rows are added back to the necessary side. After all JOINs are complete is the WHERE filter processed.

This means that in Query A (WHERE), the inner join between A and B was completed, then rows 3 (Simon), and 5 (Daryl) were added back in. Then the WHERE was applied, and Simon and Sandra were removed as their name begins with S.

In Query B (ON), A and B were joined together, but any rows in A with a name LIKE ‘S%’ were not joined, as the ON filter ignored them during the initial JOIN phase. The OUTER JOIN phase then added in all rows from the LEFT that weren’t joined, and so Simon and Sandra were added back in.

Is this a problem? I don’t believe it’s much of an issue, as I tend to think of the WHERE clauses as being distinct from the ON clauses. Just remember to keep your join filters (ON) and your row excluders (WHERE) separate, and understand the difference when necessary.

If you ran the sample code, don’t forget to clean up:

DROP TABLE A
DROP TABLE B 

Can We Reproduce Columnstore Aggregations in SQL 2008? (T-SQL Tuesday #016)

The subject of T-SQL Tuesday #016, kindly hosted by Jes Borland (blog | twitter), is Aggregation. Although it’s a fine topic, I could not figure out what I could write about aggregation that would capture my interest, and more importantly, dear reader, yours.

I thought about the topic ceaselessly while doing dishes, feeding children, and cleaning up the house. I considered aggregate functions, and optional grouping parameters. I twisted the topic inside and out, trying to figure out an interesting angle. I considered telling a story about the Tuples vs the Aggregates on their way to Query Station (and spent an embarrassingly long time on that train of thought). Finally, I went out for a run, thought of a topic pretty quickly and started turning the idea over in my mind for the next 7 km.

Columnstore Indexes

A columnstore index is a new feature of Denali, whereby an index is stored vertically – one column per page.  Here is the whitepaper download – highly recommended, and only 5 pages of text. Imagine a row of a typical covering index (in this case, AdventureWork’s Person.Person table, and I’ve covered every column):

image

Every row stored in this particular index is stored on the same page. As each row is quite wide, only 5 rows will fit per 8 KB page.

A columnstore index, however, stores each column on a separate set of pages. I am unsure of the ordering of the rows* but each page is compressed. As the column is likely to contain similar data, this means that each page can have excellent compression ratios – a factor of 4-15 is quoted in the whitepaper – and thousands of rows can fit on a single page.

* Due to a lack of internals documentation so far. Are they ordered by the column, or by the clustering key? I can see arguments for either way, but haven’t yet dived deep enough to decide one or the other.

This allows for fantastic performance gains when performing aggregates, as fewer pages need to be read to service the query, and only the columns actually required by the query need to be accessed.

Howerver, due to the fact that columnstore indexes are read only, they really only have applications in data warehouses.

Can we reproduce columnstore indexes in SQL Server 2008?

The question of whether it is possible to reproduce columnstore indexes in SQL Server 2008 popped into my mind, and the short answer is “No, not a chance”. Sorry.

Let’s look at the properties of a columnstore index.

Read Only – SQL Server 2008 can certainly build index structures on read-only data. And, as it’s read only, separate aggregation steps can be created in 2008 to summarise data ahead of time. However, one of columnstore’s selling points is to remove the need to aggregate. Columnstore gives you a massive performance benefit with little effort.

Highly Compressed – SQL Server 2008 can compress indexes. However, to get highly compressible indexes, similar to columnstore’s you’ll only want a single column, which makes the index less useful, and forces you to join frequently.

Only select required columns – A query accessing a columnstore index will only access those columns that are needed by the query. To get the same sort of functionality in 2008, you would need to either create a covering index for each query you want (nothing like columnstore!), or join multiple indexes together (resulting in a large amount of logical reads).

Different table structure – Denali is able to use a completely different page structure, particularly as this data is read-only. This is not possible in SQL Server 2008, and still be usable with T-SQL.

No code changes – Denali’s implementation allows us to create a single columnstore index on all columns in the table, and we don’t need to modify any queries to use these columnstore indexes. It’s completely seamless.

I tried to create an example where we could create similar compressed indexes in 2008, but I was never able to merge multiple indexes together to query the data efficiently. Denali has specific optimisations available for columnstore indexes, and these cannot be reproduced. About the best we can do is to create a read-only compressed covering index, which is nowhere  near impressive these days.

After a few hours considering this, my conclusion is that you are better off optimising your data warehouse for those queries that you know are going to come, and ad-hoc queries will need to continue to run slowly, at least until you’ve upgraded to Denali. Frequently run queries will benefit from extra customisation (pre-aggregation, specialised indexes with compression), but there no chance of a similar payoff inside the database engine.

Not too long to wait now, and you’ll be able to take advantage of columnstore indexes!

Out of Identity Values – Are There Performance Considerations for Re-seeding?

It’s finally happened. The table you created five years ago with an clustered integer IDENTITY column is about to hit 231 values (2,147,483,648). Five years ago, you were inexperienced with SQL Server, and had no idea that this table would grow this large – or that you’d be around to see it!

First, you reseed the IDENTITY value, back to -231.

DBCC CHECKIDENT (LargeTable, RESEED, -2147483647)

This should allow you to get another couple of years out of the table, and you will have enough time to implement a better table structure.

A flash of inspiration

A few hours later, you’re relaxing after work, quietly pleased about the bullet you’ve dodged today. A thought suddenly strikes you like a lightning bolt. You recall that inserts into a clustered index work best when inserting into an ever increasing value, at the end of the index. Can you  have made a massive error in judgement in forcing all inserts to always happen just before identity value 1?

While you toss and turn about this all night, feel free to consider the issue before reading on. What do you think will happen? Will you have an enormous number of page splits are you constantly run into identity 1? This is obviously a frequently inserted table, given that it has had 2 billion rows added in the past five years (1.1 million per day).

Testing the hypothesis

Here’s a nice block of code to test the problem. We create a table with 1000 rows approximately 1000 bytes long, to fit 8 rows per page.

CREATE DATABASE IdentityTest
GO
USE IdentityTest
GO

-- Create a test table, just under 1000 bytes per row. We can fit 8 rows per page.
CREATE TABLE T (i INT IDENTITY PRIMARY KEY, j char(960))
GO

SET NOCOUNT ON

-- Backup the database so we have a consistent log to examine
ALTER DATABASE IdentityTest SET RECOVERY FULL
BACKUP DATABASE IdentityTest TO DISK = 'C:\Work\IdentityTest_Full.bak' WITH INIT
GO

-- Seed the table with 1000 rows with identity ranges from 1-1000
INSERT INTO T (j) VALUES ('AAAAAAA')
GO 1000

-- Reseed back to a large negative number
DBCC CHECKIDENT (T, RESEED, -2147483647)
GO

-- Add another 100 rows with negative identities counting towards positive
INSERT INTO T (j) VALUES ('BBBBBBB')
GO 100

-- Examine the transaction log to see page split operations.
-- Note the "CheckIdent" Transaction Name around row 1136 where we reseeded.
SELECT [Transaction Name], * FROM fn_dblog(null, null)
WHERE [Transaction Name] IS NOT NULL
ORDER BY [Current LSN]
GO

-- Cleanup
USE master
GO
DROP DATABASE IdentityTest
GO

We can now clearly see the page split pattern in the transaction log. As expected, during inserts prior to the re-seeding, we have a pattern of 8 inserts before a Page Split occurs, signalling that no more rows will fit onto this page.

image

Analysis of the results

After we run the CheckIdent operation (row 1136), we can see that the next insert operation causes a page split, because we are trying to insert onto the first page of the clustered index. This already has 8 rows on it (IDs 1-8), so ID -2,147,483,646 won’t fit.  The new page ends up with the new row, plus IDs 1, 2, and 3. This leaves room for four more rows to be inserted.

As expected, the fifth insert after the CheckIdent (row 1142) causes another page split. This time, the four negative ID rows are moved to one page, and rows 1,2,3 are on the other page.

So far, the pattern seems bad. We’ve had two splits after inserting five rows, and we get another split on the ninth insert (row 1147). In this case, however, we have not bumped up against the positive IDs. We currently have a single page holding the previous 8 negative ID rows, and the ninth won’t fit at the end, so a new page is allocated with this single row.

Now we’re into a stable state. Every 8 inserts, a new page is allocated, and the inserts never again touch the pages containing the first positive rows.

Conclusion

The results of this test are pretty clear – the page splitting algorithms will quickly re-stabilise after re-seeding. You’ll no longer be inserting into the end of the clustered index B-tree, but you will be inserting into a consistent point, which is separated from the start of the values, and so you should not see any performance degradation.

You were stressing for nothing. Relax!

How I Study for Microsoft Certification Exams

I’ve done five SQL Server Microsoft Certification exams in the past few years, passing each on the first try, and only using an Exam Preparation book for the first exam. I believe there are two methods of studying for an exam – either cramming and hoping you retain enough knowledge during the exam, or creating a targeted list of topics, and living and breathing the subject until you know it intimately.

This article will talk about the second method, which I believe results in a much deeper understanding of the exam contents.

Daily Practice

Use SQL Server daily. This is the most important step. If you are wanting to take an exam on SQL Server, you should be using it daily (alright, you can have weekends off!). The point of this is to be continually using what you’ve been learning, reinforcing it, and completely internalising it. If you aren’t using SQL Server, why get certified in it? 

I recommend spending 30-60 minutes each day using SQL Server in addition to your day job. Create a database in your development environment, and practice the techniques and commands you are learning, and then utilise them in production where appropriate (and approved). Even if you’re doing routine work and have no current need to use the new features that will be on the exam, you should still aim to practice each day. This study period can be at any time of the day – whatever works for you. I recommend either during lunch, or before starting work in the morning. Personally, evenings don’t work for me, as I’m less alert at the end of the day and can’t get started until 9:30pm due to family commitments.

If you don’t have the correct environment at work (for example, you’re studying for SQL Server 2008 certifications, but you only use 2000 at work), it may be harder to get the necessary practice. In this case, I recommend discussing the options with your boss. It’s in his interests to get you skilled in the latest version of SQL Server, due to the amount of technical debt growing with older versions of SQL Server. If this is not possible, buy your own copy of SQL Server 2008 Developer Edition, and install it on your own computer (it runs fine on a netbook – just ask Noel McKinney!)

What’s Going To Be Examined?

How do you know which topics will be on the exam? The title of the exam is often not a very good indicator, and goes into no detail, but the exact requirements of the exam are freely available on the Microsoft Learning web site. For example, scan through the list of requirements for the 70-433 exam, available on the “Skills Measured” tab.

You’ll see that there are seven major sections, with four or five dot-points against each.  This tells you exactly which topics may be included, and as long as you have a good handle on each of these, you’re good to go!

I like to make a copy of this list, and cut out the topics that I already know well, leaving behind a list of subjects that I believe I need to work on. I then take one of these subjects, and throw myself into learning the topic.

This can be from:

  • reading Books Online
  • searching for online articles or blog posts on the topic
  • using one of the Exam Preparation guides
  • watching videos (e.g. from www.sqlshare.com)

Reproduce to Reinforce

When reading material with examples, reproduce each example in your own environment. Type out the code each time, run it, and check that it works.

After a period of time (an hour, a day), reproduce the example without looking at the original code. If you get stuck, use Books Online for syntax assistance.

This method of following the examples has two effects. Firstly, you are more likely to retain the syntax if you’re actively typing it, rather than passively copying and pasting. Secondly, by reproducing the example with minimal assistance, you’re proving to yourself that you now know the material.

Apply Your Knowledge

Finally, consider the pearls of wisdom that you’re learning, and see if you can apply it back to your own situation at work. If you’re learning about mirroring, and you currently have a cluster at work, consider the benefits of your current cluster, and contrast these with the benefits of switching to mirroring. You don’t actually have to change to a mirror, but consider how things would be different if you did. How does your DR plan change?

If you’re studying performance, and you don’t have a performance baseline for your servers, look into setting one up, and start to monitor performance more closely.

If you have a colleague available (preferably one that already knows this material) take them out for coffee and take them through your reasoning. The act of explaining your thoughts will solidify the concepts in your head, and they can suggest things you might have missed.

If no colleagues are available, you can throw your ideas to the Internet. Find a SQL Server forum, search for related threads, and if none are found, start your own. Or, utilise the #sqlhelp tag on Twitter.

Rinse and Repeat

Once you’ve completed the above steps for a specific topic, go back and choose another topic. Once you’ve worked your way through the list, you should be ready for the exam. Good luck, and if these tips are helpful, please let me know!

A Craftsman’s Tools

imageWhen you think of a craftsman, a likely image to immediately pop into your head would be one of an older man, working with hand tools on a piece of wood.  Each movement he makes are deliberate and precise, and he seems to know intuitively what needs to happen next.  His tools are so familiar to him, and used so effortlessly that they seem like an extension of his body.

Although database work is a far cry from any form of woodwork, it is still a craft, albeit one that is difficult to garner as much sympathy at family gatherings as the classic crafts.  As a craft, one of the two main things database professionals will talk about is "what have you created or done?", and "which tools do you use?".

This second question is a great question, as you may discover a fantastic tool that you’d never heard of, and which can provide vast improvements in the efficiency of your regular time-consuming tasks, whether by providing necessary information more easily, or by automating tasks you currently do manually.

I frequently get odd looks when I state that I don’t regularly use third-party tools at all (with the exception of backup compression tools). 

What?  No Tools?

The primary reason for this stance is that I work for many different clients, and the majority do not have any tools over what is provided with SQL Server, and for various reasons, cannot justify acquiring any additional tools.

This is not to say that I am against third-party tools – I definitely do make use of them when available, but I believe that for every task that a tool provides assistance with, a database professional should be able to do without the tool at a pinch.

I liken this to our craftsman’s view of power tools.  There are many situations where power tools will greatly increase the speed and ease of creating some projects in the workshop, but for other situations hand-tools reign supreme.  Perhaps the craftsman is visiting family and is asked to repair a broken chair, or has a simple task to do that doesn’t justify setting up the power tools.

Let’s See Some Examples

As an example, Ola Hallengren’s excellent Index Optimisation scripts (amongst other things) are an excellent and free utility for more intelligently performing index maintenance.  However, at a pinch, a DBA should be able to query the sys.dm_db_index_* family of DMVs to determine which indexes require maintenance, and issue the appropriate ALTER INDEX command.

Automatic code-completion is another helpful tool that some people swear by.  As this is a relatively new feature in the SQL Server world (whether a third-party tool, or the one provided as part of SQL Server 2008’s Management Studio), many DBAs are used to not using this.  In the future, however, we will become used to having code completion tools available.  It’s still important to know how to quickly query the database structure using sp_help, sp_helptext, sys.objects, INFORMATION_SCHEMA, etc. (Incidentally, I wow at least 5 people a year by using the Alt-F1 shortcut for sp_help. Simply highlight a table name in a query window in Management Studio, and hit Alt-F1.  Very useful.)

There are a number of tools available that can be used to trace expensive queries, and I do enjoy using a tool developed in-house to provide trace control and analysis.  If this is not available, or if it would take too much time to set up, I’m happy to pull out Profiler, script out a focussed trace, collect some data, and then analyse the resulting trace files with fn_trace_gettable().

There are numerous other examples.  Can you read a deadlock graph, or do you need Profiler to draw a picture?  Can you read a blocking chain in sp_who2, or do you need Adam Machanic’s (blog | twitter) Who Is Active?  What if you are called upon to help with a SQL Server 2000 instance, where this is unavailable?

Regardless of whether you primarily use the base set of tools, or primarily use the "power" tools, it pays to be familiar with both types, and to be able to use each tool in the proper way in the right situation.

Solid Foundations

When a building is designed, one of the main requirements is to have an appropriate foundation, and a larger and stronger building requires a larger and stronger foundation.  A one-storey house, for example, requires a much simpler foundation than an 85-storey office building.  The deeper the foundation is, the higher the building can be, and the more resistant it is to outside forces, such as strong winds.  The foundations are hidden from the casual glance, but are absolutely crucial to the success of the building.

Careers, particularly in SQL Server administration, are much the same as a building.  You must first create a solid foundation of skills, and, like a building’s foundation, these may not be visible.  This will include both technical knowledge and "soft"skills, such as:

    · The Windows operating system (including monitoring)

    · File system knowledge

    · Knowledge of networking protocols

    · Scripting basics for automation purposes (e.g., Powershell)

    · Server architecture

    · Storage technologies and I/O concepts

    · Basic Windows/Active Directory security concepts

    · Interpersonal relationships and communication

    · Knowledge of table structure and basic SQL syntax

    · Data backups, restores, and recoverability

    · SQL Server security

(I hesitate to put SQL Server Internals in here as they are a foundational skill for more in-depth SQL Server work, but at a much deeper level, and requires the other foundations first.)

You’ll notice that there is not a lot on that list that is specific to SQL Server.  At the heart of it, a database administrator is a specialised system administrator, and requires similar base skills as a Windows sys admin, with the addition some database knowledge.  At the least, a DBA needs to be know how to configure security, query and change data, and provide a measure of data protection in the forms of backups.

The analogy with a building breaks down a little when you realise that, unlike building construction, you don’t have to complete your foundations prior to working on the "visible" parts of your career.  This is a good thing, though.  You can get started with a DBA career without knowing much about Windows Server, and your limited knowledge will be sufficient to keep you from blowing over during normal weather, but will be inadequate if you attempt to configure a clustered SQL Server deployment.

This holiday period is a good time to reflect.  If you made any New Years Resolutions, after a week they’re either sticking or they’re shot, so you can reflect now without having to worry about the stigma of frequently broken New Years Resolutions hanging over your head.  What areas of your foundations are a little shaky, and could use some attention?  Which SQL Server features would you like to focus on this year, but would require an improvement in the foundations before you can really understand it?  As an example, if you want to improve your backup speeds, you may need to improve your knowledge about I/O throughput to your SAN before you can confidently use SQL Server’s backup performance enhancements (compression, striped backup files, etc).

Personally, I need to improve my knowledge of SANs, and more up to speed with the features that they can bring to the table, and the basics how each of these features work – enough to understand the pros and cons of each feature.

So, get to it. Build those foundations stronger and become a rock!

Using RAISERROR for Progress Messages

As mentioned in my last post, it is possible to use RAISERROR WITH NOWAIT in order to immediately send a message back to the client.  This is useful for long, procedural (i.e., not set-based) stored procedures that loop over many different rows.

Consider the following stored procedure:

CREATE PROCEDURE dbo.InfoMsgTest
AS
    DECLARE @i int;
    SET @i = 1;
    WHILE @i < 100
    BEGIN
        RAISERROR('%d', 0, 1, @i) WITH NOWAIT;
        -- Do some processing!
        WAITFOR DELAY '00:00:01';
        SET @i = @i + 1;
    END
GO

This procedure is a simple loop that counts to 100.  Each time around the loop, a RAISERROR command is executed, passing out the value of @i.  Any message at all could be passed – you could include how many rows have been processed, how many to go, and what the primary key is of the current row.

On the client, consider the following C# console application.  All error handling has been removed, and I haven’t written any .NET code in two years, so your forgiveness is appreciated!

using System;
using System.Data.SqlClient;

namespace InfoMessages
{
    class Program
    {
        static void Main(string[] args)
        {
            SqlConnection conn = new SqlConnection(
                "Data Source=(local);Initial Catalog=AdventureWorks;"
                    + "Integrated Security=SSPI;");
            conn.InfoMessage +=
                new SqlInfoMessageEventHandler(InfoMessage);
            conn.Open();
            SqlCommand cmd = new SqlCommand("exec dbo.InfoMsgTest", conn);
            cmd.CommandTimeout = 120;
            Console.WriteLine("Processing starting.");
            cmd.ExecuteReader();
            conn.Close();
            Console.WriteLine("Processing complete.");
        }

        private static void InfoMessage (object sender,
            SqlInfoMessageEventArgs e)
        {
            Console.WriteLine("Percent completed: " + e.Message + "%");
        }
    }
}

Note that it is vital to use a cmd.ExecuteReader().  cmd.ExecuteNonQuery() will not fire the InfoMessage handler. 

And the output:

image

There you have it!  A GUI application shouldn’t be too much harder.  Little things like this can make the difference between having a responsive application that informs the user as to what is happening, versus a black box that appears to hang for 30 seconds while the stored procedure is executed.

PRINT vs RAISERROR

SQL Server provides two primary ways of communicating data to the client – Result Sets and Messages.  Typically, a client application will respond to Result Sets, and any error messages that are raised by SQL Server with a severity higher than 10.  For error messages with a severity of 10 or less, the .NET event SQLConnection.InfoMessasge can be used to return information during query processing.

In Management Studio, the difference between a Message and and Error is that the Error is flagged in red on the Messages result panel and may trigger rollbacks or break connections, depending on the severity of the error.

PRINT

One use of communicating data back to the client is for stored procedures to let the user know where they are up to.  While this could be used for production code, it is usually used as a poor man’s debugger.  By sprinkling PRINT “Currently at point x” statements through your stored procedure, you can get an inkling of where the processing is up to.

However, PRINT has a noticeable drawback – the results are not returned immediately.  Instead, anything sent to PRINT will be buffered, and not released until the buffer is full, or the query completes.  This buffer is around 8KB in size.

“No problem!” I hear you cry. “I’ll just pad my PRINT message out to be 8KB!”  Nice try, but unfortunately, the PRINT statement will trim to varchar(8000) or nvarchar(4000), which isn’t enough.  For example:

PRINT 'A' + REPLICATE(' ', 8000)
PRINT 'B' + REPLICATE(' ', 124)
WAITFOR DELAY '00:00:05'
PRINT 'C'

In this example, we’re using REPLICATE to try to pad out the PRINT’s message, but we need two PRINT statements to get anything back immediately.  By running the example, and flicking to the Messages screen in Management Studio, you can see if A is being returned before or after the WAITFOR DELAY statement.  In my tests, the 124 on the B line is not a static value – it was 134 for a different server. 

So, two PRINT messages does not really seem like an acceptable solution.

RAISERROR

Enter RAISERROR. While the RAISERROR syntax is slightly more complicated, it’s also a lot more powerful (although the misspelling is quite annoying).

RAISERROR ('Message', 0, 1, ..., ...) WITH NOWAIT

The first parameter is simply a textual description of the error/message.  Next (0) is the Severity level.  If this value is 10 or less, it will be counted as a Message, and not as an Error.  The 1 indicates the State of the message – for a message, you’ll generally keep this at 1. After the State, you can list multiple parameters that will be inserted into the first parameter – more on this shortly.

 

Example 1 shows two methods of RAISERROR, one where the text of the message is stored in a variable, and one where it is included in the RAISERROR command.  This simply returns “Currently at position 56” in both instances.  Note the WITH NOWAIT.  This tells SQL Server to send the message back to the client immediately, effectively avoiding the problems PRINT has.

-- Example 1
DECLARE @msg nvarchar(200) = 'Currently at position %d.'
RAISERROR (@msg, 0, 1, 56) WITH NOWAIT
RAISERROR ('Currently at position %d.', 0, 1, 56) WITH NOWAIT

Note that the equivalent PRINT statement would be:

PRINT 'Currently at position ' + CONVERT(varchar(10), 124) + '.'

 

Example 2 shows how easy it is to output a text value.  This is useful for displaying the current value of the a loop.

-- Example 2
DECLARE @somevalue varchar(200) = 'Melbourne'
DECLARE @msg nvarchar(200) = '@somevalue is currently %s.'
RAISERROR (@msg, 0, 1, @somevalue) WITH NOWAIT

 

Finally, Example 3 shows how you can combine multiple values in your output.

-- Example 3
DECLARE @somevalue varchar(200) = 'Melbourne'
DECLARE @msg nvarchar(200) = '@somevalue is currently "%s" at position %d.'
RAISERROR (@msg, 0, 1, @somevalue, 124) WITH NOWAIT

 

Monitoring

Another benefit of RAISERROR over PRINT is that it is much easier to trace  RAISERROR in Profiler.  Simply capture “User Error Message” events for Error 50000, and you’ll get the messages.  Of course, you can always filter on the severity or the SPID, or any other filter that is appropriate.

image

So, there you have it!  RAISERROR is a much more sophisticated method of returning status messages to the client that using PRINT.

WordPress Themes