Aug 152007

Remote Desktop and Macs came up in conversation with Phill last week, and I finally got around to installing the beta, available from Microsoft’s site. It looks like it will expire at the end of March, 2008, but until then, it seems to work perfectly!

No more will I need to use VNC to remotely control my Windows desktop machine. Definitely worth checking out.

Aug 092007

An interesting post popped up on the SQLDownUnder mailing list today, to do with the error message “Cannot sort a row of size 8365, which is greater than the allowable maximum row of 8094″ when running a query which has had multiple tables joined together under SQL Server 2000 (2005 has “Row Overflow” functionality, where long rows are split among different pages).

In this case, it was a DISTINCT clause causing the results to be sorted, but it could happen to any recordset that has had two tables with large row sizes joined together, and then sorted by both tables. If the recordset is to be sorted by just one table, then SQL Server will usually elect to sort that one table first, then perform a nested loop join onto the other table.

So, what do you do if you want to sort on two columns, one from each table? For example, imagine we had Books and BookReviews, and we wanted to get a list of all Books and Reviews for a particular author, ordered by the number of stars given to the review, and the book’s title.

CREATE TABLE Book (ISBN CHAR(15), Title VARCHAR(100), AuthorID INT, Description VARCHAR(7900), PRIMARY KEY (ISBN))
CREATE TABLE BookReview (ReviewId INT IDENTITY, ISBN CHAR(15), Review VARCHAR(7900), NumStars INT, ReviewDate DATETIME)
(Design for illustration purposes — in reality a TEXT field would be better than limiting reviews to 7900 characters)

SELECT b.ISBN, b.Title, b.Description, br.ReviewDate, br.NumStars, br.Review
FROM Book b
LEFT JOIN BookReview br ON b.ISBN = br.ISBN
WHERE AuthorID = @AuthorID
ORDER BY br.NumStars, b.Title

This code will fail on the condition that the total size of the joined recordsets is greater than 8094 (for example, if the Description and the Review are both 5000 bytes long).

The best option is to perform the sort first, and then add the wide columns to the recordset:

SELECT b.ISBN, b.Title, b.Description, br.ReviewDate, br.NumStars, br.Review
(SELECT RowNumber, Book.ISBN, BookReview.ReviewID
LEFT JOIN BookReview ON Book.ISBN = BookReview.ISBN
ORDER BY BookReview.NumStars, Book.Title) sorted
JOIN Book b ON sorted.ISBN = b.ISBN
JOIN BookReview br ON sorted.ReviewID = br.ReviewID
ORDER BY sorted.RowNumber

RowNumber then needs to be generated by your favourite method, for example joining onto a predefined table containing positive integers.

Aug 092007

Another fun, life changing event has occurred this year – we’re having a baby! Sometime in January, expect to see me very sleepy. Amanda and I are both very excited!

Aug 012007

I recently received an email from a colleague asking if he was missing something in an email he had received from a vendor. His application was running a report that was taking too long, and the vendor had suggested shrinking his transaction log (TLog) from 60 GB to 1 GB, and to make hourly tlog backups. The vendor had apparently run a test that proved that a 60 GB log file would cause the report to run in around 3 minutes, whereas shrinking the TLog improved performance drastically, taking just 20 seconds to run. My colleague was confused, as he couldn’t see any reason why this would be so.

The TLog is simply a revolving file that holds a record of all transactions made that make changes to the database. The TLog has a head marker, which keeps track of where it is up to, and when you get to the end, it goes back to the start of the file. You get a TLog full error when the head tries to go past the tail. When you do a TLog backup, you back up everything from the tail to the head (at time of backup) and then mark that as empty space, moving the tail back to the point in time that the backup was performed.

If your transaction log is on a single physical disk, which is reserved for ONLY the TLog, then there is no benefit to having a smaller log file. In fact, it should take up the entire disk, if that’s all that’s running on it.

So, what could cause a report to run faster after shrinking the TLog? Two thoughts spring to mind:

a) It’s a report. Reports typically don’t use the TLog as they don’t write data to the log. Usually a report will create some temporary tables, but those would be stored in TempDB, unless these temporary tables are created as permanent tables in the application’s database.

b) Was SQL Server fully warmed up for both experiments? It’s possible that vendor tested the database by attachking the database, running the report, shrinking the database, setting up hourly TLog backups, and running the report again. In this situation, the first time the report was run, all pages would have had to have been loaded physically off the disk. The second time a lot of the pages would have still been in memory, so only logical reads would have been required. Logical (memory) reads are measured in nanoseconds, while physical (disk access) reads are typically measured in milliseconds.

I think there’s a large gap between most developers’ knowledge about their database engine of choice and how it operates – it’s a little more complicated that just sticking things in 3rd normal form and understanding SELECT, INSERT, UPDATE and DELETE statements. This example cements this fact, as the person recommending using a smaller TLog doesn’t understand the principals of what a TLog does, and the concept of SQL Server warming up.