Don’t forget the basics when query tuning!

The world of performance tuning, specifically query tuning, is quite vast, with entire books devoted to the subject.  And those books come with pages and pages of dense and complex technical material.  But in your quest to figure out how to get more performance from that slower-than-molasses stored procedure, don’t forget some basic steps:

  1. Define Success. Before you do anything, decide what the goal is?  A 50% reduction in time needed.  75%?  What??  And once you have met the goal stated for success, stop!  I’m sure you have other tasks on your to-do list.  Nobody in your company is going to care if you make it faster then needed. When the customer/end user is happy, move on.
  2. Do no harm.  Whatever you do, don’t make the problem worst then it is.  For example don’t throw a bunch of indexes into your database in an attempt to fix one procedure, without understanding the rest of the queries in your system.  It is better to start off with a review of the code first.  Changing the code will only affect that procedure where as creating or dropping indexes will surely affect other queries on your system.  And please fully test all changes you do make! The last thing you want to do is introduce logical errors into your code.
  3. Document your changes.  Oh yes, the dreaded need for documentation.  I know all of the excuses for not documenting a code change…you’re too busy…nobody will read it…nobody will care.  Trust me, the new developer who comes in behind you will be so grateful if you do this.  And if you need to work with the same code again, you’ll be happy if you document your changes, because you will probably forget what you did 6 months later!  So, write down what you changed before you forget!

Documentation, Just Do It

For the past 6 weeks I’ve been attending a civilian “Boot Camp”, an fitness program designed to help you lose weight and get in shape.  With this program comes the requirement to  write down everything you eat in a food journal.  Oh, how I have hated keeping this food journal, it is harder then the daily exercise sessions.  Why do I want to remember what I ate last week if a lot of it was junk?…but that’s just the point.  If you write your food choices down you are more likely to pick healthy food, not junk.  Keeping track of my eating habits in a journal has given me insight into my eating habits and how they affect my health.

So what does this have to do with documenting our databases?  Well, in my experience the worst databases I’ve come across have the least amount of documentation, in some cases, none.  If you are throwing hacks into your database, you just want to purge the whole event from your mind, not write it down.  We don’t want to remember the hack we had to put into the database or the fact we really don’t like the all the other “junk” that the last developer/DBA put into the poor database either.

But like your diet, the worse the database is, the more it needs documentation.   Like the improvements in our bodily health, improvements in our SQL Servers will be much easier to come by if we take the time to document what we’ve got, as junky as it might be.  But don’t try to document the whole thing at once.  My food journal only asks for what you ate at your last meal or snack, not your food choices for the past month.  A little bit at time makes it more likely that you will actually do it.  So document your databases as you go, a small portion at a time.  That way it won’t feel like such a chore.

I’m going on a SQLCruise!!!

About 2 weeks ago I heard of a SQL Server training event held on a cruise ship.  Huh?  What?  A cruise ship….my curiosity led me to SQLCruise.com to read all about it.  The more I read the more excited I got about going, so I consulted the checkbook and the Hubby, got the green light to go, and signed up!

Just 78 days until we depart.  I can’t wait!

Time to wake up the Blog with some odds and ends

Gee, where did the month of April go to…I look up and here is May.  Lots of work, SQL Saturday, Jury Duty…some fun and not so fun times.

The first ever SQL Saturday to come to SoCal was filled with lots of great education.  Andrew Karcher ( Blog | Twitter ) and his gang did a bang up job putting in a lot of hard work in a very short time to bring this event to the SoCal SQL Server Community.  I can’t wait for the next one!

I did manage to look at my feed reader a few times.  Here are the best links of April:

Rock Stars, Normal People, and You by Brent Ozar  – Quite possibly the most encouraging thing I’ve read in years.  Excellent example of hard work, long hours and some risking taking paying off big time.

The April Linchpin Session – By Seth Godin – Seth is brilliant.  Not much more to add here.

24 hours of PASS – more free training, yea!

T-SQL Tuesday #005 – Reporting – The Round-Up By Aaron Nelson – So much good reporting stuff in here, I can’t list it all

And on the subject of jury duty, well, I’d just rather forget all about it.

Have a happy May 2010!

The two things you need to know about outer joins

Today I’ll discuss two very important items to keep in mind when coding with an outer join.  These issues apply to both the left and right outer join.

Join order matters – with an inner join, the join order does not affect the results set.  The SELECT statement returns the same number of rows regardless of the order in which the tables are listed.  But that is not the case with an outer join.  Here is an example from the AdventureWorks sample database.

First I’ll join the Production.Product table to the Production.ProductInventory table with a left join so I can list all the products(including those without any inventory) and their corresponding quantities.

SELECT     p.Name as ProductName,
        inv.Quantity
FROM Production.Product p
LEFT JOIN Production.ProductInventory inv
    ON p.ProductID = inv.ProductId

But after reviewing the output, I realize that the location name is needed to make this query complete, so I’ll add the location table to the query with an inner join:

SELECT     p.Name as ProductName,
        inv.Quantity
FROM Production.Product p
LEFT JOIN Production.ProductInventory inv
    ON p.ProductID = inv.ProductId
INNER JOIN Production.Location l
    ON l.LocationID = inv.LocationID
ORDER BY p.Name desc

But now something is wrong with the code, because the result set is missing the products without inventory.  Why?

That’s because using an inner join to add the location table filters out the products with no quantity.  Since they have no inventory, they do not have a row in the ProductInventory table and hence nothing to join to in the ON l.LocationID = inv.LocationID clause.  Since I want to keep the rows where inv.Quantity is null, let’s change this to a left join:

SELECT     p.Name as ProductName,
        inv.Quantity
FROM Production.Product p
LEFT JOIN Production.ProductInventory inv
    ON p.ProductID = inv.ProductId
LEFT JOIN Production.Location l
    ON l.LocationID = inv.LocationID

And that works!  All of our products (even if they do not have any inventory) are in the result set.

WHERE clause vs. ON clause

The WHERE clause is typically used to restrict the results of a query.  But in the case of an outer join, using the WHERE clause can give you the wrong results.  Here I want to see all the products and the total quantity of each product currently in work.  So I left join the Production.Product table to the Production.WorkOrder table to see all 504 products and the quantity in work.  Quantity is zero for products not currently in production.

SELECT    p.Name,
        COUNT(wo.WorkOrderID)
FROM Production.Product p
LEFT JOIN Production.WorkOrder wo
    ON p.productID = wo.ProductID
GROUP BY p.Name

But I don’t want to see all work orders, just the most recent ones.  So I try to limit the query to the most recent work orders by putting the StartDate in the WHERE clause:

SELECT    p.Name,
        COUNT(wo.WorkOrderID)
FROM Production.Product p
LEFT JOIN Production.WorkOrder wo
    ON p.productID = wo.ProductID
WHERE wo.StartDate >= ’1/1/2004′
GROUP BY p.Name

But now I have lost my products not currently in production.  So now what?  I’ll move the StartDate restriction to the ON clause, so it will be part of the join:

SELECT    p.name,
        COUNT(wo.WorkOrderID)
FROM Production.Product p
LEFT JOIN Production.WorkOrder wo
    ON p.productID = wo.ProductID
        AND StartDate >= ’1/1/2004′
GROUP BY p.Name

Now I will return all 504 products, with the quantities for the most recent work orders.  So why does using the ON clause work and not the WHERE clause?  There are two reasons.  First the ON clause is applied as part of the join and the WHERE clause is applied after the join.  Secondly, the WHERE clause is applied to the right side of the left outer join, the WorkOrder table, the table we only want rows from if there is a match.

So, don’t be afraid to use an outer join, just be aware of the gotcha’s when coding a left/right join.