Category Archives: T-SQL

Fun and Joy with Optional Search Parameters

The past few weeks I’ve been writing stored procedures that use optional search parameters. Since this subject has been written about at length by many other SQL folks, with articles that have been very helpful to me, I decided to list the resources here on my blog.

Dynamic Search Conditions in T-SQL – Version for SQL 2005 and Earlier by Erland Sommarskog

Dynamic Search Conditions in T-SQL Version for SQL 2008 (SP1 CU5 and later) by Erland Sommarskog

Sometimes the Simplest Solution Isn’t the Best Solution by Bart Duncan

The two things you need to know about outer joins

Today I’ll discuss two very important items to keep in mind when coding with an outer join.  These issues apply to both the left and right outer join.

Join order matters – with an inner join, the join order does not affect the results set.  The SELECT statement returns the same number of rows regardless of the order in which the tables are listed.  But that is not the case with an outer join.  Here is an example from the AdventureWorks sample database.

First I’ll join the Production.Product table to the Production.ProductInventory table with a left join so I can list all the products(including those without any inventory) and their corresponding quantities.

SELECT     p.Name as ProductName,
        inv.Quantity
FROM Production.Product p
LEFT JOIN Production.ProductInventory inv
    ON p.ProductID = inv.ProductId

But after reviewing the output, I realize that the location name is needed to make this query complete, so I’ll add the location table to the query with an inner join:

SELECT     p.Name as ProductName,
        inv.Quantity
FROM Production.Product p
LEFT JOIN Production.ProductInventory inv
    ON p.ProductID = inv.ProductId
INNER JOIN Production.Location l
    ON l.LocationID = inv.LocationID
ORDER BY p.Name desc

But now something is wrong with the code, because the result set is missing the products without inventory.  Why?

That’s because using an inner join to add the location table filters out the products with no quantity.  Since they have no inventory, they do not have a row in the ProductInventory table and hence nothing to join to in the ON l.LocationID = inv.LocationID clause.  Since I want to keep the rows where inv.Quantity is null, let’s change this to a left join:

SELECT     p.Name as ProductName,
        inv.Quantity
FROM Production.Product p
LEFT JOIN Production.ProductInventory inv
    ON p.ProductID = inv.ProductId
LEFT JOIN Production.Location l
    ON l.LocationID = inv.LocationID

And that works!  All of our products (even if they do not have any inventory) are in the result set.

WHERE clause vs. ON clause

The WHERE clause is typically used to restrict the results of a query.  But in the case of an outer join, using the WHERE clause can give you the wrong results.  Here I want to see all the products and the total quantity of each product currently in work.  So I left join the Production.Product table to the Production.WorkOrder table to see all 504 products and the quantity in work.  Quantity is zero for products not currently in production.

SELECT    p.Name,
        COUNT(wo.WorkOrderID)
FROM Production.Product p
LEFT JOIN Production.WorkOrder wo
    ON p.productID = wo.ProductID
GROUP BY p.Name

But I don’t want to see all work orders, just the most recent ones.  So I try to limit the query to the most recent work orders by putting the StartDate in the WHERE clause:

SELECT    p.Name,
        COUNT(wo.WorkOrderID)
FROM Production.Product p
LEFT JOIN Production.WorkOrder wo
    ON p.productID = wo.ProductID
WHERE wo.StartDate >= ’1/1/2004′
GROUP BY p.Name

But now I have lost my products not currently in production.  So now what?  I’ll move the StartDate restriction to the ON clause, so it will be part of the join:

SELECT    p.name,
        COUNT(wo.WorkOrderID)
FROM Production.Product p
LEFT JOIN Production.WorkOrder wo
    ON p.productID = wo.ProductID
        AND StartDate >= ’1/1/2004′
GROUP BY p.Name

Now I will return all 504 products, with the quantities for the most recent work orders.  So why does using the ON clause work and not the WHERE clause?  There are two reasons.  First the ON clause is applied as part of the join and the WHERE clause is applied after the join.  Secondly, the WHERE clause is applied to the right side of the left outer join, the WorkOrder table, the table we only want rows from if there is a match.

So, don’t be afraid to use an outer join, just be aware of the gotcha’s when coding a left/right join.

SQL to get all Month End and Begin Dates

Many web sites and blogs will tell you how to find the last day, first day of a month or year

But what if you need the month end and begin dates for a specific time period?  Well, here is the code for that:

–script to get month begining & end dates for a time period
–declare all variables
DECLARE @StartDate datetime
DECLARE @EndDate datetime
DECLARE @MonthEndDate datetime
DECLARE @MonthBeginDate datetime

DECLARE @MonthlyDates Table
(    MonthID INT NOT NULL IDENTITY(1,1),
MonthBeginDate SMALLDATETIME NOT NULL,
MonthEndDate SMALLDATETIME NOT NULL
)

–set the time period here
SET @StartDate = ’3/1/2009′
SET @EndDate = ’5/31/2012′

–Get the first month’s begin and end dates
SELECT @MonthBeginDate = DATEADD(dd,-(DAY(@StartDate)-1),@StartDate)
SELECT @MonthEndDate = DATEADD(mm,1,@StartDate) – DAY(DATEADD(mm,1,@StartDate))

–do the first insert into the holding table
INSERT INTO @MonthlyDates
(MonthBeginDate, MonthEndDate)
VALUES
(@MonthBeginDate, @MonthEndDate)

–use while loop to move through each month
WHILE @MonthEndDate < @EndDate
BEGIN

SELECT @MonthBeginDate = DATEADD(mm, 1, @MonthBeginDate)
SELECT @MonthEndDate = DATEADD(dd, -1, DATEADD(mm, 1, @MonthBeginDate))

INSERT INTO @MonthlyDates
(MonthBeginDate, MonthEndDate)
VALUES
(@MonthBeginDate, @MonthEndDate)

END
SELECT
MonthBeginDate,
MonthEndDate
FROM @MonthlyDates

So this code will give you all the months begin & end dates, in order.  Now in this code I’m not looking at when the @StartDate or @EndDate falls in the month.  For example, if I pass into the code a start date of March 10, 2009 it will return, as the first row, the begin and end dates for the month of March.  If you wanted to only return dates for whole months that your range covers (in the above example that would be a start month of April), you would need to alter the code.

Oh, and it is good for leap years too!

Good reference links for the TOP Keyword

This past week I needed to use the TOP keyword for some code and came across some excellent resources:

First, the quick basics of top:

Dynamically controlling the number of rows affected by a SQL Server query

Next, a good article on how to retrieve top and bottom rows together:

How to Retrieve TOP and BOTTOM Rows Together using T-SQL

And an excellent article on the TOP per group scenario:

Who’s On First? Solving the Top per Group Problem

And the book Inside Microsoft SQL Server 2008 T-SQL Querying there is a whole chapter mostly on TOP; Chapter 9 TOP and APPLY.

TOP isn’t something used everyday in SQL Server, but when you need it, you’ll be happy it’s part of TSQL.

The New Date & Time Datatypes are great, but…

The first time I read about the new Date and Time datatypes I though “Well, that’s nice”.  But the more I think about it the happier I am.  No more stripping off the time values to do a date only comparison.  Thank goodness.   Some other pluses;  possible use of indexes, safe to use BETWEEN for date ranges, easier reading of code without all of those CONVERT functions. 

But don’t be so eager to leave out the time part.  You need to be absolutely sure that you don’t need the time piece.  If your not sure, keep the time part in a column separate from the date.

There is some good sample data for these datatypes in the AdventureWorks2008 database.  Take a look at the Human Resources.Employee table, Birthdates & Hire Date columns.  And also look at Human Resources.Shift table, Start Time & End Time columns.