T-SQL Tuesday #011 – Filtered Index and an Alternative

This month’s T-SQL Tuesday is brought to us by Sankar Reddy (Blog|Twitter).  He has graciously invited us to discuss Misconceptions in SQL Server.  This is quite a large topic, and one popular misconception that I touched on in a previous post was that SQL Server doesn’t require a DBA!  Today however, I’m going to look at something more obscure, and whether or not it’s even a misconception is a matter of interpretation and the circumstances.

The misconception I want to tackle is that filtered indexes require SQL Server 2008 or later.  Okay, it’s not entirely true to call this a misconception.  Filtered index creation syntax was introduced with SQL Server 2008.  However, there’s a work-around that can be used in earlier versions to obtain similar benefits if the index is unique.

Let’s come up with a contrived example using a filtered unique index.  Note that everything here is created to demonstrate a specific point with brevity.  So please don’t look to the resulting table that is created below as an example of good database design or practice.

With that out of the way, let’s say we need to come up with a database of information about sheep.  We create a simple table with a national tag id, U.S. state of registry, the sheep’s name, gender, breed, and date of birth.  The national tag id uniquely identifies animals across the United States.  This sounds like a reasonable primary key.  Owners give sheep names for various reasons, from marketing to affection.  Similar to the case for people, there’s no assumption that names are unique.  However, in the spirit of devising a contrived example, let’s say that the state of California conducted a multi-million dollar study of nervous sheep.  After years of research, this study concluded that California sheep were suffering from a crisis of identity.  The solution to this severe problem was to enact a law requiring that no two sheep registered in the state of California can have the same name.

So let’s create our sheep table in a test database.  Standard disclaimer applies that there are no guarantees, warranties, don’t blindly use this on a production system, etc.

CREATE DATABASE [TestIndex] ;
GO

USE [TestIndex] ;

CREATE TABLE [dbo].[TestSheep]
( [TagID] INT IDENTITY(1,1) NOT NULL
, [State]     CHAR(2)       NOT NULL
, [Name]      VARCHAR(30)   NOT NULL
, [Gender]    CHAR(1)       NOT NULL
, [Breed]     VARCHAR(30)   NOT NULL
, [Birth]     SMALLDATETIME NOT NULL
, CONSTRAINT [PK_TestSheep] PRIMARY KEY CLUSTERED ([TagID]) ) ;
GO

Okay, that looks fine, but there is nothing here to require Name to be unique when the state is California.  No problem, we can just add a unique filtered index to enforce this constraint.

CREATE UNIQUE NONCLUSTERED INDEX [FI_TestSheep_Name_State]
    ON [dbo].[TestSheep] ( [Name] )
    WHERE [State] = 'CA' ;
GO

The above looks like a typical index creation statement, except that it contains a WHERE clause.  This clause is what makes it a filtered index.  Also note that a WHERE clause is not allowed when using the ALTER TABLE ADD CONSTRAINT UNIQUE syntax.  In my prior T-SQL Tuesday post, I discussed why this might be a preferred syntax for enforcing a uniqueness business rule.  However if a WHERE clause filter is needed, then the only option is to use CREATE UNIQUE INDEX syntax.

Let’s now see the filtered index in action by attempting to execute the following eight INSERT statements.

INSERT INTO dbo.TestSheep ([State],[Name],Gender,Breed,Birth)
VALUES('MT','Nelly','F','Hampshire','2009-02-18') ;

INSERT INTO dbo.TestSheep ([State],[Name],Gender,Breed,Birth)
VALUES('MT','Nelly','F','Hampshire','2010-03-04') ;

INSERT INTO dbo.TestSheep ([State],[Name],Gender,Breed,Birth)
VALUES('IN','Tipsy','F','Montadale','2009-12-27') ;

INSERT INTO dbo.TestSheep ([State],[Name],Gender,Breed,Birth)
VALUES('IN','Trent','M','Suffolk','2009-12-28') ;

INSERT INTO dbo.TestSheep ([State],[Name],Gender,Breed,Birth)
VALUES('OH','Tipsy','F','Southdown','2009-02-17') ;

INSERT INTO dbo.TestSheep ([State],[Name],Gender,Breed,Birth)
VALUES('CA','Nelly','F','Cheviot','2009-12-20') ;

INSERT INTO dbo.TestSheep ([State],[Name],Gender,Breed,Birth)
VALUES('CA','Willow','F','Dorset','2009-12-22') ;

--this should fail
INSERT INTO dbo.TestSheep ([State],[Name],Gender,Breed,Birth)
VALUES('CA','Willow','F','Suffolk','2010-01-10') ;

The last INSERT failed with “Msg 2601, Level 14, State 1, Line 24 Cannot insert duplicate key row in object ‘dbo.TestSheep’ with unique index ‘FI_TestSheep_Name_State’.”  This was the filtered index at work, preventing two sheep with the same name of Willow in the state of California to be inserted.  Noticed that there was nothing preventing two sheep from having the same name in other states.

So that works and it seems like the job is done.  Let’s say you did the above development work on a SQL Server 2008 test server.  Now you go to install it on production, which happens to be running SQL Server 2005.  The table creation statement completes successfully, but you run the unique filtered index creation script and get an error message “Msg 156, Level 15, State 1, Line 3 Incorrect syntax near the keyword ‘WHERE’.”

The issue causing this error message is that using a WHERE clause in an index creation statement to specify a filtered index was introduced with SQL Server 2008.  You cannot use this syntax on earlier versions of SQL Server.

Now we’re stuck, right?  There’s no way to create an index that enforces uniqueness only under certain conditions unless you have SQL Server 2008, right?  Wrong.  Hence, we finally get to the misconception.  With that, let’s look at how we can obtain this functionality in earlier versions of SQL Server.

What we’ll do is create a view containing our WHERE clause acting as a filter, then create a unique index on the view.  Below is the code to drop and recreate the table, then create the view and index the view.

--Set statements required for creating materialized views.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

--Set statements required for creating index to materialized view.
SET ANSI_PADDING ON
GO
SET ANSI_WARNINGS ON
GO
SET ARITHABORT ON   --only required in 80 compatibility mode.
GO
SET CONCAT_NULL_YIELDS_NULL ON
GO
SET NUMERIC_ROUNDABORT OFF
GO

DROP TABLE [dbo].[TestSheep] ;

CREATE TABLE [dbo].[TestSheep]
( [TagID] INT IDENTITY(1,1) NOT NULL
, [State]     CHAR(2)       NOT NULL
, [Name]      VARCHAR(30)   NOT NULL
, [Gender]    CHAR(1)       NOT NULL
, [Breed]     VARCHAR(30)   NOT NULL
, [Birth]     SMALLDATETIME NOT NULL
, CONSTRAINT [PK_TestSheep] PRIMARY KEY CLUSTERED ([TagID]) ) ;
GO

CREATE VIEW [dbo].[vTestSheepCA] ([Name]) WITH SCHEMABINDING
AS
SELECT [Name] FROM [dbo].[TestSheep] WHERE [State] = 'CA' ;
GO

CREATE UNIQUE CLUSTERED INDEX [UI_vTestSheepCA_Name]
  ON [dbo].[vTestSheepCA] ( [Name] ) ;
GO

Next, run the eight statements above that INSERT records into the TestSheep table to see that, as before, the first seven succeed while the eighth fails as intended with error message “Msg 2601, Level 14, State 1, Line 24 Cannot insert duplicate key row in object ‘dbo.vTestSheepCA’ with unique index ‘UI_vTestSheepCA_Name’.”  Thus, while using SQL Server 2005 we were able to obtain the same conditional uniqueness functionality that’s offered with a filtered index in SQL Server 2008.

I’ve enjoyed participating in this month’s T-SQL Tuesday, an event that was the brainchild of Adam Machanic (Blog|Twitter).  Thanks again to Sankar Reddy for hosting T-SQL Tuesday #11!

About Noel McKinney

Noel McKinney is the administrator of noelmckinney.com
This entry was posted in SQL Server, T-SQL Tuesday and tagged . Bookmark the permalink.

8 Responses to T-SQL Tuesday #011 – Filtered Index and an Alternative

  1. Pingback: T-SQL Tuesday #011 – Filtered Index and an Alternative | Noel NOT … | Birth Database

  2. Pingback: Tweets that mention T-SQL Tuesday #011 – Filtered Index and an Alternative | Noel NOT NULL; -- Topsy.com

  3. Bob Pusateri says:

    Pretty slick – I hadn’t thought of that!

    • Thanks Bob! It happened in a kind of backward way. I had a column that just had to be unique sometimes… the route I was heading down was to partition the data into two tables, one where the column was unique and the other where it wasn’t, then I’d union the tables in a view. Then I started working on putting an index on that view, and that’s when I wondered if this solution would work, and it did. Much easier than partitioning the data into two tables and maintaining the juggling of those tables with stored procedures.

  4. Brandie Tarvin says:

    I just love the Nervous Sheep Law. Every state should have one. @=)

    Nice example. Thanks.

  5. Pingback: T-SQL Tuesday #11 Round up, Misconceptions in SQL Server | Sankar Reddy, SQL Server Developer/DBA

Comments are closed.