This month’s T-SQL Tuesday is brought to us by Michael J. Swart (Blog | Twitter). He wants to know if we use indexes or indices for the plural form of index. I guess my answer is in the title of this post. Okay, that’s not all Michael wants from us this month. In fact, he’s opened the door to any aspect dealing with indexes. So with that, I’m taking this opportunity to delve deeper into something that’s been on my mind for a while, and that’s the potential for identification of “unused” indexes that actually might be used in a more indirect manner. In other words, they aren’t utilized by seeks or scans during queries. However, we might want to keep them around, even though they have costs in terms of write operations and storage space.
Starting with SQL Server 2005, we received a very powerful set of tools with the introduction of Dynamic Management Views (DMV). Several of these are related to indexes and open up a wealth of information on how indexes are used, missing indexes, etc. For example, Michelle Ufford (Blog|Twitter) has written some impressive scripts using DMVs for index maintenance.
As I’ve mentioned in previous posts, I was one of the fortunate few to board SQLCruise last month, and it was not all fun, games and swag. Tim Ford (Blog|Twitter) turned up the educational intensity volume with his Deep Dive Into Dynamic Management Objects training session, which included an examination of the index-related DMVs. Among the scripts and demos that Tim presented was one looking at index usage and comparing reads and writes (which the economist in me wants to interpret as benefits vs. costs) as a way of finding those that might best be removed. I have done this before using the same DMV and system view, but here’s the thing… I had only done this on databases that I either designed or otherwise work with frequently. So I would see indexes that are not used from the standpoint of the DMV, but due to my familiarity with the structures I also knew that these indexes were enforcing a business rule. Until I saw Tim’s presentation, I didn’t realize how often I rule out certain indexes for removal due to familiarity or naming convention. As an aside, this is a reminder of why attending training, presentations, etc. even on subjects you (think you) know is important… another perspective or frame of reference can wake you up to valuable revelations.
For the rest of this post, I’m going to provide a demonstration involving index and constraint creation and utilization, followed by use of the “sys.dm_db_index_usage_stats” DMV with a query that investigates index utilization. Then I’ll modify that query to add information on which indexes are constraints. Finally I’ll mention the pitfalls that remain and the difficulty in addressing those pitfalls.
Let’s create a test database and two tables with indexes and constraints. Then we’ll insert a few rows and run some queries that will just use some of the indexes. Source code in this post has been used with both SQL Server 2005 and SQL Server 2008R2. Also, these are contrived examples created to demonstrate a point, so don’t look to these tables for examples of good database design or practice. Standard disclaimer applies that there are no guarantees, warranties, don’t blindly use this on a production system, etc.
CREATE DATABASE [TestIndex] ; GO USE [TestIndex] ; CREATE TABLE [dbo].[TestAccount] ( [AccountID] INT IDENTITY(1,1) NOT NULL , [AccountNumber] CHAR(6) NOT NULL , [AccountName] VARCHAR(50) NOT NULL , [AccountCurrency] CHAR(3) NOT NULL , [OpenDate] DATETIME NOT NULL , [CloseDate] DATETIME NULL ) ; ALTER TABLE [dbo].[TestAccount] ADD CONSTRAINT [PK_TestAccount] PRIMARY KEY CLUSTERED ( [AccountID] ) ; ALTER TABLE [dbo].[TestAccount] ADD CONSTRAINT [AK_TestAccount_AccountNumber] UNIQUE NONCLUSTERED ( [AccountNumber] ) ; CREATE UNIQUE NONCLUSTERED INDEX [AK_TestAccount_AccountName] ON [dbo].[TestAccount] ( [AccountName] ) ; CREATE NONCLUSTERED INDEX [IX_TestAccount_OpenDate] ON [dbo].[TestAccount] ( [OpenDate] ) ; GO CREATE TABLE [dbo].[TestEntity] ( [EntityID] INT IDENTITY(1,1) NOT NULL , [EntityName] VARCHAR(50) NOT NULL , [AccountID] INT NOT NULL ) ; ALTER TABLE [dbo].[TestEntity] ADD CONSTRAINT [PK_TestEntity] PRIMARY KEY CLUSTERED ( [EntityID] ) ; ALTER TABLE [dbo].[TestEntity] ADD CONSTRAINT [AK_TestEntity_EntityName] UNIQUE NONCLUSTERED ( [EntityName] ) ; ALTER TABLE [dbo].[TestEntity] ADD CONSTRAINT [FK_TestEntity_ref_TestAccount] FOREIGN KEY ([AccountID]) REFERENCES [dbo].[TestAccount] ([AccountID]) ; GO INSERT INTO dbo.TestAccount (AccountNumber,AccountName,AccountCurrency,OpenDate,CloseDate) VALUES('100201','BOAT ANCHOR FUND','USD','2002-02-20',NULL) ; INSERT INTO dbo.TestAccount (AccountNumber,AccountName,AccountCurrency,OpenDate,CloseDate) VALUES('100202','PIRATE HAT ACQUISITION','GBP','2003-05-12',NULL) ; INSERT INTO dbo.TestAccount (AccountNumber,AccountName,AccountCurrency,OpenDate,CloseDate) VALUES('100203','SAIL SAVINGS','JPY','2003-08-01',NULL) ; INSERT INTO dbo.TestAccount (AccountNumber,AccountName,AccountCurrency,OpenDate,CloseDate) VALUES('100204','VESSEL FUNDING','USD','2004-03-17',NULL) ; INSERT INTO dbo.TestAccount (AccountNumber,AccountName,AccountCurrency,OpenDate,CloseDate) VALUES('100205','RIGGING TRUST','CHF','2005-12-24',NULL) ; INSERT INTO dbo.TestAccount (AccountNumber,AccountName,AccountCurrency,OpenDate,CloseDate) VALUES('100206','AIR HORN FUTURES','USD','2006-01-03',NULL) ; GO INSERT INTO [dbo].[TestEntity] (EntityName,AccountID) VALUES ('CAPE CRUISE AIDER',3) ; INSERT INTO [dbo].[TestEntity] (EntityName,AccountID) VALUES ('CRUISE CAPITAL',4) ; INSERT INTO [dbo].[TestEntity] (EntityName,AccountID) VALUES ('GETTING RICH KWIK',2) ; INSERT INTO [dbo].[TestEntity] (EntityName,AccountID) VALUES ('MAID OFFSHORE',5) ; INSERT INTO [dbo].[TestEntity] (EntityName,AccountID) VALUES ('SEA SIDEWAYS',4) ; GO SELECT a.AccountCurrency FROM dbo.TestEntity e INNER JOIN dbo.TestAccount a ON e.AccountID = a.AccountID WHERE e.EntityName = 'MAID OFFSHORE' SELECT e.EntityName FROM dbo.TestEntity e INNER JOIN dbo.TestAccount a ON e.AccountID = a.AccountID WHERE a.AccountNumber = '100204' GO
The two SELECT statements at the end were crafted to utilize certain indexes and leave others unused for seek or scan operations. To see this, we’ll use the next query, which is adapted from the previously mentioned presentation by Tim Ford. Tim’s version has additional filters in the WHERE clause that I took out, plus he had more columns in the SELECT clause (system_seeks, system_scans, etc.) that I removed for brevity. I would encourage tacking an asterisk to the end of that SELECT clause at some point just to see the wealth of information available to you.
SELECT OBJECT_NAME(u.object_id) AS [object name] , i.name AS [index name] , u.user_seeks , u.user_scans , u.user_lookups , i.rowcnt AS [index rows] , u.user_updates FROM sys.dm_db_index_usage_stats u INNER JOIN sys.sysindexes i ON u.object_id = i.id AND u.index_id = i.indid WHERE u.database_id = db_id() ORDER BY OBJECT_NAME(u.object_id), i.name, u.user_updates DESC ; GO
The query’s results are below. You’ll see that the three indexes named PK_TestAccount, AK_TestEntity_EntityName and PK_TestEntity were each used in at least one seek, scan or lookup. The other three indexes were not.
|object name||index name||user seeks||user scans||user lookups||index rows||user updates|
Should we drop the three indexes that were not used in at least one seek, scan or lookup? Maybe not. The AK_TestAccount_AccountName and AK_TestAccount_AccountNumber indexes were declared UNIQUE, and the naming prefix “AK” might be used to indicate that these are alternative keys, thus they might be used for enforcing certain business rules or constraints. On the other hand, the index IX_TestAccount_OpenDate looks like an index we will want to keep an eye on to see if it is ever used.
Now, I didn’t consider removing the two indexes with an “AK” prefix in their name, but what if we can’t infer that these are meant to be constraints from the name? To this end, I modified the above query to add the “sys.key_constraints” view as seen below. This allowed the addition of a new column named “constraint type” which will indicate if the index is a Primary Key (PK), Unique (UQ) or not a constraint (NULL).
SELECT OBJECT_NAME(u.object_id) AS [object name] , i.name AS [index name] , u.user_seeks , u.user_scans , u.user_lookups , i.rowcnt AS [index rows] , u.user_updates , k.[type] AS [constraint type] FROM sys.dm_db_index_usage_stats u INNER JOIN sys.sysindexes i ON u.object_id = i.id AND u.index_id = i.indid LEFT OUTER JOIN sys.key_constraints k ON i.id = k.parent_object_id AND i.indid = k.unique_index_id WHERE u.database_id = db_id() ORDER BY OBJECT_NAME(u.object_id), i.name, u.user_updates DESC ; GO
From the results below (the first column is not displayed so that the more interesting information fits on the page) we seem to have had some success. The new “constraint type” column identifies AK_TestAccount_AccountNumber as a Unique constraint. Furthermore, the “constraint type” is NULL for IX_TestAccount_OpenDate indicating, as expected, that it’s not a constraint. However, there is a problem remaining.
|index name||user seeks||user scans||user lookups||index rows||user updates||constraint type|
The “constraint type” for index AK_TestAccount_AccountName is NULL, implying that it is not a constraint. What went wrong? Well, the issue isn’t our query. It has to do with AK_TestAccount_AccountName being defined using CREATE UNIQUE INDEX syntax, rather than using ALTER TABLE ADD CONSTRAINT UNIQUE as was done when adding the indexes AK_TestAccount_AccountNumber and AK_TestEntity_EntityName.
So we might have a lesson here that if a unique index is intended to be a constraint, then we should not use CREATE UNIQUE INDEX syntax. This is interesting to me because I’ve often found these two statements used interchangeably as though there was a belief that the final result is the same, so it doesn’t matter which is used. Of course, this begs the question of whether there is a time when we might have to use CREATE UNIQUE INDEX when creating constraints, and I can think of two. One would be if you have a business rule that can be enforced using a filtered index (which is only available with SQL Server 2008 and later, although there is an indirect way of achieving the same result in earlier versions using an indexed view). The other reason would be if you wanted to have an included column (using the INCLUDE syntax available with SQL Server 2005 and later) on an index that is also enforcing a uniqueness requirement. Of course, we could debate the wisdom of such a design, but that doesn’t keep it from happening. I plan to write more on these topics (constraints implemented with filtered indexes and included columns) in a later post. This one is already long enough and I need to wrap it up!
Finally, Rob Farley (Blog|Twitter) wrote Chapter 40 of SQL Server MVP Deep Dives
titled “When is an unused index not an unused index?” Rob makes an interesting point that an index-related DMV only tells us part of the story. He shows that sometimes the optimizer uses unique indexes in a manner that we won’t detect using DMVs alone. So that is another caveat that before removing an index, we might need to do further investigation beyond seeing if the index is used in seeks, scans and lookups.