T-SQL Tuesday #018 – CTE with OPENROWSET

May’s T-SQL Tuesday is brought to us by Bob Pusateri (Blog|Twitter). Bob is very active in the SQL Server community here in Chicago and was one of the first people I met when I started attending user group meetings. So there was no way I was going to miss out on this month’s T-SQL Tuesday event! For this month, Bob is asking “Have you ever solved or created a problem by using CTEs? Got a tip, trick, or something nifty to share?” Sure, I think I can fill that order.

CTE is the abbreviation for Common Table Expression, a feature that was new in SQL Server 2005. Much of the excitement with this feature’s arrival was the capability to implement recursion in T-SQL code via CTEs. I’ve mostly used CTEs to create more readable code in situations that might have involved complicated subqueries as well as some (but not all) circumstances that might utilize temp tables or table variables.

One of my favorite ways to use CTEs is when I want to access a file using the T-SQL OPENROWSET feature. There’s nothing wrong with using OPENROWSET in the FROM clause. However, it seems cleaner to break it out into a CTE so that all that awkward syntax is up and away from the FROM clause. That way, once you have your OPENROWSET syntax done, you can fiddle around with the FROM clause and not worry about accidentally breaking your OPENROWSET work.

Let’s say you have a tab-delimited file of some folks you know and their favorite words. The file has three columns: a line number, the person’s first name, and their favorite word. So this file might look like the following.

1	Bob	coefficient of thermal expansion
2	Brent	gaga
3	Buck	cloud
4	Harper	awesome
5	Jes	run
6	Karen	data
7	Norman	hammer
8	Paul	sheep
9	Yanni	hike

Note that this tab-delimited data file named “FavoriteWords.txt” is available in a compressed zip file that can be downloaded here. Also available in this zip file is the format file named “FavoriteWords.xml” that is utilized in the queries that follow. So first let’s see if we can query this file in a CTE utilizing OPENROWSET with the BULK provider. As usual, nothing here comes with any guarantees that your server won’t explode, data won’t be wiped out permanently, etc. Like any source code or files you find on the web, don’t run anything here untested on a production machine or any other system that you care about.

; WITH [FavoriteWords] ([LineNumber],[FirstName],[FavoriteWord])
 AS ( SELECT [LineNumber],[FirstName],[FavoriteWord]
        FROM OPENROWSET( BULK 'C:\temp\FavoriteWords.txt'
                       , FORMATFILE = 'C:\temp\FavoriteWords.xml')
      AS [fav] )
SELECT [LineNumber],[FirstName],[FavoriteWord]
  FROM [FavoriteWords] ;

That’s not too bad, although you may have to read up on OPENROWSET in Books Online in case you run into permissions issues, etc. Perhaps you can already see why I like pushing all that OPENROWSET syntax up into a CTE. We could now add more complexity to our SELECT, or even write an INSERT/UPDATE/DELETE, and not have to touch the OPENROWSET syntax in the CTE. For example, if you have the AdventureWorks sample database available, you can write a query like this.

; WITH [FavoriteWords] ([LineNumber],[FirstName],[FavoriteWord])
 (SELECT [fav].[LineNumber],[fav].[FirstName],[fav].[FavoriteWord]
    FROM OPENROWSET( BULK 'C:\temp\FavoriteWords.txt'
           , FORMATFILE = 'C:\temp\FavoriteWords.xml') AS [fav] )
SELECT [e].[Title],[f].[FirstName],[f].[FavoriteWord]
  FROM [AdventureWorks].[HumanResources].[Employee] [e]
 INNER JOIN [FavoriteWords] [f]
    ON [e].[EmployeeID] = [f].[LineNumber] ;

Of course, if you don’t have AdventureWorks, you could use some other table and join on an integer primary key to get similar results.  Running the above query in SQL Server Management Studio gave the results seen in this screenshot snippet.

BONUS: Did you notice that the file path includes the drive letter in the above queries? Yuck. Plus, what if the folder containing the file being queried is on C:\ in some environments but E:\ in others? Well, the nice thing is that you can also specify the share name if the folder is shared.  So if the C:\temp folder is shared as “temp” then you can use that to write queries such as the one below.

; WITH [FavoriteWords] ([LineNumber],[FirstName],[FavoriteWord])
AS ( SELECT [LineNumber],[FirstName],[FavoriteWord]
     FROM OPENROWSET( BULK '\\LOCALHOST\temp\FavoriteWords.txt'
 , FORMATFILE = '\\LOCALHOST\temp\FavoriteWords.xml') AS [fav] )
SELECT [LineNumber],[FirstName],[FavoriteWord]
  FROM [FavoriteWords] ;

With that, I’m looking forward to reading the contributions to this month’s T-SQL Tuesday and seeing how everyone is using CTEs.  Thank you to Bob Pusateri for hosting T-SQL Tuesday #18, and thanks agaom to Adam Machanic (Blog|Twitter) for creating this monthly blog event!


About Noel McKinney

Noel McKinney is the administrator of noelmckinney.com
This entry was posted in T-SQL Tuesday and tagged , . Bookmark the permalink.

7 Responses to T-SQL Tuesday #018 – CTE with OPENROWSET

  1. Claire says:

    Nice. I hadn’t known that you could specify the share name if the folder is shared–makes things a bit easier!

  2. Great use of CTE’s I think my favorite part about T-SQL Tuesday is seeing all the cool ways people use the syntax!

    • Thanks Bradley! Same here, it really makes T-SQL come to life when people talk about how they use features, Adam did the community a great service by coming up with T-SQL Tuesday.

  3. Bob Pusateri says:

    Hi Noel – Thanks for the kind words and the excellent contribution to T-SQL Tuesday. I really need to play with the XML functions more, and your example sounds like a great place to start!

    • Wow, thanks Bob! I should have mentioned that you don’t have to use an XML format file, I just prefer using XML when possible for configuration files and such.

  4. Pingback: T-SQL Tuesday #18 Wrapup | Bob Pusateri - The Outer Join

Comments are closed.