Sometimes I want to quickly load a data file into a SQL Server table, and one of my favorite methods is using the BULK INSERT statement. Yes, I’m also a fan of SSIS (SQL Server Integration Services) and it’s forerunner DTS, as well as the BCP executable. If nothing else, file this under TIMTOWTDI (There Is More Than One Way To Do It) because I’m not trying to convince anyone of what method is best. However, sometimes there are situations where I want or need to do something in T-SQL rather than using other available options. I’m also not going into a lot of BULK INSERT details here and many aspects are kept simple to avoid deviations.
Now, using BULK INSERT to load data into a table seems acceptable enough, but why would I want to use it with a view? Well, one day I wanted to load a data file with columns that didn’t match the underlying table. I also didn’t want to mess around with creating a format file. So I created a view on the underlying table containing only columns that were in the data file, then did a simple BULK INSERT into the view, and it worked! This left me wondering if I could also use this technique for generating the file’s line number of each row. That’s what I’ll demonstrate here.
To start, we’ll create a database, table, and view into which we can BULK INSERT a file’s contents.
CREATE DATABASE TestBIV ; USE TestBIV ; CREATE TABLE TestBulkInsert ( LineNumber INT IDENTITY(1,1) NOT NULL , RowData VARCHAR(100) ) ; CREATE VIEW vTestBulkInsert AS SELECT RowData FROM TestBulkInsert ;
The file that we’ll load just has one column of less than 100 characters of text data. That’s what we’ll load into the “RowData” column of the “TestBulkInsert” table. That table has an integer “LineNumber” column with the IDENTITY property that will start off at 1 and will increment by 1 for each row inserted. I created a file named “RowsSixteenK.txt” and saved it to the C:\Temp folder. This file is available in the zip file mentioned at the end of this post and it contains 16,000 rows. Each row has the line number prefixed to a string of characters; this let me verify that the “LineNumber” integer value generated by the IDENTITY property matched the actual line number of the row in the file. Below is the statement to load the file.
TRUNCATE TABLE TestBIV.dbo.TestBulkInsert ; BULK INSERT TestBIV.dbo.vTestBulkInsert FROM 'C:\Temp\RowsSixteenK.txt' WITH (BATCHSIZE = 8000 , TABLOCK) ;
Nothing has been loaded into the table yet, so the TRUNCATE isn’t needed unless you want to reload the file. TRUNCATE is used instead of DELETE so that the seed to the IDENTITY will be reset and values for LineNumber will start at 1 if data is reloaded. If DELETE is used then the LineNumber values will continue from the prior load. Also note that I specified a BATCHSIZE value of 8000 rows. If this is not specified then all the rows are loaded as one batch. Since there are 16,000 rows in the data file, this breaks the load into two batches. The optimal value to use varies among situations, it’s just a value I used here for demonstration purposes.
With the data file loaded into the table, a query showed that “LineNumber” column values were generated that matched the line number of the data file.
As mentioned above, a zip file with the T-SQL source code used in this demonstration, as well as a file with 16,000 rows of data for inserting, is here if you want to try this yourself (with the usual disclaimer of no guarantees, don’t blindly use this on a production system, etc).