Import data from Excel into SQL Server

By Mike Gledhill

Supposing we have the following Excel file, and we wish to import its sales figures into SQL Server.

A sample Excel file

A sample Excel file Sales.xlsx

With SQL Server, this is really easy.

All you need is the relevant OLEDB drivers installed on your server, and the following SELECT statement. Notice how it contains the filename of the Excel file, plus the worksheet name (followed by a dollar-sign) to read in.

SELECT * FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0;Database=C:\Sample Excel\Sales.xlsx;HDR=NO;IMEX=1',
    'SELECT * FROM [SalesFigures$]')

Note: The Excel file must be on the same server as SQL Server itself, otherwise (in this example) SQL Server will be looking for the file on it's own C: drive, rather than your local C: drive.
 

If you're lucky, then straight away, you'll see the Excel values in the Results window.
(If this doesn't work, then see below to find out how to get around known problems.

A sample Excel file

To do something with this data, I generally import the data into a temporary table, merging in a "Row Number" field, then filter out the data as required.

Some sample code is given below, but yours will be entirely dependant on the format of the data in your Excel file.

SELECT RowNumber=IDENTITY(int,0,1), *
INTO #tmpExcel
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0;Database=C:\Sample Excel\Sales.xlsx;HDR=NO;IMEX=1',
    'SELECT * FROM [SalesFigures$]')

SELECT * FROM #tmpExcel     -- First, let's display ALL rows of data

SELECT F1 AS 'Employee name',
    F2 AS 'Sales 2011',
    F3 AS 'Sales 2010',
    F4 AS 'Sales 2009',
    F5 AS 'Sales 2008',
    F6 AS 'Sales 2007',
    F7 AS 'Sales 2006',
    F8 AS 'Sales 2005'
FROM #tmpExcel
WHERE RowNumber >= 3        -- Just get the rows we're interested in.
AND RowNumber <= 11

DROP TABLE #tmpExcel

This code produces two sets of results.
First, it shows all of the rows from Excel, and then it shows just the rows with sales data in them (except for the Totals Row we put in row 13 of our worksheet):

After importing the Excel file

Simple, hey !

This method can cause problems though: you need to make sure you have sufficient permissions to read in external files, that the relevant OLEDB library is installed on your SQL Server server, and you must copy the Excel file onto your SQL Server machine.

But, once you overcome these requirements, this method lets you import Excel data extremely quickly.

Headers

It's also worth mentioning that in this example, our data doesn't start at the top of our Excel spreadsheet. Rows 1-3 contain a few lines describing what's in the worksheet, rather than jumping straight into the samples figures directly.

However, if your data does start at the top of a worksheet, then you can tell the OLEDB drivers that your Excel worksheet does contain header information. In return, the SELECT statement will use these header items as the field names it returns.

For example, let's delete the top three lines of our Excel worksheet, so that it now looks like this:

A sample Excel file, with Headers

A sample Excel file, with headers SalesWithHeader.xls

Now, compare the output, if we select the top 4 items from this worksheet, first without headers, and then with headers.

SELECT TOP(4) * FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0;Database=C:\Samples\SalesWithHeader.xls;HDR=NO;IMEX=1',
    'SELECT * FROM [Sales$]')

SELECT TOP(4) * FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0;Database=C:\Samples\SalesWithHeader.xls;HDR=YES;IMEX=1',
    'SELECT * FROM [Sales$]')

Notice the difference in the fieldnames which the SELECT produces:

SELECT, with an without header information

This can be very useful.
For example, now that we have a field called "2011", we could get the total of the 2011 sales figures using the following command:

SELECT SUM([2011]) FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0;Database=C:\Samples\SalesWithHeader.xls;HDR=YES;IMEX=1',
    'SELECT * FROM [Sales$] WHERE [F1] IS NOT NULL')

Total 2011 sales

Did you notice that I had to slip in WHERE [F1] IS NOT NULL into the SELECT statement ? Without this, the Total would've included the Excel's own total value in cell B11. The SUM() value would've been double the expected amount.

By excluding any rows where F1 (column A in the worksheet) is blank, we will just SUM the sales figures on rows where column A isn't blank.


Windows 7 64-bit and SQL Server 2008

I'm starting to regret starting this topic !

For some reason, once you start using 64-bit operating systems, connecting to Excel files using OLEDB seems erratic, to say the least.

The "cleanest" way to test whether your SQL Server environment can successfully connect to Excel 2007 files is to create a Linked Server. You can do this using the following command:

EXEC master.dbo.sp_addlinkedserver
    @server ='ExcelLinkedServer',               -- Create a Linked-Server called "ExcelLinkedServer"...
    @srvproduct= 'Excel',
    @provider= 'Microsoft.ACE.OLEDB.12.0',
    @datasrc= 'D:\ExcelFiles\Sample.xls',       -- ...pointing to this Excel file.
    @provstr= 'Excel 12.0'

(This command will also work with Excel 2007 .xslx files.)

Alternatively, you can right-click on the Linked Servers branch (under Server Objects), select "New linked Server.." and provide the details manually:

Creating a linked server

Either way, if this works, then you'll now be able to select your new Linked Server, and expand the Catalogs, default, then Tables branches, to see a list of the worksheets in your Excel file:

Linked server test

If something did go wrong when creating your Linked Server connection, then you're likely to see an error which might help. And it might not. But it'll certainly show you that running a command to directly access the Excel file is unlikely to work for you (until you fix this problem).

-- A SELECT like this is unlikely to work, if you were unable to create a Linked Server to the .xlsx file
SELECT * FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0;Database=C:\Sample Excel\Sales.xlsx;HDR=NO;IMEX=1',
    'SELECT * FROM [SalesFigures$]')

Google is full of ideas of how to fix the numerous problems you'll face, and how to overcome them.

My best suggestion is that you make sure to give the Everyone user full read-write access to whichever directory on your SQL Server machine contains your Excel file.

Also, I have had numerous problems when trying to open Excel 2007 .xlsx files, which don't occur for Excel 2003 .xls files. So if you can't open an .xlsx file, try loading the .xlsx file with Excel, re-saving it as an .xls file, and see if you can open that instead.

It won't solve your problem, but at least it'll reassure you that the OLEDB libraries are alive and kicking, and the problem lies elsewhere.

SELECT * FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0;Database=C:\Sample Excel\Sales.xls;HDR=NO;IMEX=1',
    'SELECT * FROM [SalesFigures$]')

Some articles recommand using a Provider of "Excel 12.0 Xml" instead of "Excel 12.0" for opening Excel 2007 files, but I have seen little difference.

I could be wrong..



Troubleshooting

If you aren't able to run the SELECT statement above, to read data from an Excel file, here's some of the typical error messages.

Click on the links, to see how to solve these issues:


 

Msg 15281, Level 16, State 1, Line 2
SQL Server blocked access to STATEMENT 'OpenRowset/OpenDatasource' of component 'Ad Hoc Distributed Queries' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'Ad Hoc Distributed Queries' by using sp_configure. For more information about enabling 'Ad Hoc Distributed Queries', see "Surface Area Configuration" in SQL Server Books Online.

If you get this Ad-Hoc error, you need to run the following command to turn on Ad-Hoc queries:

sp_configure 'show advanced options', 1
reconfigure
go
sp_configure 'Ad Hoc Distributed Queries', 1;
reconfigure
go

If this works (i.e. if you have permission to change these settings), you'll see something like this:

Configuration option 'show advanced options' changed from 0 to 1. Run the RECONFIGURE statement to install.
Configuration option 'Ad Hoc Distributed Queries' changed from 0 to 1. Run the RECONFIGURE statement to install.

If you see these messages, try re-running the SELECT statement again, and see if SQL Server is now able to read in the data from the Excel file.

If you get the following error, particularly on a 64-bit version of Windows, try running the script below:

Cannot fetch a row from OLE DB provider "Microsoft.ACE.OLEDB.12.0" for linked server "(null)".

EXEC master.dbo.sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0' , N'AllowInProcess' , 1
GO
EXEC master.dbo.sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0' , N'DynamicParameters' , 1
GO



Plan B

If you find you can't use the OLEDB drivers, due to errors like this:

The OLE DB provider "Microsoft.ACE.OLEDB.12.0" has not been registered.

...then an alternative solution is to re-save your Excel file as .xls, then use the old crappy Jet drivers.

Try to read the contents of one of your Excel worksheets using this command:

SELECT * FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
    'Excel 8.0;Database=C:\Sample Excel\Sales.xlsx;HDR=NO;IMEX=1',
    'SELECT * FROM [SalesFigures$]')

If you get this error:

The OLE DB provider "Microsoft.Jet.OLEDB.4.0" has not been registered.

...then you need to register a few .dlls on your server:

cd c:\windows\sysWOW64
regsvr32 Msjetoledb40.dll
regsvr32 Msjet40.dll
regsvr32 Mswstr10.dll
regsvr32 Msjter40.dll
regsvr32 Msjint40.dll

But... if your SQL Server machine doesn't have Office installed, then your safest bet is to just install the ACE drivers using the following link, and stick to using the "Microsoft.ACE.OLEDB.12.0" drivers.

http://www.microsoft.com/en-us/download/details.aspx?id=13255




Useful links

Excel Import to SQL Server using Distributed Queries
http://www.excel-sql-server.com/excel-import-to-sql-server-using-distributed-queries.htm

Comments

blog comments powered by Disqus