Now that I have the time to look up and see what’s going on in the world around me, I’m trying to spend more time on the SSIS Forums on MSDN. This is a great community-driven resource where lots of really bright and dedicated people (and sometimes me, too) help the growing body of SSIS developers solve their problems. And looking around the forums I’ve found several posts (including this one) asking about how to load data from multiple Excel files using a single SSIS package. This is an exceptionally easy task for SSIS, but I could not find any online resource that walked you through step by step.
So here you go: Step by step instructions on how to load data from multiple Excel files with a single SSIS package.
1. Create an SSIS project in Visual Studio
7. Click the Preview button to be certain that it works
8. Build the rest of data flow – in this case we’re simply counting the rows, but you might want to put the data into a database or something 😉
9. Make sure the Excel file is not open – this will cause errors every time if Excel has the file open when you try to load it
10. Execute the package
11. At this point we know that we have everything built correctly to load a single Excel file – now we need to modify the package to work with an arbitrary set of files
12. Copy and paste into notepad at the ConnectionString property of the connection manager – it should look something like this: Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Projects2005\SSIS_Excel_Loop\SSIS_Excel_Loop\email_book_01.xls;Extended Properties=”Excel 8.0;HDR=YES”;
13. Create a package variable called FileName and give it the value of the file name from the ConnectionString
14. Update the connection manager’s ConnectionString property to be based on an expression that uses the FileName variable to duplicate its tested value – “Provider=Microsoft.Jet.OLEDB.4.0;Data Source=” + @[User::FileName] + “;Extended Properties=\”Excel 8.0;HDR=YES\”;”
(Note how we’re using the backslash “\” character to escape the double-quotes that are part of the connection string without breaking the expression!)
15. Run the package again to make sure it still works – By testing after each change you will catch errors early, and not need to try to figure out where you went wrong if you get to the last step and things don’t work
16. Add a Foreach Loop Container to the control flow and drag the Data Flow task to be contained within it
17. Edit the Foreach Loop Container so that it loops through all xls files in the folder and retrieves the fully qualified file name and maps that value to the FileName variable
18. Run the package again – Voila!! We’ve now loaded multiple Excel files with a single SSIS package
19. If you’re feeling particularly skeptical (after all, how can you be certain that really looped through the files?) you can easily add a Data Viewer to the data flow and look at the data being loaded from each Excel file
And there you have it. It’s not exactly a complicated scenario, but it is very common, very “real world” and it shows off some of SSIS’s most powerful features, namely variables and property expressions. We could also easily update this example to use a configuration to set the path of the folder through which the Foreach Loop Container enumerates, but I’ll leave hat as an exercise for the reader.