Showing posts with label ssis for each ado.net schema rowset enumerator. Show all posts
Showing posts with label ssis for each ado.net schema rowset enumerator. Show all posts

Friday, May 3, 2013

Loop through excel column names using SSIS

How do I get the field names from a specific excel sheet in an excel file? There was a question asked on SSIS MSDN forum which forced me to blog on this.

Suppose the excel file has more than one sheets and the sheet we are interested in is ImpData which has two columns (Data1 and Data2). The aim is to get these field names and check if these fields are present in a specific sql server table. If fields Data1 and Data2 are not present in the table then add them to the table. Once the columns are added to the table load the excel data to the sql server table.

Step1:Create a table ExcelFields (Field1 Varchar(20))

Step2:Create 4 package scoped string variables SheetName, FieldName, ExcelSheetName and AlterSQL. Set the value for SheetName as ImpData$ as this is the sheet that contains the relevant data to be fetched and loaded to sql server table.

Step3:Create an ado.net connection for the excel file. Use a foreach loop to iterate through excel sheets and save each of the sheet name in ExcelSheetName variable. Check my blog on how to create ado.net connection manager and save the excel sheet name in a variable.

Step4:Have a foreach loop within the foreach loop used to capture the sheet name (Step3) and click on Collection in the left pane in foreach loop editor. Select the enumerator as "Foreach ADO.NET Schema Rowset Enumerator". Under enumerator configuration select the ado.net connection created in Step2. Select "Columns" as Schema from the drop down box.
Click on "SetRestrictions" button and select Table_Catalog, Table_Schema and Table_Name. Highlight Table_Name and Select SheetName variable as shown:


 
Step5:Click on Variable Mappings in left pane and select the variable FieldName (created in Step1) and set the index as 3.

Following is the control flow so far:



















Step6: Put a dummy script task inside the outer foreach loop and hook it to the inner foreach loop container. Add a precedence constraint (Evaluation operation as expression and constraint and value as success)using following expression: @[User::SheetName]==@[User::ExcelSheetName].

Step7: Set the evaulate as expression property for variable AlterSQL as true and set the following expression: "Alter Table ExcelFields Add " +  @[User::FieldName] + "  Varchar (50)"

Step8: Add an execute sql task inside the inner foreach loop and set the connection manager to point to the table that needs to be altered. Use AlterSQL as the sql source in execute sql task.

Step9: An execute sql task can be added out of the both the foreach loops and be used to load the excel data to table using the distributed queries mentioned here.

Note: The expression for AlterSQL can be modified to check and ignore if the fields are already present in the table