Friday, May 3, 2013

Loop through excel column names using SSIS

How do I get the field names from a specific excel sheet in an excel file? There was a question asked on SSIS MSDN forum which forced me to blog on this.

Suppose the excel file has more than one sheets and the sheet we are interested in is ImpData which has two columns (Data1 and Data2). The aim is to get these field names and check if these fields are present in a specific sql server table. If fields Data1 and Data2 are not present in the table then add them to the table. Once the columns are added to the table load the excel data to the sql server table.

Step1:Create a table ExcelFields (Field1 Varchar(20))

Step2:Create 4 package scoped string variables SheetName, FieldName, ExcelSheetName and AlterSQL. Set the value for SheetName as ImpData$ as this is the sheet that contains the relevant data to be fetched and loaded to sql server table.

Step3:Create an ado.net connection for the excel file. Use a foreach loop to iterate through excel sheets and save each of the sheet name in ExcelSheetName variable. Check my blog on how to create ado.net connection manager and save the excel sheet name in a variable.

Step4:Have a foreach loop within the foreach loop used to capture the sheet name (Step3) and click on Collection in the left pane in foreach loop editor. Select the enumerator as "Foreach ADO.NET Schema Rowset Enumerator". Under enumerator configuration select the ado.net connection created in Step2. Select "Columns" as Schema from the drop down box.
Click on "SetRestrictions" button and select Table_Catalog, Table_Schema and Table_Name. Highlight Table_Name and Select SheetName variable as shown:


 
Step5:Click on Variable Mappings in left pane and select the variable FieldName (created in Step1) and set the index as 3.

Following is the control flow so far:



















Step6: Put a dummy script task inside the outer foreach loop and hook it to the inner foreach loop container. Add a precedence constraint (Evaluation operation as expression and constraint and value as success)using following expression: @[User::SheetName]==@[User::ExcelSheetName].

Step7: Set the evaulate as expression property for variable AlterSQL as true and set the following expression: "Alter Table ExcelFields Add " +  @[User::FieldName] + "  Varchar (50)"

Step8: Add an execute sql task inside the inner foreach loop and set the connection manager to point to the table that needs to be altered. Use AlterSQL as the sql source in execute sql task.

Step9: An execute sql task can be added out of the both the foreach loops and be used to load the excel data to table using the distributed queries mentioned here.

Note: The expression for AlterSQL can be modified to check and ignore if the fields are already present in the table



6 comments:

  1. Error: The GetEnumerator method of the ForEach Enumerator has failed with error 0x80131509 "(null)". This occurs when the ForEach Enumerator cannot enumerate.

    I am getting this error. Do you know how to fix it

    ReplyDelete
    Replies
    1. Which step are you getting the error?

      Delete
  2. At the second for each loop container.

    ReplyDelete
    Replies
    1. Can you mail me the details of your implementation in addition to the version of excel?

      Delete
    2. I have seen your mail and you need three for each loop containers as you need to iterate through the excel files (first for each loop) and then iterate through each excel sheet (second for each loop) in each of the excel file and then to iterate through each column name in each excel sheet (third for each loop).Please go through the link mentioned in Step3

      Delete
  3. hi, actuallyi have an error parameter incorrect ?? could you help me please ??
    thank you in advance

    ReplyDelete