In the previous post, we learnt about the Get Metadata activity and the various metadata types that can be extracted using the activity. One of the most common use cases of the Get Metadata activity is to extract a list of files from a storage folder. This is a very common scenario while loading data into a Data Warehouse.
In this example, we will be using SQL to parse the output from the Get Metadata activity, extract a list of Filenames (and the corresponding Filetypes) from a folder and load these into an SQL table named JSON_Output.
The output from the Get Metadata activity is generated in JSON format, in-order- to run the SQL statements to parse the JSON output, we will use a Stored Procedure activity.
Let’s look at the steps:
Step 1: Create a table in the database using the following code:
CREATE TABLE dbo.JSON_Output ( Filelist [NVARCHAR] (max), FileName [NVARCHAR] (max), FileType [NVARCHAR] (100) )
Step 2: Create a Stored Procedure as shown below. The Stored Proc takes an input variable @JSON, which we are going to pass from the data factory pipeline in the next steps.
CREATE PROC dbo.Process_JsonOutput @JSON [NVARCHAR] (max) AS BEGIN INSERT INTO dbo.JSON_Output ( Filelist, FileName, FileType ) SELECT Filelist ,filename ,filetype FROM OPENJSON(@JSON) WITH ( Filelist NVARCHAR(MAX) '$' AS JSON ) J1 OUTER APPLY OPENJSON (J1.Filelist) WITH( filename NVARCHAR(MAX) '$.name' ,filetype VARCHAR(500) '$.type' )J2 END
Step 3: Link the output of the Get Metadata Activity to the stored Procedure activity. Configure the settings for the Linked Service and select the stored proc that we created in Step 2.
Step 4: The last and the most important step is to pass the JSON output from the Get Metadata activity as an input parameter to the stored proc that we created in Step 2 above. To do this, we will create a new parameter of data type String named “Json” under the Stored Procedure activity settings.
Use the following code as the parameter value.
Note: I have used Get Metadata1 as an example, replace ‘Get Metadata1’ with the actual name of the Get Metadata activity in your pipeline.