Configure Azure Data Factory Authentication to ADLS Gen2

The most convenient way to move data to the cloud for analytics use cases, is to create a Data Lake using ADLS. Azure Data Factory (or Synapse Analytics) Pipelines can then be used to transform this data for further analysis. This can be done by creating a Linked Service. We have discussed about linked services and Data Factory Pipelines in previous post.

When we create a linked service, we are creating an authenticated connection between Azure Data Factory and ADLS. In today’s post, let’s have a look at various options to configure access for Data Factory using the ADLS Gen2 connector.

Data Factory Linked Service: ADLS Gen2 Connector

There are four authentication types available for the ADLS Gen2 connector:

1. Account Key: This method requires two primary properties to be configured.

a. The first one is the URL to the ADLS Gen2 container, this can be found in the properties of the ADLS storage account in the Azure Portal.

b. A storage account access key. This is available under Settings-> Access Keys on the storage account page in the Azure Portal. Microsoft recommends storing the access keys in the Key Vault for secure access.

2. Service Principal Authentication: In this method, Azure Data Factory needs to be registered as an application in the Azure Active Directory (Azure AD). Instructions about the process can be found here. Once registered the following information is required to configure Service Principal authentication:

Application Id

Application Key

Tenant Id

Also, proper permissions need to be provided to the service principal to enable access to the ADLS.

3. System Assigned Managed Identity: System assigned managed Identity is a unique Id for the Data Factory Instance which is created by default when the Data Factory is provisioned.  The Azure Data Factory Managed Identity needs to be granted appropriate permission in ADLS according to the type of access/role to ADLS path as required.

Linked Service TypeRole
SourceStorage Blob Data Reader
SinkStorage Blob Data Contributor
ADLS Roles by Linked Service Type

4. User Assigned Managed Identity: This method is similar to the System-assigned Managed Identity method with some variations such as:

a. A Data Factory can be assigned multiple User-Assigned Managed Identities instead of just one System-Assigned Managed Identity

b. Credentials must be created for each User-Assigned Managed Identity to be used for authentication while creating Linked Services

Reference: Copy and transform data in Azure Data Lake Storage Gen2 – Azure Data Factory & Azure Synapse | Microsoft Docs

2 thoughts on “Configure Azure Data Factory Authentication to ADLS Gen2

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: