Post-Image

Azure Data Factory Parameterized Linked Services

In Azure Data Factory (ADF), managing linked services can be a bit tricky. It is often quite the balancing act to figure out who should manage them, how they should be managed across environments, and how to audit/monitor them from a security perspective. It can get especially interesting to manage data factories that contain multiple projects or use cases. The question becomes where is it best to manage and audit the linked services configuration.

If you are using Azure resource manager (ARM) templates to manage your ADF, you might initially think that building individual linked services for all the uses cases and then making use of template parameters is the right way to go. This could result in a potentially large amount of linked services to manage and enforce configuration on.

An alternate way to do things is to look at parameterized linked services. In this setup, datasets (when configured) are prompted for additional information relating to the target linked service. This allows you to reduce the number of linked services, pushing up configuration to the dataset layer. In your environment, this may or may not be easier to audit/maintain/secure then a myriad of linked service connections individually connected.

The goal of this post is to walk through how to set up parameterized linked services. You can read the official documentation here. In this example, we walk through how to do this for an Azure storage account.

When you create a linked service for an Azure storage account, the following code gets automatically created.

{
    "name": "AzureBlobStorage1",
    "properties": {
        "annotations": [],
        "type": "AzureBlobStorage",
        "typeProperties": {
            "connectionString": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "AzureKeyVault1",
                    "type": "LinkedServiceReference"
                },
                "secretName": "storageconnstring"
            }
        }
    }
}

As you can see above, I’ve opted to store my connection string in key vault. When you use the standard configuration above, you get the following view while creating a dataset.

Let’s now add a parameter so that the key name used to store the connection string can be inserted from the dataset, rather than hardcoded in the linked service.

{
    "name": "AzureBlobStorage2",
    "properties": {
        "annotations": [],
        "type": "AzureBlobStorage",
        "typeProperties": {
            "connectionString": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "AzureKeyVault1",
                    "type": "LinkedServiceReference"
                },
                "secretName": "@{linkedService().keyVaultSecretName}"
            }
        },
        "parameters": {
            "keyVaultSecretName": {
                "type": "String"
            }
        }
    }
}

Here is what the dataset creation looks like now:

Conclusion

While this approach might not make a ton of sense for changing key names dynamically on a storage account, it might come in handy for other linked services, such as Azure Databricks, so that different datasets and pipelines can specify different cluster Ids. This is of course dependant on how you configure your overall ETL process, as now that the linked service is parameterized, it does not show up in the ARM template parameters.

 

About Shamir Charania

Shamir Charania, a seasoned cloud expert, possesses in-depth expertise in Amazon Web Services (AWS) and Microsoft Azure, complemented by his six-year tenure as a Microsoft MVP in Azure. At Keep Secure, Shamir provides strategic cloud guidance, with senior architecture-level decision-making to having the technical chops to back it all up. With a strong emphasis on cybersecurity, he develops robust global cloud strategies prioritizing data protection and resilience. Leveraging complexity theory, Shamir delivers innovative and elegant solutions to address complex requirements while driving business growth, positioning himself as a driving force in cloud transformation for organizations in the digital age.

Share This Article

Comments