Data Factory Pipelines are often used to run business critical data loads. Most of them are scheduled, using triggers. We have discussed options to automate the execution of Data Factory Pipelines using triggers in a previous post.
While there are metrics available which can be captured using Azure Monitor. During runtime, sometimes an activity might take longer than usual, and this could affect the overall completion time of the pipeline.
Recently, Microsoft has added a new feature to Data Factory, called the Elapsed Time Pipeline Run metric. This new setting allows the Data Factory users to set a pipeline expected completion duration. In case the Pipeline keeps running beyond this duration, it will emit a metric when the Pipeline run duration crosses the duration set in the metric.
This metric can also be used to trigger alert emails to action groups (data engineers/developers), who will then be able to make informed decisions about subsequent Pipelines/Activities and minimize the impact of the delay. Creating alerts for Data Factory metrics will be discussed in a future post.
So, to summarize, this metric enables proactive alerts that inform the stakeholders that the Pipeline run duration has exceeded the expected duration even before the Pipeline run finishes.
To set a value for the metric, just click on the blank canvas area of the Pipeline and go to the Settings tab as shown in the screenshot below:
The metric generated by this setting can be found in the Azure Monitor with the id: PipelineElapsedTimeRuns .
One thought on “Azure Data Factory: Identify long running Pipelines during execution”