We looked at a specific scenario for branching the control flow based on some business rule or condition in the previous post. In this post, we are going to look at a different scenario.
Sometimes, during a scheduled data load, the source data store may not be ready i.e., the files in the source folder have not been generated when the pipeline gets triggered. By default, this would cause the Pipeline to fail. However, Data Factory provides an easy way to halt the Pipeline execution for the time being and retry later. This can be accomplished using the Validation Activity.
As we can see in the screenshot above, under the Settings tab, there are three configuration settings:
- Timeout: This is the time for which the validation activity is active, the default value for timeout is 7 days.
- Sleep: Time (in seconds) between retry attempts.
So, for example, if the pipeline trigger time is 2PM and we would like the Pipeline to keep checking for a file until 5 PM every 2 minutes, we can set these values as:
Timeout: 0:03:00:00 (Format D:HH:MM:SS)
Sleep: 120 (seconds)
This brings us to the third setting:
- Child Items: This setting is available only when the selected dataset points to a folder. It allows the users to control the Pipeline execution based on some checks the folder. There are three options here:
- Ignore: Just checks the existence of the folder, irrespective of whether there are any objects in the folder
- True: Checks if the folder exists and checks the folder for existence of object
- False: This option will check if the folder exists, and it is empty.