Optimizing performance is one of the most important design decisions while implementing a Data Warehouse using dedicated SQL Pool in Azure Synapse Analytics.
We have discussed how to update Data Warehousing Units in Azure Synapse Analytics in previous post.
Data Warehousing Units or DWUs (cDWUs) determine the query performance level of the Synapse Analytics dedicated SQL Pool. One of the simplest ways to improve query performance would be to increase the DWU allocation. This is known as scaling out.
Microsoft recommends using a dataset of at least 1 TB size to analyse the true impact of a DWU scale out.
The optimal amount of DWUs will depend on the following factors:
1. Amount of data processed during the read/write operation
2. The underlying partitioning and indexing of the tables
3. Network data transfer speeds (if the data is being loaded from/to external sources/sinks)
The optimal performance level should be guided by the use case requirement of the application. However, the best way to find the optimal value for DWUs in Azure Synapse Analytics is by performing a scale-out test.
Some basic guidelines to perform scale out testing for Azure Synapse Analytics:
- Start with a small number say, 200 DWUs and keep increasing the DWUs step by step , run a few queries and monitor the completion time or the application performance
- For the next step, say 300 DWUs, assume linear scale e.g. if the query took 20 secs with 200 DWUs, the same query should take around 15 secs with 300 DWUs (50% better performance).
- At some point you will find that the query performance does not increase linearly with DWUs, this DWU number is an ideal candidate for optimum performance and cost.