In this post, we will discuss options available in Azure Stream Analytics to output results to an Azure SQL Database and how to optimize the configuration setting to achieve optimal throughput.
Depending on the SQL database schema design, indexing and compression, the easiest way to enhance performance for the Azure Stream Analytics job is to create multiple output partitions, with each output partition loading data into an Azure SQL table partition. This means the write operations happen in parallel and a fully parallel job topology can be achieved.
Further, lets have a look at some configuration options to optimize performance:
Inherit Partitioning: With this configuration setting, it is possible to inherit the partition scheme of the previous query step or the Stream Analytics input.
Batch Size: This is the maximum number of records sent with every bulk insert transaction. This configuration setting is available for Azure Stream Analytics output. The options for this configuration depend on the nature of the destination table. For a table with clustered columnstore indexes, the optimal bath size is 100,000 records, whereas in disk-based tables, 10,000 or lower value is recommended.
Input Message Tuning: This configuration setting in Azure Stream Analytics increased the number of input events per message per partition, thereby increasing the write throughput.
We will look at ways to optimize Azure Stream Analytics output to other destination in a future post.
Reference: https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-sql-output-perf
3 thoughts on “Optimizing Azure Stream Analytics output to Azure SQL Database”