We have discussed Azure Data Lake Storage (ADLS) in a previous post. The main application of a Data Lake is to store various types of data at a low cost. But with the evolution of the Lakehouse architecture, it is becoming increasingly simple to analyse massive amounts of data and extract information cost effectively. To enable this, Microsoft has included an in-built service, called Azure Data Lake Analytics on the Azure cloud.
Azure Data Lake Analytics is an analytics job service which is optimized for distributed data processing. It is designed to handle jobs of varied scale (from a few GBs to Peta Byte scale) and be able to scale and adapt the compute power as required, according to the job. This is achieved by dynamic provisioning of resources.
Let’s have a look at some of the main features of Azure Data Lake Analytics:
- USQL: Probably the biggest advantage of using Data Lake Analytics is that it uses a slightly modified version of T-SQL, called USQL. USQL integrates features from TSQL as well as C#, complementing and extending the programmability and data analysis capabilities of both.
- Visual Studio Integration: Seamless integration with Visual Studio provides users a familiar IDE to write, debug and test analytics jobs with ease.
- Compatibility with Azure Data Services: Data Lake Analytics can interact with ADLS Gen1, Azure SQL, Blob Storage and Synapse Analytics.
- Supports Diverse Workloads: Due to its scalability and flexibility, Data Lake Analytics can be used for various types of workloads such as, ETL, Machine Learning, Sentiment Analysis etc.
- Pay-per-job: Users are changed only for the time and resources that the job consumes (on a per-job run basis), which means no hardware or service specific charges.
Limitations: One of the major limitations of Azure Data Lake Analytics is that it does not support ADLS Gen2 (only ADLS Gen1 support currently).