Comparing SQL Server and Synapse Analytics IDENTITY Columns

IDENTITY is a SQL keyword to specify that a column in the table will be used as a Surrogate Key. Surrogate Key column contains numerical data to uniquely identify a row in the table. A surrogate key is used to assign unique row identifiers in various scenarios. One of the most common use cases for IDENTITY column is, when we are inserting new rows into a Dimension Table while loading data into a Data Warehouse.

Following is an example of IDENTITY column definition in Azure Synapse Analytics Hash distributed table:

CREATE TABLE dbo.T1
(    ADE1 INT IDENTITY(1,1) NOT NULL
,    ADE2 INT NULL
)

WITH
(   DISTRIBUTION = HASH(ADE2)
,   CLUSTERED COLUMNSTORE INDEX
)
;

Now that we have a basic understanding of the IDENTITY column, let’s have a look at the key differences between SQL Server and Synapse Analytics in terms of the features of the IDENTITY column:

IDENTITY column BehaviourSQL Server (including SQL Azure DB)Azure Synapse Analytics
Unique Row IdentifierYesYes
Column value order1SequentialRandom
Data Types Supported2Decimal, Int, Numeric, Smallint, Bigint, TinyintInt, BigInt
Number of allowed IDENTITY columns per table11
Support for @@IDENTITY system function3YesNo
IDENTITY column: Differences between SQL Server and Synapse Analytics

As we can see above, there are some major differences between how the IDENTITY column works in SQL Server(both on-prem and Azure SQL Database) and Azure Synapse Analytics.

Explanation:

 1. Column Value Order: The IDENTITY column in Synapse Analytics is populated with RANDOM values (not sequential). Phew!! Yes, it is hard to believe, but the architecture of Synapse Analytics doesn’t allow the IDENTITY column to have sequential values as is the case with traditional SQL Server.

2. Data Types Supported: Only INT and BIGINT data types are supported in Synapse Analytics. This is done to avoid premature exhaustion of key values for the identity column, so only the biggest datatypes are supported.

3. Support for @@IDENTITY: The @@IDENTITY is a system function that returns the last value of the most recently used IDENTITY column in the current session. @@IDENTITY is not available as a system function in Synapse Analytics.

Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-identity

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: