Azure Cosmos DB Global Data Distribution

Azure Cosmos DB is a premier NoSQL database service. Cosmos DB is available across all Azure regions worldwide.

Microsoft provides the diagram below to better understand how Global data distribution is managed in Cosmos DB:

Azure Regions can have multiple data centres, where stamps (massive racks) of machines are located. These stamps can be divided into clusters where Cosmos DB is deployed. These clusters are divided into fault domains (usually around 10-20). This spread of clusters across fault domains provides high availability in case some machines in a cluster fail. Machines are replicated to provide redundancy at the cluster level.

Cosmos DB database consists of Cosmos containers. Data stored in Cosmos DB is replicated at two levels:

Within Azure Region:  The local distribution of data is based on partition key. We learnt in the previous post about partition servers. These servers manage the underlying physical partitions.

Global Replication: Each physical partition is replicated across multiple Azure regions. This provides low latency access to the data from anywhere in the world.

The management of the partitions is done internally within Cosmos DB. A group of replicas, called replica set, is used to implement physical partitions. Replica sets will be discussed in detail in a future post.

Reference: https://docs.microsoft.com/en-us/azure/cosmos-db/global-dist-under-the-hood

One thought on “Azure Cosmos DB Global Data Distribution

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: