Technical Glossary

Sharding

Definition: Horizontal database partitioning technique that distributes data across multiple instances to scale storage and query capacity.

— Source: NERVICO, Product Development Consultancy

What is Sharding

Sharding is a horizontal database partitioning technique that splits a large dataset into smaller fragments called shards. Each shard contains a subset of the total data and is hosted on a separate database instance. Together, all shards form the complete dataset. It is one of the fundamental strategies for scaling databases beyond the limits of a single server.

How it works

The system uses a shard key to determine which shard stores each record. The most common strategies are range-based sharding (users 1 to 1,000,000 on shard A, 1,000,001 to 2,000,000 on shard B), hash-based sharding (a hash function is applied to the ID for uniform distribution), or geographic sharding (European users on one shard, American users on another). A routing layer, either in the database client or an intermediary proxy, directs each query to the correct shard based on the shard key. Queries that target a single shard are efficient, while cross-shard queries (requiring data from multiple shards) need additional coordination and are more costly.

Why it matters

When a database reaches the limits of a single server (millions of writes per second, terabytes of data, or queries that degrade performance), sharding is the primary strategy for continued scaling. It distributes both read and write load across multiple servers, something that read replicas alone cannot achieve. However, sharding adds significant complexity to the application and should be adopted when simpler alternatives (query optimization, indexing, caching, read replicas) are no longer sufficient.

Practical example

A SaaS analytics platform processes 500 million events daily. Its single-server PostgreSQL database begins to saturate CPU and disk. The team implements sharding by tenant_id: each customer’s data is stored on one of twelve shards, distributed via a hash function. Queries for an individual tenant resolve on a single shard with low latency. Global reports that cross data from multiple tenants execute as parallel queries to all shards, aggregating results at the application layer.


Last updated: February 2026

Need help with product development?

We help you accelerate your development with cutting-edge technology and best practices.