· nervico-team · cloud-architecture · 11 min read
Serverless on AWS: Lambda, API Gateway, and DynamoDB in Practice
Practical guide to serverless architecture on AWS: how to design APIs with Lambda and API Gateway, DynamoDB data access patterns, real costs, and when serverless is not the answer.
Serverless does not mean there are no servers. It means they are not your problem. AWS handles provisioning, scaling, security patches, and availability. Your team writes functions, deploys them, and pays only for actual invocations. It sounds ideal, and in many cases it is.
But serverless is not a universal solution. It has concrete limitations that, if not understood before you start, lead to frustration, unexpected costs, and architectures that end up more complex than the ones they were meant to replace.
This article explains how the Lambda + API Gateway + DynamoDB triad works in practice, with real patterns, verified costs, and the design decisions that separate a well-thought-out serverless architecture from one that becomes a maintenance burden.
What Serverless Is and What It Is Not
The Execution Model
In a serverless architecture, the cloud provider runs your code in response to events. There are no servers to manage, no capacity to reserve. The model rests on three principles:
- On-demand execution: Code runs only when an event arrives (HTTP request, queue message, database change).
- Automatic scaling: If 10 simultaneous requests arrive, 10 instances of your function are created. If 10,000 arrive, 10,000 instances are created. No additional configuration.
- Pay-per-use billing: You pay for the number of invocations and execution time. If there is no traffic, you pay nothing.
What Serverless Does Not Solve
Serverless eliminates infrastructure management but does not eliminate complexity:
- It does not remove the need for architecture: A poorly designed serverless application is just as problematic as a poorly designed monolith. Design patterns still matter.
- It does not simplify debugging: Tracing a chain of 15 Lambda functions communicating through events is significantly harder than attaching a debugger to a local monolith.
- It does not automatically reduce latency: Lambda cold starts add between 100ms and several seconds of latency on the first invocation, depending on the runtime and package size.
- It is not always cheaper: For constant, predictable workloads, a reserved EC2 instance can be 5-10x more economical than Lambda.
AWS Lambda: The Compute Unit
How It Works Internally
Lambda runs your code inside an isolated execution environment called a microVM, based on Firecracker technology that AWS developed internally. Each invocation follows this flow:
- Incoming event: A trigger (API Gateway, SQS, S3, EventBridge) sends an event to the Lambda service.
- Environment allocation: Lambda looks for an available “warm” execution environment. If one exists, it reuses it. If not, it creates a new one (cold start).
- Execution: Your function receives the event, processes it, and returns a response.
- Billing: Lambda charges per invocation ($0.20 per million) and per duration in 1ms increments (price depends on allocated memory).
Configuration That Matters
Memory: Lambda allows between 128 MB and 10,240 MB. But memory does not only affect available RAM. AWS allocates CPU proportionally to memory. A function with 1,769 MB gets 1 full vCPU. With 128 MB, it gets a minimal fraction. If your function does intensive processing, increasing memory can make it run faster and cost less because duration decreases.
Timeout: The maximum is 15 minutes. For REST APIs, a 30-second timeout is reasonable. For batch processing, 15 minutes may fall short. If you need more, consider Step Functions or ECS.
Layers: Shared dependencies between functions are packaged as layers. A layer with the AWS SDK, for example, avoids duplicating 50 MB across every function. The limit is 5 layers per function and 250 MB uncompressed total.
Environment variables: Connection strings, API keys, configuration. Sensitive variables should be encrypted with AWS KMS. Never hardcode secrets in your code.
Cold Start: The Real Problem
The cold start is the time Lambda needs to create a new execution environment. Times vary by runtime:
| Runtime | Typical Cold Start | With VPC |
|---|---|---|
| Python | 100-300ms | 200-500ms |
| Node.js | 100-300ms | 200-500ms |
| Java | 800ms-3s | 1-5s |
| .NET | 400ms-1.5s | 600ms-2.5s |
| Rust/Go | 10-50ms | 50-200ms |
Practical solutions:
- Provisioned Concurrency: Keeps N instances always warm. Eliminates cold starts but introduces fixed cost. Useful for latency-critical functions with SLAs.
- SnapStart (Java only): AWS freezes the JVM state after initialization and restores it on each invocation. Reduces Java cold starts to under 200ms.
- Minimize the package: Every additional MB increases cold start time. Use tree-shaking, remove unnecessary dependencies, and consider lightweight runtimes like Rust for latency-critical functions.
API Gateway: The Front Door
REST API vs HTTP API
AWS offers two types of API Gateway, and the choice matters more than it appears:
HTTP API (recommended for most cases):
- Cost: $1.00 per million requests.
- Latency: 60% lower than REST API.
- Supports native JWT authorizers, automatic CORS, Lambda integration, parameterized routes.
- Does not support: usage plans, API keys as authentication, request/response transformations, built-in schema validation.
REST API (for specific use cases):
- Cost: $3.50 per million requests.
- Supports everything above plus: WAF, built-in caching, usage plans with per-API-key throttling, validation models, VTL transformations.
- Required if you need: response caching at the API level, IP whitelisting with WAF, API key monetization.
Practical recommendation: Start with HTTP API. If you need caching or WAF, migrate to REST API. The route structure is compatible.
Integration Patterns
The most direct approach is proxy integration with Lambda:
GET /users/{id} -> Lambda function -> JSON response
POST /users -> Lambda function -> JSON responseEach route can point to a different Lambda function (one handler per endpoint) or to a single function that routes internally (monolambda). The choice has consequences:
One handler per endpoint:
- Independent deployments per endpoint.
- Smaller packages, faster cold starts.
- Greater granularity in IAM permissions.
- More functions to manage.
Monolambda (one function for the entire API):
- Single deployment, single package.
- Larger package, slower cold starts.
- Simpler local development.
- Less permission granularity.
For APIs with fewer than 20 endpoints, monolambda is usually the most practical option. For large APIs or multiple teams, one handler per endpoint allows independent deployments.
Authentication and Authorization
API Gateway supports three authentication mechanisms:
- JWT Authorizer (HTTP API): Validates JWT tokens from any OIDC provider (Cognito, Auth0, Firebase Auth). No additional cost, no additional Lambda. The simplest and most efficient option.
- Lambda Authorizer: A Lambda function that receives the token, validates it, and returns an IAM policy. Useful when you need complex authorization logic (roles, granular permissions, database lookups).
- IAM Authorization: Uses SigV4 signatures. Ideal for service-to-service communication within AWS, not for public APIs.
DynamoDB: Data Design for Serverless
The Mindset Shift
DynamoDB is not a relational database. It does not support JOINs. It does not have a flexible query language like SQL. And that is intentional. DynamoDB is designed to deliver single-digit millisecond latency at any scale, but it demands that you think about access patterns before designing the table.
In a relational database, you design the tables first and the queries later. In DynamoDB, it is the opposite: first you define all the queries your application needs, then you design the table to support them.
Single Table Design
The most powerful DynamoDB pattern is single table design. Instead of creating a table per entity (Users, Orders, Products), you store all entities in a single table with generic keys:
PK (Partition Key) | SK (Sort Key) | Data
-----------------------|------------------------|------------------
USER#123 | PROFILE | {name, email...}
USER#123 | ORDER#2024-001 | {total, status...}
USER#123 | ORDER#2024-002 | {total, status...}
PRODUCT#abc | METADATA | {name, price...}
PRODUCT#abc | REVIEW#2024-001 | {rating, text...}Advantages:
- A single
Queryoperation withPK = USER#123returns the user profile and all their orders. - No JOINs, no additional queries, no added latency.
- Scaling is automatic: DynamoDB distributes data by partition key.
Disadvantages:
- The initial design requires knowing all access patterns upfront.
- Later changes to access patterns may require data migration.
- The learning curve is steep for teams accustomed to SQL.
Capacity Modes and Cost
DynamoDB offers two billing modes:
On-Demand (pay per use):
- $1.25 per million writes.
- $0.25 per million reads.
- No capacity to plan.
- Ideal for unpredictable workloads or early-stage products.
Provisioned (reserved capacity):
- You define read/write capacity units per second.
- More economical for predictable workloads (up to 70% savings with Reserved Capacity).
- Risk of throttling if you underestimate capacity.
Recommended strategy: Start in On-Demand mode. When your traffic pattern is stable and predictable (usually after 3-6 months in production), evaluate whether Provisioned with auto-scaling reduces costs.
Global Secondary Indexes (GSI)
GSIs let you query the table with alternative keys. If your primary table has PK = USER#123 but you need to search by email, you create a GSI with PK = email.
Each GSI is a partial copy of the table with a different key schema. It consumes additional read/write capacity. The limit is 20 GSIs per table.
Practical rule: If you need more than 5 GSIs, your table design probably needs rethinking. A good single table design covers most access patterns with 2-3 GSIs.
Serverless Architecture Patterns
Synchronous REST API
The most common pattern:
Client -> API Gateway -> Lambda -> DynamoDB -> ResponseAppropriate for: CRUD operations, queries, anything that must respond within 3 seconds.
Asynchronous Processing with Queues
For operations that do not need an immediate response:
Client -> API Gateway -> Lambda (validation) -> SQS -> Lambda (processing) -> DynamoDBAdvantages: Decouples reception from processing. If processing fails, the message returns to the queue. If there is a traffic spike, SQS absorbs messages and Lambda processes them at its own pace.
Event-Driven with EventBridge
For cross-domain communication:
Service A -> EventBridge -> Rule 1 -> Lambda (notifications)
-> Rule 2 -> Lambda (analytics)
-> Rule 3 -> Step Functions (workflow)EventBridge allows services to emit events without knowing the consumers. It is the fundamental pattern for serverless architectures spanning multiple domains.
Orchestration with Step Functions
For workflows involving multiple steps with conditional logic, retries, and error handling:
Step Functions:
1. Validate data (Lambda)
2. If valid -> Process payment (Lambda)
3. If payment OK -> Generate invoice (Lambda) + Send email (SES)
4. If payment KO -> Notify error (SNS) + Retry (up to 3 times)Step Functions is preferable to chaining Lambdas directly because it handles retries, errors, and timeouts declaratively. The cost is $0.025 per 1,000 state transitions.
Real Costs: Serverless in Numbers
Scenario 1: REST API with 100,000 Monthly Active Users
Estimate for an API with 50 endpoints, 10 million requests per month, and 500 GB of DynamoDB storage:
| Service | Configuration | Cost/Month |
|---|---|---|
| Lambda | 10M invocations, 200ms avg, 256MB | ~$20 |
| API Gateway (HTTP) | 10M requests | ~$10 |
| DynamoDB (On-Demand) | 5M writes + 30M reads | ~$14 |
| DynamoDB Storage | 500 GB | ~$125 |
| CloudWatch Logs | 10 GB/month | ~$5 |
| Total | ~$174 |
Scenario 2: The Same Load on EC2
| Service | Configuration | Cost/Month |
|---|---|---|
| EC2 (2x t3.medium) | On-Demand, 24/7 | ~$122 |
| ALB | Application Load Balancer | ~$25 |
| RDS (db.t3.medium) | PostgreSQL, Multi-AZ | ~$150 |
| CloudWatch | Metrics + logs | ~$10 |
| Total | ~$307 |
In this scenario, serverless is 43% cheaper. But if the load increases to 100 million requests per month with constant 24/7 traffic, EC2 with Reserved Instances can be more economical.
The Inflection Point
There is no universal rule, but as a reference:
- Fewer than 50M requests/month with variable traffic: Serverless usually wins.
- More than 100M requests/month with constant traffic: EC2/ECS with Reserved Instances is usually cheaper.
- Traffic with extreme spikes (events, launches): Serverless wins on automatic scaling, regardless of volume.
When Not to Use Serverless
Serverless is not the right answer for every case. These situations call for alternatives:
Applications with Persistent In-Memory State
Long-duration WebSockets, in-memory caches, persistent database connections. Lambda constantly creates and destroys instances. If your application depends on maintaining state between requests, you need containers (ECS/EKS) or EC2 instances.
Long-Running Processing
Lambda has a 15-minute limit. If you need to process video files, train models, or run ETL jobs that last hours, Lambda is not viable. Use ECS Fargate for batch tasks or AWS Batch for processing at scale.
Guaranteed Ultra-Low Latency
If your SLA requires consistent responses under 10ms, Lambda cold starts are a risk. Provisioned Concurrency mitigates the issue but adds fixed cost and does not guarantee sub-10ms latency.
Predictable, Constant Workloads
A service that processes the same load 24/7 does not benefit from pay-per-use. EC2 with Reserved Instances or Savings Plans will be significantly cheaper.
Tools for Local Development
Developing and testing Lambda functions locally requires specific tooling:
- AWS SAM (Serverless Application Model): AWS framework for defining, testing, and deploying serverless applications. Includes
sam local invokefor running functions locally andsam local start-apifor emulating API Gateway. - Serverless Framework: Third-party alternative with a larger plugin ecosystem. Supports multiple cloud providers.
- LocalStack: Emulates AWS services in Docker. Useful for integration tests without cost.
- SST (Serverless Stack): Modern framework with live Lambda development that connects your local code directly to real AWS services.
Infrastructure as Code for Serverless
Do not deploy serverless from the console. Ever. Use Infrastructure as Code from day one:
- AWS CDK: Define infrastructure in TypeScript, Python, or Java. Generates CloudFormation. The most powerful option if your team already codes in these languages.
- Terraform: Cloud-agnostic. More verbose than CDK but with a massive ecosystem and no CloudFormation dependency.
- AWS SAM: Simplified CloudFormation extension for serverless. Less flexible than CDK but more concise for purely serverless use cases.
Conclusion
Serverless on AWS is a powerful tool when applied to the right problem. Lambda + API Gateway + DynamoDB form a triad that lets you build scalable APIs with predictable costs and minimal operations overhead. But it is not magic. It requires designing data access patterns before writing code, understanding cold start limitations, and accepting that distributed debugging is inherently more complex.
The choice between serverless and containers is not binary. Many successful architectures combine both: serverless for APIs and event processing, containers for stateful services and long-running workloads.
If you are evaluating a serverless migration or designing a new architecture on AWS, NERVICO helps technical teams make these decisions based on data, not hype. Request a free audit and we will review your current architecture to identify where serverless makes sense and where it does not.