ElastiCache vs Self-Hosted Redis Sentinel: Cost and Feature Comparison

AWS ElastiCache is the default choice when teams on AWS need Redis. It handles provisioning, replication, patching, and failover. But it comes with significant cost overhead, networking constraints, and configuration limitations that are worth understanding before you commit.

Self-hosted Redis Sentinel on your own servers gives you full control at a fraction of the price, provided you handle the initial setup correctly. This guide compares the two approaches across cost, features, failover behavior, and operational trade-offs so you can make an informed decision.

What ElastiCache Gives You

ElastiCache for Redis is a managed service that runs Redis on AWS-managed EC2 instances. You pick a node type, configure replication, and AWS handles the rest. The key components:

Replication groups -- a primary node with up to 5 read replicas. Multi-AZ deployments place the primary and replicas in different availability zones.
Automatic failover -- when Multi-AZ is enabled, ElastiCache promotes a replica if the primary fails. DNS endpoint updates to point to the new primary.
Managed patching -- AWS applies Redis engine patches during your maintenance window.
Snapshots -- automated daily backups to S3, with configurable retention.
CloudWatch integration -- built-in metrics for memory usage, connections, replication lag, and cache hits/misses.

What you do not get: SSH access to the underlying nodes, full control over Redis configuration parameters, the ability to install custom modules, or the option to run Redis on hardware outside of AWS.

What Self-Hosted Redis Sentinel Gives You

Redis Sentinel is Redis's built-in high-availability system. You run it on your own servers -- cloud VMs, dedicated machines, or bare metal. The architecture consists of:

1 Redis primary -- handles all writes and optionally reads
1+ Redis replicas -- receive replicated data via async streaming replication
3+ Sentinel processes -- monitor Redis health, detect failures, and coordinate automatic failover
HAProxy (recommended) -- provides a single connection endpoint that always routes to the current primary

You own every layer of the stack. You choose the hardware, the Redis version, the configuration parameters, and the network topology. You are also responsible for setup, monitoring, and maintenance.

Cost Comparison

This is where the two approaches diverge sharply. ElastiCache pricing is based on node-hours for the specific cache node type, plus costs for backup storage and data transfer. Self-hosted pricing is whatever your server provider charges, with no per-service markup.

The following comparisons use on-demand ElastiCache pricing in US East (N. Virginia) and standard Hetzner Cloud pricing. All ElastiCache setups include Multi-AZ with automatic failover (primary + 1 replica minimum). Self-hosted setups use 3 nodes running Redis + Sentinel with HAProxy.

Small Scale: Development / Low-Traffic Production

For workloads needing 2-4 GB of Redis memory with basic HA.

Component	ElastiCache (cache.t4g.medium)	Hetzner 3-Node Sentinel
Node type / server	cache.t4g.medium (2 vCPU, 3.09 GB)	CPX21 (3 vCPU, 4 GB RAM)
Node count	2 (primary + 1 replica)	3 (primary + 2 replicas + Sentinel)
Compute cost	~$94/mo ($0.065/hr x 2 x 730hr)	~$18/mo ($5.99/mo x 3)
Backup storage (5 GB)	~$0.43/mo	~$1/mo (object storage)
Data transfer (50 GB out)	~$4.50/mo	$0 (included)
Monthly total	~$99/mo	~$19/mo
Annual total	~$1,188/yr	~$228/yr
Annual savings	--	~$960/yr (81%)

Medium Scale: Standard Production

For workloads needing 8-16 GB of Redis memory with production-grade HA.

Component	ElastiCache (cache.r7g.large)	Hetzner 3-Node Sentinel
Node type / server	cache.r7g.large (2 vCPU, 13.07 GB)	CCX23 (4 vCPU, 16 GB RAM)
Node count	2 (primary + 1 replica)	3 (primary + 2 replicas + Sentinel)
Compute cost	~$219/mo ($0.150/hr x 2 x 730hr)	~$45/mo ($14.99/mo x 3)
Backup storage (20 GB)	~$1.70/mo	~$2/mo
Data transfer (100 GB out)	~$9/mo	$0
Monthly total	~$230/mo	~$47/mo
Annual total	~$2,760/yr	~$564/yr
Annual savings	--	~$2,196/yr (80%)

Large Scale: High-Memory Production

For workloads needing 32-64 GB of Redis memory with multiple replicas.

Component	ElastiCache (cache.r7g.2xlarge)	Hetzner 3-Node Sentinel
Node type / server	cache.r7g.2xlarge (8 vCPU, 52.82 GB)	CCX43 (16 vCPU, 64 GB RAM)
Node count	3 (primary + 2 replicas)	3 (primary + 2 replicas + Sentinel)
Compute cost	~$657/mo ($0.300/hr x 3 x 730hr)	~$165/mo ($54.99/mo x 3)
Backup storage (100 GB)	~$8.50/mo	~$5/mo
Data transfer (200 GB out)	~$18/mo	$0
Monthly total	~$684/mo	~$170/mo
Annual total	~$8,208/yr	~$2,040/yr
Annual savings	--	~$6,168/yr (75%)

The cost ratio holds at roughly 4-5x across all scale points. Reserved Instance pricing on AWS narrows the gap to approximately 3x, but requires a 1- or 3-year commitment.

Feature Comparison

Feature	ElastiCache	Self-Hosted Sentinel
Automatic failover	Yes (Multi-AZ)	Yes (Sentinel quorum)
Failover time	30-60 seconds (DNS propagation)	10-30 seconds (Sentinel + HAProxy)
Read replicas	Up to 5 per replication group	Unlimited (add more servers)
Redis version control	Limited to AWS-supported versions	Any version, including Valkey
Custom Redis modules	Not supported	Full support
SSH access	No	Full root access
Configuration control	Partial (parameter groups)	Full control over redis.conf
TLS/SSL	Supported (in-transit encryption)	Supported (Redis 6+)
At-rest encryption	Supported	Your responsibility (disk encryption)
Automated backups	Built-in	Your responsibility (RDB/AOF + object storage)
Monitoring	CloudWatch	Prometheus, Grafana, or any stack
Multi-region	Global Datastore (additional cost)	Deploy replicas anywhere with SSH access
Provider lock-in	AWS only	Any provider or bare metal
Cluster mode (sharding)	Supported	Use Redis Cluster separately

ElastiCache Limitations

Beyond the cost premium, ElastiCache has structural limitations worth understanding before you commit.

VPC Lock-In

ElastiCache nodes run inside your VPC and are not accessible from outside it. This means your application must run in the same VPC (or a peered VPC) to connect. Cross-region access requires Global Datastore, which adds another replication group at full cost. If you later want to move your application to a different cloud provider or to bare metal, your Redis layer cannot follow -- you need to migrate data and rebuild.

Limited Configuration

ElastiCache exposes Redis configuration through parameter groups, but not all parameters are available. You cannot modify certain low-level settings related to memory management, persistence, and networking. If you need fine-grained tuning -- for example, adjusting hz, changing activedefrag thresholds, or configuring custom eviction behavior -- you may find the parameter group does not expose what you need.

No SSH Access

You cannot SSH into ElastiCache nodes. This means no direct debugging with redis-cli on the server, no access to Redis log files on disk, no ability to inspect the underlying OS for performance issues, and no way to run tools like redis-rdb-tools directly on the snapshot files. When something goes wrong, you are limited to CloudWatch metrics, the ElastiCache console events, and whatever redis-cli commands you can run over the network.

No Custom Modules

ElastiCache does not support loading custom Redis modules. If you need RediSearch, RedisJSON, RedisTimeSeries, RedisBloom, or any third-party module, ElastiCache is not an option. AWS offers MemoryDB as an alternative for some of these, but at an even higher price point.

Upgrade Constraints

Engine upgrades are managed by AWS and follow their release schedule. New Redis versions typically become available on ElastiCache weeks to months after the upstream release. You cannot skip versions or pin to a specific patch release. Maintenance windows for patching are configurable, but the patches themselves are mandatory.

Failover Behavior

How each system handles a primary node failure matters for your application's availability guarantees.

ElastiCache Failover

When Multi-AZ is enabled and the primary fails, ElastiCache promotes a replica to primary. The process involves updating the DNS endpoint for the primary to point to the new node. Total failover time is typically 30-60 seconds, but can take longer in edge cases because DNS propagation is involved. During failover, writes fail and read replicas may serve stale data.

ElastiCache does not support custom health check logic. Failover is triggered by AWS's internal monitoring, which checks node health at intervals you do not control.

Redis Sentinel Failover

Sentinel processes continuously ping the Redis primary. When a configurable number of Sentinels (the quorum) agree the primary is unreachable, they initiate a leader election among themselves, then the elected Sentinel promotes the most up-to-date replica to primary. The other replicas are reconfigured to replicate from the new primary.

With HAProxy in front of the Sentinel cluster, your application connects to a single stable endpoint. HAProxy health checks detect the new primary within seconds, so total failover time is typically 10-30 seconds. You control the down-after-milliseconds and quorum settings, allowing you to tune the trade-off between failover speed and false-positive detection.

Sentinel failover is generally faster than ElastiCache because it does not depend on DNS propagation. HAProxy detects the topology change through direct TCP health checks.

Performance Comparison

Raw Redis performance is primarily determined by memory speed, network latency, and CPU clock speed. The Redis process itself is single-threaded for command execution (with I/O threading available in Redis 6+).

ElastiCache runs on AWS EC2 instances with network-attached EBS storage (for persistence) and shared network infrastructure. Network latency between your application and ElastiCache within the same AZ is typically 0.1-0.5ms. Cross-AZ latency adds 0.5-1ms.

Self-hosted on dedicated servers (Hetzner dedicated, OVH, etc.) with local NVMe storage can achieve lower and more consistent latency, especially for persistence operations. If your application runs on the same provider and in the same datacenter, network latency is comparable to same-AZ latency on AWS.

For most workloads, the performance difference is negligible. Redis operations are fast on both. Where self-hosted can edge ahead is in persistence-heavy workloads where local NVMe outperforms network-attached EBS, and in scenarios where you co-locate Redis on the same physical network as your application servers.

Networking and Access Considerations

ElastiCache's VPC-only access model has practical implications:

Local development: You cannot connect to ElastiCache from your local machine without a VPN or SSH tunnel into the VPC.
Multi-cloud: If part of your infrastructure runs outside AWS, connecting to ElastiCache requires VPN tunnels or AWS PrivateLink, adding complexity and latency.
Migration: Moving away from ElastiCache requires provisioning new Redis infrastructure and migrating data. There is no "export and move" path.

Self-hosted Redis can be configured with any network topology you need. Firewall rules control access at the OS level. You can expose Redis on a private network, restrict access to specific IP ranges, or set up WireGuard tunnels between datacenters. The flexibility is complete, but you are responsible for securing it correctly.

Decision Framework

Choose ElastiCache when:

Your entire stack is on AWS and will stay on AWS for the foreseeable future.
You have no engineers comfortable with Linux server administration.
Your Redis usage is small enough that the cost premium is immaterial (under $100/month).
You need compliance certifications that are simpler to achieve with a managed service (AWS handles SOC 2, HIPAA eligibility, etc. for the infrastructure layer).
You need Global Datastore for multi-region replication and do not want to manage cross-region Sentinel setups yourself.

Choose self-hosted Redis Sentinel when:

Infrastructure cost is a meaningful line item and you want 75-80% savings.
You need full control over Redis configuration, version, or modules.
Your application runs outside AWS, or you want provider flexibility.
You need faster failover than ElastiCache's DNS-based approach provides.
You have engineers who can manage Linux servers, or you use deployment tooling that handles the setup.
You want to avoid VPC lock-in and maintain the ability to move your Redis layer to any provider.

How sshploy Deploys Redis Sentinel

sshploy automates the full Redis Sentinel deployment through tested Ansible playbooks. You select your servers, configure your topology (number of Redis nodes, optional HAProxy load balancers), set a password, and sshploy handles the rest: Redis primary and replica configuration, Sentinel quorum setup across all nodes, HAProxy for single-endpoint master routing, firewall rules scoped to cluster-internal traffic, and Docker-based process management. The entire deployment runs over SSH to servers you already own on any provider -- Hetzner, OVH, Vultr, DigitalOcean, or bare metal. What would normally take hours of manual configuration finishes in minutes with a production-ready, automatically-failing-over cluster.

FAQ

Can I migrate from ElastiCache to self-hosted Redis Sentinel?

Yes. The simplest approach is to use BGSAVE on your ElastiCache instance (or download an automated snapshot from S3), then restore the RDB file on your new self-hosted primary. For near-zero-downtime migration, you can set up your self-hosted Redis as a replica of the ElastiCache primary using REPLICAOF, let it sync, then cut over your application to the new endpoint and promote the self-hosted node to primary.

Does ElastiCache support Redis Sentinel?

No. ElastiCache uses its own failover mechanism, not Redis Sentinel. When you enable Multi-AZ, AWS handles failover through internal monitoring and DNS endpoint updates. Your application connects via the ElastiCache primary endpoint, which is a DNS name that AWS updates during failover. Sentinel-aware Redis clients are not needed and will not work with ElastiCache.

Is ElastiCache Serverless a better option than standard ElastiCache?

ElastiCache Serverless (launched in late 2023) removes the need to choose node types and handles scaling automatically. However, it is significantly more expensive per GB of data stored -- roughly $0.125/GB-hour for data storage plus $0.0034 per ElastiCache Processing Unit (ECPU) for compute. For predictable workloads, standard ElastiCache with reserved nodes is cheaper. For unpredictable or spiky workloads, Serverless avoids over-provisioning. Both are substantially more expensive than self-hosted.

What about Valkey instead of Redis?

Valkey is a community fork of Redis maintained under the Linux Foundation, created after Redis changed its license. It is fully compatible with Redis Sentinel and works as a drop-in replacement. If licensing is a concern, self-hosted Valkey with Sentinel gives you the same architecture described in this guide without Redis's dual-license model. ElastiCache does not currently support Valkey.

How much operational work is self-hosted Redis Sentinel after initial setup?

Minimal for most teams. Redis is a stable, mature piece of software. After the initial deployment, ongoing work consists of: monitoring dashboards for memory usage and replication lag (set up once), validating backups periodically, and upgrading Redis versions a few times per year. Sentinel handles failover automatically. The most common operational task is replacing a failed node, which with sshploy means adding a new server and re-running the deployment. Most teams spend less than an hour per month on Redis operations.