What is Redis Sentinel? A Beginner's Guide

A single Redis instance is a single point of failure. When it crashes or the server loses power, every service that depends on it stops working until someone intervenes manually.

Redis Sentinel is the built-in solution to this problem. It monitors your Redis instances, detects failures, and promotes a replica to primary without human intervention. This guide explains how Sentinel works, what its architecture looks like, and what you need to know before deploying it.

The Problem Sentinel Solves

A standalone Redis setup has one server handling all reads and writes. If that server goes down, your application loses access to its data. Adding a replica helps with durability, but it does not help with automatic recovery. Someone still needs to:

Detect that the primary is unreachable
Promote the replica manually using REPLICAOF NO ONE
Reconfigure your application to point at the new primary
Set up remaining replicas to follow the new primary

This manual process takes minutes at best, hours at worst. Redis Sentinel automates the entire sequence -- it monitors your instances, coordinates with other Sentinel processes to confirm a failure, and promotes a replica to primary. Typical failover completes in 10-30 seconds.

Architecture: How the Pieces Fit Together

A Redis Sentinel deployment has three types of processes:

Redis primary -- the single instance that handles all writes (and usually reads)
Redis replicas -- one or more instances that receive a continuous copy of the primary's data via asynchronous replication
Sentinel processes -- lightweight monitoring daemons that watch the Redis instances and coordinate failover

The standard minimum deployment uses three servers, each running one Redis instance and one Sentinel process:

Server 1                Server 2                Server 3
+----------------+      +----------------+      +----------------+
| Redis PRIMARY  |      | Redis REPLICA  |      | Redis REPLICA  |
|   port 6379    |      |   port 6379    |      |   port 6379    |
+----------------+      +----------------+      +----------------+
| Sentinel       |      | Sentinel       |      | Sentinel       |
|   port 26379   |      |   port 26379   |      |   port 26379   |
+----------------+      +----------------+      +----------------+
        |                       |                       |
        +----------- network communication ------------+
                  (health checks, voting, failover)

Data flows from the primary to the replicas via Redis replication. Sentinel processes communicate with each other and with all Redis instances over separate connections. Sentinels discover each other automatically through the Redis primary's Pub/Sub system.

Sentinel does not touch your data. It only monitors health, manages configuration, and orchestrates failover. Your application talks to Redis normally.

How Failover Works Step by Step

When the primary goes down, failover happens in a specific sequence. Understanding this sequence helps when debugging issues or tuning timeout values.

Step 1: Subjective Down (SDOWN)

Each Sentinel periodically pings the Redis primary (by default, every second). If the primary does not respond within the configured down-after-milliseconds threshold (default: 5000ms), that individual Sentinel marks the primary as subjectively down (SDOWN). This is a local decision -- one Sentinel's opinion, not yet confirmed by others.

Step 2: Objective Down (ODOWN)

The Sentinel that detected SDOWN asks the other Sentinels: "Do you also think the primary is down?" If a quorum of Sentinels agree the primary is unreachable, the state changes to objectively down (ODOWN). This two-phase detection prevents false positives caused by network issues between a single Sentinel and the primary.

Step 3: Leader Election

The Sentinels run a leader election among themselves. One Sentinel is elected to actually execute the failover -- the others watch and verify. The Sentinel that first detected the failure typically wins, but any Sentinel can be elected.

Step 4: Replica Selection and Promotion

The elected leader chooses the best replica to promote based on replica priority (a configurable value -- lower is preferred), replication offset (how much data the replica has received), and run ID as a tiebreaker. It sends REPLICAOF NO ONE to the chosen replica, promoting it to primary, then reconfigures remaining replicas to follow the new primary.

Step 5: Client Notification

Sentinel publishes a +switch-master notification. Sentinel-aware clients (most modern Redis libraries support this) automatically discover the new primary and reconnect. If you use HAProxy in front of Redis, it detects the change through health checks.

The entire process takes 10-30 seconds with default settings. The largest contributor to this delay is down-after-milliseconds.

Quorum and Why It Matters

The quorum is the minimum number of Sentinels that must agree a primary is down before failover begins. With 3 Sentinels, the quorum is typically set to 2.

This prevents split-brain scenarios. If a network partition isolates one Sentinel from the primary, that Sentinel cannot trigger a failover alone -- it needs others to confirm. Since the remaining Sentinels can still reach the primary, they will not agree, and no unnecessary failover occurs.

Rules of thumb:

3 Sentinels, quorum of 2 -- the standard setup. Tolerates 1 Sentinel failure.
5 Sentinels, quorum of 3 -- for larger deployments. Tolerates 2 Sentinel failures.
Always use an odd number of Sentinels to avoid ties in voting.
Never set quorum to 1 in production -- it defeats the purpose of distributed consensus.

Note that quorum only governs the ODOWN decision. The actual failover execution also requires a majority of Sentinels to authorize it. With 3 Sentinels, you need at least 2 available to perform a failover. With 5, you need at least 3.

Key Configuration Parameters

Sentinel configuration is straightforward. The most important settings are:

sentinel monitor mymaster 192.168.1.1 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000
sentinel parallel-syncs mymaster 1
sentinel auth-pass mymaster your-redis-password

What each one does:

sentinel monitor -- defines the primary to watch, its address, port, and the quorum count (2 in this example).
down-after-milliseconds -- how long a node must be unreachable before Sentinel considers it down. Lower values mean faster detection but increase false positives on unreliable networks. 5000ms (5 seconds) is a sensible default.
failover-timeout -- the maximum time allowed for a failover to complete. If the failover does not finish within this window, it is aborted and can be retried. Default is 60 seconds.
parallel-syncs -- how many replicas can resync with the new primary simultaneously after failover. Setting this to 1 means replicas sync one at a time, keeping most replicas available for reads during the transition.
auth-pass -- the password for authenticating with the monitored Redis instances.

You do not need to list replicas in the Sentinel configuration. Sentinel discovers them automatically by querying the primary's INFO replication output.

Common Pitfalls

Running Fewer Than 3 Sentinels

Two Sentinels cannot form a majority if one goes down, which means failover becomes impossible at the worst possible time. Always run at least 3.

Placing All Sentinels on the Same Server

If that server fails, you lose all monitoring and failover capability. Distribute Sentinels across separate machines -- ideally in different failure domains (different racks, availability zones, or data centers).

Ignoring Asynchronous Replication

Redis replication is asynchronous by default. When the primary fails, any writes acknowledged to the client but not yet replicated to a replica are lost. For most caching workloads, this is acceptable. For use cases where data loss is not tolerable, configure min-replicas-to-write and min-replicas-max-lag so the primary rejects writes when replicas are too far behind or unavailable.

Hardcoding the Primary Address in Your Application

If your application connects directly to a specific Redis IP, failover is useless -- your app will keep trying to reach the old primary. Use a Sentinel-aware client library that queries Sentinel for the current primary address, or place an HAProxy instance in front of Redis that routes to whichever node is currently primary.

Setting down-after-milliseconds Too Low

Values under 2000ms lead to false failovers during brief network hiccups or garbage collection pauses. The result is unnecessary failovers that disrupt clients and can cause data loss. Start with 5000ms and only lower it if your network is stable and you have measured the impact.

How sshploy Deploys Redis Sentinel

sshploy automates the full Redis Sentinel deployment across your own servers. You select your nodes, set a password, and sshploy runs Ansible playbooks that configure a Redis primary with replicas, deploy Sentinel processes on each node with quorum-based failover, and optionally set up HAProxy so your application connects to a single endpoint that always routes to the current primary. It handles internal DNS resolution between nodes, firewall rules that restrict Redis and Sentinel ports to cluster-internal traffic, and all the configuration details like down-after-milliseconds, failover-timeout, and min-replicas-to-write. The deployment runs over SSH with no agents or vendor lock-in -- you keep full access to your servers and configuration files.

FAQ

How many servers do I need for Redis Sentinel?

The minimum is 3 servers. Each server runs one Redis instance and one Sentinel process. This gives you 1 primary, 2 replicas, and 3 Sentinels -- enough for quorum-based failover that tolerates 1 server failure. You can scale up to more replicas for read capacity or additional Sentinels (5 or 7) for stronger consensus guarantees.

Does Redis Sentinel support sharding (partitioning data across nodes)?

No. Sentinel provides high availability for a single dataset that is fully replicated across all nodes. Every replica holds a complete copy of the primary's data. If your dataset is too large for a single server's memory, you need Redis Cluster, which partitions data across multiple shards. See our Redis Sentinel vs Cluster guide for a detailed comparison.

What happens to my application during a failover?

During the 10-30 second failover window, writes to Redis will fail. Read requests to replicas may continue working if your client supports reading from replicas. Sentinel-aware clients (available in most languages -- Jedis, redis-py, ioredis, go-redis) automatically discover the new primary and reconnect. If you use HAProxy in front of Redis, it detects the new primary through health checks and your application does not need to handle the topology change.

Can I use Redis Sentinel with Redis 7?

Yes. Sentinel is included with every Redis distribution and works with all modern Redis versions, including Redis 7.x. There are no version-specific limitations for Sentinel. The same applies to Valkey, the open-source Redis fork -- Sentinel works identically.

Is Redis Sentinel the same as Redis Cluster?

No. They solve different problems. Sentinel provides high availability (automatic failover) for a single replicated dataset. Redis Cluster provides both high availability and data sharding across multiple nodes. Most applications that need Redis HA should start with Sentinel because it is simpler to operate and has no restrictions on multi-key commands, Lua scripts, or transactions. Consider Cluster only when your data exceeds single-server memory or you need distributed write throughput.