Implementing Delayed Jobs with Redis Sorted Sets

This guide builds a delayed-job queue from primitives, the hand-rolled version of the mechanism described in scheduled and delayed jobs and part of the broader Queue Fundamentals & Architecture collection. It is the right tool when your broker's native delay is too limited — most often AWS SQS, whose DelaySeconds caps at 15 minutes.

The concrete problem: you need to run a job at an arbitrary future time — 6 hours, 30 days, an exact calendar instant — without holding a worker thread idle for the duration and without losing the job if a process restarts. A Redis sorted set solves this cleanly: store each delayed job scored by its due timestamp, then poll for everything now due and atomically move it onto a ready list that ordinary workers consume.

Prerequisites

  • A reachable Redis instance and a client library (pip install redis; examples are Python but the commands are language-agnostic).
  • A ready/work queue your normal workers already consume — here a Redis list, jobs:ready, drained with BRPOP.
  • Understanding of why sleeping in a worker is wrong, and of clock-skew risks — both covered in the parent guide.
  • A serialization format for job payloads (JSON below; for larger payloads see optimizing JSON vs Protobuf for job payloads).

Step 1: Schedule a job with ZADD

Enqueue a delayed job by adding it to a sorted set with the absolute due time (Unix seconds) as the score. Store an absolute instant, never a relative offset, so the schedule survives restarts and clock corrections.

import json, time, uuid
import redis

r = redis.Redis()
DELAYED = "jobs:delayed"   # the sorted set
READY   = "jobs:ready"     # the list workers consume

def schedule(payload: dict, delay_seconds: float) -> str:
    job_id = str(uuid.uuid4())
    due = time.time() + delay_seconds
    member = json.dumps({"id": job_id, "payload": payload})
    r.zadd(DELAYED, {member: due})   # score = absolute due timestamp
    return job_id

# run a job in 6 hours — far beyond SQS's 15-minute limit
schedule({"task": "send_reminder", "user": 42}, delay_seconds=6 * 3600)

ZADD keeps the set ordered by score, so the cheapest-to-find jobs are always the ones due soonest.

Step 2: Find due jobs with ZRANGEBYSCORE

A job is due when its score is less than or equal to "now." ZRANGEBYSCORE from -inf to the current time returns exactly that set, ordered, and LIMIT lets you promote in bounded batches so one tick never tries to move a million jobs at once.

def due_jobs(limit: int = 100):
    now = time.time()
    return r.zrangebyscore(DELAYED, "-inf", now, start=0, num=limit)

Step 3: Promote atomically with a Lua script

The dangerous moment is between reading due jobs and removing them: if two poller instances both read the same job, or a crash lands between the move and the remove, you get duplicates or losses. The fix is to do find + push-to-ready + remove-from-delayed as one atomic Lua script. Redis runs it as an indivisible unit, so no other client can observe a half-completed promotion.

-- promote_due.lua
-- KEYS[1] = delayed sorted set
-- KEYS[2] = ready list
-- ARGV[1] = now (unix seconds)  -- prefer redis TIME; passed here for testability
-- ARGV[2] = max jobs to move this tick
local now   = tonumber(ARGV[1])
local limit = tonumber(ARGV[2])

local due = redis.call('ZRANGEBYSCORE', KEYS[1], '-inf', now, 'LIMIT', 0, limit)
for _, member in ipairs(due) do
  redis.call('LPUSH', KEYS[2], member)   -- make it runnable
  redis.call('ZREM', KEYS[1], member)    -- remove from delayed set
end
return #due   -- how many were promoted

For skew-free timing, replace the passed now with redis.call('TIME') inside the script so every promotion is judged against the single Redis clock rather than each poller's local clock.

Step 4: Run the poller loop

The poller is a small, supervised process that calls the script on a fixed cadence. Match the interval to the precision you need; one second is plenty for most "remind me later" work and costs one cheap ranged read per tick.

with open("promote_due.lua") as f:
    promote = r.register_script(f.read())

def run_poller(interval: float = 1.0, batch: int = 500):
    while True:
        moved = promote(keys=[DELAYED, READY], args=[time.time(), batch])
        if moved == batch:
            continue          # backlog: loop again immediately, don't sleep
        time.sleep(interval)  # caught up: wait for the next tick

The "if we filled the batch, loop again immediately" detail drains a large backlog quickly after downtime instead of trickling it out one batch per second.

Step 5: Consume the ready list in a worker

Once promoted, a job is an ordinary ready item. Regular workers block on the ready list with BRPOP and execute — they never know or care that the job was delayed.

def run_worker():
    while True:
        _, raw = r.brpop(READY)            # blocks until a job is ready
        job = json.loads(raw)
        try:
            handle(job["payload"])
        except Exception:
            # route persistent failures to a dead-letter set, etc.
            r.lpush("jobs:dead", raw)

Verification

Functional: a job appears only after its delay. Schedule with a 2-second delay and assert the ready list is empty before and populated after.

import time
jid = schedule({"task": "noop"}, delay_seconds=2)
assert r.llen(READY) == 0                       # not yet due
promote(keys=[DELAYED, READY], args=[time.time(), 100])
assert r.llen(READY) == 0                       # still hidden
time.sleep(2.1)
promote(keys=[DELAYED, READY], args=[time.time(), 100])
assert r.llen(READY) == 1                       # now ready

Timing accuracy. Record due - actual_promote_time for a sample of jobs; the lateness should stay within roughly one poll interval. If it drifts larger, the poller is falling behind and needs a shorter interval or larger batch.

No duplicates under concurrent pollers. Run two poller processes against the same keys, schedule 1000 jobs, and assert LLEN jobs:ready == 1000 with no member appearing twice — the proof that the atomic Lua promotion holds.

Gotchas and edge cases

Poller crash leaves jobs stuck. The delayed set itself is durable — jobs survive a poller crash because they're persisted in Redis. The risk is liveness: if the only poller dies, due jobs silently never get promoted. Run the poller under a supervisor (systemd, Kubernetes), and emit a heartbeat metric plus an alert on "oldest due job age" so a stalled poller is caught fast.

Crash mid-promotion. Because promotion is a single Lua script, a crash either happens before it (job stays in delayed, retried next tick) or after it (job is in ready, gone from delayed) — never half-done. This is precisely why the multi-command move must not be split into separate client calls.

Duplicate enqueues and at-least-once handling. Promotion plus worker processing is at-least-once: a worker that dies after BRPOP but before finishing loses the in-flight job unless you add a processing set with reclaim. Make handlers idempotent — see preventing duplicate job execution with idempotency.

Memory growth from far-future jobs. A sorted set holding millions of month-out jobs consumes memory continuously. Monitor ZCARD jobs:delayed, and for very large horizons consider a database-backed store with a periodic sweep into Redis only for the near term.

Clock skew between producers and the poller. Producers compute due times from their local clocks. If those drift from Redis, jobs fire early or late. Run NTP everywhere and, where exactness matters, compute due times from redis TIME rather than the producer's clock.

For recurring rather than one-shot timing, the sorted-set poller is the wrong tool — use a cron scheduler instead, as covered in cron-style scheduling with Celery Beat.

FAQ

Why a sorted set instead of one key per job with a TTL and keyspace notifications? Keyspace-notification expiry is best-effort and not delivered reliably under load or after a restart, so jobs can be missed. A sorted set is explicit and pollable: you always see exactly what is due, and promotion is atomic and recoverable.

How precise can the timing be? Precision is bounded by the poll interval. A 1-second interval yields roughly second-level accuracy. You can poll faster for tighter timing, but for jobs measured in minutes or hours, second-level jitter is irrelevant — don't over-poll.

Can I cancel a scheduled job before it fires? Yes. Keep the serialized member (or its job id) and ZREM it from the delayed set before it is promoted. Once promoted to the ready list, cancellation requires a separate check in the worker.

Related