How do I autoscale Kubernetes pods based on Redis queue length using KEDA?

We have a background job processor running in Kubernetes that pulls tasks from a Redis Streams queue. Right now we manually set replica counts, which means we either waste resources during quiet periods or get backlogs during spikes.

I’ve heard KEDA (Kubernetes Event-Driven Autoscaling) can scale pods based on external metrics like queue length. But I’m not sure how to set it up.

Questions:

  1. How does KEDA differ from the built-in Horizontal Pod Autoscaler (HPA)?
  2. What does a basic KEDA ScaledObject look like for Redis Streams?
  3. Can KEDA scale to zero when the queue is empty?
  4. Any tips for avoiding thundering herd issues when scaling up rapidly?

Running Kubernetes 1.29 on EKS.


This is seed content posted by the DevForums team to help get our community started. Have a better answer or want to add context? Jump in!

KEDA is a fantastic fit for this — it’s basically purpose-built for scaling based on external event sources. Here’s a full breakdown.

1. KEDA vs. built-in HPA

The built-in Horizontal Pod Autoscaler only scales based on metrics already inside Kubernetes (CPU, memory, or custom metrics you’ve already piped into the metrics API). KEDA acts as a metrics adapter that connects external systems — Redis, Kafka, RabbitMQ, PostgreSQL, AWS SQS, and dozens more — directly to the HPA machinery. Under the hood, KEDA creates and manages an HPA for you.

The killer feature: KEDA can scale to zero and back. The standard HPA has a minimum of 1 replica.

2. Setup

First, install KEDA on your EKS cluster:

helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda --namespace keda --create-namespace

Then create a ScaledObject that targets your Deployment:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: job-processor-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: job-processor          # your Deployment name
  minReplicaCount: 0              # scale to zero when idle
  maxReplicaCount: 20             # cap during spikes
  pollingInterval: 15             # check Redis every 15s
  cooldownPeriod: 120             # wait 2min before scaling down
  triggers:
    - type: redis-streams
      metadata:
        address: redis.default.svc.cluster.local:6379
        stream: task-queue          # your Redis Stream name
        consumerGroup: processors   # your consumer group
        pendingEntriesCount: "5"    # scale up when >5 pending entries per replica

The pendingEntriesCount is the target metric — KEDA will try to maintain roughly 5 pending entries per pod. If there are 50 pending entries, KEDA scales to 10 pods.

3. Scale-to-zero behaviour

Yes, KEDA can scale to zero. When minReplicaCount: 0 and the queue is empty for the duration of cooldownPeriod, KEDA removes all pods. When new messages arrive, KEDA detects them during the next pollingInterval and spins up pods.

One thing to know: scale-from-zero latency includes pod scheduling + container startup + your app’s boot time. For jobs where a 15-30 second delay on the first message is acceptable, this is great. If you need sub-second response, keep minReplicaCount: 1.

4. Avoiding thundering herd

A few strategies that work well in production:

  • Set maxReplicaCount conservatively — better to process a bit slower than overwhelm your database or downstream services.
  • Use advanced.horizontalPodAutoscalerConfig to tune HPA behaviour:
spec:
  advanced:
    horizontalPodAutoscalerConfig:
      behavior:
        scaleUp:
          stabilizationWindowSeconds: 30
          policies:
            - type: Pods
              value: 3
              periodSeconds: 60

This limits KEDA to adding at most 3 pods per 60 seconds, with a 30-second stabilisation window. Prevents going from 0 to 20 pods instantly.

  • Use pod startup probes so new pods don’t receive work until they’re genuinely ready.
  • Add a rate limiter in your consumer code so each pod processes at a controlled rate regardless of how many messages are pending.

We’ve been running this pattern in production for about a year with Redis Streams and it’s been rock solid. Happy to answer follow-up questions about monitoring or alerting around this setup.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.