Skip to content

Priority Queues, Fairness, and Backpressure

Once a job system becomes shared infrastructure, scheduling policy becomes a product feature. Without fairness, one tenant or job class can starve everyone else. Without backpressure, producers turn temporary dependency slowdowns into huge latency debt. Priority is useful, but priority without quotas becomes a denial-of-service mechanism.

The Scheduling Problem

Workers are finite. Jobs are not equal:

  • Some jobs are user-visible.
  • Some jobs are batch maintenance.
  • Some tenants pay for stronger guarantees.
  • Some downstreams have strict quotas.
  • Some retries should wait behind fresh work.

The scheduler decides who gets scarce execution slots.

Priority Is Not Enough

If tenant A never stops enqueueing high-priority jobs, everyone else starves. Priority must be combined with fairness or admission control.

Common Policies

PolicyHow it worksRisk
FIFOOldest job firstUrgent work waits behind bulk work
Strict priorityHighest priority firstStarvation
Weighted fair queueingEach class gets a shareMore scheduler complexity
Deficit round robinClasses spend credits by job costRequires cost estimates
Earliest deadline firstClosest deadline firstBad estimates cause thrashing

Tenant Fairness

Use per-tenant budgets:

text
effective_score =
  priority_weight
  + age_boost
  - tenant_over_budget_penalty
  - downstream_pressure_penalty

Fairness does not mean equal throughput. It means no tenant can consume unbounded shared capacity without policy approval.

Aging

Aging raises priority as a job waits:

text
age_boost = min(max_boost, floor(wait_seconds / aging_interval) * boost_step)

This prevents low-priority jobs from waiting forever while preserving preference for urgent work.

Backpressure Signals

SignalProducer response
Queue age exceeds SLOSlow or reject new background work
Downstream 429 rate increasesReduce concurrency for that integration
DB latency spikesPause DB-heavy job types
Worker error rate spikesStop retry amplification
Tenant exceeds quotaDefer or reject tenant jobs

Backpressure should affect enqueueing, not just workers. Otherwise the system accepts work it cannot finish.

Admission Control

For user-facing operations, rejecting early is often better than accepting a job that completes hours late.

Cost-Aware Scheduling

A one-minute video transcode and a 50ms email job should not spend the same scheduling token.

Track estimated cost:

  • CPU seconds.
  • Memory footprint.
  • DB queries.
  • External API calls.
  • GPU seconds.
  • Expected runtime.

Use actual execution metrics to correct estimates over time.

Failure Modes

FailureSymptomMitigation
StarvationOld low-priority jobs never runAging and minimum shares
Priority inversionLow-priority job holds scarce dependencyDependency-specific concurrency pools
Retry amplificationFailed jobs dominate workersRetry queue with lower priority
Noisy tenantOne tenant fills queuePer-tenant quotas
Scheduler hot loopRepeatedly scans unrunnable jobsPartition by runnable time and class

Operational Metrics

  • Queue age by priority and tenant.
  • Share of worker time by tenant.
  • Jobs rejected/deferred by admission control.
  • Starvation count.
  • Priority inversion incidents.
  • Downstream-pressure throttles.
  • Scheduler decision latency.

A practical reference for distributed system design. Released under the MIT License.