Scheduling Tasks: Cron, Queues, and Background Jobs
- ShiftQuality Contributor
- Jun 16, 2025
- 4 min read
Not every piece of work should happen while a user waits. Sending an email, generating a report, processing an upload, syncing data with a third party — these tasks are too slow, too unreliable, or too resource-intensive to run inline with a user's request.
The user clicks "Generate Report." If you generate the 50-page PDF synchronously, they stare at a spinner for 30 seconds. If you queue the job and notify them when it's done, they continue working while the report generates in the background.
Background work falls into three categories: things that run on a schedule, things triggered by events, and things that are too slow for a request. Each has a different solution.
Scheduled Tasks: Cron
Cron is the Unix tool for running tasks on a schedule. It's been doing this since 1975 and it works.
# Run a backup every day at 2 AM
0 2 * * * /scripts/backup.sh
# Run a report every Monday at 8 AM
0 8 * * 1 /scripts/weekly-report.sh
# Clean up temp files every hour
0 * * * * /scripts/cleanup.sh
The five fields are: minute, hour, day of month, month, day of week. * means "every."
When to Use Cron
Database backups on a schedule
Nightly data aggregation or reporting
Periodic cleanup (temp files, expired sessions, stale data)
Regular data syncs with external systems
Health checks and monitoring scripts
Cron's Limitations
No retry on failure. If the job fails, cron doesn't retry. It runs again at the next scheduled time. For critical jobs, you need monitoring to detect failures and alerting to notify you.
No overlap protection. If a job takes 90 minutes and runs every hour, two instances will run simultaneously. This can cause data corruption or resource contention. Use a lock file or a tool like flock to prevent overlap:
# Only run if no other instance is running
0 * * * * flock -n /tmp/myjob.lock /scripts/myjob.sh
No visibility. Cron doesn't have a dashboard. You find out a job failed by checking logs (if you set up logging) or noticing the side effects (the backup didn't happen, the report is missing).
Server-dependent. Cron jobs run on a specific server. If that server goes down, the jobs don't run. For critical scheduled tasks, consider a managed scheduler.
Modern Alternatives to Cron
Cloud schedulers: AWS EventBridge, Google Cloud Scheduler, Azure Timer Triggers. These run scheduled tasks in the cloud without managing a server.
Application-level schedulers: Celery Beat (Python), Hangfire (C#), node-cron (Node.js). These run within your application and provide dashboards, retry logic, and monitoring that bare cron doesn't.
Task Queues: Process Work Asynchronously
A task queue decouples "requesting work" from "doing work." Your web application puts a message on a queue. A separate worker process picks it up and does the work. The user doesn't wait.
Web App Queue Worker
| | |
|-- "Send welcome email"->| |
| (returns immediately) | |
| |-- "Send welcome email"->|
| | |-- Send the email
| | |-- Done
Common Queue Systems
Redis + BullMQ (Node.js): Redis as the queue backend, BullMQ for the API. Simple, fast, good for most applications.
Celery + Redis/RabbitMQ (Python): The standard Python task queue. Powerful, well-documented, handles complex workflows.
Hangfire (.NET): Background jobs for .NET applications. Dashboard included. Supports delayed, recurring, and fire-and-forget jobs.
Amazon SQS / Google Cloud Tasks: Managed queue services. No infrastructure to maintain. Pay per message.
What Goes in a Queue
Email sending. Email services can be slow (1-3 seconds). Don't make the user wait.
Image/video processing. Resizing, converting, generating thumbnails — CPU-intensive work that doesn't belong in a request.
Third-party API calls. External APIs are unreliable. Queue the call, retry on failure.
Report generation. Anything that takes more than a few seconds.
Webhook delivery. When your system needs to notify external systems.
Queue Design Principles
Idempotency. The same job processed twice should produce the same result as processing it once. Queues guarantee "at least once" delivery — a job might be delivered twice if the worker crashes after processing but before acknowledging. Design for this.
Retries with backoff. When a job fails, retry it — but not immediately. Use exponential backoff: wait 1 second, then 2, then 4, then 8. This prevents hammering a failing service.
Dead letter queues. Jobs that fail after all retries go to a dead letter queue instead of disappearing. You can inspect them, fix the issue, and replay them.
Monitoring. Track queue depth (how many jobs are waiting), processing time (how long jobs take), and failure rate (how often jobs fail). A growing queue depth means workers can't keep up. A rising failure rate means something's broken.
Background Workers
Workers are the processes that pull jobs from queues and execute them. They run separately from your web server — different processes, often different servers.
Scaling Workers
The beauty of queue-based architectures: scaling is straightforward. If the queue is backing up, add more workers. If the queue is empty, reduce workers. The web application doesn't need to change.
Worker Considerations
Keep workers simple. A worker should do one thing: pull a job, process it, acknowledge it. Complex worker logic with multiple responsibilities is hard to debug and maintain.
Handle shutdown gracefully. When a worker is stopped (deployment, scaling down), it should finish its current job before exiting — not abandon it mid-processing.
Log everything. Workers run without user visibility. When something goes wrong, logs are your only diagnostic tool. Log job start, completion, failure, and retry for every job.
Putting It Together
A typical application uses all three:
Cron for scheduled maintenance: nightly backups, weekly reports, hourly data syncs
Task queues for event-triggered background work: email sending, image processing, webhook delivery
Workers to process the queue and execute the jobs
[User Request] → [Web App] → [Queue] → [Worker] → [External Service]
↓
[Response: "Processing..."]
[Cron] → [Scheduled Script] → [Queue] → [Worker] → [Database/Report]
Key Takeaway
Work that's slow, unreliable, or scheduled shouldn't run during user requests. Cron handles scheduled tasks but lacks retry and visibility. Task queues (BullMQ, Celery, Hangfire, SQS) decouple requesting work from doing work, with retry logic and monitoring. Workers process queue jobs independently and scale horizontally. Design jobs to be idempotent, retry with backoff, and use dead letter queues for failed jobs. Monitor queue depth and failure rates. The pattern — queue the work, process it asynchronously, notify when done — applies to almost every application that grows beyond basic CRUD.



Comments