Every sysadmin eventually learns that there are two kinds of people: those who make backups, and those who wish they had. But backups are useless without automation, and automation is useless without scheduling. These two skills form a feedback loop that underpins all of operations.
Cron is named after Chronos, the Greek god of time. It was written by Ken Thompson for Unix Version 7 in 1979. The basic syntax has not changed in over 45 years — a testament to how well it was designed (or how reluctant sysadmins are to learn something new).
The 3-2-1 backup rule was coined by photographer Peter Krogh in his 2005 book about digital asset management. It spread to IT because photographers, like sysadmins, deal with irreplaceable data and unreliable storage.
rsync was created in 1996 by Andrew Tridgell (who also co-created Samba). Its delta-transfer algorithm was part of his PhD thesis. Instead of copying entire files, rsync checksums blocks and only transfers the differences — turning a 10GB copy into a 50MB transfer.
systemd timers can replace cron entirely — and on some modern distributions, the default scheduled tasks (like log rotation and temporary file cleanup) already use timers instead of cron. The transition is happening, but slowly.
Cron is the workhorse of Linux automation. It runs in the background, checks its schedule every minute, and executes commands at the times you specify.
Every cron entry has five time fields followed by the command:
+------------------- minute (0-59)
| +---------------- hour (0-23)
| | +------------- day of month (1-31)
| | | +---------- month (1-12)
| | | | +------- day of week (0-7, 0 and 7 = Sunday)
| | | | |
* * * * * command to execute
Special characters:
* — Every value (wildcard)
, — List of values (1,15 = 1st and 15th)
- — Range (1-5 = Monday through Friday)
/ — Step values (*/5 = every 5 units)
Try It Now: Want to see what a cron expression translates to in plain English? If you have a browser open, go to crontab.guru and type */15 9-17 * * 1-5. It’s a lifesaver for checking your work before committing a schedule.
# Remove your entire crontab (DANGEROUS -- no confirmation!)
crontab-r
# Edit crontab for a specific user (requires root)
sudocrontab-udeploy-e
# List another user's crontab
sudocrontab-udeploy-l
Exam tip: crontab -r deletes ALL your cron jobs without asking. Many sysadmins have accidentally typed crontab -r when they meant crontab -e (the keys are adjacent). Some people alias crontab -r to crontab -ri for safety.
Beyond per-user crontabs, Linux provides system-wide cron locations:
Terminal window
/etc/crontab# System crontab (includes a username field)
/etc/cron.d/# Drop-in cron files (same format as /etc/crontab)
/etc/cron.hourly/# Scripts run every hour
/etc/cron.daily/# Scripts run once a day
/etc/cron.weekly/# Scripts run once a week
/etc/cron.monthly/# Scripts run once a month
The /etc/cron.d/ directory is the cleanest approach for system tasks — each application can drop in its own file without editing a shared crontab.
The cron.hourly/, cron.daily/, etc. directories contain executable scripts (not crontab-format lines). The exact time they run depends on the system — typically controlled by anacron or a cron entry in /etc/crontab.
Stop and think: If you put a script in /etc/cron.hourly/, how do you pass arguments to it? You can’t. Scripts in these directories are executed directly without arguments. If you need arguments, you must use /etc/crontab or a file in /etc/cron.d/.
# 1. PATH is minimal in cron -- use full paths to commands
# 2. Environment variables from .bashrc are NOT loaded
# 3. Script is not executable (chmod +x)
# 4. Script uses relative paths that don't resolve in cron's context
Pro tip: Always test your cron command by running it manually first. Then add >> /var/log/myscript.log 2>&1 to capture any output when it runs via cron.
The OnCalendar format is more expressive than cron:
# Every day at midnight
OnCalendar=daily
# Every Monday and Friday at 9 AM
OnCalendar=Mon,Fri *-*-* 09:00:00
# Every 15 minutes
OnCalendar=*:0/15
# First day of every month
OnCalendar=*-*-01 00:00:00
# Every weekday at 6 PM
OnCalendar=Mon..Fri *-*-* 18:00:00
Test your expressions with systemd-analyze calendar:
Terminal window
systemd-analyzecalendar"Mon..Fri *-*-* 09:00:00"
# Original form: Mon..Fri *-*-* 09:00:00
# Normalized form: Mon..Fri *-*-* 09:00:00
# Next elapse: Mon 2025-01-13 09:00:00 UTC
# (in UTC) Mon 2025-01-13 09:00:00 UTC
# From now: 2 days left
Try It Now: Test this on your own system. Run systemd-analyze calendar "Fri *-*-13 00:00:00". When is the next time Friday the 13th happens? Systemd will calculate it instantly.
While cron handles recurring schedules, at is for one-off tasks: “run this command once at a specific time.”
Terminal window
# Install at (if not present)
sudoaptinstall-yat# Debian/Ubuntu
sudodnfinstall-yat# RHEL/Rocky
# Enable the at daemon
sudosystemctlenable--nowatd
# Schedule a task for 3 PM today
echo"/usr/local/bin/deploy.sh"|at15:00
# Schedule for a specific date and time
echo"reboot"|at02:00AMDecember25
# Schedule relative to now
echo"/usr/local/bin/cleanup.sh"|atnow+30minutes
echo"/usr/local/bin/report.sh"|atnow+2hours
# List pending at jobs
atq
# 3 Tue Jan 14 15:00:00 2025 a user
# 4 Thu Dec 25 02:00:00 2025 a user
# View the contents of a pending job
at-c3
# Remove a pending job
atrm3
When to use at vs cron: Use at for tasks you want to run exactly once — a scheduled reboot, a one-time data migration, or a reminder. Use cron for anything recurring.
Standard cron assumes the machine is always on. If a cron job is scheduled for 2 AM and the laptop is closed, the job is simply skipped. Anacron solves this.
Anacron does not run continuously. It checks timestamps at boot (and periodically) to determine whether a job is overdue, then runs it. This makes it ideal for laptops, desktops, and any machine with irregular uptime.
# Run daily jobs, with a 5-minute delay after boot
15daily-backup/usr/local/bin/backup.sh
# Run weekly jobs, with a 10-minute delay
710weekly-cleanup/usr/local/bin/cleanup.sh
# Run monthly jobs, with a 15-minute delay
3015monthly-report/usr/local/bin/report.sh
Terminal window
# Check when anacron last ran each job
ls-la/var/spool/anacron/
# -rw------- 1 root root 9 Jan 14 03:05 daily-backup
# The file content is a date stamp: 20250114
# Force anacron to run all overdue jobs now
sudoanacron-f-n
# -f = force (ignore timestamps)
# -n = now (don't wait for delay)
# Test without executing (dry run)
sudoanacron-T
On most modern systems, the cron.daily/, cron.weekly/, and cron.monthly/ directories are actually triggered by anacron, not by cron itself. This ensures those maintenance tasks run even on machines with variable uptime.
A mid-size e-commerce company ran nightly backups of their PostgreSQL database to a network share. The cron job ran dutifully every night. Nagios showed green. The backup script exited with code 0. Life was good — for six months.
Then a developer accidentally ran a DROP TABLE on the production orders table. No problem, they thought, we have backups. The DBA went to restore and discovered that the backup script had been silently failing since a password rotation six months earlier. The pg_dump command returned an authentication error, wrote an empty file, and exited 0 because the script used pg_dump ... ; gzip instead of pg_dump ... && gzip — the gzip succeeded on the empty file, so the script exited cleanly. Six months of empty .sql.gz files, each about 20 bytes.
They recovered partial data from application logs and read replicas. They lost three months of historical order data.
The lessons:
A backup you never test is not a backup — it is a hope
Check backup file sizes — a 20-byte database dump is not a good sign
Use set -e in scripts or chain commands with &&, never ;
tar (tape archive) bundles files and directories into a single archive. Despite the name referencing tape drives, it remains the standard archiving tool on Linux.
Rule of thumb: Use gzip for daily backups (speed matters), xz for archives you will keep for months or years (ratio matters), and skip compression for data that is already compressed (images, videos, encrypted files).
rsync is the Swiss Army knife of file transfer. Unlike cp, rsync only transfers what has changed, can resume interrupted transfers, and works over SSH for remote copies.
Understanding these three strategies is essential for designing backup systems:
Week 1:
Sun Mon Tue Wed Thu Fri Sat
FULL inc inc inc inc inc inc <-- Incremental
| | | | | | |
| +--+ +--+ +--+ +--+ +--+ +--+
| only only only only only only
| Mon Tue Wed Thu Fri Sat
| changes changes changes changes changes changes
FULL diff diff diff diff diff diff <-- Differential
| | | | | | |
| +--+ +----+ +------+ +--------+ ...
| Mon Mon- Mon- Mon-
| changes Tue Wed Thu
| changes changes changes
Strategy
Backup Size
Restore Speed
Restore Complexity
Full
Largest
Fastest
Simplest (1 backup needed)
Incremental
Smallest
Slowest
Most complex (full + all incrementals)
Differential
Medium
Medium
Moderate (full + latest differential)
Full backup: Complete copy of everything. Simple but slow and storage-heavy.
Incremental: Only what changed since the last backup (full or incremental). Smallest backups, but restoring requires the full backup plus every incremental in sequence.
Differential: Only what changed since the last full backup. Grows over the week, but restoring only needs the full backup plus the latest differential.
This is not paranoia — it is probability. A single hard drive has roughly a 1-2% annual failure rate. Two independent drives failing simultaneously is rare but not impossible (especially drives from the same batch). Adding an offsite copy protects against correlated failures: the fire that destroys the server room destroys both local copies.
Pause and predict: If you use a cloud sync service to mirror your Documents folder, does that count as a backup under the 3-2-1 rule? (Hint: If you accidentally delete a file locally, or a ransomware encrypts it, those changes are immediately synced to the cloud. Sync is not backup!)
Question 1: Your company’s log server is running out of disk space every weekend. You wrote a script at /usr/local/bin/cleanup.sh to archive old logs. To minimize impact on active users, you need this script to execute exactly at 3:15 AM every Sunday. What is the correct crontab entry to achieve this?
Show Answer
15 3 * * 0 /usr/local/bin/cleanup.sh
The cron format expects five time-and-date fields followed by the command: minute, hour, day of month, month, and day of week. By setting the minute to 15 and the hour to 3, you specify the exact time of 3:15 AM. Leaving the day of month and month as wildcards (*) ensures it runs regardless of the date, while setting the day of week to 0 (or 7) restricts the execution strictly to Sundays. Using a shortcut like @weekly would not work here because it defaults to midnight, missing your specific maintenance window.
Question 2: You manage a fleet of developer laptops that run a daily systemd timer for backing up local code repositories at 2:00 AM. Developers frequently close their laptops and take them home at 6:00 PM, only opening them again at 9:00 AM. What systemd timer configuration directive ensures these backups still happen, and how does it function in this scenario?
Show Answer
The critical directive you need is Persistent=true in the [Timer] section of your systemd timer unit. When a system is powered off or asleep during a scheduled execution time, standard cron jobs simply miss their window and are skipped until the next occurrence. By setting Persistent=true, systemd records the time the timer last triggered on disk. When the developer opens their laptop at 9:00 AM, systemd checks this record, realizes the 2:00 AM backup was missed, and immediately executes the service to catch up, ensuring data is not left unprotected.
Question 3: A junior admin was tasked with migrating the /var/www/html directory to a new backup disk mounted at /mnt/backup. They executed rsync -av /var/www/html /mnt/backup/ but then panicked because the backup disk didn’t contain index.html at the root, but instead had a nested html folder. What caused this behavior, and how should the command have been written to avoid it?
Show Answer
The junior admin omitted the trailing slash on the source directory, which fundamentally changes how rsync interprets the command. When you run rsync -av /var/www/html (without a trailing slash), rsync reads it as “copy this specific directory and place it inside the destination,” resulting in /mnt/backup/html/index.html. To achieve the intended result of copying the contents directly, the command should have been rsync -av /var/www/html/ /mnt/backup/. The trailing slash instructs rsync to copy the contents of the source directory rather than the directory itself, ensuring files like index.html land directly in /mnt/backup/.
Question 4: You are designing a disaster recovery policy for a critical database. Your manager suggests simply copying the database dump to a secondary local hard drive every night to save costs. How would you apply the 3-2-1 backup rule to explain the vulnerabilities in their plan, and what specific scenarios does each component of the rule protect against?
Show Answer
The manager’s plan violates almost every tenet of the 3-2-1 backup rule, which requires 3 total copies of data, stored on 2 different media types, with 1 copy kept offsite. Having only 3 copies (the primary data and two backups) ensures that if one backup is found to be corrupted during a restore attempt, a fallback exists. Using 2 different media types (e.g., SSD and cloud object storage, or hard drive and tape) mitigates the risk of a single hardware defect or firmware bug wiping out all copies simultaneously. Finally, requiring 1 copy offsite is critical because the manager’s secondary local hard drive would be instantly destroyed or compromised by site-wide disasters like fires, floods, or a ransomware infection spreading across the local network.
Question 5: A database migration script scheduled in cron runs pg_dump production_db > backup.sql; gzip backup.sql. After a major database crash, you attempt to restore the backup but find that backup.sql.gz is only 20 bytes long. Checking the system logs reveals that the database was restarting exactly when the cron job ran. Why did the script finish without reporting an error, and how should it be rewritten to prevent this silent failure?
Show Answer
The silent failure occurred because the script used a semicolon (;) to separate commands, which instructs the shell to execute the second command regardless of whether the first one succeeded or failed. When the database was restarting, pg_dump failed to connect and produced an empty backup.sql file, but then gzip happily compressed that empty file and exited with a successful status code of 0. To fix this, you should chain the commands with the logical AND operator (&&), like pg_dump production_db > backup.sql && gzip backup.sql, so that compression only occurs if the dump succeeds. Alternatively, using a pipe with set -o pipefail (e.g., pg_dump production_db | gzip > backup.sql.gz) is even more robust and saves disk space by avoiding the intermediate uncompressed file.
Question 6: The financial compliance team needs a script to pull stock market data at the start of trading hours. They have requested a systemd timer that triggers the data collection service exclusively from Monday through Friday, precisely at 9:00 AM. What OnCalendar expression accurately captures this complex scheduling requirement?
Show Answer
OnCalendar=Mon..Fri *-*-* 09:00:00
Systemd timers use a highly expressive calendar event syntax formatted as DayOfWeek Year-Month-Day Hour:Minute:Second. By specifying Mon..Fri, you instruct the timer to restrict execution to weekdays, entirely skipping the weekend. The *-*-* portion acts as a wildcard for the date, meaning it matches every year, month, and day of the month. Finally, 09:00:00 locks the execution to the exact time required by the compliance team. You can always validate such expressions before deploying them by running systemd-analyze calendar "Mon..Fri *-*-* 09:00:00".
Question 7: A developer used the at command to schedule an emergency patch deployment script to run at midnight. An hour later, they realize the script contains a critical bug that will corrupt the database. How can they view the queue of scheduled one-off tasks to find their specific job, and what command must they run to cancel the job identified as task number 5 before it executes?
Show Answer
To view the queue of pending one-off tasks, the developer should use the atq command, which lists all scheduled jobs along with their job IDs, execution times, and the user who scheduled them. Once they identify the erroneous deployment script as job ID 5, they must execute atrm 5 to remove it from the queue. If they are unsure whether job 5 is indeed their script, they can inspect the exact commands scheduled to run by typing at -c 5 before issuing the removal command, preventing accidental deletion of a different critical task.
Hands-On Exercise: Build an Automated Backup System
You now have the skills to automate anything on a schedule and protect data with proper backups. Return to the LFCS Learning Path to review remaining study areas, or revisit Module 8.1: Storage Management if you want to combine LVM snapshots with your backup strategy.