I'm using borgbackup to back up a few systems. They all use the same borgbackup repository. I have written a systemd timer that starts a backup job using borgmatic. I have an Ansible playbook that deploys everything and sets up the timer. My timer file looks like this:
[Unit]
Description=Run borgmatic backup
[Timer]
OnCalendar=*-*-* 00:00:00
RandomizedDelaySec=10000
OnBootSec=1200
Persistent=true
[Install]
WantedBy=timers.target
All systems have the same OnCalendar expression. The issue I'm trying to solve is that even with the RandomizedDelaySec set, the backups often run close enough to the same time that one has a lock and the second job fails.
I know that I can solve this in a few different ways, I could use separate repos, I could change the OnCalendar expression with Ansible when I deploy, or I could manually change it. The solution that I'm trying to pursue, for the sake of learning systemd better, is one where I create a configuration that retries after some time on the occasion that a backup job fails. It seems to me like systemd must provide a way to do this. Is it possible? If so, how?
Restart=on-failure
RestartSec=5min
, described inman systemd.service
.