Score:0

Bash rsync script won't complete on large backups

us flag

Over the last year or so, I've be trying to learn bash scripting on Ubuntu (one 20.04 LTS box and one 21.04). I feel that my skills have been progressing nicely, but I'm at my wits end with one particular backup script I've written.

Long story short, if the backup script is only trying to backup my /home directory, it runs smoothly beginning to end, but if it is trying to backup my / directory, it runs until the rsync process is complete (including creating a log file), and then the script simply seems to stop, event though I've got some housekeeping code that follows to alert me. I even added an exit trap, but that isn't triggered.

I don't know what I'm missing, but I'm eager to try to learn, if anyone can help me out.

I don't know if it will help, but here's a slightly simplified version of my script:

#!/bin/bash
#
# backup
# unified back up, home and/or root folders to raid on Betty
# runs daily from sudo crontab (time varies by device); day of month determines which backup is performed
#
# related files
# generates backup.log in home folder of main user, archives existing backup.log to logs folder as yyyy-mm-dd-backup.log
# requires two files in $homefold: home-backup.exclude and boot-backup.exclude, to provide backup exclusions for rsync
#
# exit trap
function badquit {
   if [[ "$1" = "test" ]] ; then
      exit 0
   elif [[ "$goodflag" = "0" ]] ; then
      echo "$(date +%r) : $shname exited with goodflag=0" > $logfold/$shname.quit
   fi
}
trap badquit EXIT
#
# variables
host="$(hostname)"
hostlc="${host,,}"
shname="$(basename $0)"
hostuser="$(getent passwd "1000" | cut -d: -f1)"
homefold="/home/$hostuser"
logfold="$homefold/logs"
log="$homefold/$shname.log"
date="$(date +%Y-%m-%d)"
dom="$(date +%d)" # day of month at runtime determines which backup is performed (daily home, weekly home, quarterly boot)
soonwarn="7" # days in advance to create .soon report
case $host in # $bumount location varies by device: Betty mounts it differently than all others
   Betty)
      bumount="/mnt/raid"
      ;;
   *)
      bumount="/media/betty-raid"
      ;;
esac
bufold="$bumount/$hostlc"
goodflag="0" # default status of goodflag to trigger unexpected exit message
#
# pre-check
# confirm existence of $bufold, attempt to mount if not found, set fail message if unsuccessful
if [[ ! -e $bufold ]] ; then
   mount 192.168.x.x:/mnt/raid $bumount
   sleep 5
   if [[ ! -e $bufold ]] ; then
      echo "$(date +%r) : $shname exited because $bufold could not be mounted" > $logfold/$shname.quit
      goodflag="1"
      exit 1
   fi
fi
# 
# log management
if [[ -e $log ]] ; then
   mv $log $logfold/"$(date -r $log +"%Y-%m-%d")"-$shname.log
fi
#
# primary tasks
# set targdom per old boot backup schedule (to allow time for .soon)
case $host in
   Betty)
      targdom="8"
      ;;
   Veronica)
      targdom="12"
      ;;
   *)
      echo "$(date +%r) : $shname cannot identify $host" > $logfold/$shname.quit
      goodflag="1"
      exit 1
      ;;
esac   
# determine which backup to run, set final variables
if [[ "$dom" = "$targdom" ]] ; then # targdom triggers non-daily backups
   if [[ $(date +%m) -eq "01" ]] || [[ $(date +%m) -eq "04" ]] || [[ $(date +%m) -eq "07" ]] || [[ $(date +%m) -eq "10" ]] ; then # quarterly boot backup Jan,Apr,Jul,Oct
      ropts="-avuHkbi --delete --exclude-from=$homefold/boot-backup.exclude --backup-dir=$bufold/Backup.$date --log-file=$log" 
      rsource="/"
   else # monthly home backup
      ropts="-axvuHkbi --delete --exclude-from=$homefold/home-backup.exclude --backup-dir=$bufold/Backup.$date --log-file=$log"
      rsource="/home"
   fi
else # not targdom, daily home backup
   ropts="-axvuHkbi --exclude-from=$homefold/home-backup.exclude --backup-dir=$bufold/Backup.$date --log-file=$log" #daily home backup
   rsource="/home"
fi
# if this is a test, set variables appropriately
if [[ "$1" = "test" ]] ; then
   ropts="-navuHkbi --exclude-from=$homefold/boot-backup.exclude --backup-dir=$bufold/Backup.$date --log-file=$log" # dry-run ON
   rsource="/"
fi
# run the backup, finally
rsync $ropts --backup-dir=$bufold/Backup.$date --log-file=$log $rsource $bufold
#
# set exitcond
if [[ "$?" = "0" ]] ; then
# successful, log saved
   exitcond="successfully, log saved"
else
# unsuccessful, exit cond
   exitcond="unsuccessfully, exit code $?"
fi
#
echo "$(date +%r) : $shname completed $exitcond" >> $logfold/$shname.done
goodflag="1"
sudodus avatar
jp flag
When you try to backup the root partition `/`, there will also be active system files, that will be modified by the rsync process. So **either exclude the directories with such files (e.g. `/run` and `/proc`) or boot from another system** (for example a live system booted from USB or a second system in a dual boot system). I guess that you are already running the script with `sudo` to make it possible to read all files and to preserve all permissons and ownerships.
Thomas Aichinger avatar
cn flag
My first guess is that there are some files that can not be copied (sockets, devices) To track that problem down, try to eliminate the bash script and write just one line rsync --with-all-paramteters. Then try to reproduce this error and post it here.
Artur Meinild avatar
vn flag
+1 for excluding system directories. Look here for more [info on the filesystem](https://www.linux.com/training-tutorials/linux-filesystem-explained/). I would assume `/dev`, `/proc`, `/run` and `/sys` should always be excluded, since these are all virtual filesystems.
KneadToKnow avatar
us flag
I will post my .exclude file this evening when I can get to it again. I think I kept most of the mentioned directories out of the backup. I may have stumbled across something when I was looking into what I thought was a different problem, though: I think trying to backup the /usr/bin/x11 directory may be creating an infinite loop, which would certainly explain why the script never finishes.
KneadToKnow avatar
us flag
Well, when I went to get my .exclude file, I discovered that the script was, in fact, still chugging along, having run at that point for about 17 hours. I had to attend to other things, but I know from my housekeeping software that it was still running when the regular nightly backup kicked off at 1:00 this morning, so I'll need to follow up on all of that tomorrow when I can sit down and examine all the logs. I still suspect the real problem is that /usr/bin/x11 and possibly similar directories that hopefully a good combination of include-file and exclude-file parameters will sort out.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.