Score:2

Bash script: Conditionally delete older files while keeping latest copies

kp flag

Note: Though there's an answer from jeff-schaller, it depends on zsh; so I would like to get an answer based on Bash.

I like to create a Bash script to conditionally delete older files from a backup directory.

There are 2 conditions for 2 distinct file backups:

1, Keep the latest copy of Edge_Profile_*.tgz, and delete rest of Edge_Profile_*.tgz only if they are older than 5 days.

2, Keep the latest copy of Firefox_Profile_*.tgz, and delete rest of Firefox_Profile_*.tgz, doesn't matter how old are they.

Here's how I have modify this AskUbuntu answer: https://askubuntu.com/a/933098/928088

Script:

#!/bin/bash

LOG_FILE="/home/admn/Cleanup.log"
TEMP_LOG="/tmp/Temp_Cleanup.log"

mv $LOG_FILE $TEMP_LOG

{

cd /home/admn/Downloads/Test;

echo "Cleanup log:" `date`

find /home/admn/Downloads/Test/Edge_Profile_*.tgz -type f \( -mtime +5 -printf 'Y\t' -o -printf 'N\t' \) -printf '%A@\t%p\0' |
    sort -zk2,2 | head -zn -1 | while read -r -d '' flag _ file; do \
        case "$flag" in 
            'Y') echo rm "$file" 
               ;; 
            *) echo "skipping $file (too new)"
               ;; 
        esac;
done

echo

find /home/admn/Downloads/Test/Firefox_Profile_*.tgz -type f \( -printf 'Y\t' -o -printf 'N\t' \) -printf '%A@\t%p\0' |
    sort -zk2,2 | head -zn -1 | while read -r -d '' flag _ file; do \
        case "$flag" in 
            'Y') echo rm "$file" 
               ;; 
            *) echo "skipping $file (too new)"
               ;; 
        esac        
done

} &>> $LOG_FILE

cat $TEMP_LOG >>$LOG_FILE

exit;

Output in the logfile with echo:

/usr/local/scripts/cleanup.sh

rm /home/admn/Downloads/Test/Edge_Profile_Jul_06_2021_00-35.tgz
rm /home/admn/Downloads/Test/Edge_Profile_Jul_07_2021_00-35.tgz
....
skipping /home/admn/Downloads/Test/Edge_Profile_Jul_12_2021_00-35.tgz (too new)
skipping /home/admn/Downloads/Test/Edge_Profile_Jul_13_2021_00-35.tgz (too new)
....

rm /home/admn/Downloads/Test/Firefox_Profile_Jul_01_2021_00-35.tgz
rm /home/admn/Downloads/Test/Firefox_Profile_Jul_02_2021_00-35.tgz
....

Output in the logfile while actually running the script, without echo:

/home/admn/Downloads/cleanup.sh: line 24: skipping /home/admn/Downloads/Test/Edge_Profile_Jul_12_2021_00-35.tgz (too new): No such file or directory
/home/admn/Downloads/cleanup.sh: line 24: skipping /home/admn/Downloads/Test/Edge_Profile_Jul_13_2021_00-35.tgz (too new): No such file or directory
....

Total files in the directory: 20 files

1, Edge_Profile_*.tgz: From July 06 to July 17: 12 files

2, Firefox_Profile_*.tgz: From July 01 to July 08: 8 files

The issues:

(1) I think the script is kind of working but I'm not sure as I've modified most part without knowing what's going on.

(2) Output to logfile:

I would prefer the exact same output in the logfile that I get with echo, except just the filenames (not with full path), like:

rm Edge_Profile_Jul_11_2021_00-35.tgz

skipping Edge_Profile_Jul_12_2021_00-35.tgz (too new)

OS: Ubuntu MATE 21.04

Thanks a lot.

Score:1
ru flag

Manipulating files based on their modification times is much easier in a shell that lets you access them directly. zsh is one such shell. Simply sudo apt install zsh to install it. Since your files appear to be in one directory, this answer is non-recursive. Demonstrating the simpler case first:

  • To keep the latest copy of Firefox_Profile_*.tgz and delete rest of them no matter how old they are:

    echo would rm -v -- Firefox_Profile_*.tgz(.om[2,-1])
    

    Remove the echo would portion if it looks correct. This uses a glob (wildcard) qualifier inside the parenthesis to do three things:

    • select only plain files (not directories or sockets or etc) with .
    • order (sort) the files by their modification time, newest to oldest, with om
    • select a slice of the resulting list starting from the second element to the end -- skipping the first (newest) file, with [2,-1]

    If there are no matching files, zsh will stop and complain with "zsh: no matches found", and will not execute the rm.

  • To keep the latest copy of Edge_Profile_*.tgz and delete the rest of them only if they are older than 5 days, first we grab the latest one:

    newest=(Edge_Profile_*.tgz(.om[1]))
    

    ... and then we get the ones that are older than five days:

    older=(Edge_Profile_*.tgz(.m+5))
    

    The new part here is the +5 on the m modifier. That selects files that are older than 5 days. After that, we make sure the newest one isn't in the list to remove:

    remove=("${(@)older:|newest}")
    

    The new part here is the array subtraction symbol :|; it is documented in the Parameter Expansion section of the zsh manual. It selects the elements of "older" that are not in "newest". Finally, we remove that list of files:

    echo would rm -v -- "${remove[@]}"
    
Jags avatar
kp flag
Thank you for an answer, but I like to use `Bash`. 'Coz last time when I had installed `zsh`, I ended up with format and clean installation.
Jeff Schaller avatar
ru flag
You don't have to use it as your day-to-day shell; just install it and use it for this one script.
Jags avatar
kp flag
oh okay, later I'm gonna try this, first in a VM. Thanks again.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.