0

I have a CentOS 7 directory that gets filled with tar.gz archives for different packages, where each package can create multiple archive versions.

I'm trying to find a way to delete any archives older than W weeks, but which will still keep up to N archives.

Thus, the following command:

find $top_dir -type f -name "*tar.gz" -mtime +21 -exec rm -f {} \;

is not good, because if there're packages which have not changed in the last three weeks - all versions of that package may be deleted, if that package was not changed in the last three weeks.

So, I need a command/script that will delete all archives older than D days, but will keep the last N versions of any type of archives.

Edit:

The archives of interest are the result of the following -

ls -1 * | grep -v "name-base.tar.gz" | grep 'tar.gz' | awk -F- '{ print $1"-"$2 }' | tr -d '[.0-9]' | sort -u

For instance pkgactions-4.2.0-973-g43e2a14.tar.gz or pkg-elastic-4.2.0-develop-129-ge38848d.tar.gz

Script to keep the 3 newest archives:

The following script will keep only the newest three archives, but it doesn't have the provisions not to delete packages newer than D days.

#!/bin/bash

cd /usr/share/nginx/rpm
for pkg in $( ls -1 * | grep -v "pkg-base.tar.gz" | grep 'tar.gz' | awk -F- '{ print $1"-"$2 }' | tr -d '[.0-9]' | sort -u ); do 
        ordered=$( ls -1atr ${pkg}*) ; 
        pkg_num=$( echo $ordered | tr ' ' '\n' | wc -l ) ; 
        if [[ ${pkg_num} > 3 ]]; then
                num2delete=$(( ${pkg_num}-3 )) ; 
        else
                num2delete=0
        fi
        oldest=$( echo $ordered | tr ' ' '\n' | head -$num2delete ) ; 
        rm -f $oldest
done
2
  • Any specific naming scheme for the archives?
    – xenoid
    Commented Jun 1, 2020 at 20:47
  • Thanks for the response, @xenoid: see the Edit in my question.
    – boardrider
    Commented Jun 2, 2020 at 18:23

2 Answers 2

1

Not a complete answer, but since you have

  • A script to find the newer N archives (output to newerN.lst)
  • A script to find all archives older than N days (output to older.lst)

You can easily produce a list of the old files that are not in the N newer using

grep --invert-match --file newerN.lst older.lst

This list is what you want to erase.

0

You'll have to keep track of them individually, probably by partial name, and keep the last N versions OR x days.

for each commonThing do if more than N then delete copies > N older than X days fi done

or some such similar arrangement.

Doing this reliably is a challenge, as you'll probably need a list of the commonThings to identify them, and if that list gets corrupted (and it will), you'll end up deleting too much or not enough.

1
  • I have the script to delete the oldest archives while keeping the newest three versions of each archive. What I don't have an idea for is how to combine that with also not deleting archives if they are newer than D days. See "Script to keep the 3 newest archives:"
    – boardrider
    Commented Jun 2, 2020 at 18:27

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .