Using the perl
rename
utility:
Note: perl rename is also known as file-rename
, perl-rename
, or prename
. Not to be confused with the rename
utility from util-linux
which has completely different and incompatible capabilities and command-line options. perl rename is the default rename on Debian...IIRC, it's in the prename
package on Centos and the command should be executed as prename
rather than rename
.
$ rename -n 'if (m/(^\d{4}_\d\d_\d\d)_(\d\d)/) {
my ($date,$hour) = ($1,$2);
my $dir = "./$date/$hour/";
mkdir $date;
mkdir $dir;
s=^=$dir=
}' *
rename(2021_10_15_23_35_SIP_CDR_pid3894_ins2_thread_1_4718.csv.gz, ./2021_10_15/23/2021_10_15_23_35_SIP_CDR_pid3894_ins2_thread_1_4718.csv.gz)
rename(2021_11_24_21_15_Gi_pid25961_ins2_thread_1_6438.csv.gz, ./2021_11_24/21/2021_11_24_21_15_Gi_pid25961_ins2_thread_1_6438.csv.gz)
rename(2021_11_24_21_15_Gi_pid27095_ins2_thread_1_6485.csv.gz, ./2021_11_24/21/2021_11_24_21_15_Gi_pid27095_ins2_thread_1_6485.csv.gz)
rename(2021_11_24_21_15_Gi_pid27095_ins3_thread_2_6485.csv.gz, ./2021_11_24/21/2021_11_24_21_15_Gi_pid27095_ins3_thread_2_6485.csv.gz)
rename(2021_11_24_21_15_Gi_pid27095_ins4_thread_3_6485.csv.gz, ./2021_11_24/21/2021_11_24_21_15_Gi_pid27095_ins4_thread_3_6485.csv.gz)
rename(2021_11_24_21_15_Gi_pid681_ins5_thread_4_6457.csv.gz, ./2021_11_24/21/2021_11_24_21_15_Gi_pid681_ins5_thread_4_6457.csv.gz)
rename(2021_11_25_20_55_Gi_pid29741_ins5_thread_4_7540.csv.gz, ./2021_11_25/20/2021_11_25_20_55_Gi_pid29741_ins5_thread_4_7540.csv.gz)
rename(2021_11_25_20_55_Gi_pid30842_ins3_thread_2_7489.csv.gz, ./2021_11_25/20/2021_11_25_20_55_Gi_pid30842_ins3_thread_2_7489.csv.gz)
rename(2021_11_25_20_55_Gi_pid30842_ins4_thread_3_7488.csv.gz, ./2021_11_25/20/2021_11_25_20_55_Gi_pid30842_ins4_thread_3_7488.csv.gz)
rename(2021_11_25_20_55_Gi_pid30842_ins5_thread_4_7489.csv.gz, ./2021_11_25/20/2021_11_25_20_55_Gi_pid30842_ins5_thread_4_7489.csv.gz)
The -n
is a dry-run option, it will only show what it would do without actually doing it. Remove it (or replace with -v
for verbose output) when you're sure the rename script is going to do what you want.
The script works by first extracting the date and hour portions of each filename (skipping any filenames that don't match). Then it creates the directories for the date
and date/hour
, then renames the filename into those directories.
This assumes that the filenames are in the current directory. If they aren't, you'll have to adjust the m//
matching regex in the first line AND the s===
substitution regex in the second-last line.
Alternate version using the File::Path perl core module (which is included with perl), instead of using mkdir
twice (the make_path
function works like the mkdir -p
shell command):
$ rename -v 'BEGIN {use File::Path qw(make_path)};
if (m/(^\d{4}_\d\d_\d\d)_(\d\d)/) {
my $dir = "./$1/$2/";
make_path $dir;
s=^=$dir=
}' *
2021_10_15_23_35_SIP_CDR_pid3894_ins2_thread_1_4718.csv.gz renamed as ./2021_10_15/23/2021_10_15_23_35_SIP_CDR_pid3894_ins2_thread_1_4718.csv.gz
2021_11_24_21_15_Gi_pid25961_ins2_thread_1_6438.csv.gz renamed as ./2021_11_24/21/2021_11_24_21_15_Gi_pid25961_ins2_thread_1_6438.csv.gz
2021_11_24_21_15_Gi_pid27095_ins2_thread_1_6485.csv.gz renamed as ./2021_11_24/21/2021_11_24_21_15_Gi_pid27095_ins2_thread_1_6485.csv.gz
2021_11_24_21_15_Gi_pid27095_ins3_thread_2_6485.csv.gz renamed as ./2021_11_24/21/2021_11_24_21_15_Gi_pid27095_ins3_thread_2_6485.csv.gz
2021_11_24_21_15_Gi_pid27095_ins4_thread_3_6485.csv.gz renamed as ./2021_11_24/21/2021_11_24_21_15_Gi_pid27095_ins4_thread_3_6485.csv.gz
2021_11_24_21_15_Gi_pid681_ins5_thread_4_6457.csv.gz renamed as ./2021_11_24/21/2021_11_24_21_15_Gi_pid681_ins5_thread_4_6457.csv.gz
2021_11_25_20_55_Gi_pid29741_ins5_thread_4_7540.csv.gz renamed as ./2021_11_25/20/2021_11_25_20_55_Gi_pid29741_ins5_thread_4_7540.csv.gz
2021_11_25_20_55_Gi_pid30842_ins3_thread_2_7489.csv.gz renamed as ./2021_11_25/20/2021_11_25_20_55_Gi_pid30842_ins3_thread_2_7489.csv.gz
2021_11_25_20_55_Gi_pid30842_ins4_thread_3_7488.csv.gz renamed as ./2021_11_25/20/2021_11_25_20_55_Gi_pid30842_ins4_thread_3_7488.csv.gz
2021_11_25_20_55_Gi_pid30842_ins5_thread_4_7489.csv.gz renamed as ./2021_11_25/20/2021_11_25_20_55_Gi_pid30842_ins5_thread_4_7489.csv.gz
This isn't really any better than the first version, but it does demonstrate that you can use any perl code, any perl module to rename and/or move files.
Third version, this one uses File::Basename to split the input pathname into $path
and $file
portions. It can cope with filenames in the current directory, or in any other directory. File::Basename
is a core perl module, so is included with perl. It provides three useful functions, basename()
and dirname()
(which work similarly to the shell tools of the same name), and fileparse()
which is what I'm using in this script to extract both the basename and the directory into separate variables.
rename -n 'BEGIN {use File::Path qw(make_path); use File::Basename};
my ($file, $path) = fileparse($_);
if ($file =~ m/(\d{4}_\d\d_\d\d)_(\d\d)/) {
my $dir = "$path/$1/$2";
make_path $dir;
$_ = "$dir/$file"
}' /home/cas/rename-test/*
rename(/home/cas/rename-test/2021_10_15_23_35_SIP_CDR_pid3894_ins2_thread_1_4718.csv.gz, /home/cas/rename-test/2021_10_15/23/2021_10_15_23_35_SIP_CDR_pid3894_ins2_thread_1_4718.csv.gz)
rename(/home/cas/rename-test/2021_11_24_21_15_Gi_pid25961_ins2_thread_1_6438.csv.gz, /home/cas/rename-test/2021_11_24/21/2021_11_24_21_15_Gi_pid25961_ins2_thread_1_6438.csv.gz)
rename(/home/cas/rename-test/2021_11_24_21_15_Gi_pid27095_ins2_thread_1_6485.csv.gz, /home/cas/rename-test/2021_11_24/21/2021_11_24_21_15_Gi_pid27095_ins2_thread_1_6485.csv.gz)
rename(/home/cas/rename-test/2021_11_24_21_15_Gi_pid27095_ins3_thread_2_6485.csv.gz, /home/cas/rename-test/2021_11_24/21/2021_11_24_21_15_Gi_pid27095_ins3_thread_2_6485.csv.gz)
rename(/home/cas/rename-test/2021_11_24_21_15_Gi_pid27095_ins4_thread_3_6485.csv.gz, /home/cas/rename-test/2021_11_24/21/2021_11_24_21_15_Gi_pid27095_ins4_thread_3_6485.csv.gz)
rename(/home/cas/rename-test/2021_11_24_21_15_Gi_pid681_ins5_thread_4_6457.csv.gz, /home/cas/rename-test/2021_11_24/21/2021_11_24_21_15_Gi_pid681_ins5_thread_4_6457.csv.gz)
rename(/home/cas/rename-test/2021_11_25_20_55_Gi_pid29741_ins5_thread_4_7540.csv.gz, /home/cas/rename-test/2021_11_25/20/2021_11_25_20_55_Gi_pid29741_ins5_thread_4_7540.csv.gz)
rename(/home/cas/rename-test/2021_11_25_20_55_Gi_pid30842_ins3_thread_2_7489.csv.gz, /home/cas/rename-test/2021_11_25/20/2021_11_25_20_55_Gi_pid30842_ins3_thread_2_7489.csv.gz)
rename(/home/cas/rename-test/2021_11_25_20_55_Gi_pid30842_ins4_thread_3_7488.csv.gz, /home/cas/rename-test/2021_11_25/20/2021_11_25_20_55_Gi_pid30842_ins4_thread_3_7488.csv.gz)
rename(/home/cas/rename-test/2021_11_25_20_55_Gi_pid30842_ins5_thread_4_7489.csv.gz, /home/cas/rename-test/2021_11_25/20/2021_11_25_20_55_Gi_pid30842_ins5_thread_4_7489.csv.gz)
BTW, it would be trivial to modify this so that it moved the files to a completely different path - just make it do something like my $dir = "/my/new/path/$1/$2";
instead of my $dir = "$path/$1/$2";
The key thing to understand about how the perl rename
utility works is that iff the rename script modifies the $_
variable then rename will attempt to rename the file to the new value of $_
. If $_
is unchanged, it will not try to rename it. This is why you can use any perl code to rename files - has to do is change $_
. Most often you'll probably use very simple sed
-like rename scripts (e.g. rename 's/ +/_/g' *
to rename spaces in filenames to an underscore) but the rename algorithm can be as complex as you need it to be.
$_
is a very important variable in perl - it's used as the default variable to hold input from file handles and iterators for loops if the programmer doesn't specify one. It's also used as the default operand for several operators (like m//
, s///
, tr///
) and as the default argument for many (but not all) functions. See man perlvar
and search for $_
(you'll need to escape that in less as \$_
).
BTW, one thing I didn't mention about rename
earlier is that it can take filenames either as arguments on the command line or from stdin. It defaults to newline-separated input from stdin (so it won't work with filenames that contain newlines - an annoying but completely valid possibility). You can use the -0
argument to make it use NUL separated input instead of newline-separated...so, it can work with any filenames, taking input from anything that can generate a list of NUL-separated filenames (e.g. find ... -print0
, but it's probably better to just use find
's -exec ... {} +
option).
rename
will also refuse to rename a file over an existing file unless you use its -f
or --force
option.