I am using GNU tar to process a tar(layer of a docker image) to modify some jars in that. I am doing:
- save image to disk as tar
- extract it, so I have each layer in a dir
- enter each layer, I have a
layer.tar
, ajson
and aVERSION
- iterate all
*/*.jar
file inlayer.tar
, trying to find some class file - if I found them, extract the jar with file tree structure, remove the class file from it, and put it back to
layer.tar
, overwriting the original jar - package each layer back to a new tar, use docker to load it and push later(not done yet)
I created a script for this, which almost does the work, but with 2 jars one besides another, one with the class to remove, and another without it.
#!/bin/bash
# tar needs find to package without ".". u for update, c for create
function pack_all_without_period() {
find $1 -printf "%P\n" -type f -o -type l -o -type d | sudo tar -$3vf $2 --no-recursion -C $1 -T -
}
if [ -z $1 ]; then
printf "Save the image as tar, extract, and enter each layer to remove the vulnerable classes(JMSAppender/SocketServer/SimpleSocketServer)\nPlease provide the image name. \n"
exit 1
fi
dir="log4j-1.x-fix"
image_tar=amq-image-to-fix.tar
if [ ! -d $dir ]; then
mkdir $dir
fi
# save image to tar
docker save $1 -o $image_tar
# extract tar
tar xf $image_tar -C $dir
# each layer is extracted to a folder, each folder has a "layer.tar".
# Go into each folder, extract `layer.tar`, and use `jar` to remove the classes
# and package them back to `layer.tar` (-a to append), and delete the extracted folders.
# at last, package all layers + manifest.json and so back into another tar, WITHOUT COMPRESSION
cd $dir
# enter layer and exit
for layer in */; do
echo Processing layer $layer
cd $layer
# tar does not support overwrite, as tape cannot be overwritten; so I wanted to remove the original jar from tar,
# then append it back with tar -u/-A/-r; but then I found tar --delete is extremely slow(by design)
# so at last I have to extract all files and package them back
mkdir temp
sudo tar --extract --directory=temp --file layer.tar --wildcards "*.jar" # file tree is preserved, so package them back is easy
if [[ $? -eq 0 ]]; then
for f in $(find . -mindepth 2 -name "*.jar" -not -type l -printf "%P\n"); do # exclude jolokia.jar(link)
sudo jar -tvf $f | grep -E "(*JMSAppender*.class|*SocketServer.class|*log4j*.class)"
if [[ $? -eq 0 ]]; then
echo Found classes in $f
read -p "Do you want to remove these classes? (Y/N) " option
if [[ $option == 'Y' || $option == 'y' ]]; then
echo Removing class file from $f
sudo zip -d $f "*JMSAppender.class" "*SocketServer.class" "*SimpleSocketServer.class"
######### here I need to delete the original jar with the classes I just deleted, but I don't know how ############
else continue
fi
else
continue
fi
done
# append folders to tar, without leading "."
echo Appending modified folders to layer.tar anew
pack_all_without_period temp layer.tar r
fi
sudo rm -r $(find . -maxdepth 1 -mindepth 1 -type d -print)
cd .. # back to $dir
done
cd ..
# tar will always include a folder "." as root. This function get rid of it, so the archive
# only contains the content of the folder
# compress will preserve ownership and group by default; and to extract while preserving the same info,
# we use '--same-owner', which is used by default when using sudo.
# again, append all layers and files to new tar, without leading "."
echo after processing all layers, we are at $(pwd)
pack_all_without_period $dir amq-image-fixed.tar c
sudo rm -Irv $dir $image_tar
but I found that:
tar
can only append, will not overwrite. So I changed it so I would first delete the original jar inlayer.tar
then append- Then I found that
tar --delete some/path/foo.tar
does not work withtar --file xxx --delete path-to-jar
. GNU tar documentation claims that--delete
works in pipe of stdin and stdout(https://www.gnu.org/software/tar/manual/html_node/delete.html) But what is the correct syntax? I tried these but not working:
sudo tar tf layer.tar $f | sudo tar --delete #not deleting
sudo tar xf layer.tar --exclude $f | sudo tar cf layer.tar -T - # create tar of size 0
Some more considerations:
- I don't want to extract all files, as each layer contains
/usr
or/boot
that I don't want to deal with. My jars are basically under/opt
or so(not 100% sure) - I need to preserve the ownership/timestamp and so. That's why I use
sudo
(but not sure if that can achieve my purpose)
I use the script like this:
./remove-log4j-1.x-classes.sh registry.access.redhat.com/jboss-amq-6/amq63-openshift:1.4-44.1638430186
Please help, thanks!
EDIT: I now try with:
tar tf layer.tar -O | tar f - --delete $f > layer-new.tar
or
zcat -f layer.tar | tar f - --delete $f > layer-new.tar
But I fail with error:
tar: opt/amq/lib/optional/log4j-1.2.17.redhat-1.jar: Not found in archive
tar: Exiting with failure status due to previous errors