Here's one possible answer. You can use the find
command to generate a list of all the files in your repository, exluding those contained in submodules. We identify a submodule as "a directory that contains a .git
directory". The find command would be:
find . \( \! -depth 0 -type d -execdir test -f {}/.git \; -prune \) \
-o -type f -print
That means "looking at everything contained in this directory, if it is a directory and contains a directory named .git
and is not the top-level directory, do not descend into it, otherwise print out the filename".
You can use this as input into the tar
command:
tar -T- -n -cz -f repo.tar.gz
That accepts a list of files on stdin (-T-
, makes sure do not recurse into subdirectories (because find
has done that for us), and creates (-c
) gzip-compressed (-z
) output.
The full command looks like:
find . \( \! -depth 0 -type d -execdir test -f {}/.git \; -prune \) \
-o -type f -print |
tar -T- -n -cz -f repo.tar.gz
This would create a compress tar archive named repo.tar.gz
in your current directory.
Ways that this will fail:
By not using git archive
, you may in accidentally includes files in your working directory that are not part of the git repository. Using git clean
can help avoid that.
You may also build an archive that includes changes that have not yet been committed to the repository. You can avoid this by explicitly checking for uncomitted changes before running your script.