1

For some context, I'm working on a package manager-like utility that supports building packages as a non-root user. I want to make sure that packages built by a root user and built by a non-root user are absolutely indistinguishable rather than, say, using a tar archive and ignoring the metadata.

Is there a format/utility a bit like tar where files and directories inside the archive don't (and ideally can't) contain metadata like permission bits, timestamps, and ownership-related info? I'd like the archive to be completely described by the directories and files that exist in it and the file contents (and thus it is incapable of storing symlinks or hard links either).

I'm also okay with an archive format that doesn't have the ability to distinguish between absolute and relative paths (i.e. /a/b and a/b map to the same thing because the archive's notion of a path is different from a Unix path).

3
  • Or you could stick with the archive file formats that exist, and do as Debian does: run pax or tar or whatever under fakeroot when building the archive for a package as a non-superuser.
    – JdeBP
    Commented Mar 8, 2017 at 3:39
  • Is there a way to strip the metadata from a tar archive after creating it? Or configure a cpio archive not to keep track of timestamps and permissions? I'm not trying to reinvent the wheel. I'd like to be able to inspect the archive after creating it. Commented Mar 8, 2017 at 4:14
  • Have you read the Debian wiki on reproducible builds? Commented Mar 8, 2017 at 22:40

2 Answers 2

3

You cannot remove user information when using tar (or cpio) but can force them to avoid distinguishing who made the package. Using tar and forcing some parameters to avoid distinguishing the user who has built the package (see man tar):

  • -P, --absolute-names: Force tar to not remove leading '/' (this is done by default). If you can don't put absolute paths on command line the path you mention will be kept (try tu use -C or --change-directory if you cannot cd in the root dir you want)
  • --owner: force user stored in tar file ignoring the actual owner of the files/directrories (e.g. --owner=root)
  • --group: force group stored in tar file (e.g. --group=root).
  • --no-acls: avoids copying your own ACLs in tar file
  • --numeric-owner, --numeric-group: masks the actual names of accounts on your local storing only UID/GID (not needed if your force root since root is always 0)
  • --mtime: to force the modification time of all files/directories in order to mask when they where actually modified

Be aware that symlinks with absolute paths will be stored as is. However it is mainly better to always use relative symlinks when they point inside your package tree.

1
1

The best I found was the following, which attempts to normalise by

  • sorting the file list
  • using numeric 0 for owner and group
  • removing the r and w bits for the owner, and all the permissions for everyone else
  • fixing the mtime to the UNIX epoch
find <files> -print0 \
| sort -z \
| tar -cf <output>.tar \
      --format=posix \
      --numeric-owner \
      --owner=0 \
      --group=0 \
      --mode="go-rwx,u-rw" \
      --mtime='1970-01-01' \
      --no-recursion \
      --null \
      --files-from -

I wrote more about this at http://h2.jaguarpaw.co.uk/posts/reproducible-tar/

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .