5

I know that if I have files in a bare repository, I can access them using git show HEAD:path/to/file.

But can I add new content to a bare repository without cloning and modifying a working tree?

6
  • 1
    You can do it without using a worktree by using low-level git commands to construct commits, a perfectly honorable way to do things but rarely an improvement on the convenience commands; and you can do it without cloning -- there's even lots of options -- but choosing a reasonable way to go about that depends on the relations between the history you want to construct and what's already there. So, what situation do you find yourself in that prompts the question here, please?
    – jthill
    Commented Apr 1, 2015 at 14:48
  • The contents of said git repository are small files, that one day could number in the millions. While I know there will be performance costs as things get larger, the concern of having those millions of small files and folder in the extreme case running on a server doesn't sit well. Using a bare repository alleviates some of that while alternatives are investigated Commented Apr 1, 2015 at 15:06
  • So a single commit could be of millions of files, which you often or usually won't want all checked out at once, yes? Okay, that's very helpful. Next issue: is it that monster checkout you're trying to avoid with the no-clone requirement, or what? To update a repo you either have to be on the same filesystem or push to it, the only question left is tradeoffs on where to do the work and details of the commit structure.
    – jthill
    Commented Apr 1, 2015 at 15:36
  • Each file has an associated commit, they're only ever added/read, never removed/modified the primary worry at the moment is file system and disk performance of having a massive file/folder tree Commented Apr 1, 2015 at 15:39
  • Okay. git's built to be extensible into territory like this, again it's just a question of the best way to proceed here, so, if you don't want to check out all the files, is there a reason to have every commit include them all? . . . but, really, there are enough questions here that I think a fairly complete narrative description of the system you're setting up and how you currently see git fitting in to it would be quickest, Then we'd be able to suggest the next step for you with a clearer picture of where you're starting from.
    – jthill
    Commented Apr 1, 2015 at 15:51

1 Answer 1

8

if I add 1 file, only the 1 file is in that commit, aka we added something new and it's now set in stone

There's several convenient ways to add a single-file commit to the tip of the master branch in a bare repo.

So it appears i need to create a blob object, attach it to a tree, then attach that tree object to a commit.

All the ways to commit anything boil down to doing that, it's just a question of how well the convenience commands suit your purpose. git add creates a blob and makes an entry for it in the index; git commit does a git write-tree that adds any new trees for what's in the index, and a git commit-tree that adds a commit of the top-level resulting tree, and a git update-ref to keep HEAD up to date. Bare repos do have a HEAD commit, generally attached to (aka a symbolic ref for) a branch like master, . . .

So git's convenience commands are already doing almost exactly what you want. Especially with just the one file, this is going to be very easy.

Say for example your files appear in ~server/data/logs/, the bare repo you're using for distribution is at ~server/repo.git, you want the committed files to be at data/logs in the repo, and you always want to commit the latest logfile:

#!/bin/sh
cd ~server

# supply locations git ordinarily does on its own in working i.e. non-bare repos:

export GIT_DIR=$PWD/repo.git                  # bare repos don't have defaults for these
export GIT_WORK_TREE=$PWD                     # so supply some to suit our purpose
export GIT_INDEX_FILE=$GIT_DIR/scratch-index  # ...

# payload:  commit (only) the latest file in data/logs:

git read-tree --empty                       # make the index all pretty, and 
git add data/logs/`ls -1t data/logs|sed q`  # everything's ordinary from here - add and 
git commit -m'new logfile'                  # commit

git read-tree loads index entries from committed trees. It's what underlies checkout and merge and reset and probably some others I'm forgetting atm. Here, we just want an empty index to start, hence --empty.

use push/pull/remote to synchronize data while using a tool already available on every machine

You said "millions" of files over time, and if you don't want that full history distributed, rsync as I gather you already suspect might be a better bet. But -- one at a time, one new file per minute, it'll take two years to accumulate just one million. So, ?

Whatever, the above procedure is pretty efficiently extensible to any small-ish numbers of files per commit. For bulk work there are better ways.

0

Not the answer you're looking for? Browse other questions tagged or ask your own question.