Introduction to Git for developers
- 3. Brief history Local only Open-source: SCCS (1972) · RCS (1982) Proprietary: PVCS (1985) Client-server Open-source: CVS (1990) · CVSNT (1998) · Subversion (2000) Proprietary: Software Change Manager (1970s) · ClearCase (1992) · Visual SourceSafe (1994) · Perforce (1995) · Team Foundation Server (2005) Distributed Open-source: GNU arch (2001) · Darcs (2002) · DCVS (2002) · SVK (2003) · Monotone (2003) · Codeville (2005) · Git (2005) · Mercurial (2005) · Bazaar (2005) · Fossil (2007) Proprietary: TeamWare (1990s?) · Code Co-op (1997) · BitKeeper (1998) · Plastic SCM (2006)
- 4. Предпосылки Бо́льшую часть существования ядра Linux (1991-2002) изменения вносились в код путем приёма патчей и архивирования версий. В 2002 году проект перешёл на проприетарную BitKeeper В 2005 отношения между сообществом разработчиков ядра Linux и владельцем BitKeeper испортились, и право бесплатного пользования продуктом было отменено Guess who started git? Linus Torvalds
- 5. Problems Linus’ April 7, 2005 email: “ SCMs I've looked at make this hard. One of the things (the main thing, in fact) I've been working at is to make that process really efficient .” “ If it takes half a minute to apply a patch […] a series of 250 emails takes two hours” “ When I say I hate CVS with a passion , I have to also say that if there are any SVN (Subversion) users in the audience, you might want to leave […] I see Subversion as being the most pointless project ever started ” “ The slogan of Subversion for a while was "CVS done right", or something like that, and if you start with that kind of slogan, there's nowhere you can go. There is no way to do CVS right ”
- 6. “ git ” ? “ I’m an egotistical bastard, and I name all my projects after myself. First Linux, now git .” – Linus git – (British) a foolish or worthless person Examples of GIT That git of a brother of yours has ruined everything! <oh, don't be such a silly git , of course your mates want you around>
- 8. Design criteria Take CVS as an example of what not to do ; if in doubt, make the exact opposite decision. Support a distributed , BitKeeper-like workflow. Very strong safeguards against corruption , either accidental or malicious Very high performance
- 9. Characteristics Non-linear development rapid branching and merging specific tools for visualizing and navigating a non-linear development history Distributed development Like Darcs, BitKeeper, Mercurial, SVK, Bazaar and Monotone Compatibility with existing systems/protocols Repositories can be published via HTTP , FTP , rsync , or a Git protocol over either a plain socket or ssh CVS server emulation Subversion and svk repositories can be used directly with git-svn
- 10. Characteristics Efficiency order of magnitude faster than some revision control systems fetching revision history from a locally stored repository can be two orders of magnitude faster than fetching it from the remote server Git does not get slower as the project history grows larger Toolkit-based design set of programs written in C shell scripts that provide wrappers around those programs
- 11. INTERNALS No man should marry until he has studied anatomy and dissected at least one woman. - Honore de Balzac
- 12. Storage model Subversion, CVS, Perforce, Mercurial are Delta Storage systems store the differences between one commit and the next yes, mercurial is a delta-storage system Git is different stores a snapshot of what all the files in your project look like in this tree structure each time you commit.
- 13. Everything has hash All the information needed to represent the history of a project is stored in files referenced by a 40-digit "object name" that looks something like this: 6ff87c4664981e4397625791c8ea3bbb5f2279a3 SHA1 hash of the contents of the object. Advantages: Git can quickly determine whether two objects are identical or not, just by comparing names. Since object names are computed the same way in every repository, the same content stored in two repositories will always be stored under the same name . Git can detect errors when it reads an object, by checking that the object's name is still the SHA1 hash of its contents.
- 14. Objects Every object consists of three things - a type , a size and content There are four different types of objects: "blob", "tree", "commit", and "tag". A " blob " is used to store file data - it is generally a file A " tree " is basically like a directory A " commit " points to a single tree, marking it as what the project looked like at a certain point in time A " tag " is a way to mark a specific
- 15. Blob Object Chunk of binary data Files with same content (anywhere in repo) share same blob
- 16. Tree Object Simple object with pointers to blobs and other trees – like directory. Two trees have the same hash name if and only if their contents (including, recursively, the contents of all subdirectories) are identical
- 17. Commit Object Links a physical state of a tree with a description of how we got there and why.
- 18. Commit Object Commit is defined by a tree : The SHA1 name of a tree object, representing the contents of a directory at a certain point in time. parent(s ): The SHA1 name of some number of commits which represent the immediately previous step(s) in the history of the project. A commit with no parents is called a "root" commit, and represents the initial revision of a project. an author : The name of the person responsible for this change, together with its date. a committer : The name of the person who actually created the commit, with the date it was done. a comment describing this commit.
- 20. REVISION HISTORY You can either have software quality or you can have pointer arithmetic, but you cannot have both at the same time. -- Bertrand Meyer
- 21. History is a DAG In computer science speak, the Git object data is a directed acyclic graph . That is, starting at any commit you can traverse its parents in one direction and there is no chain that begins and ends with the same object
- 22. History is a DAG To keep all the information and history on the three versions of this tree, Git stores 16 immutable , signed , compressed objects.
- 23. BRANCHES There are two major products that come out of Berkeley: LSD and UNIX. We don’t believe this to be a coincidence. -- Jeremy S. Anderson
- 24. Objects vs References Git objects are immutable Beside objects, there are references Unlike the objects, references can change References are simple pointers to a particular commit
- 25. Branches Examples of references are branches and remotes A branch in Git is just a file that contains the SHA-1 of the most recent commit of that branch Creating a branch is nothing more than just writing 40 characters to a file . As you continue to commit, one of the branches will keep changing to point to the new commit SHA-1s, while the other one can stay where it was.
- 31. What to do with fast branches New branch each time you begin to work on a story or feature If you get blocked and need to put it on hold, it doesn’t effect anything else . Often you merge the branch back into development and delete it the same day that you created it If you get a huge project or idea (refactoring, etc), you create a long-term branch , continuously rebase it to keep it in line with other development, and once everything is tested and ready, merge it in with your master.
- 33. PRACTICE : LOCAL REPOSITORY The function of good software is to make the complex appear to be simple. -- Grady Booch
- 34. git init c:gt; mkdir test c:gt; cd test c:est> git init Initialized empty Git repository in c:/test/.git/
- 35. git add c:est> dir /b cities.cpp cities.h c:est> git add cities.h c:est> git status # On branch master # # Initial commit # # Changes to be committed: # (use "git rm --cached <file>..." to unstage) # # new file: cities.h # # Untracked files: # (use "git add <file>..." to include in what will be committed) # # cities.cpp
- 36. git commit c:est> git commit -m "first commit" [master (root-commit) 207b79d] first commit 1 files changed, 44 insertions(+), 0 deletions(-) create mode 100644 cities.h c:est> git status # On branch master # Untracked files: # (use "git add <file>..." to include in what will be committed) # # cities.cpp nothing added to commit but untracked files present (use "git add" to track)
- 37. git commit -a c:est> git status # On branch master nothing to commit (working directory clean) c:est> echo "aaa" > cities.cpp c:est> git commit -m "test" # On branch master # Changed but not updated: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: cities.cpp # no changes added to commit (use "git add" and/or "git commit -a") c:est> git commit -m "test" -a [master 6eaf41e] test 1 files changed, 1 insertions(+), 210 deletions(-) rewrite cities.cpp (100%)
- 39. git log c:est> git commit Aborting commit due to empty commit message. c:est> git log commit 57e762203d0b522fa3a47afcc907af313b5d6d78 Author: Dmitry Guyvoronsky <dmitry.guyvoronsky@gmail.com> Date: Fri Feb 25 16:18:15 2011 +0200 second commit commit 207b79dd89469a75c9e92a38c4b3eac904bea603 Author: Dmitry Guyvoronsky <dmitry.guyvoronsky@gmail.com> Date: Fri Feb 25 16:15:17 2011 +0200 first commit c:est> git log --pretty=oneline 57e762203d0b522fa3a47afcc907af313b5d6d78 second commit 207b79dd89469a75c9e92a38c4b3eac904bea603 first commit
- 40. git branch c:est> git status # On branch master nothing to commit (working directory clean) c:est> git branch * master c:est> git branch mytest c:est> git branch * master mytest c:est> git checkout mytest Switched to branch 'mytest' c:est> git branch master * mytest
- 41. git checkout -b c:est> git branch master * mytest c:est> git checkout -b another Switched to a new branch 'another' c:est> git branch * another master mytest
- 43. Cloning To clone repo = to create a copy Git can clone a repository over several transports , including local, HTTP, HTTPS, SSH, its own git protocol, and rsync.
- 44. Remote branches Remotes are pointers to branches in other peoples copies of the same repository If you got your repository by cloning it, you should have a remote branch of where you copied it from automatically added as origin by default.
- 45. Remote branches A fetch pulls all the refs and objects that you don’t already have from the remote repository you specify.
- 46. Remote branches We look at the origin/idea branch and like it, but we also want the changes they’ve made on their origin/master branch So we do a 3-way merge of their two branches and our master . We don’t know how well this is going to work, so we make a tryidea branch first and then do the merge there.
- 47. Just for your information The current record for number of commit parents in the Linux kernel is 12 branches merged in a single commit
- 48. git clone c:est> git clone git@dreamiurg.unfuddle.com:dreamiurg/test.git Initialized empty Git repository in c:/test/test/.git/ remote: Counting objects: 10, done. remote: Compressing objects: 100% (10/10), done. remote: Total 10 (delta 1), reused 0 (delta 0) Receiving objects: 100% (10/10), 5.69 KiB, done. Resolving deltas: 100% (1/1), done.
- 49. Local branches are yours only c:estest> git branch -a * master remotes/origin/HEAD -> origin/master remotes/origin/master c:estest> git checkout -b working Switched to a new branch 'working' c:estest> git branch -a master * working remotes/origin/HEAD -> origin/master remotes/origin/master
- 50. git fetch ; git merge c:estest> git st # On branch master nothing to commit (working directory clean) c:estest> git fetch remote: Counting objects: 4, done. remote: Compressing objects: 100% (2/2), done. remote: Total 3 (delta 1), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. From dreamiurg.unfuddle.com:dreamiurg/test 3ade0ca..6309355 master -> origin/master c:estest> git st # On branch master # Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded. # nothing to commit (working directory clean) c:estest> git merge origin/master Updating 3ade0ca..6309355 Fast-forward new.cpp | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) create mode 100644 new.cpp
- 51. git pull = git fetch ; git merge c:estest> git pull remote: Counting objects: 5, done. remote: Compressing objects: 100% (2/2), done. remote: Total 3 (delta 1), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. From dreamiurg.unfuddle.com:dreamiurg/test 6309355..a4212c7 master -> origin/master Updating 6309355..a4212c7 Fast-forward new.cpp | 1 + 1 files changed, 1 insertions(+), 0 deletions(-)
- 52. GUIs Simplicity is not a matter of dumbing things down. Simplicity is when someone takes care of the details.
- 58. git History is DAG Manipulate history – rebase, reset, commit amend, etc Branch is just a reference (head) Faster on Linux systems C Linux, Rails, Perl, Android, Wine, Fedora, Gnome etc. github.org (619,333 users, 1,783,177 repos) That’s it Mercurial History is DAG, but tries to be linear, causing negative effects in some places (same rev number over different repos) No tools to manipulate history by default Confusion working with branches – named/unnamed, etc. Python Mozilla, OpenJDK, OpenSolaris, Xen, Symbian, Go etc. bitbucket.org (100,000+ users, 49,334 repos) That’s it
- 59. … investigate it yourself Rebase Git stash Git bisect (binary search) Tagging (with/without message, +signed tags possible) … Profit!
- 60. Q & A Start here : http://book.git-scm.com / For those who know SVN - http:// git.or.cz/course/svn.html Git for Windows - http://code.google.com/p/msysgit/