67

I've been reading Martin Fowler's note on Continuous Integration and he lists as a must "Everyone Commits To the Mainline Every Day".

I do not like to commit code unless the section I'm working on is complete and that in practice I commit my code every three days: one day to investigate/reproduce the task and make some preliminary changes, a second day to complete the changes, and a third day to write the tests and clean it up^ for submission. I would not feel comfortable submitting the code sooner.

Now, I pull changes from the repository and integrate them locally usually twice a day, but I do not commit that often unless I can carve out a smaller piece of work.

Question: is committing everyday such a good practice that I should change my workflow to accomodate it, or it is not that advisable?

Edit: I guess I should have clarified that I meant "commit" in the CVS meaning of it (aka "push") since that is likely what Fowler would have meant in 2006 when he wrote this.

^ The order is more arbitrary and depends on the task, my point was to illustrate the time span and activities, not the exact sequence.

16
  • 21
    You may commit your code if it compiles and performs some useful logic. Better to commit code in short cycles if you are working in a team environment.
    – Yusubov
    Commented Jul 5, 2012 at 20:35
  • 4
    Is Martin Fowler assuming a VCS that isn't distributed?
    – user16764
    Commented Jul 6, 2012 at 0:22
  • 4
    Note the date on that article: May 1st, 2006. Git and Mercurial weren't even begun until April 2005, and my impression is that they really started getting traction in about 2008. I can't find any articles on Fowler's site that refer to either one of them prior to 2009. So this article from 2006 clearly assumes a centralized source control system like SVN. The advice is not applicable to teams using a DVCS.
    – Kyralessa
    Commented Jul 6, 2012 at 2:35
  • 2
    @Kyralessa: The article even states that "Subversion is the modern [version control system]".
    – che
    Commented Jul 6, 2012 at 8:40
  • 4
    First the code and then the tests?
    – user1249
    Commented Jul 6, 2012 at 9:32

9 Answers 9

46

I do not agree with this rule and I agree with what Mason Wheeler said. I would like to add a few ideas.

I try to commit every time I have a meaningful change to commit: this can be several times a day if I fix several small bugs, or once a week if I am working on a larger piece of software that cannot be used by the rest of the code in any meaningful way until it reaches a consistent state.

Also, I interpret committing as publishing a meaningful revision that contributes new functionality to the code base. I think one should try to clean up the code before committing so that other developers can understand the meaning and the purpose of the change when they look at the revision history. The fewer changes other developers see in the history, the better: when I look at the revision history I want to see increments that add some meaningful functionality; I am not interested in every small idea each developer had and wanted to try out before they reached the solution.

Furthermore, I do not think it is a good idea to use the SVN server (or whatever version control system) as a backup facility to which the current snapshot of the code is committed (provided that it compiles): you can use a USB stick or an external USB-drive or a network disk to mirror your current code so that it does not get lost if your computer breaks down. Revision control and data backup are two different things. Publishing a revision is not the same as saving a snapshot of your code.

Finally, I think that it should not be a problem to commit every now and then (i.e. only when one is really satisfied with the current state of the code) and avoiding merge conflicts is not a good justification for committing (too) often. Many merge conflicts happen when different people work on the same files at the same time, which is a bad practice (see e.g. this article, point 7). Merge conflicts should be reduced by splitting a project into modules with clear interfaces and as few dependencies as possible, and by coordinating the work of developers so that the code they work on overlaps as little as possible.

Just my 2 cents.

EDIT

Another reason against premature commits that came to my mind is that a (very) buggy version cannot be tested. If you are committing on the trunk and your test team is testing every day, they might have no testable version for a few hours (or for a day). Even if you do not try to fix the bug and just revert your changes, a rebuild can take a couple of hours. With, say, five testers working in your team, you have wasted 5 x 2 = 10 hours of the team's time due to inactivity. It happened to me once so I really try to avoid premature commits in the name of commit as soon as possible.

10
  • 24
    A 'commit' is not a 'publish'. 'Commit' means 'snapshot'; 'publish' is called 'push' in scm-lingo. Of course, SVN just merges both concepts into one, making many sensible workflows impossible, but that's a limitation of the tool, not of source control workflows in general.
    – tdammers
    Commented Jul 5, 2012 at 22:08
  • 3
    Revision control and data backup are two different things Yes, I definitely feel this way.
    – Sled
    Commented Jul 6, 2012 at 1:59
  • 1
    @tdammers: I meant publish in an informal way: As long as the code is on my computer it is my private changes to the common code. As soon as I commit it, it is published, known to the rest of the team and part of the official project history.
    – Giorgio
    Commented Jul 6, 2012 at 4:37
  • 1
    In that case, 'commit' is probably the wrong word. Many SCM's allow for local commits, and sharing your code with the rest of the team is a separate action, usually called 'push'. Again, SVN lumps the two concepts together, but that is a limitation of the tool, and if it gets in the way of your workflow, consider switching to a different SCM.
    – tdammers
    Commented Jul 6, 2012 at 9:02
  • @tdammers: To have a clear distinction between local commit and publish would be a step forward. In SVN I can use a separate branch for that though. But again, I wonder why I would want to keep track of a revision that does not make much sense to me? I am not convinced that I want a new revision (even a private one) just because it is 5 'o clock and I am going home. I prefer to have a backup instead.
    – Giorgio
    Commented Jul 6, 2012 at 9:34
111

I commit code several times a day. Whenever I reach a point where the code is complete enough to compile and doesn't break other things, it goes in.

You should look at breaking up your work so you can safely check-in a few times a day.

The rationales for this are two:

  1. Any work that is not checked in may be lost - your computer may have a catastrophic failure. In this case, the longer you wait, the more work you lose.
  2. The more work you do without checking in, the more code others will need to integrate when you finally decide that it bakes. This introduces more chances of conflicts and merge issues.
20
  • 3
    If you've got a serious problem with conflicts and merge issues, it means your project manager isn't doing his job. Multiple cases involving similar functionality should go to the same developer, precisely so that you don't have two or more coders stomping each other's work. Commented Jul 5, 2012 at 20:19
  • 15
    @MasonWheeler - After 3 days of work that has not been committed, there is a very good chance that one has touched code that others have at the same time. If you have a bunch of programmers doing this, event the best project manager can't avoid conflicts from occurring.
    – Oded
    Commented Jul 5, 2012 at 20:22
  • 3
    @Oded: Maybe. I suppose my response is colored by my experience on a codebase large enough that our developers (about a dozen coders on the team) all tend to have non-overlapping responsibilities. Not sure how different it would be on smaller projects. Commented Jul 5, 2012 at 20:27
  • 3
    @ArtB - What if there is someone like yourself that only checks in every 3 days? Or once a week? You are relying on others doing the right thing.
    – Oded
    Commented Jul 6, 2012 at 8:30
  • 3
    When I read the question, my response was "is that like asking if it's a good idea to shower every week"? Commented Jul 8, 2012 at 12:05
41

Slavishly adhering to any methodology or practice without understanding the reasons behind it is never a good idea. That's where cargo-cult programming comes from.

Therefore, "I should commit every day because Martin Fowler said so" is just stupid. And sometimes it's impractical too. If you're working on a complicated new feature, you might not reach a point where it's worth checking in until you've already worked on it for a few days.

This doesn't mean you should make sure everything's perfect before checking it in. That's a good way to lose work if something goes wrong. The correct thing to do is to develop and use good judgment on the matter. Rules of thumb can only help you so much.

4
  • 2
    Then if it is complex feature integration/development, it is still a big loss not to commit it, maybe not to the trunk, but at least in a branch for this feature, that's what branches are for !
    – Vincent B.
    Commented Jul 6, 2012 at 1:32
  • 2
    What do you mean 'worth checking in'? If it doesn't break anyone else's code, why wouldn't you check it in? Commented Jul 6, 2012 at 3:05
  • 3
    "What do you mean 'worth checking in'? If it doesn't break anyone else's code, why wouldn't you check it in? ": because I do not want to keep old copies of the code just because they existed at some point in time. I want too keep an old copy of the code if it contains some useful information that I might want to retrieve in the future. Otherwise I am just producing useless noise in the revision history.
    – Giorgio
    Commented Jul 6, 2012 at 9:54
  • 3
    +1. I once worked in a team where we had to check code into the vcs every day, even if the code was a spike or a useless investigation. It proved inefficient and wasteful, particularly because it required periodic maintenance to clean the vcs up. It was a due to combination of paranoia over potentially risking losing a little time to redo something, and because the manager had read in a book that you should commit every day. An extreme example perhaps, but seriously, if you haven't the judgement to know whether it's "worth" checking something in, you're probably not well suited to the job.
    – S.Robins
    Commented Jul 6, 2012 at 13:43
15

Oded gave two important reasons to commit code as frequently as possible. I'll add a few more:

  1. While working on your piece of code, other might need some functions on that code. They shouldn't wait 6 days to get it. In this case my colleagues usually create a prototype in my piece of code, commit it, I add the body and commit it again. And this is usually done in a few hours.

  2. The 'common' code is for everyone to see every change as soon as possible. If the piece of code you're working on is totally separate from others' work and you will not have them wait, then it is recommended to create a branch for you to work on, and then, if everything is successful, merge it to the mainline.

3
  • 1
    Why is this answer with (IMO) the only correct and accurate answer (point 2) so low rated ?! Of course that's the point of a branch! @Mason Wheeler: So you enjoy coding several days in a raw without commit a single time ? Then why using a version control system ?!
    – Vincent B.
    Commented Jul 6, 2012 at 1:32
  • 2
    This is the correct answer. If your task is many days of work before it is usable, then branch. Otherwise you commit whenever it works to ensure that team members have the latest version, they can test that it works, and they identify added/missing features ASAP. Commented Jul 6, 2012 at 2:36
  • "So you enjoy coding several days in a raw without commit a single time ? Then why using a version control system ?!": Because eventually you do want to commit a revision, even though you are not forced to blindly commit every day. Rather, it is up to you to decide whether you commit several times a day or you work three days on a row without committing. I really do not see the point in committing some unfinished feature that nobody can use: just make a backup, the next day you can finish it and commit it.
    – Giorgio
    Commented Jul 7, 2012 at 22:08
8

I'm a strong believer in committing every logical change that is worth keeping. Commit often, and if code isn't worth keeping, revert it back to a clean state. The longer you wait to push/publish your code back, the harder it is to implement, and the more problems you'll run into. You'll also get feedback about your contributions a lot quicker:

  • do they break the build?
  • are you duplicating another team member's efforts?
  • are you doing something incorrect?
  • or are people waiting on things from you?

Small changes are a lot easier to manage.

Also, it's worth noting the difference between different version control systems. Some, such as Git (distributed), will allow you to commit and control your entire history locally, only pushing when you are ready to publish. Others, like SVN (centralized), will combine the two steps making small commits very inefficient.

Don't forget that your commits are essentially change documentation. When things go wrong, you'll be glad to have more history than not enough. A single commit for a weeks work seems useless to me. I'd just end up reading every single line of code changed rather than the summary of each logical chunk.

5

I think most of the answers here misses one of the main points in Martin Fowlers statement. This is related to Continuous Integration. Code that isn't checked in (pushed/published/merged) into the mainline isn't tested.

This should not be read as an encouragement to commit whatever code you have in your local machine whenever it's time to leave the office. As pointed out by several others here that would be bad, would break the build and cause an unstable mainline.

However, it is an encouragement to try to make your changes in small steps that can be checked in to the mainline without causing problems. This encourages evolution of the code instead of ripping it all apart and rewriting.

Now, what's good about this way of working?

  1. Not committing large chunks of code or revolutionary changes reduces the chance of breaking the build.
  2. If your commit breaks the build it is fairly trivial to identify what the problems are, to revert it and then commit a fixed version quickly.
  3. By making sure all tests run on every small change in the code, you ensure that you don't introduce subtle bugs or regressions that can come from having code grow outside of the continuous integration scheme.

Of course not all changes lend themselves to this approach. As others pointed out, no rule is absolute. However, for changes that are expected to stay out of mainline for a long time, set up an alternative mainline with it's own continuous integration scheme and follow the same approach towards it. With the distributed VCS's of today that's a fairly easy thing to do.

4
  • +1: "Of course not all changes lend themselves to this approach." I think this is the point. I find Fowler's advice OK, but one should judge from case to case. Instead, this advice is often generalized to an absolute rule and followed without any further consideration.
    – Giorgio
    Commented Jul 13, 2012 at 7:14
  • @Giorgio, I absolutely agree with you on that. No advice should be taken as absolute rules, no matter who is behind it.
    – harald
    Commented Jul 13, 2012 at 20:33
  • Some more ideas on this. "Code that isn't checked in (pushed/published/merged) into the mainline isn't tested.": I agree that this is a good principle and one should not wait weeks before checking in and having their code tested. However, blind application of this principle can lead to a broken application that cannot even be tested (I have seen this live: the whole test team sits idle for days and cannot test anything until the code is brought back into a usable state). Maybe what other users wrote is applicable to some situations but it is not in general.
    – Giorgio
    Commented Jul 13, 2012 at 21:12
  • 1
    Checking in unstable code is never ok. A commit that breaks the CI should be reverted. If you commit small incremental changes often, there's less chance of introducing such breakage than if you have a big change that has been left untested for a long time. It may also be easier to revert if it breaks the build. But as you say, sometimes there's no way outside of a disruptive change. Then by all means polish it as best as you can, and test thoroughly before committing it. The point is not following rules, but understanding where the advice comes from.
    – harald
    Commented Jul 14, 2012 at 8:08
3

Arguments for checking in every day:

  • Code is stored and backed up against harddrive failure
  • Activity can be recorded in commit notes (what did I do on Thursday...?)
  • Integration with existing code base happens earlier and in smaller chunks, hopefully identifying conflicts or merge issues sooner
  • Your team have visibility of what you have been working on
  • Your colleagues can work against your interfaces sooner, giving them more time to integrate with your 'big complex bit of code'
  • Your code will be real-world tested sooner, or at least exposed to more use than you will give it, leading to earlier identification of bugs or omissions.

Arguments against checking in every day:

  • Don't need to or don't want to
  • Haven't 'cleaned up' my code yet, it's a mess
  • Don't have time

I don't believe there's any good reason to check in less than daily apart from laziness or disorganisation. Nothing worse than see the code running in the development environment doesn't match the code in the development branch because someone 'hasn't finished yet' and thus hasn't checked in.

I'd love to be wrong on this so please let me know any legitimate argument against daily check-in.

8
  • "I don't believe there's any good reason to check in less than daily apart from laziness or disorganisation. ": I believe the opposite for exactly the same reason. I can take the time to look at the current state of the code and decide if it contains some relevant information that is worth remembering, or, if I am lazy and disorganized, I can simply check it in (and produce extra revisions with little information content) as long as it compiles.
    – Giorgio
    Commented Jul 6, 2012 at 9:57
  • 1
    I understand your point that one should not be lazy and clean up their code every day so that it can be checked in. On the other hand, when working on some complex code, this is difficult to achieve because the clean up may take several hours, and you cannot spend several hours every day only to clean up your code.
    – Giorgio
    Commented Jul 6, 2012 at 10:04
  • @Giorgio So you spend several days cleaning up your code? I've given some good reasons for checking in daily - your reason is that you will have to clean up your code? Just write cleaner code straight up. Commented Jul 8, 2012 at 22:56
  • This is not always possible, e.g. if I am developing from scratch some complex code (> 4000 LOC) which needs a lot of experimentation to get right. It is well possible that at the end of the day the code is a bit messy and I don't want to fix it until I get to a consistent state, which is a couple of days later. Unfortunately I am not so smart that finished, perfect code forms in my mind and I can always have it all written down in a few hours (i.e. at the end of one day). I had such an experience lately and the typical development cycle (from one consistent state to the next) was 2, 3 days.
    – Giorgio
    Commented Jul 9, 2012 at 1:20
  • @Giorgio do you not have a development branch that you are checking into? The code should be checked in so that other people can review and test it as well. Commented Jul 9, 2012 at 1:31
2

If you're meaning "commit" as "merge into mainline", then you definitely should not be doing that everyday on a software project that's being released to the customers. You should be merging changes that are done and tested, so that the mainline is always working and releasable, and not in some broken state with half-finished features.

However, the luxury of working with today's distributed version control is that you can both keep mainline stable, and at the same time do your git/hg/whatever commit every time you feel you want to preserve the state of things. I do this once every few hours and definitely at the end of every day.

With DVCS, you can publish your work, collaborate on it with others in your team, and keep it up to date with changes in the mainline branch. You can do all this without polluting the stability of code your customers and/or other teams depend on.

In times when Subversion was the latest technology and there was no way to fork and merge feature branches without extreme pain, having a mainline where several different features were in simultaneous construction might have been the best approach. But this superiority does not scale beyond 2010.

2

In Team Foundation Server you can 'Shelve' which is not the same as a check in, but just makes a backup of your code so that if your machine dies you have not lost the changes.

I have also seen software houses that have a 'developer line' and a 'mainline'. Devs are free to check in to the developer line whenever they deem fit and only the team leader has access to the mainline so they are responsible for copying code from dev to main when it is production ready.

Not the answer you're looking for? Browse other questions tagged or ask your own question.