104

Why would you run unit tests on a CI server?

Surely, by the time something gets committed to master, a developer has already run all the unit tests before and fixed any errors that might've occurred with their new code. Isn't that the point of unit tests? Otherwise they've just committed broken code.

12
  • 52
    Our developers are not allowed to commit to master. They push to a feature branch, the CI server then merges with master and runs tests. If they succeed, then changes are merged to master. So code with broken tests cannot be on master... Commented Jan 27, 2016 at 16:45
  • 2
    @BoristheSpider - very good workflow indeed. master should always be sane, and preferably automatically deployed on each merge to a staging environment for internal QA & testing. Commented Jan 27, 2016 at 18:04
  • 139
    "Surely, by the time something gets committed to master, a developer has already run all the unit tests before and fixed any errors that might've occurred with their new code." What fantasy world do you live in?
    – jpmc26
    Commented Jan 27, 2016 at 22:21
  • 6
    In some industries, the important part isn't just to run the tests on the code, it's to run the tests on the binaries. Running the tests on the CI output means that you can guarantee that the delivered product works, because the exact binary your client received is the one that passed all your tests. It sounds trivial, but sometimes this can have an effect (one I've seen is obfuscation; on complex projects, or when set up oddly, it can cause problems in the obfuscated build that weren't there in the clean version). Commented Jan 28, 2016 at 16:53
  • 6
    "Surely, by the time something gets committed to master, a developer has already run all the unit tests before and fixed any errors that might've occurred with their new code."...not sure if serious
    – chucksmash
    Commented Jan 29, 2016 at 16:44

9 Answers 9

232

Surely, by the time something gets committed to master, a developer has already run all the unit tests before and fixed any errors that might've occurred with their new code.

Or not. There can be many reasons why this can happen:

  • The developer doesn't have the discipline to do that
  • They have forgotten
  • They didn't commit everything and pushed an incomplete commit set (thanks Matthieu M.)
  • They only ran some tests, but not the whole suite (thanks [nhgrif][2])
  • They tested on their branch prior to merging (thanks [nhgrif][2] * 2)

But the real point is to run the tests on a machine that is not the developer machine. One that is configured differently.

This helps catch out issues where tests and/or code depend on something specific to a developer box (configuration, data, timezone, locale, whatever).

Other good reasons for CI builds to run tests:

  • Testing on different platforms other than the main development platforms, which may be difficult for a developer to do. (thanks [TZHX][3])

  • Acceptance/Integration/End to End/Really long running tests may be run on the CI server that would not be run on a developer box usually. (thanks [Ixrec][4])

  • A developer may make a tiny change before pushing/committing (thinking this is a safe change and therefore not running the tests). (thanks [Ixrec][4] * 2)

  • The CI server configuration doesn't usually include all the developer tools and configuration and thus is closer to the production system

  • CI systems build the project from scratch every time, meaning builds are repeatable

  • A library change could cause problems downstream - a CI server can be configured to build all dependent codebases, not just the library one

    [2]: What's the point of running unit tests on a CI server?) [3]: What's the point of running unit tests on a CI server? [4]: https://softwareengineering.stackexchange.com/users/161917/ixrec

19
  • 37
    Other common reasons: 1) The CI server may run high-level integration/acceptance tests that take too long for the developers to always run them. 2) The developer did run them and then made one tiny change before pushing that they were very sure wouldn't break anything, but we want to be certain.
    – Ixrec
    Commented Jan 27, 2016 at 13:34
  • 11
    A change to a dependency often also runs all the downstream builds too. If a change that a developer makes breaks something downstream, it isn't easily seen when modifying a library (say changing an underlying datatype from a SortedSet to a HashSet (only providing the contract of Set) and someone downstream worked on the mistaken assumption that the Set was sorted). Not running the (downstream) tests on the CI server would let that bug fester for awhile.
    – user40980
    Commented Jan 27, 2016 at 14:10
  • 2
    @MichaelT Good catch. That's actually the cause for >90% of our CI failures these days, not sure how I forgot it...
    – Ixrec
    Commented Jan 27, 2016 at 15:14
  • 35
    Also, running them on a CI environment usually means you set up your project from scratch, ensuring your build is repeatable. Commented Jan 27, 2016 at 18:54
  • 5
    Also, two changes could be committed that tested okay separately, but break together (e.g. one removing an unused API, and the other starting to use it). Commented Jan 27, 2016 at 20:20
77

As a developer who doesn't run all the integration and unit tests before making a commit to source control, I'll offer up my defense here.

I would have to build, test and verify that an application runs correctly on:

  • Microsoft Windows XP and Vista with Visual Studio 2008 compiler.
  • Microsoft Windows 7 with Visual Studio 2010 compiler.
    • Oh, and the MSI builds for each of those.
  • RHEL 5 and 6 with 4.1 and 4.4 respectively (similarly CentOS)
    • 7 soon. Woop-de-woop.
  • Fedora Workstation with GCC for last three recent versions.
  • Debian (and derivatives like Ubuntu) for last three recent versions.
  • Mac OSX in last three recent versions.
    • And the packages (rpm, dmg, etc)

Add in the Fortran (with both Intel and GNU compilers), Python (and it's various versions depending on OS) and bash / bat script components and, well, I think you can see things spiral out

So that's sixteen machines I'd have to have, just to run a few tests a couple of times a day. It would be almost a full time job just to manage the infrastructure for that. I think almost anyone would agree that's unreasonable, especially multiplying it out to the number of people in the project. So we let our CI servers do the work.

Unit tests don't stop you committing broken code, they tell you if they know you've broken something. People can say "unit tests should be fast", and go on about principles and design patterns and methodologies, but in reality sometimes its just better to let the computers we've designed for repetitive, monotonous tasks do those and only get involved if they tell us they've found something.

11
  • 3
    Unit testing tests code not configurations. It would be seriously inert of you to add a new test and throw it over the wall without even running it locally first...
    – Robbie Dee
    Commented Jan 27, 2016 at 15:18
  • 34
    @RobbieDee I'm afraid I can't see your point? I don't suggest creating new tests without testing them locally, or just blindly committing things to source control without testing them yourself, and I would run the tests on my own machine -- but "configuration" does need to be tested for consistent behavior, and it's better to do that relatively quickly when the developer's mind is still in that area than finding an issue when the team who predominantly uses Macs wake up four thousand miles away and update their copies.
    – TZHX
    Commented Jan 27, 2016 at 15:29
  • 7
    @RobbieDee I'd say TZHX would run all the tests locally if they could do so, but they can't. Since TZHX can't, they run some tests locally (those that can run on their dev system and short enough or most relevant to the changed code, for example), and let the full battery run on the CI system. Fairly reasonable.
    – muru
    Commented Jan 27, 2016 at 23:52
  • 11
    @RobbieDee: He believes in unit testing. So he tests them on his Macbook air and pass and checks in. The CI servers running Red Hat, Solaris and Windows then runs those tests again. Isn't it nice to know that what you tested also works on production platforms?
    – slebetman
    Commented Jan 28, 2016 at 7:27
  • 2
    @RobbieDee: I have often written Unit Tests that were specific to a certain compiler on a certain platform. Consider e.g. a graphics subsystem that makes use of AMD (the Intel competitor) specific CPU instructions which are only available on g++ (the GNU C++ compiler) version 4.5 or newer, but I happen to work on a Atom CPU and ICC (the Intel C++ Compiler). It would be nonsense to run the AMD/g++4.5-tests everytime on that machine, yet it is code to be tested before release; plus my own CPU-independent code must be tested for proper interoperability. Sure, there are VMs and emulators, ...
    – phresnel
    Commented Jan 28, 2016 at 14:41
23

You'd think so wouldn't you - but developers are human and they sometimes forget.

Also, developers often fail to pull the latest code. Their latest tests might run fine then at the point of check-in, someone else commits a breaking change.

Your tests may also rely on a local (unchecked-in) resource. Something that your local unit tests wouldn't pick up.

If you think all the above is fanciful, there is a level above CI (on TFS at least) called Gated where builds that have failing tests are shelved and aren't committed to the code base.

3
  • 7
    I've seen more oops I forgot to commit that CI failures that I care to admit to. Commented Jan 27, 2016 at 15:10
  • 1
    @DanNeely To be fair, it beats getting your butt kicked by the build manager because you forgot to tell him/her about something... :-)
    – Robbie Dee
    Commented Jan 27, 2016 at 15:12
  • 3
    That's one of the reasons I love CI. Finding about and fixing your own ooopses is much better than having someone else find them for you. Commented Jan 27, 2016 at 15:14
23

Apart from the excellent Oded answer:

  • You test the code from the repository. It may work on your machine with your files... that you forgot to commit. It may depend on a new table that does not have the creation script (In liquibase for example), some configuration data or properties files.
  • You avoid code integration problems. One developer downloads the last version, creates unit and integration test, adds code, pass all test in his machine, commits and push. Another developer has just done the same. Both changes are right on their own but when merged causes a bug. This could be the repository merging or just that it is not detected as a conflict. E.g. Dev 1 deletes file that was not used at all. Dev 2 codes against this file and tests without Dev 1 changes.
  • You develop an script to deploy automatically from the repository. Having an universal building and deploying script solves a lot of issues. Some developer may have added a lib or compiling option that is not shared by everybody. Not only does this save you time, but more importantly, it makes the deployment safe and predictable. Furthermore, you can go back in your repository to version 2.3.1 and deploy this version with a script that works with this version. It includes database objects like views, stored procedures, views, and triggers that should be versioned. (Or you won't be able to go back to a workable version).
  • Other tests: Like integration, performance and end to end tests. This can be slow and might include testing tools like Selenium. You may need a full set of data with a real database instead of mock objects or HSQL.

I once worked on a firm that had a lot of bugs on deployment due to the merging and deployment process. This was caused by a weird propietary framework that made testing and CI hard. It was not a happy experience to find that code that worked perfectly on development didn't arrive right to production.

1
  • Yeap, simply forgetting to commit some of the changes is very common. I'd say forgetting to "svn add" new files and so forgetting to commit them later is the most popular way to get a failing automatic build.
    – sharptooth
    Commented Jan 28, 2016 at 7:28
14

by the time something gets committed to master

I usually set up my CI to run on every single commit. Branches don't get merged into master until the branch has been tested. If you're relying on running tests on master, then that opens a window for the build to be broken.

Running the tests on a CI machine is about reproducible results. Because the CI server has a known clean environment pulled from your VCS, you know that the test results are correct. When running locally, you could forget to commit some code needed for them to pass, or have uncommitted code that makes them pass when they should be failing.

It also can save the developers time by running different suites in parallel, especially if some are slow, multi-minute tests that aren't likely to be run locally after each change.

At my current work our production deployment is gated on CI passing all tests. The deploy scripts will prevent deployment unless they're passing. This makes it impossible to accidentally forget to run them.

CI being part of the workflow takes burden off of developers as well. As a developer, do you usually run a linter, static analyzer, unit test, code coverage, and integration test for every single change? CI can, completely automatically and without needing to think about it - reducing decision fatigue.

8
  • 1
    You shouldn't really have slow unit tests - this violates FIRST principles.
    – Robbie Dee
    Commented Jan 27, 2016 at 13:33
  • 4
    @RobbieDee: I think that usually the CI server runs all the tests, not just the unit tests. Commented Jan 27, 2016 at 13:35
  • 4
    @RobbieDee: in theory all unit tests are fast. In practice.... Regardless, CI can and should run all the tests - linters, static analysis, unit tests, integration tests.
    – Daenyth
    Commented Jan 27, 2016 at 13:39
  • 2
    @RobbieDee Obviously the specifics of configuration will vary from team to team. Even when the builds take multiple minutes, it's often possible to run multiple of those builds in parallel. Given a single monolithic codebase it could be a larger drawback, but IME it's not a barrier.
    – Daenyth
    Commented Jan 27, 2016 at 13:50
  • 1
    @RobbieDee I think it depends more on your architecture. I've seen it work for hand for an engineering team of ~80, but that's with well-defined sub-teams for product areas.
    – Daenyth
    Commented Jan 28, 2016 at 3:42
5

By the time something gets committed to master, a developer should have already run all the unit tests ... but what if they haven't? If you don't run the unit tests on the CI server, you'll not know until someone else pulls the changes to their machine and discovers the tests just broke on them.

In addition, the developer may have made a mistake and referenced a local resource specific to their machine. When they check in the code and the CI run fails, the problem is immediately identified and can be corrected.

3

Assuming (contrary to other answers) that developers are quite disciplined and do run unit tests before committing, there can be several reasons :

  • running unit tests can take long for some special set up. For example, running unit tests with memory checker (like valgrind) can take much longer. Although all unit tests are passing, memory check can fail.
  • the result is not that important for some special settings - for example, running unit tests to check the code coverage requires special compiling flags. For normal developers, code coverage is not that important - it is more for people taking care that code maintains certain quality, like team leads.
3

It is possible to imagine cases when the change A does not break the test, and change B does not break the test, but A and B together do. If A and B are made by different developers, only CI server will detect the new bug. A and B may even be two parts of the same longer sentence.

Imagine a train driven by the two locomotives A and B. Maybe one is more than enough and this is the fix to apply. However if the two "fixes" are applied removing both, the train will not move.

Also, not all developers run all Unit tests, while most good developers do.

0
2

Let's ask an equivalent question:

Why would you build the code on a CI server?

Surely, by the time something gets committed to master, a developer has already built the code before and fixed any errors that might've occurred with their new code. Isn't that the point of building code? Otherwise they've just committed broken code.


The are several reasons for doing CI, but the main point of CI is to get an idea what the state of the code is over time. The main benefit (out of several) this provides, is that we can find out when the build breaks, figure out what broke it, and then fix it.

If the code is never broken, why do we even use CI? To deliver builds for testing, nightly builds would be good enough.

Not the answer you're looking for? Browse other questions tagged or ask your own question.