How to avoid continuous integration-caused instabilities in test environments?

Question

Assume you're using continuous integration processes which frequently update some target environments, so that every time there are some changes "you" can test your changes right away. That's part of the goals of CI, no?

But, also assume that you have other people involved in your test cycle, e.g. managers or customers. Makes sense to get other people involved in trying to review (break?) your upcoming changes, no?

But if you continuously keep delivering changes in the environment in which those other people are, seriously, trying to test them, then multiple issues may arise, such as:

they might waste their time in reporting issues which, by the time they save the (in depth) report, they cannot even reproduce the issue themselves anymore (e.g because accidently you also ran into the same issue, and already fixed it in their environment).
you might not be able to reproduce issues they reported, since the environments in which they ran into some issue, is no longer identical (you (!!!) might have overlayed their environment).

So what can you do (how to configure things?) to avoid such (frustrating) situations?

Pierre.Vriens · Accepted Answer · 2017-03-18 10:18:58Z

I'll give my experience on this one, mostly because it showcases why some answers are not always applicable.

Some context to start:

We have 7 environments to host roughly 80 applications, most of them rely on each others through webservices or shared tables on db2-iSeries.
For good or bad, the iSeries are our DB system of reference.
This last point invalidates any idea of bringing the app with its dependencies in an isolated environment as bringing up an AS400 for each would cost too much and we wouldn't have the hardware to run it anyway.

What we are doing is not a complete automated Continuous Delivery, we have a schedule of releases to bring up a coherent lot of applications for the general operations. Aside from this each test teams can trigger a release in one of the Q/A environment for the application they are testing and can put a lock on some application version to avoid another team request breaking their tests.

Dependencies of applications are checked before release, so the system won't release something if other applications can't be updated or doesn't match its dependencies needed. The main idea is to allow updates when it won't impact someone, if there are no tests planned, it should flow from previous environment (and we're aiming at removing the scheduled releases in the 5 firsts environments on mid term now we have validated this 'on demand' method system).

The short version is to have a 'semaphore' system around the applications in the environment, a team should be able to lock its target application with its dependencies (and transitive dependencies) for the time of manual tests.
The implementation of this semaphore is highly dependent on your automation system so I won't extend on that.

Of course the easy way is, as others mentioned, to create a fresh environment for an application with all its dependencies to avoid the semaphore described above.

This answer is a variation of what I'm used to (mainframes), where we do these kind of things for at least 1,5 decade or so already (before "DevOps" was born). I wonder if it would make sense to add my own answer here (to further expand on this answer, how we do this with CMN/ZMF for eg "banks"), or just take it to a new (self answered) question. What do you think? Also, I'm curious about that metaphore thing, worth a new question (with reference to this answer)? PS: you mind if I correct some typos? — Pierre.Vriens, Commented Mar 18, 2017 at 9:49
No problem for edit :) I did keep it generic, That's not much specific to a devops org IMHO. Again DevOps is an organization change, which may help setting up a better automation by sharing the concerns... so I call this a semaphore as in programmation, I don't think it worth a question but that's up to you — Tensibai, Commented Mar 18, 2017 at 10:11
Ok, edit completed (as usual: rollback / improve as you see fit). BTW, do you have an "s" on your keyboard?!?!?! Apart from that: stuff to think about over the weekend: see my newest meta question ... Bon weekend! Time for gardening over here (pruning ...) — Pierre.Vriens, Commented Mar 18, 2017 at 10:20

Dan Cornilescu · Accepted Answer · 2018-02-27 17:03:02Z

Sounds like you're talking about a test environment which is constantly re-used without being reliably re-initialized for every test execution. This makes such test an unreliable one. Similar, from the reliability perspective, with manual testing, if you want.

IMHO you shouldn't be using such testing inside your CI/CD qualification purposes as that will effectively invalidate your qualification process (at least in that area). Saying that the software passes test X without actually executing test X for every software version delivered or without having the certainty that the pass result obtained is not accidental (due to false positives) will erode your testing's confidence level. False negatives are not credibility damaging, but they are also undesired because of the unnecessary "noise" they create.

It's fine to execute such testing outside your CI/CD qualification process. But you'd be treating a failed result in such testing just like a customer-found bug: you'd need to reliably reproduce the issue to be able to develop a fix for it and confirm that the fix is working. And you can't really do that if the testing is not reliable.

If you plan to address the issue then ideally you'd first develop an automated, reliable test case for reproducing the issue. Which you'd use to develop a fix and confirm its effectiveness (test result should transition from FAIL to PASS). You can (should?) also place this testcase inside your CI/CD qualification process to prevent future re-occurence, if desired - to increase your overall software release quality level.

There's a lot to digest in your answer (I'm not sure I get it already completely). But what you wrote about "execute such testing outside your CI/CD qualification process": I would expect that the ultimate result of what gets produced/delivered is stored in your QA and prod environments (via CD, either automatic or manual). But that also "seems" to me that CI should also deliver its output over there, while "outside" seems like separation or duplication or something, no? — Pierre.Vriens, Commented Mar 13, 2017 at 18:35
The inside and outside references are relative to the CI verification loop. Basically I question the reason for the QA environment's existance - most of the tests done in there should be reliable and eventually executed as part of the CI verifications, especially in a continuous deployment context - since you want to execute them on every CI iteration (successful up to that point at least) anyways. — Dan Cornilescu, Commented Mar 14, 2017 at 2:21

Romeo Ninov · Accepted Answer · 2017-03-12 20:20:42Z

7

The usual approach is to create different environments:

DEV - this is the place where dev team mess the things. Here are create all changes tunings, deploy new version and so on. Here is the place where CI is integrated fully.

PREPROD/QA - this is the place "play" QA/test/validation team do tests. This environment usually freeze during the tests. Integration of CI with this environment is only to provide new version of the product, configurations, etc.

PRODUCTION - is it need to explain :)?

answered Mar 12, 2017 at 20:20

Romeo Ninov

4315 silver badges16 bronze badges

ok, that should help to improve stability, merci! My question is about "test" environments, so obviously "production" should not be considered as such. Despite those who use "production" for testing, you know the saying "The best test is to activate it in production, and if it doesn't work, just perform a rollback/backout!"?
– Pierre.Vriens ♦
Commented Mar 13, 2017 at 18:09
@Pierre.Vriens, "playing " in prod IMHO is not wise :) Such separation of environment is intentional. On previous job we had 5 different environments.... A votre serivce
– Romeo Ninov
Commented Mar 13, 2017 at 18:14
1

"I" agree that such playing is not wise. However what can "you" do about the cowboys (my 'term' I use for such juppies) who keep doing this over and over, and each and every time they get approval from their managers to get around the (eg) monthly release activation, by yet another bugfix (eg for their bugfix from the day before ... which introduced a new bug). You think that doesn't happen in the real world? BTW: about the "freeze" in your answer, you think it makes sense to post a question like "What are sample implementations of a frozen environment?"
– Pierre.Vriens ♦
Commented Mar 13, 2017 at 18:22
@Pierre.Vriens, for me it make perfect sense to post such question. Normally this is regulated by company rules, but devops create quite dynamic environment and this can re a real challenge :)
– Romeo Ninov
Commented Mar 13, 2017 at 18:36
1

This is my preferred approach, that way it gives an environment where the devs can immediately test their changes in an integrated environment, but keeps QA clean until the the code is ready to be formally tested
– Taegost
Commented Mar 29, 2017 at 20:43

Add a comment |

Adrian · Accepted Answer · 2017-03-13 20:26:29Z

If you're doing CI/CD, that implies that there are some automated tests happening (CI) prior to deployment (CD). If you're finding a lot of issues in your test environment, that means they aren't being caught by the tests being run prior to deployment; this indicates insufficient automated testing. If the developers are having issues where defects are cropping up in the test environment(s), they need to improve their automated test suites in order to prevent this. This will also improve quality and reliability overall, all the way through into production.

Pierre.Vriens · Accepted Answer · 2017-03-20 15:45:51Z

To add to Romeo Ninov's answer, internally within an environment you need to try and separate out the applications as much as possible. This is partially why docker has been so successful for dev/test. It let's you almost pretend that you aren't sharing an environment at all.

The other option is to have very clearly defined servers on which the applications run which are separate from the rest of the infrastructure that makes up your environment. Ie. All the environment management or enablement machinery goes on separate, long lived servers. Then you hook in new short lived servers based on a known image to test an application and, if any changes are made to the base image, you need to apply those changes everywhere for every new component. Which means testing the changes against everything.

If an appdev team ask for a change that breaks someone else's application, then tough luck, they need to internally create a mitigant in their code and keep their specific requirements separate from the environments offering.

Interesting viewpoints/additions, though there are some things in it that maybe you want to refine/rework: (1) "applications" in this context, what do you mean (some examples?) (2) any idea how this could work in (good old) mainframe environments (3) what's a "mitigant" in this context here? PS: let me know if you think I should create a new question for any of these "things" (bullets). — Pierre.Vriens, Commented Mar 13, 2017 at 18:16

Stack Exchange Network

How to avoid continuous integration-caused instabilities in test environments?

5 Answers 5

Not the answer you're looking for? Browse other questions tagged
continuous-integration
testing
or ask your own question.

Linked

Hot Network Questions

How to avoid continuous integration-caused instabilities in test environments?

5 Answers 5

Not the answer you're looking for? Browse other questions tagged continuous-integrationtesting or ask your own question.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
continuous-integration
testing
or ask your own question.