A/B Testing Technique

Question

My friend and I were having a discussion about different ways of A/B testing. His technique was to make a single small change (e.g. changing the colour of a button) and then perform A/B Testing to see what effect the change had. My technique is to try quite a radical change (e.g. completely change the layout & content) and A/B test that.

His technique is a lot more scientific and you get a better understanding of the users but you could potentially end up refining a non-optimal path. With my technique you're not exactly sure why one version is better than another but it allows you to move faster and try new ideas.

Is one of these techniques better than the other?

Is your app really so close to perfect that the color of a single button will improve it? Are there better ways to spend your time? — Alex Feinman, Commented Feb 5, 2013 at 17:19
Multivariate testing would let you test with both approaches at the same time: test the two subtle changes against a wholly radical change and see which fairs better in the same test. — Taj Moore, Commented Feb 5, 2013 at 21:05
Larger changes could mean larger results, but the tradeoff is not knowing why those results occurred. Smaller changes might mean smaller wins, but you're more likely to know the cause. — d-_-b, Commented Nov 25, 2015 at 23:17

JohnGB · Accepted Answer · 2013-02-05 03:07:46Z

No, neither is better. They deal with different aspects or strategies, and in general you need both.

A small change lets you refine your design and have a better understanding of what affects conversion, but may let you end up with a local maxima.

A more radical change with many elements will not help you understand what affects conversion, but may also show you a significantly better design.

Think of the second method as course optimisation, and the first as fine optimisation.

enter image description here

Imagine that optimising your design is moving along the x-axis in the image above, and the y-axis it the conversion rate that you get by that change.

Now if you were near the local max, you would want fine optimisation. Whereas if you were further away, you would want more course optimisation. The problem is that we don't have a nice curve like this, and we only get to see what we test. That is why you need to use both techniques.

Amy Marquez · Accepted Answer · 2013-02-05 05:11:04Z

It sounds like you need multivariate testing. For example, select four elements on the page you'd like to test. You could radically redesign each of the elements so that in one version of the test, you're testing a mostly redesigned page. Then those four elements would be turned "on" or "off" alternately. You'd end up with 16 variations against your control (the original design).

So say your elements are the numbers 1, 2, 3 and 4. You'd test the variations like the pattern below, with the dashes (-) meaning that the element was not "turned on" in that variation:

1234
1-34
1--4
-234
12--
123-
1-34
1-3-
(etc.)

Smashing Magazine has a good article on multivariate testing.

As to the question if one method is better than the other, with the multivariate testing (versus A/B), you will know exactly which element caused a greater click-through rate or improved the usability of the page since it will be used with and without all of the other variants.

With A/B testing, especially with testing a broad-sweeping redesign, you will not know if you could have done better because if everything is changed, you can't attribute the change to any single element.

I know as a designer, I love to do big redesigns, but often what helps the end user is the careful testing of single elements. And I firmly believe that, especially in application flows or servicing interactions, the best UI changes are the ones that are invisible to the end user. By making gradual changes of elements that test better, over the course of time you'll have a completely redesigned page, but your user will never notice because the new elements were introduced slowly.

This way, you'll be able to avoid the cognitive dissonance that a one-and-done redesign can cause.

Giults · Accepted Answer · 2013-02-05 09:36:05Z

Both are valid approaches. You friend might get more statistically significant results because the more radical is the redesign the more the variation between results increases. Moreover, it takes much longer to achieve statistical relevance when testing a small change such as changing the color of a conversion button from red to green as opposed to a big change such as a complete redesign. Even if this is not an absolute rule.

Thing is a specific approach for A/B testing must be followed according to the hypothesis you want to disprove.

When starting any new A/B test we should always start the experiment assuming a null hypothesis. The purpose of running the A/B test is to disprove this null hypothesis. Therefore, you can use both methods (small or big changes in redesign) according to the goal you want to achieve.

You can read some of these considerations in these article: http://www.growthgiant.com/blog/the-pitfalls-ab-testing/

Serg · Accepted Answer · 2013-02-05 11:34:54Z

To make any reasonable conclusion based on A/B test you need to obtain statistics. Statistics is more precise if you have enough visitors and they are similar. You need to take into account your conversion value and visitors flow. If you have 10 visitors per day, you have to spend much time testing of changing single parameter. I think you should take in mind namely time aspect, of course, if you and your friend have huge visitors flow and perfect mathematical tools to analyze multiple statistical data.

Stack Exchange Network

A/B Testing Technique

4 Answers 4

Not the answer you're looking for? Browse other questions tagged
user-behavior
user-research
ab-testing
optimisation
or ask your own question.

Hot Network Questions

A/B Testing Technique

4 Answers 4

Not the answer you're looking for? Browse other questions tagged user-behavioruser-researchab-testingoptimisation or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
user-behavior
user-research
ab-testing
optimisation
or ask your own question.