Creating a culture that provokes failure and boosts improvement

A or B?
Creating a Culture that Provokes
Failure and Boosts Improvement
Ben Dressler

Failing
= not reaching the goal you set yourself

“Anything we design, we’re going to test and iterate, Lean Startup-style. Just
because something looks good, doesn’t mean it’s actually working. This data-driven
approach gives us a more enhanced resolution on how the product is
behaving and succeeding compared to what a typical startup would do.”
Garrett Camp (StumbleUpon, Uber)
Successful companies, start-ups and corporations alike, are leveraging
strategies that are powered by failure as a way of learning and adapting.!

Progressive failure means failing inexpensively and rapidly, with clear learnings
and fast recovery.!

It’s not about risking catastrophic damage.!

Ingredients for
progressive failure
1.  The Thumb of Caesar
2.  A Flight Recorder
3.  A Big Blackboard

Augmented reality
UK retailer - ex-catalogue business with 2 billion dollar turnover. Adopting a
test-and-fail culture on its journey to become a world class digital retailer.!

Cimagine – Israeli startup with a markerless augmented reality app for
furniture.!

„It‘s great“
„I would use this“
First verbal reactions from Shop Direct customers very encouraging – without
exception impressed by the technology. But…!

… users were not able to use basic functions of the app successfully.!

By our earlier deﬁnition this is a failure.!

The usual response to failing. We’re not going to do this.!

1. The Thumb of Caesar

The ﬁrst element of successful processes that are based on failure: Having a
clear, measurable criterion that tells you whether or not you missed your goal. It
is crucial to know THAT you failed if you want to take lessons from it.!

Think evolution: No matter how or why – genes being passed on means
success, genes not passed on means failure.!

Examples
- Metrics in an A/B test 
- Completion rates in user testing 
- Any measurable goal (be tough on yourself!)

Ron Kohavi
(Bing experimentation team)

“We measure 500 metrics. The
shipping decision is based on three.”

Ron Kohavi


= 0.6% of all data influences decision

Focus is key here. If this isn’t reﬂecting what you’re trying as a business overall,
it will drive you into the wrong direction long term.!

Success criterion: 100% of users can use all basic features

Measurement 1: 0% of users could use all basic features

Thumb of Caesar
Know THAT you failed
-  Yes/No answer
-  Eliminate ideas/prototypes/hypotheses
-  Base tests on a rock solid criterion
-  Statistics may apply
-  You‘ll learn one thing, but that for sure

2. A Flight Recorder
Knowing THAT you failed is the basics. But in order to improve you need more
information. That is why you also need to know HOW you failed.!

In the 1950s no one was interested in funding what would later become the
ﬂight recorder, or black box. In spring 2014 an estimated 60m€ are being spent
on ﬁnding a single one of those devices.!

Games
do it
well
Failing in a game usually leaves you with a trace of audiovisual feedback that
gives you a good idea of all the events leading to the failure. !

A multitude
of sources
This stage is all about gathering lots of rich, varied data. It’s not about
answering questions (yet), its about generating as many as possible! !

Ron Kohavi


= 99.4% of all data is used for investigating

It’s the antithesis of the Thumb of Caesar – we’re not concerned with
measuring or testing. All we want is data of all kinds to investigate.!

User
testing

Live on-demand data feed
Observation: Users aim higher than they should and
drag in unexpected ways

A Flight Recorder
Know HOW you failed
-  It‘s about having loads of data
-  It‘s about generating ideas
-  Don‘t confuse with hard evidence
-  No need to monitor all the time

3. A Big Blackboard
When we know THAT we fail and HOW we fail it is time we think about the
WHY. Now we throw all the data we got at the blackboard and try to understand
relations, build theories and come up with clean hypotheses.!

Examples
- Why it looks good: Design theory
- Why users do XYZ: Psychology
- Why you‘ll have product-market fit: Market models
Theory: Collection of ideas and assumptions that try to explain causal
relationships of a system (e.g. the user behaviour or growth development)!

1.  Older users are not familiar with 3D
technology
2.  Users aim too high because they have a
mental image of an overlay rather than a
3D environment
3.  Many users skip tutorials

The Big
Black-
board
Test hypothesis:
Masking the lower half of the camera screen will nudge users to aim
lower with the device.

This variation on the app tests nothing but the speciﬁc hypothesis we created. If
completion rates don’t improve we need to form a new one.!

A Big Blackboard
Know WHY you failed
-  Have a theory of the relevant system
-  Let different theories rival each other
-  Build yes/no hypotheses to predict effects
-  Modify theory after failure

Constant data feeds
Test: Fail/Success
Theory and
Hypotheses
Flight Recorder
Big Blackboard
Thumb of Caesar

After a few iterations
Thumb of Caesar: 100% of users could use all basic features

Now we not only have a better product that is ﬁt for launch – we also have learnt
fundamental things about how our user and the product behave. (see Garrett Camps
quote at the start)!

Big
Black-
board
1.  Design of Spotify lags behind
2.  Design is a factor in attracting users
3.  A good design results from...
(insert design theory here)
1. Blackboard = theories. These were some of the theories the team at Spotify
had when going into the redesign.!

The old design (top left) plus 3 different versions were used to get an initial feel
for user preferences.!

Thumb
Of Caesar
1.  Users will prefer one out of four
2.  The winning design will increase
brand perception
3.  The new design will make users
more satisfied with the product
4.  Any redesign will not hurt the
commercial metric
2. Thumb of Caesar = testing yes/no hypotheses. These were some of the hypotheses at the
time of going into testing stage. Importance of focus: Improving commercial metrics in the
short term wasn’t a focus at this point so the success criterion was only to not lower them!!

1.  Users will prefer one design Thumb up

2.  Brand perception and user satisfaction up Thumb up

User activation

No change
User retention

No change
# of songs played

No change
A
B

Flight
Recorder
1.  Raving press reviews
2.  Great ratings
3.  Positive user comments
3. Flight recorder = rich, diverse data collection. In this case the team gathered press
reviews, app store ratings, user comments and all test data that wasn’t used for the primary
hypotheses. !

Good things only become a culture if you keep doing them! Look at the data and keep asking questions.
Challenge your ideas/products/business/colleagues models, establish success and fail criteria. And build
razor sharp hypotheses – ”Our business idea will change the world” is too high level! !

Creating a culture that provokes failure and boosts improvement

Related slideshows

More Related Content

Creating a culture that provokes failure and boosts improvement