Critical Thinking for Software Testers

28-Jan-13

Critical Thinking for Testers
James Bach
http://www.satisfice.com
james@satisfice.com
Twitter: @jamesmarcusbach
Michael Bolton
http://www.developsense.com
michael@developsense.com
Twitter: @michaelbolton

Bolton’s Definition of Critical Thinking

• Michael Bolton

1

28-Jan-13

Why Don’t People Think Well?

• It’s got to be farmer, because there are 18 times more male
farm workers in the USA than male librarians as of 2010 (source:
US Bureau of Labor Statistics) and probably a much higher
percentage worldwide.
• (By contrast, for females the numbers of librarians and farmers
are roughly equal.)
• You must consider prior probability before answering questions
like this.

2

28-Jan-13

Reflex is IMPORTANT
But Critical Thinking is About Reflection

System 2
Slower
Surer

Faster
Looser

get more
data
REFLECTION

REFLEX

System 1
See Thinking Fast and Slow, by Daniel Kahneman

Why Don’t People Think Well?

3

28-Jan-13

• Answer: $.05
• Many people who get the right answer find that the
wrong answer ALSO popped into their heads…

Exercise: Calculator Test

“You are carrying a calculator.
You drop it!
Perhaps it is damaged!
What might you do to test it?”

4

28-Jan-13

and dry it out before attempting to test it.
When did I drop it? Was I in the middle of a calculation? If so part of my testing might be to visually
inspect the status of the display to determine whether the calculator appears to still be in the state it was at the
time I dropped it. If so, I might continue the calculation from that point, unless I believe that the drop
probably damaged the calculator.
Did I drop it on a hard surface with a force that makes me suspect internal damage? If so then I would
expect possible hairline fractures. I imagine that would lead to intermittent or persistent short circuits or
broken circuits. I also suspect damage to moving parts, battery or solar cell connections, or screen.
Did I drop it into a destructive chemical environment? If so, I might worry more about the progressive
decay of the components.
Did I drop it into a dangerous biological or radiological environment? If so, the functions of the
calculator maybe less concern than contaminants. I may have to test it with a Geiger counter.
Was the calculator connected to anything else whereby the connection (data cable or AC/cable or duct
tape that fastened it to a Faberge egg) could have been damaged, or could have damaged the thing it
was connected to?
Did I detect anything while it was dropping that leads me to suspect any damage in particular (e.g. an
electrical flash, or maybe a loud popping sound)?
Am I aware of a history of "drop" related problems with this calculator? Have I ever dropped it before?
Is the calculator ruggedized? Is it designed to be dropped in this way?
What is my relationship to this calculator? Is it mine or someone else's? Maybe I'm just borrowing it.
What is the value of this calculator. I assume that this is not a precious artifact from a museum. The
exercise as presented appears to be about a calculator as calculating machine, rather than as a precious
Minoan urn that happens to have calculator functions built into it.

What makes an assumption
more dangerous?
• Not “what specific assumptions are more
dangerous?”…
• But “what factors would make one assumption more
dangerous than another?”
• Or “what would make the same assumption more
dangerous from one time to another?”

5

28-Jan-13

What makes an assumption
more dangerous?
1. Consequential: required to support critical plans and activities. (Changing the assumption
would change important behavior.)
2. Unlikely: may conflict with other assumptions or evidence that you have. (The
assumption is counter intuitive, confusing, obsolete, or has a low probability of being
true.)
3. Blind: regards a matter about which you have no evidence whatsoever.
4. Controversial: may conflict with assumptions or evidence held by others. (The assumption
ignores controversy.)
5. Impolitic: expected to be declared, by social convention. (Failing to disclose the
assumption violates law or local custom.)
6. Volatile: regards a matter that is subject to sudden or extreme change. (The assumption
may be invalidated unexpectedly.)
7. Unsustainable: may be hard to maintain over a long period of time. (The assumption
must be stable.)
8. Premature: regards a matter about which you don’t yet need to assume.
9. Narcotic: any assumption that comes packaged with assurances of its own safety.
10. Latent: Otherwise critical assumptions that we have not yet identified and dealt with.
(The act of managing assumptions can make them less critical.)

Themes
• Technology consists of complex and ephemeral relationships that
can seem simple, fixed, objective, and dependable even when
they aren’t.
• Testers are people who ponder and probe complexity.
• Basic testing is a straightforward technical process.
• But, excellent testing is a difficult social and psychological process
in addition to the technical stuff.

A tester is someone
who knows that things can be different.
Jerry Weinberg

6

28-Jan-13

Don’t Be A Turkey
Graph of My Fantastic Life! Page 25!
(by the most intelligent Turkey in the world)

Well Being!

• Every day the turkey adds one more data point to
his analysis proving that the farmer LOVES
turkeys.
DATA
• Hundreds of observations
ESTIMATED
POSTHUMOUSLY
support his theory.
AFTER THANKSGIVING
“Corn meal a little
off today!”
• Then, a few days before
Thanksgiving…

Based on a story told by Nassim Taleb, who stole it from
Bertrand Russell, who stole it from David Hume.

Don’t Be A Turkey
• No experience of the past can LOGICALLY be
projected into the future, because we have no
experience OF the future.
• No big deal in a world of
stable, simple patterns.
• BUT SOFTWARE IS NOT
STABLE OR SIMPLE.
• “PASSING” TESTS CANNOT
PROVE SOFTWARE GOOD.
Based on a story told by Nassim Taleb, who stole it from
Bertrand Russell, who stole it from David Hume.

7

28-Jan-13

How Do We Know What “Is”?
“We know what is because we see what is.”

We believe
we know what is because we see
what we interpret as signs that indicate
what is
based on our prior beliefs about the world.

How Do We Know What “Is”?

“If I see X, then probably Y, because probably A, B, C, D, etc.”

• THIS CAN FAIL:
•
•
•
•
•
•

Getting into a car– oops, not my car.
Bad driving– Why?
Bad work– Why?
Ignored people at my going away party– Why?
Couldn’t find soap dispenser in restroom– Why?
Ordered orange juice at seafood restaurant– waitress misunderstood

8

28-Jan-13

The Role of Tacit Knowledge
• Although people tend to favour the explicit, much of
what we do in testing is based on evolving tacit
knowledge.

What are the elements of tacit knowledge your
testing? What parts can be made explicit?

Remember this, you testers!

9

28-Jan-13

Models Link Observation and Inference
• A model is an idea, activity, or object…
such as an idea in your mind, a diagram, a list of words, a spreadsheet, a
person, a toy, an equation, a demonstration, or a program

• …that represents another idea, activity, or object…
such as something complex that you need to work with or study.

• …whereby understanding the model may help you understand
or manipulate what it represents.
A map helps navigate across a terrain.
2+2=4 is a model for adding two apples to a basket that already has two apples.
Atmospheric models help predict where hurricanes will go.
A fashion model helps understand how clothing would look on actual humans.
Your beliefs about what you test are a model of what you test.
19

Models Link Observation & Inference

“I believe

”

My Model
of the World

“I see

”

• Testers must distinguish
observation from inference!
• Our mental models form the
link between them
• Defocusing is lateral thinking.
• Focusing is logical (or “vertical”)
thinking.

20

10

28-Jan-13

Testing against requirements
is all about modeling.

“The system shall operate at an input voltage range
of nominal 100 250 VAC.”

“Try it with an input voltage in the range of 100-250.”

How many test case are needed to test the product
represented by this flowchart?

11

28-Jan-13

This is what people think you do

“Compare the product to its specification”

described

actual

This is more like what you really do

“Compare the idea
of the product to
a description of it”

“Compare the idea
of the product to
the actual product”

imagined

described

actual

“Compare the actual product
to a description of it”

12

28-Jan-13

This is what you find…
The designer INTENDS the product to
be Firefox compatible,
but never says so, and it actually is
not.
The designer INTENDS the
product to be Firefox
compatible, SAYS SO IN THE
SPEC,
but it actually is not.

The designer INTENDS the product
to be Firefox compatible,
MAKES IT FIREFOX COMPATIBLE,
but forgets to say so in the spec.

The designer
INTENDS
the product to be
Firefox compatible,
SAYS SO,
and IT IS.

The designer assumes
the product is not Firefox
compatible, and it actually is not,
but the ONLINE HELP SAYS IT IS.

the product is not
Firefox compatible,
but it ACTUALLY IS, and the
ONLINE HELP SAYS IT IS.

the product is not Firefox
compatible, and no one
claims that it is,
but it ACTUALLY IS.

How to Think Critically:
Slowing down your thinking
• You may not understand. (errors in interpreting
and modeling a situation, communication errors)
• What you understand may not be true. (missing
information, observations not made, tests not
run)
• The truth may not matter, or may matter much
more than you think. (poor understanding of
risk)

13

28-Jan-13

To What Do We Apply Critical Thinking?
• Words and Pictures
• Causation
• The Product
• Design
• Behavior

• The Project
• Schedule
• Infrastructure

• The Test Strategy
• Coverage
• Oracles
• Procedures

“Huh?”
Critical Thinking About Words
• Among other things, testers question premises.
• A suppressed premise is an unstated premise that an
argument needs in order to be logical.
• A suppressed premise is something that should be there,
but isn’t…
• (…or is there, but it’s invisible or implicit.)
• Among other things, testers bring suppressed premises to
light and then question them.
• A diverse set of models can help us to see the things that
“aren’t there.”
28

14

28-Jan-13

Example: Missing Words
• “I performed the tests. All my tests passed.
Therefore, the product works.”
• “The programmer said he fixed the bug. I can’t
reproduce it anymore. Therefore it must be
fixed.”
• “Microsoft Word frequently crashes while I am
using it. Therefore it’s a bad product.”
• “Step 1. Reboot the test system.”
• “Step 2. Start the application.”
29

Example: Generating Interpretations
• Selectively emphasize each word in a statement; also
consider alternative meanings.
MARY had a little lamb.
Mary HAD a little lamb.
Mary had A little lamb.
Mary had a LITTLE lamb.
Mary had a little LAMB.
30

15

28-Jan-13

“Really?”
The Data Question

Safety Language
(aka “epistemic modalities”)
• “Safety language” in software testing, means to qualify
or otherwise draft statements of fact so as to avoid
false confidence.
• Examples:

I think…

So far…
apparently…

I infer…
I assumed…

The feature worked

It appears…
It seems…

I have not yet seen any
failures in the feature…

16

28-Jan-13

Some Common Beliefs About Testing
• Every test must have an expected, predicted result.
• Effective testing requires complete, clear, consistent, and
unambiguous specifications.
• Bugs found earlier cost less to fix than bugs found later.
• Testers are the quality gatekeepers for a product.
• Repeated tests are fundamentally more valuable.
• You can’t manage what you can’t measure.
• Testing at boundary values is the best way to find bugs.

Some Common Beliefs About Testing
• Test documentation is needed to deflect legal liability.
• The more bugs testers find before release, the better the testing
effort.
• Rigorous planning is essential for good testing.
• Exploratory testing is unstructured testing, and is therefore
unreliable.
• Adopting best practices will guarantee that we do a good job of
testing.
• Step by step instructions are necessary to make testing a
repeatable process.

17

28-Jan-13

Critical Thinking About Projects
• You will have five weeks to test the product:

5 weeks

“So?”
Critical Thinking About Risk
“When the user presses a button on the
touchscreen, the system shall respond within
300 milliseconds.”

18

28-Jan-13

Heuristic Model:
The Four Part Risk Story
Someone may be hurt or annoyed
because of something that might go wrong while operating the product,
due to some vulnerability in the product
that is exploited by some threat.

• Victim. Someone that experiences the impact of a problem. Ultimately no bug
can be important unless it victimizes a human.

• Problem: Something the product does that we wish it wouldn’t do.
• Vulnerability: Something about the product that causes or allows it to
exhibit a problem, under certain conditions.

• Threat: Some condition or input external to the product that, were it to occur,
would trigger a problem in a vulnerable product.

Critical Thinking About Diagrams
Analysis
• [pointing at a box] What if the function in this box fails?
• Can this function ever be invoked at the wrong time?
• [pointing at any part of the diagram] What error checking do you
do here?
• [pointing at an arrow] What exactly does this arrow mean? What
would happen if it was broken?

Browser

Web Server

App Server

Database
Layer

19

28-Jan-13

Guideword Heuristics
for Diagram Analysis
• Boxes
•
•
•
•
•
•
•
•
•

• Paths

• Lines

Interfaces (testable)
Missing/Drop out
Extra/Interfering/Transient
Incorrect
Timing/Sequencing
Contents/Algorithms
Conditional behavior
Limitations
Error Handling
Browser

•
•
•
•
•
•

Missing/Drop out
Extra/Forking
Incorrect
Timing/Sequencing
Status Communication
Data Structures

Web Server

•
•
•
•
•
•
•
•

Simplest
Popular
Critical
Complex
Pathological
Challenging
Error Handling
Periodic

Database
Layer

App Server

Testability!

Visualizing Test Coverage: Annotation
Performance history

Survey

Performance data

Server stress

Inspect reports
Build
stressbots

Build history
oracle

History oracle

Man-in-middle

Browser

Web Server

Data generator
Build
Error Monitor

Datagen Oracle

Force fail
Force fail

Build
Error Monitor
Review Error
Output

Database
Layer

App Server
Table consistency oracle
Coverage analysis
History oracle

40

20

28-Jan-13

Beware Visual Bias!
Browser

Web Server

•

setup

•

browser type & version

•

cookies

•

security settings

•

screen size

•

review client-side scripts & applets

•

usability

•

specific functions

Database
Layer
App Server

41

Exercise: Overlapping Events Testing

Event A
Event B
time

• You want to test the interaction between two
potentially overlapping events.
• How would you test this?

21

28-Jan-13

Critical Thinking About Practices:
What does “best practice” mean?

•
•
•
•
•
•
•

Someone: Who is it? What do they know?
Believes: What specifically is the basis of their belief?
You: Is their belief applicable to you?
Might: How likely is the suffering to occur?
Suffer: So what? Maybe it’s worth it?
Unless: Really? There’s no alternative?
You do this practice: What does it mean to “do” it? What does it
cost? What are the side effects? What if you do it badly? What if
you do something else really well?

Beware of…
•
•
•
•
•
•

Numbers: “We cut test time by 94%.”
Documentation: “You must have a written plan.”
Judgments: “That project was chaotic. This project was a success.”
Behavior Claims: “Our testers follow test plans.”
Terminology: Exactly what is a “test plan?”
Contempt for Current Practice: CMM Level 1 (initial) vs. CMM
level 2 (repeatable)

• Unqualified Claims: “A subjective and unquantifiable requirement is
not testable.”

22

28-Jan-13

Look For…
• Context: “This practice is useful when you want the power of creative
testing but you need high accountability, too.”

• People: “The test manager must be enthusiastic and a real hands on leader
or this won’t work very well.”

• Skill: “This practice requires the ability to tell a complete story about testing:
coverage, techniques, and evaluation methods.”

• Learning Curve: “It took a good three months for the testers to get good
at producing test session reports.”

• Caveats: “The metrics are useless unless the test manager holds daily
debriefings.”

• Alternatives: “If you don’t need the metrics, you ditch the daily
debriefings and the specifically formatted reports.”

• Agendas: “I run a testing business, specializing in exploratory testing.”

Some Common Thinking Errors
• Reification Error
• giving a name to a concept, and then believing it has an
objective existence in the world
• ascribing material attributes to mental constructs—“that
product has quality”
• mistaking relationships for things—“its purpose is…”
• purpose and quality are relationships, not attributes; they
depend on the person
• how can we count ideas? how can we quantify
relationships?

23

28-Jan-13

• Fundamental Attribution Error
• “it always works that way”; “he’s a jerk”
• failure to recognize that circumstance and context play a part
in behaviour and effects

• The Similarity Uniqueness Paradox
• “all companies are like ours”; “no companies are like ours”
• failure to consider that everything incorporates similarities
and differences

• Missing multiple paths of causation
• “A causes B” (even though C and D are also required)

• Assuming that effects are linear with causes
• “If we have 20% more traffic, throughput will slow by 20%”
• this kind of error ignores non linearity and feedback loops—
c.f. general systems

• Reactivity Bias
• the act of observing affects the observed
• a.k.a. “Heisenbugs”, the Hawthorne Effect

• The Probabilistic Fallacy
• confusing unpredictability and randomness
• after the third hurricane hits Florida, is it time to relax?

24

28-Jan-13

• Binary Thinking Error / False Dilemmas
• “all manual tests are bad”; “that idea never works”
• failure to consider gray areas; belief that something is
either entirely something or entirely not

• Unidirectional Thinking
• expresses itself in testing as a belief that “the application
works”
• failure to consider the opposite: what if the application
fails?
• to find problems, we need to be able to imagine that they
might exist

• Availability Bias
• the tendency to favor prominent or vivid instances in
making a decision or evaluation
• example: people are afraid to fly, yet automobiles are far
more dangerous per passenger mile
• to a tech support person (or to some testers), the product
always seems completely broken
• spectacular failures often get more attention than
grinding little bugs

• Confusing concurrence with correlation
• “A and B happen at the same time; they must be related”

25

28-Jan-13

• Nominal Fallacies
• believing that we know something well because we can name
it

• “equivalence classes”
• believing that we don’t know something because we don’t
have a name for it at our fingertips
• “the principle of concomitant variation”; “inattentional
blindness”

• Evaluative Bias of Language
• failure to recognize the spin of word choices
• …or an attempt to game it
• “our product is full featured; theirs is bloated”

• Selectivity Bias
• choosing data (beforehand) that fits your preconceptions or
mission
• ignoring data that doesn’t fit

• Assimilation Bias
• modifying the data or observation (afterwards) to fit the
model
• grouping distinct things under one conceptual umbrella
• Jerry Weinberg refers to this as “lumping”
• for testers, the risk is in identifying setup, pinpointing,
investigating, reporting, and fixing as “testing”

26

28-Jan-13

• Narrative Bias
• a.k.a “post hoc, ergo propter hoc”
• explaining causation after the facts are in

• The Ludic Fallacy
• confusing complex human activities with random, roll of the
dice games
• “Our project has a two in three chance of success”

• Confusing correlation with causation
• “When I change A, B changes; therefore A must be causing B”

• Automation bias
• people have a tendency to believe in results from an automated
process out of all proportion to validity

• Formatting bias
• It’s more credible when it’s on a nicely formatted spreadsheet or
document
• (I made this one up)

• Survivorship bias
• we record and remember results from projects (or people) who
survived
• the survivors prayed to Neptune, but so did the sailors who died
• What was the bug rate for projects that were cancelled?

27

28-Jan-13

Do you prefer A or B?

Program A:
Program B:

Do you prefer C or D?

Program C:
Program D:

28

28-Jan-13

A = C B = D A>B D>C
Program A:
(3/4 surveyed prefer this to B)
Program B:

Program C:
Program D:

(3/4 surveyed prefer this to C)

Some Verbal Heuristics:
“A vs. THE”
When trying to explain something,
prefer "a" to "the".
• Example: “A problem…” instead of “THE problem…”
• Using “A” instead of “THE” helps us to avoid several kinds
of critical thinking errors
• single path of causation
• confusing correlation and causation
• single level of explanation

29

28-Jan-13

“Unless…”
Try adding "unless..."
• When someone asks a question based on a false or
incomplete premise, try adding “unless…” to the
premise
• When someone offers a Grand Truth about testing,
append “unless…” or “except in the case of…”

“And Also…”
Whatever is happening,
something else
may ALSO be happening.
• The product gives the correct result! Yay!
• …It also may be silently deleting system files.

30

28-Jan-13

“So far” and “Not yet”
Whatever is true now,
may not be true for long.
•
•
•
•

The product works… so far.
We haven’t seen it fail… yet.
No customer has complained… yet.
Remember: There is no test for ALWAYS.

31

Critical Thinking for Software Testers

Related slideshows

More Related Content

Critical Thinking for Software Testers