SlideShare a Scribd company logo
Benchmarking Techniques
Michael Hope <michael.hope@linaro.org>
bzr branch lp:~michaelh1/+junk/benchmarking-techniques
r6
2
Why benchmark?
3
Issues are
Relevance
Accuracy
Repeatability
4
Picking relevant benchmarks:
Profile
Workload
Features
We use SPEC 2000 and EEMBC
We'd like shareable benchmarks
5
Test Platform
Build / test / benchmark via Linux / web / commodity hardwar
6
Measuring
7
Realtime timers
See 'man clock_gettime'
8
<comparison of different timers>
See https://wiki.linaro.org/WorkingGroups/ToolChain/Benchmarks/TimerAc
9
10
11
External influences
12
Other features to be wary of
scheduler
governor
cpuidle, power management, thermal
limiting
SMP
bugs, like core lockdown
NEON startup
See “Understanding the Linux Kernel” ch10:
http://oreilly.com/catalog/linuxkernel/chapter/ch10.html
13
Putting things together and running
We use:
timer built into the app
run five times
collect everything
post process
14
Statistics
15
16
17
Standard deviation
Dispersion / coefficient of variance
t-scores
t-test
18
t=
̄X 1 − ̄X 2
√ s1
2
N 1
+
s2
2
N 2
See http://en.wikipedia.org/wiki/Welch%27s_t_
19
Compiler Mean Std CV
gcc-4.6.2 1.00 134u 134u
gcc-linaro-4.6-2011.11 4.25 5035u 1178u
t = 30,300
21
Variant Mean Std CV
Plain 1.00 320u 320u
With SMS 1.01 430u 426u
t = 118
22
Variant Mean Std CV
Plain 1.00 320u 320u
With vectoriser 1.001 404u 404u
t = 8.27 - significant
23
Our tools
perf
difftest
betters
tabulate
Python
LibreOffice!
Other statistical tools like scipy.stat, R,
Judge, and ministat
24
http://help.libreoffice.org/Calc/Applying_AutoFilter
25
www.linaro.org / wiki.linaro.org
people.linaro.org/~michaelh/presentations

More Related Content

Q2.12: Benchmarking Techniques