Q2.12: Benchmarking Techniques

Benchmarking Techniques
Michael Hope <michael.hope@linaro.org>
bzr branch lp:~michaelh1/+junk/benchmarking-techniques
r6

3
Issues are
Relevance
Accuracy
Repeatability

4
Picking relevant benchmarks:
Profile
Workload
Features
We use SPEC 2000 and EEMBC
We'd like shareable benchmarks

5
Test Platform
Build / test / benchmark via Linux / web / commodity hardwar

7
Realtime timers
See 'man clock_gettime'

8
<comparison of different timers>
See https://wiki.linaro.org/WorkingGroups/ToolChain/Benchmarks/TimerAc

12
Other features to be wary of
scheduler
governor
cpuidle, power management, thermal
limiting
SMP
bugs, like core lockdown
NEON startup
See “Understanding the Linux Kernel” ch10:
http://oreilly.com/catalog/linuxkernel/chapter/ch10.html

13
Putting things together and running
We use:
timer built into the app
run five times
collect everything
post process

17
Standard deviation
Dispersion / coefficient of variance
t-scores
t-test

18
t=
̄X 1 − ̄X 2
√ s1
2
N 1
+
s2
2
N 2
See http://en.wikipedia.org/wiki/Welch%27s_t_

19
Compiler Mean Std CV
gcc-4.6.2 1.00 134u 134u
gcc-linaro-4.6-2011.11 4.25 5035u 1178u
t = 30,300

21
Variant Mean Std CV
Plain 1.00 320u 320u
With SMS 1.01 430u 426u
t = 118

22
Variant Mean Std CV
Plain 1.00 320u 320u
With vectoriser 1.001 404u 404u
t = 8.27 - significant

23
Our tools
perf
difftest
betters
tabulate
Python
LibreOffice!
Other statistical tools like scipy.stat, R,
Judge, and ministat

24
http://help.libreoffice.org/Calc/Applying_AutoFilter

25
www.linaro.org / wiki.linaro.org
people.linaro.org/~michaelh/presentations

Q2.12: Benchmarking Techniques

More Related Content

Q2.12: Benchmarking Techniques