DevoxxUK: Optimizating Application Performance on Kubernetes

Optimizing Application
Performance on Kubernetes
Dinakar Guniguntala @dinogun

About Me
●
Architect, Runtime Cloud Optimization
●
Former Maintainer, AdoptOpenJDK Community Docker Images
●
Interested in every aspect of running Java Apps in K8s including
Cloud Native as well as Legacy migration to Cloud
●
Ex Linux Kernel and glibc hacker
Dinakar Guniguntala (@dinogun)
Runtimes Cloud Architect, Red Hat

Kubernetes is a portable,
extensible, open-source
platform for managing
containerized workloads
and services, that facilitates
both declarative configuration
… blah blah blah
Kitna Deti Hai ?*
Any questions ?
* What's the mileage ?

●
Throughput
●
Response Time
●
Utilization

What is the granularity of observation ?
●
Trade-off between accurate info and overhead
Additional Operational Info
●
Quarkus Micrometer
●
Spring Actuator
●
Liberty MicroProfile
●
Node.js prom-client
Observability

BIOS
●
CPU Power and Performance Policy: <Performance>
OS / Hypervisor
●
CPU Scaling governor: <Performance>
$ cat
/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors
performance powersave
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
Performance
Hyperthreading
●
Do not count hyperthreading while capacity planning
Don’t Forget The Hardware

Node Affinity
SRE
Lower My Response Time!
Pod Affinity

Node Affinity
●
Helps to match workloads to right resources
Pod Affinity
●
Helps to schedule related pods together
Node and Pod Affinities
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S1
topologyKey: topology.kubernetes.io/zone
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S2
topologyKey: topology.kubernetes.io/zone

CPU Request / Limit Memory Request / Limit
SRE
Node Affinity Pod Affinity

K8s QoS
classes
Guaranteed
Burstable
BestEffort
Right Size
apiVersion: apps/v1
kind: Deployment
metadata:
name: acmeair
labels:
app: acmeair-app
spec:
replicas: 1
selector:
matchLabels:
app: acmeair-deployment
template:
metadata:
labels:
name: acmeair-deployment
app: acmeair-deployment
app.kubernetes.io/name: "acmeair-mono"
version: v1
spec:
volumes:
- name: test-volume
hostPath:
path: "/root/icp/jLogs"
type: ""
containers:
- name: acmeair-libertyapp
image: dinogun/acmeair-monolithic
imagePullPolicy: Always
ports:
- containerPort: 8080
resources:
requests:
memory: 500M
cpu: 2
limits:
memory: 1024M
cpu: 3
volumeMounts:
- name: "test-volume"
mountPath: "/opt/jLogs"
Ensure LimitRange does not
get in the way of your
deployment !
apiVersion: v1
kind: LimitRange
metadata:
name: limit-range
spec:
limits:
- default:
cpu: 1
memory: 512Mi
defaultRequest:
cpu: 0.5
memory: 256Mi
type: Container
Requests → Should cover the observed peaks
Limits → Handle any spikes !

SRE
Java Heap Size / Ratio

Container Aware JVM
Use -XX:MaxRAMPercentage and
-XX:InitialRAMPercentage
instead of -Xmx and -Xms.
Heap = 2.4G
Container Mem = 3G
Container Mem = 2G Container Mem = 4G
-Xmx = 2G -Xmx = 2G -Xmx = 2G
Comparing a fixed heap size with a “MaxRAMPercentage” setting
Here “-XX:MaxRAMPercentage=80”
Don’t Hardcode the Java Heap!
Heap = 1.6G Heap = 3.2G
Beware of Default Hotspot
Settings
If container “mem < 1G”,
assumed as “client-class”
machine by the JVM and
the default is “serial GC” !

SRE
Java Heap Size / Ratio
VPA HPA CA

It’s All About the Scaling
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
- type: Pods
pods:
metric:
name: packets-per-second
target:
type: AverageValue
averageValue: 1k
- type: Object
object:
metric:
name: requests-per-second
describedObject:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
name: main-route
target:
type: Value
value: 10k
Set HPA with app specific metrics
- type: External
external:
metric:
name: concurrent_connections
selector: "connection=current"
target:
type: Value
Value: 1200
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: zk-pdb
spec:
maxUnavailable: 1
selector:
matchLabels:
app: zookeeper
Use PodDisruptionBudget with CA
to ensure no service disruption

Life of a SRE?!
Finance
Developer
User

So what do we need here ?
●
Multiple stake holders to express requirements as an
“Objective Function”
●
Autonomously detect all the right options that tries to
match the “Objective Function”
●
Try options intelligently and provide a recommendation

Introducing
Kruize Autotune
https://github.com/kruize/autotune

Autotune Architecture
Example Autotune yaml
apiVersion: "recommender.com/v1"
kind: "Autotune"
metadata:
name: "quarkusapp-autotune"
namespace: "quarkusapp-autotune-ns"
spec:
slo:
objective_function: “performedChecks_total”
direction: “maximize”
slo_class: "throughput"
hpo_algo_impl: optuna_tpe
function_variables:
- name: “performedChecks_total”
query: "metrics_QuarkusApp_performedChecks_total"
datasource: "prometheus"
value_type: "double"
mode: "show"
selector:
matchLabel: "app.kubernetes.io/name"
matchLabelValue: "quarkusApp-deployment"
datasource:
name: “prometheus”
value: “prometheus_URL”
Dependency
Analyzer
Autotune
Operator
Experiment
Manager
App
Operator(s)
App Pods
(Production)
Deploy App Pods
with Experimental
Config
Config
experiment
Experiment
Results
App
Metrics
App Pods
(Training)
Incoming
App Load
Config
Recommendation
Recommendation
Manager
Metric
Providers
Tuning Sets
Search Space
Objective function
+
Tunables
(Container + Runtime +
App Server + App)
+
Ranges
optuna_tpe
Hyper-Parameter
Optimization
tpemultivariate
Hyper-Parameter
Optimization
optuna_scikit
Hyper-Parameter
Optimization
Results
Summary
Micrometer
Metrics
Layer
Info

Objective Fn: Reduce Response Time
[Layer] [Tunable] [Default, Range]
[Quarkus] quarkus.thread-pool.core-threads [1, 3-256]
[Quarkus] quarkus.thread-pool.queue-size [unbounded, 0-10000]
[Quarkus] quarkus.datasource.jdbc.min-size [0, 2-31]
[Quarkus] quarkus.datasource.jdbc.max-size [20, 32-100]
[Hotspot] FreqInlineSize [325, 325-1000]
[Hotspot] MaxInlineLevel [9, 9-50]
[Hotspot] MinInliningThreshold [250, 0-500]
[Hotspot] CompileThreshold [1500, 1000-20000]
[Hotspot] CompileThresholdScaling [1, 1-20]
[Hotspot] ConcGCThreads [0, 0-32]
[Hotspot] InlineSmallCode [1000, 500-5000]
[Hotspot] LoopUnrollLimit [50, 20-250]
[Hotspot] LoopUnrollMin [4, 0-20]
[Hotspot] MinSurvivorRatio [3, 3-48]
[Hotspot] NewRatio [2, 1-20]
[Hotspot] TieredStopAtLevel [4, 0-4]
[Hotspot] TieredCompilation [false, ]
[Hotspot] AllowParallelDefineClass [false, ]
[Hotspot] AllowVectorizeOnDemand [true, ]
[Hotspot] AlwaysCompileLoopMethods [false, ]
[Hotspot] AlwaysPreTouch [false, ]
[Hotspot] AlwaysTenure [false, ]
[Hotspot] BackgroundCompilation [true, ]
[Hotspot] DoEscapeAnalysis [true, ]
[Hotspot] UseInlineCaches [true, ]
[Hotspot] UseLoopPredicate [true, ]
[Hotspot] UseStringDeduplication [false, ]
[Hotspot] UseSuperWord [true, ]
[Hotspot] UseTypeSpeculation [true, ]
[Container] cpuRequest [None, 1-32]
[Container] memoryRequest [None, 270M-8192M]
Openshift version 4.8.13
3 Master
6 Worker
32C – 32GB
Each
RHEL 8.3
4C – 8GB
Benchmark → TechEmpower
Framework
– Quarkus RestEasy
K8s resource requests = limits
Incoming load is constant = 512 users

Be careful
what you
wish for !

0.28 ms
Default
0.83 ms
Autotune vs Default Config – Take 1
[ Obj Fn = Minimal Response Time ]

Summary: Better perf at a cost of higher hardware config
For full results please see
https://github.com/kruize/autotune-results/tree/main/techempower/experiment-4
[ Obj Fn = Minimal Response Time ]
60% better response time 19% better throughput

1.82 ms
Default
5.01 ms
[ Obj Fn = Minimal Response Time + Fixed Resources (4C, 4GB) ]

Summary: Better perf but slightly higher tail latencies
[ Obj Fn = Minimal Response Time + Fixed Resources (4C, 4GB) ]

1.91 ms
Default
5.01 ms
[ Obj Fn = Minimal Response Time + Fixed Resources (4C, 4GB) + Low Tail Latency ]

Best perf taking into account all requirements !

Cost for handling 1 million transactions / sec
Autotune vs Default Config – Take 3 - COST
8% cost reduction

[Layer] [Tunable] [Default, Range] Best Config (1.91 ms)
[Quarkus] quarkus.thread-pool.core-threads [1, 0-32] = 19
[Quarkus] quarkus.thread-pool.queue-size [unbounded, 0-10000] = 3700
[Quarkus] quarkus.datasource.jdbc.min-size [0, 1-12] = 10
[Quarkus] quarkus.datasource.jdbc.max-size [12, 12-90] = 86
[Hotspot] FreqInlineSize [325, 325-500] = 340
[Hotspot] MaxInlineLevel [9, 9-50] = 50
[Hotspot] MinInliningThreshold [250, 0-200] = 55
[Hotspot] CompileThreshold [1500, 1000-10000] = 6930
[Hotspot] CompileThresholdScaling [1, 1-15] = 8.3
[Hotspot] ConcGCThreads [0, 0-8] = 6
[Hotspot] InlineSmallCode [1000, 500-5000] = 1416
[Hotspot] LoopUnrollLimit [50, 20-250] = 128
[Hotspot] LoopUnrollMin [4, 0-20] = 13
[Hotspot] MinSurvivorRatio [3, 3-48] = 12
[Hotspot] NewRatio [2, 1-10] = 9
[Hotspot] TieredStopAtLevel [4, 0-4] = 4
[Hotspot] TieredCompilation [false, ] = true
[Hotspot] AllowParallelDefineClass [false, ] = false
[Hotspot] AllowVectorizeOnDemand [true, ] = true
[Hotspot] AlwaysCompileLoopMethods [false, ] = false
[Hotspot] AlwaysPreTouch [false, ] = false
[Hotspot] AlwaysTenure [false, ] = true
[Hotspot] BackgroundCompilation [true, ] = true
[Hotspot] DoEscapeAnalysis [true, ] = true
[Hotspot] UseInlineCaches [true, ] = false
[Hotspot] UseLoopPredicate [true, ] = false
[Hotspot] UseStringDeduplication [false, ] = false
[Hotspot] UseSuperWord [true, ] = true
[Hotspot] UseTypeSpeculation [true, ] = true
[Container] cpuRequest [None, 1-4] = 4
[Container] memoryRequest [None, 270M-4096M] = 3319M
Openshift version 4.8.13
3 Master
6 Worker
32C – 32GB
Each
RHEL 8.3
4C – 8GB
Benchmark → TechEmpower
Framework
– Quarkus RestEasy
K8s resource requests = limits
Incoming load is constant = 512 users

Autotune Roadmap
●
Autotune MVP expected 1H 2022
●
Currently single service only
●
For Dev / QA environments
●
Different load conditions = multiple
recommended configs
●
HPA recommendation

Summary
●
Observability is Key
●
Do not forget to tune the hardware
●
Set Node and Pod Affinities
●
Ensure requests and limits are set for all app pods and right sized
●
Do not hardcode the Java heap
●
Use app specific scaling metrics
●
Ensure no disruption with PDB
●
Check out Autotune for autonomous tuning and stay tuned(!) for
updates.

Repo’s and Contributing
●
Kruize Project - https://github.com/kruize
●
Autotune - https://github.com/kruize/autotune
●
Autotune Demo - https://github.com/kruize/autotune-demo
●
Benchmarks - https://github.com/kruize/benchmarks
●
Autotune Results - https://github.com/kruize/autotune-results
Call for collaboration !
Kruize Slack
@dinogun

DevoxxUK: Optimizating Application Performance on Kubernetes

Related slideshows

More Related Content

DevoxxUK: Optimizating Application Performance on Kubernetes