SlideShare a Scribd company logo
Replacing iptables with eBPF in
Kubernetes with Cilium
Cilium, eBPF, Envoy, Istio, Hubble
Michal Rostecki
Software Engineer
mrostecki@suse.com
mrostecki@opensuse.org
Swaminathan Vasudevan
Software Engineer
svasudevan@suse.com
22
What’s wrong with iptables?
3
IPtables runs into a couple of significant problems:
● Iptables updates must be made by recreating and updating all rules in a
single transaction.
● Implements chains of rules as a linked list, so almost all operations are O(n).
● The standard practice of implementing access control lists (ACLs) as
implemented by iptables was to use sequential list of rules.
● It’s based on matching IPs and ports, not aware about L7 protocols.
● Every time you have a new IP or port to match, rules need to be added and
the chain changed.
● Has high consumption of resources on Kubernetes.
What’s wrong with legacy iptables?
4
Complexity of iptables
● Linked list.
● All rules in the chain have to be replaced as a whole.
Rule 1
Rule 2
Rule n
...
Search O(n)
Insert O(1)
Delete O(n)

Recommended for you

BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)

USENIX LISA2021 talk by Brendan Gregg (https://www.youtube.com/watch?v=_5Z2AU7QTH4). This talk is a deep dive that describes how BPF (eBPF) works internally on Linux, and dissects some modern performance observability tools. Details covered include the kernel BPF implementation: the verifier, JIT compilation, and the BPF execution environment; the BPF instruction set; different event sources; and how BPF is used by user space, using bpftrace programs as an example. This includes showing how bpftrace is compiled to LLVM IR and then BPF bytecode, and how per-event data and aggregated map data are fetched from the kernel.

bpfebpflinux
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking

In this session, we’ll review how previous efforts, including Netfilter, Berkley Packet Filter (BPF), Open vSwitch (OVS), and TC, approached the problem of extensibility. We’ll show you an open source solution available within the Red Hat Enterprise Linux kernel, where extending and merging some of the existing concepts leads to an extensible framework that satisfies the networking needs of datacenter and cloud virtualization.

Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDP

SOSCON 2019.10.17 What are the methods for packet processing on Linux? And how fast are each packet processing methods? In this presentation, we will learn how to handle packets on Linux (User space, socket filter, netfilter, tc), and compare performance with analysis of where each packet processing is done in the network stack (hook point). Also, we will discuss packet processing using XDP, an in-kernel fast-path recently added to the Linux kernel. eXpress Data Path (XDP) is a high-performance programmable network data-path within the Linux kernel. The XDP is located at the lowest level of access through SW in the network stack, the point at which driver receives the packet. By using the eBPF infrastructure at this hook point, the network stack can be expanded without modifying the kernel. Daniel T. Lee (Hoyeon Lee) @danieltimlee Daniel T. Lee currently works as Software Engineer at Kosslab and contributing to Linux kernel BPF project. He has interest in cloud, Linux networking, and tracing technologies, and likes to analyze the kernel's internal using BPF technology.

agilioiptableslinux
5
Kubernetes uses iptables for...
● kube-proxy - the component which implements Services and load
balancing by DNAT iptables rules
● the most of CNI plugins are using iptables for Network Policies
6
What is BPF?
7
HW Bridge OVS .
Netdevice / Drivers
Traffic Shaping
Ethernet
IPv4 IPv6
Netfilter
TCP UDP Raw
Sockets
System Call Interface
Process Process Process
● The Linux kernel stack is split into multiple abstraction
layers.
● Strong userspace API compatibility in Linux for years.
● This shows how complex the linux kernel is and its years
of evolution.
● This cannot be replaced in a short term.
● Very hard to bypass the layers.
● Netfilter module has been supported by linux for more
than two decades and packet filtering has to applied to
packets that moves up and down the stack.
Linux Network Stack
8
HW Bridge OVS .
Netdevice / Drivers
Traffic Shaping
Ethernet
IPv4 IPv6
Netfilter
TCP UDP Raw
Sockets
System Call Interface
Process Process Process
BPF System calls
BPF Sockmap and
Sockops
BPF TC hooks
BPF XDP
BPF kernel hooks
BPF cGroups

Recommended for you

Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP

This document provides an introduction to eBPF and XDP. It discusses the history of BPF and how it evolved into eBPF. Key aspects of eBPF covered include the instruction set, JIT compilation, verifier, helper functions, and maps. XDP is introduced as a way to program the data plane using eBPF programs attached early in the receive path. Example use cases and performance benchmarks for XDP are also mentioned.

Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps

This document discusses XDP (eXpress Data Path), a high-performance network data path that allows programs to run on the receive path of a network interface card. XDP enables packet processing using eBPF programs before packets reach the Linux networking stack. The document provides an overview of XDP and its performance advantages over other packet processing methods. It also discusses XDP's current status and support in the Linux kernel as well as example use cases and benchmarks.

Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack

- The document discusses Linux network stack monitoring and configuration. It begins with definitions of key concepts like RSS, RPS, RFS, LRO, GRO, DCA, XDP and BPF. - It then provides an overview of how the network stack works from the hardware interrupts and driver level up through routing, TCP/IP and to the socket level. - Monitoring tools like ethtool, ftrace and /proc/interrupts are described for viewing hardware statistics, software stack traces and interrupt information.

linuxnetworkkernel
9
Mpps
10
PREROUTING INPUT OUTPUTFORWARD POSTROUTING
FILTER
FILTER FILTER
NAT
NAT
Routing
Decision
NAT
Routing
Decision
Routing
Decision
Netdev
(Physical or
virtual Device)
Netdev
(Physical or
virtual Device)
Local
Processes
eBPF
Code
eBPF
Code
IPTables
netfilter
hooks
eBPF
TC
hooks
XDP
hooks
BPF replaces IPtables
11
NetFilter NetFilter
To Linux
Stack
From Linux
Stack
Netdev
(Physical or
virtual Device)
Netdev
(Physical or
virtual Device)
Ingress
Chain
Selector
INGRESS
CHAIN
FORWARD
CHAIN
[local dst]
[rem
ote
dst]
TC/XDP Ingress
hook
TC Egress hook
Egress Chain
Selector
OUTPUT
CHAIN
[local src]
[remote
src]
Update
session
Label Packet
Update
session
Label Packet
Store
session
Store
session
Store
session
Update
session
Label Packet
Connection Tracking
BPF based filtering architecture
12
….
Headers
parsing
IP.dst
lookup
IP1 bitv1
IP2 bitv2
IP3 bitv3
eBPF Program #1 eBPF Program #2 eBPF Program #3
IP.proto
lookup
* bitv1
udp bitv2
tcp bitv3
Bitwise
AND
bit-vectors
Search
first
Matching
rule
Update
counters
ACTION
(drop/
accept)
rule1 act1
rule2 act2
rule3 act3
rule1 cnt1
rule2 cnt2
eBPF
Program
eBPF Program #N
Packet in
Packet out
From eBPF hook
To eBPF hook
Tailcall
Tailcall
Tailcall
Tailcall
Packet header offsets
Bitvector with temporary result
per cpu _array shared across the entire program chain
per cpu _array shared across the entire program chain
Each eBPF program can exploit a
different matching algorithm (e.g.,
exact match, longest prefix match,
etc).
Each eBPF program is
injected only if there are
rules operating on that
field.
LBVS is implemented
with a chain of eBPF
programs, connected
through tail calls.
Header parsing is done
once and results are kept
in a shared map for
performance reasons
BPF based tail calls

Recommended for you

Cilium - overview and recent updates
Cilium - overview and recent updatesCilium - overview and recent updates
Cilium - overview and recent updates

Cilium is an open source project which provides networking, security and load balancing for containers by using eBPF and XDP technologies in the Linux kernel. It provides eBPF and XDP features to CRI-O, Docker and Kubernetes. This presentation shows an overview on Cilium, explains the concepts behind it and then provide the project update, as it reached the 1.0 milestone last year. The video from talk at FOSDEM 2019: https://video.fosdem.org/2019/H.2214/cilium_overview_and_updates.webm

ciliumbpfebpf
Cloud Native Networking & Security with Cilium & eBPF
Cloud Native Networking & Security with Cilium & eBPFCloud Native Networking & Security with Cilium & eBPF
Cloud Native Networking & Security with Cilium & eBPF

This document summarizes a presentation about Cilium and eBPF. Cilium provides cloud native networking and security using eBPF. eBPF allows programs to run securely in the Linux kernel for networking, security, and observability. Cilium offers networking features like Kubernetes services, cluster mesh for multi-cluster connectivity, and platform integration. It also provides security using identity-based policies and API authorization. Observability features include flow visibility and service maps. Cilium can be used as a service mesh or with Tetragon for prevention capabilities without proxies.

cloud nativesecuritynetworking
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF

Using the new extended Berkley Packet Filter capabilities in Linux to the improve performance of auditing security relevant kernel events around network, file and process actions.

securitymonitoringlinux
13
BPF goes into...
● Load balancers - katran
● perf
● systemd
● Suricata
● Open vSwitch - AF_XDP
● And many many others
14
BPF is used by...
1515
Cilium
16
What is Cilium?

Recommended for you

Cilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDPCilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDP

This talk demonstrates that programmability and performance does not require user space networking, it can be achieved in the kernel by generating BPF programs and leveraging the existing kernel subsystems. We will demo an early prototype which provides fast IPv6 & IPv4 connectivity to containers, container labels based security policy with avg cost O(1), and debugging and monitoring based on the per-cpu perf ring buffer. We encourage a lively discussion on the approach taken and next steps.

containerpolicynetworking
eBPF maps 101
eBPF maps 101eBPF maps 101
eBPF maps 101

The document provides an overview of eBPF maps and how they can be used to share data between eBPF programs running in the kernel and userspace applications. It describes how maps are created via the BPF syscall using the BPF_MAP_CREATE command. It also explains how keys and values can be looked up, updated, and deleted from maps using commands like BPF_MAP_LOOKUP_ELEM, BPF_MAP_UPDATE_ELEM, and BPF_MAP_DELETE_ELEM. Finally, it lists the different types of eBPF maps available.

BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating System

Container runtimes cause Linux to return to its original purpose: to serve applications interacting directly with the kernel. At the same time, the Linux kernel is traditionally difficult to change and its development process is full of myths. A new efficient in-kernel programming language called eBPF is changing this and allows everyone to extend existing kernel components or glue them together in new forms without requiring to change the kernel itself.

linuxbpfcilium
17
CNI Functionality
CNI is a CNCF ( Cloud Native Computing Foundation) project for Linux Containers
It consists of specification and libraries for writing plugins.
Only care about networking connectivity of containers
● ADD/DEL
General container runtime considerations for CNI:
The container runtime must
● create a new network namespace for the container before invoking any plugins
● determine the network for the container and add the container to the each network by calling the corresponding plugins for each network
● not invoke parallel operations for the same container.
● order ADD and DEL operations for a container, such that ADD is always eventually followed by a corresponding DEL.
● not call ADD twice ( without a corresponding DEL ) for the same ( network name, container id, name of the interface inside the container).
When CNI ADD call is invoked it tries to add the network to the container with respective veth pairs and assigning IP address from the respective IPAM
Plugin or using the Host Scope.
When CNI DEL call is invoked it tries to remove the container network, release the IP Address to the IPAM Manager and cleans up the veth pairs.
18
Kubernetes API Server
Kubelet
CRI-Containerd
CNI-Plugin (Cilium)
Cilium Agent
eth0
BPF Maps
Container2
Container1
Linux Kernel
Network
Stack 000 c1 FE 0A
001 54 45 31
002 A1 B1 C1
004 32 66 AA
cni-add()..
Kubectl
K8s Pod
Userspace
Kernel
bpf_syscall()
BPF
Hook
Cilium CNI Plugin control Flow
19
Cilium Components with BPF hook points and BPF maps shown in
Linux Stack Orchestrator
20
container A container B container C
eth0 eth0 eth0
lxc0 lxc0 lxc1
eth0 eth0

Recommended for you

Linux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityLinux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network Security

Cilium is open source software for transparently securing the network connectivity between application services deployed using Linux container management platforms like Docker and Kubernetes. At the foundation of Cilium is a new Linux kernel technology called BPF, which enables the dynamic insertion of powerful security visibility and control logic within Linux itself. Because BPF runs inside the Linux kernel itself, Cilium security policies can be applied and updated without any changes to the application code or container configuration.

apikuberenetesdocker
DoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDK

The document compares eBPF, XDP and DPDK for packet inspection. It describes the speaker's experience using these tools to build a virtual machine that can handle 10Gbps of traffic and drop packets to mitigate DDoS attacks. It details how eBPF and XDP were able to achieve higher packet drop rates than iptables or a custom module. While DPDK could drop traffic at line rate, it required specialized hardware and expertise. Ultimately, XDP provided the best balance of performance, driver support and programmability using eBPF to drop millions of packets per second.

linuxdosddos
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux Kernel

The document discusses how Cilium can accelerate Envoy and Istio by using eBPF/XDP to provide transparent acceleration of network traffic between Kubernetes pods and sidecars without any changes required to applications or Envoy. Cilium also provides features like service mesh datapath, network security policies, load balancing, and visibility/tracing capabilities. BPF/XDP in Cilium allows for transparent TCP/IP acceleration during the data phase of communications between pods and sidecars.

kubernetesistioenvoy
21
Networking modes
Use case:
Cilium handling routing between nodes
Encapsulation
Use case:
Using cloud provider routers, using BGP
routing daemon
Direct routing
Node A
Node B
Node C
VXLAN
VXLAN
VXLAN
Node A
Node B Node C
Cloud or BGP
routing
22
23
24
L3 filtering – label based, ingress
Pod
Labels: role=frontend
IP: 10.0.0.1
Pod
Labels: role=frontend
IP: 10.0.0.2
Pod
IP: 10.0.0.5
Pod
Labels: role=backend
IP: 10.0.0.3
Pod
Labels: role=backend
IP: 10.0.0.4
allow
deny

Recommended for you

eBPF/XDP
eBPF/XDP eBPF/XDP
eBPF/XDP

Netronome's half-day tutorial on host data plane acceleration at ACM SIGCOMM 2018 introduced attendees to models for host data plane acceleration and provided an in-depth understanding of SmartNIC deployment models at hyperscale cloud vendors and telecom service providers. Presenter Bios Jakub Kicinski is a long term Linux kernel contributor, who has been leading the kernel team at Netronome for the last two years. Jakub’s major contributions include the creation of BPF hardware offload mechanisms in the kernel and bpftool user space utility, as well as work on the Linux kernel side of OVS offload. David Beckett is a Software Engineer at Netronome with a strong technical background of computer networks including academic research with DDoS. David has expertise in the areas of Linux architecture and computer programming. David has a Masters Degree in Electrical, Electronic Engineering at Queen’s University Belfast and continues as a PhD student studying Emerging Application Layer DDoS threats.

ebpfxdpsmartnics
Introduction to the Container Network Interface (CNI)
Introduction to the Container Network Interface (CNI)Introduction to the Container Network Interface (CNI)
Introduction to the Container Network Interface (CNI)

CNI, the Container Network Interface, is a standard API between container runtimes and container network implementations. These slides are from the Cloud Native Computing Foundation's Webinar, and explain what CNI is, how you use it, and what lies ahead on the roadmap.

kubernetescontainersnetwork
Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1

1. The document discusses OpenStack Neutron and Open vSwitch (OVS), describing their architecture and configuration. It explains that Neutron uses OVS to provide virtual networking and switching capabilities between virtual machines. 2. Key components of the Neutron-OVS architecture include the Neutron server, OVS agents on compute nodes, and the OVS daemon that implements the switch in the kernel and userspace. 3. The document also provides examples of configuring an OVS bridge and ports for virtual networking in OpenStack.

25
L3 filtering – label based, ingress
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
description: "Allow frontends to access backends"
metadata:
name: "frontend-backend"
spec:
endpointSelector:
matchLabels:
role: backend
ingress:
- fromEndpoints:
- matchLabels:
class: frontend
26
L3 filtering – CIDR based, egress
IP: 10.0.1.1
Subnet: 10.0.1.0/24
IP: 10.0.2.1
Subnet: 10.0.2.0/24
allow
deny
Cluster A
Pod
Labels: role=backend
IP: 10.0.0.1
Any IP not belonging
to 10.0.1.0/24
27
L3 filtering – CIDR based, egress
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
description: "Allow backends to access 10.0.1.0/24"
metadata:
name: "frontend-backend"
spec:
endpointSelector:
matchLabels:
role: backend
egress:
- toCIDR:
- IP: “10.0.1.0/24”
28
L4 filtering
Pod
Labels: role=backend
IP: 10.0.0.1
allow
deny
TCP/80
Any other port

Recommended for you

ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!

Extended BPF (eBPF) provides a mechanism for running custom programs inside the Linux kernel that can be used for filtering network packets, monitoring system activity, and more. eBPF programs are written in a restricted subset of C and compiled to bytecode that is verified by the kernel for safety before being run. The BCC toolkit makes it easier to write and load eBPF programs. The IO Visor project aims to further develop eBPF and provide tools and use cases for networking, security, and system tracing applications.

Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterKernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter

NDIV is a young, very simple, yet efficient network traffic diverter. Its purpose is to help build network applications that intercept packets at line rate with a very low processing overhead. A first example application is a stateless HTTP server reaching line rate on all packet sizes. Willy Tarreau, HaproxyTech

linux kernelnetwork
Packet Walk(s) In Kubernetes
Packet Walk(s) In KubernetesPacket Walk(s) In Kubernetes
Packet Walk(s) In Kubernetes

When it comes to networking inside Kubernetes, selecting the correct networking solution may be one of the most important decisions you may face. This is especially true if you are trying to run a Kubernetes cluster in production.  Therefore it's beneficial to have a good understanding of different CNI options out there and most importantly how these networking options are different from each other. This presentation goes over packet by packet-level details of how the network plumbing is happening with different CNI plugins including, Flannel, Calico & Cilium. 

kubernetesk8scalico
29
L4 filtering
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
description: "Allow to access backends only on TCP/80"
metadata:
name: "frontend-backend"
spec:
endpointSelector:
matchLabels:
role: backend
ingress:
- toPorts:
- ports:
- port: “80”
protocol: “TCP”
30
L7 filtering – API Aware Security
Pod
Labels: role=api
IP: 10.0.0.1
GET /articles/{id}
GET /private
Pod
IP: 10.0.0.5
31
L7 filtering – API Aware Security
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
description: "L7 policy to restict access to specific HTTP endpoints"
metadata:
name: "frontend-backend"
endpointSelector:
matchLabels:
role: backend
ingress:
- toPorts:
- ports:
- port: “80”
protocol: “TCP”
rules:
http:
- method: "GET"
path: "/article/$"
32
Standalone proxy, L7 filtering
Node A
Pod A
+ BPF
Envoy
Generating BPF programs for
L7 filtering through libcilium.so
Node B
Pod B
+ BPF
Envoy
Generating BPF programs for
L7 filtering through libcilium.so
Generating BPF programs
for L3/L4 filtering
Generating BPF programs
for L3/L4 filtering
VXLAN

Recommended for you

Scaling the Container Dataplane
Scaling the Container Dataplane Scaling the Container Dataplane
Scaling the Container Dataplane

Vector Packet Technologies such as DPDK and FD.io/VPP revolutionized software packet processing initially for discrete appliances and then for NFV use cases. Container based VNF deployments and it's supporting NFV infrastructure is now the new frontier in packet processing and has number of strong advocates among both traditional Comms Service Providers and in the Cloud. This presentation will give an overview of how DPDK and FD.io/VPP project are rising to meet the challenges of the Container dataplane. The discussion will provide an overview of the challenges, recent new features and what is coming soon in this exciting new area for the software dataplane, in both DPDK and FD.io/VPP! About the speaker: Ray Kinsella has been working on Linux and various other open source technologies for about twenty years. He is recently active in open source communities such as VPP and DPDK but is a constant lurker in many others. He is interested in the software dataplane and optimization, virtualization, operating system design and implementation, communications and networking.

dpdkcontainersnetwork technology
Introduction to TCP/IP
Introduction to TCP/IPIntroduction to TCP/IP
Introduction to TCP/IP

This document provides an introduction to TCP/IP networking. It discusses the TCP/IP network architecture including the client-server model and layers. It also covers naming and addressing schemes, common protocols like TCP, UDP, IP, and Ethernet. Packet formats and programming interfaces are described. Finally, it discusses protocol analysis tools like Wireshark that can be used to observe network traffic.

tcpipudp
Scaling Kubernetes to Support 50000 Services.pptx
Scaling Kubernetes to Support 50000 Services.pptxScaling Kubernetes to Support 50000 Services.pptx
Scaling Kubernetes to Support 50000 Services.pptx

Scaling Kubernetes to Support 50,000 Services The challenges of scaling Kubernetes include high API server load from managing a large number of services, pods and endpoints. Solutions tested include batch processing requests to reduce QPS, and restructuring iptables rules into a search tree for faster routing. IPVS was also tested and showed significantly better performance than iptables for large scale deployments, with constant latency for adding rules and higher network bandwidth. With these techniques, Kubernetes has been scaled to support over 50,000 services.

33
Features
34
Cluster Mesh
Cluster A Cluster B
Node A
Pod A
+ BPF
Node B
+ BPF
Container
eth0
Pod B
Container
eth0
Pod C
Container
eth0
External etcd
Node A
Pod A
+ BPF
Container
eth0
35
Socket Socket Socket Socket
Service Service
Socket
TCP/IP
Ethernet
eth0
Socket
TCP/IP
Ethernet
eth0
Network
TCP/IP
Ethernet
IPtables
TCP/IP
Ethernet
IPtables
Loopback
IPtables IPtables
TCP/IP TCP/IP
Ethernet Ethernet
Loopback
36
Cilium CNI Cilium CNI
Socket Socket Socket Socket
Service Service
Socket
TCP/IP
Ethernet
eth0
Socket
TCP/IP
Ethernet
eth0
Network

Recommended for you

IPTABLES Introduction
IPTABLES IntroductionIPTABLES Introduction
IPTABLES Introduction

In this slide, we discuss the concept of IPTABLES/EBTABLES and then show how they work in a simple docker environment. In order to track the packet flow in those containers communication, we use the LOG module in IPTABLES/EBTABLE to track the information.

linuxnetworkingiptables
Netlink-Optimization.pptx
Netlink-Optimization.pptxNetlink-Optimization.pptx
Netlink-Optimization.pptx

The document discusses using eBPF filters to optimize Netlink performance in SONiC by filtering unnecessary Netlink messages. It proposes: 1. Using eBPF/CBPF socket filtering to drop unwanted Netlink messages in the kernel before they are sent to applications. 2. Implementing filters using either eBPF assembly or Clang/LLVM for easier development and debugging. 3. Developing a customized eBPF library for SONiC with predefined filter rules and actions to simplify application-specific filtering.

xdp
Protecting host with calico
Protecting host with calicoProtecting host with calico
Protecting host with calico

Project Calico is an open-source networking project that provides layer 3 networking for scalable datacenter deployments using a more efficient implementation than traditional overlays. Calico is able to secure network interfaces on hosts using the same security policy model used for workloads. It supports building components like Calicoctl and Calico/node, and defines two types of endpoints - host endpoints for static interfaces and workload endpoints for dynamically managed interfaces. To run Calico and secure host interfaces, basic connectivity and policy is created, host endpoint objects are created in etcd for each interface, and additional security policies can be applied.

project calico
37
Service A Service B Service C
38
Service A Service B
39
Service A Service B
External
Github
Service
External
Cloud
Network
40
Kubernetes Services
● Hash table.
BPF, Cilium
● Linked list.
● All rules in the chain have to be
replaced as a whole.
Iptables, kube-proxy
Key
Key
Key
Value
Value
Value
Rule 1
Rule 2
Rule n
...
Search O(1)
Insert O(1)
Delete O(1)
Search O(n)
Insert O(1)
Delete O(n)

Recommended for you

OSDC 2017 - Casey Callendrello -The evolution of the Container Network Interface
OSDC 2017 - Casey Callendrello -The evolution of the Container Network InterfaceOSDC 2017 - Casey Callendrello -The evolution of the Container Network Interface
OSDC 2017 - Casey Callendrello -The evolution of the Container Network Interface

The Container Network Interface (CNI) is a simple specification for connecting containers to an arbitrary network. It promises interoperability between diverse networking technologies and container orchestration engines. Since its release two years ago, the CNI standard has grown in adoption. It is now a cross-industry effort, with contributors from CoreOS, RedHat, Google, Microsoft, and WeaveWorks, for example. CNI is used by the Kubernetes, CloudFoundry, and Mesos container orchestration engines. After a brief overview of the project, this talk will cover recent and coming developments in the CNI. As a specification, the CNI must balance the desire for new features with that of stability. I’ll cover the implications of that need for balance, design considerations, changes in the CNI spec, and the new use cases made possible.

osdc 2017
OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...
OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...
OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...

The Container Network Interface (CNI) is a simple specification for connecting containers to an arbitrary network. It promises interoperability between diverse networking technologies and container orchestration engines. Since its release two years ago, the CNI standard has grown in adoption. It is now a cross-industry effort, with contributors from CoreOS, RedHat, Google, Microsoft, and WeaveWorks, for example. CNI is used by the Kubernetes, CloudFoundry, and Mesos container orchestration engines. After a brief overview of the project, this talk will cover recent and coming developments in the CNI. As a specification, the CNI must balance the desire for new features with that of stability. I’ll cover the implications of that need for balance, design considerations, changes in the CNI spec, and the new use cases made possible.

osdcopen sourcedata
SRE NL MeetUp - eBPF.pdf
SRE NL MeetUp - eBPF.pdfSRE NL MeetUp - eBPF.pdf
SRE NL MeetUp - eBPF.pdf

The presentation from Joseph Muli and Rajesh Dutta from Xebia on "How eBPF boost up  Kubernetes service  networking performance" - as presented on the 13th of April, 2023 at the Site Reliability Engineering NL MeetUp.

site reliability engineeringsreebpf
41
usec
number of services in cluster
42
CNI chaining
Policy enforcement, load balancing,
multi-cluster connectivity
IP allocation, configuring network
interface, encapsulation/routing
inside the cluster
43
Native support for AWS ENI
44
●
●
●
●
●
●
●

Recommended for you

iptables 101- bottom-up
iptables 101- bottom-upiptables 101- bottom-up
iptables 101- bottom-up

From the bottom-up approach to introduction the iptables, including the architecture of iptables/ebtables and the some usage of iptables.

iptableslinux
Cilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDPCilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDP

We present a new open source project which provides IPv6 networking for Linux Containers by generating programs for each individual container on the fly and then runs them as JITed BPF code in the kernel. By generating and compiling the code, the program is reduced to the minimally required feature set and then heavily optimised by the compiler as parameters become plain variables. The upcoming addition of the Express Data Plane (XDP) to the kernel will make this approach even more efficient as the programs will get invoked directly from the network driver.

linux kernel bpf network containers
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)

This document discusses Linux rumpkernel and the LKL (Linux Kernel Library). It introduces LKL as a library that allows running unmodified Linux kernel code in various configurations like application libraries and microkernels. LKL transforms a monolithic kernel code into a reusable library called liblkl by outsourcing machine-dependent code and keeping application and kernel code untouched. It provides different interfaces for applications to interact with the LKL kernel, including direct syscalls, hijacking the host library, or extending an alternative libc. Various usages of LKL are also presented, such as running a network stack in userspace (NUSE), building unikernels, and doing network simulation with ns-3 using the Linux network stack.

lklrumplinux
45
●
○
●
○
●
○
●
○
46
47
48

Recommended for you

Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetes

Container technologies use namespaces and cgroups to provide isolation between processes and limit resource usage. Docker builds on these technologies using a client-server model and additional features like images, containers, and volumes to package and run applications reliably and at scale. Kubernetes builds on Docker to provide a platform for automating deployment, scaling, and operations of containerized applications across clusters of hosts. It uses labels and pods to group related containers together and services to provide discovery and load balancing for pods.

container kubernetes docker
Kubernetes networking
Kubernetes networkingKubernetes networking
Kubernetes networking

Kubernetes networking allows pods to communicate with each other and services to load balance traffic to pods. The document discusses Kubernetes networking concepts including the network model, pod networking using CNI plugins like Flannel, and different service types such as ClusterIP, NodePort, and Ingress. It provides examples of exposing a Kubernetes service using hostNetwork, hostPort, and NodePort and how network policies are implemented using iptables.

kubernetesnetworkingcni
Linux router
Linux routerLinux router
Linux router

The document describes how to configure a Linux machine as a router to connect two subnets. It provides instructions to enable IP forwarding and configure the network interfaces using temporary and permanent methods. The summary is: - Enable IP forwarding and configure the network interfaces of two Ethernet cards using ifconfig to set up routing temporarily - Use netconf to configure the interfaces and routing permanently by editing settings, accepting changes, and rebooting to confirm the configuration persists - Install traffic generator programs on end stations to test routing of UDP and TCP packets between subnets going through the router

linux routing configuration
4949
To sum it up
50
Why Cilium is awesome?
● It makes disadvantages of iptables disappear. And always gets the best
from the Linux kernel.
● Cluster Mesh / multi-cluster.
● Makes Istio faster.
● Offers L7 API Aware filtering as a Kubernetes resource.
● Integrates with the other popular CNI plugins – Calico, Flannel, Weave,
Lyft, AWS CNI.
Replacing iptables with eBPF in Kubernetes with Cilium

More Related Content

What's hot

Unifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPFUnifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPF
Netronome
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPF
RogerColl2
 
Using eBPF for High-Performance Networking in Cilium
Using eBPF for High-Performance Networking in CiliumUsing eBPF for High-Performance Networking in Cilium
Using eBPF for High-Performance Networking in Cilium
ScyllaDB
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
Brendan Gregg
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
PLUMgrid
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDP
Daniel T. Lee
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
lcplcp1
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
lcplcp1
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
Adrien Mahieux
 
Cilium - overview and recent updates
Cilium - overview and recent updatesCilium - overview and recent updates
Cilium - overview and recent updates
Michal Rostecki
 
Cloud Native Networking & Security with Cilium & eBPF
Cloud Native Networking & Security with Cilium & eBPFCloud Native Networking & Security with Cilium & eBPF
Cloud Native Networking & Security with Cilium & eBPF
Raphaël PINSON
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
Alex Maestretti
 
Cilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDPCilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDP
Thomas Graf
 
eBPF maps 101
eBPF maps 101eBPF maps 101
eBPF maps 101
SUSE Labs Taipei
 
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
Thomas Graf
 
Linux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityLinux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network Security
Thomas Graf
 
DoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDK
Marian Marinov
 
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Thomas Graf
 
eBPF/XDP
eBPF/XDP eBPF/XDP
eBPF/XDP
Netronome
 
Introduction to the Container Network Interface (CNI)
Introduction to the Container Network Interface (CNI)Introduction to the Container Network Interface (CNI)
Introduction to the Container Network Interface (CNI)
Weaveworks
 

What's hot (20)

Unifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPFUnifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPF
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPF
 
Using eBPF for High-Performance Networking in Cilium
Using eBPF for High-Performance Networking in CiliumUsing eBPF for High-Performance Networking in Cilium
Using eBPF for High-Performance Networking in Cilium
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDP
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
Cilium - overview and recent updates
Cilium - overview and recent updatesCilium - overview and recent updates
Cilium - overview and recent updates
 
Cloud Native Networking & Security with Cilium & eBPF
Cloud Native Networking & Security with Cilium & eBPFCloud Native Networking & Security with Cilium & eBPF
Cloud Native Networking & Security with Cilium & eBPF
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
 
Cilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDPCilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDP
 
eBPF maps 101
eBPF maps 101eBPF maps 101
eBPF maps 101
 
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
 
Linux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityLinux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network Security
 
DoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDK
 
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux Kernel
 
eBPF/XDP
eBPF/XDP eBPF/XDP
eBPF/XDP
 
Introduction to the Container Network Interface (CNI)
Introduction to the Container Network Interface (CNI)Introduction to the Container Network Interface (CNI)
Introduction to the Container Network Interface (CNI)
 

Similar to Replacing iptables with eBPF in Kubernetes with Cilium

Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1
Yongyoon Shin
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
Affan Syed
 
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterKernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Anne Nicolas
 
Packet Walk(s) In Kubernetes
Packet Walk(s) In KubernetesPacket Walk(s) In Kubernetes
Packet Walk(s) In Kubernetes
Don Jayakody
 
Scaling the Container Dataplane
Scaling the Container Dataplane Scaling the Container Dataplane
Scaling the Container Dataplane
Michelle Holley
 
Introduction to TCP/IP
Introduction to TCP/IPIntroduction to TCP/IP
Introduction to TCP/IP
Frank Fang Kuo Yu
 
Scaling Kubernetes to Support 50000 Services.pptx
Scaling Kubernetes to Support 50000 Services.pptxScaling Kubernetes to Support 50000 Services.pptx
Scaling Kubernetes to Support 50000 Services.pptx
thaond2
 
IPTABLES Introduction
IPTABLES IntroductionIPTABLES Introduction
IPTABLES Introduction
HungWei Chiu
 
Netlink-Optimization.pptx
Netlink-Optimization.pptxNetlink-Optimization.pptx
Netlink-Optimization.pptx
KalimuthuVelappan
 
Protecting host with calico
Protecting host with calicoProtecting host with calico
Protecting host with calico
Anirban Sen Chowdhary
 
OSDC 2017 - Casey Callendrello -The evolution of the Container Network Interface
OSDC 2017 - Casey Callendrello -The evolution of the Container Network InterfaceOSDC 2017 - Casey Callendrello -The evolution of the Container Network Interface
OSDC 2017 - Casey Callendrello -The evolution of the Container Network Interface
NETWAYS
 
OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...
OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...
OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...
NETWAYS
 
SRE NL MeetUp - eBPF.pdf
SRE NL MeetUp - eBPF.pdfSRE NL MeetUp - eBPF.pdf
SRE NL MeetUp - eBPF.pdf
SiteReliabilityEngin
 
iptables 101- bottom-up
iptables 101- bottom-upiptables 101- bottom-up
iptables 101- bottom-up
HungWei Chiu
 
Cilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDPCilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDP
Thomas Graf
 
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Hajime Tazaki
 
Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetes
Ted Jung
 
Kubernetes networking
Kubernetes networkingKubernetes networking
Kubernetes networking
Sim Janghoon
 
Linux router
Linux routerLinux router
ОЛЕКСАНДР ЛИПКО «Graceful Shutdown Node.js + k8s» Online WDDay 2021
ОЛЕКСАНДР ЛИПКО «Graceful Shutdown Node.js + k8s» Online WDDay 2021ОЛЕКСАНДР ЛИПКО «Graceful Shutdown Node.js + k8s» Online WDDay 2021
ОЛЕКСАНДР ЛИПКО «Graceful Shutdown Node.js + k8s» Online WDDay 2021
WDDay
 

Similar to Replacing iptables with eBPF in Kubernetes with Cilium (20)

Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
 
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterKernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
 
Packet Walk(s) In Kubernetes
Packet Walk(s) In KubernetesPacket Walk(s) In Kubernetes
Packet Walk(s) In Kubernetes
 
Scaling the Container Dataplane
Scaling the Container Dataplane Scaling the Container Dataplane
Scaling the Container Dataplane
 
Introduction to TCP/IP
Introduction to TCP/IPIntroduction to TCP/IP
Introduction to TCP/IP
 
Scaling Kubernetes to Support 50000 Services.pptx
Scaling Kubernetes to Support 50000 Services.pptxScaling Kubernetes to Support 50000 Services.pptx
Scaling Kubernetes to Support 50000 Services.pptx
 
IPTABLES Introduction
IPTABLES IntroductionIPTABLES Introduction
IPTABLES Introduction
 
Netlink-Optimization.pptx
Netlink-Optimization.pptxNetlink-Optimization.pptx
Netlink-Optimization.pptx
 
Protecting host with calico
Protecting host with calicoProtecting host with calico
Protecting host with calico
 
OSDC 2017 - Casey Callendrello -The evolution of the Container Network Interface
OSDC 2017 - Casey Callendrello -The evolution of the Container Network InterfaceOSDC 2017 - Casey Callendrello -The evolution of the Container Network Interface
OSDC 2017 - Casey Callendrello -The evolution of the Container Network Interface
 
OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...
OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...
OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...
 
SRE NL MeetUp - eBPF.pdf
SRE NL MeetUp - eBPF.pdfSRE NL MeetUp - eBPF.pdf
SRE NL MeetUp - eBPF.pdf
 
iptables 101- bottom-up
iptables 101- bottom-upiptables 101- bottom-up
iptables 101- bottom-up
 
Cilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDPCilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDP
 
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
 
Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetes
 
Kubernetes networking
Kubernetes networkingKubernetes networking
Kubernetes networking
 
Linux router
Linux routerLinux router
Linux router
 
ОЛЕКСАНДР ЛИПКО «Graceful Shutdown Node.js + k8s» Online WDDay 2021
ОЛЕКСАНДР ЛИПКО «Graceful Shutdown Node.js + k8s» Online WDDay 2021ОЛЕКСАНДР ЛИПКО «Graceful Shutdown Node.js + k8s» Online WDDay 2021
ОЛЕКСАНДР ЛИПКО «Graceful Shutdown Node.js + k8s» Online WDDay 2021
 

Recently uploaded

Top 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your WebsiteTop 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your Website
e-Definers Technology
 
Prada Group Reports Strong Growth in First Quarter …
Prada Group Reports Strong Growth in First Quarter …Prada Group Reports Strong Growth in First Quarter …
Prada Group Reports Strong Growth in First Quarter …
908dutch
 
ANSYS Mechanical APDL Introductory Tutorials.pdf
ANSYS Mechanical APDL Introductory Tutorials.pdfANSYS Mechanical APDL Introductory Tutorials.pdf
ANSYS Mechanical APDL Introductory Tutorials.pdf
sachin chaurasia
 
React vs Next js: Which is Better for Web Development? - Semiosis Software Pr...
React vs Next js: Which is Better for Web Development? - Semiosis Software Pr...React vs Next js: Which is Better for Web Development? - Semiosis Software Pr...
React vs Next js: Which is Better for Web Development? - Semiosis Software Pr...
Semiosis Software Private Limited
 
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
Asher Sterkin
 
ENISA Threat Landscape 2023 documentation
ENISA Threat Landscape 2023 documentationENISA Threat Landscape 2023 documentation
ENISA Threat Landscape 2023 documentation
sofiafernandezon
 
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
Hironori Washizaki
 
Independence Day Hasn’t Always Been a U.S. Holiday.pdf
Independence Day Hasn’t Always Been a U.S. Holiday.pdfIndependence Day Hasn’t Always Been a U.S. Holiday.pdf
Independence Day Hasn’t Always Been a U.S. Holiday.pdf
Livetecs LLC
 
Safe Work Permit Management Software for Hot Work Permits
Safe Work Permit Management Software for Hot Work PermitsSafe Work Permit Management Software for Hot Work Permits
Safe Work Permit Management Software for Hot Work Permits
sheqnetworkmarketing
 
What is OCR Technology and How to Extract Text from Any Image for Free
What is OCR Technology and How to Extract Text from Any Image for FreeWhat is OCR Technology and How to Extract Text from Any Image for Free
What is OCR Technology and How to Extract Text from Any Image for Free
TwisterTools
 
Google ML-Kit - Understanding on-device machine learning
Google ML-Kit - Understanding on-device machine learningGoogle ML-Kit - Understanding on-device machine learning
Google ML-Kit - Understanding on-device machine learning
VishrutGoyani1
 
Attendance Tracking From Paper To Digital
Attendance Tracking From Paper To DigitalAttendance Tracking From Paper To Digital
Attendance Tracking From Paper To Digital
Task Tracker
 
A Comparative Analysis of Functional and Non-Functional Testing.pdf
A Comparative Analysis of Functional and Non-Functional Testing.pdfA Comparative Analysis of Functional and Non-Functional Testing.pdf
A Comparative Analysis of Functional and Non-Functional Testing.pdf
kalichargn70th171
 
Development of Chatbot Using AI\ML Technologies
Development of Chatbot Using AI\ML TechnologiesDevelopment of Chatbot Using AI\ML Technologies
Development of Chatbot Using AI\ML Technologies
MaisnamLuwangPibarel
 
dachnug51 - All you ever wanted to know about domino licensing.pdf
dachnug51 - All you ever wanted to know about domino licensing.pdfdachnug51 - All you ever wanted to know about domino licensing.pdf
dachnug51 - All you ever wanted to know about domino licensing.pdf
DNUG e.V.
 
Intro to Amazon Web Services (AWS) and Gen AI
Intro to Amazon Web Services (AWS) and Gen AIIntro to Amazon Web Services (AWS) and Gen AI
Intro to Amazon Web Services (AWS) and Gen AI
Ortus Solutions, Corp
 
一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理
一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理
一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理
avufu
 
Migrate your Infrastructure to the AWS Cloud
Migrate your Infrastructure to the AWS CloudMigrate your Infrastructure to the AWS Cloud
Migrate your Infrastructure to the AWS Cloud
Ortus Solutions, Corp
 
Wired_2.0_Create_AmsterdamJUG_09072024.pptx
Wired_2.0_Create_AmsterdamJUG_09072024.pptxWired_2.0_Create_AmsterdamJUG_09072024.pptx
Wired_2.0_Create_AmsterdamJUG_09072024.pptx
SimonedeGijt
 
Splunk_Remote_Work_Insights_Overview.pptx
Splunk_Remote_Work_Insights_Overview.pptxSplunk_Remote_Work_Insights_Overview.pptx
Splunk_Remote_Work_Insights_Overview.pptx
sudsdeep
 

Recently uploaded (20)

Top 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your WebsiteTop 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your Website
 
Prada Group Reports Strong Growth in First Quarter …
Prada Group Reports Strong Growth in First Quarter …Prada Group Reports Strong Growth in First Quarter …
Prada Group Reports Strong Growth in First Quarter …
 
ANSYS Mechanical APDL Introductory Tutorials.pdf
ANSYS Mechanical APDL Introductory Tutorials.pdfANSYS Mechanical APDL Introductory Tutorials.pdf
ANSYS Mechanical APDL Introductory Tutorials.pdf
 
React vs Next js: Which is Better for Web Development? - Semiosis Software Pr...
React vs Next js: Which is Better for Web Development? - Semiosis Software Pr...React vs Next js: Which is Better for Web Development? - Semiosis Software Pr...
React vs Next js: Which is Better for Web Development? - Semiosis Software Pr...
 
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
 
ENISA Threat Landscape 2023 documentation
ENISA Threat Landscape 2023 documentationENISA Threat Landscape 2023 documentation
ENISA Threat Landscape 2023 documentation
 
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
 
Independence Day Hasn’t Always Been a U.S. Holiday.pdf
Independence Day Hasn’t Always Been a U.S. Holiday.pdfIndependence Day Hasn’t Always Been a U.S. Holiday.pdf
Independence Day Hasn’t Always Been a U.S. Holiday.pdf
 
Safe Work Permit Management Software for Hot Work Permits
Safe Work Permit Management Software for Hot Work PermitsSafe Work Permit Management Software for Hot Work Permits
Safe Work Permit Management Software for Hot Work Permits
 
What is OCR Technology and How to Extract Text from Any Image for Free
What is OCR Technology and How to Extract Text from Any Image for FreeWhat is OCR Technology and How to Extract Text from Any Image for Free
What is OCR Technology and How to Extract Text from Any Image for Free
 
Google ML-Kit - Understanding on-device machine learning
Google ML-Kit - Understanding on-device machine learningGoogle ML-Kit - Understanding on-device machine learning
Google ML-Kit - Understanding on-device machine learning
 
Attendance Tracking From Paper To Digital
Attendance Tracking From Paper To DigitalAttendance Tracking From Paper To Digital
Attendance Tracking From Paper To Digital
 
A Comparative Analysis of Functional and Non-Functional Testing.pdf
A Comparative Analysis of Functional and Non-Functional Testing.pdfA Comparative Analysis of Functional and Non-Functional Testing.pdf
A Comparative Analysis of Functional and Non-Functional Testing.pdf
 
Development of Chatbot Using AI\ML Technologies
Development of Chatbot Using AI\ML TechnologiesDevelopment of Chatbot Using AI\ML Technologies
Development of Chatbot Using AI\ML Technologies
 
dachnug51 - All you ever wanted to know about domino licensing.pdf
dachnug51 - All you ever wanted to know about domino licensing.pdfdachnug51 - All you ever wanted to know about domino licensing.pdf
dachnug51 - All you ever wanted to know about domino licensing.pdf
 
Intro to Amazon Web Services (AWS) and Gen AI
Intro to Amazon Web Services (AWS) and Gen AIIntro to Amazon Web Services (AWS) and Gen AI
Intro to Amazon Web Services (AWS) and Gen AI
 
一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理
一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理
一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理
 
Migrate your Infrastructure to the AWS Cloud
Migrate your Infrastructure to the AWS CloudMigrate your Infrastructure to the AWS Cloud
Migrate your Infrastructure to the AWS Cloud
 
Wired_2.0_Create_AmsterdamJUG_09072024.pptx
Wired_2.0_Create_AmsterdamJUG_09072024.pptxWired_2.0_Create_AmsterdamJUG_09072024.pptx
Wired_2.0_Create_AmsterdamJUG_09072024.pptx
 
Splunk_Remote_Work_Insights_Overview.pptx
Splunk_Remote_Work_Insights_Overview.pptxSplunk_Remote_Work_Insights_Overview.pptx
Splunk_Remote_Work_Insights_Overview.pptx
 

Replacing iptables with eBPF in Kubernetes with Cilium

  • 1. Replacing iptables with eBPF in Kubernetes with Cilium Cilium, eBPF, Envoy, Istio, Hubble Michal Rostecki Software Engineer mrostecki@suse.com mrostecki@opensuse.org Swaminathan Vasudevan Software Engineer svasudevan@suse.com
  • 3. 3 IPtables runs into a couple of significant problems: ● Iptables updates must be made by recreating and updating all rules in a single transaction. ● Implements chains of rules as a linked list, so almost all operations are O(n). ● The standard practice of implementing access control lists (ACLs) as implemented by iptables was to use sequential list of rules. ● It’s based on matching IPs and ports, not aware about L7 protocols. ● Every time you have a new IP or port to match, rules need to be added and the chain changed. ● Has high consumption of resources on Kubernetes. What’s wrong with legacy iptables?
  • 4. 4 Complexity of iptables ● Linked list. ● All rules in the chain have to be replaced as a whole. Rule 1 Rule 2 Rule n ... Search O(n) Insert O(1) Delete O(n)
  • 5. 5 Kubernetes uses iptables for... ● kube-proxy - the component which implements Services and load balancing by DNAT iptables rules ● the most of CNI plugins are using iptables for Network Policies
  • 7. 7 HW Bridge OVS . Netdevice / Drivers Traffic Shaping Ethernet IPv4 IPv6 Netfilter TCP UDP Raw Sockets System Call Interface Process Process Process ● The Linux kernel stack is split into multiple abstraction layers. ● Strong userspace API compatibility in Linux for years. ● This shows how complex the linux kernel is and its years of evolution. ● This cannot be replaced in a short term. ● Very hard to bypass the layers. ● Netfilter module has been supported by linux for more than two decades and packet filtering has to applied to packets that moves up and down the stack. Linux Network Stack
  • 8. 8 HW Bridge OVS . Netdevice / Drivers Traffic Shaping Ethernet IPv4 IPv6 Netfilter TCP UDP Raw Sockets System Call Interface Process Process Process BPF System calls BPF Sockmap and Sockops BPF TC hooks BPF XDP BPF kernel hooks BPF cGroups
  • 10. 10 PREROUTING INPUT OUTPUTFORWARD POSTROUTING FILTER FILTER FILTER NAT NAT Routing Decision NAT Routing Decision Routing Decision Netdev (Physical or virtual Device) Netdev (Physical or virtual Device) Local Processes eBPF Code eBPF Code IPTables netfilter hooks eBPF TC hooks XDP hooks BPF replaces IPtables
  • 11. 11 NetFilter NetFilter To Linux Stack From Linux Stack Netdev (Physical or virtual Device) Netdev (Physical or virtual Device) Ingress Chain Selector INGRESS CHAIN FORWARD CHAIN [local dst] [rem ote dst] TC/XDP Ingress hook TC Egress hook Egress Chain Selector OUTPUT CHAIN [local src] [remote src] Update session Label Packet Update session Label Packet Store session Store session Store session Update session Label Packet Connection Tracking BPF based filtering architecture
  • 12. 12 …. Headers parsing IP.dst lookup IP1 bitv1 IP2 bitv2 IP3 bitv3 eBPF Program #1 eBPF Program #2 eBPF Program #3 IP.proto lookup * bitv1 udp bitv2 tcp bitv3 Bitwise AND bit-vectors Search first Matching rule Update counters ACTION (drop/ accept) rule1 act1 rule2 act2 rule3 act3 rule1 cnt1 rule2 cnt2 eBPF Program eBPF Program #N Packet in Packet out From eBPF hook To eBPF hook Tailcall Tailcall Tailcall Tailcall Packet header offsets Bitvector with temporary result per cpu _array shared across the entire program chain per cpu _array shared across the entire program chain Each eBPF program can exploit a different matching algorithm (e.g., exact match, longest prefix match, etc). Each eBPF program is injected only if there are rules operating on that field. LBVS is implemented with a chain of eBPF programs, connected through tail calls. Header parsing is done once and results are kept in a shared map for performance reasons BPF based tail calls
  • 13. 13 BPF goes into... ● Load balancers - katran ● perf ● systemd ● Suricata ● Open vSwitch - AF_XDP ● And many many others
  • 14. 14 BPF is used by...
  • 17. 17 CNI Functionality CNI is a CNCF ( Cloud Native Computing Foundation) project for Linux Containers It consists of specification and libraries for writing plugins. Only care about networking connectivity of containers ● ADD/DEL General container runtime considerations for CNI: The container runtime must ● create a new network namespace for the container before invoking any plugins ● determine the network for the container and add the container to the each network by calling the corresponding plugins for each network ● not invoke parallel operations for the same container. ● order ADD and DEL operations for a container, such that ADD is always eventually followed by a corresponding DEL. ● not call ADD twice ( without a corresponding DEL ) for the same ( network name, container id, name of the interface inside the container). When CNI ADD call is invoked it tries to add the network to the container with respective veth pairs and assigning IP address from the respective IPAM Plugin or using the Host Scope. When CNI DEL call is invoked it tries to remove the container network, release the IP Address to the IPAM Manager and cleans up the veth pairs.
  • 18. 18 Kubernetes API Server Kubelet CRI-Containerd CNI-Plugin (Cilium) Cilium Agent eth0 BPF Maps Container2 Container1 Linux Kernel Network Stack 000 c1 FE 0A 001 54 45 31 002 A1 B1 C1 004 32 66 AA cni-add().. Kubectl K8s Pod Userspace Kernel bpf_syscall() BPF Hook Cilium CNI Plugin control Flow
  • 19. 19 Cilium Components with BPF hook points and BPF maps shown in Linux Stack Orchestrator
  • 20. 20 container A container B container C eth0 eth0 eth0 lxc0 lxc0 lxc1 eth0 eth0
  • 21. 21 Networking modes Use case: Cilium handling routing between nodes Encapsulation Use case: Using cloud provider routers, using BGP routing daemon Direct routing Node A Node B Node C VXLAN VXLAN VXLAN Node A Node B Node C Cloud or BGP routing
  • 22. 22
  • 23. 23
  • 24. 24 L3 filtering – label based, ingress Pod Labels: role=frontend IP: 10.0.0.1 Pod Labels: role=frontend IP: 10.0.0.2 Pod IP: 10.0.0.5 Pod Labels: role=backend IP: 10.0.0.3 Pod Labels: role=backend IP: 10.0.0.4 allow deny
  • 25. 25 L3 filtering – label based, ingress apiVersion: "cilium.io/v2" kind: CiliumNetworkPolicy description: "Allow frontends to access backends" metadata: name: "frontend-backend" spec: endpointSelector: matchLabels: role: backend ingress: - fromEndpoints: - matchLabels: class: frontend
  • 26. 26 L3 filtering – CIDR based, egress IP: 10.0.1.1 Subnet: 10.0.1.0/24 IP: 10.0.2.1 Subnet: 10.0.2.0/24 allow deny Cluster A Pod Labels: role=backend IP: 10.0.0.1 Any IP not belonging to 10.0.1.0/24
  • 27. 27 L3 filtering – CIDR based, egress apiVersion: "cilium.io/v2" kind: CiliumNetworkPolicy description: "Allow backends to access 10.0.1.0/24" metadata: name: "frontend-backend" spec: endpointSelector: matchLabels: role: backend egress: - toCIDR: - IP: “10.0.1.0/24”
  • 28. 28 L4 filtering Pod Labels: role=backend IP: 10.0.0.1 allow deny TCP/80 Any other port
  • 29. 29 L4 filtering apiVersion: "cilium.io/v2" kind: CiliumNetworkPolicy description: "Allow to access backends only on TCP/80" metadata: name: "frontend-backend" spec: endpointSelector: matchLabels: role: backend ingress: - toPorts: - ports: - port: “80” protocol: “TCP”
  • 30. 30 L7 filtering – API Aware Security Pod Labels: role=api IP: 10.0.0.1 GET /articles/{id} GET /private Pod IP: 10.0.0.5
  • 31. 31 L7 filtering – API Aware Security apiVersion: "cilium.io/v2" kind: CiliumNetworkPolicy description: "L7 policy to restict access to specific HTTP endpoints" metadata: name: "frontend-backend" endpointSelector: matchLabels: role: backend ingress: - toPorts: - ports: - port: “80” protocol: “TCP” rules: http: - method: "GET" path: "/article/$"
  • 32. 32 Standalone proxy, L7 filtering Node A Pod A + BPF Envoy Generating BPF programs for L7 filtering through libcilium.so Node B Pod B + BPF Envoy Generating BPF programs for L7 filtering through libcilium.so Generating BPF programs for L3/L4 filtering Generating BPF programs for L3/L4 filtering VXLAN
  • 34. 34 Cluster Mesh Cluster A Cluster B Node A Pod A + BPF Node B + BPF Container eth0 Pod B Container eth0 Pod C Container eth0 External etcd Node A Pod A + BPF Container eth0
  • 35. 35 Socket Socket Socket Socket Service Service Socket TCP/IP Ethernet eth0 Socket TCP/IP Ethernet eth0 Network TCP/IP Ethernet IPtables TCP/IP Ethernet IPtables Loopback IPtables IPtables TCP/IP TCP/IP Ethernet Ethernet Loopback
  • 36. 36 Cilium CNI Cilium CNI Socket Socket Socket Socket Service Service Socket TCP/IP Ethernet eth0 Socket TCP/IP Ethernet eth0 Network
  • 37. 37 Service A Service B Service C
  • 39. 39 Service A Service B External Github Service External Cloud Network
  • 40. 40 Kubernetes Services ● Hash table. BPF, Cilium ● Linked list. ● All rules in the chain have to be replaced as a whole. Iptables, kube-proxy Key Key Key Value Value Value Rule 1 Rule 2 Rule n ... Search O(1) Insert O(1) Delete O(1) Search O(n) Insert O(1) Delete O(n)
  • 42. 42 CNI chaining Policy enforcement, load balancing, multi-cluster connectivity IP allocation, configuring network interface, encapsulation/routing inside the cluster
  • 46. 46
  • 47. 47
  • 48. 48
  • 50. 50 Why Cilium is awesome? ● It makes disadvantages of iptables disappear. And always gets the best from the Linux kernel. ● Cluster Mesh / multi-cluster. ● Makes Istio faster. ● Offers L7 API Aware filtering as a Kubernetes resource. ● Integrates with the other popular CNI plugins – Calico, Flannel, Weave, Lyft, AWS CNI.