Kubernetes Robotics Edge Cluster System
- 2. Tomoya Fujita, R&D Center, Sony Group Corporation
Feng Gao, Sony China Limited
Kubernetes Robotics
Edge Cluster System
- 3. Agenda
• Introduction
• Sony’s Purpose
• Background
• ROS
• Problems
• Goal
• Advantages
• Architecture
• Distributed System
• Security Enclaves
• Device-Plugin
• Cluster Reconfiguration
• Plan
• Questions
- 4. Self-Introduction
• Tomoya Fujita Tomoya.Fujita@sony.com
• Sony R&D Center, Tokyo Lab
• Software Architect & Developer
• ROS TSC(Technical Steering Committee)
• fujitatomoya@github, tomoyafujita@linkedin
• Related work
• ROS-I 2020 Asia Pacific Workshop
• ROSCon2019 Panel Talk
• Feng Gao Feng.Fg.Gao@sony.com
• Sony China Software Center
• Software Developer
• gaofeng1973@github, 15618992861@wechat
• Related work
• Kubernetes
• mutli-media
We are available on
slack#wg-iot-
edge !!!
- 6. General Background
• Edge Devices Getting Matured
• Distributed System
• Connected System
• Circulatory Functioning System
- 7. Robotics Background
• Robotics Orchestration
• Highly task, more collaborative
• Multiple use cases (factory, logistics, entertainment, rescue, autonomous car, drone)
• Application Lifecycle
• Fleet Management
• Development
• Frequently Upgrade/Downgrade (No down time preferred)
• Easy, Quick and Efficient for Application Developers
• Maintenance
• Nobody wants to get paged in the night
• Hardware Abstraction
• Application Portability / Modularity
• Platform Agnostic
- 13. Advantage
• Kubernetes is the “mainline”
• Deployment with Policies
• Maintenance
• Roll up/down (No down time)
• Role Base Access Control
• Scalability
• Orchestration
- 14. Common Architecture
Edge Node
Edge Node
Application
Container
Application
Container
System
Ext APIs
System
Ext APIs
Application
Application
System
Services
System
Services
Kernel & Drivers
Kernel & Drivers
Output
Output
Edge Node
Edge Node
Kernel & Drivers
Kernel & Drivers
Edge Cluster Network
(Edge Cluster)
Edge Cluster Network
(Edge Cluster)
Edge
Cluster
Primary
Edge
Cluster
Primary
Capabilities
Capabilities
Node
Controller
Node
Controller
Node
Controller
Node
Controller Application
Container
Application
Container
System
Ext APIs
System
Ext APIs
Application
Application
System
Services
System
Services
Output
Output
Cloud Cluster Network
(Cloud Cluster)
Cloud Cluster Network
(Cloud Cluster)
Federation
Cloud Node
Cloud Node
Application
Container
Application
Container
Application
Application
Node
Controller
Node
Controller
Kernel & Drivers
Kernel & Drivers
GPU
Access
GPU
Access
Cloud Node
Cloud Node
Application
Container
Application
Container
Application
Application
Node
Controller
Node
Controller
Kernel & Drivers
Kernel & Drivers
GPU
Access
GPU
Access
Cloud
Cluster
Primary
Cloud
Cluster
Primary
x86 arm64
Accelerator
Accelerator
Accelerator
Accelerator
Certificate
Certificate
Capabilities
Capabilities
Certificate
Certificate
- 15. Common Architecture
Edge Node
Edge Node
Application
Container
Application
Container
System
Ext APIs
System
Ext APIs
System
Services
System
Services
Kernel & Drivers
Kernel & Drivers
Output
Output
Edge Node
Edge Node
Kernel & Drivers
Kernel & Drivers
Edge Cluster Network
Edge Cluster Network
Edge
Cluster
Primary
Edge
Cluster
Primary
Node
Controller
Node
Controller
Node
Controller
Node
Controller Application
Container
Application
Container
System
Ext APIs
System
Ext APIs
System
Services
System
Services
Output
Output
Cloud Cluster Network
Cloud Cluster Network
Federation
Cloud Node
Cloud Node
Application
Container
Application
Container
Application
Application
Node
Controller
Node
Controller
Kernel & Drivers
Kernel & Drivers
GPU
Access
GPU
Access
Cloud Node
Cloud Node
Application
Container
Application
Container
Application
Application
Node
Controller
Node
Controller
Kernel & Drivers
Kernel & Drivers
GPU
Access
GPU
Access
Cloud
Cluster
Primary
Cloud
Cluster
Primary
x86 arm64
Accelerator
Accelerator
Accelerator
Accelerator
Capabilities
Capabilities
Certificate
Certificate
Capabilities
Capabilities
Certificate
Certificate
Distributed System
Kubernetes with ROS
Hardware Abstraction
via Device-Plugin
ROS Security
Enclaves
Certificate & Key
Dynamic Cluster
Reconfiguration
- 16. Distributed System
LAN
Kubernetes Primary (x86) Kubernetes Worker (arm64) Kubernetes Worker (arm64)
CNI – Weave (Layer 2 Emulation)
Kubernetes
API
Server
Kubelet
Kubelet
Kubelet
Application
Pod
Application
Pod
Dashboard
Visualizer
Application
Pod
Application
Pod
Face
Detection
Application
Pod
Application
Pod
Selector
Eye
Detection
Face
Detection
Eye
Detection
- 17. Security Enclaves
LAN
App
App
Primary Worker Worker
App Container
kubelet kubelet
kubelet
API-Server
Registration
Access
Control
Administrator
Load
Load
Load
ConfigMap & Secrets for Each ROS2 Application
App
App
App Container
App
App
App Container
Bind Security
Enclaves
User
Certificate to Join this entire distributed system,
Access permission for each topics and services
- 18. Device-Plugin
• One of Kubernetes Custom Resource
• Dynamically plugin vendor hardware and device
• Agnostic from Application Pods
Scheduler
kubelet
API server
Device Plugin
(DaemonSets)
K8s system components
Vendor components
GPU
Application
Pods
ExtendResource
Vendor.com/gpus
List/Watch/Allocate
1. Advertise
2. Registration
4. Request
3. Pod Create
5. Allocate
6. Mount
Primary Worker
GPU
GPU
GPU
7. Access
- 19. Device-Plugin
• FPGA, Hardware Acceleration, DSP
• Virtual Devices such as API to access host system
• Platform Dependent and Specific Devices
Scheduler
kubelet
API server
Platform Device Plugin
K8s system components
Vendor components
FPGA
Application
Pods
ExtendResource
sony.com/fpga
sony.com/dsp
sony.com/apiX
List/Watch/Allocate
1. Advertise
2. Registration
4. Request
3. Pod Create
5. Allocate
6. Mount
Primary Worker
DSP
Device
API
to
Host
7. Access
- 20. Device-Plugin Open Issue
• No Device Plugin Callback for Releasing devices against Allocate
• Issue
• https://github.com/kubernetes/kubernetes/issues/86539
• KEP
• https://github.com/kubernetes/enhancements/issues/1948
• https://github.com/kubernetes/enhancements/pull/1949
- 21. Cluster Reconfiguration
• Robot moves
• Wireless Network
• Network Unstable
• Shutdown Accidentally
• Battery
• Break Down Easily
• Mis-Operation
• Cost Effective
- 23. Plan
• Redeployment based on Sensing Data
• Edge Distributed System Sidecar
• Micro-Controller Support (e.g KubeEdge)
• More cost effective kubelet
• Light-weight container runtime
- 24. SONY is a registered trademark of Sony Corporation.
Names of Sony products and services are the registered trademarks and/or trademarks of Sony Corporation or its Group companies.
Other company names and product names are registered trademarks and/or trademarks of the respective companies.