SlideShare a Scribd company logo
Copyright 2021 Sony Corporation
Tomoya Fujita, R&D Center, Sony Group Corporation
Feng Gao, Sony China Limited
Kubernetes Robotics
Edge Cluster System
Agenda
• Introduction
• Sony’s Purpose
• Background
• ROS
• Problems
• Goal
• Advantages
• Architecture
• Distributed System
• Security Enclaves
• Device-Plugin
• Cluster Reconfiguration
• Plan
• Questions
Self-Introduction
• Tomoya Fujita Tomoya.Fujita@sony.com
• Sony R&D Center, Tokyo Lab
• Software Architect & Developer
• ROS TSC(Technical Steering Committee)
• fujitatomoya@github, tomoyafujita@linkedin
• Related work
• ROS-I 2020 Asia Pacific Workshop
• ROSCon2019 Panel Talk
• Feng Gao Feng.Fg.Gao@sony.com
• Sony China Software Center
• Software Developer
• gaofeng1973@github, 15618992861@wechat
• Related work
• Kubernetes
• mutli-media
We are available on
slack#wg-iot-
edge !!!
Sony Purpose
General Background
• Edge Devices Getting Matured
• Distributed System
• Connected System
• Circulatory Functioning System
Robotics Background
• Robotics Orchestration
• Highly task, more collaborative
• Multiple use cases (factory, logistics, entertainment, rescue, autonomous car, drone)
• Application Lifecycle
• Fleet Management
• Development
• Frequently Upgrade/Downgrade (No down time preferred)
• Easy, Quick and Efficient for Application Developers
• Maintenance
• Nobody wants to get paged in the night
• Hardware Abstraction
• Application Portability / Modularity
• Platform Agnostic
ROS
Robotics SDK
ROS
Simulation as the best possible substitute
for physical robots
Problems
Robot
App
Robot
App
Cloud
Robots
Cloud
App
Cloud
App
Sensor
App
Sensor
App
IoT
Devices
What’s the pain?
• Different architecture for cloud and edge device.
• Take time and effort to setup environment and run application.
• IoT Device is really statistic implementation.
Complicated
&
Single System
&
Specified
Goal
App App
App App
App
App
Cloud
Robots
IoT
Devices
What we want is…
• Common base architecture for everyone and everywhere.
• Application can be deployed on anywhere.
• Eco-system for Application.
Simple/Common
&
Distributed System
&
Platform Agnostic
Application Friendly
Application
System
System
Agnostic
Device/Hardware
Application
System
Device/Hardware
Application
System
Device/Hardware
Application
System
Device/Hardware
Application
Agnostic
Cloud & Edge Common Platform Broker
Advantage
• Kubernetes is the “mainline”
• Deployment with Policies
• Maintenance
• Roll up/down (No down time)
• Role Base Access Control
• Scalability
• Orchestration
Common Architecture
Edge Node
Edge Node
Application
Container
Application
Container
System
Ext APIs
System
Ext APIs
Application
Application
System
Services
System
Services
Kernel & Drivers
Kernel & Drivers
Output
Output
Edge Node
Edge Node
Kernel & Drivers
Kernel & Drivers
Edge Cluster Network
(Edge Cluster)
Edge Cluster Network
(Edge Cluster)
Edge
Cluster
Primary
Edge
Cluster
Primary
Capabilities
Capabilities
Node
Controller
Node
Controller
Node
Controller
Node
Controller Application
Container
Application
Container
System
Ext APIs
System
Ext APIs
Application
Application
System
Services
System
Services
Output
Output
Cloud Cluster Network
(Cloud Cluster)
Cloud Cluster Network
(Cloud Cluster)
Federation
Cloud Node
Cloud Node
Application
Container
Application
Container
Application
Application
Node
Controller
Node
Controller
Kernel & Drivers
Kernel & Drivers
GPU
Access
GPU
Access
Cloud Node
Cloud Node
Application
Container
Application
Container
Application
Application
Node
Controller
Node
Controller
Kernel & Drivers
Kernel & Drivers
GPU
Access
GPU
Access
Cloud
Cluster
Primary
Cloud
Cluster
Primary
x86 arm64
Accelerator
Accelerator
Accelerator
Accelerator
Certificate
Certificate
Capabilities
Capabilities
Certificate
Certificate
Common Architecture
Edge Node
Edge Node
Application
Container
Application
Container
System
Ext APIs
System
Ext APIs
System
Services
System
Services
Kernel & Drivers
Kernel & Drivers
Output
Output
Edge Node
Edge Node
Kernel & Drivers
Kernel & Drivers
Edge Cluster Network
Edge Cluster Network
Edge
Cluster
Primary
Edge
Cluster
Primary
Node
Controller
Node
Controller
Node
Controller
Node
Controller Application
Container
Application
Container
System
Ext APIs
System
Ext APIs
System
Services
System
Services
Output
Output
Cloud Cluster Network
Cloud Cluster Network
Federation
Cloud Node
Cloud Node
Application
Container
Application
Container
Application
Application
Node
Controller
Node
Controller
Kernel & Drivers
Kernel & Drivers
GPU
Access
GPU
Access
Cloud Node
Cloud Node
Application
Container
Application
Container
Application
Application
Node
Controller
Node
Controller
Kernel & Drivers
Kernel & Drivers
GPU
Access
GPU
Access
Cloud
Cluster
Primary
Cloud
Cluster
Primary
x86 arm64
Accelerator
Accelerator
Accelerator
Accelerator
Capabilities
Capabilities
Certificate
Certificate
Capabilities
Capabilities
Certificate
Certificate
Distributed System
Kubernetes with ROS
Hardware Abstraction
via Device-Plugin
ROS Security
Enclaves
Certificate & Key
Dynamic Cluster
Reconfiguration
Distributed System
LAN
Kubernetes Primary (x86) Kubernetes Worker (arm64) Kubernetes Worker (arm64)
CNI – Weave (Layer 2 Emulation)
Kubernetes
API
Server
Kubelet
Kubelet
Kubelet
Application
Pod
Application
Pod
Dashboard
Visualizer
Application
Pod
Application
Pod
Face
Detection
Application
Pod
Application
Pod
Selector
Eye
Detection
Face
Detection
Eye
Detection
Security Enclaves
LAN
App
App
Primary Worker Worker
App Container
kubelet kubelet
kubelet
API-Server
Registration
Access
Control
Administrator
Load
Load
Load
ConfigMap & Secrets for Each ROS2 Application
App
App
App Container
App
App
App Container
Bind Security
Enclaves
User
Certificate to Join this entire distributed system,
Access permission for each topics and services
Device-Plugin
• One of Kubernetes Custom Resource
• Dynamically plugin vendor hardware and device
• Agnostic from Application Pods
Scheduler
kubelet
API server
Device Plugin
(DaemonSets)
K8s system components
Vendor components
GPU
Application
Pods
ExtendResource
Vendor.com/gpus
List/Watch/Allocate
1. Advertise
2. Registration
4. Request
3. Pod Create
5. Allocate
6. Mount
Primary Worker
GPU
GPU
GPU
7. Access
Device-Plugin
• FPGA, Hardware Acceleration, DSP
• Virtual Devices such as API to access host system
• Platform Dependent and Specific Devices
Scheduler
kubelet
API server
Platform Device Plugin
K8s system components
Vendor components
FPGA
Application
Pods
ExtendResource
sony.com/fpga
sony.com/dsp
sony.com/apiX
List/Watch/Allocate
1. Advertise
2. Registration
4. Request
3. Pod Create
5. Allocate
6. Mount
Primary Worker
DSP
Device
API
to
Host
7. Access
Device-Plugin Open Issue
• No Device Plugin Callback for Releasing devices against Allocate
• Issue
• https://github.com/kubernetes/kubernetes/issues/86539
• KEP
• https://github.com/kubernetes/enhancements/issues/1948
• https://github.com/kubernetes/enhancements/pull/1949
Cluster Reconfiguration
• Robot moves
• Wireless Network
• Network Unstable
• Shutdown Accidentally
• Battery
• Break Down Easily
• Mis-Operation
• Cost Effective
Cluster Reconfiguration
Primary
Candidate
Node
Primary
Candidate
Node
Current
Primary
Worker
Node
Worker
Node Worker
Node
Worker
Node
• Kubernetes Aware
• Robustness
• Primary Election
• Election Consensus
• Service Discovery
• Node Discovery
• Namespace
Worker
Node
Discovery
Plan
• Redeployment based on Sensing Data
• Edge Distributed System Sidecar
• Micro-Controller Support (e.g KubeEdge)
• More cost effective kubelet
• Light-weight container runtime
SONY is a registered trademark of Sony Corporation.
Names of Sony products and services are the registered trademarks and/or trademarks of Sony Corporation or its Group companies.
Other company names and product names are registered trademarks and/or trademarks of the respective companies.

More Related Content

Kubernetes Robotics Edge Cluster System

  • 1. Copyright 2021 Sony Corporation
  • 2. Tomoya Fujita, R&D Center, Sony Group Corporation Feng Gao, Sony China Limited Kubernetes Robotics Edge Cluster System
  • 3. Agenda • Introduction • Sony’s Purpose • Background • ROS • Problems • Goal • Advantages • Architecture • Distributed System • Security Enclaves • Device-Plugin • Cluster Reconfiguration • Plan • Questions
  • 4. Self-Introduction • Tomoya Fujita Tomoya.Fujita@sony.com • Sony R&D Center, Tokyo Lab • Software Architect & Developer • ROS TSC(Technical Steering Committee) • fujitatomoya@github, tomoyafujita@linkedin • Related work • ROS-I 2020 Asia Pacific Workshop • ROSCon2019 Panel Talk • Feng Gao Feng.Fg.Gao@sony.com • Sony China Software Center • Software Developer • gaofeng1973@github, 15618992861@wechat • Related work • Kubernetes • mutli-media We are available on slack#wg-iot- edge !!!
  • 6. General Background • Edge Devices Getting Matured • Distributed System • Connected System • Circulatory Functioning System
  • 7. Robotics Background • Robotics Orchestration • Highly task, more collaborative • Multiple use cases (factory, logistics, entertainment, rescue, autonomous car, drone) • Application Lifecycle • Fleet Management • Development • Frequently Upgrade/Downgrade (No down time preferred) • Easy, Quick and Efficient for Application Developers • Maintenance • Nobody wants to get paged in the night • Hardware Abstraction • Application Portability / Modularity • Platform Agnostic
  • 9. ROS Simulation as the best possible substitute for physical robots
  • 10. Problems Robot App Robot App Cloud Robots Cloud App Cloud App Sensor App Sensor App IoT Devices What’s the pain? • Different architecture for cloud and edge device. • Take time and effort to setup environment and run application. • IoT Device is really statistic implementation. Complicated & Single System & Specified
  • 11. Goal App App App App App App Cloud Robots IoT Devices What we want is… • Common base architecture for everyone and everywhere. • Application can be deployed on anywhere. • Eco-system for Application. Simple/Common & Distributed System & Platform Agnostic
  • 13. Advantage • Kubernetes is the “mainline” • Deployment with Policies • Maintenance • Roll up/down (No down time) • Role Base Access Control • Scalability • Orchestration
  • 14. Common Architecture Edge Node Edge Node Application Container Application Container System Ext APIs System Ext APIs Application Application System Services System Services Kernel & Drivers Kernel & Drivers Output Output Edge Node Edge Node Kernel & Drivers Kernel & Drivers Edge Cluster Network (Edge Cluster) Edge Cluster Network (Edge Cluster) Edge Cluster Primary Edge Cluster Primary Capabilities Capabilities Node Controller Node Controller Node Controller Node Controller Application Container Application Container System Ext APIs System Ext APIs Application Application System Services System Services Output Output Cloud Cluster Network (Cloud Cluster) Cloud Cluster Network (Cloud Cluster) Federation Cloud Node Cloud Node Application Container Application Container Application Application Node Controller Node Controller Kernel & Drivers Kernel & Drivers GPU Access GPU Access Cloud Node Cloud Node Application Container Application Container Application Application Node Controller Node Controller Kernel & Drivers Kernel & Drivers GPU Access GPU Access Cloud Cluster Primary Cloud Cluster Primary x86 arm64 Accelerator Accelerator Accelerator Accelerator Certificate Certificate Capabilities Capabilities Certificate Certificate
  • 15. Common Architecture Edge Node Edge Node Application Container Application Container System Ext APIs System Ext APIs System Services System Services Kernel & Drivers Kernel & Drivers Output Output Edge Node Edge Node Kernel & Drivers Kernel & Drivers Edge Cluster Network Edge Cluster Network Edge Cluster Primary Edge Cluster Primary Node Controller Node Controller Node Controller Node Controller Application Container Application Container System Ext APIs System Ext APIs System Services System Services Output Output Cloud Cluster Network Cloud Cluster Network Federation Cloud Node Cloud Node Application Container Application Container Application Application Node Controller Node Controller Kernel & Drivers Kernel & Drivers GPU Access GPU Access Cloud Node Cloud Node Application Container Application Container Application Application Node Controller Node Controller Kernel & Drivers Kernel & Drivers GPU Access GPU Access Cloud Cluster Primary Cloud Cluster Primary x86 arm64 Accelerator Accelerator Accelerator Accelerator Capabilities Capabilities Certificate Certificate Capabilities Capabilities Certificate Certificate Distributed System Kubernetes with ROS Hardware Abstraction via Device-Plugin ROS Security Enclaves Certificate & Key Dynamic Cluster Reconfiguration
  • 16. Distributed System LAN Kubernetes Primary (x86) Kubernetes Worker (arm64) Kubernetes Worker (arm64) CNI – Weave (Layer 2 Emulation) Kubernetes API Server Kubelet Kubelet Kubelet Application Pod Application Pod Dashboard Visualizer Application Pod Application Pod Face Detection Application Pod Application Pod Selector Eye Detection Face Detection Eye Detection
  • 17. Security Enclaves LAN App App Primary Worker Worker App Container kubelet kubelet kubelet API-Server Registration Access Control Administrator Load Load Load ConfigMap & Secrets for Each ROS2 Application App App App Container App App App Container Bind Security Enclaves User Certificate to Join this entire distributed system, Access permission for each topics and services
  • 18. Device-Plugin • One of Kubernetes Custom Resource • Dynamically plugin vendor hardware and device • Agnostic from Application Pods Scheduler kubelet API server Device Plugin (DaemonSets) K8s system components Vendor components GPU Application Pods ExtendResource Vendor.com/gpus List/Watch/Allocate 1. Advertise 2. Registration 4. Request 3. Pod Create 5. Allocate 6. Mount Primary Worker GPU GPU GPU 7. Access
  • 19. Device-Plugin • FPGA, Hardware Acceleration, DSP • Virtual Devices such as API to access host system • Platform Dependent and Specific Devices Scheduler kubelet API server Platform Device Plugin K8s system components Vendor components FPGA Application Pods ExtendResource sony.com/fpga sony.com/dsp sony.com/apiX List/Watch/Allocate 1. Advertise 2. Registration 4. Request 3. Pod Create 5. Allocate 6. Mount Primary Worker DSP Device API to Host 7. Access
  • 20. Device-Plugin Open Issue • No Device Plugin Callback for Releasing devices against Allocate • Issue • https://github.com/kubernetes/kubernetes/issues/86539 • KEP • https://github.com/kubernetes/enhancements/issues/1948 • https://github.com/kubernetes/enhancements/pull/1949
  • 21. Cluster Reconfiguration • Robot moves • Wireless Network • Network Unstable • Shutdown Accidentally • Battery • Break Down Easily • Mis-Operation • Cost Effective
  • 22. Cluster Reconfiguration Primary Candidate Node Primary Candidate Node Current Primary Worker Node Worker Node Worker Node Worker Node • Kubernetes Aware • Robustness • Primary Election • Election Consensus • Service Discovery • Node Discovery • Namespace Worker Node Discovery
  • 23. Plan • Redeployment based on Sensing Data • Edge Distributed System Sidecar • Micro-Controller Support (e.g KubeEdge) • More cost effective kubelet • Light-weight container runtime
  • 24. SONY is a registered trademark of Sony Corporation. Names of Sony products and services are the registered trademarks and/or trademarks of Sony Corporation or its Group companies. Other company names and product names are registered trademarks and/or trademarks of the respective companies.