Edge Computing Architecture using GPUs and Kubernetes
- 2. VirtualTech Japan Inc.
日本仮想化技術株式会社
• Company name: VirtualTech Japan Inc. (called VTJ)
• Address: 1-8-1 Shibuya Shibuya-ku Tokyo
• Founded: Dec 2006
• President and CEO: Toru Miyahara
• Number of employees: 8 (Engineer: 7, Business Development: 1)
• Our service:
• Consulting NFV/OpenStack for Japan telco company
• NTT Docomo’s large-scale OpenStack services
• NTT West’s one of management systems of fixed network service using OpenStack
• Plan to consulting Edge + GPU Computing
• Corporate Web Site: http://virtualtech.jp
2
Corporate profile
- 3. Our expertise at OpenStack
We are experts in Open Infrastructure, OpenStack and NFV.
3
2014/11 OpenStack Summit Paris
We spoke the knowledge and tips
when building and operating
OpenStack Cloud on 100 Physical
Servers.
(Neutron HA, VXLAN performance,,,)
2012/10 OpenStack Summit San Diego
We announced baremetal provisioning
framework which handles barematel
machine like a virtual machine.
This is merged upstream in Grizzly.
2015/10 OpenStack Summit Tokyo
We (NTT West, Canonical and VTJ)
spoke ”Requirements for Providing
Telecom Services on OpenStack-based
Infrastructure”.
Corporate profile
- 4. 1. OpenStack
2. Kubernetes
3. Kubernetes on OpenStack
4. OpenStack on Kubernetes
5. Edge Cloud
6. NFV Cloud
Network Function Virtualization
Definition of words
4
Kubernetes
OpenStack OpenStack
Kubernetes
3. Kubernetes on OpenStack 4. OpenStack on Kubernetes
Kubernetes
Hardware
Hardware
Under
Cloud
Over
Cloud
5. Edge Cloud 6. NFV Cloud
Relation of OpenStack and Kubernetes
Using Cloud/Container technology at Telco company
Device
Kubernetes
Hardware
OpenStack OpenStack
Hardware
Access
Point
Inter
net
- 6. Questions about “Edge” Computing
We have any questions about “Edge” Computing.
• Can you tell me about your “Edge” ?
• What’s “Edge” Computing ?
• What’s key points of “Edge” Computing ?
6
What’s “Edge” Computing?
- 7. Can you tell me about your “Edge” ?
• I know that the definition of “Edge” is different for each person.
1. Edge of Network nodes
2. Edge of Cloud / Computing
3. Server side of IoT Application
etc
• We want ”Edge” Computing that can be used in various use
cases.
7
What’s “Edge” Computing?
- 8. What’s “Edge” Computing ?
• We want ”Edge” Computing that can be used in various use
cases.
• I joined OpenStack Summit Vancouver. I watched some Telco
Edge Computing projects. (AT&T, China Mobile and Verizon)
• China Mobile’s use cases of “Edge” Computing
From presentation “ Edge TIC – Future edge cloud for China mobile”
• Enterprise Private Network (as like SD-WAN)
• CDN Deployment
• Live Sporting Event
• Real time data backhaul of Unmanned Aerial Vehicle
• V2X Service (V is Vehicle)
8
What’s “Edge” Computing?
- 9. What’s “Edge” Computing ?
• We want ”Edge” Computing that can be used in various use
cases.
• AT&T and China Mobile are combining NFV, “Edge” and MANO,
and are beginning to create the Next-gen Network Service
Infra.
9
What’s “Edge” Computing?
NFV Edge
MANO
MA
NO
MANO: NFV Management and Orchestration
NF
V
NF
V
Ed
ge
Ed
ge
Ed
ge
Regional
(4+)
Province
(100+)
City
(600+)
County
(3000+)
AP
(100K+)
Software
• MANO: ONAP
• NFV: OPNFV (Based OpenStack)
• Edge: Akraino (Based Kubernetes
on OpenStack)
The number above is the assumed value of China Mobile.
- 10. What’s key points of “Edge” Computing ?
• It’s important to think about both ”Technical side”
and ”Business side” for success “Edge” Computing.
• Technical side
• We have to solve the technical problem related to “Edge” Computing.
• Container , Kubernetes and Kubernetes ecosystem
(include Kubernetes on OpenStack)
• Running Kubernetes production, logging and problem solving
• Business side
• We have to think the business model using “Edge” Computing.
• We are ready to help thinking your services and solutions leveraging
“Edge” Computing + GPUs.
• Cost reduction, Operation optimization
• Create new business and new revenue
(ex. Selling edge nodes for advertise items) 10
What’s “Edge” Computing?
- 11. “Edge” Computing + GPUs is Big impact!
• Operation side
Ex. Auto healing for Cloud Infra.
• Service side
Ex. Live Sporting Event
11
What’s “Edge” Computing?
NFV Edge
MANO Big data &
Log Streaming GPU
1. Error occur at
Network services
2. Detect error
from logs
Policy
Engine
3. Action
(change routing)
OpenStack Auto
healing SIG is alpha.
Edge Edge
GPU
Football Stadium
Camera Camera
Edge
1. Streaming
from Cameras
2. Processing
Streaming Data
GPU
Users3. Streaming
GPU power-ed
live videos
0. Booked
Edge nodes
- 12. Summary
• I know that the definition of “Edge” is different for each person.
• We want ”Edge” Computing that can be used in various use
cases.
• AT&T and China Mobile are combining NFV, “Edge” and MANO,
and are beginning to create the Next-gen Network Service Infra.
• It’s important to think about both ”Technical side”
and ”Business side” for success “Edge” Computing.
• “Edge” Computing + GPUs is Big Impact!
• We are ready to help thinking your services and solutions
leveraging “Edge” Computing + GPUs.
12
What’s “Edge” Computing?
- 14. Summary of Our Edge Computing POC
• This’s a use case of Japanese telecom company.
• Starting 5G/Edge Computing POC project using Kubernetes
and NVIDIA GPUs.
• This Edge Computing runs CPUs/GPUs at edge nodes.
• Kubernetes manage Edge Computing Infra + GPUs.
• We’re using Canonical Juju/MAAS (Provisioning tools) for Zero-
touch provisioning.
14
Edge Computing POC
- 16. About Multi-Access Edge Computing (MEC)
External Factor
5G Network is ready
Edge Throughput: 100Mbps
Latency: 1ms
Peak Data Rate: 20Gbps
All Telco company need to promote
5G Network
Internal Factor
Cost reduction and productivity
improvement
Next-generation Network Virtualization
(NFV / SD-WAN) production is
planning
Edge Computing POC
- 17. Understanding MEC
5G/MEC use case
MBB: Mobile Broadband
mMTC: massive Machine
Type Communications
Dense Inf Society
Connected vehicles
VR office/factory/tactile
Throughput
Latency
Reliability
Availability
Energy
Efficiency
User/Device
density
Implications of 5G RAN and IoT on OpenStack based edge computing. より引用 [ OpenStack Summit にて AT&T, Ericsson 発表 ]
https://www.openstack.org/videos/sydney-2017/implications-of-5g-ran-and-iot-on-openstack-based-edge-computing
From AT&T‘s MEC POC
- 18. Disaggregated CoreDisaggregated RAN
Understanding MEC (cond.)
AT&T’s MEC Architecture
5G Application
Ecosystem
IoT
Connected
Car
MBB
RU DU UPF UPF
Macro Radio
& Small cell
Antennas
5G
Base
Stations
Edge
Cloud
Centralized
Cloud
CCF
Internet
CU-CP
CU-UP
NFV MANO (Management & Orchestration)
CU: Centralized Unit
CP: Control Plane
UP: User Plane
UPF: User Plane Function
CCF: Core Control Function
RU: Radio Unit
DU: Digital Unit
Implications of 5G RAN and IoT on OpenStack based edge computing. より引用
- 19. Understanding MEC (cond.)
• Building Docker / Kubernetes controller
• Zero-touch Provisioning is key
• Planning thousands of locations
• Support emerging technology at edge node
(GPU, SmartNIC, FPGA, etc)
• Planning collaboration with SDN/NFV and Orchestration
Feedback from AT&T’s MEC project
Edge Computing POC
- 21. Proof of Concept(POC) #1
The scope of POC#1 is the following.
• Building edge controller and container nodes using
Kubernetes
• Zero-touch Provisioning
• Support GPUs at container nodes
The scope of POC#2 is planning.
Edge Computing POC
- 22. Container /
Compute
Nodes
Edge Computing + GPUs Architecture
NFV MANO
Edge Controllers
Physical
Provisioning
Application
Provisioning
SDN / SDS
Monitoring /
Alerting
Orchestrator
GPU
Hi speed
networking
General
purpose
Low
energy
Hi speed
storage
GPU Server
GPU Server
Storage
Server
Storage
Server
Object
Storage
Servers
w/t SmartNIC Servers
Scope of Edge Cloud
ServerServer Server
Edge Computing POC
- 23. Container nodes
Scope of Edge Computing + GPUs POC#1
NFV MANO
Edge Controllers
Physical
Provisioning
Application
Provisioning
SDN / SDS
Monitoring /
Alerting
Orchestrator
GPU
Hi speed
networking
General
purpose
Low
energy
Hi speed
storage
GPU Server
GPU Server
Storage
Server
Storage
Server
Object
Storage
Servers
w/t SmartNIC Servers
Scope of Edge Cloud
ServerServer Server
Edge Computing POC
- 24. Components for Edge Computing
Components
• Edge Cloud
• Edge Controllers
• Physical Provisioning: Ubuntu MAAS
• Application Provisioning: Ubuntu Juju
• Orchestrator: Kubernetes
• SDN(Software Defined Network): Flannel (I believe Juniper Contrail needs it)
• Monitoring/Alerting: Prometheus, Grafana
• Container nodes
• GPU Server
• General Purpose Server: Intel and ARM Server
Edge Computing POC
- 25. Questions: VM vs Container
• Existing Apps running on VMs will remain VMs.
(You can migrate VMs to Containers, but cost does not match.)
• New Apps such as IoT, Edge Computing and AI will be
advanced with Containers.
• NFV (their service infrastructure such as 5G and Fixed service) is
currently VMs, Next generations will be Containers. (AT&T planed)
• Large size (Servers > 100), prepare "Kubernetes on
OpenStack" and let the user choose VMs or Containers.
• Middle size (20 < Servers < 100), the user choice "Kubernetes"
or "OpenStack".
• Small size (Servers < 20), the user choice "Kubernetes".
25
Edge Computing POC
- 26. Kubernetes
Kubernetes vs ”Kubernetes on OpenStack”
• Kubernetes • Kubernetes on OpenStack
26
Kubernetes
Container ContainerContainer Container ContainerContainer
Kubernetes’s Good:
• common to use Kubernetes to manage containers
• Light weight controller
• Auto healing is very good
Kubernetes’s Bad:
• No Multi-Tennant
• No Network Policy related SDN
• No Persistent Storage
Kubernetes
OpenStack
“Kubernetes on OpenStack“ add missing
features of Kubernetes.
However, OpenStack’s controller isn’t
Light weight. We have to think to apply it.
Edge Computing POC
- 28. Container nodes
POC#1 environment (H/W)
Edge Controllers
• Physical Provisioning
• Monitoring /Alerting
• Application Provisioning
• Orchestrator
• SDN
POC#1のControllersの多重化は行わない想定
• GPU
• General
purpose
Edge Computing POC
- 29. Container nodes
POC#1 environment (S/W)
Edge Controllers
• Physical Provisioning
• Monitoring /Alerting
• Application Provisioning
• Orchestrator
• SDN
POC#1のControllersの多重化は行わない想定
• GPU
• General
purpose
MAAS/Juju
Prometheus
/ Grafana
Kubernetes Flannel
nvidia
docker
docker
Edge Computing POC
- 30. ・Normal x86_64 Server
���Juju/MAAS
・Prometheus
・Grafana
・apt local repository
・Normal x86_64 Server
・Kubernetes Master node
・docker image pool
・Normal x86_64 Server
・Kubernetes Master node
(・docker image pool)
・Normal x86_64 Server w/GPU
・Kubernetes Worker node
・Normal x86_64 Server
・Load Balancer
・ARM64 Server
・Kubernetes Worker
node
・1GbE以上の
Switch
・10GbE Switch
IPMI
IPMI
IPMI
IPMI
IPMI
Port VLANでも構いませ
ん
IPMI
・作業用端末
MAAS, メンテナンス用
for Podデプロイ, 一般通信用
・確認用端末
※Podデプロイの通信を独立させる構想は、以後のPoCでの検証対象とします
必要に応じて移動
• Ubuntu Server
• Juju/MAAS
• Kubernetes
• GPU Server
• ARM Server
• Flannel
• Prometheus
• Grafana
POC#1 environment (Our Testbed)
Edge Computing POC
- 31. Next Step
• Try OSS about Edge Computing + AI/DL
• From AT&T OSS
• Airship: Infrastructure project for OpenStack and Kubernetes
• Akraino: Edge Computing Framework
• Acumos AI: develop ML models for cloud optimization use-cases
• From Kubernetes issues
• Container Network (Calico, Tungsten Fabric, Cilium, etc)
• Container Security (Istio, etc)
• Persistent Storage (Ceph, Rook, etc)
• Application deployment (Spinnaker, etc)
31
- 32. Summary of Our Edge Computing POC
• This’s a use case of Japanese telecom company.
• Starting 5G/Edge Computing POC project using Kubernetes
and NVIDIA GPUs.
• This Edge Computing runs CPUs/GPUs at edge nodes.
• Kubernetes manage Edge Computing Infra + GPUs.
• We’re using Canonical Juju/MAAS (Provisioning tools) for Zero-
touch provisioning.
32
Edge Computing POC
- 35. 参考) Kubernetes と OpenStack
• レイヤの異なるものなので比較するものではありません
• コンテナの管理に特化したのが Kubernetes
• 仮想マシンやコンテナやベアメタルサーバを管理するのが
OpenStack
• OpenStack上でKubernetesを動かすK8s on OpenStackもあり
ます
• アプリケーションがコンテナ/仮想マシンを選択することにな
ります。
• 仮想マシン上のアプリケーションをコンテナに移植することは可能で
すが、全てのアプリケーションを移行することは現実的ではありませ
ん。
35
- 36. OpenStack Summit Feedback (1
1) AT&T's "Network AI"
Network AI: AT&T’s Framework for Its Open Source Efforts
That Will Drive our Software-Defined Network in 2018 and
Beyond
http://about.att.com/innovationblog/att_framework
36
- 38. OpenStack on Kubernetes
38
Under
Cloud
Over
Cloud
Server
Server
Server
Server
Server Server
Server
Server
Server
Server
Server Server
Server
Server
Server
Server
Server Server
Server
Server
Server
Server
Server Server
Control
Plane
Control
Plane
Control
Plane
Control
Plane
Control
Plane
Control
Plane
Control
Plane
Control
Plane
Control
Plane
Contain
er node
Contain
er node
Control
Plane
Contain
er node
Control
node
Control
node
Control
node
Control
node
Control
node
Control
node
Control
node
Compute
node
Compute
node
Control
node
Compute
node
1. Single Node
Bootstrap
2. Expand
Control Plane
3. Deploy
Additional Masters
4. Deploy
Compute Hosts
Kubeadm Self-hosted
Deployment
•Keystone
•Nova
•Glance
•Heat
•Ironic
•Ceph
Discover baremetal
servers using Ironic
Over Cloud で OpenStack
Under Cloud で Kubernetes
- 39. OpenStack Summit Feedback 2)
2) Acumos AI Project
A federated platform for managing AI and ML applications
and sharing AI models. AT&T and Tech Mahindra contributed
the initial Acumos code, now freely available for download.
The Linux Foundation Launches Open Source Acumos AI
Project
https://www.acumos.org/news/2018/03/26/the-linux-
foundation-launches-open-source-acumos-ai-project/
39
- 40. OpenStack Summit Feedback 3)
3) Telus, Canadian telco comapny, AI Challenge
Telus's AI Challenge is excellent. You can watch the following
video.
I will share interesting slide by email.
Artificial Intelligence driven Orchestration, Challenges and
Opportunities
https://www.openstack.org/videos/vancouver-2018/artificial-
intelligence-driven-orchestration-challenges-and-
opportunities
40
Editor's Notes
- Let’s start presentation.
Today’s agenda is two.
1 is “What’s Edge Computing ?”
I will talk with definition and use case of edge computing.
1 is “Introduce to our Edge Computing POC project”
I will talk about Edge Computing POC that NVIDIA and VTJ proposed and building and running it.
- First, I have questions about Edge computing for you.
1 is “Can you tell me about your “Edge” ?”
I know that the definition of “Edge computing” is different for each person.
I will talk about the definition of edge computing.
1 is “What’s “Edge” Computing ?”
I will talk about use-case of edge computing.
Last month, I joined events, OpenStack. I will feedback other telco company, AT&T and China Mobile, from OpenStack Summit.
1 is “What’s key points of “Edge” Computing ?”
I will talk for success “Edge” Computing.
- the definition of “Edge” is different for each person.
1 is Edge of Network nodes
Telco user is almost it.
2 is Edge of Cloud / Computing
Cloud user is it, maybe.
3 is Server side of IoT ApplicationIoT Application user is it.
In this presentation, the definition of Edge Computing includes everything.
And, I will focus MEC, Multi-access Edge Computing, Mobile Edge Compuring , I talk about it.
- Last month, I joined OpenStack Summit Vancouver. I watched Telco Edge Computing projects. (AT&T, China Mobile and Verizon).
China mobile’s use case is the bellow. You can check at YouTube. Keyword is “Edge TIC china mobile”.
- I was surprised by the activities of ATT and China Mobile.
I am NFV consultant. NFV, I know, MANO,,,Orchestration for NFV, I know. They talked combining NFV, MANO and Edge.
In Japan, NFV project and Edge project is deferent. I was surprised.
Can you watch right side pictures in this slides ?
This is China mobile use case.
Reginal and province are NFV running . City, county and AP are Edge running. All NFV and Edge manage MANO.
And, MANO, NFV and Edge is running OSS. Edge computing OSS, Akaraino, I watched first time.
- I was mistake about some edge computing projects.
I think for success “Edge” Comporting, we are thinking both “Technical side” and “Business side”.
Technical side, technology of edge computing is not mature, many many technical problem.
We are solving those technical problems.
We have to think involving LOB and business development.
NVIDIA and VTJ are ready to help thinking your projects.
Cost reduce, Operation Optimization, It’s OK.
Create new business, for example selling edge nodes for advertise items, It’s OK, I know big challenge.
- This slide is use-cases of Edge Computing + GPUs.
I‘m talking those use-cases usually.
Left side of slide, It’s Operation side use-case.
Error occurred NFV or Edge, Network is down or slowly.
At Bigdata & Log Streaming system, detect errors from log.
We believe that you can find errors efficiently by using GPUs.
Operate MANO via Policy Engine to change NFV and Edge settings.
Change routing and change band-width at NFV and Edge.
And, OpenStack Auto-healing SIG is alpha now.
OpenStack Auto-healing covered Collect logs, Detect error and apply Policy engine.
This is future function.
Right side of slide. It’s Service side use-case.
Live Sporting Event, Imaging such as Olympic Tokyo 2020.
Streaming data of many cameras upload to Edge node with GPUs.
Process from streaming data to such as panorama image and player view image and so.
Publish streaming data from the edge node to the users.
- It‘s summary.
”Edge” Computing can be used in various use cases. And This session was focus MEC.
AT&T and China Mobile are combining NFV, “Edge” and MANO, and are beginning to create the Next-gen Network Service Infra.
It’s important to think about both ”Technical side” and ”Business side”.
If need to help, NVIDIA and VTJ will help your projects.
- Next is Our Edge Computing POC project
This is Japanese customer use case.
NVIDIA and VTJ are promoted GPU MEC POC project.
This POC system is running.
- This is summary of this POC project.
Using Containers , Kubernetes and NVIDIA GPUs.
Kubernetes is managing containers and container orchestration.
This project’s Kubernetes manage many containers and GPUs.
- Half years ago, I joined OpenStack Summit Sydney, I watched AT&T Edge Computing project journey presentation.
I was inspired by that material.
- This slide is External and Internal Factor of MEC.
- Refer from AT&T document
I understood that 5g network is several use-case.
Each use-case has different system requirement.
- This slide is AT&T‘s MEC High level Architecture.
Right side of slide is Centralized Cloud, called “Core Network”.
Left side of slide is Radio network for Mobile Network.
Center side of slide is Edge Cloud using Edge Computing.
Upper side of slide is NFV MANO. NFV MANO is Orchestrator for Telecom service networks.
NFV MANO manage Centralized Cloud and Edge Cloud.
- This slide is feedback from at&t mec projects.
MEC is running at telco central center, city and county and so.
There are limitation space for server of edge computing.
This AT&T project choose container and Kubernetes.
There are many node of edge computing, and management from remote site is necessary.
At Edge computing, Zero-touch provisioning is key feature.
- This is introduction about our poc project.
- The proof of Our POC is bellow.
This POC is first phase.
Proposed at November of last year, research and build POC system.
This POC system is running now.
- This is Edge Computing + GPUs Architecture.
The upper side of slide is Edge Controller.
The bottom side of slide is Container nodes/Compute nodes.
For several use-case for edge computing, we will prepare variable type of Container nodes.
- 1st phase of our POC is Edge controller and GPU Servers and General purpose servers.
- This slide is components for our edge computing.
We are used Ubuntu Juju/MAAS for zero-touch provisioning.
SDN is flannel is good, may be.
GPU is used.
And, General purpose server is used Intel server and ARM server.
ARM server is challenging. In this project,
We ware asking Canonical, Canonical deliver Ubuntu Support Service.
We ware made Kubernetes on ARM.
- This slide is question about VM versus Container.
Generally, User choose VM or Container according to system requirements.
The VM is already in use, the Container will be used for future application development.
As NFV Consultant, I think that the Container is not mature of virtualized network and hardware offloads for high speed network.
This slide is question about VM vs Container in case of MEC.
This is my knowledge. MEC have problem about space limitation. Small Edge node is 1 rack or 2 racks, I heard.
Under 20 servers MEC environments, User of infra. choose usually Container and Kubernetes.
Over 100 servers MEC environments, User of infra. build VM and OpenStack. if needed, install Kubernetes on VM. We call “Kubernetes on OpenStack”.
User of Application choose VMs or Containers.
- This slide is Kubernetes vs Kubernetes on OpenStack.
Kubernetes is good solution. Light weight controller and built-in auto-healing.
But, today’s Kubernetes is no Multi-tenant and no network policy related SDN.
If needed, we are choose Kubernetes on OpenStack.
- This slides our POC environment.
- This is our testbed.
Using Juju/M.AAS, we can build same environment.
- This slide is next step for our poc.
We have many issue. We will try to solve issues.
- This is summary of this POC project.
Using Containers , Kubernetes and NVIDIA GPUs.
Kubernetes is managing containers and container orchestration.
This project’s Kubernetes manage many containers and GPUs.
If interested MEC and our MEC projects, please ask NVIDIA and VTJ.