SlideShare a Scribd company logo
© 2016 Mesosphere, Inc. All Rights Reserved.
BUILDING THE GLUE FOR
SERVICE DISCOVERY &
LOAD BALANCING
MICROSERVICES
@SARGUN
1
Sargun Dhillon, 2016
© 2016 Mesosphere, Inc. All Rights Reserved.
WHO AM I?
2
© 2016 Mesosphere, Inc. All Rights Reserved.
AGENDA
3
• How, and why did we go here?
• What have I been building in the past year?
• Why?
• How?
• Challenges
• Some of the future
© 2016 Mesosphere, Inc. All Rights Reserved.
A BRIEF HISTORY
OF NETWORKS
IN THE
DATACENTER
4
© 2016 Mesosphere, Inc. All Rights Reserved.
ORGANIZATIONS CIRCA 2007
5
•Before DevOps was coined
•Clear differentiation of ownership
•The datacenter was owned by the NOC
•Deployment of services was done by sysadmins in the operations group
•Developers operated without access to production systems
•Production deployments gated by QA and Operations
© 2016 Mesosphere, Inc. All Rights Reserved.
SOFTWARE CIRCA 2007
6
•Different services glued together via CORBA, XML-RPC, SOAP
•No one was really consciously doing microservices
•Networks were static, giant layer 2 domains
•Load Balancing provided by hardware
•Everyone ran their own datacenter
•EC2 in its infancy, only a year prior has the term “Cloud” began to become popular
•Systems statically partitioned
© 2016 Mesosphere, Inc. All Rights Reserved. 7
SaaS continued to grow at
an incredible rate
© 2016 Mesosphere, Inc. All Rights Reserved. 8
There became a race to ship faster
© 2016 Mesosphere, Inc. All Rights Reserved. 9
We kept the software alive
By feeding it
With Sysadmins
© 2016 Mesosphere, Inc. All Rights Reserved. 10
We kept the machines alive
By feeding them
With Blood
© 2016 Mesosphere, Inc. All Rights Reserved. 11
This wasn’t working
© 2016 Mesosphere, Inc. All Rights Reserved.
ORGANIZATIONS 2008+
12
•We began seeing a gradual shift in the industry where lines between QA, Dev, and
Ops were blurring
•Devops term coined in 2008, first DevOpsDay in 2009
•Gradual adoption of the cloud, fewer organizations owning their own datacenters
•Either networking was outsourced to the cloud, or typically remained in a small
internal organization
•Needed to reduce ratio of operators to servers
© 2016 Mesosphere, Inc. All Rights Reserved.
SOFTWARE CIRCA 2008+
13
•Popularization of Open Source tooling to automate QA and much of traditional operations
•Jenkins / Hudson
•Puppet / Chef
•Capistrano
•Popularization of stacks requiring more complex operational requirements
•Nutch / Hadoop
•NoSQLs
•Machine still statically partitioned
•Networks still sacred territory
© 2016 Mesosphere, Inc. All Rights Reserved.
CIRCA 2011
14
•Much of what’s been happening for the past half-decade hits networking
•Much of this falls under the term “SDN” (Software Defined Networking) or “NFV” (Network
Function Virtualization)
•Hastened by the adoption of VMs in the enterprise in the hype cycle
•Openflow promises to fix everything
•Major adoptions of the cloud by startups as well as enterprise
•Virtualization begins to become mainstream as a mechanism of consolidating workloads
•The invention of the “private cloud”
•DotCloud / Docker funded by Y-combinator a year earlier
•Term “Microservice” coined
© 2016 Mesosphere, Inc. All Rights Reserved.
CIRCA 2013
15
• Docker becomes an instant hit and brings containers to the forefront
• Dynamic partitioning begins to make in-roads
• Google releases Omega paper
• Apache Aurora open sourced
• Microservice counts explode, demanding collocation of workloads for efficiency
• Mesosphere Founded
• Site Reliability Engineering begins to popularize and further blur the lines between
Dev, Ops, and QA
© 2016 Mesosphere, Inc. All Rights Reserved. 16
Everything was changing
© 2016 Mesosphere, Inc. All Rights Reserved. 17
Why?
© 2016 Mesosphere, Inc. All Rights Reserved. 18
Business Value
© 2016 Mesosphere, Inc. All Rights Reserved. 19
© 2016 Mesosphere, Inc. All Rights Reserved.
BENEFITS
20
•Reduction in cost of goods sold
•Smaller engineer to server ratio
•Linear, or super linear growth rate of engineering team to servers is unsustainable
•Smaller engineer to capability ratio, where capability includes:
•Features
•Throughput
•Better User Experience
•Better availability
•Quicker release to features and fixes
© 2016 Mesosphere, Inc. All Rights Reserved. 21
But at what cost?
© 2016 Mesosphere, Inc. All Rights Reserved. 22
Complexity
© 2016 Mesosphere, Inc. All Rights Reserved.
OLD WORLD
23
© 2016 Mesosphere, Inc. All Rights Reserved.
NEW WORLD
24
© 2016 Mesosphere, Inc. All Rights Reserved.
DIVING DEEPER
25
© 2016 Mesosphere, Inc. All Rights Reserved. 26
Paxos?
Raft?
Ω Failure Detector?
Pods?
Wat?
Sidecars?
Etcd?
Zookeeper?
VxLan?
© 2016 Mesosphere, Inc. All Rights Reserved. 27
Performance
© 2016 Mesosphere, Inc. All Rights Reserved.
REDIS PERFORMANCE
28
© 2016 Mesosphere, Inc. All Rights Reserved. 29
MySQL Performance with Containers
0
75000
150000
225000
300000
Container-free Host Mode Bridged Overlay
Transactions / Sec
© 2016 Mesosphere, Inc. All Rights Reserved.
OUR JOURNEY
30
© 2016 Mesosphere, Inc. All Rights Reserved. 31
“Connectivity”
© 2016 Mesosphere, Inc. All Rights Reserved. 32
Where are my apps running?
© 2016 Mesosphere, Inc. All Rights Reserved. 33
© 2016 Mesosphere, Inc. All Rights Reserved. 34
The Old World
© 2016 Mesosphere, Inc. All Rights Reserved. 35
Let Mesos* Choose Ports
*The Scheduler
© 2016 Mesosphere, Inc. All Rights Reserved. 36
How do you find the tasks?
© 2016 Mesosphere, Inc. All Rights Reserved. 37
A Directory?
© 2016 Mesosphere, Inc. All Rights Reserved. 38
See: DNS
© 2016 Mesosphere, Inc. All Rights Reserved. 39
© 2016 Mesosphere, Inc. All Rights Reserved. 40
Solution:
Expose the IP and Port of Tasks
via
DNS SRV records
© 2016 Mesosphere, Inc. All Rights Reserved. 41
Everyone has DNS right?
© 2016 Mesosphere, Inc. All Rights Reserved. 42
And GLibc even has a bug open for it!
© 2016 Mesosphere, Inc. All Rights Reserved. 43
…Opened in 2005
© 2016 Mesosphere, Inc. All Rights Reserved. 44
So, we performed an OODA loop
1.Observe
2. Orient
3. Decide
4. Act
© 2016 Mesosphere, Inc. All Rights Reserved.
OBSERVE: SERVICE DISCOVERY
45
• Existing Dynamic Service Discovery Solutions:
• Etcd
• Finagle + Zookeeper
• Consul
• Existing Static Service Discovery Solutions:
• Amazon ELB
• Hardware Load Balancers
• Service Discovery is an afterthought
Gathering our data about the field
© 2016 Mesosphere, Inc. All Rights Reserved.
OBSERVE: NETWORKING
46
• Everybody assumes IP per application instance
• Everybody assumes reliable DNS
• Some people want to be fast
• Some people want security
• Nobody wants to edit application code
• Nobody wants to talk to their network engineer
Gathering our data about the field
© 2016 Mesosphere, Inc. All Rights Reserved.
ORIENT:
NETWORKING
CORE TENANTS
47
•We must be agnostic to the underlying environment
•AWS / Azure / GCE / Softlayer as the lowest common
denominators
•We should require minimal changes to user code in
order to work
•We should provide similar services to existing
environments
•Fixed load balancers
•Security
•IP/Container
•We do not want to require a change in organization
procedures
© 2016 Mesosphere, Inc. All Rights Reserved. 48
Act:
What did we build?
© 2016 Mesosphere, Inc. All Rights Reserved. 49
Load Balancing:
Minuteman
© 2016 Mesosphere, Inc. All Rights Reserved.
WHAT WE WANTED
50
© 2016 Mesosphere, Inc. All Rights Reserved. 51
First try:
LD_PRELOAD
© 2016 Mesosphere, Inc. All Rights Reserved. 52
How does connect() work?
© 2016 Mesosphere, Inc. All Rights Reserved. 53
© 2016 Mesosphere, Inc. All Rights Reserved. 54
© 2016 Mesosphere, Inc. All Rights Reserved. 55
How does connect() work on
LD_PRELOAD?
© 2016 Mesosphere, Inc. All Rights Reserved. 56
© 2016 Mesosphere, Inc. All Rights Reserved. 57
© 2016 Mesosphere, Inc. All Rights Reserved. 58
© 2016 Mesosphere, Inc. All Rights Reserved. 59
© 2016 Mesosphere, Inc. All Rights Reserved. 60
…But no
© 2016 Mesosphere, Inc. All Rights Reserved. 61
Static Linking
© 2016 Mesosphere, Inc. All Rights Reserved. 62
What else is there?
© 2016 Mesosphere, Inc. All Rights Reserved. 63
© 2016 Mesosphere, Inc. All Rights Reserved. 64
© 2016 Mesosphere, Inc. All Rights Reserved. 65
© 2016 Mesosphere, Inc. All Rights Reserved. 66
© 2016 Mesosphere, Inc. All Rights Reserved. 67
Minuteman:
Flows at a Moment’s notice
© 2016 Mesosphere, Inc. All Rights Reserved.
MINUTEMAN:
BENEFITS
68
•Appearance of a fixed-load balancer
•Fully distributed
•Other than first packet, the entire lifetime is handled
in kernel space
•Source: github.com/dcos/minuteman
© 2016 Mesosphere, Inc. All Rights Reserved. 69
How do we tie it all together?
© 2016 Mesosphere, Inc. All Rights Reserved.
GLOBAL STATE
70
• Load Balancer Task Mapping
• DNS Zones
• Virtual Network Routing Tables
• Reachability
• Security ACLs
© 2016 Mesosphere, Inc. All Rights Reserved. 71
Computers
© 2016 Mesosphere, Inc. All Rights Reserved. 72
Constant Churn
© 2016 Mesosphere, Inc. All Rights Reserved. 73
Sometimes
They Break
© 2016 Mesosphere, Inc. All Rights Reserved. 74
Sometimes
Many Break
© 2016 Mesosphere, Inc. All Rights Reserved. 75
Sometimes
You don’t know
© 2016 Mesosphere, Inc. All Rights Reserved.
HOW DO YOU
DO
SIGNALING?
76
© 2016 Mesosphere, Inc. All Rights Reserved. 77
Nobody ever got fired for using
Zookeeper
© 2016 Mesosphere, Inc. All Rights Reserved. 78
We know about Zookeeper
© 2016 Mesosphere, Inc. All Rights Reserved. 79
It works…usually
© 2016 Mesosphere, Inc. All Rights Reserved. 80
How else can we do this?
© 2016 Mesosphere, Inc. All Rights Reserved. 81
Naively
© 2016 Mesosphere, Inc. All Rights Reserved. 82
Massive Amount of Information
© 2016 Mesosphere, Inc. All Rights Reserved. 83
Who else has solved this?
© 2016 Mesosphere, Inc. All Rights Reserved. 84
Academia
© 2016 Mesosphere, Inc. All Rights Reserved. 85
Control Plane:
Lashup
© 2016 Mesosphere, Inc. All Rights Reserved. 86
How do we scale the naive approach?
© 2016 Mesosphere, Inc. All Rights Reserved. 87
All we need is a sparse, connected graph
© 2016 Mesosphere, Inc. All Rights Reserved. 88
But how?
© 2016 Mesosphere, Inc. All Rights Reserved. 89
Enter: HyParView
Builds a connected graph (overlay), where the degree is <=K
© 2016 Mesosphere, Inc. All Rights Reserved. 90
Boot Time
© 2016 Mesosphere, Inc. All Rights Reserved. 91
© 2016 Mesosphere, Inc. All Rights Reserved. 92
© 2016 Mesosphere, Inc. All Rights Reserved. 93
© 2016 Mesosphere, Inc. All Rights Reserved. 94
© 2016 Mesosphere, Inc. All Rights Reserved. 95
© 2016 Mesosphere, Inc. All Rights Reserved. 96
© 2016 Mesosphere, Inc. All Rights Reserved. 97
© 2016 Mesosphere, Inc. All Rights Reserved. 98
© 2016 Mesosphere, Inc. All Rights Reserved. 99
© 2016 Mesosphere, Inc. All Rights Reserved. 100
© 2016 Mesosphere, Inc. All Rights Reserved. 101
© 2016 Mesosphere, Inc. All Rights Reserved. 102
© 2016 Mesosphere, Inc. All Rights Reserved. 103
© 2016 Mesosphere, Inc. All Rights Reserved. 104
Connected Graph
Hyparview
© 2016 Mesosphere, Inc. All Rights Reserved. 105
Constant Adaptive
Health Checks*
*Borrowed from SWIM, Gossip Style Failure Detector
© 2016 Mesosphere, Inc. All Rights Reserved. 106
Dealing with Failure
Hyparview
© 2016 Mesosphere, Inc. All Rights Reserved. 107
Dealing with Failure
Hyparview
© 2016 Mesosphere, Inc. All Rights Reserved. 108
Dealing with Failure
Hyparview
© 2016 Mesosphere, Inc. All Rights Reserved. 109
Dealing with Failure
Hyparview
© 2016 Mesosphere, Inc. All Rights Reserved. 110
Routing*
*Borrowed from Dijkstra, OSPF, IS-IS, Perlman
© 2016 Mesosphere, Inc. All Rights Reserved. 111
© 2016 Mesosphere, Inc. All Rights Reserved. 112
© 2016 Mesosphere, Inc. All Rights Reserved. 113
Just Run Dijkstra*
*BFS / DFS
© 2016 Mesosphere, Inc. All Rights Reserved. 114
Minimum Spanning Tree, From Sender A
© 2016 Mesosphere, Inc. All Rights Reserved. 115
A Wild Pokemon Appears!*
*CRDT papers
Database
© 2016 Mesosphere, Inc. All Rights Reserved. 116
CRDTS: Semilattices
“Time”
© 2016 Mesosphere, Inc. All Rights Reserved. 117
CRDT KV Store
© 2016 Mesosphere, Inc. All Rights Reserved.
LASHUP KV
118
•Datatypes:
•Maps
•Composable
•Sets
•Flags
•Counters
•Registers
•Last-write-wins
© 2016 Mesosphere, Inc. All Rights Reserved. 119
Evaluation
© 2016 Mesosphere, Inc. All Rights Reserved. 120
Prior to Lashup
© 2016 Mesosphere, Inc. All Rights Reserved. 121
After Lashup
© 2016 Mesosphere, Inc. All Rights Reserved.
PROJECT
LASHUP
122
•A novel distributed systems SDK that provides:
•Membership
•Multicast Delivery
•Strongly-eventually consistent data storage
•Powers:
•Minuteman VIP dissemination
•Minuteman node liveness checks
•Overlay routing
•DNS Synchronization
•Source: github.com/dcos/lashup
© 2016 Mesosphere, Inc. All Rights Reserved.
THE FUTURE
123
© 2016 Mesosphere, Inc. All Rights Reserved. 124
Zero Overhead NFV:
Software Defined Systems?
© 2016 Mesosphere, Inc. All Rights Reserved. 125
This seems familiar
© 2016 Mesosphere, Inc. All Rights Reserved. 126
Something new
© 2016 Mesosphere, Inc. All Rights Reserved. 127
Something new(-ish)
© 2016 Mesosphere, Inc. All Rights Reserved. 128
© 2016 Mesosphere, Inc. All Rights Reserved. 129
(Extended)BPF?
© 2016 Mesosphere, Inc. All Rights Reserved. 130
© 2016 Mesosphere, Inc. All Rights Reserved. 131
© 2016 Mesosphere, Inc. All Rights Reserved.
ZERO-OVERHEAD NFV
132
•Implemented using built-in Linux APIs
•Linux Security Module API
•Linux eBPF
•Allows standard BSD API to work
•Example: getpeername() works
•In preliminary testing: <0.1% overhead
•Challenges
•Upgrading Kernel
© 2016 Mesosphere, Inc. All Rights Reserved. 133
IPv6:
Just more addresses?
© 2016 Mesosphere, Inc. All Rights Reserved. 134
000.000.000.000
32-bits
IPv4
© 2016 Mesosphere, Inc. All Rights Reserved. 135
0000:0000:0000:0000:0000:0000:0000:0000
64-bits
IPv6
64-bits
Network
Part
Host
Part
© 2016 Mesosphere, Inc. All Rights Reserved. 136
SLAAC
StateLess Address Auto Configuration
© 2016 Mesosphere, Inc. All Rights Reserved.
IPV6
137
•264 addresses per host
•Automatically configured
•Challenges
•Cloud support
•Organization challenges
© 2016 Mesosphere, Inc. All Rights Reserved. 138
What did we learn?
© 2016 Mesosphere, Inc. All Rights Reserved. 139
Organizations are weird
© 2016 Mesosphere, Inc. All Rights Reserved. 140
We’re probably doing it wrong
(today)
© 2016 Mesosphere, Inc. All Rights Reserved. 141
The future looks bright
© 2016 Mesosphere, Inc. All Rights Reserved.
BUILDING THE GLUE FOR
SERVICE DISCOVERY &
LOAD BALANCING
MICROSERVICES
@SARGUN
142

More Related Content

Building the Glue for Service Discovery & Load Balancing Microservices

  • 1. © 2016 Mesosphere, Inc. All Rights Reserved. BUILDING THE GLUE FOR SERVICE DISCOVERY & LOAD BALANCING MICROSERVICES @SARGUN 1 Sargun Dhillon, 2016
  • 2. © 2016 Mesosphere, Inc. All Rights Reserved. WHO AM I? 2
  • 3. © 2016 Mesosphere, Inc. All Rights Reserved. AGENDA 3 • How, and why did we go here? • What have I been building in the past year? • Why? • How? • Challenges • Some of the future
  • 4. © 2016 Mesosphere, Inc. All Rights Reserved. A BRIEF HISTORY OF NETWORKS IN THE DATACENTER 4
  • 5. © 2016 Mesosphere, Inc. All Rights Reserved. ORGANIZATIONS CIRCA 2007 5 •Before DevOps was coined •Clear differentiation of ownership •The datacenter was owned by the NOC •Deployment of services was done by sysadmins in the operations group •Developers operated without access to production systems •Production deployments gated by QA and Operations
  • 6. © 2016 Mesosphere, Inc. All Rights Reserved. SOFTWARE CIRCA 2007 6 •Different services glued together via CORBA, XML-RPC, SOAP •No one was really consciously doing microservices •Networks were static, giant layer 2 domains •Load Balancing provided by hardware •Everyone ran their own datacenter •EC2 in its infancy, only a year prior has the term “Cloud” began to become popular •Systems statically partitioned
  • 7. © 2016 Mesosphere, Inc. All Rights Reserved. 7 SaaS continued to grow at an incredible rate
  • 8. © 2016 Mesosphere, Inc. All Rights Reserved. 8 There became a race to ship faster
  • 9. © 2016 Mesosphere, Inc. All Rights Reserved. 9 We kept the software alive By feeding it With Sysadmins
  • 10. © 2016 Mesosphere, Inc. All Rights Reserved. 10 We kept the machines alive By feeding them With Blood
  • 11. © 2016 Mesosphere, Inc. All Rights Reserved. 11 This wasn’t working
  • 12. © 2016 Mesosphere, Inc. All Rights Reserved. ORGANIZATIONS 2008+ 12 •We began seeing a gradual shift in the industry where lines between QA, Dev, and Ops were blurring •Devops term coined in 2008, first DevOpsDay in 2009 •Gradual adoption of the cloud, fewer organizations owning their own datacenters •Either networking was outsourced to the cloud, or typically remained in a small internal organization •Needed to reduce ratio of operators to servers
  • 13. © 2016 Mesosphere, Inc. All Rights Reserved. SOFTWARE CIRCA 2008+ 13 •Popularization of Open Source tooling to automate QA and much of traditional operations •Jenkins / Hudson •Puppet / Chef •Capistrano •Popularization of stacks requiring more complex operational requirements •Nutch / Hadoop •NoSQLs •Machine still statically partitioned •Networks still sacred territory
  • 14. © 2016 Mesosphere, Inc. All Rights Reserved. CIRCA 2011 14 •Much of what’s been happening for the past half-decade hits networking •Much of this falls under the term “SDN” (Software Defined Networking) or “NFV” (Network Function Virtualization) •Hastened by the adoption of VMs in the enterprise in the hype cycle •Openflow promises to fix everything •Major adoptions of the cloud by startups as well as enterprise •Virtualization begins to become mainstream as a mechanism of consolidating workloads •The invention of the “private cloud” •DotCloud / Docker funded by Y-combinator a year earlier •Term “Microservice” coined
  • 15. © 2016 Mesosphere, Inc. All Rights Reserved. CIRCA 2013 15 • Docker becomes an instant hit and brings containers to the forefront • Dynamic partitioning begins to make in-roads • Google releases Omega paper • Apache Aurora open sourced • Microservice counts explode, demanding collocation of workloads for efficiency • Mesosphere Founded • Site Reliability Engineering begins to popularize and further blur the lines between Dev, Ops, and QA
  • 16. © 2016 Mesosphere, Inc. All Rights Reserved. 16 Everything was changing
  • 17. © 2016 Mesosphere, Inc. All Rights Reserved. 17 Why?
  • 18. © 2016 Mesosphere, Inc. All Rights Reserved. 18 Business Value
  • 19. © 2016 Mesosphere, Inc. All Rights Reserved. 19
  • 20. © 2016 Mesosphere, Inc. All Rights Reserved. BENEFITS 20 •Reduction in cost of goods sold •Smaller engineer to server ratio •Linear, or super linear growth rate of engineering team to servers is unsustainable •Smaller engineer to capability ratio, where capability includes: •Features •Throughput •Better User Experience •Better availability •Quicker release to features and fixes
  • 21. © 2016 Mesosphere, Inc. All Rights Reserved. 21 But at what cost?
  • 22. © 2016 Mesosphere, Inc. All Rights Reserved. 22 Complexity
  • 23. © 2016 Mesosphere, Inc. All Rights Reserved. OLD WORLD 23
  • 24. © 2016 Mesosphere, Inc. All Rights Reserved. NEW WORLD 24
  • 25. © 2016 Mesosphere, Inc. All Rights Reserved. DIVING DEEPER 25
  • 26. © 2016 Mesosphere, Inc. All Rights Reserved. 26 Paxos? Raft? Ω Failure Detector? Pods? Wat? Sidecars? Etcd? Zookeeper? VxLan?
  • 27. © 2016 Mesosphere, Inc. All Rights Reserved. 27 Performance
  • 28. © 2016 Mesosphere, Inc. All Rights Reserved. REDIS PERFORMANCE 28
  • 29. © 2016 Mesosphere, Inc. All Rights Reserved. 29 MySQL Performance with Containers 0 75000 150000 225000 300000 Container-free Host Mode Bridged Overlay Transactions / Sec
  • 30. © 2016 Mesosphere, Inc. All Rights Reserved. OUR JOURNEY 30
  • 31. © 2016 Mesosphere, Inc. All Rights Reserved. 31 “Connectivity”
  • 32. © 2016 Mesosphere, Inc. All Rights Reserved. 32 Where are my apps running?
  • 33. © 2016 Mesosphere, Inc. All Rights Reserved. 33
  • 34. © 2016 Mesosphere, Inc. All Rights Reserved. 34 The Old World
  • 35. © 2016 Mesosphere, Inc. All Rights Reserved. 35 Let Mesos* Choose Ports *The Scheduler
  • 36. © 2016 Mesosphere, Inc. All Rights Reserved. 36 How do you find the tasks?
  • 37. © 2016 Mesosphere, Inc. All Rights Reserved. 37 A Directory?
  • 38. © 2016 Mesosphere, Inc. All Rights Reserved. 38 See: DNS
  • 39. © 2016 Mesosphere, Inc. All Rights Reserved. 39
  • 40. © 2016 Mesosphere, Inc. All Rights Reserved. 40 Solution: Expose the IP and Port of Tasks via DNS SRV records
  • 41. © 2016 Mesosphere, Inc. All Rights Reserved. 41 Everyone has DNS right?
  • 42. © 2016 Mesosphere, Inc. All Rights Reserved. 42 And GLibc even has a bug open for it!
  • 43. © 2016 Mesosphere, Inc. All Rights Reserved. 43 …Opened in 2005
  • 44. © 2016 Mesosphere, Inc. All Rights Reserved. 44 So, we performed an OODA loop 1.Observe 2. Orient 3. Decide 4. Act
  • 45. © 2016 Mesosphere, Inc. All Rights Reserved. OBSERVE: SERVICE DISCOVERY 45 • Existing Dynamic Service Discovery Solutions: • Etcd • Finagle + Zookeeper • Consul • Existing Static Service Discovery Solutions: • Amazon ELB • Hardware Load Balancers • Service Discovery is an afterthought Gathering our data about the field
  • 46. © 2016 Mesosphere, Inc. All Rights Reserved. OBSERVE: NETWORKING 46 • Everybody assumes IP per application instance • Everybody assumes reliable DNS • Some people want to be fast • Some people want security • Nobody wants to edit application code • Nobody wants to talk to their network engineer Gathering our data about the field
  • 47. © 2016 Mesosphere, Inc. All Rights Reserved. ORIENT: NETWORKING CORE TENANTS 47 •We must be agnostic to the underlying environment •AWS / Azure / GCE / Softlayer as the lowest common denominators •We should require minimal changes to user code in order to work •We should provide similar services to existing environments •Fixed load balancers •Security •IP/Container •We do not want to require a change in organization procedures
  • 48. © 2016 Mesosphere, Inc. All Rights Reserved. 48 Act: What did we build?
  • 49. © 2016 Mesosphere, Inc. All Rights Reserved. 49 Load Balancing: Minuteman
  • 50. © 2016 Mesosphere, Inc. All Rights Reserved. WHAT WE WANTED 50
  • 51. © 2016 Mesosphere, Inc. All Rights Reserved. 51 First try: LD_PRELOAD
  • 52. © 2016 Mesosphere, Inc. All Rights Reserved. 52 How does connect() work?
  • 53. © 2016 Mesosphere, Inc. All Rights Reserved. 53
  • 54. © 2016 Mesosphere, Inc. All Rights Reserved. 54
  • 55. © 2016 Mesosphere, Inc. All Rights Reserved. 55 How does connect() work on LD_PRELOAD?
  • 56. © 2016 Mesosphere, Inc. All Rights Reserved. 56
  • 57. © 2016 Mesosphere, Inc. All Rights Reserved. 57
  • 58. © 2016 Mesosphere, Inc. All Rights Reserved. 58
  • 59. © 2016 Mesosphere, Inc. All Rights Reserved. 59
  • 60. © 2016 Mesosphere, Inc. All Rights Reserved. 60 …But no
  • 61. © 2016 Mesosphere, Inc. All Rights Reserved. 61 Static Linking
  • 62. © 2016 Mesosphere, Inc. All Rights Reserved. 62 What else is there?
  • 63. © 2016 Mesosphere, Inc. All Rights Reserved. 63
  • 64. © 2016 Mesosphere, Inc. All Rights Reserved. 64
  • 65. © 2016 Mesosphere, Inc. All Rights Reserved. 65
  • 66. © 2016 Mesosphere, Inc. All Rights Reserved. 66
  • 67. © 2016 Mesosphere, Inc. All Rights Reserved. 67 Minuteman: Flows at a Moment’s notice
  • 68. © 2016 Mesosphere, Inc. All Rights Reserved. MINUTEMAN: BENEFITS 68 •Appearance of a fixed-load balancer •Fully distributed •Other than first packet, the entire lifetime is handled in kernel space •Source: github.com/dcos/minuteman
  • 69. © 2016 Mesosphere, Inc. All Rights Reserved. 69 How do we tie it all together?
  • 70. © 2016 Mesosphere, Inc. All Rights Reserved. GLOBAL STATE 70 • Load Balancer Task Mapping • DNS Zones • Virtual Network Routing Tables • Reachability • Security ACLs
  • 71. © 2016 Mesosphere, Inc. All Rights Reserved. 71 Computers
  • 72. © 2016 Mesosphere, Inc. All Rights Reserved. 72 Constant Churn
  • 73. © 2016 Mesosphere, Inc. All Rights Reserved. 73 Sometimes They Break
  • 74. © 2016 Mesosphere, Inc. All Rights Reserved. 74 Sometimes Many Break
  • 75. © 2016 Mesosphere, Inc. All Rights Reserved. 75 Sometimes You don’t know
  • 76. © 2016 Mesosphere, Inc. All Rights Reserved. HOW DO YOU DO SIGNALING? 76
  • 77. © 2016 Mesosphere, Inc. All Rights Reserved. 77 Nobody ever got fired for using Zookeeper
  • 78. © 2016 Mesosphere, Inc. All Rights Reserved. 78 We know about Zookeeper
  • 79. © 2016 Mesosphere, Inc. All Rights Reserved. 79 It works…usually
  • 80. © 2016 Mesosphere, Inc. All Rights Reserved. 80 How else can we do this?
  • 81. © 2016 Mesosphere, Inc. All Rights Reserved. 81 Naively
  • 82. © 2016 Mesosphere, Inc. All Rights Reserved. 82 Massive Amount of Information
  • 83. © 2016 Mesosphere, Inc. All Rights Reserved. 83 Who else has solved this?
  • 84. © 2016 Mesosphere, Inc. All Rights Reserved. 84 Academia
  • 85. © 2016 Mesosphere, Inc. All Rights Reserved. 85 Control Plane: Lashup
  • 86. © 2016 Mesosphere, Inc. All Rights Reserved. 86 How do we scale the naive approach?
  • 87. © 2016 Mesosphere, Inc. All Rights Reserved. 87 All we need is a sparse, connected graph
  • 88. © 2016 Mesosphere, Inc. All Rights Reserved. 88 But how?
  • 89. © 2016 Mesosphere, Inc. All Rights Reserved. 89 Enter: HyParView Builds a connected graph (overlay), where the degree is <=K
  • 90. © 2016 Mesosphere, Inc. All Rights Reserved. 90 Boot Time
  • 91. © 2016 Mesosphere, Inc. All Rights Reserved. 91
  • 92. © 2016 Mesosphere, Inc. All Rights Reserved. 92
  • 93. © 2016 Mesosphere, Inc. All Rights Reserved. 93
  • 94. © 2016 Mesosphere, Inc. All Rights Reserved. 94
  • 95. © 2016 Mesosphere, Inc. All Rights Reserved. 95
  • 96. © 2016 Mesosphere, Inc. All Rights Reserved. 96
  • 97. © 2016 Mesosphere, Inc. All Rights Reserved. 97
  • 98. © 2016 Mesosphere, Inc. All Rights Reserved. 98
  • 99. © 2016 Mesosphere, Inc. All Rights Reserved. 99
  • 100. © 2016 Mesosphere, Inc. All Rights Reserved. 100
  • 101. © 2016 Mesosphere, Inc. All Rights Reserved. 101
  • 102. © 2016 Mesosphere, Inc. All Rights Reserved. 102
  • 103. © 2016 Mesosphere, Inc. All Rights Reserved. 103
  • 104. © 2016 Mesosphere, Inc. All Rights Reserved. 104 Connected Graph Hyparview
  • 105. © 2016 Mesosphere, Inc. All Rights Reserved. 105 Constant Adaptive Health Checks* *Borrowed from SWIM, Gossip Style Failure Detector
  • 106. © 2016 Mesosphere, Inc. All Rights Reserved. 106 Dealing with Failure Hyparview
  • 107. © 2016 Mesosphere, Inc. All Rights Reserved. 107 Dealing with Failure Hyparview
  • 108. © 2016 Mesosphere, Inc. All Rights Reserved. 108 Dealing with Failure Hyparview
  • 109. © 2016 Mesosphere, Inc. All Rights Reserved. 109 Dealing with Failure Hyparview
  • 110. © 2016 Mesosphere, Inc. All Rights Reserved. 110 Routing* *Borrowed from Dijkstra, OSPF, IS-IS, Perlman
  • 111. © 2016 Mesosphere, Inc. All Rights Reserved. 111
  • 112. © 2016 Mesosphere, Inc. All Rights Reserved. 112
  • 113. © 2016 Mesosphere, Inc. All Rights Reserved. 113 Just Run Dijkstra* *BFS / DFS
  • 114. © 2016 Mesosphere, Inc. All Rights Reserved. 114 Minimum Spanning Tree, From Sender A
  • 115. © 2016 Mesosphere, Inc. All Rights Reserved. 115 A Wild Pokemon Appears!* *CRDT papers Database
  • 116. © 2016 Mesosphere, Inc. All Rights Reserved. 116 CRDTS: Semilattices “Time”
  • 117. © 2016 Mesosphere, Inc. All Rights Reserved. 117 CRDT KV Store
  • 118. © 2016 Mesosphere, Inc. All Rights Reserved. LASHUP KV 118 •Datatypes: •Maps •Composable •Sets •Flags •Counters •Registers •Last-write-wins
  • 119. © 2016 Mesosphere, Inc. All Rights Reserved. 119 Evaluation
  • 120. © 2016 Mesosphere, Inc. All Rights Reserved. 120 Prior to Lashup
  • 121. © 2016 Mesosphere, Inc. All Rights Reserved. 121 After Lashup
  • 122. © 2016 Mesosphere, Inc. All Rights Reserved. PROJECT LASHUP 122 •A novel distributed systems SDK that provides: •Membership •Multicast Delivery •Strongly-eventually consistent data storage •Powers: •Minuteman VIP dissemination •Minuteman node liveness checks •Overlay routing •DNS Synchronization •Source: github.com/dcos/lashup
  • 123. © 2016 Mesosphere, Inc. All Rights Reserved. THE FUTURE 123
  • 124. © 2016 Mesosphere, Inc. All Rights Reserved. 124 Zero Overhead NFV: Software Defined Systems?
  • 125. © 2016 Mesosphere, Inc. All Rights Reserved. 125 This seems familiar
  • 126. © 2016 Mesosphere, Inc. All Rights Reserved. 126 Something new
  • 127. © 2016 Mesosphere, Inc. All Rights Reserved. 127 Something new(-ish)
  • 128. © 2016 Mesosphere, Inc. All Rights Reserved. 128
  • 129. © 2016 Mesosphere, Inc. All Rights Reserved. 129 (Extended)BPF?
  • 130. © 2016 Mesosphere, Inc. All Rights Reserved. 130
  • 131. © 2016 Mesosphere, Inc. All Rights Reserved. 131
  • 132. © 2016 Mesosphere, Inc. All Rights Reserved. ZERO-OVERHEAD NFV 132 •Implemented using built-in Linux APIs •Linux Security Module API •Linux eBPF •Allows standard BSD API to work •Example: getpeername() works •In preliminary testing: <0.1% overhead •Challenges •Upgrading Kernel
  • 133. © 2016 Mesosphere, Inc. All Rights Reserved. 133 IPv6: Just more addresses?
  • 134. © 2016 Mesosphere, Inc. All Rights Reserved. 134 000.000.000.000 32-bits IPv4
  • 135. © 2016 Mesosphere, Inc. All Rights Reserved. 135 0000:0000:0000:0000:0000:0000:0000:0000 64-bits IPv6 64-bits Network Part Host Part
  • 136. © 2016 Mesosphere, Inc. All Rights Reserved. 136 SLAAC StateLess Address Auto Configuration
  • 137. © 2016 Mesosphere, Inc. All Rights Reserved. IPV6 137 •264 addresses per host •Automatically configured •Challenges •Cloud support •Organization challenges
  • 138. © 2016 Mesosphere, Inc. All Rights Reserved. 138 What did we learn?
  • 139. © 2016 Mesosphere, Inc. All Rights Reserved. 139 Organizations are weird
  • 140. © 2016 Mesosphere, Inc. All Rights Reserved. 140 We’re probably doing it wrong (today)
  • 141. © 2016 Mesosphere, Inc. All Rights Reserved. 141 The future looks bright
  • 142. © 2016 Mesosphere, Inc. All Rights Reserved. BUILDING THE GLUE FOR SERVICE DISCOVERY & LOAD BALANCING MICROSERVICES @SARGUN 142