The hardest part of microservices:
Calling your services
Christian Posta (@christianposta)
Chief Architect – Red Hat
Full slide deck here:
The Hardest Part of Microservices: Calling Your Services
Christian Posta
Chief Architect, cloud application development
Twitter: @christianposta
• Author “Microservices for Java developers”
• Committer/contributor lots of open-source projects
• Worked with large Microservices, web-scale,
unicorn company
• Blogger, speaker

The network… does what it wants.

Microservices are…
distributed systems
As we move to services architectures,
we push the complexity to the space
between our services.
We need guiding “service principles”
• Services shall be resilient when communicating
• Failures should not jump boundaries (no cascading failures)
• New releases should not impact production
• Mean time to recover should approach zero
• Security as a first-class citizen
• Policies to eliminate unexpected usage
Service principles

• Highly centralized == centralized governance
• Inadvertently scattered business logic away from business
• Scalability issues?
• Not ideal for use in cloud environments
Some drawbacks to this approach?
• Netflix Hystrix (circuit breaking / bulk heading)
• Netflix Zuul (edge router)
• Netflix Ribbon (client-side service discovery / load balance)
• Netflix Eureka (service discovery registry)
• Brave / Zipkin (tracing)
• Netflix spectator / atlas (metrics)
“Microservices” patterns
Now we have “service hurdles”

• Require specific language to bring in new services
• A single language doesn’t fit for all use cases
• How do you patch/upgrade/manage lifecycle?
• Need strict control over application library choices
Some drawbacks to this approach?
But I’m using Spring!
• spring-cloud-netflix-hystrix
• spring-cloud-netflix-zuul
• spring-cloud-netflix-eureka-client
• spring-cloud-netflix-ribbon
• spring-cloud-netflix-atlas
• spring-cloud-netflix-spectator
• spring-cloud-netflix-hystrix-stream
• …..
• ......
• @Enable....150differentThings
But I’m using Vert.x!
• vertx-circuit-breaker
• vertx-service-discovery
• vertx-dropwizard-metrics
• vertx-zipkin?
• …..
• ......
Screw Java - I’m using NodeJS!
JavaScript is for rookies, I use Go!
But python is so pretty!
I prefer unreadability… Perl for me!

Things you must solve for because…
distributed systems
• Service discovery
• Retries
• Timeouts
• Load balancing
• Rate limiting
• Thread bulk heading
• Circuit breaking
• Routing between services (adaptive, zone-aware)
• Deadlines
• Back pressure
• Outlier detection
• Health checking
• Traffic shaping
• Request shadowing
• Edge/DMZ routing
• Surgical / fine / per-request routing
• A/B rollout
• Internal releases / dark launches
• Fault injection
• Stats, metric, collection
• Logging
• Tracing
An implementation of “service mess”
Now we have …

Now we have …
• 30 different libraries for each of 5 languages, and each with 3
• How do we maintain, upgrade, retire
• classpath/namespace pollution
• increases operational complexity
• force specific languages
• inconsistency
• correctness
These are all horizontal concerns
and apply to all services regardless
of implementation.
Let’s abstract this functionality to a single
binary and apply to all services.
• Allow heterogeneous architectures
• Remove application-specific implementations of this
• Consistently enforce these properties
• Correctly enforce these properties
• Opt-in as well as safety nets

Evolution of application networking
Meet Envoy Proxy
Envoy is…
• service proxy
• written in C++, highly parallel, non-blocking
• L3/4 network filter
• out of the box L7 filters
• HTTP 2, including gRPC
• baked in service discovery/health checking
• advanced load balancing
• stats, metrics, tracing
• dynamic configuration through xDS

Envoy implements
• zone aware, least request load balancing
• circuit breaking
• outlier detection
• retries, retry policies
• timeout (including budgets)
• traffic shadowing
• rate limiting
• access logging, statistics collection
• Many other features!
As an edge proxy
As an shared proxy
As a service-instance proxy

Service instance proxy AKA
The Hardest Part of Microservices: Calling Your Services
Service mesh
“2018 is the year of the service mesh”
Clayton Coleman (@smarterclayton)
Red Hat OpenShift Platform Architect

How do we reason about a fleet of
these service proxies in a large cluster?
A service mesh is decentralized application-
networking infrastructure between your services
that provides resiliency, security, observability,
and routing control.
A service mesh is comprised of a data plane
and control plane.
Time for definitions:
All traffic between our applications flows
through these proxies. The proxies make
up the “data plane”
A control plane for service proxies

The Hardest Part of Microservices: Calling Your Services
The Hardest Part of Microservices: Calling Your Services
What higher-order clusters semantics
does Istio enable?
• Request-level control
• Graduated deployment and release
• Service observability
• Cluster reliability
• Chaos testing
• Policy enforcement

BTW: Hand drawn diagrams made with Paper by 
Twitter: @christianposta
Slides: up links:

The Hardest Part of Microservices: Calling Your Services

  • 1. The hardest part of microservices: Calling your services Christian Posta (@christianposta) Chief Architect – Red Hat
  • 2. Full slide deck here:
  • 4. Christian Posta Chief Architect, cloud application development Twitter: @christianposta Blog: Email: Slides: • Author “Microservices for Java developers” • Committer/contributor lots of open-source projects • Worked with large Microservices, web-scale, unicorn company • Blogger, speaker
  • 12. The network… does what it wants. @christianposta
  • 14. As we move to services architectures, we push the complexity to the space between our services. @christianposta
  • 15. We need guiding “service principles” @christianposta
  • 16. • Services shall be resilient when communicating • Failures should not jump boundaries (no cascading failures) • New releases should not impact production • Mean time to recover should approach zero • Security as a first-class citizen • Policies to eliminate unexpected usage Service principles
  • 17. Have we had to solve for this in the past? @christianposta
  • 21. • Highly centralized == centralized governance • Inadvertently scattered business logic away from business apps/services • Scalability issues? • Not ideal for use in cloud environments Some drawbacks to this approach?
  • 22. • Netflix Hystrix (circuit breaking / bulk heading) • Netflix Zuul (edge router) • Netflix Ribbon (client-side service discovery / load balance) • Netflix Eureka (service discovery registry) • Brave / Zipkin (tracing) • Netflix spectator / atlas (metrics) “Microservices” patterns
  • 24. Now we have “service hurdles” @christianposta
  • 25. • Require specific language to bring in new services • A single language doesn’t fit for all use cases • How do you patch/upgrade/manage lifecycle? • Need strict control over application library choices Some drawbacks to this approach?
  • 26. But I’m using Spring! • spring-cloud-netflix-hystrix • spring-cloud-netflix-zuul • spring-cloud-netflix-eureka-client • spring-cloud-netflix-ribbon • spring-cloud-netflix-atlas • spring-cloud-netflix-spectator • spring-cloud-netflix-hystrix-stream • ….. • ...... • @Enable....150differentThings
  • 27. But I’m using Vert.x! • vertx-circuit-breaker • vertx-service-discovery • vertx-dropwizard-metrics • vertx-zipkin? • ….. • ...... @christianposta
  • 28. Screw Java - I’m using NodeJS! JavaScript is for rookies, I use Go! But python is so pretty! I prefer unreadability… Perl for me! @christianposta
  • 29. Things you must solve for because… distributed systems • Service discovery • Retries • Timeouts • Load balancing • Rate limiting • Thread bulk heading • Circuit breaking @christianposta
  • 30. …continued • Routing between services (adaptive, zone-aware) • Deadlines • Back pressure • Outlier detection • Health checking • Traffic shaping • Request shadowing @christianposta
  • 31. …continued • Edge/DMZ routing • Surgical / fine / per-request routing • A/B rollout • Internal releases / dark launches • Fault injection • Stats, metric, collection • Logging • Tracing
  • 32. An implementation of “service mess” @christianposta Now we have …
  • 33. Now we have … • 30 different libraries for each of 5 languages, and each with 3 frameworks • How do we maintain, upgrade, retire • classpath/namespace pollution • increases operational complexity • force specific languages • inconsistency • correctness
  • 34. These are all horizontal concerns and apply to all services regardless of implementation. @christianposta
  • 35. Let’s abstract this functionality to a single binary and apply to all services. • Allow heterogeneous architectures • Remove application-specific implementations of this functionality • Consistently enforce these properties • Correctly enforce these properties • Opt-in as well as safety nets @christianposta
  • 39. Envoy is… • service proxy • written in C++, highly parallel, non-blocking • L3/4 network filter • out of the box L7 filters • HTTP 2, including gRPC • baked in service discovery/health checking • advanced load balancing • stats, metrics, tracing • dynamic configuration through xDS
  • 41. Envoy implements • zone aware, least request load balancing • circuit breaking • outlier detection • retries, retry policies • timeout (including budgets) • traffic shadowing • rate limiting • access logging, statistics collection • Many other features!
  • 42. As an edge proxy
  • 43. As an shared proxy
  • 45. Service instance proxy AKA Sidecar
  • 48. “2018 is the year of the service mesh” Clayton Coleman (@smarterclayton) Red Hat OpenShift Platform Architect @christianposta
  • 49. How do we reason about a fleet of these service proxies in a large cluster? @christianposta
  • 50. A service mesh is decentralized application- networking infrastructure between your services that provides resiliency, security, observability, and routing control. A service mesh is comprised of a data plane and control plane. @christianposta Time for definitions:
  • 51. All traffic between our applications flows through these proxies. The proxies make up the “data plane” @christianposta
  • 52. Meet A control plane for service proxies
  • 55. What higher-order clusters semantics does Istio enable? • Request-level control • Graduated deployment and release • Service observability • Cluster reliability • Chaos testing • Policy enforcement
  • 58. Thanks! BTW: Hand drawn diagrams made with Paper by  Twitter: @christianposta Blog: Email: Slides: up links: • • • • • • •

