SlideShare a Scribd company logo
Best Practices for Inter-process
Communication
Gustavo Garcia
@anarchyco
Tef con2016 (1)
What happens when your application and/or team starts growing?
Tef con2016 (1)
Tef con2016 (1)
Disclaimer: I don’t like the word. I’m
not advocating to use microservices.
Inter-process communication
Once you break a monolithic application into separate pieces – microservices –
the pieces need to speak to each other. And it turns out that you have many
options for inter-process communication.
1-1 1-many
SYNCHRONOUS Request / Response
ASYNCHRONOUS Notification
Request / Async Response
Publish / Subscribe
Publish / Async Responses
Request / Response (RPC)
Discover -> Format -> Send
Discovery and Load Balancing
When you are writing some code that invokes a service, in order to make a
request, your code needs to know the network location (IP address and port) of
a service instance.
In a modern, cloud-based microservices application, however, this is a much
more difficult problem to solve.
Service instances have dynamically assigned network locations and the set of
service instances changes dynamically because of autoscaling, failures, and
upgrades.
Discovery and Load Balancing
At a high level there are two different approaches:
Client-Side Discovery Pattern: The calling service needs to find the
Server-Side Discovery Pattern: The calling service sends the request to an
intermediary (router/proxy) who is the responsible of locating
Discovery and Load Balancing
Ribbon is a Inter Process Communication (remote procedure calls) library
with built in software load balancers. The primary usage model involves
REST calls with various serialization scheme support. It is heavily used in
production by Netflix.
Finagle clients come equipped with a load balancer, a pivotal
component in the client stack, whose responsibility is to dynamically
distribute load across a collection of interchangeable endpoints.
Finagle is the core component of the Twitter microservices architecture
and it is used by FourSquare, Tumblr, ING...
“A common anti-pattern used for HTTP microservices is to have a load
balancing service fronting each stateless microservice. “ Joyent.
“Generally, the Proxy Model is workable for simple to moderately complex
applications. It’s not the most efficient approach/Model for load
balancing, especially at scale.” Nginx.
Tef con2016 (1)
Serialization / Formats
Different ways to serialize the information for sending:
- Interface Definition Language (protobuf, thrift, json schema ...)
- Schema-free or “Documentation” based
IDL based are usually binary (but not necessarily) and usually includes the
possibility of auto-generating code.
Serialization / Formats
Binary / Schema Text / Schema free
Efficiency High Lower
Development speed Low? High
Debugging / Readability Low High
Robustness High Low
Tef con2016 (1)
Transport
Protocol HTTP, TCP
Security SSL, non-SSL
Reusing connections No reuse, Reusing, Multiplexing
Tef con2016 (1)
Transport
Good News: HTTP/2
● Efficient, SSL, Multiplexed
● Supported by major libraries: gRPC, Finagle ...
Failures
Applications in complex distributed architectures have dozens of dependencies, each of
which will inevitably fail at some point. If the host application is not isolated from these
external failures, it risks being taken down with them.
For example, for an application that depends on 30 services where each service has
99.99% uptime, here is what you can expect: 99.9930 = 99.7% uptime
2+ hours downtime/month even if all dependencies have excellent uptime.
Reality is generally worse.
Engineering for Failure
Detect: How and when to mark a request as a failure
React: What do you do when you detect a failure
Isolate: Minimize the impact in the whole system
Tef con2016 (1)
Detecting failures
What is the definition of failure?
Connection failures vs HTTP Response Status
Timeouts:
Sometimes is more difficult than what it looks like.
Fail Fast
Reacting to failures
Possible ways to react to failures:
Retrying the request again in case it is idempotent
Cache the results and return them if the next request fails or always
Fallback to return something else or change the logic when one of the
requests fails (for example sending a predefined value)
Tef con2016 (1)
Circuit Breaker
If something is not working stop trying for a while
because it could to make it worse for you or for
them.
It can be a local Circuit Breaker or a global one
Tef con2016 (1)
Example of logic
https://github.com/Netflix/Hystrix
Bulkhead pattern
A service miss-behaving shouldn’t affect rest of
services.
Control use of resources of the client to a specific
service.
Make sure a client to a specific service is not
blocking the whole process.
Swimline pattern
Mantien independent full stacks so that even in case of a problem in one of
them there is no full outage.
Back Pressure or Flow Control
When your server is under pressure you should use some counter-measures to
avoid making it worse.
For example wait accepting new connections, throttling messages, return 503...
Monitoring and Debugging
Knowing what’s happening in your service and why the latency or failures
increases is harder when you are calling 30 services to process the request.
Monitoring
Debugging
Monitoring
You need to know if any of your requests is taking longer than expected, how
many are failing, queue sizes...
33% HTTP EndPoint
33% Logs
33% No stats
Debugging
Consistency:
It has to be automatic
There has to be some guidelines and you have to be very strict
Traceability:
● Easily find all the requests belonging to the same call flow
● Identify the hierarchy (who is calling who)
sessionId == X OR sessionid == X OR session_id == X
Debugging
Trace / Spans
This is
just too hard
Frameworks, Frameworks, Frameworks
DDIY
Boring is Good
Microservices Chassis
“Para comerme la mierda de otro mejor me como la mía”
Wrap Up
“When you move to a microservices architecture, it
comes with this constant tax on your development cycle
that’s going to slow you down from that point on”
Acknowledgements
All the projects collaborating in the survey
References
HOW TO ADOPT MICROSERVICES
https://www.nginx.com/resources/library/oreilly-building-microservices/
Microservices Architecture: The Good, The Bad, and What You Could Be Doing
Better
http://nordicapis.com/microservices-architecture-the-good-the-bad-and-what-
you-could-be-doing-better/

More Related Content

Tef con2016 (1)

  • 1. Best Practices for Inter-process Communication Gustavo Garcia @anarchyco
  • 3. What happens when your application and/or team starts growing?
  • 6. Disclaimer: I don’t like the word. I’m not advocating to use microservices.
  • 7. Inter-process communication Once you break a monolithic application into separate pieces – microservices – the pieces need to speak to each other. And it turns out that you have many options for inter-process communication. 1-1 1-many SYNCHRONOUS Request / Response ASYNCHRONOUS Notification Request / Async Response Publish / Subscribe Publish / Async Responses
  • 8. Request / Response (RPC) Discover -> Format -> Send
  • 9. Discovery and Load Balancing When you are writing some code that invokes a service, in order to make a request, your code needs to know the network location (IP address and port) of a service instance. In a modern, cloud-based microservices application, however, this is a much more difficult problem to solve. Service instances have dynamically assigned network locations and the set of service instances changes dynamically because of autoscaling, failures, and upgrades.
  • 10. Discovery and Load Balancing At a high level there are two different approaches: Client-Side Discovery Pattern: The calling service needs to find the Server-Side Discovery Pattern: The calling service sends the request to an intermediary (router/proxy) who is the responsible of locating
  • 11. Discovery and Load Balancing
  • 12. Ribbon is a Inter Process Communication (remote procedure calls) library with built in software load balancers. The primary usage model involves REST calls with various serialization scheme support. It is heavily used in production by Netflix. Finagle clients come equipped with a load balancer, a pivotal component in the client stack, whose responsibility is to dynamically distribute load across a collection of interchangeable endpoints. Finagle is the core component of the Twitter microservices architecture and it is used by FourSquare, Tumblr, ING... “A common anti-pattern used for HTTP microservices is to have a load balancing service fronting each stateless microservice. “ Joyent. “Generally, the Proxy Model is workable for simple to moderately complex applications. It’s not the most efficient approach/Model for load balancing, especially at scale.” Nginx.
  • 14. Serialization / Formats Different ways to serialize the information for sending: - Interface Definition Language (protobuf, thrift, json schema ...) - Schema-free or “Documentation” based IDL based are usually binary (but not necessarily) and usually includes the possibility of auto-generating code.
  • 15. Serialization / Formats Binary / Schema Text / Schema free Efficiency High Lower Development speed Low? High Debugging / Readability Low High Robustness High Low
  • 17. Transport Protocol HTTP, TCP Security SSL, non-SSL Reusing connections No reuse, Reusing, Multiplexing
  • 19. Transport Good News: HTTP/2 ● Efficient, SSL, Multiplexed ● Supported by major libraries: gRPC, Finagle ...
  • 20. Failures Applications in complex distributed architectures have dozens of dependencies, each of which will inevitably fail at some point. If the host application is not isolated from these external failures, it risks being taken down with them. For example, for an application that depends on 30 services where each service has 99.99% uptime, here is what you can expect: 99.9930 = 99.7% uptime 2+ hours downtime/month even if all dependencies have excellent uptime. Reality is generally worse.
  • 21. Engineering for Failure Detect: How and when to mark a request as a failure React: What do you do when you detect a failure Isolate: Minimize the impact in the whole system
  • 23. Detecting failures What is the definition of failure? Connection failures vs HTTP Response Status Timeouts: Sometimes is more difficult than what it looks like. Fail Fast
  • 24. Reacting to failures Possible ways to react to failures: Retrying the request again in case it is idempotent Cache the results and return them if the next request fails or always Fallback to return something else or change the logic when one of the requests fails (for example sending a predefined value)
  • 26. Circuit Breaker If something is not working stop trying for a while because it could to make it worse for you or for them. It can be a local Circuit Breaker or a global one
  • 29. Bulkhead pattern A service miss-behaving shouldn’t affect rest of services. Control use of resources of the client to a specific service. Make sure a client to a specific service is not blocking the whole process.
  • 30. Swimline pattern Mantien independent full stacks so that even in case of a problem in one of them there is no full outage.
  • 31. Back Pressure or Flow Control When your server is under pressure you should use some counter-measures to avoid making it worse. For example wait accepting new connections, throttling messages, return 503...
  • 32. Monitoring and Debugging Knowing what’s happening in your service and why the latency or failures increases is harder when you are calling 30 services to process the request. Monitoring Debugging
  • 33. Monitoring You need to know if any of your requests is taking longer than expected, how many are failing, queue sizes... 33% HTTP EndPoint 33% Logs 33% No stats
  • 34. Debugging Consistency: It has to be automatic There has to be some guidelines and you have to be very strict Traceability: ● Easily find all the requests belonging to the same call flow ● Identify the hierarchy (who is calling who) sessionId == X OR sessionid == X OR session_id == X
  • 37. Frameworks, Frameworks, Frameworks DDIY Boring is Good Microservices Chassis “Para comerme la mierda de otro mejor me como la mía”
  • 38. Wrap Up “When you move to a microservices architecture, it comes with this constant tax on your development cycle that’s going to slow you down from that point on”
  • 39. Acknowledgements All the projects collaborating in the survey
  • 40. References HOW TO ADOPT MICROSERVICES https://www.nginx.com/resources/library/oreilly-building-microservices/ Microservices Architecture: The Good, The Bad, and What You Could Be Doing Better http://nordicapis.com/microservices-architecture-the-good-the-bad-and-what- you-could-be-doing-better/

Editor's Notes

  1. Understand the options you have when having to communica. Microservices is also deployment or how you handle database inconsistencies. Overview of how / what are we doing today Understand the implications of distributing your logic and how to minimize the problems Encuesta, Experiencias, Libros Finagle
  2. https://www.joyent.com/blog/container-native-discovery
  3. Compartimentación
  4. Compartimentación
  5. https://code.facebook.com/posts/215466732167400/wangle-an-asynchronous-c-networking-and-rpc-library/