SlideShare a Scribd company logo
Adelina Simion & Nicole Gillett
Form3 & Cloudflare
Sharing our best secrets 🀐
Design a distributed system from
scratch πŸ—
Hello? Is this thing on? 🎀
Adelina Simion
β€’ Technology Evangelist @ Form3
β€’ Gopher since 2018
β€’ Backend engineer since 2014
β€’ Tweet me @classic_addetz
Hello? Is this thing on? 🎀
Nicole Gillett
β€’ Systems Engineer @ Cloudflare
β€’ Writin’ code since 2019
β€’ Physics and Neuroscience
before that
β€’ Tweet me @nictakesnote
We’re taking on system architecture! πŸ’ͺ
Our session will give you:
β€’ A repeatable structured process
β€’ An overview of the requirements and processes you should consider
β€’ A summary of different technologies you should compare
We’re going on an adventure! 🧭
Remember that today it’s all about the journey, not the destination
The aim is to teach you how to take on the system architecture exercise
yourself, not just design TechyNotes
TechyNotes is only a toy problem, so focus on the process and the
discussions we will have
Session overview
The top secret agenda 🀭
β€’ Introduction to TechyNotes πŸ†•
β€’ System Interface Definitions πŸ“
β€’ Databases & Storage πŸ“š
β€’ First Solution πŸ₯‡
β€’ Bottlenecks & Scalability πŸš€
β€’ Revised Solution πŸ‘‘
β€’ Conclusions 🏁
Training planπŸŽ–
In each section we will:
β€’ Introduce technical concepts
β€’ Describe the use of each of them
β€’ Breakout for group exercise on worksheets
β€’ Present our own solution for TechyNotes and answer questions
β€’ Repeat for each section πŸ“
You can find these slides at https://bit.ly/wit-sysdesign
Introduction to TechyNotes πŸ†•
Super Secret Login πŸ”
First things first, it’s time
to log in to your very own
secret store of notes..
Write your secrets 🀐
TechyNotes is a note-taking app with
the following flashy features:
● Save your notes
● Add attachments
● Share with your dev friends 😎
Add an attachment πŸ“¦
We accept files in a wide variety of
shapes: JPEG, TXT, PDF, CSV etc.
Upload from a file, your email, or
even your camera! πŸ“Έ
Share with friends ❀
Share your note with a list of users,
or send them a copy (if you don’t
trust their edits!)
If your friends aren’t on TechyNotes
yet, send them a link so that they
can sign up to the app, and see your
note πŸ₯³
Organise secrets πŸ—‚
Secrets are saved in folders
Once shared with you, all your
shared notes will show up in the
Shared Notes directory, which is a
generated pseudo directory
Once shared, your conspirators have
full access to your notes, just like you
do 😎
See your secrets πŸ™ˆ
See a list of notes in each folder
Notes have a title, creation date, edit
date and are shown with a preview
to make selection easier
Breakout 🎳
● What are the main models of the
the system?
● What functionality do we need to
provide? What operations should
the user be able to do?
● What requirements can we
specify for the system?
● What about constraints/things
TechyNotes won’t do?
TechyNotes Functionality - Workbook πŸ“–
TechyNotes will allow users to:
- Create, edit, view and delete
their notes
-
TechyNotes will NOT allow users to:
- Share attachments only
-
Slides at https://bit.ly/wit-sysdesign
TechyNotes Functionality
The main players πŸƒ
From the mockups, we can see that the main models that comprise the
TechyNotes system are:
● Users πŸ“±πŸ’»
● Folders πŸ—‚
● Notes πŸ“
● Attachments πŸ–Ό
The plan of attack - Part 1 🚁
TechyNotes will allow users to:
β€’ Sign up and login to their account via email
β€’ Create, edit, view and delete their notes
β€’ Preview the first 2 lines of their note
β€’ Organize notes in folders
β€’ Add, view and delete attachments from existing notes
β€’ Share notes with other users via email
β€’ View and edit other users’ shared notes
β€’ Notify users via email when a note is shared
The plan of attack - Part 2 🚁
TechyNotes will NOT allow users to:
β€’ View notes unless they are signed in
β€’ View or search through the user list
β€’ Share notes at different privilege levels
β€’ Share entire folders
β€’ Search through notes and folders
β€’ Share attachments only
β€’ Add attachments to notes at the same time as creating them
β€’ See multiple versions of notes
β€’ Restore deleted notes
System Interface Definitions πŸ“
What is an API? πŸ€”
API stands for Application Programming Interface
APIs allow applications to talk to each other through predefined contracts
REST 😴
REST is an acronym for Representational State Transfer
β€’ Client-server
β€’ Stateless
β€’ Uniform, easy to understand interface
β€’ Layered system
β€’ Multiple endpoints per model
HTTP verbs
HTTP defines a set of verbs to indicate the desired action to be performed
for a given resource
β€’ The GET method requests a representation of the specified resource
β€’ The POST method is used to submit an entity to the specified resource,
often causing a change in state or side effects on the server
β€’ The PUT method replaces all current representations of the target
resource with the request payload
β€’ The DELETE method deletes the specified resource
Status codes βœ…βŒ
HTTP response status codes indicate the outcome of HTTP request
β€’ 200 OK πŸŽ‰
β€’ 4xx: Client errors e.g. unauthorized; not found; too many requests
β€’ 5xx: Server errors
The power of GraphQL πŸ’£
β€’ GraphQL is a query language for
your API, and a server-side
runtime for queries
β€’ Services typically run at a single
URL on a web service and receive
GraphQL queries
β€’ Clients take the control of the
data by requesting precisely what
data they want, but it does
require a defined schema
Breakout 🎳
● What endpoints do we need to
power the functionality we
already identified?
● Are there any dependencies
between our types?
● How should they be represented
in our endpoints?
● Any specific tech we can commit
to? REST vs GraphQL?
TechyNotes API - Workbook πŸ“–
Slides at https://bit.ly/wit-sysdesign
TechyNotes API Endpoints
Proposed TechyNotes API 🎯
Choosing the best tool for the job πŸ› 
β€’ We propose REST endpoints as the domain is limited and the
operations well defined
β€’ All operations in our solution are converted to RESTful endpoints
using the domain entities as resources
Databases & Storage πŸ“š
New phone, who dis SQL? πŸ“²
β€’ Relational databases
β€’ Structured queries for looking up data
β€’ Predefined schema for the data
β€’ Data stored once - normalisation
β€’ Usual suspects: Postgres, Oracle, MySQL σ°¬­
SQL - what is it good for?
● Ideal for consistent systems βœ…
● Long history of usage and support βœ…
● Scalability and sharding difficulties ❌
● Predefined schema can be a constraint ❌
New phone, who dis NoSQL? πŸ“²
β€’ No predefined schema
β€’ Can store much larger amounts of data
β€’ Stores blobs of unstructured data that can be in any format
β€’ Data can be stored multiple times
β€’ Usual suspects: DynamoDB, Redis, Cassandra σ°¬­
NoSQL - what is it good for?
β€’ No limits on types of data to store βœ…
β€’ Easier to scale by design βœ…
β€’ Excellent for big data analytics βœ…
β€’ Less support ❌ & less mature tools ❌
Cloud storage πŸ—ƒ
β€’ Store shared files in the cloud
β€’ Save files & metadata together
β€’ Services save & secure your data and you interact with their APIs
β€’ No query engine, no relational data
Breakout 🎳
● What are your main services?
● What kind of database is best?
SQL vs NoSQL?
● How will data travel through your
system?
● What is the simplest solution we
can start out with?
● What functionality will we buy
and what will we build?
First solution architecture - Workbook πŸ“–
Slides at https://bit.ly/wit-sysdesign
First solution πŸ₯‡
Simple TechyNotes architecture 🎯
Bottlenecks & Scalability πŸš€
Scaling your database πŸš€
β€’ Sharding to distribute over multiple databases.
β€’ noSQL databases have the advantage of scaling horizontally, whilst SQL
databases scale vertically πŸ’°
β€’ Read replicas for read-heavy applications, which increase throughput.
β€’ Cold storage solutions for data that has not been accessed in some time,
or for example, an archive functionality of notes.
Load balancer βš–
Load balancing refers to the process of distributing a set of tasks over a set
of resources, with the aim of making their overall processing more efficient.
β€’ We can add load balancing between client and application servers, or
between the application and the database
β€’ We can start with a simple Round Robin approach, where requests are
distributed equally among servers. If a server is dead, the LB will no
longer send any traffic
β€’ Public cloud providers will have these implemented
Cache πŸ’Έ
A cache (pronounced cash) is used to temporarily store data so that it can
be accessed quickly.
β€’ If the data is not stored in the cache, a cache miss occurs and the data is
fetched from the main memory and stored in the cache.
β€’ If we follow the 80/20 rule, that 80% of our traffic is generated by 20% of
our notes, then we want to cache 20% of these hot notes.
β€’ Usually implemented using key value stores such as Redis and
Memcached
Queue
A message queue is a form of asynchronous service-to-service
communication.
β€’ Queues can be used to sync devices that go offline. For example, if an
edit is made to a note, that change will stay in the queue until the device
comes back online
β€’ Messages are stored on the queue until they are processed and deleted,
allowing us to regulate throughput in our system
β€’ Examples include Amazon SQS and RabbitMQ
Group discussion 🎳
● Looking at our first solution, what
are bottlenecks that wouldn’t
scale as TechyNotes goes viral?
● Which strategies can we use to
alleviate some of these issues?
● What new technologies should
we introduce to our system?
TechyNotes goes viral! πŸš€
Revised Solution πŸ‘‘
TechyNotes architecture 2.0 🎯🎯
● Load balancer to
distribute load across
multiple instances of
Notes Service.
● Read replica to ease the
load on the Notes &
Users Database.
TechyNotes architecture 2.1 🎯🎯
● We could switch to
noSQL database, which
will scale automatically.
TechyNotes architecture 2.2 🎯🎯
● We could add an archive
functionality to handle
notes that haven’t been
accessed in a while. Old
attachments could be
moved to cold storage.
A note on system updates 🏰
β€’ Start with a simple solution, understand the usage of your system
β€’ Add complexity and costs where needed
β€’ Design an update strategy that does not require down time
β€’ Use feature flagging to slowly update your system
β€’ Monitoring and logging are your BFFs
Slides at https://bit.ly/wit-sysdesign
Conclusions 🏁
Recap 🏁
β€’ Use this process for your system designs
β€’ Start with your requirements
β€’ Discuss your models and APIs
β€’ Start with a simple solution
β€’ Modify according to what bottlenecks are important to you.
β€’ Remember, no system is perfect!
Slides at https://bit.ly/wit-sysdesign
Thanks for coming to our workshop! πŸ™Œ
Sharing our best secrets: Design a distributed system from scratch
Wireframes by Well Nice Studio
✊🎀 ⬇

More Related Content

Sharing our best secrets: Design a distributed system from scratch

  • 1. Adelina Simion & Nicole Gillett Form3 & Cloudflare Sharing our best secrets 🀐 Design a distributed system from scratch πŸ—
  • 2. Hello? Is this thing on? 🎀 Adelina Simion β€’ Technology Evangelist @ Form3 β€’ Gopher since 2018 β€’ Backend engineer since 2014 β€’ Tweet me @classic_addetz
  • 3. Hello? Is this thing on? 🎀 Nicole Gillett β€’ Systems Engineer @ Cloudflare β€’ Writin’ code since 2019 β€’ Physics and Neuroscience before that β€’ Tweet me @nictakesnote
  • 4. We’re taking on system architecture! πŸ’ͺ Our session will give you: β€’ A repeatable structured process β€’ An overview of the requirements and processes you should consider β€’ A summary of different technologies you should compare
  • 5. We’re going on an adventure! 🧭 Remember that today it’s all about the journey, not the destination The aim is to teach you how to take on the system architecture exercise yourself, not just design TechyNotes TechyNotes is only a toy problem, so focus on the process and the discussions we will have
  • 7. The top secret agenda 🀭 β€’ Introduction to TechyNotes πŸ†• β€’ System Interface Definitions πŸ“ β€’ Databases & Storage πŸ“š β€’ First Solution πŸ₯‡ β€’ Bottlenecks & Scalability πŸš€ β€’ Revised Solution πŸ‘‘ β€’ Conclusions 🏁
  • 8. Training planπŸŽ– In each section we will: β€’ Introduce technical concepts β€’ Describe the use of each of them β€’ Breakout for group exercise on worksheets β€’ Present our own solution for TechyNotes and answer questions β€’ Repeat for each section πŸ“ You can find these slides at https://bit.ly/wit-sysdesign
  • 10. Super Secret Login πŸ” First things first, it’s time to log in to your very own secret store of notes..
  • 11. Write your secrets 🀐 TechyNotes is a note-taking app with the following flashy features: ● Save your notes ● Add attachments ● Share with your dev friends 😎
  • 12. Add an attachment πŸ“¦ We accept files in a wide variety of shapes: JPEG, TXT, PDF, CSV etc. Upload from a file, your email, or even your camera! πŸ“Έ
  • 13. Share with friends ❀ Share your note with a list of users, or send them a copy (if you don’t trust their edits!) If your friends aren’t on TechyNotes yet, send them a link so that they can sign up to the app, and see your note πŸ₯³
  • 14. Organise secrets πŸ—‚ Secrets are saved in folders Once shared with you, all your shared notes will show up in the Shared Notes directory, which is a generated pseudo directory Once shared, your conspirators have full access to your notes, just like you do 😎
  • 15. See your secrets πŸ™ˆ See a list of notes in each folder Notes have a title, creation date, edit date and are shown with a preview to make selection easier
  • 16. Breakout 🎳 ● What are the main models of the the system? ● What functionality do we need to provide? What operations should the user be able to do? ● What requirements can we specify for the system? ● What about constraints/things TechyNotes won’t do?
  • 17. TechyNotes Functionality - Workbook πŸ“– TechyNotes will allow users to: - Create, edit, view and delete their notes - TechyNotes will NOT allow users to: - Share attachments only - Slides at https://bit.ly/wit-sysdesign
  • 19. The main players πŸƒ From the mockups, we can see that the main models that comprise the TechyNotes system are: ● Users πŸ“±πŸ’» ● Folders πŸ—‚ ● Notes πŸ“ ● Attachments πŸ–Ό
  • 20. The plan of attack - Part 1 🚁 TechyNotes will allow users to: β€’ Sign up and login to their account via email β€’ Create, edit, view and delete their notes β€’ Preview the first 2 lines of their note β€’ Organize notes in folders β€’ Add, view and delete attachments from existing notes β€’ Share notes with other users via email β€’ View and edit other users’ shared notes β€’ Notify users via email when a note is shared
  • 21. The plan of attack - Part 2 🚁 TechyNotes will NOT allow users to: β€’ View notes unless they are signed in β€’ View or search through the user list β€’ Share notes at different privilege levels β€’ Share entire folders β€’ Search through notes and folders β€’ Share attachments only β€’ Add attachments to notes at the same time as creating them β€’ See multiple versions of notes β€’ Restore deleted notes
  • 23. What is an API? πŸ€” API stands for Application Programming Interface APIs allow applications to talk to each other through predefined contracts
  • 24. REST 😴 REST is an acronym for Representational State Transfer β€’ Client-server β€’ Stateless β€’ Uniform, easy to understand interface β€’ Layered system β€’ Multiple endpoints per model
  • 25. HTTP verbs HTTP defines a set of verbs to indicate the desired action to be performed for a given resource β€’ The GET method requests a representation of the specified resource β€’ The POST method is used to submit an entity to the specified resource, often causing a change in state or side effects on the server β€’ The PUT method replaces all current representations of the target resource with the request payload β€’ The DELETE method deletes the specified resource
  • 26. Status codes βœ…βŒ HTTP response status codes indicate the outcome of HTTP request β€’ 200 OK πŸŽ‰ β€’ 4xx: Client errors e.g. unauthorized; not found; too many requests β€’ 5xx: Server errors
  • 27. The power of GraphQL πŸ’£ β€’ GraphQL is a query language for your API, and a server-side runtime for queries β€’ Services typically run at a single URL on a web service and receive GraphQL queries β€’ Clients take the control of the data by requesting precisely what data they want, but it does require a defined schema
  • 28. Breakout 🎳 ● What endpoints do we need to power the functionality we already identified? ● Are there any dependencies between our types? ● How should they be represented in our endpoints? ● Any specific tech we can commit to? REST vs GraphQL?
  • 29. TechyNotes API - Workbook πŸ“– Slides at https://bit.ly/wit-sysdesign
  • 32. Choosing the best tool for the job πŸ›  β€’ We propose REST endpoints as the domain is limited and the operations well defined β€’ All operations in our solution are converted to RESTful endpoints using the domain entities as resources
  • 34. New phone, who dis SQL? πŸ“² β€’ Relational databases β€’ Structured queries for looking up data β€’ Predefined schema for the data β€’ Data stored once - normalisation β€’ Usual suspects: Postgres, Oracle, MySQL σ°¬­
  • 35. SQL - what is it good for? ● Ideal for consistent systems βœ… ● Long history of usage and support βœ… ● Scalability and sharding difficulties ❌ ● Predefined schema can be a constraint ❌
  • 36. New phone, who dis NoSQL? πŸ“² β€’ No predefined schema β€’ Can store much larger amounts of data β€’ Stores blobs of unstructured data that can be in any format β€’ Data can be stored multiple times β€’ Usual suspects: DynamoDB, Redis, Cassandra σ°¬­
  • 37. NoSQL - what is it good for? β€’ No limits on types of data to store βœ… β€’ Easier to scale by design βœ… β€’ Excellent for big data analytics βœ… β€’ Less support ❌ & less mature tools ❌
  • 38. Cloud storage πŸ—ƒ β€’ Store shared files in the cloud β€’ Save files & metadata together β€’ Services save & secure your data and you interact with their APIs β€’ No query engine, no relational data
  • 39. Breakout 🎳 ● What are your main services? ● What kind of database is best? SQL vs NoSQL? ● How will data travel through your system? ● What is the simplest solution we can start out with? ● What functionality will we buy and what will we build?
  • 40. First solution architecture - Workbook πŸ“– Slides at https://bit.ly/wit-sysdesign
  • 44. Scaling your database πŸš€ β€’ Sharding to distribute over multiple databases. β€’ noSQL databases have the advantage of scaling horizontally, whilst SQL databases scale vertically πŸ’° β€’ Read replicas for read-heavy applications, which increase throughput. β€’ Cold storage solutions for data that has not been accessed in some time, or for example, an archive functionality of notes.
  • 45. Load balancer βš– Load balancing refers to the process of distributing a set of tasks over a set of resources, with the aim of making their overall processing more efficient. β€’ We can add load balancing between client and application servers, or between the application and the database β€’ We can start with a simple Round Robin approach, where requests are distributed equally among servers. If a server is dead, the LB will no longer send any traffic β€’ Public cloud providers will have these implemented
  • 46. Cache πŸ’Έ A cache (pronounced cash) is used to temporarily store data so that it can be accessed quickly. β€’ If the data is not stored in the cache, a cache miss occurs and the data is fetched from the main memory and stored in the cache. β€’ If we follow the 80/20 rule, that 80% of our traffic is generated by 20% of our notes, then we want to cache 20% of these hot notes. β€’ Usually implemented using key value stores such as Redis and Memcached
  • 47. Queue A message queue is a form of asynchronous service-to-service communication. β€’ Queues can be used to sync devices that go offline. For example, if an edit is made to a note, that change will stay in the queue until the device comes back online β€’ Messages are stored on the queue until they are processed and deleted, allowing us to regulate throughput in our system β€’ Examples include Amazon SQS and RabbitMQ
  • 48. Group discussion 🎳 ● Looking at our first solution, what are bottlenecks that wouldn’t scale as TechyNotes goes viral? ● Which strategies can we use to alleviate some of these issues? ● What new technologies should we introduce to our system?
  • 51. TechyNotes architecture 2.0 🎯🎯 ● Load balancer to distribute load across multiple instances of Notes Service. ● Read replica to ease the load on the Notes & Users Database.
  • 52. TechyNotes architecture 2.1 🎯🎯 ● We could switch to noSQL database, which will scale automatically.
  • 53. TechyNotes architecture 2.2 🎯🎯 ● We could add an archive functionality to handle notes that haven’t been accessed in a while. Old attachments could be moved to cold storage.
  • 54. A note on system updates 🏰 β€’ Start with a simple solution, understand the usage of your system β€’ Add complexity and costs where needed β€’ Design an update strategy that does not require down time β€’ Use feature flagging to slowly update your system β€’ Monitoring and logging are your BFFs Slides at https://bit.ly/wit-sysdesign
  • 56. Recap 🏁 β€’ Use this process for your system designs β€’ Start with your requirements β€’ Discuss your models and APIs β€’ Start with a simple solution β€’ Modify according to what bottlenecks are important to you. β€’ Remember, no system is perfect! Slides at https://bit.ly/wit-sysdesign
  • 57. Thanks for coming to our workshop! πŸ™Œ
  • 59. Wireframes by Well Nice Studio