Sharing our best secrets: Design a distributed system from scratch
- 1. Adelina Simion & Nicole Gillett
Form3 & Cloudflare
Sharing our best secrets π€
Design a distributed system from
scratch π
- 2. Hello? Is this thing on? π€
Adelina Simion
β’ Technology Evangelist @ Form3
β’ Gopher since 2018
β’ Backend engineer since 2014
β’ Tweet me @classic_addetz
- 3. Hello? Is this thing on? π€
Nicole Gillett
β’ Systems Engineer @ Cloudflare
β’ Writinβ code since 2019
β’ Physics and Neuroscience
before that
β’ Tweet me @nictakesnote
- 4. Weβre taking on system architecture! πͺ
Our session will give you:
β’ A repeatable structured process
β’ An overview of the requirements and processes you should consider
β’ A summary of different technologies you should compare
- 5. Weβre going on an adventure! π§
Remember that today itβs all about the journey, not the destination
The aim is to teach you how to take on the system architecture exercise
yourself, not just design TechyNotes
TechyNotes is only a toy problem, so focus on the process and the
discussions we will have
- 7. The top secret agenda π€
β’ Introduction to TechyNotes π
β’ System Interface Definitions π
β’ Databases & Storage π
β’ First Solution π₯
β’ Bottlenecks & Scalability π
β’ Revised Solution π
β’ Conclusions π
- 8. Training planπ
In each section we will:
β’ Introduce technical concepts
β’ Describe the use of each of them
β’ Breakout for group exercise on worksheets
β’ Present our own solution for TechyNotes and answer questions
β’ Repeat for each section π
You can find these slides at https://bit.ly/wit-sysdesign
- 10. Super Secret Login π
First things first, itβs time
to log in to your very own
secret store of notes..
- 11. Write your secrets π€
TechyNotes is a note-taking app with
the following flashy features:
β Save your notes
β Add attachments
β Share with your dev friends π
- 12. Add an attachment π¦
We accept files in a wide variety of
shapes: JPEG, TXT, PDF, CSV etc.
Upload from a file, your email, or
even your camera! πΈ
- 13. Share with friends β€
Share your note with a list of users,
or send them a copy (if you donβt
trust their edits!)
If your friends arenβt on TechyNotes
yet, send them a link so that they
can sign up to the app, and see your
note π₯³
- 14. Organise secrets π
Secrets are saved in folders
Once shared with you, all your
shared notes will show up in the
Shared Notes directory, which is a
generated pseudo directory
Once shared, your conspirators have
full access to your notes, just like you
do π
- 15. See your secrets π
See a list of notes in each folder
Notes have a title, creation date, edit
date and are shown with a preview
to make selection easier
- 16. Breakout π³
β What are the main models of the
the system?
β What functionality do we need to
provide? What operations should
the user be able to do?
β What requirements can we
specify for the system?
β What about constraints/things
TechyNotes wonβt do?
- 17. TechyNotes Functionality - Workbook π
TechyNotes will allow users to:
- Create, edit, view and delete
their notes
-
TechyNotes will NOT allow users to:
- Share attachments only
-
Slides at https://bit.ly/wit-sysdesign
- 19. The main players π
From the mockups, we can see that the main models that comprise the
TechyNotes system are:
β Users π±π»
β Folders π
β Notes π
β Attachments πΌ
- 20. The plan of attack - Part 1 π
TechyNotes will allow users to:
β’ Sign up and login to their account via email
β’ Create, edit, view and delete their notes
β’ Preview the first 2 lines of their note
β’ Organize notes in folders
β’ Add, view and delete attachments from existing notes
β’ Share notes with other users via email
β’ View and edit other usersβ shared notes
β’ Notify users via email when a note is shared
- 21. The plan of attack - Part 2 π
TechyNotes will NOT allow users to:
β’ View notes unless they are signed in
β’ View or search through the user list
β’ Share notes at different privilege levels
β’ Share entire folders
β’ Search through notes and folders
β’ Share attachments only
β’ Add attachments to notes at the same time as creating them
β’ See multiple versions of notes
β’ Restore deleted notes
- 23. What is an API? π€
API stands for Application Programming Interface
APIs allow applications to talk to each other through predefined contracts
- 24. REST π΄
REST is an acronym for Representational State Transfer
β’ Client-server
β’ Stateless
β’ Uniform, easy to understand interface
β’ Layered system
β’ Multiple endpoints per model
- 25. HTTP verbs
HTTP defines a set of verbs to indicate the desired action to be performed
for a given resource
β’ The GET method requests a representation of the specified resource
β’ The POST method is used to submit an entity to the specified resource,
often causing a change in state or side effects on the server
β’ The PUT method replaces all current representations of the target
resource with the request payload
β’ The DELETE method deletes the specified resource
- 26. Status codes β
β
HTTP response status codes indicate the outcome of HTTP request
β’ 200 OK π
β’ 4xx: Client errors e.g. unauthorized; not found; too many requests
β’ 5xx: Server errors
- 27. The power of GraphQL π£
β’ GraphQL is a query language for
your API, and a server-side
runtime for queries
β’ Services typically run at a single
URL on a web service and receive
GraphQL queries
β’ Clients take the control of the
data by requesting precisely what
data they want, but it does
require a defined schema
- 28. Breakout π³
β What endpoints do we need to
power the functionality we
already identified?
β Are there any dependencies
between our types?
β How should they be represented
in our endpoints?
β Any specific tech we can commit
to? REST vs GraphQL?
- 32. Choosing the best tool for the job π
β’ We propose REST endpoints as the domain is limited and the
operations well defined
β’ All operations in our solution are converted to RESTful endpoints
using the domain entities as resources
- 34. New phone, who dis SQL? π²
β’ Relational databases
β’ Structured queries for looking up data
β’ Predefined schema for the data
β’ Data stored once - normalisation
β’ Usual suspects: Postgres, Oracle, MySQL σ°¬
- 35. SQL - what is it good for?
β Ideal for consistent systems β
β Long history of usage and support β
β Scalability and sharding difficulties β
β Predefined schema can be a constraint β
- 36. New phone, who dis NoSQL? π²
β’ No predefined schema
β’ Can store much larger amounts of data
β’ Stores blobs of unstructured data that can be in any format
β’ Data can be stored multiple times
β’ Usual suspects: DynamoDB, Redis, Cassandra σ°¬
- 37. NoSQL - what is it good for?
β’ No limits on types of data to store β
β’ Easier to scale by design β
β’ Excellent for big data analytics β
β’ Less support β & less mature tools β
- 38. Cloud storage π
β’ Store shared files in the cloud
β’ Save files & metadata together
β’ Services save & secure your data and you interact with their APIs
β’ No query engine, no relational data
- 39. Breakout π³
β What are your main services?
β What kind of database is best?
SQL vs NoSQL?
β How will data travel through your
system?
β What is the simplest solution we
can start out with?
β What functionality will we buy
and what will we build?
- 44. Scaling your database π
β’ Sharding to distribute over multiple databases.
β’ noSQL databases have the advantage of scaling horizontally, whilst SQL
databases scale vertically π°
β’ Read replicas for read-heavy applications, which increase throughput.
β’ Cold storage solutions for data that has not been accessed in some time,
or for example, an archive functionality of notes.
- 45. Load balancer β
Load balancing refers to the process of distributing a set of tasks over a set
of resources, with the aim of making their overall processing more efficient.
β’ We can add load balancing between client and application servers, or
between the application and the database
β’ We can start with a simple Round Robin approach, where requests are
distributed equally among servers. If a server is dead, the LB will no
longer send any traffic
β’ Public cloud providers will have these implemented
- 46. Cache πΈ
A cache (pronounced cash) is used to temporarily store data so that it can
be accessed quickly.
β’ If the data is not stored in the cache, a cache miss occurs and the data is
fetched from the main memory and stored in the cache.
β’ If we follow the 80/20 rule, that 80% of our traffic is generated by 20% of
our notes, then we want to cache 20% of these hot notes.
β’ Usually implemented using key value stores such as Redis and
Memcached
- 47. Queue
A message queue is a form of asynchronous service-to-service
communication.
β’ Queues can be used to sync devices that go offline. For example, if an
edit is made to a note, that change will stay in the queue until the device
comes back online
β’ Messages are stored on the queue until they are processed and deleted,
allowing us to regulate throughput in our system
β’ Examples include Amazon SQS and RabbitMQ
- 48. Group discussion π³
β Looking at our first solution, what
are bottlenecks that wouldnβt
scale as TechyNotes goes viral?
β Which strategies can we use to
alleviate some of these issues?
β What new technologies should
we introduce to our system?
- 51. TechyNotes architecture 2.0 π―π―
β Load balancer to
distribute load across
multiple instances of
Notes Service.
β Read replica to ease the
load on the Notes &
Users Database.
- 53. TechyNotes architecture 2.2 π―π―
β We could add an archive
functionality to handle
notes that havenβt been
accessed in a while. Old
attachments could be
moved to cold storage.
- 54. A note on system updates π°
β’ Start with a simple solution, understand the usage of your system
β’ Add complexity and costs where needed
β’ Design an update strategy that does not require down time
β’ Use feature flagging to slowly update your system
β’ Monitoring and logging are your BFFs
Slides at https://bit.ly/wit-sysdesign
- 56. Recap π
β’ Use this process for your system designs
β’ Start with your requirements
β’ Discuss your models and APIs
β’ Start with a simple solution
β’ Modify according to what bottlenecks are important to you.
β’ Remember, no system is perfect!
Slides at https://bit.ly/wit-sysdesign