SlideShare a Scribd company logo
How to scale a distributed (file) system
Atin Mukherjee
Gluster Hacker
SSE @ Red Hat
@mukherjee_atin
IRC : atinmu
Agenda
● Consensus in distributed system
● CAP theorem in distributed system
● Different distributed system design approaches
● Design challenges
● RAFT algorithm
● Consistent distributed store
● etcd
● Q & A
Consensus in distributed system
● Consensus – An agrement but for what and
between whom?
● For what → the op/transaction can be
committed or not
● Between whom → Answer is pretty simple, the
nodes forming the distributed system
● Quorum – (n/2) + 1
CAP theorem
● Any two of the following three gurantees
– Consistency (all nodes see the same data at the
same time)
– Availability (a guarantee that every request
receives a response about whether it succeeded or
failed)
– Partition tolerance (the system continues to operate
despite arbitrary message loss or failure of part of
the system)

Recommended for you

Pnuts yahoo!’s hosted data serving platform
Pnuts  yahoo!’s hosted data serving platformPnuts  yahoo!’s hosted data serving platform
Pnuts yahoo!’s hosted data serving platform

This document provides an overview of PNUTS, Yahoo!'s hosted data serving platform. The key points are: 1. PNUTS is a massively parallel and geographically distributed database system that provides a consistency model between full serializability and eventual consistency, through per-record timeline consistency. 2. It uses asynchronous geographic replication to replicas around the world for high availability and low latency updates. Notifications allow subscribers to stream updates. Bulk loading facilitates rapid data ingestion. 3. The architecture involves record-level mastering, storage units, tablet controllers, routers, and a message broker to propagate updates between regions while ensuring consistency.

pnutsyahoo!hosted data serving platform
Os examples scheduling
Os examples schedulingOs examples scheduling
Os examples scheduling

The document discusses scheduling in the Linux, Windows, and Solaris operating systems. In Linux, tasks are divided into scheduling classes and the scheduler selects the highest priority task in the highest priority class to run. The two main classes are the default class which uses Completely Fair Scheduler, and the real-time class. Windows uses a priority-based preemptive scheduler where the highest priority ready thread runs, and a higher priority thread can preempt a lower one. Solaris has six priority-based classes and uses different scheduling algorithms depending on the class, like time-sharing with dynamic time slices or real-time with static highest priority.

Hbase hivepig
Hbase hivepigHbase hivepig
Hbase hivepig

This document discusses NoSQL and big data processing. It provides background on scaling relational databases and introduces key concepts like the CAP theorem. The CAP theorem states that a distributed data store can only provide two out of three guarantees around consistency, availability, and partition tolerance. Many NoSQL systems sacrifice consistency for availability and partition tolerance, adopting an eventual consistency model instead of ACID transactions.

Distributed system design approaches
● No meta data – all nodes share across their
data
● Meta data server – One node holds data where
others fetches from it
So which one is better???
Probably none of them? Ask yourself for a
minute....
Challenges in design of a distributed
system
● No meta data server
– N * N exchange of Network messages
– Not scalable when N is probably in hundreds or
thousands
– Initialization time can be very high
– Can end up in a situation like “whom to believe,
whom not to” - popularly known as split brain
– How to undo a transaction locally
Challenges in design of a distributed
system - 2
● MDS (Meta data server)
– SPOF
Ahh!! so is this the only drawback??
– How about having replicas and then replica count??
– Additional N/W hop, lower performance
RAFT – A consensus algorithm
● Key functions
– Asymmetric – leader based
– Leader election
– Normal operation
– Safety and consistency after leader changes
– Neutralizing old leaders
– Client interactions
– Configuration changes

Recommended for you

Management on Cloud 2011
Management on Cloud 2011Management on Cloud 2011
Management on Cloud 2011

This document discusses session management challenges for web applications in elastic infrastructure as a service (IaaS) systems. It presents two architectural solutions - session monitoring and session migration - to support scaling down operations while maintaining sticky sessions. The authors implemented session monitoring in the open source Eucalyptus IaaS system by adding a load balancer, monitor, and control daemon to track sessions and safely remove servers from the cluster. Future work includes evaluating the prototype's performance.

iaasieee servicesautoscaling
Distributed Mutual exclusion algorithms
Distributed Mutual exclusion algorithmsDistributed Mutual exclusion algorithms
Distributed Mutual exclusion algorithms

1. There are two main approaches to distributed mutual exclusion - token-based and non-token based. Token based approaches use a shared token to allow only one process access at a time, while non-token approaches use message passing to determine access order. 2. A common token based algorithm uses a centralized coordinator process that grants access to the requesting process. Ring-based algorithms pass a token around a logical ring, allowing the process holding it to enter the critical section. 3. Lamport's non-token algorithm uses message passing of requests and timestamps to build identical request queues at each process, allowing the process at the head of the queue to enter the critical section. The Ricart-Agrawala

physical clocklogical clockclock drift
Distributed System
Distributed System Distributed System
Distributed System

- A distributed system is a collection of autonomous computers linked by a network that appear as a single computer. Inter-process communication allows processes running on different computers to exchange data. Common IPC methods include message passing, shared memory, and remote procedure calls. - Marshalling is the process of reformatting data to allow exchange between modules that use different data representations. Remote procedure calls allow a program to execute subroutines in another address space, such as on another computer. The client-server model partitions tasks between service providers (servers) and requesters (clients). - Election algorithms are used in distributed systems to choose a coordinator process from among a group of processes. Examples include the bully algorithm and ring

RAFT : Terms
● Divided into two parts
– Election
– Normal operation
● At most 1 leader per term
● Failed election - split vote
● Each server maintains current term value
● Identify obsolete information
RAFT : Server states
● Server states transition
RAFT : Replicated state machine
● A picture says thousand words...
RAFT : Different RPCs
● RequestVote RPCs – Candidate sends to other
nodes for electing itself as leader
● AppendEntries RPCs – Normal operation
workload
● AppendEntries RPCs with no message - Heart
beat messages – Leader sends to all followers
to make its presence

Recommended for you

Performance Engineering Requirements
Performance Engineering RequirementsPerformance Engineering Requirements
Performance Engineering Requirements

The document outlines the key performance testing requirements when using LoadRunner. It discusses factors that impact throughput like hardware specifications, software processing overhead, and degree of parallelism. It also defines different types of performance tests including load testing to determine maximum load, stress testing under deprived resources, soak testing under prolonged load, and scalability testing to determine if an application can scale. Finally, it lists requirements for scripts, parametrization, correlations, scenarios, think times, and monitoring tools for capacity planning when using LoadRunner for performance testing.

Database replication
Database replicationDatabase replication
Database replication

This is the Complete Information about Data Replication you need, i am focused on these topics: What is replication? Who use it? Types ? Implementation Methods?

data replicationdatabase replicationreplication
Making Automation Work
Making Automation WorkMaking Automation Work
Making Automation Work

StrikrSystemsLLP provides automation services to help identify automation requirements, plan proof of concepts and development work, and drive efficiency. They take a modular approach to design reusable automation solutions across organizations. Their process involves understanding existing systems and scripts, exploring automation opportunities, and designing data-driven solutions without resume-driven frameworks or outdated tools like Excel. They deliver integrated services and aim to eliminate user interaction from automated processes.

fossautomationdevops
RAFT : Leader Election
● current_term++
● Follower->Candidate
● Self vote
● Send request vote RPCs to all other servers, retry until either:
– Receive votes from majority of server
– Receive RPC from valid leader
– Election time out elapses – increment term
● Election properties
– Safety – allow at most one winner per term
– Liveness – some candidate must eventually win
Consistent distributed store
● A common consistent store which can be
shared by different nodes
● In the form of key value pair for ease of use
● Such distributed key value store
implementations are available.
etcd
● Named as /etc distributed
● Open source distributed consistent key value store
● Based on RAFT
● Highly available and reliable
● Sequentially consistent
● Watchable
● Exposed via HTTP
● Runtime reconfigurable (Saling feature)
● Durable (snapshot backup/restore)
● Time to live keys (have a time out)
Why etcd
● Vibrant community
● 500+ applications like kubernetes, cloud
foundry using it
● 150+ developers
● Stable releases

Recommended for you

OS Process and Thread Concepts
OS Process and Thread ConceptsOS Process and Thread Concepts
OS Process and Thread Concepts

This lecture covers process and thread concepts in operating systems including scheduling criteria and algorithms. It discusses key process concepts like process state, process control block and CPU scheduling. Common scheduling algorithms like FCFS, SJF, priority and round robin are explained. Process scheduling queues and the producer-consumer problem are also summarized. Evaluation methods for scheduling algorithms like deterministic modeling, queueing models and simulation are briefly covered.

Presentation on Transaction
Presentation on TransactionPresentation on Transaction
Presentation on Transaction

This presentation discusses database transactions. Key points: 1. A transaction must follow the properties of atomicity, consistency, isolation, and durability (ACID). It accesses and possibly updates data items while preserving a consistent database. 2. Transaction states include active, partially committed, failed, aborted, and committed. Atomicity and durability are implemented using a shadow database with a pointer to the current consistent copy. 3. Concurrent transactions are allowed for better throughput and response time. Concurrency control ensures transaction isolation to prevent inconsistent databases. 4. A schedule specifies the execution order of transaction instructions. A serializable schedule preserves consistency like a serial schedule. Conflict and view serializability are forms

Communication And Synchronization In Distributed Systems
Communication And Synchronization In Distributed SystemsCommunication And Synchronization In Distributed Systems
Communication And Synchronization In Distributed Systems

This document discusses various topics related to communication and synchronization in distributed systems. It covers concepts like communication protocols, remote procedure calls, client-server and peer-to-peer models, blocking vs non-blocking communication, reliability, group communication, message ordering, and synchronization techniques including clock synchronization algorithms, mutual exclusion algorithms, and atomic transactions.

Conclusion
● Use etcd sub cluster to store configuration data
● No burden on application to maintain
consistency
● And that's all!!
References
● https://raftconsensus.github.io/
● https://www.youtube.com/watch?
v=YbZ3zDzDnrw
● https://github.com/coreos/etcd#etcd
Q & A
THANK YOU

Recommended for you

Transaction concurrency control
Transaction concurrency controlTransaction concurrency control
Transaction concurrency control

This document discusses transaction management and concurrency control. It defines a transaction as a logical unit of work that must be completed or aborted with no intermediate states. It describes the ACID properties of atomicity, consistency, isolation, and durability that transactions should have. It also discusses concurrency control techniques like locking and time stamping to ensure transactions execute serially for consistency despite concurrent access.

transaction logacid propertieslock
Chapter 18 - Distributed Coordination
Chapter 18 - Distributed CoordinationChapter 18 - Distributed Coordination
Chapter 18 - Distributed Coordination

Event Ordering Mutual Exclusion Atomicity Concurrency Control Deadlock Handling Election Algorithms Reaching Agreement

distributedcoordinationsystems
Transactions and Concurrency Control
Transactions and Concurrency ControlTransactions and Concurrency Control
Transactions and Concurrency Control

Transactions and Concurrency Control in distributed systems. Transaction properties, classification, and transaction implementation. Flat, Nested, and Distributed transactions. Inconsistent Retrievals, Lost Update, Dirty Read, and Premature Writes Problem

distributed systemsconcurrency controlflat transactions

More Related Content

What's hot

Task migration in os
Task migration in osTask migration in os
Task migration in os
uos lahore pakistan
 
Designing large scale distributed systems
Designing large scale distributed systemsDesigning large scale distributed systems
Designing large scale distributed systems
Ashwani Priyedarshi
 
Agreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared MemoryAgreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared Memory
SHIKHA GAUTAM
 
Pnuts yahoo!’s hosted data serving platform
Pnuts  yahoo!’s hosted data serving platformPnuts  yahoo!’s hosted data serving platform
Pnuts yahoo!’s hosted data serving platform
lammya aa
 
Os examples scheduling
Os examples schedulingOs examples scheduling
Os examples scheduling
Dana dia
 
Hbase hivepig
Hbase hivepigHbase hivepig
Hbase hivepig
Radha Krishna
 
Management on Cloud 2011
Management on Cloud 2011Management on Cloud 2011
Management on Cloud 2011
steccami
 
Distributed Mutual exclusion algorithms
Distributed Mutual exclusion algorithmsDistributed Mutual exclusion algorithms
Distributed Mutual exclusion algorithms
MNM Jain Engineering College
 
Distributed System
Distributed System Distributed System
Distributed System
Nitesh Saitwal
 
Performance Engineering Requirements
Performance Engineering RequirementsPerformance Engineering Requirements
Performance Engineering Requirements
srivinayak
 
Database replication
Database replicationDatabase replication
Database replication
Arslan111
 
Making Automation Work
Making Automation WorkMaking Automation Work
Making Automation Work
strikr .
 
OS Process and Thread Concepts
OS Process and Thread ConceptsOS Process and Thread Concepts
OS Process and Thread Concepts
sgpraju
 
Presentation on Transaction
Presentation on TransactionPresentation on Transaction
Presentation on Transaction
Rahul Prajapati
 
Communication And Synchronization In Distributed Systems
Communication And Synchronization In Distributed SystemsCommunication And Synchronization In Distributed Systems
Communication And Synchronization In Distributed Systems
guest61205606
 
Transaction concurrency control
Transaction concurrency controlTransaction concurrency control
Transaction concurrency control
Anand Grewal
 
Chapter 18 - Distributed Coordination
Chapter 18 - Distributed CoordinationChapter 18 - Distributed Coordination
Chapter 18 - Distributed Coordination
Wayne Jones Jnr
 
Transactions and Concurrency Control
Transactions and Concurrency ControlTransactions and Concurrency Control
Transactions and Concurrency Control
Dilum Bandara
 
Process Scheduling
Process SchedulingProcess Scheduling
Process Scheduling
Santhi thi
 
dos mutual exclusion algos
dos mutual exclusion algosdos mutual exclusion algos
dos mutual exclusion algos
Akhil Sharma
 

What's hot (20)

Task migration in os
Task migration in osTask migration in os
Task migration in os
 
Designing large scale distributed systems
Designing large scale distributed systemsDesigning large scale distributed systems
Designing large scale distributed systems
 
Agreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared MemoryAgreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared Memory
 
Pnuts yahoo!’s hosted data serving platform
Pnuts  yahoo!’s hosted data serving platformPnuts  yahoo!’s hosted data serving platform
Pnuts yahoo!’s hosted data serving platform
 
Os examples scheduling
Os examples schedulingOs examples scheduling
Os examples scheduling
 
Hbase hivepig
Hbase hivepigHbase hivepig
Hbase hivepig
 
Management on Cloud 2011
Management on Cloud 2011Management on Cloud 2011
Management on Cloud 2011
 
Distributed Mutual exclusion algorithms
Distributed Mutual exclusion algorithmsDistributed Mutual exclusion algorithms
Distributed Mutual exclusion algorithms
 
Distributed System
Distributed System Distributed System
Distributed System
 
Performance Engineering Requirements
Performance Engineering RequirementsPerformance Engineering Requirements
Performance Engineering Requirements
 
Database replication
Database replicationDatabase replication
Database replication
 
Making Automation Work
Making Automation WorkMaking Automation Work
Making Automation Work
 
OS Process and Thread Concepts
OS Process and Thread ConceptsOS Process and Thread Concepts
OS Process and Thread Concepts
 
Presentation on Transaction
Presentation on TransactionPresentation on Transaction
Presentation on Transaction
 
Communication And Synchronization In Distributed Systems
Communication And Synchronization In Distributed SystemsCommunication And Synchronization In Distributed Systems
Communication And Synchronization In Distributed Systems
 
Transaction concurrency control
Transaction concurrency controlTransaction concurrency control
Transaction concurrency control
 
Chapter 18 - Distributed Coordination
Chapter 18 - Distributed CoordinationChapter 18 - Distributed Coordination
Chapter 18 - Distributed Coordination
 
Transactions and Concurrency Control
Transactions and Concurrency ControlTransactions and Concurrency Control
Transactions and Concurrency Control
 
Process Scheduling
Process SchedulingProcess Scheduling
Process Scheduling
 
dos mutual exclusion algos
dos mutual exclusion algosdos mutual exclusion algos
dos mutual exclusion algos
 

Similar to Manging scalability of distributed system

Consensus algo with_distributed_key_value_store_in_distributed_system
Consensus algo with_distributed_key_value_store_in_distributed_systemConsensus algo with_distributed_key_value_store_in_distributed_system
Consensus algo with_distributed_key_value_store_in_distributed_system
Atin Mukherjee
 
Storing the real world data
Storing the real world dataStoring the real world data
Storing the real world data
Athira Mukundan
 
Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemaker
kuchinskaya
 
Coordination in distributed systems
Coordination in distributed systemsCoordination in distributed systems
Coordination in distributed systems
Andrea Monacchi
 
Megastore by Google
Megastore by GoogleMegastore by Google
Megastore by Google
Ankita Kapratwar
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMS
Kathirvel Ayyaswamy
 
Distributed fun with etcd
Distributed fun with etcdDistributed fun with etcd
Distributed fun with etcd
Abdulaziz AlMalki
 
Spark 1.0
Spark 1.0Spark 1.0
Spark 1.0
Jatin Arora
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
Sakari Keskitalo
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
Sakari Keskitalo
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
Codership Oy - Creators of Galera Cluster
 
SVCC-2014
SVCC-2014SVCC-2014
SVCC-2014
John Brinnand
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data grid
Bogdan Dina
 
Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE Architectures
Alexander Penev
 
Distributed Database Management System - Introduction
Distributed Database Management System - IntroductionDistributed Database Management System - Introduction
Distributed Database Management System - Introduction
geethamarya1
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
LinkedIn
 
M|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouM|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for You
MariaDB plc
 
Square's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with RaftSquare's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with Raft
ScyllaDB
 
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...
Umair Shahid
 
Multithreaded processors ppt
Multithreaded processors pptMultithreaded processors ppt
Multithreaded processors ppt
Siddhartha Anand
 

Similar to Manging scalability of distributed system (20)

Consensus algo with_distributed_key_value_store_in_distributed_system
Consensus algo with_distributed_key_value_store_in_distributed_systemConsensus algo with_distributed_key_value_store_in_distributed_system
Consensus algo with_distributed_key_value_store_in_distributed_system
 
Storing the real world data
Storing the real world dataStoring the real world data
Storing the real world data
 
Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemaker
 
Coordination in distributed systems
Coordination in distributed systemsCoordination in distributed systems
Coordination in distributed systems
 
Megastore by Google
Megastore by GoogleMegastore by Google
Megastore by Google
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMS
 
Distributed fun with etcd
Distributed fun with etcdDistributed fun with etcd
Distributed fun with etcd
 
Spark 1.0
Spark 1.0Spark 1.0
Spark 1.0
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
 
SVCC-2014
SVCC-2014SVCC-2014
SVCC-2014
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data grid
 
Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE Architectures
 
Distributed Database Management System - Introduction
Distributed Database Management System - IntroductionDistributed Database Management System - Introduction
Distributed Database Management System - Introduction
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
 
M|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouM|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for You
 
Square's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with RaftSquare's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with Raft
 
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...
 
Multithreaded processors ppt
Multithreaded processors pptMultithreaded processors ppt
Multithreaded processors ppt
 

More from Atin Mukherjee

GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized StoreGlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
Atin Mukherjee
 
Ready to go
Ready to goReady to go
Ready to go
Atin Mukherjee
 
Glusterd_thread_synchronization_using_urcu_lca2016
Glusterd_thread_synchronization_using_urcu_lca2016Glusterd_thread_synchronization_using_urcu_lca2016
Glusterd_thread_synchronization_using_urcu_lca2016
Atin Mukherjee
 
Gluster d2.0
Gluster d2.0Gluster d2.0
Gluster d2.0
Atin Mukherjee
 
Thread synchronization in GlusterD using URCU
Thread synchronization in GlusterD using URCUThread synchronization in GlusterD using URCU
Thread synchronization in GlusterD using URCU
Atin Mukherjee
 
GlusterD - Daemon refactoring
GlusterD - Daemon refactoringGlusterD - Daemon refactoring
GlusterD - Daemon refactoring
Atin Mukherjee
 
Gluster fs architecture_&_roadmap_atin_punemeetup_2015
Gluster fs architecture_&_roadmap_atin_punemeetup_2015Gluster fs architecture_&_roadmap_atin_punemeetup_2015
Gluster fs architecture_&_roadmap_atin_punemeetup_2015
Atin Mukherjee
 

More from Atin Mukherjee (7)

GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized StoreGlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
 
Ready to go
Ready to goReady to go
Ready to go
 
Glusterd_thread_synchronization_using_urcu_lca2016
Glusterd_thread_synchronization_using_urcu_lca2016Glusterd_thread_synchronization_using_urcu_lca2016
Glusterd_thread_synchronization_using_urcu_lca2016
 
Gluster d2.0
Gluster d2.0Gluster d2.0
Gluster d2.0
 
Thread synchronization in GlusterD using URCU
Thread synchronization in GlusterD using URCUThread synchronization in GlusterD using URCU
Thread synchronization in GlusterD using URCU
 
GlusterD - Daemon refactoring
GlusterD - Daemon refactoringGlusterD - Daemon refactoring
GlusterD - Daemon refactoring
 
Gluster fs architecture_&_roadmap_atin_punemeetup_2015
Gluster fs architecture_&_roadmap_atin_punemeetup_2015Gluster fs architecture_&_roadmap_atin_punemeetup_2015
Gluster fs architecture_&_roadmap_atin_punemeetup_2015
 

Recently uploaded

Manual | Product | Research Presentation
Manual | Product | Research PresentationManual | Product | Research Presentation
Manual | Product | Research Presentation
welrejdoall
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
BookNet Canada
 
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALLBLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
Liveplex
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
ArgaBisma
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
Larry Smarr
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
BookNet Canada
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
Larry Smarr
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
 
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Bert Blevins
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
Mark Billinghurst
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
ishalveerrandhawa1
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Chris Swan
 
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
UiPathCommunity
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Erasmo Purificato
 
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
Enterprise Wired
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
Stephanie Beckett
 
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Bert Blevins
 

Recently uploaded (20)

Manual | Product | Research Presentation
Manual | Product | Research PresentationManual | Product | Research Presentation
Manual | Product | Research Presentation
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
 
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALLBLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
 
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
 
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
 
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
 
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
 

Manging scalability of distributed system

  • 1. How to scale a distributed (file) system Atin Mukherjee Gluster Hacker SSE @ Red Hat @mukherjee_atin IRC : atinmu
  • 2. Agenda ● Consensus in distributed system ● CAP theorem in distributed system ● Different distributed system design approaches ● Design challenges ● RAFT algorithm ● Consistent distributed store ● etcd ● Q & A
  • 3. Consensus in distributed system ● Consensus – An agrement but for what and between whom? ● For what → the op/transaction can be committed or not ● Between whom → Answer is pretty simple, the nodes forming the distributed system ● Quorum – (n/2) + 1
  • 4. CAP theorem ● Any two of the following three gurantees – Consistency (all nodes see the same data at the same time) – Availability (a guarantee that every request receives a response about whether it succeeded or failed) – Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)
  • 5. Distributed system design approaches ● No meta data – all nodes share across their data ● Meta data server – One node holds data where others fetches from it So which one is better??? Probably none of them? Ask yourself for a minute....
  • 6. Challenges in design of a distributed system ● No meta data server – N * N exchange of Network messages – Not scalable when N is probably in hundreds or thousands – Initialization time can be very high – Can end up in a situation like “whom to believe, whom not to” - popularly known as split brain – How to undo a transaction locally
  • 7. Challenges in design of a distributed system - 2 ● MDS (Meta data server) – SPOF Ahh!! so is this the only drawback?? – How about having replicas and then replica count?? – Additional N/W hop, lower performance
  • 8. RAFT – A consensus algorithm ● Key functions – Asymmetric – leader based – Leader election – Normal operation – Safety and consistency after leader changes – Neutralizing old leaders – Client interactions – Configuration changes
  • 9. RAFT : Terms ● Divided into two parts – Election – Normal operation ● At most 1 leader per term ● Failed election - split vote ● Each server maintains current term value ● Identify obsolete information
  • 10. RAFT : Server states ● Server states transition
  • 11. RAFT : Replicated state machine ● A picture says thousand words...
  • 12. RAFT : Different RPCs ● RequestVote RPCs – Candidate sends to other nodes for electing itself as leader ● AppendEntries RPCs – Normal operation workload ● AppendEntries RPCs with no message - Heart beat messages – Leader sends to all followers to make its presence
  • 13. RAFT : Leader Election ● current_term++ ● Follower->Candidate ● Self vote ● Send request vote RPCs to all other servers, retry until either: – Receive votes from majority of server – Receive RPC from valid leader – Election time out elapses – increment term ● Election properties – Safety – allow at most one winner per term – Liveness – some candidate must eventually win
  • 14. Consistent distributed store ● A common consistent store which can be shared by different nodes ● In the form of key value pair for ease of use ● Such distributed key value store implementations are available.
  • 15. etcd ● Named as /etc distributed ● Open source distributed consistent key value store ● Based on RAFT ● Highly available and reliable ● Sequentially consistent ● Watchable ● Exposed via HTTP ● Runtime reconfigurable (Saling feature) ● Durable (snapshot backup/restore) ● Time to live keys (have a time out)
  • 16. Why etcd ● Vibrant community ● 500+ applications like kubernetes, cloud foundry using it ● 150+ developers ● Stable releases
  • 17. Conclusion ● Use etcd sub cluster to store configuration data ● No burden on application to maintain consistency ● And that's all!!
  • 19. Q & A