Using galera replication to create geo distributed clusters on the wan

Using Galera Replication to Create
Geo-distributed Clusters on the WAN
Philip Stoev
Codership Oy

Agenda
• A very quick overview of Galera Cluster
• What is a geo-distributed database and why use it?
• Galera’s approach
• Configuration considerations
• AWS demo

Galera Cluster Overview
Synchronous
– each transaction is immediately replicated on all nodes at commit
– no stale slaves
Multi-Master
– read from and write to any node
– automatic transaction conflict detection
Replication
– a copy of the entire dataset is available on all nodes
– new nodes can join automatically
For MySQL
– based on a modified version of MySQL (5.5, 5.6 with 5.7 coming up)
– InnoDB storage engine

And more …
• Recovers from node failures within seconds
• Data consistency protections
– avoids reading stale data
– prevents unsafe data modifications
• Cloud and WAN support

What is a Geo-distributed
Database Cluster?
• There are database nodes in different physical locations
– multiple data centers, regions, continents …
• Nodes work together as a single entity
– rather than be in some subordinate relationship

Why Have a Geo-Distributed Database?
• Distribute global data globally
• Bring data closer to the users
• Go beyond availability zones and achieve multi-
datacenter redundancy
– multiple availability zones can fail at the same time
• Use multiple cloud providers

Galera’s Approach
• Single logical MySQL database
– behaves as a single entity with multiple connection points
• Each node has a complete replica of the database
– can respond to any read request without delay
– removes latency for many operations
– may reduce the number of caching layers required
• Each node is a master
– no primary/secondary relationship
– no need to promote a secondary to master on master failure

Galera Features for WAN
• Optimized network protocol
– packets exchanged over WAN only at transaction commit time
• Topology-aware replication
– each transaction is sent to each datacenter only once
– if needed, node synchronizes with nearest neighbors
• Traffic encryption
• Detection and automatic eviction of unreliable nodes
– node will be evicted if it repeatedly suffers network issues
– it will not be allowed to rejoin without a manual intervention

What Data Can Take Advantage of
Synchronous WAN Replication?
• Global in nature
– configuration data, authentication databases, SSO, etc.
– e.g. OpenStack's Keystone and Glance databases
• High read-to-write ratio
(in a distributed system, consistent writes require communication)
• Very high consistency requirements
– financial data, payments, bank accounts
• High write availability requirements
– writes must be possible at all times (without violating consistency)

Designing Your Cluster Topology
• Use an odd number of data centers
• If two data centers, run a Galera arbitrator
• Consider multiple nodes per datacenter

Latency Considerations
• Delay at commit time is generally equal to max RTT
– the highest latency dominates the overall response time
• A client can commit a maximum of 1/RTT
transactions/second
– consolidate updates into larger transactions
– larger connection pool may be required
• In multi-master setups, you can successfully update a
given row a maximum of 1/RTT times per second
– or conflicts can occur and an error will be returned to client

Bandwidth/Throughput Considerations
• All links between nodes are important for overall
performance
• Galera slows down commits to what the network is able
to handle
• Full snapshot transfers (SST) across WAN are
bandwidth-intensive
– have more than one node at each location

Configuration
• Configure gmcast.segment = ID
– each location should have a separate ID
• Review default values for:
– evs.inactive_timeout (15 seconds); evs.suspect_timeout (5 seconds)
• Size gcache appropriately
– to avoid snapshot transfers (SST) over WAN
• Set up optional auto-eviction
• Set up optional encryption
– SST encryption is configured separately

Network Configuration
• Use static/reserved public IPs
• Open firewall ports: 3306, 4567, 4568, 4444
– but not to the entire world
• Settings that use the public IPs:
– wsrep_cluster_address
– wsrep_node_address
• Settings that use the private IPs:
– ist.recv_bind

Performance Configuration
• wsrep_provider_options:
– gcs.max_packet_size=1048576
– evs.send_window=512; evs.user_send_window=256
– gcs.fc_limit=128
• wsrep_slave_threads
• binlog_row_event_max_size, binlog_cache_size=2M
• at the TCP level:
– net.core.rmem_max = 16777216
– net.core.wmem_max = 16777216
– net.core.rmem_default = 16777216
– net.core.wmem_default = 16777216
– net.ipv4.tcp_rmem = 4096 87380 16777216
– net.ipv4.tcp_wmem = 4096 65536 16777216
– net.ipv4.tcp_slow_start_after_idle = 0

Demo
• EC2 nodes in US East, Brazil and Australia
– m4.large instances (2 virtual CPUs, 8GB RAM, $0.12/hour)
– latencies:
Brazil
Sao Paulo
Australia
Sydney
US East
Virginia
319
229
119 ms

Questions
• Please use the Question/Chat box in the GoToWebinar
panel

Thank You
http://www.galeracluster.com
Discussion group:
codership-team@googlegroups.com

Using galera replication to create geo distributed clusters on the wan

Related slideshows

Recommended for you

Recommended for you

Recommended for you

Recommended for you

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to Using galera replication to create geo distributed clusters on the wan

Similar to Using galera replication to create geo distributed clusters on the wan (20)

More from Codership Oy - Creators of Galera Cluster

More from Codership Oy - Creators of Galera Cluster (6)

Recently uploaded

Recently uploaded (20)

Using galera replication to create geo distributed clusters on the wan