The document discusses the challenges of scaling social games to millions of daily active users. It describes how the company scaled from 170,000 daily users to over 1,000,000 by:
1) Moving data and queries from MySQL databases to Redis to improve performance and handle higher volumes.
2) Sharding high-volume tables and migrating data to distribute load across multiple database servers.
3) Using load balancers and adding application servers to scale the architecture horizontally as more users were added.
17. Early August
30
22.5
15
7.5
0
6:00 6:10 6:20 6:30 6:40 6:50 7:00 7:10 7:20 7:30 7:40 7:50 8:00 8:10
query time in ms
• The MySQL hiccup
• every 70 mins query time spikes x7
18. Hiccup causes
Who is periodically blocking MySQL
• Code (app + plugins + Rails)?
• Some periodic job?
• The devil (AWS)?
19. Hiccup quick fix
• We shard out the top queried table
(40% of all queries)
MySQL servers
shard 1 shard 2 shard 3 shard 4
20. Hiccup quick fix
• We shard out the top queried table
(40% of all queries)
Top table Top table Top table Top table
shard 1 shard 2 shard 3 shard 4
Other tables Other tables Other tables Other tables
shard 1 shard 2 shard 3 shard 4
21. Hiccup quick fix
• Mysql likes it
• “top table” shards will go a long way in the
scaling process
Top table Top table Top table Top table
shard 1 shard 2 shard 3 shard 4
Other tables Other tables Other tables Other tables
shard 1 shard 2 shard 3 shard 4
22. Hiccup causes
Who is periodically blocking MySQL
• Code (app + plugins + Rails)?
• Some periodic job?
• The devil (AWS)?
None of the Above
23. Hiccup real cause
• Emerging MySQL internal at high volume
• MySQL flushes its buffer
• Under heavy write IO it’s blocking
24. Hiccup solution
• Percona MySQL patches (XtraDB) avoid
blocking behavior
• Query time profile gets smooth
• IO capacity limit manifested with gradual
performance decay
25. Write through cache
• Memcache in front of MySQL
• Evaluated before sharding
• Was discarded
• Because of our read/write reatio
27. Write through cache
It means 90% of the times
1. read cache
2. write cache
3. write SQL
28. Write through cache
Bound to
Read heavy Write heavy
• Mysql write
(unless async)
• memcache perfs
• Write through lib
optimized for
writes?
29. MySQL
• Sharding SQL is a painful way to scale
• Data migrations at high load imply
downtime
• ACID benefits all lost because of sharding
or in name of performance
30. Redis
• A persistent cache
• Fast 60000 qps on AWS hardware
• Interesting data structures, not only KV
• Already some small scale experince in
house
31. Redis adoption
• Which data to start from?
• How do we migrate without downtime?
• Which Ruby object - Redis structure lib?
32. Redis adoption
• Which data to start from?
• Best data fit for Redis hashes
• Top 3rd queried table
• a collection of integer fields that need only
increment / decrement
33. Redis adoption
• How do we migrate without downtime?
• Migrate one user at a time
• Use a Redis set to keep note of migrated/
non migrated
• No downtime, transparent to users
34. Redis adoption
• How do we migrate without downtime?
MySQL
User 123
RoR
Server
Redis
35. Redis adoption
• How do we migrate without downtime?
read original data
MySQL
User 123
RoR
Server
Redis
36. Redis adoption
• How do we migrate without downtime?
MySQL
User 123
RoR
Server
Redis
write migrated data
37. Redis adoption
• How do we migrate without downtime?
• Migration might never complete
• SQL + Redis set information to generate
final batch migration
38. Redis 1st result
10% query load from 4 MySQL server
is moved to 1 Redis server
Redis server load is 0.05
39. Redis
• Becomes the tool to use
• Migration plan for all write intensive data
• Migrate one “class” at a time
40. Redis honeymoon end
• Memory usage grows more than data
• Snapshot to disk causes spikes in query
time
• Starting new slaves eats memory on the
master node
41. Redis honeymoon end
Russian Roulette Feeling
• Redis machine sized with overabundant
RAM
• Rigorous slave/master starting plan
42. Redis
• Redis team acknowledges persistency/
replication problems
• Redis 2.4 diskstore plan starts