Deep Dive into Amazon ElastiCache Architecture and Design Patterns (DAT307) | AWS re:Invent 2013

DAT307 - Deep Dive into Amazon ElastiCache
Architecture and Design Patterns
Nate Wiger, Principal Solutions Architect
November 14, 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

Contents
•
•
•
•
•

Caching: What’s all this then?
Amazon ElastiCache
Laziness, impatience, and hubris
From one to a dozen nodes
Memcached vs. Redis showdown

Device Fragmentation
•
•
•
•
•
•

Phones, tablets, PCs, toasters
HTML, apps, JSON APIs
Presentation differs
Data is the same
CDN for static images, videos
Doesn’t help “Welcome Back, Kotter!”

Death By 1000 Queries
•
•
•
•
•

Login, session
New messages, recent posts
Calls to Facebook, Twitter APIs
Your friends love the new Coldplay album!!!
Sudden viral traffic spikes

cache (noun)
a group of things that have been stored in a secret
place because they are illegal or have been stolen

Typical Web 2.0 App
External APIs

ELB

App

Amazon ElastiCache
•
•
•
•
•

Managed cache service
Memcached or Redis
Launch cluster of nodes
Scale up / down
Monitoring + alerts

Memcached
•
•
•
•
•

In-memory
Slab allocator
Multithreaded
No persistence
Gold standard

Wire It Up
# Ruby
require ‘dalli’
cache = Dalli::Client([
’mycache.z2vq55.0001.usw2.cache.amazonaws.com:11211’,
’mycache.z2vq55.0002.usw2.cache.amazonaws.com:11211’
])
cache.set("some_key", "Some value")
value = cache.get("some_key")
cache.set("another_key", 3)
cache.delete("another_key”)

Multiple Cache Nodes
External APIs

ELB

App

Sharding Across Nodes
server_list = [
]
server_index = hash(key) % server_list.length
server = server_list[server_index]

Sharding Across Nodes
server_list = [
]
server_index = hash(key) % server_list.length
server = server_list[server_index]

BAD

It’s All Been Done Before
•
•
•
•
•

Ruby – Dalli
Python – HashRing / MemcacheRing
Node.js – node-memcached
PHP – libketama or ElastiCache Client
Java – SpyMemcached or ElastiCache Client

So Far
•
•
•
•

Launched a cache cluster
Got the node names
Connected our client
Figured out sharding

What To Cache?
•
•
•
•
•

Everything!
Database records
Full HTML pages
Page fragments
Remote API calls

How To Cache It?
• Lazy population
• Write-through
• Timed refresh

Laziness is a Virtue
# Python
def get_user(user_id):
record = cache.get(user_id)
if record is None:

# Run a DB query
record = db.query("select * from users where id = ?", user_id)
cache.set(user_id, record)
return record
# App code
user = get_user(17)

Ship It
•
•
•
•
•

Most data is never accessed
Ensures cache is filled
Caches fail and scale
But cache miss penalty
Best approach for most data

Foresight is 20-20
# Python
def save_user(user_id, values):
record = db.query("update users ... where id = ?", user_id, values)
cache.set(user_id, record)
return record
# App code
user = save_user(17, {"name": "Nate Dogg"})

Laziness vs. Impatience
•
•
•
•
•

Ensures cache is always current
Write penalty vs. read penalty
But missing data on scale up
Plus excess data / cache churn
Still need lazy fetch too

Combo Move!
def save_user(user_id, values):
record = db.query("update users ... where id = ?", user_id, values)
cache.set(user_id, record, 300) # ttl
return record
def get_user(user_id):
record = cache.get(user_id)
if record is None:
record = db.query("select * from users where id = ?", user_id)
cache.set(user_id, record, 300) # ttl
return record
# App code
save_user(17, {"name": "Nate Diddy"})
user = get_user(17)

Timed Refresh
•
•
•
•

Run job to periodically update cache
Good for Top-N lists
Time-intensive rankings
Trending items

Monitoring
•
•
•
•
•

Integration with CloudWatch metrics
Setup alarms to send via email
Memory usage
Evictions
Which ElastiCache metrics should I monitor?

Node Discovery
• Setup an Amazon SNS topic for ElastiCache
• Have app listen for events
– ElastiCache:AddCacheNodeComplete
– ElastiCache:RemoveCacheNodeComplete

• Reconfigure connections
• See Event Notifications and Amazon SNS

Programmable Scaling
External APIs

SNS
Add Node

ELB

App

Node Auto-Discovery
# PHP
$server_endpoint = "mycache.z2vq55.cfg.usw2.cache.amazonaws.com";
$server_port
= 11211;
$cache = new Memcached();
$cache->setOption(
Memcached::OPT_CLIENT_MODE, Memcached::DYNAMIC_CLIENT_MODE);
# Set config endpoint as only server
$cache->addServer($server_endpoint, $server_port);
# Lib auto-locates nodes
$cache->set("key", "value");

Redis
•
•
•
•
•
•

Also in-memory
Advanced data types
Atomic operations
Single-threaded
Persistence
Read replicas

Leaderboard with Sorted Sets
ZADD
ZADD
ZADD
ZADD

leaderboard
leaderboard
leaderboard
leaderboard

556
819
105
1312

"Andy"
"Barry"
"Carl"
"Derek"

ZREVRANGE leaderboard 0 -1
1) "Derek"
2) "Barry"
3) "Andy"
4) "Carl"
ZREVRANK "Barry"
2

Follow the Leader
def save_score(user, score):
record = db.query("update users ... where id = ?", user_id, score)
redis.zadd("leaderboard", score, user)
def get_rank(user)
return redis.zrevrank(user) + 1
# App code
save_score("Andy", 556)
save_score("Barry", 819)
save_score("Carl", 105)
save_score("Derek", 1312)
get_rank("Barry")

# 2

Redis Replicas
External APIs

ELB

App

Writes

Reads

Replication Group

Redis Sharding
• Same concept as Memcached
• BUT
• Can't shard
– Lists
– Sets / sorted sets
– Hashes

• Require single in-memory structure

Anti-Pattern: Dedicated Nodes
• Spawn multiple nodes
• Use for different features
– Leaderboard
– Counters

• Can still shard key-value ops

Dedicated Redis Nodes
External APIs

ELB

App

Leaderboard

Counters

Summary
•
•
•
•
•

Caching is good
Good caching is hard
ElastiCache eases deployment
Memcached or Redis
More to come

Please give us your feedback on this
presentation

DAT307 - Nate Wiger
As a thank you, we will select prize
winners daily for completed surveys!

Deep Dive into Amazon ElastiCache Architecture and Design Patterns (DAT307) | AWS re:Invent 2013

More Related Content

Deep Dive into Amazon ElastiCache Architecture and Design Patterns (DAT307) | AWS re:Invent 2013