SlideShare a Scribd company logo
MEMCACHED: WHAT IS IT
AND WHAT DOES IT DO?
             Brian Moon
            dealnews.com
     http://brian.moonspot.net/
@BRIANLMOON
• Senior Web Engineer for
    dealnews.com
• Founder and lead developer of
    Phorum
•   Memcached community member
•   Gearmand contributor
•   PHP internals contributor
•   I used PHP/FI
WHAT IS A CACHE?
WHAT IS A CACHE?

   "1 a: a hiding place especially for
concealing and preserving provisions or
implements b: a secure place of storage"


        http://www.merriam-webster.com/dictionary/cache
http://www.flickr.com/photos/simonov/479658875/
WHAT IS A CACHE?
   "...a component that improves performance by
transparently storing data such that future requests for
  that data can be served faster. The data that is stored
     within a cache might be values that have been
computed earlier or duplicates of original values that
are stored elsewhere. If requested data is contained in
   the cache (cache hit), this request can be served by
 simply reading the cache, which is comparably faster.
       Otherwise (cache miss), the data has to be
   recomputed or fetched from its original storage
         location, which is comparably slower."
                 http://en.wikipedia.org/wiki/Cache
WHAT IS A CACHE?
   "...a component that improves performance by
transparently storing data such that future requests for
  that data can be served faster. The data that is stored
     within a cache might be values that have been
computed earlier or duplicates of original values that
are stored elsewhere. If requested data is contained in
   the cache (cache hit), this request can be served by
 simply reading the cache, which is comparably faster.
       Otherwise (cache miss), the data has to be
   recomputed or fetched from its original storage
         location, which is comparably slower."
                 http://en.wikipedia.org/wiki/Cache
WHAT IS MEMCACHED?
memcached is a high-performance, distributed
memory object caching system, generic in nature, but
intended for use in speeding up dynamic web
applications by alleviating database load.
   •   Dumb daemon
   •   It is a generic key/data storage system
   •   Uses libevent and epoll/kqueue
   •   Caches data in memory
   •   Cache is distributed by the smart clients
CLIENT OPTIONS
•   C/C++ - libmemcached
•   PHP - PECL/memcached
•   Perl - Cache::Memcached
•   Python - python-memcached / Python libmemcached
•   Ruby - Ruby MemCache (per Google)
•   Java - spymemcached
•   Plus MySQL UDF, .NET, C#, Erlang, Lua, and more
SIMPLE PHP EXAMPLE
$MEMCACHE = new Memcached();
$MEMCACHE->addServer(“192.168.0.1”, 11211);
$MEMCACHE->addServer(“192.168.0.2”, 11211);

$mydata = $MEMCACHE->get(“mydata”);

if($mydata === false){
    $mydata = generate_mydata();
    $MEMCACHE->set(“mydata”, $mydata, 86400);
}

echo $mydata;
http://www.flickr.com/photos/tomharpel/1748935/




Where is my data stored?
WHERE IS MY DATA?
• The client (not server) uses a hashing algorithm to
    determine the storage server
•   Data is sent to only one server
•   Servers do not share data
•   Data is not replicated
•   Two hashing algorithms possible:
    • Traditional
    • “Consistent”
WHERE IS MY DATA?

               Traditional

server = servers[hash(key) % servers.length]
               (eenie meenie miney moe)
WHERE IS MY DATA?
“Consistent”
Each server is allocated
LOTS of numbers on a
“wheel”. The key is
hashed to a number in
that range and the
server assigned the
closest number is used.
Adding/removing
servers from the list      http://www.flickr.com/photos/k-bot/2614389196/

results in less key
reassignment.
What can I store?
                                                How big can it be?




http://www.flickr.com/photos/hshap/469025786/
WHAT CAN I STORE?

•   Server stores blobs of binary data
•   Most clients will serialize non-string data
•   Keys are limited to 250 bytes in length
•   Keys can not contain spaces or “high” characters. Stick
    with letters, numbers, _ and you are pretty safe.
• Some clients may normalize keys for you. But, don’t
    count on it.
DATA SIZE MATTERS
• Maximum size for one item is 1MB (until recently)
• Some clients support compression
• Data is stored in slabs based on size
  • Lots of items of the same size is not optimal
  • Slab size can be customized
  • May not be able to store items when it appears
     there is “free” memory
  • Data can be evicted sooner than expected.
&E
                                                      vict
                                                          ion
                                                                s



http://www.flickr.com/photos/aussiegall/322980012/
EVICTION AND EXPIRATION
• Expiration time can be expressed
  as seconds from now or as an
  absolute epoch time.
• Items are not removed from
  memory when they expire
• Items are evicted when newer
  items need to be stored
• Least Recently Used (LRU)
  determines what is evicted
• Eviction is done per slab          http://www.flickr.com/photos/bitchcakes/4410181958/
How do I know it is working?




http://www.flickr.com/photos/carolinadoug/3932117107/
HOW WELL IS IT WORKING?
HOW WELL IS IT WORKING?




    STAT   uptime 9207843
    STAT   cmd_get 66421687
    STAT   cmd_set 10640419
    STAT   get_hits 66421687     84%
    STAT   get_misses 12360549   hit rate

    STAT   evictions 0
HOW WELL IS IT WORKING?
 • Graph stats from memcached using Cacti/Ganglia, etc.
 • Key stats:
   • Hits/Misses
   • Gets/Sets
   • Evictions
 • Cacti Templates:
   • http://dealnews.com/developers/
   • http://code.google.com/p/mysql-cacti-templates/
There are some things
you think you want to
 do, but you can’t do
them and/or shouldn’t
      do them.


                        http://www.flickr.com/photos/magdalar/4241254141/
HOW DO I SEE THE CACHE?

 • You have no way to see the cached data.
 • You probably don’t need to see it.
 • For memcached to tell you, it would freeze your entire
   caching system
 • There are debug ways to see.
 • DO NOT COMPILE PRODUCTION WITH DEBUG
   BECAUSE YOU ARE A CONTROL FREAK!
HOW DO I BACK IT UP?

YOU DON’T!!!
• If you application requires that, you are using it wrong
• It is a cache, not a data storage system
• Maybe try Tokyo Tyrant, MongoDB or another
  “NOSQL” key/data store
NAMESPACES & TAGGING
• There is no concept of namespaces or tagging built in
  to memcached
• You can simulate them with an extra key storage
• See the FAQ for an example of simulated namespaces
• This of course means there is no mass delete in
  memcached
• There have been patches, but they never performed
  well.
MORE THINGS NOT TO DO

•   Use memcached as a locking daemon
•   Use memcached to store data that can’t go away
•   Don’t use it to try and speed up your intranet
•   Store complex data types that the clients have to
    serialize or unserialize
• Complain on the mailing list that you can’t do any of the
    things listed above. =)
QUEUES



JUST DON’T OK?
   see dormando in #memcached on freenode
REFERENCES

• http://code.google.com/p/memcached/
• http://code.google.com/p/memcached/wiki/Clients

• http://brian.moonspot.net/
• http://dealnews.com/developers/

More Related Content

Memcached: What is it and what does it do?

  • 1. MEMCACHED: WHAT IS IT AND WHAT DOES IT DO? Brian Moon dealnews.com http://brian.moonspot.net/
  • 2. @BRIANLMOON • Senior Web Engineer for dealnews.com • Founder and lead developer of Phorum • Memcached community member • Gearmand contributor • PHP internals contributor • I used PHP/FI
  • 3. WHAT IS A CACHE?
  • 4. WHAT IS A CACHE? "1 a: a hiding place especially for concealing and preserving provisions or implements b: a secure place of storage" http://www.merriam-webster.com/dictionary/cache
  • 6. WHAT IS A CACHE? "...a component that improves performance by transparently storing data such that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere. If requested data is contained in the cache (cache hit), this request can be served by simply reading the cache, which is comparably faster. Otherwise (cache miss), the data has to be recomputed or fetched from its original storage location, which is comparably slower." http://en.wikipedia.org/wiki/Cache
  • 7. WHAT IS A CACHE? "...a component that improves performance by transparently storing data such that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere. If requested data is contained in the cache (cache hit), this request can be served by simply reading the cache, which is comparably faster. Otherwise (cache miss), the data has to be recomputed or fetched from its original storage location, which is comparably slower." http://en.wikipedia.org/wiki/Cache
  • 8. WHAT IS MEMCACHED? memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load. • Dumb daemon • It is a generic key/data storage system • Uses libevent and epoll/kqueue • Caches data in memory • Cache is distributed by the smart clients
  • 9. CLIENT OPTIONS • C/C++ - libmemcached • PHP - PECL/memcached • Perl - Cache::Memcached • Python - python-memcached / Python libmemcached • Ruby - Ruby MemCache (per Google) • Java - spymemcached • Plus MySQL UDF, .NET, C#, Erlang, Lua, and more
  • 10. SIMPLE PHP EXAMPLE $MEMCACHE = new Memcached(); $MEMCACHE->addServer(“192.168.0.1”, 11211); $MEMCACHE->addServer(“192.168.0.2”, 11211); $mydata = $MEMCACHE->get(“mydata”); if($mydata === false){ $mydata = generate_mydata(); $MEMCACHE->set(“mydata”, $mydata, 86400); } echo $mydata;
  • 12. WHERE IS MY DATA? • The client (not server) uses a hashing algorithm to determine the storage server • Data is sent to only one server • Servers do not share data • Data is not replicated • Two hashing algorithms possible: • Traditional • “Consistent”
  • 13. WHERE IS MY DATA? Traditional server = servers[hash(key) % servers.length] (eenie meenie miney moe)
  • 14. WHERE IS MY DATA? “Consistent” Each server is allocated LOTS of numbers on a “wheel”. The key is hashed to a number in that range and the server assigned the closest number is used. Adding/removing servers from the list http://www.flickr.com/photos/k-bot/2614389196/ results in less key reassignment.
  • 15. What can I store? How big can it be? http://www.flickr.com/photos/hshap/469025786/
  • 16. WHAT CAN I STORE? • Server stores blobs of binary data • Most clients will serialize non-string data • Keys are limited to 250 bytes in length • Keys can not contain spaces or “high” characters. Stick with letters, numbers, _ and you are pretty safe. • Some clients may normalize keys for you. But, don’t count on it.
  • 17. DATA SIZE MATTERS • Maximum size for one item is 1MB (until recently) • Some clients support compression • Data is stored in slabs based on size • Lots of items of the same size is not optimal • Slab size can be customized • May not be able to store items when it appears there is “free” memory • Data can be evicted sooner than expected.
  • 18. &E vict ion s http://www.flickr.com/photos/aussiegall/322980012/
  • 19. EVICTION AND EXPIRATION • Expiration time can be expressed as seconds from now or as an absolute epoch time. • Items are not removed from memory when they expire • Items are evicted when newer items need to be stored • Least Recently Used (LRU) determines what is evicted • Eviction is done per slab http://www.flickr.com/photos/bitchcakes/4410181958/
  • 20. How do I know it is working? http://www.flickr.com/photos/carolinadoug/3932117107/
  • 21. HOW WELL IS IT WORKING?
  • 22. HOW WELL IS IT WORKING? STAT uptime 9207843 STAT cmd_get 66421687 STAT cmd_set 10640419 STAT get_hits 66421687 84% STAT get_misses 12360549 hit rate STAT evictions 0
  • 23. HOW WELL IS IT WORKING? • Graph stats from memcached using Cacti/Ganglia, etc. • Key stats: • Hits/Misses • Gets/Sets • Evictions • Cacti Templates: • http://dealnews.com/developers/ • http://code.google.com/p/mysql-cacti-templates/
  • 24. There are some things you think you want to do, but you can’t do them and/or shouldn’t do them. http://www.flickr.com/photos/magdalar/4241254141/
  • 25. HOW DO I SEE THE CACHE? • You have no way to see the cached data. • You probably don’t need to see it. • For memcached to tell you, it would freeze your entire caching system • There are debug ways to see. • DO NOT COMPILE PRODUCTION WITH DEBUG BECAUSE YOU ARE A CONTROL FREAK!
  • 26. HOW DO I BACK IT UP? YOU DON’T!!! • If you application requires that, you are using it wrong • It is a cache, not a data storage system • Maybe try Tokyo Tyrant, MongoDB or another “NOSQL” key/data store
  • 27. NAMESPACES & TAGGING • There is no concept of namespaces or tagging built in to memcached • You can simulate them with an extra key storage • See the FAQ for an example of simulated namespaces • This of course means there is no mass delete in memcached • There have been patches, but they never performed well.
  • 28. MORE THINGS NOT TO DO • Use memcached as a locking daemon • Use memcached to store data that can’t go away • Don’t use it to try and speed up your intranet • Store complex data types that the clients have to serialize or unserialize • Complain on the mailing list that you can’t do any of the things listed above. =)
  • 29. QUEUES JUST DON’T OK? see dormando in #memcached on freenode

Editor's Notes