SlideShare a Scribd company logo
Managing Distributed Systems with
              Chef

                Mandi Walls
            mandi@opscode.com
                RiCon 2012
             October 10, 2012
whoami


•   Senior Technical Evangelist (Consultant) at Opscode

•   @lnxchk

•   mandi@opscode.com
Chef


•   Configuration management system built with the cloud in mind

•   http://www.opscode.com
Chef is a Tool




http://www.flickr.com/photos/wessexandy/7690486884/sizes/c/in/pool-96164123@N00/
Quick Chef Bits
•   Resources: things you manage (files, directories, services)

•   Nodes: hosts you manage, run chef-client

•   Recipes: collections of resources

•   Templates: dynamically generate configuration

•   Cookbooks: packages for recipes, usually a functional piece of software

•   Chef Server: stores info, cookbooks, runs an API and a search engine
What is a Distributed System


“A distributed system is a collection of independent computers that
appears to its users as a single coherent system”

                   ~ Tanenbaum and van Steen, Distributed Systems, 2002
What Do You Distribute?

•   Hits: with a load balancer

•   Reads: with some slaves

•   Workload: with some compute nodes

•   Storage: with some storage nodes
Infrastructure Management


•   Complex distributed systems require the proper
    tools to configure them to meet their goals

•   Distributed systems are built by connecting bricks
    together in pleasing landscapes



                                                         http://www.flickr.com/photos/53486765@N02/
How Do You Distribute?

•   Client-Server N:1

•   Master-Slave-Client: 1:N:M

•   Mesh or Cluster: N!

•   omg science
Goals


•   Create system topologies that are as complex as needed to meet the
    requirements of my architecture

•   Allow configurations to dynamically update when nodes join or
    disappear
N:1 Client Server


•   Basic examples

    •   Load balancer finding web nodes

    •   Application servers finding a datastore
example: Load Balancer
•   Using Chef roles

•   Roles are essentially used to create types of nodes

    •   I want a webserver, I create a webserver role that includes stuff like
        Apache, or nginx, or php, or whatever I need

    •   The webserver I build today using the role will be the same as the
        one I build next week or next month

    •   Roles are searchable!
Chef Roles
example: Load Balancer
example: App Servers and Data

•   Maybe I don’t want my servers to dynamically go looking for an
    element

•   I want to be able to tell them where to find it

•   Use Chef attributes
Chef Attributes
example: App Servers and Data
example: App Servers and Data
Master:Slaves:Clients


•   Services in complex topologies have more than one access pattern

•   Combinations of Chef Roles and Attributes create more interesting
    relationships
Master Role
Slave Role
Client Role
What Was All That?
Clusters


•   Complex topologies in
    which every component
    should know about all or
    most of the other
    components
Clusters Using Environments

•   Chef Environments allow you to logically partition your infrastructure

•   Canonical example is a Dev/Test/Stage/Prod model

•   But!

•   Create a logical group dedicated to your cluster
Chef Environments
Environments in Recipes

•   Nodes belonging to the mob environment will have tony as their boss

•   Nodes belonging to the herd environment will have cowboy as their
    boss

•   The same software can be used to manage these two clusters, and
    their environments keep them together
Chef Databags
•   “Bags of Holding”

•   Whatever random kind of stuff you need to share, in JSON

•   Not limited to cookbooks, roles, recipes: global data

•   Usually you want them to be saved in your source repository

•   Set of directories and files like data_bags/clusters/herd.json and
    data_bags/clusters/mob.json
Writing to Databags from Nodes


•   Little bit dangerous

•   Little bit racy




                           http://www.gcpvd.org/2011/09/27/riverzedge-industrial-ball-and-sander-drag-races-october-14/
Clusters Using Databags
•   Update the cluster’s databag in real time from the nodes

•   Let’s say the herd cluster elected sheepdog as its boss instead of
    cowboy

•   We could write a piece of node data and search, or we could abuse a
    databag
Write a Databag Item

•   Write out a new value in the clusters databag, herd item
Read Databag Items
•   The rest of the herd will get the new value by reading it out of the
    databag

•   The convergence interval is as long as you wait between chef-client
    executions on the nodes
Things to Consider


•   Your nodes converge by running the chef-client agent

•   chef-client can be run on an interval, or on demand
Other Chefy Things


•   Start, Report, and Error Handlers

•   Lightweight Resources and Providers

•   http://wiki.opscode.com
Thanks!

•   http://www.opscode.com

•   http://community.opscode.com

•   @lnxchk

•   mandi@opscode.com

More Related Content

Managing Distributed Systems with Chef

  • 1. Managing Distributed Systems with Chef Mandi Walls mandi@opscode.com RiCon 2012 October 10, 2012
  • 2. whoami • Senior Technical Evangelist (Consultant) at Opscode • @lnxchk • mandi@opscode.com
  • 3. Chef • Configuration management system built with the cloud in mind • http://www.opscode.com
  • 4. Chef is a Tool http://www.flickr.com/photos/wessexandy/7690486884/sizes/c/in/pool-96164123@N00/
  • 5. Quick Chef Bits • Resources: things you manage (files, directories, services) • Nodes: hosts you manage, run chef-client • Recipes: collections of resources • Templates: dynamically generate configuration • Cookbooks: packages for recipes, usually a functional piece of software • Chef Server: stores info, cookbooks, runs an API and a search engine
  • 6. What is a Distributed System “A distributed system is a collection of independent computers that appears to its users as a single coherent system” ~ Tanenbaum and van Steen, Distributed Systems, 2002
  • 7. What Do You Distribute? • Hits: with a load balancer • Reads: with some slaves • Workload: with some compute nodes • Storage: with some storage nodes
  • 8. Infrastructure Management • Complex distributed systems require the proper tools to configure them to meet their goals • Distributed systems are built by connecting bricks together in pleasing landscapes http://www.flickr.com/photos/53486765@N02/
  • 9. How Do You Distribute? • Client-Server N:1 • Master-Slave-Client: 1:N:M • Mesh or Cluster: N! • omg science
  • 10. Goals • Create system topologies that are as complex as needed to meet the requirements of my architecture • Allow configurations to dynamically update when nodes join or disappear
  • 11. N:1 Client Server • Basic examples • Load balancer finding web nodes • Application servers finding a datastore
  • 12. example: Load Balancer • Using Chef roles • Roles are essentially used to create types of nodes • I want a webserver, I create a webserver role that includes stuff like Apache, or nginx, or php, or whatever I need • The webserver I build today using the role will be the same as the one I build next week or next month • Roles are searchable!
  • 15. example: App Servers and Data • Maybe I don’t want my servers to dynamically go looking for an element • I want to be able to tell them where to find it • Use Chef attributes
  • 19. Master:Slaves:Clients • Services in complex topologies have more than one access pattern • Combinations of Chef Roles and Attributes create more interesting relationships
  • 23. What Was All That?
  • 24. Clusters • Complex topologies in which every component should know about all or most of the other components
  • 25. Clusters Using Environments • Chef Environments allow you to logically partition your infrastructure • Canonical example is a Dev/Test/Stage/Prod model • But! • Create a logical group dedicated to your cluster
  • 27. Environments in Recipes • Nodes belonging to the mob environment will have tony as their boss • Nodes belonging to the herd environment will have cowboy as their boss • The same software can be used to manage these two clusters, and their environments keep them together
  • 28. Chef Databags • “Bags of Holding” • Whatever random kind of stuff you need to share, in JSON • Not limited to cookbooks, roles, recipes: global data • Usually you want them to be saved in your source repository • Set of directories and files like data_bags/clusters/herd.json and data_bags/clusters/mob.json
  • 29. Writing to Databags from Nodes • Little bit dangerous • Little bit racy http://www.gcpvd.org/2011/09/27/riverzedge-industrial-ball-and-sander-drag-races-october-14/
  • 30. Clusters Using Databags • Update the cluster’s databag in real time from the nodes • Let’s say the herd cluster elected sheepdog as its boss instead of cowboy • We could write a piece of node data and search, or we could abuse a databag
  • 31. Write a Databag Item �� Write out a new value in the clusters databag, herd item
  • 32. Read Databag Items • The rest of the herd will get the new value by reading it out of the databag • The convergence interval is as long as you wait between chef-client executions on the nodes
  • 33. Things to Consider • Your nodes converge by running the chef-client agent • chef-client can be run on an interval, or on demand
  • 34. Other Chefy Things • Start, Report, and Error Handlers • Lightweight Resources and Providers • http://wiki.opscode.com
  • 35. Thanks! • http://www.opscode.com • http://community.opscode.com • @lnxchk • mandi@opscode.com

Editor's Notes

  1. \n
  2. Basically I get to travel around hanging out with our customers and it’s totally cool.\n
  3. Learn more about Chef at our website; sign up for a free account\n
  4. \n
  5. The essential bits of Chef we’ll assume you need to know exist. There’s other stuff, of course, like idempotency and convergence and ruby and whatever, but let’s handwave some of that for now. Those aren’t the topics we’re looking for.\n
  6. \n
  7. \n
  8. yeah, i just belabored that metaphor\n
  9. \n
  10. Commodity computing means nodes can be very transient. Need more stuff done about things? Boot some more nodes. Not enough things for every node to do stuff? Shut down some nodes. Glue all the nodes together in interesting ways.\n
  11. \n
  12. \n
  13. What a role looks like, what it can tell you about the nodes. handwave this is ruby\n
  14. Recipe code from the load balancer recipe, searching for all webserver nodes. Why this is helpful: I can have more than one load balancer serving up the same webfarm using GSLB for example. I have the ability to create multiple N:1 configuration groups. I bestow on this node the role of load balancer and let there be traffic.\n
  15. \n
  16. \n
  17. The attributes, defined in a role, and the recipe code that accesses them.\n
  18. \n
  19. \n
  20. hey, we can hook up the backup server, too\n
  21. notice that there are multiple server types who can read from these slaves, the “other” servers\n
  22. \n
  23. So we just hooked up all these links, without having to know *where* these systems are and *what their names are*\n
  24. \n
  25. \n
  26. \n
  27. in this example, i’ve chosen the boss of the clusters. but what if the cluster is able to pick it’s own boss by an election or some other mechanism?\n
  28. \n
  29. and by “racy” i mean race condition\n
  30. What the file would look like on disk, in your repo\n
  31. This is where the race condition comes into play. If it’s possible for two nodes to think they are the boss, they will race to update this info\n\n
  32. \n
  33. how often, and how quickly your infrastructure converges to a new configuration depends on how often you run your chef-clients\n
  34. Other advanced Chef features that help you build a robust infrastructure that helps you manage it\n
  35. \n