Divide and conquer: resource segregation in the OpenStack cloud
- 2. Why segregate resources?
● Infrastructure
–Expose logical groupings of infrastructure based on physical
characteristics
–Expose logical groupings of infrastructure based on some abstract
functionality/capability
–“More-massive” horizontal scalability
- 3. Why segregate resources?
● Infrastructure
–Expose logical groupings of infrastructure based on physical
characteristics
–Expose logical groupings of infrastructure based on some abstract
functionality/capability
–“More-massive” horizontal scalability
● Workloads
–Ensure an even spread of a single workload
–Ensure close placement of related workloads
- 4. Segregation in datacenter virtualization
● Infrastructure segregation:
–Logical data center constructs
● Contain some number of logical clusters
● Clusters typically:
–Are relatively small (0's to 00's of nodes per cluster)
–Are tightly coupled to physical storage and network layout
● Workload segregation:
–Host-level affinity/anti-affinity
–CPU-level affinity/anti-affinity (pinning)
- 5. Segregation in an elastic cloud
● Amazon EC2:
–Infrastructure segregation:
● Regions – Separate geographic areas (e.g. us-east-1)
● Availability Zones – Isolated locations within a region (e.g. us-east-1a)
–Workload segregation:
● Placement Groups – Workload affinity within an availability zone
- 6. Segregation in an elastic cloud
● Amazon EC2:
–Infrastructure segregation:
● Regions – Separate geographic areas (e.g. us-east-1)
● Availability Zones – Isolated locations within a region (e.g. us-east-1a)
–Workload segregation:
● Placement Groups – Workload affinity within an availability zone
● OpenStack:
–Overloads some of these terms (and more!)
–Application is more flexible for deployers and operators
- 7. Segregation in an elastic cloud
● Wait a second...weren't we moving to the cloud to hide all this
infrastructure stuff from the user?
- 8. Segregation in an elastic cloud
● Wait a second...weren't we moving to the cloud to hide all this
stuff from the user?
–Yes!
● Users and applications demand some visibility of:
–Failure domains
–Premium features
● Deployers and operators determine the level of granularity
exposed.
- 10. Segregation in OpenStack
● Infrastructure segregation:
–Regions
–Cells
–Host aggregates
–Availability zones
● Workload segregation:
–Server groups
- 12. Regions
● Complete OpenStack deployments
–Share at least a Keystone and Horizon installation
–Implement their own targetable API endpoints
● In default deployment all services in one region – 'RegionOne'.
● New regions are created using Keystone:
–$ keystone endpointcreate region “RegionTwo”
- 13. Regions
● Target actions at a region's endpoint (mandatory):
–CLI:
● $ nova --os-region-name “RegionTwo” boot …
–Horizon:
- 17. Cells
● Maintains a single compute endpoint
● Relieve pressure on queues
database at scale (000's of nodes)
● Introduces the cells scheduler
- 18. API (parent) cell
● Adds a load balancer in front of
multiple instances of the API service
● Has its own message queue
● Includes a new service, nova-cells
–Handles cell scheduling
–Packaged as openstack-nova-cells
–Required in every cell
- 19. Compute (child) cell
● Each compute cell contains:
–Its own message queue and database
–Its own scheduler, conductor, compute
nodes
- 20. Common cell configuration
● Setup database and message broker for each cell
● Initialize cell database using nova-manage
● Optionally:
–Modify scheduling filter/weight configuration for cells scheduler
–Create cells JSON file to avoid need to avoid reloading from database
- 21. API (parent) cell configuration
● Nova.conf:
–Change compute_api_class
–Enable cells
–Name the cell
–Enable and start nova-cells
- 22. Compute (child) cell configuration
● nova.conf
–Disable quota driver
–Enable cells
–Name the cell
–Enable and start nova-cells
- 23. Cells pitfalls
● That all sounds pretty good – sign me up!
● Lack of “cell awareness” in other projects
● Minimal test coverage in the gate
● Some standard functionality currently broken with cells:
–Host aggregates
–Security groups
- 24. So how do they stack up?
Regions
● Supported by all services
● Separate endpoints
● Exist above scheduling
● Linked via REST APIs
Cells
● Supported by compute
● Common endpoint
● Additional scheduling layer
● Linked via RPC
- 26. Host aggregates
● Logical groupings of hosts based on metadata
● Typically metadata describes special capabilities hosts share:
–Fast disks for ephemeral data storage
–Fast network interfaces
–Etc.
● Hosts can be in multiple host aggregates:
–“Hosts that have SSD storage and GPUs”
- 27. Host aggregates
● Implicitly user targetable:
–Admin defines host aggregate with metadata, and a flavor that matches it
–User selects flavor with extra specifications when requesting instance
–Scheduler places instance on a host in a host aggregate that matches
(extra specifications to metadata)
–User explicitly targets a capability, not an aggregate
- 29. Host aggregates (example)
● Create host aggregates:
–$ nova aggregatecreate storageoptimized
–$ nova aggregatecreate networkoptimized
–$ nova aggregatecreate computeoptimized
- 40. Availability zones
● Logical groupings of hosts based on arbitrary factors like:
–Location (country, data center, rack, etc.)
–Network layout
–Power source
● Explicitly user targetable:
–$ nova boot availabilityzone “rack1”
● OpenStack Block Storage (Cinder) also has availability zones
- 41. Availability zones
● Host aggregates are made explicitly user targetable by creating
them as an AZ:
–$ nova aggregatecreate tier1 useasttier1
–tier1 is the aggregate name, useasttier1 is the AZ name
● Host aggregate is the availability zone in this case
–Hosts can not be in multiple availability zones
● Well...sort of.
–Hosts can be in multiple host aggregates
- 44. So how do they stack up?
Host Aggregates
● Implicitly user targetable
● Hosts can be in multiple
aggregates
● Grouping based on common
capabilities
Availability Zones
● Explicitly user targetable
● Hosts can not be in multiple
zones (see previous disclaimer)
● Grouping based on arbitrary
factors such as location, power,
network
- 46. Server groups
● Policies for defining workload placement rules for a group
–Anti-affinity filter – Grizzly
–Affinity filter – Havana
–API – Icehouse
● Implemented via scheduler filters:
–ServerGroupAffinityFilter
–ServerGroupAntiAffinityFilter
- 47. Server groups
● Affinity:
–Places instances within the group on the same host
● Anti-affinity:
–Places instances within the group on different hosts
● Not equivalent to AWS placement groups (host placement versus
availability zone placement)
- 48. Server groups
● Create the server group:
–$ nova servergroupcreate policy=antiaffinity
my_group
–Really defining a policy rather than a group.
● Specify the group UUID or name when launching instances:
–$ nova boot image ... flavor … hint
group=group_id
- 51. What next?
● Relevant design sessions:
–Simultaneous Scheduling for Server Groups
● Friday, May 16 • 1:20pm – 2:00pm
–Scheduler hints for VM life cycle
● Friday, May 16 • 2:10pm – 2:50pm
–Nova Dev/Ops Session
● Friday, May 16 • 3:00pm - 3:40pm
- 52. Resources
● Operations Guide – Chapter 5 “Scaling”
–http://docs.openstack.org/trunk/openstack-ops/content/scaling.html
● Configuration Reference Guide – Chapter 2 “Compute”
–http://docs.openstack.org/trunk/config-
reference/content/section_compute-cells.html
● OpenStack in Production Blog
–http://openstack-in-production.blogspot.fr/
Editor's Notes
- Role at Red Hat involves talking to customers, primarily about OpenStack compute
Confusion about various options for compute segregation common
Here we are!
I'm a compute guy, so primarily talking about compute
- Physical characteristics might include:
Geographic location (country, state, city, data center, rack)
Power source
Network layout
Anything really, but typically something where a fault would take out the entire unit or it's desirable to upgrade it as one.
- Physical characteristics might include:
Geographic location (country, state, city, data center, rack)
Power source
Network layout
Anything really, but typically something where a fault would take out the entire unit or it's desirable to upgrade it as one.
- 30 hosts for vSphere/ESXi cluster
~30 clusters for vCenter management
- In EC2 one user's us-east-1a may differ from anothers.
- AVAILABLE_REGIONS
Some facility for sharing other facilities between regions by adding endpoints for same IP in both.
Token replication, use memcached to help
- AmbiguousEndpoints
- No facility for setting metadata via Horizon, yet.
- Targeting is not mandatory.
If a default is specified it is used.
If no default is specified
- Not equivalent with current policies anyway.
- Trying to boot more instances in a group with affinity than the hardware allows results in NoValidHost (hard affinity)
- Trying to boot more instances than there are hosts available in a group with anti-affinity results in NoValidHost (hard anti-affinity).