4. Project Goal
Maximize Resource Budget Solution
Only pay for what you need?
Reduce AWS traffic to best utilize lower
resource instances
Use AutoScaling to handle huge spikes in traffic
to the web front end
Sporting Events
Promotional Blasts
5. GIT Setup
Branch setup
Development, Staging, Production, Master
Which branch is on which server
Relatively low dev work after launch, no cross
company collaboration
Start up scripts for the servers to check for code
updates
.gitignore
.htaccess
Settings and files (NFS)
6. AWS Intro
Command Line Tools
Mainly EC2
More features
.bash_profile
Maintain so you can have multiple clients
8. And again...
# Paths to AWS Tools #
export EC2_HOME=~/ec2-api-tools-1.5.2.2
export AWS_AUTO_SCALING_HOME=~/AutoScaling-1.0.39.0
export AWS_RDS_HOME=~/RDSCli-1.6.001
#PATH=$PATH:$HOME/bin (This is probably your default)
PATH=$PATH:$HOME/bin:${EC2_HOME}/bin:${AWS_ELB_HOME}/bin:$
{AWS_AUTO_SCALING_HOME}/bin:$AWS_RDS_HOME/bin
9. Base Config (The Tedious Way)
What is this for?
Testing
Growing your AutoScaling Group
Initial configuration of the base instance
Public AMIs
Bootstrap your own Debian Instance
https://github.com/tomheady/ec2debian/wiki/64bit-ebs-
ami-pvgrub
Service Basics
Mysql, apache, postfix, users
10. AWS Tools Test
Use the “describe” type commands to see what
info you can pull
grapple:~ greg$ ec2-describe-instances
Unable to find a $JAVA_HOME at "/usr", continuing with system-provided Java...
RESERVATION r-0854b268 109231141564OIT Dev/Staging
INSTANCE i-615ba304 ami-e00df089 stopped oit 0 m1.medium
11. Base Config (The Chef Way)
Chef
Initial configuration of the base instance
Recipes
Mysql, apache, postfix, users
Caveats and Bootstrap Usage
Un-bootstrapping
Boot Time PLUS Config Time
12. Manual vs Chef Base Config
Time to learn Chef (who pays for it)
Do you have anything in place you can
replicate?
Does the client/server need any unique config
items?
Who is going to “Own” the Chef Server?
Additional Costs and Time
Time & Cost Ruled all on this, understand your
clients needs
13. Manual vs Chef Base Config
Who is going to “Own” the Chef Server?
Additional Costs and Time
Time & Cost Ruled All
Instance spin-up time
Chef Config time
How “Blank” is your base instance?
Overall Trigger to “In Service” Times
Caps Game 7 OT Scenario
14. AWS Workflow & Infrastructure
Development Instance
Ability to turn them off while keep the costs low
and on the client side
Dev & Staging Site
Possible updates? Just run chef-client on boot-
up
15. Elasitic Load Balancer
80 → 80
443 → 443
Keep it simple. If you put your Cert ON the ELB
you'll have to account for the Forwarded For
IP Address
Only One Cert per ELB
Multiple ELBs to an instance requires command line
tools
CNAME – Force Traffic to www
16. Instance Security Groups
What Are They?
Using the Groups
Simple GUI (something actually available in the
console)
What to put in them
SSH, ICMP
jailed to your source
maintainable outside of the instance config
HTTP/HTTPS (but from what source?)
Traffic Flow (amazon-elb/sg-843f59ed)
Add a test source, use your hosts file
17. Instance Group Features
Divide Them Up
Few Functions Per SG
WEB, DB, NFS, etc
Public to Specific Type, then link them together inside the Zone
Jail Services to Inside the Zone
NFS
MySQL
sg-504e8f38 (OIT PROD DB)
Even Traffic from the ELB
amazon-elb/sg-843f59ed (amazon-elb-sg)
Accommodates for New Instances Addresses
18. AutoScaling Build-Up
Now that you have your Base Instance...
Creating a AutoScaling AMI
$ ec2-create-image -n newoitprod i-258a0f40
Feedback will tell you the AMI to use:
created AMI:
ami-0cfa2965
Careful now, AWS will turn it off to copy it
19. AutoScaling Infrastructure Details
What is going to Scale?
as-create-launch-config OptionITProd --image-id
ami-0cfa2965 --instance-type m1.large
--monitoring-enabled --key oit --group sg-f234f29a
OptionITProd – Unique Name you choose
--image-id – Feedback from Prev Step
--instance-type
32/64 available on any type now, woooo!
This gets us better granularity and reduced cost
--group – Your WEB Security Group
20. Defining the Entire Group
You do this for multiple projects...
as-create-auto-scaling-group -z us-east-1b -l
OptionITProd -M 20 -m 2 --default-cooldown 180
--desired-capacity 1 --load-balancers OITNewProd
--auto-scaling-group OITNewProdASGroup
-l – Again, the Previous Step config
-M/m – Max/Min instances
Setting max and min is great for do-overs
Recommendation is 2 because of no SLA
--default-cooldown – Hysteresis (> 120s)
--load-balancers – This will auto attach
ELB still has to see it as healthy though
21. Great. HOW do we scale?
$ as-create-or-update-trigger OITCPUTrigger --
auto-scaling-group OITNewProdASGroup
--namespace "AWS/EC2" --measure
CPUUtilization --statistic Average
--dimensions
"AutoScalingGroupName=OITNewProdASGro
up" --period 120 --lower-threshold 20 --upper-
threshold 60 --lower-breach-increment=-1
--upper-breach-increment 1 --breach-duration
120
Your Welcome
22. AutoScaling Referrers and Stats
--auto-scaling-group = Name from as-create-
auto-scaling-group command
--namespace = standard, what AWS feature to
apply this to. For EC2, always pick
“AWS/EC2”
--measure = metric to trigger against. Here it’s
CPU . Can be changed to available storage
space, etc.
--Statistic = Metric method. Could be tripped on
an absolute value, average, max, min, etc.
23. Scaling Metrics
--dimensions = Don’t know exactly, just read this
and understood this to be somewhat of a jail
to operate in
--period = for the statistic metric, amount of time
to take the measurement for (seconds)
--lower-threshold = This is set for when to scale
down the autogroup. Therefore here it will
reduce the size by 1 when the average CPU
utilization across the group is less than 20 for
--period minutes
24. Scaling Actions
--breach-duration = Amount of time that has to
go by for the --statistic to be true to trigger an
autoscaling event (either increase by one or
decrease by 1. Notice for lower breach the
number is -1)
How much are we going to Scale
We can't force AWS to go shorter than 2
minutes
25. Operation
Nature of the Scaling
Not a “LIFO” scaling model
(get your logs while you can!)
How fast...Really?
2+ Minutes After Trigger
CDN
Reduce Traffic Load from NFS Shares
28. Alarms
CloudWatch
CPU Usage (which leads to scaling events)
DB Instance Network Out
ELB Unhealthy Host
Indicates “Out of Service” Instance
29. Testing
Remember the SG Hole we left?
Direct your computer directly to an Instance
Terminate an Instance
Check mounts
Time responses
Time to make to “running” status
Time to attach to ELB with “In Service” status
Load Testing
Be aware of what you are throwing at it
30. Making Mods
Removing the Set-Up
$ as-delete-trigger OITCPUTrigger --auto-scaling-group
OITNewProdASGroup
$ as-update-auto-scaling-group OITNewProdASGroup
--min-size 0 --max-size 0
$ as-delete-auto-scaling-group OITNewProdASGroup
$ as-delete-launch-config OptionITProd
Deleting your AMI (available through the Console)
31. More Modifications
Create a new image of it:
ec2-create-image -n OITProd20120329 i-258a0f40
The -n parameter is tagged for the current date, must be
a new unique name.
Create a new launch config with the new AMI
as-create-launch-config OptionITProd20120329...
Update your current AS Group
as-update-auto-scaling-group OITNewProdASGroup
--launch-configuration OptionITProd20120329
32. PCI Compliance
Understand the nature of the scan
Talk to techs, whatever you have to do
ELB = bad
Create a test instance for the scanning tool
Don't Hit Production, yet
33. Lessons Learned
Use an RDS? What sort of access and tools will
people use? How many Zones are you
operating in?
How to pre-scale to prepare for a flood
Set the Min up a notch or two
Contact Amazon and make sure your account
can scale up to the Max # of Instances
Use Chef for search of Instances and DB
34. More Lessons Learned
Config Rsyslog for remote logging
Understand your client
Scheduling Maintenance
Planning Promotions and Watching Hockey
(literally)
Don't get in a race to the bottom, Upscale if
you have a good case for it
Falcons vs. Caps
35. Conclusion
More Chef Control
Auto adding and deleting of nodes, etc
I figured this out in 2 weeks on and off. You can
probably do better.
No ultimate TotalChef solution yet?
Chef controlled cluster
MGMT Software controlling command line
tools