...
Page Navigation: | ||||
---|---|---|---|---|
|
Confidence Level TBD This article has not been reviewed for accuracy, timeliness, or completeness. Check that this information is valid before acting on it. |
---|
Highlights
Scale-up is done by an auto-scaling up metric alarm that checks the queue size.
Desirement to reduce auto-scaling AZ re-balancing which results in terminations. Therefore better to keep fleet balanced evenly across auto-scaling zones (AZ).
...
When using spot fleet, can ensure scale out in multiples of number of AZs.
Set scaling policy per ASG
example: If alarm threshold is greater than 1 for greater than 60 seconds
Add 1 instance when JobsWaiting-grfn-job_worker-large is [1,10)
Add 10 instances when JobsWaiting-grfn-job_worker-large is [10,∞)
Optimization
...
Auto-scaling optimizations
...
Currently public batch size 20-instance per 5-minute cool-down
...
Cool-down default 300-seconds
...
Default internal batch rate of 10-instances per 30-seconds
AWS ASG will increase our max batch rate to 100-instances per 30-seconds
...
Could manually set desired group size to 100
...
Logs indicate our queue_size metric alarm only firing every 10-minutes.
...
Recommendations
change cool-down to 1-minute
try batch size of 100 instances
set custom queue_size metric to check every 1-minute CloudWatch
if make these recommended changes, then estimate 1000 instances will take 55-minutes to ramp up
...
Self-
...
Termination
...
Need to suspend auto-scaling group AZ for load balancing scale down
http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/US_SuspendResume.html
...
?Command for ASG to turn off AZ rebalancing
aws autoscaling suspend-processes --auto-scaling-group-name ${yourASGname} --scaling-processes AZRebalance |
This only needs to be ran once per ASG and will show up in the details tab of the ASG
References
AWS Autoscaling
AWS CloudWatch
boto: Python interface to Amazon Web Services
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|
...
|