Page Navigation:

Table of Contents

maxLevel	2

Highlights

Scale-up is done by an auto-scaling up metric alarm that checks the queue size.
Desirement to reduce auto-scaling AZ re-balancing which results in terminations. Therefore better to keep fleet balanced evenly across auto-scaling zones (AZ).

...

When using spot fleet, can ensure scale out in multiples of number of AZs.
Set scaling policy per ASG
- example: If alarm threshold is greater than 1 for greater than 60 seconds
  - Add 1 instance when JobsWaiting-grfn-job_worker-large is [1,10)
  - Add 10 instances when JobsWaiting-grfn-job_worker-large is [10,∞)

Optimization

...

Auto-scaling optimizations

...

- Currently public batch size 20-instance per 5-minute cool-down

...

- Cool-down default 300-seconds

...

Default internal batch rate of 10-instances per 30-seconds
AWS ASG will increase our max batch rate to 100-instances per 30-seconds

...

Could manually set desired group size to 100

...

Logs indicate our queue_size metric alarm only firing every 10-minutes.

...

Recommendations
- change cool-down to 1-minute
- try batch size of 100 instances
- set custom queue_size metric to check every 1-minute CloudWatch
if make these recommended changes, then estimate 1000 instances will take 55-minutes to ramp up

...

Self-

...

Termination

...

Need to suspend auto-scaling group AZ for load balancing scale down
http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/US_SuspendResume.html

...

?Command for ASG to turn off AZ rebalancing

aws autoscaling suspend-processes --auto-scaling-group-name ${yourASGname} --scaling-processes AZRebalance

This only needs to be ran once per ASG and will show up in the details tab of the ASG

References

AWS Autoscaling

http://boto.readthedocs.org/en/latest/autoscale_tut.html

AWS CloudWatch

boto: Python interface to Amazon Web Services

...

>>> import

...

boto.ec2.cloudwatch

>>> c = boto.ec2.cloudwatch.connect_to_region('us-west-2')

>>> metrics = c.list_metrics()

>>> metrics

[Metric:DiskReadBytes,

...

Metric:CPUUtilization,

...

Metric:DiskWriteOps,

...

Metric:DiskWriteOps,

...

Metric:DiskReadOps,

...

Metric:DiskReadBytes,

...

Metric:DiskReadOps,

...

Metric:CPUUtilization,

...

Metric:DiskWriteOps,

...

Metric:NetworkIn,

...

Metric:NetworkOut,

...

Metric:NetworkIn,

...

Metric:DiskReadBytes,

...

Metric:DiskWriteBytes,

...

Metric:DiskWriteBytes,

...

Metric:NetworkIn,

...

Metric:NetworkIn,

...

Metric:NetworkOut,

...

Metric:NetworkOut,

...

Metric:DiskReadOps,

...

Metric:CPUUtilization,

...

Metric:DiskReadOps,

...

Metric:CPUUtilization,

...

Metric:DiskWriteBytes,

...

Metric:DiskWriteBytes,

...

Metric:DiskReadBytes,

...

Metric:NetworkOut,

...

Metric:DiskWriteOps]

Code

http://gitlab:8000/browser/trunk/HySDS/cluster_fab/aria-jobs-dev/test_autoscale.py

📖 Related Articles:

Filter by label (Content by label)

showLabels	false
max	12
sort	title
showSpace	false
cql	label = "aws"

Have Questions? Ask a HySDS Developer:

Anyone can join our public Slack channelto learn more about HySDS. JPL employees can join #HySDS-Community

JPLers can also ask HySDS questions atStack Overflow Enterprise

Live Search

placeholder	Search HySDS Wiki

🚀 Page Information:

Was this page useful?

Yes No

Contribution History:

Contributors

mode	list
showLastTime	true
order	update

Find an Error?

Is this document outdated or inaccurate? Please contact the assigned Subject Matter Expert:

Hook Hua

Versions Compared

Old Version 4

New Version 5

Key

Highlights

Optimization

Self-

Termination

References

AWS Autoscaling

AWS CloudWatch

boto: Python interface to Amazon Web Services

Code

Page Comparison

Versions Compared

Old Version 4

New Version 5

Key

Highlights

Optimization

Self-

Termination

References

AWS Autoscaling

AWS CloudWatch

boto: Python interface to Amazon Web Services

Code