2017-08-31 Death Valley for HySDS v2
(The following was documented on https://github.jpl.nasa.gov/hysds-org/general/issues/456)
Use spot fleet instead of Auto Scaling to also test harikiri at scale.
Depends on “verdi event stream on figaro” https://github.jpl.nasa.gov/hysds-org/general/milestone/55
New features tested:
- job drain
- no-clobber dataset publishing from verdi
- stability checks on compute instances
- docker daemon
- event stream back to mozart
- spot fleet
at around 2000 active worker nodes
- mozart on r4.8xlarge
- cpu around 20%
- network in: 40MB/s
- network out: 20MB/s
- metrics on r4.4xlarge
- cpu around 2%
- network in: 4MB/s
- network out: 1MB/s
- grq on r4.4xlarge
- cpu around 70%
- network in: 4MB/s
- network out: 2MB/s
- factotum on r4.4xlarge
- cpu around 1%
- network in: 1MB/s
- network out: 2MB/s
Successfully tested using the following trinity mode configuration:
- mozart (rabbitmq node) => r4.8xlarge
- mozart (ES node) => r4.8xlarge
- mozart (redis node) => r4.8xlarge
- grq => r4.8xlarge
- factoturm => r4.4xlarge
- ci => r4.xlarge
metrics node:
grq node:
Note: JPL employees can also get answers to HySDS questions at Stack Overflow Enterprise: