/
2017-08-31 Death Valley for HySDS v2

2017-08-31 Death Valley for HySDS v2

(The following was documented on https://github.jpl.nasa.gov/hysds-org/general/issues/456)

Use spot fleet instead of Auto Scaling to also test harikiri at scale.

Depends on “verdi event stream on figaro” https://github.jpl.nasa.gov/hysds-org/general/milestone/55

New features tested:

  • job drain
  • no-clobber dataset publishing from verdi
  • stability checks on compute instances
  • docker daemon
  • event stream back to mozart
  • spot fleet


at around 2000 active worker nodes

  • mozart on r4.8xlarge
  • cpu around 20%
  • network in: 40MB/s
  • network out: 20MB/s

screen shot 2017-08-31 at 2 53 21 pm

  • metrics on r4.4xlarge
  • cpu around 2%
  • network in: 4MB/s
  • network out: 1MB/s

metrics 2017-08-31 at 2 59 35 pm

  • grq on r4.4xlarge
  • cpu around 70%
  • network in: 4MB/s
  • network out: 2MB/s

grq 2017-08-31 at 3 03 00 pm

  • factotum on r4.4xlarge
  • cpu around 1%
  • network in: 1MB/s
  • network out: 2MB/s

factotum 2017-08-31 at 3 01 24 pm


Successfully tested using the following trinity mode configuration:

  • mozart (rabbitmq node) => r4.8xlarge
  • mozart (ES node) => r4.8xlarge
  • mozart (redis node) => r4.8xlarge
  • grq => r4.8xlarge
  • factoturm => r4.4xlarge
  • ci => r4.xlarge

dv-mozart-trinity-network_out-blue_rabbitmq-orange_es-green_redis


dv-mozart-trinity-network_in-blue_rabbitmq-orange_es-green_redis

dv-mozart-trinity-cpu_utilization-blue_rabbitmq-orange_es-green_redis

metrics node:


dv-metrics-cpu_utilization

dv-metrics-network_out

dv-metrics-network_in


grq node:


dv-grq-cpu_utilization

dv-grq-network_out

dv-grq-network_in


Related content

Trinity Mode for Larger Scales
Trinity Mode for Larger Scales
More like this
Optimizing System Resources and HySDS for Death Valley conditions
Optimizing System Resources and HySDS for Death Valley conditions
More like this
Upgrade
Upgrade
More like this
Cluster Setup - Installation-GitHub
Cluster Setup - Installation-GitHub
More like this
v3.0.0-rc.5
v3.0.0-rc.5
More like this
v4.1.0-beta.2
v4.1.0-beta.2
More like this
Note: JPL employees can also get answers to HySDS questions at Stack Overflow Enterprise: