Upgrade
For definitions of terminology used, please refer to our terminology reference.
Latest Releases
The latest releases are here: https://github.com/hysds/hysds-framework/releases. Each was taken from the latest head state of all repos at time of release.
Prerequisite - Graceful Shutdown
To preserve the state of queued/running HySDS jobs mozart
, the HySDS cluster should be brought down gracefully as follows:
Turn off all timer based job submission scripts (e.g. crontab on factotum
)
- Log into into your
factotum
instancessh -i <PEM file> ops@<factotum IP>
- Back up your crontab and remove it to prevent jobs from being submitted during the upgrade process
mkdir ~/crontabs crontab -l > crontab.$(date -u -Iseconds) crontab -r
Gracefully shutdown the workers
- Log into your
mozart
instancessh -i <PEM file> ops@<mozart IP>
- Shut down the syncer processes that updates job queue metrics (i.e. AWS CloudWatch) which triggers autoscaling of
verdi
workers:supervisorctl status | grep ^sync_ | awk '{print $1}' | xargs -i -t supervisorctl stop {}
- Cancel consumption of tasks:
sudo rabbitmqctl list_queues 2>&1 | grep -v '^celery' | tail -n +2 | awk '{print $1}' | xargs -i -t celery -A hysds control cancel_consumer {}
- Log into the RabbitMQ admin interface to ensure that all queues show
0
in theUnacked
column of theQueues
tab. To show only the job/tasks queues, enter^(?!celery)
in the filter text box and check theRegex
checkbox. If there are jobs/tasks currently running, you can either wait for them to complete or kill them manually to retry them later after the upgrade. Below screenshot shows theUnacked
column with all zeros signifying that there are no jobs/tasks currently running. - Log into
figaro
and ensure that it matches what you see in the RabbitMQ admin interface - Your cluster is now ready for the upgrade
Upgrade
Update HySDS core using hysds-framework and sdscli
- Log into your
mozart
instancessh -i <PEM file> ops@<mozart IP>
- Stop the cluster
sds stop all -f
- Backup your mozart directory
mv ~/mozart ~/mozart.orig
- If you have it, remove the old
hysds-framework
clonerm -rf ~/hysds-framework
- Clone the HySDS framework repository and enter it
cd ~ git clone https://github.com/hysds/hysds-framework.git cd hysds-framework
- Select the HySDS framework release tag you'd like to install for mozart
./install.sh mozart HySDS install directory set to /home/ops/mozart New python executable in /home/ops/mozart/bin/python Installing Setuptools............................................done. Installing Pip...................................................done. Created virtualenv at /home/ops/mozart. [2017-08-09 19:25:37,789: INFO/main] Github repo URL: https://xxxxxxxx@github.com/api/v3/repos/hysds/hysds-framework/releases [2017-08-09 19:25:37,798: INFO/_new_conn] Starting new HTTPS connection (1): github.com No release specified. Use -r RELEASE | --release=RELEASE to install a specific release. Listing available releases: v2.0.0-beta.3 v2.1.0-beta.0 v2.1.0-beta.1 v2.1.0-beta.2 v2.1.0-beta.3 v2.1.0-beta.4 v2.1.0-beta.5 v2.1.0-beta.6 v2.1.0-beta.7 v2.1.0-beta.8 v2.1.0-rc.0 v2.1.0-rc.1 v2.1.0-rc.2 v2.1.0-rc.3
- Install the latest HySDS release (e.g. v2.1.0-rc.3) for the mozart component
You could also install the development version which pulls the master branch of each HySDS repo:./install.sh mozart -r <release> e.g. ./install.sh mozart -r v2.1.0-rc.3
./install.sh mozart -d
- Restore the non-core repositories from the directory backup under
~/mozart.orig/ops
cd ~/mozart.orig/ops for i in *; do new=~/mozart/ops/$i; if [ ! -e "$new" ]; then cp -rp $i $new; fi done
- Update all HySDS components:
If you receive any errors, they will need to be addressed.sds update all
- (Optional) Run any adaptation-specific fabric updates (e.g. update_aria_packages)
fab -f ~/.sds/cluster.py -R factotum,verdi update_aria_packages
- Build and ship out updated code/config bundles
sds ship
- Start up the
grq
component and validate that all services come up finesds start grq sds status grq
- Start up the
mozart
component and validate that all services come up finesds start mozart sds status mozart
- Start up the
metrics
component and validate that all services come up finesds start metrics sds status metrics
- During installation, the latest versions of the
lightweight-jobs
core HySDS package and theverdi
docker image was downloaded. If the version has has incremented, we import thelightweight-jobs
package:cd ~/mozart/pkgs sds pkg import container-hysds_lightweight-jobs.*.sdspkg.tar
- Copy the
verdi
docker image to the code bucket (CODE_BUCKET
as specified duringsds configure
). EnsureVERDI_PRIMER_IMAGE
url is consistent:aws s3 cp hysds-verdi-latest.tar.gz s3://<CODE_BUCKET>/hysds-verdi-latest.tar.gz
- Start up the
factotum
component and validate that all services come up finesds start factotum sds status factotum
- View status of HySDS components and services:
sds status all
Restore all timer based job submission scripts (e.g. crontab on factotum
)
- Log back into
factotum
and restore the crontabssh -i <PEM file> ops@<factotum IP> cd ~/crontabs crontab crontab.<your_last_backup>