Gerald Manipon edited this
...
page on Aug 30,
...
2018 · 6 revisions
Page Navigation: | |
---|---|
|
Confidence Level TBD This article has not been reviewed for accuracy, timeliness, or completeness. Check that this information is valid before acting on it. |
---|
Upgrade
For definitions of terminology used, please refer to
...
Latest Releases
The latest releases are here: https://github.com/hysds/hysds-framework/releases. Each was taken from the latest head state of all repos at time of release.
Prerequisite - Graceful Shutdown
To preserve the state of queued/running HySDS
...
jobs mozart
, the HySDS cluster should be brought down gracefully as follows:
Turn off all timer based job submission scripts (e.g. crontab
...
on factotum
)
Log into into
...
your
factotum
instanceCode Block ssh -i <PEM file> ops@<factotum IP>
Back up your crontab and remove it to prevent jobs from being submitted during the upgrade process
Code Block mkdir ~/crontabs crontab -l > crontab.$(date -u -Iseconds) crontab -r
Gracefully shutdown the workers
Log into
...
your
mozart
instanceCode Block ssh -i <PEM file> ops@<mozart IP>
Shut down the syncer processes that updates job queue metrics (i.e. AWS CloudWatch) which triggers autoscaling
...
of
verdi
...
workers:
Code Block supervisorctl status | grep ^sync_ | awk '{print $1}' | xargs -i -t supervisorctl stop {}
Cancel consumption of tasks:
Code Block sudo rabbitmqctl list_queues 2>&1 | grep -v '^celery' | tail -n +2 | awk '{print $1}' | xargs -i -t celery -A hysds control cancel_consumer {}
Log into the RabbitMQ admin interface to ensure that all queues
...
show
0
in theUnacked
column of
...
the
Queues
...
tab. To show only the job/tasks queues,
...
enter
^(?!celery)
...
in the filter text box and check
...
the
Regex
...
checkbox. If there are jobs/tasks currently running, you can either wait for them to complete or kill them manually to retry them later after the upgrade. Below screenshot shows
...
the
Unacked
...
column with all zeros signifying that there are no jobs/tasks currently running.
Log
...
into
figaro
...
and ensure that it matches what you see in the RabbitMQ admin
...
interface
Your cluster is now ready for the upgrade
Upgrade
Update HySDS core using hysds-framework and sdscli
Log into
...
your
mozart
instanceCode Block ssh -i <PEM file> ops@<mozart IP>
Stop the cluster
Code Block sds stop all -f
Backup your mozart directory
Code Block mv ~/mozart ~/mozart.orig
If you have it, remove the
...
old
hysds-framework
...
clone
Code Block rm -rf ~/hysds-framework
Clone the HySDS framework repository and enter it
Code Block cd ~ git clone https://github.com/hysds/hysds-framework.git cd hysds-framework
Select the HySDS framework release tag you'd like to install for mozart
Code Block ./install.sh mozart HySDS install directory set to /home/ops/mozart New python executable in /home/ops/mozart/bin/python Installing Setuptools............................................done. Installing Pip...................................................done. Created virtualenv at /home/ops/mozart. [2017-08-09 19:25:37,789: INFO/main] Github repo URL: https://xxxxxxxx@github.com/api/v3/repos/hysds/hysds-framework/releases [2017-08-09 19:25:37,798: INFO/_new_conn] Starting new HTTPS connection (1): github.com No release specified. Use -r RELEASE | --release=RELEASE to install a specific release. Listing available releases: v2.0.0-beta.3 v2.1.0-beta.0 v2.1.0-beta.1 v2.1.0-beta.2 v2.1.0-beta.3 v2.1.0-beta.4 v2.1.0-beta.5 v2.1.0-beta.6 v2.1.0-beta.7 v2.1.0-beta.8 v2.1.0-rc.0 v2.1.0-rc.1 v2.1.0-rc.2 v2.1.0-rc.3
Install the latest HySDS release (e.g. v2.1.0-rc.3) for the mozart component
Code Block ./install.sh mozart -r <release> e.g. ./install.sh mozart -r v2.1.0-rc.3
You could also install the development version which pulls the master branch of each HySDS repo:
Code Block ./install.sh mozart -d
Restore the non-core repositories from the directory backup
...
under
~/mozart.orig/ops
Code Block cd ~/mozart.orig/ops for i in *; do new=~/mozart/ops/$i; if [ ! -e "$new" ]; then cp -rp $i $new; fi done
Update all HySDS components:
Code Block sds update all
If you receive any errors, they will need to be addressed.
(Optional) Run any adaptation-specific fabric updates (e.g. update_aria_packages)
Code Block fab -f ~/.sds/cluster.py -R factotum,verdi update_aria_packages
Build and ship out updated code/config bundles
Code Block sds ship
Start up
...
the
grq
...
component and validate that all services come up fine
Code Block sds start grq sds status grq
Start up
...
the
mozart
...
component and validate that all services come up fine
Code Block sds start mozart sds status mozart
Start up
...
the
metrics
...
component and validate that all services come up fine
Code Block sds start metrics sds status metrics
During installation, the latest versions of
...
the
lightweight-jobs
...
core HySDS package and
...
the
verdi
...
docker image was downloaded. If the version has has incremented, we import
...
the
lightweight-jobs
...
package:
Code Block cd ~/mozart/pkgs sds pkg import container-hysds_lightweight-jobs.*.sdspkg.tar
Copy
...
the
verdi
...
docker image to the code bucket (
CODE_BUCKET
...
as specified
...
during
sds configure
).
...
Ensure
VERDI_PRIMER_IMAGE
...
url is consistent:
Code Block aws s3 cp hysds-verdi-latest.tar.gz s3://<CODE_BUCKET>/hysds-verdi-latest.tar.gz
Start up
...
the
factotum
...
component and validate that all services come up fine
Code Block sds start factotum sds status factotum
View status of HySDS components and services:
Code Block sds status all
Restore all timer based job submission scripts (e.g. crontab
...
on factotum
)
Log back
...
into
factotum
...
and restore the crontab
Code Block ssh -i <PEM file> ops@<factotum IP> cd ~/crontabs crontab crontab.<your_last_backup>