When installation of the HySDS framework is complete on your mozart instance (see Installation), we must configure the rest of the cluster instances so that they can talk to each other. We do this using the sdscli command on the mozart instance. The idea is that all code and configuration is centralized on the mozart instance and when ready to deploy updates during the development cycle or when upgrading operations, we can push them easily from a single location.

...

Make sure elasticsearch is up on the mozart and grq instances. You can run the following command to check:

Code Block
curl 'http://<mozart/grq ip>:9200/?pretty'

you should get answer back from ES, something like this:

Code Block

{
 "status" : 200,
 "name" : "Dweller-in-Darkness",
 "cluster_name" : "resource_cluster",
 "version" : {
   "number" : "1.7.3",
   "build_hash" : "05d4530971ef0ea46d0f4fa6ee64dbc8df659682",
   "build_timestamp" : "2015-10-15T09:14:17Z",
   "build_snapshot" : false,
   "lucene_version" : "4.10.4"
 },
 "tagline" : "You Know, for Search"
}

If you can not connect to elastic search, you need to start ElasticSearch in mozart and grq instances:

Code Block
sudo systemctl start elasticsearch

Ensuremozartcomponent can connect to other components over ssh using the configuredKEY_FILENAME. If correctly configured, thesds status allcommand should show that it was able to ssh into each component to check that thesupervisorddaemon was not running like below:

Code Block

sds status all
########################################
grq
########################################
[100.64.106.214] Executing task 'status'
Supervisord is not running on grq.
########################################
mozart
########################################
[100.64.106.38] Executing task 'status'
Supervisord is not running on mozart.
########################################
metrics
########################################
[100.64.106.140] Executing task 'status'
Supervisord is not running on metrics.
########################################
factotum
########################################
[100.64.106.64] Executing task 'status'
Supervisord is not running on factotum.
########################################
ci
########################################
[100.64.106.220] Executing task 'status'
Supervisord is not running on ci.
########################################
verdi
########################################
[100.64.106.220] Executing task 'status'
Supervisord is not running on verdi.

Otherwise if any of the components show the following error, for example for the grq component:

Code Block

########################################
grq
########################################
[100.64.106.214] Executing task 'status'

Fatal error: Needed to prompt for a connection or sudo password (host: 100.64.106.214), but abort-on-prompts was set to True

Aborting.
Needed to prompt for a connection or sudo password (host: 100.64.106.214), but abort-on-prompts was set to True

Fatal error: One or more hosts failed while executing task 'status'

Aborting.
One or more hosts failed while executing task 'status'

then there is an issue with the configuredKEY_FILENAMEonmozartor theauthorized_keysfile under the component's~/.sshdirectory for userOPS_USER. Resolve this issue before continuing on.

Update all HySDS components:
Code Block
sds update all
If you receive any errors, they will need to be addressed.
Start up all HySDS components:
Code Block
sds start all
View status of HySDS components and services:
Code Block
sds status all
During installation, the latest versions of thelightweight-jobscore HySDS package and theverdidocker image was downloaded. Next we import thelightweight-jobspackage:
Code Block
cd ~/mozart/pkgs
Code Block
sds pkg import container-hysds_lightweight-jobs.*.sdspkg.tar
Finally we copy theverdidocker image to the code bucket (CODE_BUCKETas specified duringsds configure). EnsureVERDI_PRIMER_IMAGEurl is consistent:
Code Block
aws s3 cp hysds-verdi-latest.tar.gz s3://<CODE_BUCKET>/hysds-verdi-latest.tar.gz

Next Step

Now that you have your HySDS cluster configured, continue on to Step 5: Running your First "Hello World" Job

Version	Old Version 13	New Version 14
Changes made by	Topher Allen	Topher Allen
Saved on	Jul 14, 2020	Jul 14, 2020

Versions Compared

Key

Next Step

Content Comparison

Versions Compared

Key

Next Step