Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Gerald Manipon edited this

...

page on May 17,

...

2018 · 8 revisions

...

Page Navigation:

Table of Contents


(blue star) Confidence Level TBD  This article has not been reviewed for accuracy, timeliness, or completeness. Check that this information is valid before acting on it.

What are HySDS jobs?

HySDS jobs are

...

essentially celery tasks. More specifically, they are celery tasks that encapsulate the execution of some executable within a docker image. The celery task callable (hysds.job_worker.run_job) is responsible for setup, execution, and tear down of the job's work environment. Specifically, it ensures:

  • there is enough free space on the root work directory (threshold defaults to 10% free)

    • if there isn't, it cleans out old work directories until the threshold is met

  • the job has a unique work directory to execute in

  • job state is propagated

...

  • to mozart

  • job metrics is propagated

...

  • to metrics

  • pre-processing steps are executed

    • default built-in pre-processing step

...

    • is hysds.utils.localize_urls

...

    •  which downloads input data

  • docker parameters such as volume mounts and UID/GID are set according to job specifications (job-spec)

  • executable is run via docker

  • post-processing steps are executed

    • default built-in post-processing step

...

    • is hysds.utils.publish_datasets

...

    •  which searches for and publishes HySDS datasets generated by the executable

How do you define a HySDS job?

You define a HySDS job by defining

...

job-spec

...

 and a hysds-io.

...

See Job and HySDS IO Specifications. For a step-by-step example,

...

see Hello World.

What are HySDS Workers?

Workers are Celery-level workers that run tasks. Since jobs are tasks, they also run jobs within the context of a unique working directory.

Each job is invoked from a unique working directory on the worker node.

Worker Events

...

See http://celery.readthedocs.org/en/latest/userguide/monitoring.html#worker-events

worker-online

signature: worker-online(hostname,timestamp,freq,sw_ident,sw_ver,sw_sys)

The worker has connected to the broker and is online.

  • hostname: Hostname of the worker.

  • timestamp: Event timestamp.

  • freq: Heartbeat frequency in seconds (float).

  • sw_ident: Name of worker software (e.g. py-celery).

  • sw_ver: Software version (e.g. 2.2.0).

  • sw_sys: Operating System (e.g. Linux, Windows, Darwin).

worker-heartbeat

signature: worker-heartbeat(hostname,timestamp,freq,sw_ident,sw_ver,sw_sys,active,processed)

Sent every minute, if the worker has not sent a heartbeat in 2 minutes, it is considered to be offline.

  • hostname: Hostname of the worker.

  • timestamp: Event timestamp.

  • freq: Heartbeat frequency in seconds (float).

  • sw_ident: Name of worker software (e.g. py-celery).

  • sw_ver: Software version (e.g. 2.2.0).

  • sw_sys: Operating System (e.g. Linux, Windows, Darwin).

  • active: Number of currently executing tasks.

  • processed: Total number of tasks processed by this worker.

worker-offline

signature: worker-offline(hostname,timestamp,freq,sw_ident,sw_ver,sw_sys)

The worker has disconnected from the broker.

Celery Worker Naming Convention

The naming of the worker is important for parsing purposes to be displayed on mozart's faceted search.

Transport

Job events are shipped out to mozart via redis using with msgpack.

msgpack

It's fast, small, and has first class language support.http://msgpack.org/

PGE handling

Work dir scrubbers

POSIX signal handling for verdi worker

Verdi has python handlers for capturing any kill signal from celery worker. verdi then emits them as events to mozart via redis.

Supported POSIX signal handling and event emitting from verdi:

  • 1 SIGHUP: Hangup

  • 2 SIGINT: Terminal interrupt signal.

  • 3 SIGQUIT: Terminal quit signal.

  • 6 SIGABRT: Process abort signal

  • 9 SIGKILL: Kill (cannot be caught or ignored).

  • 15 SIGTERM: Termination signal.

Localize and Publish Data Products

Run in stand-alone test mode

Create the ./work directory and run the following command:

Code Block
HYSDS_DATASETS_CFG=~/verdi/ops/hysds/configs/datasets/datasets.json HYSDS_WORKER_CFG=job_worker.json ~/verdi/ops/hysds/scripts/run_job.py test_job.json


(lightbulb) Have Questions? Ask a HySDS Developer:

Anyone can join our public Slack channelto learn more about HySDS. JPL employees can join #HySDS-Community

(blue star)

JPLers can also ask HySDS questions atStack Overflow Enterprise

(blue star)

Live Search
placeholderSearch HySDS Wiki