Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

This article outlines the nominal data flow of a job though HySDS.

{ 6/4 update diagram to remove elements pointing to delete. see notated image }

Submit a Job

A job is submitted via On-Demand or a Trigger Rule. Once submitted it moves to job-queued status.

Job Queued

A queued job is checked to see if its running(?). If it is, the job is moved to job-started status. If not, the job is revoked and changed to job-revoked status. Revoked jobs are then deleted.

Job Started

Running jobs that have been moved to job-started status are checked for two conditions: if they have timed out (via the watchdog check timedout) and if they have succeeded (via the exit code).

  • If timedout: Jobs that have timed out are changed to job-offline status and then deleted.

  • If succeeded with exit code == 0: Successfully completed jobs are updated to job-completed status. Finally, completed jobs are deleted.

  • If succeeded with exit code != 0: Jobs with non-zero exit codes are associated with failed jobs. Their status is updated to job-failed and then deleted.

Job Deduped

Jobs that have identical parameters, or if the same job was already successfully completed, are deduped and no further processing occurs.

Tracking a Job

Jobs can be tracked through the job lifecycle via the payload ID. In Figaro, operators can facet on a job’s unique ID to monitor progression through the various stages as well as for troubleshooting functionality.

Job Completed

The PGE completed successfully.

Job Started

A worker node has started processing a job.

Job Offline

When the worker node is offline.

  • No labels