...
...
...
...
Confidence Level High This article been formally reviewed and is signed off on by a relevant subject matter expert. |
---|
This article outlines the nominal data flow of a job though HySDS.
{{Include updated diagram here}}
Tracking a Job
...
Submit a Job
A job is submitted via On-Demand or a Trigger Rule. Once submitted it moves to job-queued
status.
Job Queued
A queued job is checked to see if its running(?). If it is, the job is moved to job-started
status. If not, the job is revoked and changed to job-revoked
status. Revoked jobs are then deleted.
Job Started
Running jobs that have been moved to job-started
status are checked for two conditions: if they have timed out (via the watchdog check timedout
) and if they have succeeded (via the exit code).
If
timedout
: Jobs that have timed out are changed tojob-offline
status and then deleted.If succeeded with
exit code == 0
: Successfully completed jobs are updated tojob-completed
status. Finally, completed jobs are deleted.If succeeded with
exit code != 0
: Jobs with non-zero exit codes are associated with failed jobs. Their status is updated tojob-failed
and then deleted.
...
Job Deduped
...
“In figaro, there are various troubleshooting functions using payload ID to track the job through the system. Include this in the basic flow of jobs through” - Lela
...
Jobs that have identical parameters, or if the same job was already successfully completed, are deduped and no further processing occurs.
Tracking a Job
Jobs can be tracked through the job lifecycle via the payload ID. In Figaro, operators can facet on a job’s unique ID to monitor progression through the various stages as well as for troubleshooting functionality.
Job Completed
The PGE completed successfully.
Job Started
A worker node has started processing a job.
Job Offline
When the worker node is offline.
...
...
...
hidden | true |
---|
...