This article outlines the nominal data flow of a job though HySDS.
{{Include updated diagram here}}
Tracking a Job
{{add comment about job tracking through the job lifecycle, facet on payload ID with a screenshot of the payload ID in Figaro for a specific job, highlighted in red. }}
Submit a Job
A job is submitted via On-Demand or a Trigger Rule. Once submitted it moves to job-queued
status.
Job Queued
A queued job is checked to see if its running(?). If it is, the job is moved to job-started
status. If not, the job is revoked and changed to job-revoked
status. Revoked jobs are then deleted.
Job Started
Running jobs that have been moved to job-started
status are checked for two conditions: if they have timed out (via the watchdog check timedout
) and if they have succeeded (via the exit code).
If
timedout
: Jobs that have timed out are changed tojob-offline
status and then deleted.If succeeded with
exit code == 0
: Successfully completed jobs are updated tojob-completed
status. Finally, completed jobs are deleted.If succeeded with
exit code != 0
: Jobs with non-zero exit codes are associated with failed jobs. Their status is updated tojob-failed
and then deleted.
{{ Deduped ? }}
“In figaro, there are various troubleshooting functions using payload ID to track the job through the system. Include this in the basic flow of jobs through” - Lela