(TODO: Note - JIRA #597: How to facet on all failed jobs with the same failure message, purge them and then re-submit them. Related to JIRA #601)
Question: what are all of the possible precondition failures? (this is mentioned in Jira 601).
Rough Draft outline for JIRA #597: How to facet on all failed jobs with the same failure message, purge them and then re-submit them:
Navigate to Resource Manager
Facet on jobs, and job-failed to see all failed jobs (#1).
Next, need to identify a specific, unique searchable term shared among all of the jobs you want to purge and resubmit. There is no hard and fast rule for this; its somewhat trial and error. In this example (#2) I chose a unique string/phrase/segment of the Job ID
4. Enclose that search phrase in “quotations” and enter it in the search bar along the top to facet on all similar jobs:
5. With this list of all similar failed jobs (304 in this example), next click “On Demand” to process these using “purge” from the drop down menu. (Unclear if its beneficial to add a unique tag here for later steps)
6. Leave other settings unchanged, click “Process Now”
(from here on I’m unclear how to retry these jobs, this is my best guess understanding)
7. Remove the job-failed facet in the Resource Manager. Then search for the unique tag created (in “quotations”) when purging the jobs, “purge_tag_test” in this example. This shows all the failed and now purged similar jobs.
8. Now click On-Demand, add another unique tag, and select “Retry Jobs/Tasks” from the drop-down action menu.
9. This retries the failed and purged jobs. You can remove the “purge_tag_test” facet and add the unique tag created in step #8 to facet on the newly retried similar jobs.
Instructions