Post Processor

This is an optional step. The main task of the post processor is to run any post processing actions after the execution of a PGE and create a psuedo context file named [PGE_type]_context.json.

The Post Processor first checks if the job status is failed or completed or deduped. Then based on the type of PGE, it has a specific role.

The Post Processor queries:

Mozart for job information and products staged.
GRQ for product’s metadata

Flow of actions:

If the job completed or was deduped against a completed job then it determines the produced product’s ID.
If the project adaptation needs any actions performed post PGE run, the post processor would handle that.
For example, on missions it updates an ElasticSearch document for product accountability with the job’s status, job ID and product ID (if available)
Then it needs to put together information required by the next PGE's input preprocessor. It queries Products ES to get the product metadata and path. All this information is then passed along to the next PGE's input preprocessor

Psuedo context file

The file's purpose is to pass metadata of the previous smap_sciflo process (PGE run) to the next one's input preprocessor. It contains information about product produced and the job status of the PGE run.

The format of the psuedo context file created by the post processor is:

{  
   "product_paths”:[...],
   "job_id”: ID,
   "job_context”:{ HySDS context of PGE job},
   "product_metadata”:[...]
}

The IPP is designed to parse and use the psuedo context file, if you want to create a workflow with multiple PGE executions.