HySDS APIs

 


Confidence Level TBD  This article has not been reviewed for accuracy, timeliness, or completeness. Check that this information is valid before acting on it.

Confidence Level TBD  This article has not been reviewed for accuracy, timeliness, or completeness. Check that this information is valid before acting on it.

Introduction

HySDS has a number of APIs that are designed to provide a way for external users to programmatically access HySDS data and job information.  The APIs are also meant to throttle or jitter requests to avoid hammering the backend servers and datastores.



As we evolve HySDS, the most important role that APIs may play is to throttle or jitter the requests from HySDS itself, which may hammer our backend services. 



This is an initial attempt to compile information regarding the existing APIs and come upon best practices and design for new APIs or evolution of existing APIs. 

  • We tried to build on top of Swagger UI



API

Purpose

version

Cognizant developer/expert user

Documentation

Application scenarios

API

Purpose

version

Cognizant developer/expert user

Documentation

Application scenarios

Pele

External users access to datasets



Gerald Manipon, Namrata Malarout



Osiris (Urgent Response UI).  Future: science users, ops

Mozart

External users ability to query Mozart for job info

or to submit jobs to Mozart over REST

Enable CI to register jobs and manage containers



Gerald Manipon, Mohammed Karim, Justin Linick



ASF uses it to submit on-demand jobs

It is a way to submit jobs when running HySDS standalone scripts on-premise.

CI machine uses it to register Jobs, Mozart actions, and manage containers in cluster

AWS lambdas to trigger job submission – currently used in GRFN for ASF delivery.

Could be used for scheduled lambdas instead of cronjobs on factotum

GRQ

Provide ability to manage hysds-io on GRQ or add a new dataset index to GRQ



Gerald Manipon, Mohammed Karim



This API was meant to be the GRQ equivalent of the Mozart API

CI machine uses it to manage GRQ ACTIONS

Picasso 

Power's SMAP's UI



Sujen Shah



SMAP User Interface (currently mission-specific).  Could this be refactored to use HySDS core APIs?

It currently talks to both Mozart and GRQ and mission-specific database

Any others?

Manage/talk/get status of cloud resources?









OGC WPS

OGC WPS (xml serialization) front-end to Mozart



Namrata Malarout



Joint ESA-NASA MAAP


















Going forward

Even within the SDS, we find a need to have an API or middleware layer to communicate between components, to avoid having every developer come up with their own way to talk to components or the underlying services.



Longterm reward to investing in API now

  • extracting best practices from existing code that can be reused in a generic way

  • Then refactoring existing code to call API instead of using custom code



Questions

  • how do you separate traffic from external users from internal SDS machinery?

  • All internal SDS machinery calls would be done as the special "ops" user.

  • In HySDS core -- there is the "ops" super-user.  Everyone else has limited access, based off explicitly granting them access.

  • how do we get requirements from Ops and science users.

  • Should everything be a REST API?

    • We use boto python library to talk to Amazon

  • Where do we go with SDS Watch

    * should be a product.  Should interface via API.  May need to interact with Amazon.





Suggestion:

  • abstract the backend --

    • Uniform way of dealing with common issues and scenarios

    • jitter and rate limits to avoid hammering backend service

    • common place to deal with anomalous service response in consistent manne

    •  

      • distinguish from "no results returned" and "service unavailable"

      • automatic retry with exponential backoff

    • we can write out JSON query...but parsing ElasticSearch results should be abstracted and handled by API.

    • We can catch bad ES responses via API

    • We could sanitize JSON query against bad requests, but otherwise let the query through.

  •  

    • will make it easier for upgrading or swapping backend services.

    • encapsulate best practices in a common place

    • No elasticsearch code entangled with HySDS and PGE codebase

    • Makes it easier for end-users to develop tools and for development of PGEs.

    • ES abstraction layer should have the following characteristics (build upon query-util?)

      • (1) jittering, with automated & finite exponential backoff to backend (ES)
        (2) rate limits on API for throttling to backend (ES)
        (3) reuse connections to backend (ES) to improve performance 

  • for public APIs

    • register with email and password

    •  

      • help identify our users for announcements, or directed emails.

      • results depend on user access control, i.e. what should be visible to them.

    • programmatic way to refresh token

    •  

      • require authentication/token

      • require time limits on tokens.

    • assign limits per user

    •  

      • ability to throttle or limit bad end-users, so they don't cause denial of service to others.

    • versioning of APIs to have path to updated APIs, deprecating old verisons

    • collect metrics regarding API use

    •  

      • versioning

      • which actions are called the most

      • performance analysis

    • helpful for developers to debug. 

    • Also helpful to provide documentation to end user.

  •  

    • additional requirements

    • swagger-ui







Pele Links and Examples

Code can be found in hysds/pele

Pele - enables us to query datasets -- clean up the terminology. -- Gerald may be documenting this in SDD

Was developed mainly for Osiris (urgent response UI), based on what Namrata needed during development.

  • Currently, main user is Osiris (urgent response UI)

  • Future users -- science users who want to pull data programmatically, ops reporting

Functionality

  • dataset types

  • datasets

  • list datasets of a certain type.  (are results paginated?)

  • get ids by dataset (so we can pull out more specific info later)

  • get ids by type

  • get dataset by id

  • query for certain fields for specified type and dataset.

  • list datasets that overlap temporally (by ID) or spatially

  •  

    • Used to grab all acquisitions or product types that match AOI spatially and temporal extent

  • list overlapping datasets of particular types

  •  

    • check for each type whether products exist or not

    • for example, only care about COD or LAR or SLCP

  •  

    • swagger api requires authentication

    • need to authenticate for post, but not gets (confirm with Gerald)

    • dataset and dataset type.

    • get datasettype based on dataset





Mozart Links and Examples

  • in lambdas on AWS that trigger job submissions -- mostly for ASF delivery.  Will be using it for scheduled lambdas.  (Instead of cronjob)

  • ASF has put jobs into our system

  • FEMA could've put processing jobs in our system

  • CI to register jobs (including job spec) into Mozart + GRQ

  • MyJobs????



Functionality

  • list queues

  • manage job specs -- add, list, remove, get job type ID.  (given job type, returns the actual job spec as JSON)

  • manage jobs --- get info by ID, list submitted jobs, get status based on job id, submit a job

    • Namrata could name the job, but HySDS will append a timestamp to that name.   (This is more common inside HySDS machinery when using hysds utils)

  •  

    • Wish for enhancement to allow job naming when submitting through REST API

    • Wish to query jobs by user

    • Wish to list all known users

  • manage containers -- used mainly by CI

  • hysds_io - manage only Mozart actions

  • event - gives system ability to publish events to log anomalies like spot termination, etc.

GRQ Links and Examples

  • used by CI

Functionality

  • register jobs and actions to GRQ

SMAP-API Links and Examples



SMAP-API

  • GitHub-FN



Functionality

  •  

    • get product by id

    • get job by id

    • get docs in an index

    • list all half orbit statuses using specific metadata

    • list all half orbits

 


Related Articles:

Related Articles:

Have Questions? Ask a HySDS Developer:

Anyone can join our public Slack channel to learn more about HySDS. JPL employees can join #HySDS-Community

JPLers can also ask HySDS questions at Stack Overflow Enterprise

Search HySDS Wiki

Page Information:

Page Information:

Was this page useful?

Yes No

Contribution History:

Subject Matter Expert:

@Dustin Lo

@Hook Hua

@Gerald Manipon

Find an Error?

Is this document outdated or inaccurate? Please contact the assigned Page Maintainer:

@Hook Hua

Note: JPL employees can also get answers to HySDS questions at Stack Overflow Enterprise: