Create STAC Documents for HySDS Datasets
WHAT IS STAC?
The SpatioTemporal Asset Catalog (STAC) specification provides a common language to describe a range of geospatial information. A 'spatiotemporal asset' is a file that represents info about the earth captured in a certain space & time. It provides a lowest common denominator JSON format to wrap around any relevant data about the earth. It can more easily be indexed and discovered. The goal is for all providers of spatiotemporal assets so that new code doesn't need to be written whenever a new data set or API is released. Providing STAC JSON documents for data opens up avenues for reusing various community developed open source tools that provide search, visualization and/or analytics capabilities.
WHAT IS REQUIRED IN A STAC DOCUMENT?
In the following JSON Schema that STAC documents should conform to, the required items include stac_version, type, id, description, and links. A STAC Catalog is a top-level object that logically groups other Catalog, Collection, and Item objects. A Catalog contains an array of Link objects to these other objects and can include additional metadata to describe the objects contained therein. It is defined in full in the STAC Catalog Specification.ย
ย
Catalogs are designed so that a simple file server on the web or object store like Amazon S3 can store JSON that defines a full Catalog. More dynamic services can also return a Catalog structure, and the STAC API specification contains an OpenAPI definition of the standard way to do this, at the endpoint.
REQUIREMENTSย
Reference: item.json JSON Schema
FIELDS REQUIRED | FIELDS OPTIONAL |
stac_version | title |
stac_extensions | type (link object) |
id | title (link object) |
links | ย |
collection | ย |
geometry | ย |
bbox | ย |
properties | ย |
assets | ย |
STAC UTILITY IN HYSDS CORE
Purpose: Added a new utility to generate STAC compliant documents for HySDS datasets. This utility has been generalized such that any project can utilize this. This script takes product metadata and extracts info needed for the STAC document.
References:
HOW TO USE STAC UTILITY IN A PROJECT
The following files and code should be added in the project's repository.
Python Code Snippet
Simply import the stac_utils in your python script pt and call with create_stac_doc() the required input parameters.
from hysds_commons.stac_utils import create_stac_doc
# load project configuration files
# load product metadata file
# read in product_path from _context.json
# create list for product lineage
stac_doc = create_stac_doc(product_directory, metadata, mapping, assets_desc, product_type, product_path, lineage)
# write stac_doc to a JSON file
ย
Required Input Parameters Description
create_stac_doc(product_directory, metadata, mapping, assets_desc, product_type, product_path, lineage)
product_directory (str) - Name of product downloaded into work directory
metadata (dict) - .met.json file content of the product
mapping (dict) - file content of stac_mappings.json
assets_desc (dict) - file content of assets_description.json
product_type (str) - product / dataset type
product_path (str) - S3 location or URL of product
lineage (list) - list of URLs pointing to location of every file used as input to generate the product
PREREQUISITES
Project Configuration Files:
STAC Mappings File
This configuration file contains two sections that provide mappings of product metadata to STAC format.
The field_mappings section defines a 1:1 mapping between the expected field name in STAC to the metadata key found in the product's met.json file.
The code_mappings section defines how to derive values for certain STAC fields when they are not straight forward mappings. The value provided is a python code snippet which is evaluated in the create_stac_doc function.
ย
Template:
{
ย ย ย ย "field_mappings": {
ย ย ย ย ย ย "STAC_field_name1": "product_metadata_key1",
ย ย ย ย ย ย "STAC_field_name2": "product_metadata_key2"
ย ย ย ย },
ย ย ย ย "code_mappings": {
ย ย ย ย ย ย "STAC_field_name": "python code to evaluate to derive value",
ย ย ย ย }
}
Example:
{
ย ย ย ย "field_mappings": {
ย ย ย ย ย ย "id": "id",
ย ย ย ย ย ย "geometry": "Bounding_Polygon"
ย ย ย ย },
ย ย ย ย "code_mappings": {
ย ย ย ย ย ย "collection": "'NISAR_{}'.format(metadata.get('product_type'))",
ย ย ย ย }
}
ย
2. Assets Description File
This configuration file provides information describing the files associated with each product type.
Template:
Example: