(TODO: Review formatting revisions for accuracy/continuity of descriptions, etc)
Dataset Specification:
This page details how to author a new dataset type. In order to for HySDS to recognize a dataset, the dataset must follow certain conventions. These conventions documented on this page must be implemented by the PGE or the PGE wrapper. The dataset conventions include:
Dataset ID:
Each product should have a Dataset ID. This name is used to determine the type of the dataset and name all the important files for the dataset. A Dataset ID is matched against entries found in the datasets.json file to determine its type.
In this example, we shall use the Dataset ID dumby-product-20170101T000000Z-3lx0a.
2. Directory:
Any directory containing the below JSON files and found within the working directory supplied to the PGE is considered a dataset. Thus this directory must be named with the dataset's ID (see above):
Code Block | ||||
---|---|---|---|---|
| ||||
$ pwd /data/work/example_work_dir/dumby-product-20170101T000000Z-3lx0a $ ls dumby-product-20170101T000000Z-3lx0a.dataset.json dumby-product-20170101T000000Z-3lx0a.met.json dumby-product-20170101T000000Z-3lx0a.prov_es.json dumby-product-20170101T000000Z-3lx0a.h5 pge_output_2.h5 errors.txt other_metadata.xml |
Info |
---|
Note: Any other PGE data files should be placed in the Dataset ID directory, as the whole directory is the dataset. |
HySDS Dataset and Metadata JSON Files:
Dataset JSON file
A product must produce a Dataset ID dataset.json in the Dataset ID directory. This file contains JSON formatted metadata representing the cataloged dataset metadata:
Code Block | ||||
---|---|---|---|---|
| ||||
$ cat dumby-product-20170101T000000Z-3lx0a.dataset.json { "version": "v1.0", "label": "dumby product for 2017-01-01T00:00:00Z", "location": { "type": "polygon", "coordinates": [ [ [-122.9059682940358,40.47090915967475], [-121.6679748715316,37.84406528996276], [-120.7310161872557,38.28728069813177], [-121.7043611684245,39.94137004454238], [-121.9536916840953,40.67097860759095], [-122.3100379696548,40.7267890636145], [-122.7640648263371,40.5457010812299], [-122.9059682940358,40.47090915967475] ] ] }, "starttime": "2017-01-01T00:00:00", "endtime": "2017-01-01T00:05:00" } |
The required fields are:
*version
The optional fields are:
*label
*location (in GeoJSON format)
*starttime
*endtime
2. Metadata JSON file
In addition, other metadata data can be added to a Dataset ID met.json in the Dataset ID directory. As long as the file conforms to the JSON format, the dataset developer has free reign on what goes into this file:
Code Block | ||||
---|---|---|---|---|
| ||||
$ cat dumby-product-20170101T000000Z-3lx0a.met.json { "startingRange": 800026.4431219272, "sensor": "SAR-C Sentinel1", "esd_threshold": 0.85, "tiles": true, "reference": true, "trackNumber": 144, "lookDirection": "right", "beamMode": "IW", "direction": "descending", "inputFile": "sentinel.ini", "polarization": "VV", "imageCorners": { "maxLon": -117.56055555555555, "minLon": -119.06166666666667, "minLat" |
End of formatting revisions.
Start of unrevised content:
Markdown | ||||
---|---|---|---|---|
| ||||
Dataset Specification (how to author a new dataset type): In order to for HySDS to recognize a dataset, the dataset must follow certain conventions. These conventions are documented on this page and must be implemented by the PGE or the PGE wrapper. == Dataset ID == Each product should have a dataset ID. This name is used to determine the type of the dataset and name all the important files for the dataset. A dataset ID is matched against entries found in the <code>datasets.json</code> file to determine its type.<br /> In this example, we shall use the dataset ID <code>dumby-product-20170101T000000Z-3lx0a</code>. == Directory == Any directory containing the below JSON files and found within the working directory supplied to the PGE is considered a dataset. Thus this directory must be named with the dataset's ID (see above): <pre>$ pwd /data/work/example_work_dir/dumby-product-20170101T000000Z-3lx0a $ ls dumby-product-20170101T000000Z-3lx0a.dataset.json dumby-product-20170101T000000Z-3lx0a.met.json dumby-product-20170101T000000Z-3lx0a.prov_es.json dumby-product-20170101T000000Z-3lx0a.h5 pge_output_2.h5 errors.txt other_metadata.xml </pre> ''Note that any other PGE data files should be placed in the <Dataset ID> directory, as the whole directory is the dataset.'' == HySDS dataset and metadata JSON files == === dataset JSON file === A product must produce a <Dataset ID>.dataset.json in the <Dataset ID> directory. This file contains JSON formatted metadata representing the cataloged dataset metadata: <pre>$ cat dumby-product-20170101T000000Z-3lx0a.dataset.json { "version": "v1.0", "label": "dumby product for 2017-01-01T00:00:00Z", "location": { "type": "polygon", "coordinates": [ [ [-122.9059682940358,40.47090915967475], [-121.6679748715316,37.84406528996276], [-120.7310161872557,38.28728069813177], [-121.7043611684245,39.94137004454238], [-121.9536916840953,40.67097860759095], [-122.3100379696548,40.7267890636145], [-122.7640648263371,40.5457010812299], [-122.9059682940358,40.47090915967475] ] ] }, "starttime": "2017-01-01T00:00:00", "endtime": "2017-01-01T00:05:00" } </pre> The required fields are: * <code>version</code> The optional fields are: * <code>label</code> * <code>location</code> (in GeoJSON format) * <code>starttime</code> * <code>endtime</code> === metadata JSON file === In addition, other metadata data can be added to a <Dataset ID>.met.json in the <Dataset ID> directory. As long as the file conforms to the JSON format, the dataset developer has free reign on what goes into this file: <pre>$ cat dumby-product-20170101T000000Z-3lx0a.met.json { "startingRange": 800026.4431219272, "sensor": "SAR-C Sentinel1", "esd_threshold": 0.85, "tiles": true, "reference": true, "trackNumber": 144, "lookDirection": "right", "beamMode": "IW", "direction": "descending", "inputFile": "sentinel.ini", "polarization": "VV", "imageCorners": { "maxLon": -117.56055555555555, "minLon": -119.06166666666667, "minLat" </pre> |
Wiki Markup |
---|
= Dataset Specification (how to author a new dataset type) = In order to for HySDS to recognize a dataset, the dataset must follow certain conventions. These conventions are documented on this page and must be implemented by the PGE or the PGE wrapper. == Dataset ID == Each product should have a dataset ID. This name is used to determine the type of the dataset and name all the important files for the dataset. A dataset ID is matched against entries found in the <code>datasets.json</code> file to determine its type.<br /> In this example, we shall use the dataset ID <code>dumby-product-20170101T000000Z-3lx0a</code>. == Directory == Any directory containing the below JSON files and found within the working directory supplied to the PGE is considered a dataset. Thus this directory must be named with the dataset's ID (see above): <pre>$ pwd /data/work/example_work_dir/dumby-product-20170101T000000Z-3lx0a $ ls dumby-product-20170101T000000Z-3lx0a.dataset.json dumby-product-20170101T000000Z-3lx0a.met.json dumby-product-20170101T000000Z-3lx0a.prov_es.json dumby-product-20170101T000000Z-3lx0a.h5 pge_output_2.h5 errors.txt other_metadata.xml </pre> ''Note that any other PGE data files should be placed in the <Dataset ID> directory, as the whole directory is the dataset.'' == HySDS dataset and metadata JSON files == === dataset JSON file === A product must produce a <Dataset ID>.dataset.json in the <Dataset ID> directory. This file contains JSON formatted metadata representing the cataloged dataset metadata: <pre>$ cat dumby-product-20170101T000000Z-3lx0a.dataset.json { "version": "v1.0", "label": "dumby product for 2017-01-01T00:00:00Z", "location": { "type": "polygon", "coordinates": [ [ [-122.9059682940358,40.47090915967475], [-121.6679748715316,37.84406528996276], [-120.7310161872557,38.28728069813177], [-121.7043611684245,39.94137004454238], [-121.9536916840953,40.67097860759095], [-122.3100379696548,40.7267890636145], [-122.7640648263371,40.5457010812299], [-122.9059682940358,40.47090915967475] ] ] }, "starttime": "2017-01-01T00:00:00", "endtime": "2017-01-01T00:05:00" } </pre> The required fields are: * <code>version</code> The optional fields are: * <code>label</code> * <code>location</code> (in GeoJSON format) * <code>starttime</code> * <code>endtime</code> === metadata JSON file === In addition, other metadata data can be added to a <Dataset ID>.met.json in the <Dataset ID> directory. As long as the file conforms to the JSON format, the dataset developer has free reign on what goes into this file: <pre>$ cat dumby-product-20170101T000000Z-3lx0a.met.json { "startingRange": 800026.4431219272, "sensor": "SAR-C Sentinel1", "esd_threshold": 0.85, "tiles": true, "reference": true, "trackNumber": 144, "lookDirection": "right", "beamMode": "IW", "direction": "descending", "inputFile": "sentinel.ini", "polarization": "VV", "imageCorners": { "maxLon": -117.56055555555555, "minLon": -119.06166666666667, "minLat" </pre> |