(TODO: Review formatting revisions for accuracy/continuity of descriptions, etc)
Dataset Specification:
This page details how to author a new dataset type. In order to for HySDS to recognize a dataset, the dataset must follow certain conventions. These conventions documented on this page must be implemented by the PGE or the PGE wrapper. The dataset conventions include:
Dataset ID:
Each product should have a Dataset ID. This name is used to determine the type of the dataset and name all the important files for the dataset. A Dataset ID is matched against entries found in the datasets.json file to determine its type.
In this example, we shall use the Dataset ID dumby-product-20170101T000000Z-3lx0a.
2. Directory:
Any directory containing the below JSON files and found within the working directory supplied to the PGE is considered a dataset. Thus this directory must be named with the dataset's ID (see above):
$ pwd /data/work/example_work_dir/dumby-product-20170101T000000Z-3lx0a $ ls dumby-product-20170101T000000Z-3lx0a.dataset.json dumby-product-20170101T000000Z-3lx0a.met.json dumby-product-20170101T000000Z-3lx0a.prov_es.json dumby-product-20170101T000000Z-3lx0a.h5 pge_output_2.h5 errors.txt other_metadata.xml
Note: Any other PGE data files should be placed in the Dataset ID directory, as the whole directory is the dataset.
HySDS Dataset and Metadata JSON Files:
Dataset JSON file
A product must produce a Dataset ID dataset.json in the Dataset ID directory. This file contains JSON formatted metadata representing the cataloged dataset metadata:
$ cat dumby-product-20170101T000000Z-3lx0a.dataset.json { "version": "v1.0", "label": "dumby product for 2017-01-01T00:00:00Z", "location": { "type": "polygon", "coordinates": [ [ [-122.9059682940358,40.47090915967475], [-121.6679748715316,37.84406528996276], [-120.7310161872557,38.28728069813177], [-121.7043611684245,39.94137004454238], [-121.9536916840953,40.67097860759095], [-122.3100379696548,40.7267890636145], [-122.7640648263371,40.5457010812299], [-122.9059682940358,40.47090915967475] ] ] }, "starttime": "2017-01-01T00:00:00", "endtime": "2017-01-01T00:05:00" }
The required fields are:
*version
The optional fields are:
*label
*location (in GeoJSON format)
*starttime
*endtime
2. Metadata JSON file
In addition, other metadata data can be added to a Dataset ID met.json in the Dataset ID directory. As long as the file conforms to the JSON format, the dataset developer has free reign on what goes into this file:
$ cat dumby-product-20170101T000000Z-3lx0a.met.json { "startingRange": 800026.4431219272, "sensor": "SAR-C Sentinel1", "esd_threshold": 0.85, "tiles": true, "reference": true, "trackNumber": 144, "lookDirection": "right", "beamMode": "IW", "direction": "descending", "inputFile": "sentinel.ini", "polarization": "VV", "imageCorners": { "maxLon": -117.56055555555555, "minLon": -119.06166666666667, "minLat"
End of formatting revisions.
Start of unrevised content:
= Dataset Specification (how to author a new dataset type) =
In order to for HySDS to recognize a dataset, the dataset must follow certain conventions. These conventions are documented on this page and must be implemented by the PGE or the PGE wrapper.
== Dataset ID ==
Each product should have a dataset ID. This name is used to determine the type of the dataset and name all the important files for the dataset. A dataset ID is matched against entries found in the <code>datasets.json</code> file to determine its type.<br />
In this example, we shall use the dataset ID <code>dumby-product-20170101T000000Z-3lx0a</code>.
== Directory ==
Any directory containing the below JSON files and found within the working directory supplied to the PGE is considered a dataset. Thus this directory must be named with the dataset's ID (see above):
<pre>$ pwd
/data/work/example_work_dir/dumby-product-20170101T000000Z-3lx0a
$ ls
dumby-product-20170101T000000Z-3lx0a.dataset.json
dumby-product-20170101T000000Z-3lx0a.met.json
dumby-product-20170101T000000Z-3lx0a.prov_es.json
dumby-product-20170101T000000Z-3lx0a.h5
pge_output_2.h5
errors.txt
other_metadata.xml
</pre>
''Note that any other PGE data files should be placed in the <Dataset ID> directory, as the whole directory is the dataset.''
== HySDS dataset and metadata JSON files ==
=== dataset JSON file ===
A product must produce a <Dataset ID>.dataset.json in the <Dataset ID> directory. This file contains JSON formatted metadata representing the cataloged dataset metadata:
<pre>$ cat dumby-product-20170101T000000Z-3lx0a.dataset.json
{
"version": "v1.0",
"label": "dumby product for 2017-01-01T00:00:00Z",
"location":
,
"starttime": "2017-01-01T00:00:00",
"endtime": "2017-01-01T00:05:00"
}
</pre>
The required fields are:
- <code>version</code>
The optional fields are:
- <code>label</code>
- <code>location</code> (in GeoJSON format)
- <code>starttime</code>
- <code>endtime</code>
=== metadata JSON file ===
In addition, other metadata data can be added to a <Dataset ID>.met.json in the <Dataset ID> directory. As long as the file conforms to the JSON format, the dataset developer has free reign on what goes into this file:
<pre>$ cat dumby-product-20170101T000000Z-3lx0a.met.json
{
"startingRange": 800026.4431219272,
"sensor": "SAR-C Sentinel1",
"esd_threshold": 0.85,
"tiles": true,
"reference": true,
"trackNumber": 144,
"lookDirection": "right",
"beamMode": "IW",
"direction": "descending",
"inputFile": "sentinel.ini",
"polarization": "VV",
"imageCorners": {
"maxLon": -117.56055555555555,
"minLon": -119.06166666666667,
"minLat"
</pre>