(TODO: Review formatting revisions for accuracy/continuity of descriptions, etc)
Dataset Specification:
This page details how to author a new dataset type. In order to for HySDS to recognize a dataset, the dataset must follow certain conventions. These conventions documented on this page must be implemented by the PGE or the PGE wrapper. The dataset conventions include:
Dataset ID:
Each product should have a
...
Dataset ID. This name is used to determine the type of the dataset and name all the important files for the dataset. A
...
Dataset ID is matched against entries found in the
...
datasets.
...
json file to determine its type.
...
In this example, we shall use the Dataset ID dumby-product-20170101T000000Z-3lx0a.
2. Directory:
Any directory containing the below JSON files and found within the working directory supplied to the PGE is considered a dataset. Thus this directory must be named with the dataset
...
's ID (see above):
Code Block | ||||
---|---|---|---|---|
| ||||
$ pwd /data/work/example_work_dir/dumby-product-20170101T000000Z-3lx0a
$ ls dumby-product-20170101T000000Z-3lx0a.dataset.json
dumby-product-20170101T000000Z-3lx0a.met.json
dumby-product-20170101T000000Z-3lx0a.prov_es.json
dumby-product-20170101T000000Z-3lx0a.h5
pge_output_2.h5
errors.txt
other_metadata.xml |
Info |
---|
Note: Any other PGE data files should be placed in the Dataset ID directory, as the whole directory is the dataset. |
HySDS Dataset and Metadata JSON Files:
Dataset JSON file
A product must produce a Dataset ID dataset.json in the Dataset ID directory. This file contains JSON formatted metadata representing the cataloged dataset metadata:
Code Block | ||||
---|---|---|---|---|
| ||||
$ cat dumby-product-20170101T000000Z-3lx0a.dataset.json
{
"version": "v1.0",
"label": "dumby product for 2017-01-01T00:00:00Z",
"location": {
"type": "polygon",
"coordinates": [
[
[-122.9059682940358,40.47090915967475],
[-121.6679748715316,37.84406528996276],
[-120.7310161872557,38.28728069813177],
[-121.7043611684245,39.94137004454238],
[-121.9536916840953,40.67097860759095],
[-122.3100379696548,40.7267890636145],
[-122.7640648263371,40.5457010812299],
[-122.9059682940358,40.47090915967475]
]
]
},
"starttime": "2017-01-01T00:00:00",
"endtime": "2017-01-01T00:05:00"
} |
The required fields are:
*version
The optional fields are:
*label
*location (in GeoJSON format)
*starttime
*endtime
2. Metadata JSON file
In addition, other metadata data can be added to a Dataset ID met.json in the Dataset ID directory. As long as the file conforms to the JSON format, the dataset developer has free reign on what goes into this file:
Code Block | ||||
---|---|---|---|---|
| ||||
$ cat dumby-product-20170101T000000Z-3lx0a.met.json
{
"startingRange": 800026.4431219272,
"sensor": "SAR-C Sentinel1",
"esd_threshold": 0.85,
"tiles": true,
"reference": true,
"trackNumber": 144,
"lookDirection": "right",
"beamMode": "IW",
"direction": "descending",
"inputFile": "sentinel.ini",
"polarization": "VV",
"imageCorners": {
"maxLon": -117.56055555555555,
"minLon": -119.06166666666667,
"minLat" |
End of formatting revisions.
...
Start of unrevised content:
Markdown | ||||
---|---|---|---|---|
| ||||
Dataset Specification (how to author a new dataset type):
In order to for HySDS to recognize a dataset, the dataset must follow certain conventions. These conventions are documented on this page and must be implemented by the PGE or the PGE wrapper.
== Dataset ID ==
Each product should have a dataset ID. This name is used to determine the type of the dataset and name all the important files for the dataset. A dataset ID is matched against entries found in the <code>datasets.json</code> file to determine its type.<br />
In this example, we shall use the dataset ID <code>dumby-product-20170101T000000Z-3lx0a</code>.
== Directory ==
Any directory containing the below JSON files and found within the working directory supplied to the PGE is considered a dataset. Thus this directory must be named with the dataset's ID (see above):
<pre>$ pwd
/data/work/example_work_dir/dumby-product-20170101T000000Z-3lx0a
$ ls
dumby-product-20170101T000000Z-3lx0a.dataset.json
dumby-product-20170101T000000Z-3lx0a.met.json
dumby-product-20170101T000000Z-3lx0a.prov_es.json
dumby-product-20170101T000000Z-3lx0a.h5
pge_output_2.h5
errors.txt
other_metadata.xml
</pre>
''Note that any other PGE data files should be placed in the <Dataset ID> directory, as the whole directory is the dataset.''
== HySDS dataset and metadata JSON files ==
=== dataset JSON file ===
A product must produce a <Dataset ID>.dataset.json in the <Dataset ID> directory. This file contains JSON formatted metadata representing the cataloged dataset metadata:
<pre>$ cat dumby-product-20170101T000000Z-3lx0a.dataset.json
{
"version": "v1.0",
"label": "dumby product for 2017-01-01T00:00:00Z",
"location": {
"type": "polygon",
"coordinates": [
[
[-122.9059682940358,40.47090915967475],
[-121.6679748715316,37.84406528996276],
[-120.7310161872557,38.28728069813177],
[-121.7043611684245,39.94137004454238],
[-121.9536916840953,40.67097860759095],
[-122.3100379696548,40.7267890636145],
[-122.7640648263371,40.5457010812299],
[-122.9059682940358,40.47090915967475]
]
]
},
"starttime": "2017-01-01T00:00:00",
"endtime": "2017-01-01T00:05:00"
}
</pre>
The required fields are:
* <code>version</code>
The optional fields are:
* <code>label</code>
* <code>location</code> (in GeoJSON format)
* <code>starttime</code>
* <code>endtime</code>
=== metadata JSON file ===
In addition, other metadata data can be added to a <Dataset ID>.met.json in the <Dataset ID> directory. As long as the file conforms to the JSON format, the dataset developer has free reign on what goes into this file:
<pre>$ cat dumby-product-20170101T000000Z-3lx0a.met.json
{
"startingRange": 800026.4431219272,
"sensor": "SAR-C Sentinel1",
"esd_threshold": 0.85,
"tiles": true,
"reference": true,
"trackNumber": 144,
"lookDirection": "right",
"beamMode": "IW",
"direction": "descending",
"inputFile": "sentinel.ini",
"polarization": "VV",
"imageCorners": {
"maxLon": -117.56055555555555,
"minLon": -119.06166666666667,
"minLat"
</pre> |
...
Wiki Markup |
---|
= Dataset Specification (how to author a new dataset type) = In order to for HySDS to recognize a dataset, the dataset must follow certain conventions. These conventions are documented on this page and must be implemented by the PGE or the PGE wrapper. == Dataset ID == Each product should have a dataset ID. This name is used to determine the type of the dataset and name all the important files for the dataset. A dataset ID is matched against entries found in the <code>datasets.json</code> file to determine its type.<br /> In this example, we shall use the dataset ID <code>dumby-product-20170101T000000Z-3lx0a</code>. == Directory == Any directory containing the below JSON files and found within the working directory supplied to the PGE is considered a dataset. Thus this directory must be named with the dataset's ID (see above): <pre>$ pwd /data/work/example_work_dir/dumby-product-20170101T000000Z-3lx0a $ ls dumby-product-20170101T000000Z-3lx0a.dataset.json dumby-product-20170101T000000Z-3lx0a.met.json dumby-product-20170101T000000Z-3lx0a.prov_es.json dumby-product-20170101T000000Z-3lx0a.h5 pge_output_2.h5 errors.txt other_metadata.xml </pre> ''Note that any other PGE data files should be placed in the <Dataset ID> directory, as the whole directory is the dataset.'' == HySDS dataset and metadata JSON files == === dataset JSON file === A product must produce a <Dataset ID>.dataset.json in the <Dataset ID> directory. This file contains JSON formatted metadata representing the cataloged dataset metadata: <pre>$ cat dumby-product-20170101T000000Z-3lx0a.dataset.json { "version": "v1.0", "label": "dumby product for 2017-01-01T00:00:00Z", "location": { "type": "polygon", "coordinates": [ [ [-122.9059682940358,40.47090915967475], [-121.6679748715316,37.84406528996276], [-120.7310161872557,38.28728069813177], [-121.7043611684245,39.94137004454238], [-121.9536916840953,40.67097860759095], [-122.3100379696548,40.7267890636145], [-122.7640648263371,40.5457010812299], [-122.9059682940358,40.47090915967475] ] ] }, "starttime": "2017-01-01T00:00:00", "endtime": "2017-01-01T00:05:00" } </pre> The required fields are: * <code>version</code> The optional fields are: * <code>label</code> * <code>location</code> (in GeoJSON format) * <code>starttime</code> * <code>endtime</code> === metadata JSON file === In addition, other metadata data can be added to a <Dataset ID>.met.json in the <Dataset ID> directory. As long as the file conforms to the JSON format, the dataset developer has free reign on what goes into this file: <pre>$ cat dumby-product-20170101T000000Z-3lx0a.met.json { "startingRange": 800026.4431219272, "sensor": "SAR-C Sentinel1", "esd_threshold": 0.85, "tiles": true, "reference": true, "trackNumber": 144, "lookDirection": "right", "beamMode": "IW", "direction": "descending", "inputFile": "sentinel.ini", "polarization": "VV", "imageCorners": { "maxLon": -117.56055555555555, "minLon": -119.06166666666667, "minLat" </pre> |
...