(TODO: Review formatting revisions for accuracy/continuity of descriptions, etc)

Dataset Specification:

This page details how to author a new dataset type. In order to for HySDS to recognize a dataset, the dataset must follow certain conventions. These conventions documented on this page must be implemented by the PGE or the PGE wrapper. The dataset conventions include:

Dataset ID:

Each product should have a Dataset ID. This name is used to determine the type of the dataset and name all the important files for the dataset. A Dataset ID is matched against entries found in the datasets.json file to determine its type.

In this example, we shall use the Dataset ID dumby-product-20170101T000000Z-3lx0a.

2. Directory:

Any directory containing the below JSON files and found within the working directory supplied to the PGE is considered a dataset. Thus this directory must be named with the dataset's ID (see above):

$ pwd /data/work/example_work_dir/dumby-product-20170101T000000Z-3lx0a 
$ ls dumby-product-20170101T000000Z-3lx0a.dataset.json 
	 dumby-product-20170101T000000Z-3lx0a.met.json
	 dumby-product-20170101T000000Z-3lx0a.prov_es.json 
	 dumby-product-20170101T000000Z-3lx0a.h5 
	 pge_output_2.h5 
	 errors.txt 
	 other_metadata.xml

Note: Any other PGE data files should be placed in the Dataset ID directory, as the whole directory is the dataset.

HySDS Dataset and Metadata JSON Files:

Dataset JSON file

A product must produce a Dataset ID dataset.json in the Dataset ID directory. This file contains JSON formatted metadata representing the cataloged dataset metadata:

$ cat dumby-product-20170101T000000Z-3lx0a.dataset.json
	 {
 	 "version": "v1.0",
 	 "label": "dumby product for 2017-01-01T00:00:00Z",
 	 "location": {
 	   "type": "polygon",
 	   "coordinates": [
 	     [
    	    [-122.9059682940358,40.47090915967475],
    	    [-121.6679748715316,37.84406528996276],
    	    [-120.7310161872557,38.28728069813177],
    	    [-121.7043611684245,39.94137004454238],
    	    [-121.9536916840953,40.67097860759095],
    	    [-122.3100379696548,40.7267890636145],
    	    [-122.7640648263371,40.5457010812299],
    	    [-122.9059682940358,40.47090915967475]
    	  ]
    	]
  	},
  	"starttime": "2017-01-01T00:00:00",
  	"endtime": "2017-01-01T00:05:00"
	}

The required fields are:

*version

The optional fields are:

*label

*location (in GeoJSON format)

*starttime

*endtime

2. Metadata JSON file

In addition, other metadata data can be added to a Dataset ID met.json in the Dataset ID directory. As long as the file conforms to the JSON format, the dataset developer has free reign on what goes into this file:

$ cat dumby-product-20170101T000000Z-3lx0a.met.json
	{
	  "startingRange": 800026.4431219272,
	  "sensor": "SAR-C Sentinel1",
	  "esd_threshold": 0.85,
	  "tiles": true,
	  "reference": true,
	  "trackNumber": 144,
	  "lookDirection": "right",
	  "beamMode": "IW",
	  "direction": "descending",
	  "inputFile": "sentinel.ini",
	  "polarization": "VV",
	  "imageCorners": {
	    "maxLon": -117.56055555555555,
	    "minLon": -119.06166666666667,
	    "minLat"

End of formatting revisions.

Start of unrevised content:

= Dataset Specification (how to author a new dataset type) =

In order to for HySDS to recognize a dataset, the dataset must follow certain conventions. These conventions are documented on this page and must be implemented by the PGE or the PGE wrapper.

== Dataset ID ==

Each product should have a dataset ID. This name is used to determine the type of the dataset and name all the important files for the dataset. A dataset ID is matched against entries found in the <code>datasets.json</code> file to determine its type.<br />
In this example, we shall use the dataset ID <code>dumby-product-20170101T000000Z-3lx0a</code>.

== Directory ==

Any directory containing the below JSON files and found within the working directory supplied to the PGE is considered a dataset. Thus this directory must be named with the dataset's ID (see above):

<pre>$ pwd
/data/work/example_work_dir/dumby-product-20170101T000000Z-3lx0a
$ ls
dumby-product-20170101T000000Z-3lx0a.dataset.json
dumby-product-20170101T000000Z-3lx0a.met.json
dumby-product-20170101T000000Z-3lx0a.prov_es.json
dumby-product-20170101T000000Z-3lx0a.h5
pge_output_2.h5
errors.txt
other_metadata.xml
</pre>
''Note that any other PGE data files should be placed in the <Dataset ID> directory, as the whole directory is the dataset.''

== HySDS dataset and metadata JSON files ==

=== dataset JSON file ===

A product must produce a <Dataset ID>.dataset.json in the <Dataset ID> directory. This file contains JSON formatted metadata representing the cataloged dataset metadata:

<pre>$ cat dumby-product-20170101T000000Z-3lx0a.dataset.json
{
"version": "v1.0",
"label": "dumby product for 2017-01-01T00:00:00Z",
"location":

Unknown macro: { "type"}

,
"starttime": "2017-01-01T00:00:00",
"endtime": "2017-01-01T00:05:00"
}
</pre>
The required fields are:

<code>version</code>

The optional fields are:

<code>label</code>
<code>location</code> (in GeoJSON format)
<code>starttime</code>
<code>endtime</code>

=== metadata JSON file ===

In addition, other metadata data can be added to a <Dataset ID>.met.json in the <Dataset ID> directory. As long as the file conforms to the JSON format, the dataset developer has free reign on what goes into this file:

<pre>$ cat dumby-product-20170101T000000Z-3lx0a.met.json
{
"startingRange": 800026.4431219272,
"sensor": "SAR-C Sentinel1",
"esd_threshold": 0.85,
"tiles": true,
"reference": true,
"trackNumber": 144,
"lookDirection": "right",
"beamMode": "IW",
"direction": "descending",
"inputFile": "sentinel.ini",
"polarization": "VV",
"imageCorners": {
"maxLon": -117.56055555555555,
"minLon": -119.06166666666667,
"minLat"
</pre>

How to Author a New Dataset Type

Dataset Specification:

Dataset ID:

2. Directory:

HySDS Dataset and Metadata JSON Files:

Dataset JSON file

2. Metadata JSON file