(TODO: Review formatting revisions for accuracy/continuity of descriptions, etc)

Dataset Specification:

This page details how to author a new dataset type. In order to for HySDS to recognize a dataset, the dataset must follow certain conventions. These conventions documented on this page must be implemented by the PGE or the PGE wrapper. The dataset conventions include:

Dataset ID:

Each product should have a Dataset ID. This name is used to determine the type of the dataset and name all the important files for the dataset. A Dataset ID is matched against entries found in the datasets.json file to determine its type.

In this example, we shall use the Dataset ID dumby-product-20170101T000000Z-3lx0a.

2. Directory:

Any directory containing the below JSON files and found within the working directory supplied to the PGE is considered a dataset. Thus this directory must be named with the dataset's ID (see above):

Code Block

language	powershell
theme	Confluence

$ pwd /data/work/example_work_dir/dumby-product-20170101T000000Z-3lx0a 
$ ls dumby-product-20170101T000000Z-3lx0a.dataset.json 
	 dumby-product-20170101T000000Z-3lx0a.met.json
	 dumby-product-20170101T000000Z-3lx0a.prov_es.json 
	 dumby-product-20170101T000000Z-3lx0a.h5 
	 pge_output_2.h5 
	 errors.txt 
	 other_metadata.xml

Info
Note: Any other PGE data files should be placed in the Dataset ID directory, as the whole directory is the dataset.

HySDS Dataset and Metadata JSON Files:

Dataset JSON file

A product must produce a Dataset ID dataset.json in the Dataset ID directory. This file contains JSON formatted metadata representing the cataloged dataset metadata:

Code Block

language	powershell
theme	Confluence

$ cat dumby-product-20170101T000000Z-3lx0a.dataset.json
	 {
 	 "version": "v1.0",
 	 "label": "dumby product for 2017-01-01T00:00:00Z",
 	 "location": {
 	   "type": "polygon",
 	   "coordinates": [
 	     [
    	    [-122.9059682940358,40.47090915967475],
    	    [-121.6679748715316,37.84406528996276],
    	    [-120.7310161872557,38.28728069813177],
    	    [-121.7043611684245,39.94137004454238],
    	    [-121.9536916840953,40.67097860759095],
    	    [-122.3100379696548,40.7267890636145],
    	    [-122.7640648263371,40.5457010812299],
    	    [-122.9059682940358,40.47090915967475]
    	  ]
    	]
  	},
  	"starttime": "2017-01-01T00:00:00",
  	"endtime": "2017-01-01T00:05:00"
	}

The required fields are:

*version

The optional fields are:

*label

*location (in GeoJSON format)

*starttime

*endtime

2. Metadata JSON file

In addition, other metadata data can be added to a Dataset ID met.json in the Dataset ID directory. As long as the file conforms to the JSON format, the dataset developer has free reign on what goes into this file:

Code Block

language	powershell
theme	Confluence

$ cat dumby-product-20170101T000000Z-3lx0a.met.json
	{
	  "startingRange": 800026.4431219272,
	  "sensor": "SAR-C Sentinel1",
	  "esd_threshold": 0.85,
	  "tiles": true,
	  "reference": true,
	  "trackNumber": 144,
	  "lookDirection": "right",
	  "beamMode": "IW",
	  "direction": "descending",
	  "inputFile": "sentinel.ini",
	  "polarization": "VV",
	  "imageCorners": {
	    "maxLon": -117.56055555555555,
	    "minLon": -119.06166666666667,
	    "minLat"

End of formatting revisions.

Start of unrevised content:

Markdown

highlight	false
highlightStyle	github

Dataset Specification (how to author a new dataset type):

In order to for HySDS to recognize a dataset, the dataset must follow certain conventions. These conventions are documented on this page and must be implemented by the PGE or the PGE wrapper.

== Dataset ID ==

Each product should have a dataset ID. This name is used to determine the type of the dataset and name all the important files for the dataset. A dataset ID is matched against entries found in the <code>datasets.json</code> file to determine its type.<br />
In this example, we shall use the dataset ID <code>dumby-product-20170101T000000Z-3lx0a</code>.

== Directory ==

Any directory containing the below JSON files and found within the working directory supplied to the PGE is considered a dataset. Thus this directory must be named with the dataset's ID (see above):

<pre>$ pwd
/data/work/example_work_dir/dumby-product-20170101T000000Z-3lx0a
$ ls
dumby-product-20170101T000000Z-3lx0a.dataset.json
dumby-product-20170101T000000Z-3lx0a.met.json
dumby-product-20170101T000000Z-3lx0a.prov_es.json
dumby-product-20170101T000000Z-3lx0a.h5
pge_output_2.h5
errors.txt
other_metadata.xml
</pre>
''Note that any other PGE data files should be placed in the <Dataset ID> directory, as the whole directory is the dataset.''

== HySDS dataset and metadata JSON files ==

=== dataset JSON file ===

A product must produce a <Dataset ID>.dataset.json in the <Dataset ID> directory. This file contains JSON formatted metadata representing the cataloged dataset metadata:

<pre>$ cat dumby-product-20170101T000000Z-3lx0a.dataset.json
 {
  "version": "v1.0",
  "label": "dumby product for 2017-01-01T00:00:00Z",
  "location": {
    "type": "polygon",
    "coordinates": [
      [
        [-122.9059682940358,40.47090915967475],
        [-121.6679748715316,37.84406528996276],
        [-120.7310161872557,38.28728069813177],
        [-121.7043611684245,39.94137004454238],
        [-121.9536916840953,40.67097860759095],
        [-122.3100379696548,40.7267890636145],
        [-122.7640648263371,40.5457010812299],
        [-122.9059682940358,40.47090915967475]
      ]
    ]
  },
  "starttime": "2017-01-01T00:00:00",
  "endtime": "2017-01-01T00:05:00"
}
</pre>
The required fields are:

* <code>version</code>

The optional fields are:

* <code>label</code>
* <code>location</code> (in GeoJSON format)
* <code>starttime</code>
* <code>endtime</code>

=== metadata JSON file ===

In addition, other metadata data can be added to a <Dataset ID>.met.json in the <Dataset ID> directory. As long as the file conforms to the JSON format, the dataset developer has free reign on what goes into this file:

<pre>$ cat dumby-product-20170101T000000Z-3lx0a.met.json
{
  "startingRange": 800026.4431219272,
  "sensor": "SAR-C Sentinel1",
  "esd_threshold": 0.85,
  "tiles": true,
  "reference": true,
  "trackNumber": 144,
  "lookDirection": "right",
  "beamMode": "IW",
  "direction": "descending",
  "inputFile": "sentinel.ini",
  "polarization": "VV",
  "imageCorners": {
    "maxLon": -117.56055555555555,
    "minLon": -119.06166666666667,
    "minLat"
</pre>

Wiki Markup

= Dataset Specification (how to author a new dataset type) =

In order to for HySDS to recognize a dataset, the dataset must follow certain conventions. These conventions are documented on this page and must be implemented by the PGE or the PGE wrapper.

== Dataset ID ==

Each product should have a dataset ID. This name is used to determine the type of the dataset and name all the important files for the dataset. A dataset ID is matched against entries found in the <code>datasets.json</code> file to determine its type.<br />
In this example, we shall use the dataset ID <code>dumby-product-20170101T000000Z-3lx0a</code>.

== Directory ==

Any directory containing the below JSON files and found within the working directory supplied to the PGE is considered a dataset. Thus this directory must be named with the dataset's ID (see above):

<pre>$ pwd
/data/work/example_work_dir/dumby-product-20170101T000000Z-3lx0a
$ ls
dumby-product-20170101T000000Z-3lx0a.dataset.json
dumby-product-20170101T000000Z-3lx0a.met.json
dumby-product-20170101T000000Z-3lx0a.prov_es.json
dumby-product-20170101T000000Z-3lx0a.h5
pge_output_2.h5
errors.txt
other_metadata.xml
</pre>
''Note that any other PGE data files should be placed in the <Dataset ID> directory, as the whole directory is the dataset.''

== HySDS dataset and metadata JSON files ==

=== dataset JSON file ===

A product must produce a <Dataset ID>.dataset.json in the <Dataset ID> directory. This file contains JSON formatted metadata representing the cataloged dataset metadata:

<pre>$ cat dumby-product-20170101T000000Z-3lx0a.dataset.json
 {
  "version": "v1.0",
  "label": "dumby product for 2017-01-01T00:00:00Z",
  "location": {
    "type": "polygon",
    "coordinates": [
      [
        [-122.9059682940358,40.47090915967475],
        [-121.6679748715316,37.84406528996276],
        [-120.7310161872557,38.28728069813177],
        [-121.7043611684245,39.94137004454238],
        [-121.9536916840953,40.67097860759095],
        [-122.3100379696548,40.7267890636145],
        [-122.7640648263371,40.5457010812299],
        [-122.9059682940358,40.47090915967475]
      ]
    ]
  },
  "starttime": "2017-01-01T00:00:00",
  "endtime": "2017-01-01T00:05:00"
}
</pre>
The required fields are:

* <code>version</code>

The optional fields are:

* <code>label</code>
* <code>location</code> (in GeoJSON format)
* <code>starttime</code>
* <code>endtime</code>

=== metadata JSON file ===

In addition, other metadata data can be added to a <Dataset ID>.met.json in the <Dataset ID> directory. As long as the file conforms to the JSON format, the dataset developer has free reign on what goes into this file:

<pre>$ cat dumby-product-20170101T000000Z-3lx0a.met.json
{
  "startingRange": 800026.4431219272,
  "sensor": "SAR-C Sentinel1",
  "esd_threshold": 0.85,
  "tiles": true,
  "reference": true,
  "trackNumber": 144,
  "lookDirection": "right",
  "beamMode": "IW",
  "direction": "descending",
  "inputFile": "sentinel.ini",
  "polarization": "VV",
  "imageCorners": {
    "maxLon": -117.56055555555555,
    "minLon": -119.06166666666667,
    "minLat"
</pre>

Versions Compared

Old Version 3

New Version 4

Key

Dataset Specification:

Dataset ID:

2. Directory:

HySDS Dataset and Metadata JSON Files:

Dataset JSON file

2. Metadata JSON file

Page Comparison

Versions Compared

Old Version 3

New Version 4

Key

Dataset Specification:

Dataset ID:

2. Directory:

HySDS Dataset and Metadata JSON Files:

Dataset JSON file

2. Metadata JSON file