SciFlo Tutorial 1: Slice and Plot HDF Data Variable


 

Confidence Level TBD  This article has not been reviewed for accuracy, timeliness, or completeness. Check that this information is valid before acting on it.

Confidence Level TBD  This article has not been reviewed for accuracy, timeliness, or completeness. Check that this information is valid before acting on it.




In this tutorial, you will be writing a SciFlo using a top-down programming approach. The goal is to slice a data variable from an AIRS HDF data file, extract the pressure levels of that AIRS HDF data file, and to generate a plot of that slice using the pressure levels.   

  1. Let's generate the SciFlo document for this flow. Start with a skeleton SciFlo document:  

    •    

       $ cd /tmp; mkdir test; cd test  $ vi test.sf.xml

        

      Copy and paste the following SciFlo xml into the file:  Include(SciFloDocSkeleton)  

  2. Modify the id attribute of the <sf:flow> element and the <sf:description> value:  

    •    

        <sf:flow id="SliceAndPlotAIRSHdf">   <sf:description>Slice and plot a data variable from an AIRS HDF file.</sf:description>

        

  3. Now we need to set the global inputs of the flow. For the sake of brevity, let's say the only global input we need for this flow is the AIRS HDF file. Modify the top level <sf:inputs> section (not the ones enclosed in the <sf:process> blocks) to this:  

    •     

        <sf:inputs>   <airsFile>/tmp/test/AIRS.2003.01.01.001.L2.RetStd.v4.0.9.0.G06048023154.hdf</airsFile>   </sf:inputs>

       

    • As of version 0.9.5, you need to specify the absolute path to any local data files being used as inputs. This will be resolved in subsequent versions.   

  4. Now we need to set the global outputs of the flow. For now, we only want the plot of the variable. Modify the top-level <sf:outputs> section (not the ones enclosed in the <sf:process> blocks) to this:  

    •     

        

  5. Now on to the processes(operators) that build this flow. We need to slice the data variable, extract the pressure levels from the HDF file and plot that slice against the pressure levels. Thus, we need 3 operators: (a) one to take the HDF file, extract the data variable, and cut a slice from it, (b) one to extract the pressure levels defined in that HDF file, and (c) one to take that slice along with the extracted pressure levels and plot it. Copy the <sf:process> block (everything from <sf:process> to </sf:process>; NOT <sf:processes>) and paste it right below the one you just copied from. Do that again. You should now have 3 <sf:process> blocks in the <sf:processes> block:  

    •                                            

        

  6. Let's modify the first <sf:process> block so that it specifies the data variable slicing operator. Once again let's simply assume that this operator's interface requires only the AIRS data file as input and outputs only the sliced variable (array). This operator is going to be written in python so let's also start the <sf:binding> specification. Modify the first process block to this:  

    •                

       

    • Note that the <airsFile> input gets its value from the global inputs because of the from="@#inputs" attribute and the fact that they both share the same tag name. If the tag name had been something else, e.g. <airsHdfFile>, you would have needed to specify from="@#inputs.airsFile" instead to connect it to the corresponding global input. 

    • The specified binding (python:myModule.py?myModule.sliceVar) tells the SciFlo engine that:  

      1. This is a python function (python:)  

      2. It needs to stage a python module file (myModule.py)  

      3. Use the fully qualified function name to access it (myModule.sliceVar)   

  7. Let's modify the second <sf:process> block so that it specifies the pressure level extraction operator. Let's simply assume that this operator's interface requires only the AIRS data file as input and outputs only the pressure levels (array). This operator is going to be written in python so let's also start the <sf:binding> specification. Modify the second process block to this:  

    •                

       

    • Note that the <airsFile> input gets its value from the global inputs because of the from="@#inputs" attribute and the fact that they both share the same tag name. If the tag name had been something else, e.g. <airsHdfFile>, you would have needed to specify from="@#inputs.airsFile" instead to connect it to the corresponding global input.  

    • The specified binding (python:myModule.py?myModule.getPressLevels) tells the SciFlo engine that:  

      1. This is a python function (python:)  

      2. It needs to stage a python module file (myModule.py)  

      3. Use the fully qualified function name to access it (myModule.getPressLevels)   

  8. Let's modify the third <sf:process> block so that it specifies the plotting operator. This operator's interface requires the sliced array and the pressure levels as inputs and outputs the plot file. This operator is going to be written in python also so let's specify it binding. Modify the third process block to this:  

    •                 

       

    • Note that the <varSlice> input gets its value from the output of the sliceVar process because of the from="@#sliceVar" attribute and the fact that they both share the same tag name. If the tag name had been something else, e.g. <dataVarSlice>, you would have needed to specify from="@#sliceVar.varSlice" instead to connect it to the corresponding output of the sliceVar process.  

    • Note that the <presLevs> input gets its value from the output of the previous process because of the from="@#previous" attribute and the fact that they both share the same tag name. If the tag name had been something else, e.g. <pressureLevels>, you would have needed to specify from="@#previous.presLevs" instead to connect it to the corresponding output of the previous process.  

    • The specified binding (python:myModule.py?myModule.plotSlice) tells the SciFlo engine that:  

      1. This is a python function (python:)  

      2. It needs to stage a python module file (myModule.py)  

      3. Use the fully qualified function name to access it (myModule.plotSlice)   

  9. Now that we have the <sf:process> block defined for the plotting operator, we can now edit the <varPlot> element in the global outputs block so that in knows where to get the output from. Modify the global outputs (not the one in any of the<sf:process> blocks) to this:  

    •     

        

  10. Save the SciFlo file. Here is the SciFlo document in it's entirety:  

    •                                                                   

        

  11. Download the AIRS HDF file from here: attachment:AIRS.2003.01.01.001.L2.RetStd.v4.0.9.0.G06048023154.hdf  

    •   

        

  12. Try executing the SciFlo. You should get an error because the SciFlo engine cannot resolve the operators' bindings. Remember that all three of the operators bind to python functions defined in a module named "myModule". Basically, the SciFlo engine is looking for a file named myModule.py in the current working directory but is unable to find it:  

    •                     

        

  13. So let's finally write these python operators.  

    •   

       

    • Let's import the hdfeos, Numeric, and Masked Array python modules to perform the HDF I/O. We'll also import the matplotlib module to utilize its plotting functions. NOTE: These modules are already installed with the SciFlo bundle.          



       

    • Also, AIRS HDF files define a swath id so for the time being, let's just hard-code that as a global variable in this module.     


       

    • Let's define the sliceVar (myModule.sliceVar) function which will simply extract the 'TAirStd' variable (Air Temperature) from the HDF file, slice the first grid from it, and return that slice:       



       

    • Next, let's define the getPressLevels (myModule.getPressLevels) function which will simply return the 'pressStd' variable from the file:     



       

    • Finally, let's define the plotSlice (myModule.plotSlice) function which will take the slice and plot it against the pressure levels. The return value of this function should be the plot file itself:                   



       

    • Here's the content of myModule.py in its entirety:                                  



        

  14. Now let's try to execute the SciFlo:  

    •   

       

    • You should see a status screen similar to the following:  attachment:sflExecStatus.png  

    • If no exception occurred, you should see the following output upon completion of the SciFlo execution:     

       

    • To view the plot:    

       attachment:HDFSlicePlot.png   

That's it. Please note that this SciFlo can be made more useful (e.g. We can expose the variable name as another input to the sliceVar function to allow the user to specify the variable to slice instead of having it hard-coded). 

 


Related Articles:

Related Articles:

Have Questions? Ask a HySDS Developer:

Anyone can join our public Slack channel to learn more about HySDS. JPL employees can join #HySDS-Community

JPLers can also ask HySDS questions at Stack Overflow Enterprise

 

Search HySDS Wiki

Page Information:

Page Information:

Was this page useful?

Yes No

Contribution History:

Subject Matter Experts:

@Gerald Manipon

Find an Error?

Is this document outdated or inaccurate? Please contact the assigned Page Maintainer:

@Gerald Manipon

Note: JPL employees can also get answers to HySDS questions at Stack Overflow Enterprise: