Managing Job Workers on Pleiades

Confidence Level TBD  This article has not been reviewed for accuracy, timeliness, or completeness. Check that this information is valid before acting on it.

Confidence Level TBD  This article has not been reviewed for accuracy, timeliness, or completeness. Check that this information is valid before acting on it.


See background : NASA HECC Pleiades

Setting up Tunnel on Pleiades head node back to PCM in AWS

. The ssh tunnel to Pleiades is through Mamba Cluster factotum

. ssh -i ~/.ssh/int-aria.pem hysdsops@<IP of mamba factotum>

. The ssh tunnel configuration is under ~/.ssh/config. To ssh to the frontend node of tpfe2 of Pleiades,

. ssh tpfe2-tunnel (or use the alias pleiades='ssh tpfe2-tunnel')

. two-factor authentication (an RSA token and a password) is needed to log on

Check Tunnel Ports are live

. hysds_pcm_check_port_forwarded_tunnel_services.sh

. 8 tests should pass, the last one may fail (email service setup on factotum)

. the above script is in GitHub - hysds/hysds-hec-utils: HySDS HEC Utilities

Start “Auto-Scaling” to PBS

. pbs_auto_scale_up.sh

. adjust settings in the # input settings section if necessary

. the above script is in GitHub - hysds/hysds-hec-utils: HySDS HEC Utilities

Job Worker Singularity on Pleiades

  • Location of Work Dirs

    • . different user may have different lustre file system assignment, e.g., for user lpan, it is /nobackupp12/lpan. To get lustre quota info, run: lfs quota -u <userid> /nobackupp??

    • . work dirs are under /nobackup??/<userid>/worker/$year/$month/$day/

    • . the work dir for a pbs job will be cleaned up as the pbs job finishes

  • Location of Job Worker logs

    • . log files are under /nobackup??/<userid>/worker/logs/$year/$month/$day/

    • . the log files will be kept even after the corresponding pbs jobs finish. Manual cleanup is needed to the log files in order to stay within lustre quota.

  • on exit of each job worker, it clean up the worker’s work dirs

Debugging

 

 

 


Related Articles:

Have Questions? Ask a HySDS Developer:

Anyone can join our public Slack channel to learn more about HySDS. JPL employees can join #HySDS-Community

JPLers can also ask HySDS questions at Stack Overflow Enterprise

Search HySDS Wiki

Page Information:

Was this page useful?

Yes No

Contribution History:

Subject Matter Expert:

@Lei Pan

@Hook Hua

@Marjorie Lucas

Find an Error?

Is this document outdated or inaccurate? Please contact the assigned Page Maintainer:

@Hook Hua

Note: JPL employees can also get answers to HySDS questions at Stack Overflow Enterprise: