...
This says if the Pleiades can fit 90-minute jobs, then go ahead and dispatch our PBS jobs for job_worker-singularity.sh
CLI to PBS
Quickly delete all running+queued jobs
qstat -r -q hysds > qstat.txt
PBS script
#PBS -l select=xx:ncpus=yy:model=zz
...
https://www.nas.nasa.gov/hecc/support/kb/pbs-environment-variables_178.html
Enable “auto-scaling”-like behavior with PBS
set_desire_worker.sh
mimics scale-out (scale up)
Code Block |
---|
#!/usr/bin/env bash
DESIRED=$1echo "# DESIRED: ${DESIRED}"
while true; doTIMESTAMP=$(date +%Y%m%dT%H%M%S)echo "$(date) checking qstat on hysds queue..."
# get count of running and queue jobs
TOKENS=$(qstat -q hysds | awk '{if ($1=="hysds") print $6 " " $7}')
IFS=" " read RUNNING QUEUED <<< ${TOKENS}
echo "# RUNNING: ${RUNNING}"
echo "# QUEUED: ${QUEUED}"
RUNNING_QUEUED=$((RUNNING + QUEUED))
echo "# RUNNING_QUEUED: ${RUNNING_QUEUED}"
if [ "${RUNNING_QUEUED}" -lt "${DESIRED}" ]; then
echo "# ---> qsub one more job..."
qsub -q hysds celery.pbs
fi
echo ""
sleep 60
done |
reference: https://www.nas.nasa.gov/hecc/support/kb/commonly-used-pbs-commands_174.html
harikiri on job workers
What happens to the job worker when PBS kills the job?
...