PBS Queue Management

Functions for submiting and monitoring jobs in a PBS queue.

oi_tools.pbs.list_jobs(
users: Literal['me', 'all'] | list[str] = 'me',
*,
limit: int = 20,
completed: bool = False,
) None

List active and queued PBS jobs in a formatted table.

Parameters:
  • users (Literal['me', 'all'] | list[str]) – Who to show jobs for. "me" shows only the current user’s jobs, "all" shows all users, or a list of usernames to filter by.

  • limit (int) – Maximum number of jobs to display.

  • completed (bool) – If True, include finished/completed jobs alongside active ones.

Return type:

None

oi_tools.pbs.submit_job(
file: str | Path,
args: Sequence[str | Path | float | int] = [],
*,
mem: str | int = '8G',
cpus: int = 4,
wait: bool = False,
log_folder: Path | str | None = PosixPath('logs'),
verbose: bool | None = True,
filetype: Literal['python_script', 'python_module', 'stata_script', 'r_script', 'sas_script'] | None = None,
base_job_name: str | None = None,
python_executable: Path | None = None,
cwd: Path | str | None = None,
env_vars: Mapping[str, str | Path | float | int] | None = None,
) str

Submit a single script or module as a PBS job.

Logs are written to <log_folder>/<base_job_name>/<args>/, with one file per submission named by today’s date and an incrementing counter (e.g. logs/myscript/2001/2026-01-15-1.log).

Parameters:
  • file (str | Path) – Path to a Python script (e.g. "code/myscript.py"), a Stata do-file (e.g. "code/myscript.do"), an R script (e.g. "code/myscript.r"), or a fully-qualified Python module name (e.g. "myproject.submodule"). The filetype is inferred from the file suffix; use filetype to override.

  • args (Sequence[str | Path | float | int]) – Command-line arguments to pass to the script or module. Defaults to no arguments.

  • mem (str | int) – Memory to request. Can be an integer (treated as gigabytes) or a string such as "16G". Defaults to "8G" (overridable via OI_TOOLS_PBS_DEFAULT_MEM).

  • cpus (int) – Number of CPU cores to request. Defaults to 4 (overridable via OI_TOOLS_PBS_DEFAULT_CPUS).

  • wait (bool) – Whether to wait for the job to finish before returning (True) or immediately return (False).

  • log_folder (Path | str | None) – Directory in which to create log files. Set to None to discard output. Defaults to "logs" (overridable via OI_TOOLS_PBS_LOG_FOLDER).

  • verbose (bool | None) – Print job details and the generated PBS script before submitting. Defaults to True.

  • filetype (Literal['python_script', 'python_module', 'stata_script', 'r_script', 'sas_script'] | None) – Explicitly set the script type: "python_script", "python_module", "stata_script", or "r_script". If omitted, inferred from the file suffix (.pypython_script, .dostata_script, .rr_script). Falls back to "python_module" if unrecognized.

  • base_job_name (str | None) – Base name used for the PBS job and the log subdirectory. Defaults to the script path with non-alphanumeric characters replaced by hyphens (e.g. "code/myscript.py""code-myscript").

  • python_executable (Path | None) – Path to the Python interpreter to use. Defaults to the first python3 or python found on PATH. Only used for Python jobs.

  • cwd (Path | str | None) – Working directory for the job. Defaults to the current directory at submission time.

  • env_vars (Mapping[str, str | Path | float | int] | None) – Optional dictionary of environment variables to export in the job script (e.g. {"MY_VAR": "value"}). Defaults to None.

Returns:

The PBS job ID returned by qsub (e.g. "12345.cluster").

Return type:

str

Examples

Submit a Python script with a year argument:

>>> from pathlib import Path
>>> job_id = submit_job(  
...     "code/myscript.py",
...     ["2001"],
...     mem="16G",
...     cpus=8,
...     log_folder="logs",
...     python_executable=Path(".venv/bin/python3"),
...     base_job_name="jobname",
... )

Submit a Stata do-file:

>>> submit_job("code/myscript.do", ["2001"])  

Submit an R script:

>>> submit_job("code/myscript.r", ["2001"])  

Submit a Python module:

>>> submit_job("myproject.submodule", filetype="python_module")  

Pass environment variables to the job:

>>> submit_job(  
...     "code/myscript.py",
...     ["2001"],
...     env_vars={"MY_TOKEN": "abc123", "DATA_DIR": "/scratch/myproject"},
... )
oi_tools.pbs.submit_many_jobs(
file: str | Path,
*,
args: Iterable[Sequence[ArgumentType]] | None = None,
env_vars: Iterable[Mapping[str, ArgumentType]] | None = None,
max_concurrent_jobs: int = 7,
stop_on_error: bool = True,
**kwargs,
) None

Submit multiple jobs to the PBS queue and block until all finish.

Jobs are submitted in batches to keep at most max_concurrent_jobs running or queued at any time. As each job completes, the next one is submitted automatically. If any job exits with a non-zero status, a BatchJobError is raised (unless stop_on_error=False).

Parameters:
  • file (str | Path) – Path to a Python script, or a module name (e.g. "myproject.submodule"). Passed directly to submit_job().

  • args (Iterable[Sequence[ArgumentType]] | None) – An iterable of argument lists, one per job. Each element is passed as the args parameter of submit_job(). Mutually exclusive with env_vars.

  • env_vars (Iterable[Mapping[str, ArgumentType]] | None) – An iterable of environment variable dicts, one per job. Each element is passed as the env_vars parameter of submit_job(). Mutually exclusive with args.

  • max_concurrent_jobs (int) – Maximum number of jobs to have running or queued at any given time.

  • stop_on_error (bool) – If True (default), raise BatchJobError as soon as any job fails. If False, continue submitting and waiting for the remaining jobs.

  • **kwargs – Additional keyword arguments forwarded to every submit_job() call (e.g. filetype, mem, cpus, log_folder, etc.).

Return type:

None

Examples

Submit one job per year for 2000–2019, keeping at most 3 running at once, and block until all jobs finish:

>>> many_args = [[str(year)] for year in range(2000, 2020)]
>>> submit_many_jobs(  
...     "code/myscript.py",
...     args=many_args,
...     max_concurrent_jobs=3,
...     mem="16G",
... )

Submit one job per cohort, passing each as an environment variable:

>>> many_env_vars = [{"STATE": s} for s in ["MN", "MA", "WY"]]
>>> submit_many_jobs(  
...     "code/myscript.py",
...     env_vars=many_env_vars,
... )

Allow jobs to fail silently:

>>> submit_many_jobs(  
...     "code/myscript.py",
...     args=many_args,
...     stop_on_error=False,
... )
oi_tools.pbs.wait_for_job(
job_ids: str | Collection[str],
*,
stop_on_error: bool = True,
wait_for_all: bool = False,
polling_delay: int | float = 5.0,
) str

Watch the specified job(s) and return when one (or all) finish.

Parameters:
  • job_ids (str | Collection[str]) – PBS job ID(s) to monitor.

  • stop_on_error (bool) – If True, raise BatchJobError when a job exits with non-zero status.

  • wait_for_all (bool) – If True, block until every job finishes and return the last job ID. If False (default), return as soon as the first job finishes.

  • polling_delay (int | float) – Seconds to wait between status checks.

Returns:

The job ID of the last job to finish (if wait_for_all=True) or the first job to finish (if wait_for_all=False).

Return type:

str

Raises:
  • ValueError – If job_ids is empty.

  • BatchJobError – If a job exits with non-zero status and stop_on_error is True.

Configuration

Default values for several job submission parameters can be overridden by setting environment variables before running your script or the oi CLI. All variables use the OI_TOOLS_ prefix:

Environment variable

Default

Description

OI_TOOLS_PBS_DEFAULT_MEM

"8G"

Default memory request for submit_job().

OI_TOOLS_PBS_DEFAULT_CPUS

4

Default CPU count for submit_job().

OI_TOOLS_PBS_LOG_FOLDER

"logs"

Default log directory for submit_job().

OI_TOOLS_PBS_POLL_DELAY

5.0

Seconds between polling when waiting for a job to finish.

OI_TOOLS_PBS_MAX_CONCURRENT_JOBS

7

Maximum simultaneous jobs for submit_many_jobs().

Example — set project-wide defaults in a .env file or shell profile:

export OI_TOOLS_PBS_DEFAULT_MEM=16G
export OI_TOOLS_PBS_DEFAULT_CPUS=8
export OI_TOOLS_PBS_LOG_FOLDER=/scratch/logs