PBS Queue Management

Functions for submiting and monitoring jobs in a PBS queue.

oi_tools.pbs.list_jobs(
users: list[str] = [],
only_me: bool = True,
) None

List active and queued PBS jobs in a formatted table.

Parameters:
  • users (list[str]) – List of users to filter by when only_me is False. Default is [].

  • only_me (bool) – If True, show only jobs belonging to the current user. Default is True.

Return type:

None

oi_tools.pbs.submit_job(
file: str | Path,
args: Sequence[ArgumentType] = [],
*,
mem: str | int = '8G',
cpus: int = 4,
wait: bool = False,
log_folder: Path | str | None = PosixPath('logs'),
verbose: bool | None = True,
filetype: FileType | None = None,
base_job_name: str | None = None,
python_executable: Path | None = None,
cwd: Path | str | None = None,
env_vars: dict[str, str] | None = None,
) str

Submit a single script or module as a PBS job.

Logs are written to <log_folder>/<base_job_name>/<args>/, with one file per submission named by today’s date and an incrementing counter (e.g. logs/myscript/2001/2026-01-15-1.log).

Parameters:
  • file (str | Path) – Path to a Python script (e.g. "code/myscript.py"), a Stata do-file (e.g. "code/myscript.do"), an R script (e.g. "code/myscript.r"), or a fully-qualified Python module name (e.g. "myproject.submodule"). The filetype is inferred from the file suffix; use filetype to override.

  • args (Sequence[ArgumentType]) – Command-line arguments to pass to the script or module. Defaults to no arguments.

  • mem (str | int) – Memory to request. Can be an integer (treated as gigabytes) or a string such as "16G". Defaults to "8G".

  • cpus (int) – Number of CPU cores to request. Defaults to 4.

  • wait (bool) – Whether to wait for the job to finish before returning (True) or immediately return (False).

  • log_folder (Path | str | None) – Directory in which to create log files. Set to None to discard output. Defaults to "logs".

  • verbose (bool | None) – Print job details and the generated PBS script before submitting. Defaults to True.

  • filetype (FileType | None) – Explicitly set the script type: "python_script", "python_module", "stata_script", or "r_script". If omitted, inferred from the file suffix (.pypython_script, .dostata_script, .rr_script). Falls back to "python_module" if unrecognized.

  • base_job_name (str | None) – Base name used for the PBS job and the log subdirectory. Defaults to the script path with non-alphanumeric characters replaced by hyphens (e.g. "code/myscript.py""code-myscript").

  • python_executable (Path | None) – Path to the Python interpreter to use. Defaults to the first python3 or python found on PATH. Only used for Python jobs.

  • cwd (Path | str | None) – Working directory for the job. Defaults to the current directory at submission time.

  • env_vars (dict[str, str] | None) – Optional dictionary of environment variables to export in the job script (e.g. {"MY_VAR": "value"}). Defaults to None.

Returns:

The PBS job ID returned by qsub (e.g. "12345.cluster").

Return type:

str

Examples

Submit a Python script with a year argument:

>>> from pathlib import Path
>>> job_id = submit_job(  
...     "code/myscript.py",
...     ["2001"],
...     mem="16G",
...     cpus=8,
...     log_folder="logs",
...     python_executable=Path(".venv/bin/python3"),
...     base_job_name="jobname",
... )

Submit a Stata do-file:

>>> submit_job("code/myscript.do", ["2001"])  

Submit an R script:

>>> submit_job("code/myscript.r", ["2001"])  

Submit a Python module:

>>> submit_job("myproject.submodule", filetype="python_module")  

Pass environment variables to the job:

>>> submit_job(  
...     "code/myscript.py",
...     ["2001"],
...     env_vars={"MY_TOKEN": "abc123", "DATA_DIR": "/scratch/myproject"},
... )
oi_tools.pbs.submit_many_jobs(
file: str | Path,
args: Iterable[Sequence[ArgumentType]],
*,
max_concurrent_jobs: int = 7,
stop_on_error: bool = True,
**kwargs,
) None

Submit multiple jobs to the PBS queue and block until all finish.

Jobs are submitted in batches to keep at most max_concurrent_jobs running or queued at any time. As each job completes, the next one is submitted automatically. If any job exits with a non-zero status, a BatchJobError is raised (unless stop_on_error=False).

Parameters:
  • file (str | Path) – Path to a Python script, or a module name (e.g. "myproject.submodule"). Passed directly to submit_job().

  • args (Iterable[Sequence[ArgumentType]]) – An iterable of argument lists, one per job. Each element is passed as the args parameter of submit_job().

  • max_concurrent_jobs (int) – Maximum number of jobs to have running or queued at any given time. Defaults to 7.

  • stop_on_error (bool) – If True (default), raise BatchJobError as soon as any job fails. If False, continue submitting and waiting for the remaining jobs.

  • **kwargs – Additional keyword arguments forwarded to every submit_job() call (e.g. filetype, mem, cpus, log_folder, env_vars, etc.).

Return type:

None

Examples

Submit one job per year for 2000–2019, keeping at most 3 running at once, and block until all jobs finish:

>>> many_args = [[str(year)] for year in range(2000, 2020)]
>>> submit_many_jobs(  
...     "code/myscript.py",
...     many_args,
...     max_concurrent_jobs=3,
...     mem="16G",
... )

Allow jobs to fail silently:

>>> submit_many_jobs(  
...     "code/myscript.py",
...     many_args,
...     stop_on_error=False,
... )