“Packing” small jobs into larger ones with multi job launcher

With today’s multiprocessor and multi-node machines, it’s possible to get a lot of computing done quickly by exploiting parallelism. If you have many independent workflows to run, FireWorks makes this process simple and automatic with the multi-job launcher. For example, you might want to simultaneously run 4 workflows over 4 cores, or 4 16-core parallel job workflows over 64 cores.

Parallelizing serial jobs on a single multicore machine

If you have a single machine (e.g. workstation or laptop) with multiple cores, it’s easy to use all your cores to execute your workflows in parallel. Simply add your workflows to the LaunchPad, and then type:

mlaunch <NP>

where <NP> is the number of processing cores. For example, mlaunch 4 would execute 4 workflows in parallel so that each core handles one workflow.

Note

The mlaunch command has several useful options. Type mlaunch -h to see them listed. In particular, the --nlaunches option configures how many jobs are run consecutively in serial per core.

Parallelizing serial jobs over several (interconnected) multicore machines

If you have several interconnected machines, you will need to install MPI to run jobs in parallel. Fortunately, the command to run serial jobs, one per processor, is simple after MPI installation:

<MPIEXEC> -n <NP> rlaunch rapidfire

where <MPIEXEC> is your MPI executable and <NP> is the total number of processors over all machines. Examples might be mpirun -n 128 rlaunch rapidfire, or mpiexec -n 8 rlaunch rapidfire, depending on your flavor of MPI.

If you are familiar with MPI and FireWorks, you will recognize that this mode of operation is nothing special; we are just submitting the rlaunch rapidfire command over all cores using MPI. The rlaunch rapidfire doesn’t do anything different when run through MPI (it is not parallelized). It is the same rlaunch rapidfire from the introductory tutorials, and you can give it any of the same options as normal.

One note about this method is that unlike the special mlaunch command, no attempt is made to minimize database connections or improve database performance by sharing data between processes. So, there may be a fundamental limit to how much you can scale, depending on the performance and settings of your MongoDB server.

Parallelizing parallel jobs over several (interconnected) multicore machines

Your workflow itself might involve executing a parallel code. This means that somewhere in your FireTask, an MPI executable like mpirun, mpi_exec, or aprun is being called. In this case, the basic command to type is:

mlaunch <NP/PPJOB>

where <NP/PPJOB> is the total number of processors divided by the number of processors per job. For example, if you have 64 total processors available and each of your workflows involves an mpiexec -n 16 command, you would type mlaunch 4 to set in motion 4 jobs that each will use 16 cores.

Note

The mlaunch command has several useful options. Type mlaunch -h to see them listed. In particular, the --nlaunches option configures how many jobs are run consecutively in serial per parallel process.

Access to nodefile

If you need to access the NODEFILE from within your FireTask, you should modify the command to be:

mlaunch <NP/PPJOB> --ppn <PPN> --nodefile <NODEFILE>

Here, NODEFILE is the location of your NODEFILE (or alternatively the name of an environment variable that points to your NODEFILE), and PPN is the number of processors per node. Then, inside your FireTask you will be able to access the parameters FWData().NODE_LIST and FWData().SUB_NPROCS to design your parallel run.

Using multi job launching with a queue

It is easy to configure your queue script so that each queued job runs multiple FireWorks in parallel. In your my_qadapter.yaml file, you should modify the rocket_launch key to be one of the mlaunch scripts described above (remember to add the number of jobs, e.g. mlaunch 4, as well as config file paths). Then, when the queued job “wakes up” and starts running, it will execute multiple jobs using mlaunch instead of single job using the basic rlaunch.

A note on “packing” and heterogeneous jobs

The multi job launcher does not actually “pack” jobs the way a queue scheduler does. Rather, it just runs a fixed number of workflows in parallel. In particular, the multi-job launcher is designed to simultaneously run workflows with homogeneous computing requirements. If your workflows are not homogeneous (e.g., some workflows require more processors than others), we suggest you set up your FireWorker for mlaunch so that it only pulls jobs with a fixed computing requirement. The FireWorker can be set using the -w or -c option of the mlaunch command, and the configuration for only pulling certain jobs is described in the control tutorial.