Defining Jobs using FireTasks

This tutorial show you how to:

  • Run multiple tasks within a single FireWork
  • Run tasks that are defined within a Python function, rather than a shell script

This tutorial can be completed from the command line, but some knowledge of Python is helpful. In this tutorial, we will run examples on the central server for simplicity. One could just as easily run them on a FireWorker if you’ve set one up.

Introduction to FireTasks

In the quickstart, we ran a simple script that performed echo "howdy, your job launched successfully!" >> howdy.txt". Looking inside fw_test.yaml, recall that the command was defined within a task labeled Script Task:

spec:
  _tasks:
  - _fw_name: Script Task
    script: echo "howdy, your job launched successfully!" >> howdy.txt

The Script Task is one type of FireTask, which is a predefined job template written in Python. The Script Task in particular refers Python code inside FireWorks that runs an arbitrary shell script that you give it. You can use the Script Task to run almost any job (without worrying that it’s all done within a Python layer). However, you might want to set up custom job templates that are more explicit and reusable. In this section, we’ll demonstrate how to accomplish this with FireTasks, but first we’ll demonstrate the simplest version to linearly run multiple tasks.

Running multiple FireTasks

You can run multiple tasks within the same FireWork (it might be helpful to review the Workflow Model diagram). For example, the first step of your FireWork might write an input file that the second step processes. Let’s create a FireWork where the first step prints howdy.txt, and the second step counts the number of words in that file.

  1. Navigate to the tasks tutorial directory on your FireServer:

    cd <INSTALL_DIR>/fw_tutorials/firetask
  2. Look inside the file fw_multi.yaml. You should see two instances of Script Task inside our spec. Remember, our spec contains all the information needed to run our job. The second Script Task runs the wc -w command to count the number of characters in howdy.txt and exports the result to words.txt:

    spec:
      _tasks:
      - _fw_name: Script Task
        script: echo "howdy, your job launched successfully!" > howdy.txt
      - _fw_name: Script Task
        script: wc -w < howdy.txt > words.txt
  3. Run this multi-step FireWork on your FireServer:

    lpad reset <TODAY'S DATE>
    lpad add fw_multi.yaml
    rlaunch singleshot

Tip

You can run all three of these commands on a single line by separating them with a semicolon. This will reset the database, insert a FW, and run it within a single command.

You should see two files written out to the system, howdy.txt and words.txt, confirming that you successfully ran a two-step job!

Note

The only way to communicate information between FireTasks within the same FireWork is by writing and reading files, such as in our example. If you want to perform more complicated information transfer, you might consider defining a workflow that connects FireWorks instead. You can pass information easily between different FireWorks in a Workflow through the FWAction object, but not between FireTasks within a FireWork (Workflow Model).

Python Example (optional)

Here is a complete Python example that runs multiple FireTasks within a single FireWork:

from fireworks.core.firework import FireWork
from fireworks.core.fworker import FWorker
from fireworks.core.launchpad import LaunchPad
from fireworks.core.rocket_launcher import rapidfire
from fireworks.user_objects.firetasks.script_task import ScriptTask

# set up the LaunchPad and reset it
launchpad = LaunchPad()
launchpad.reset('', require_password=False)

# create the FireWork consisting of multiple tasks
firetask1 = ScriptTask.from_str('echo "This is TASK #1"')
firetask2 = ScriptTask.from_str('echo "This is TASK #2"')
firetask3 = ScriptTask.from_str('echo "This is TASK #3"')
fw = FireWork([firetask1, firetask2, firetask3])

# store workflow and launch it locally, rapid-fire
launchpad.add_wf(fw)
rapidfire(launchpad, FWorker())

Creating a custom FireTask

Because the Script Task can run arbitrary shell scripts, it can in theory run any type of job and is an ‘all-encompassing’ FireTask. Script Task also has many additional features that will be covered in a future tutorial.

However, if you are comfortable with some basic Python, it is better to define your own custom FireTasks (job templates) for the codes you run. A custom FireTask can clarify the usage of your code and guard against unintended behavior by restricting the commands that can be executed.

Even if you plan to only use Script Task, we suggest that you still read through the next portion before continuing with the tutorial. We’ll be creating a custom FireTask that adds one or more numbers using Python’s sum() function, and later building workflows using this (and similar) FireTasks.

How FireWorks bootstraps a job

Before diving into an example of custom FireTask, it is worth understanding how FireWorks is bootstrapping jobs based on your specification. The basic process looks like this:

FireWorks Bootstrap
  1. The first step of the image just shows how the spec section of the FireWork is structured. There is a section that contains your FireTasks (one or many), as we saw in the previous examples. The spec also allows you to define arbitrary JSON data (labeled input in the diagram) to pass into your FireTasks as input. So far, we haven’t seen an example of this; the only information we gave in the spec in the previous examples was within the _tasks section.
  2. In the second step, FireWorks dynamically loads Python objects based on your specified _tasks. It does this by searching a list of Python packages for Python objects that have a value of _fw_name that match your setting. When we set a _fw_name of ScriptTask in the previous examples, FireWorks was loading a Python object with a _fw_name class variable set to ScriptTask (and passing the script parameter to its constructor). The ScriptTask is just one type of FireTask that’s built into FireWorks to help you run scripts easily. You can write code for custom FireTasks anywhere in the user_packages directory of FireWorks, and it will automatically be discovered. If you want to place your FireTasks in a package outside of FireWorks, please read the FireWorks configuration tutorial. You will just need to define what Python packages to search for your custom FireTasks.
  3. In the third step, we execute the code of the FireTask we loaded. Specifically, we execute the run_task method which must be implemented for every FireTask. FireWorks passes in the entire spec to the run_task method; the run_task method can therefore modify its behavior based on any input data present in the spec, or by detecting previous or future tasks in the spec.
  4. When the FireTask is done executing, it returns a FWAction object that can modify the workflow (or continue as usual) and pass information to downstream FireWorks.

Custom FireTask example: Addition Task

Let’s explore custom FireTasks with by writing custom Python for adding two numbers specified in the spec.

  1. Staying in the firetasks tutorial directory, remove any output from the previous step:

    rm howdy.txt FW.json words.txt
  2. Let’s first look at what a custom FireTask looks like in Python. Look inside the file addition_task.py which defines the Addition Task:

    class AdditionTask(FireTaskBase, FWSerializable):
    
        _fw_name = "Addition Task"
    
        def run_task(self, fw_spec):
            input_array = fw_spec['input_array']
            m_sum = sum(input_array)
    
            print "The sum of {} is: {}".format(input_array, m_sum)
    
            return FWAction(stored_data={'sum': m_sum})
    
  3. A few notes about what’s going on (things will be clearer after the next step):

    • In the class definition, we are extending FireTaskBase to tell FireWorks that this is a FireTask.
    • A special parameter named _fw_name is set to Addition Task. This parameter sets what this FireTask will be called by the outside world and is used to bootstrap the object, as described in the previous section.
    • The run_task() method is a special method name that gets called when the task is run. It can take in a FireWork specification (spec) in order to modify its behavior.
    • When executing run_task(), the AdditionTask we defined first reads the input_array parameter of the FireWork’s spec. It then sums all the values it finds in the input_array parameter of the FireWork’s spec using Python’s sum() function. Next, the FireTask prints the inputs and the sum to the standard out. Finally, the task returns a FWAction object.
    • We’ll discuss the FWAction object in greater detail in future tutorials. For now, it is sufficient to know that this is an instruction that says we should store the sum we computed in the database (inside the FireWork’s stored_data section).
  4. Now let’s define a FireWork that runs this FireTask to add the numbers 1 and 2. Look inside the file fw_adder.yaml for this new FireWork definition:

    spec:
      _tasks:
      - _fw_name: Addition Task
        parameters: {}
      input_array:
      - 1
      - 2
  5. Let’s match up this FireWork with our code for our custom FireWork:

    • The _fw_name parameter is set to the same value as in our code for the FireTask (Addition Task). This is how FireWorks knows to run your custom FireTask rather than Script Task or some other FireTask.
    • This spec has an input_array field defined to 1 and 2. Remember that our Python code was grabbing the values in the input_array, summing them, and printing them to standard out.
  6. When you are comfortable that you roughly understand how a custom FireTask is set up, try running the FireWork on the central server to confirm that the Addition Task works:

    lpad reset <TODAY'S DATE>
    lpad add fw_adder.yaml
    rlaunch --silencer singleshot

    Note

    The --silencer option suppresses log messages.

  7. Confirm that the sum is not only printed to the screen, but also stored in our FireWork in the stored_data section:

    lpad get_fws -i 1 -d all

Python example (optional)

Here is a complete Python example that runs a custom FireTask:

from fireworks.core.firework import FireWork
from fireworks.core.fworker import FWorker
from fireworks.core.launchpad import LaunchPad
from fireworks.core.rocket_launcher import launch_rocket
from fw_tutorials.firetask.addition_task import AdditionTask

# set up the LaunchPad and reset it
launchpad = LaunchPad()
launchpad.reset('', require_password=False)

# create the FireWork consisting of a custom "Addition" task
firework = FireWork(AdditionTask(), spec={"input_array": [1, 2]})

# store workflow and launch it locally
launchpad.add_wf(firework)
launch_rocket(launchpad, FWorker())

Next up: Workflows!

With custom FireTasks, you can go beyond the limitations of running shell commands and execute arbitrary Python code templates. Furthermore, these templates can operate on data from the spec of the FireWork. For example, the Addition Task used the input_array from the spec to decide what numbers to add. By using the same FireWork with different values in the spec (try it!), one could execute a data-parallel application.

While one could construct an entire workflow by chaining together multiple FireTasks within a single FireWork, this is often not ideal. For example, we might want to switch between different FireWorkers for different parts of the workflow depending on the computing requirements for each step. Or, we might have a restriction on walltime that necessitates breaking up the workflow into more atomic steps. Finally, we might want to employ complex branching logic or error-correction that would be cumbersome to employ within a single FireWork. The next step in the tutorial is to explore connecting together FireWorks into a workflow.