This guide covers in more detail how one can write their own FireTasks (and return dynamic actions), and assemble those FireTasks into FireWorks and Workflows. This guide will also cover the FWAction object, passing data, and dynamic workflow actions.
If you’d like to see a “Hello World” Example of a custom FireTask, you can go here.
If you are able to run that example and want more details of how to modify and extend it, read on...
The first thing you should decide is whether to use an existing FireTask or write your own. FireWorks comes pre-packaged with many “default” FireTasks - in particular, the PyTask allows you to call any Python function. There are also existing FireTasks for running scripts, remotely transferring files, etc. The quickest route to getting something running is to use an existing FireTask, i.e. use the PyTask if you want to run a quick script.
Links to documentation on default FireTasks can be found in the main page under the heading “built-in FireTasks”.
A few reasons to not use the default FireTasks are:
The easiest way to understand a FireTask is to examine an example; for example, here’s one implementation of a task to archive files:
class ArchiveDirTask(FireTaskBase):
"""
Wrapper around shutil.make_archive to make tar archives.
Args:
base_name (str): Name of the file to create.
format (str): Optional. one of "zip", "tar", "bztar" or "gztar".
"""
_fw_name = 'ArchiveDirTask'
required_params = ["base_name"]
optional_params = ["format"]
def run_task(self, fw_spec):
shutil.make_archive(self["base_name"], format=self.get("format", "gztar"), root_dir=".")
You can copy this code to a new place and make the following modifications in order to write your FireTask:
Keep the run_task method header intact, but change the definition to your custom operation. Remember you can access dict keys of “fw_spec” as well as dict keys of “self”
When FireWorks bootstraps your FireTask from a database definition, it needs to know where to look for FireTasks.
First, you need to make sure your FireTask is defined in a file location that can be found by Python, i.e. is within Python’s search path and that you can import your FireTask in a Python shell. If Python cannot import your code (e.g., from the shell), neither can FireWorks. This step usually means either installing the code into your site-packages directory (where many Python tools install code) or modifying your PYTHONPATH environment variable to include the location of the FireTask. You can see the locations where Python looks for code by typing import sys followed by print(sys.path). If you are unfamiliar with this topic, some more details about this process can be found here, or try Googling “how does Python find modules?”
Second, you must register your FireTask so that it can be found by the FireWorks software. There are a couple of options for registering your FireTask (you only need to do one of the below):
You are now ready to use your FireTask!
In the previous example, the run_task method did not return anything, nor does it pass data to downstream FireTasks or FireWorks. Remember that the setting the _pass_job_info key in the Firework spec to True will automatically pass information about the current job to the child job - see reference for more details.
However, one can also return a FWAction object that performs many powerful actions including dynamic workflows.
Here’s an example of a FireTask implementation that includes dynamic actions via the FWAction object:
class FibonacciAdderTask(FireTaskBase):
_fw_name = "Fibonacci Adder Task"
def run_task(self, fw_spec):
smaller = fw_spec['smaller']
larger = fw_spec['larger']
stop_point = fw_spec['stop_point']
m_sum = smaller + larger
if m_sum < stop_point:
print('The next Fibonacci number is: {}'.format(m_sum))
# create a new Fibonacci Adder to add to the workflow
new_fw = Firework(FibonacciAdderTask(), {'smaller': larger, 'larger': m_sum, 'stop_point': stop_point})
return FWAction(stored_data={'next_fibnum': m_sum}, additions=new_fw)
else:
print('We have now exceeded our limit; (the next Fibonacci number would have been: {})'.format(m_sum))
return FWAction()
We discussed running this example in the Dynamic Workflow tutorial - if you have not gone through that tutorial, we strongly suggest you do so now (it also includes an example of message passing).
Note that this example is slightly different than the previous one:
Other than those differences, the code is the same format as earlier. The dynamicism comes only from the FWAction object; next, we will this object in more detail.
A FireTask (or a function called by PyTask) can return a FWAction object that can perform many powerful actions. Note that the FWAction is stored in the FW database after execution, so you can always go back and see what actions were returned by different FireTasks. A diagram of the different FWActions is below:
The parameters of FWAction are as follows:
The FWAction thereby allows you to command the workflow programmatically, allowing for the design of intelligent workflows that react dynamically to results.
It is generally not good practice to use the LaunchPad within the FireTask because this makes the task specification less explicit. For example, this could make duplicate checking more problematic. However, if you really need to access the LaunchPad within a FireTask, you can set the _add_launchpad_and_fw_id key of the Firework spec to be True. Then, your tasks will be able to access two new variables, launchpad (a LaunchPad object) and fw_id (an int), as members of your FireTask. One example is shown in the unit test test_add_lp_and_fw_id().
Other than explicitly defining a _fw_name parameter, there are two alternate ways to identify the FireTask:
You can omit the _fw_name parameter altogether, and the code will then use the Class name as the identifier. However, note that this is dangerous as changing your Class name later on can break your code. In addition, if you have two FireTasks with the same name the code will throw an error.
(or) You can omit the _fw_name and add an @explicit_serialize decorator to your Class. This will identify your class by the module name AND class name. This prevents namespace collisions, AND it allows you to skip registering your FireTask! However, the serialization is even more sensitive to refactoring: moving your Class to a different module will break the code, as will renaming it. Here’s an example of how to use the decorator:
from fireworks.utilities.fw_utilities import explicit_serialize
@explicit_serialize
class PrintFW(FireTaskBase):
def run_task(self, fw_spec):
print str(fw_spec['print'])
In both cases of removing _fw_name, there is still a workaround if you refactor your code. The FW config has a parameter called FW_NAME_UPDATES that allows one to map old names to new ones via a dictionary of {<old name>:<new name>}. This method also works if you need to change your _fw_name for any reason.