FireWorks

Note: FireWorks is under active development. It is currently incomplete and not useable as workflow software. However, certain components of the code are available for initial testing.

FireWorks is a code for defining, managing, and executing scientific workflows. It can be used to automate most types of calculations over arbitrary computing resources, including those that have a queueing system.

Features

FireWorks is intended to be a friendly workflow software that is easy to get started with, but flexible enough to handle complicated use cases.

Some (but not all) of its features include:

  • Storage and management of workflows through MongoDB, a noSQL datastore that is flexible and easy to use.
  • Ability to distribute calculations over multiple worker nodes, each of which might use different a queueing system and process a different type of calculation.
  • Support for dynamic workflows that that react to results programmatically. You can pre-specify what actions to take depending on the output of a job (e.g., terminate a workflow, add a new step, or completely alter the workflow)
  • Automatic detection of duplicate sub-workflows - skip duplicated portions between two workflows while still running unique sections
  • Loosely-coupled and modular infrastructure that is intentionally hackable. Use some FireWorks components without using everything, and more easily adapt FireWorks to your application.
  • Plug-and-play on several large supercomputing clusters and queueing systems (future)
  • Monitoring of workflows through a web service (future)
  • Package many small jobs into a single large job (useful for running on HPC machines that prefer a small number of large-CPU jobs). (future)

Limitations

Some limitations of FireWorks include:

  • FireWorks has not been stress-tested to very large numbers of worker nodes. (If you try this, we’d love to hear the results!)
  • FireWorks has not been stress-tested to hundreds or thousands of jobs in a workflow. (If you try this, we’d love to hear the results!)
  • FireWorks does not automatically optimize the distribution of computing tasks over worker nodes (e.g., to minimize data movement or to match jobs to appropriate hardware); you must define such optimizations yourself.
  • FireWorks has only been tested on Linux and Macintosh machines. (If you are trying to get set up on Windows, please let us know if you encounter problems)

Is FireWorks for me?

It can be time-consuming to evaluate whether a workflow software will meet your computing needs from documentation alone. If you just want to know whether FireWorks is a potential solution to your workflow problem, one option is to e-mail a description of your problem to the developer at: developer contact

We can tell you if:

  • Your problem is a great match for FireWorks
  • Your problem requires implementing minor extensions or modifications to FireWorks, but FireWorks is still a potential solution
  • Your problem is not easily solved with FireWorks and you should probably look elsewhere!

Getting Started!

To get started with FireWorks, we suggest that you follow our core tutorials. These tutorials will set up a central server as well as worker computers. They will also demonstrate how to define and run basic workflows. We expect that completing all of the core tutorials will take between one and three hours. (You might want to get a snack...)

More!

Depending on your application, you might also be interested in the following tutorials:

Planned future tutorials:

  • Maintaining the FW database and dealing with crashed jobs
  • Detailed tutorial on implementing dynamic jobs
  • Securing the FW database
  • Detailed tutorial on Script Task
  • File movement Task Operations
  • Database Task Operations
  • Assigning specific FireWorkers to run certain jobs
  • Assigning and modifying job priority
  • Automatically prevent duplicate jobs from running twice
  • Using a web interface to monitor FireWorks
  • Checkpoint / restart of jobs
  • Using the QueueLauncher outside of FW
  • Searching for FireWorks and Workflows
  • Logging
  • FW design guide, e.g. FireTasks vs Workflows
  • JSON vs. YAML and serialization of FW objects (including WF serialiazation as TAR instead of JSON/YAML)

Contributing and Contact

Want to see something added or changed? There are many ways to make that a reality! Some ways to get involved are:

  • Help us improve the documentation - tell us where you got ‘stuck’ and improve the install process for everyone.
  • Let us know if you need support for a queueing system or certain features.
  • Point us to areas of the code that are difficult to understand or use.
  • Share code on how FireWorks was used to solve a specific problem.
  • Get in touch and contribute to the core codebase!

The collaborative way to submit questions, issues, and all other communication is through the FireWorks Github page. You can also contact: developer contact

Thank yous

Michael Kocher and Dan Gunter initiated the architecture of a central database with multiple workers that queued ‘placeholder’ scripts responsible for checking out jobs. Some of Michael’s code was refashioned for the QueueLauncher and the PBS QueueAdapter.

Shyue Ping Ong was extremely helpful in providing guidance and feedback, as well as the nitty gritty of getting set up with Sphinx documentation, PyPI, etc. The code for modifying a FireWork specification using a dictionary (DictMod) was adapted from his custodian library. If you are in the market for a free Python materials analysis code, I highly recommend his pymatgen library (which I also sometimes contribute to).

Wei Chen was the first test pilot of FireWorks, and contributed greatly to improving the docs and ensuring that FireWorks installation went smoothly for others. In addition, he made many suggestions to improve the usability of the code.

FireWorks was developed primarily at Lawrence Berkeley National Lab using research funding from Kristin Persson for the Materials Project.

License

FireWorks is developed under the MIT License (a very permissive license), reproduced below:

The MIT License (MIT)
Copyright (c) 2011-2012 LBNL, Anubhav Jain

Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Indices and tables

Table Of Contents

Next topic

Installation Tutorial (part 1: the Server)

This Page