Taskblaster: a generic framework for automated computational workflows
Abstract
We introduce Taskblaster, a generic and lightweight Python framework for composing, executing, and managing computational workflows with automated error handling. Taskblaster supports dynamic workflows including flow control using branches and iteration, making the system Turing complete. Taskblaster aims to promote modular designs, where workflows are composed of reusable sub-workflows, and to simplify data maintenance as projects evolve and change. We discuss the main design elements including workflow syntax, a storage model based on intuitively named tasks in a nested directory tree, and command-line tools to automate and control the execution of the tasks. Tasks are executed by worker processes that may run directly in a terminal or be submitted using a queueing system, allowing for task-specific resource control. We provide a library (ASR-lib) of workflows for common materials simulations employing the Atomic Simulation Environment and the GPAW electronic structure code, but Taskblaster can equally well be used with other computational codes.