Next: , Up: Defining a Process   [Contents][Index]


4.1 process Fields

Both make-process and process accept the same fields, which we describe below.

name

The readable name of the process as a string. This is used for display purposes and to select processes by name. When the process constructor is used, the name field need not be provided explicitly.

version

This field holds an arbitrary version string. This can be used to disambiguate between different implementations of a process when searching by name.

synopsis

A short summary of what this process intends to accomplish.

description

A longer description about the purpose of this process.

packages

This field is used to specify what software packages need to be available when executing the process. Packages can either be Guix package specifications — such as the string "guile@3.0" for Guile version 3.0 — or package variable names. When using package variable names, you need to make sure to import the appropriate Guix module at the top of your workflow file, e.g. (import (gnu packages guile)) for the variable guile.

The packages field accepts a list of packages as well as multiple values (an “implicit list”). All of the following specifications are valid. A single package:

process
  packages "guile"
  …

More than one package:

process
  packages "guile" "python"
  …

A single list of packages:

process
  packages
    list "guile" "python"
  …
inputs

This field holds inputs to the process. Commonly, this will be a list of file names that the process requires to be present. The GWL can automatically connect processes by matching up their declared inputs and outputs, so that processes generating certain outputs are executed before those that declare the same item as an input.

As with the packages field, the inputs field accepts an “implicit list” of multiple values as well as an explicit list. Additionally, individual inputs can be “tagged” or named by prefixing it with a keyword (see Keywords in GNU Guile Reference Manual). Here’s an example of an implicit list of inputs spread across multiple lines where two inputs have been tagged:

process
  inputs
    . genome: "hg19.fa"
    . "cookie-recipes.txt"
    . samples: "foo.fq"
  …

The leading period is Wisp syntax to continue the previous line. You can, of course, do without the periods, but this may look a little more cluttered:

process
  inputs genome: "hg19.fa" "cookie-recipes.txt" samples: "foo.fq"
  …

Why tag inputs at all? Because you can reference them in other parts of your process definition without having to awkwardly traverse the whole list of inputs. Here is one way to select the first input that was tagged with the samples: keyword:

pick genome: inputs

To select the second item after the tag genome: do this:

pick second genome: inputs

or using a numerical zero-based index:

pick 1 genome: inputs

Code Snippets for a convenient way to access named items in code snippets without having to define your picks beforehand.

outputs

This field holds a list of outputs that are expected to appear after executing the process. Usually this will be a list of file names. Just like the inputs field, this field accepts a plain list, an implicit list of one or more values, and lists with named items.

The GWL can automatically connect processes by matching up their declared inputs and outputs, so that processes generating certain outputs are executed before those that declare the same item as an input.

output-path

This is a directory prefix for all outputs.

run-time

This field is used to specify run-time resource estimates, such as the memory requirement of the process or the maximum time it should run. This is especially useful when submitting jobs to an HPC cluster scheduler such as Grid Engine, as these schedulers may give higher priority to jobs that declare a short run time.

procedure

This field holds an expression of code that should be run when the process is executed. This is the “work” that a process should perform. By default that’s a quoted Scheme expression, but code snippets in other languages are also supported (see Code Snippets).

Here’s an example of a process with a procedure that writes a haiku to a file:

process haiku
  outputs "haiku.txt"
  synopsis "Write a haiku to a file"
  description
    . "This process writes a haiku by Gary Hotham \
to the file \"haiku.txt\"."
  procedure
    ` with-output-to-file ,outputs
        lambda ()
          display "\
the library book
overdue?
slow falling snow"

The Scheme expression here is quasiquoted (with a leading `) to allow for unquoting (with ,) of variables, such as outputs.

Not always will Scheme be the best choice for a process procedure. Sometimes all you want to do is fire off a few shell commands. While this is, of course, possible to express in Scheme, it is admittedly somewhat verbose. For convenience we offer a simple and surprisingly short syntax for this common use case. As a bonus you can even leave off the field name “procedure” and write your code snippet right there. How? Code Snippets.


Next: , Up: Defining a Process   [Contents][Index]