Useful procedures and macros

The (gwl utils) module provides a number of useful helpers that are intended to simplify common tasks when defining processes. The helpers defined by this module are all available by default.

Scheme Procedure: on collection higher proc

The on procedure is an alternative way to express the application of a higher order function to some collection. The only purpose of this procedure is to improve legibility when using Wisp syntax, as it allows one to avoid leading dots. The following two expressions are equivalent:

;; With "on"
on numbers map
   lambda (number)
     + number 10

;; Without "on"
map
  lambda (number)
    + number 10
  . samples
Scheme Macro: file file-name-part

This macro enables you to construct a normalized file name out of any number of file name parts given as arguments. A file name part can either be a string literal or a variable or expression that evaluates to a string.

Directories are separated with a literal slash. This allows you to construct file names where parts of a directory or file name are computed from other values.

define user
  . "rekado"

define my-list
  iota 32

define num
  number->string
    + 10
      length my-list

file / "home" / user / "file_" num ".txt"

=> "/home/rekado/file_42.txt"
Scheme Macro: files file-name-part

Much like the file macro, the files macro enables you to construct multiple normalized file names out of any number of file name parts given as arguments. A file name part can either be a string literal, a variable or expression that evaluates to a string, or a variable or expression that evaluates to a list of strings.

Any list of strings will lead to the construction of a combinatorial variant. This is very useful when you need to generate a list of input or output file names.

Directories are separated with a literal slash. This allows you to construct file names where parts of a directory or file name are computed from other values.

define users
  list "rekado" "zimoun"

define projects
  list "foo" "bar"

define extensions
  list "txt" "tar.gz" "scm"

files / "home" / users / "proj_" projects / "file." extensions

=> '("/home/rekado/proj_foo/file.txt"
     "/home/rekado/proj_foo/file.tar.gz"
     "/home/rekado/proj_foo/file.scm"
     "/home/rekado/proj_bar/file.txt"
     "/home/rekado/proj_bar/file.tar.gz"
     "/home/rekado/proj_bar/file.scm"
     "/home/zimoun/proj_foo/file.txt"
     "/home/zimoun/proj_foo/file.tar.gz"
     "/home/zimoun/proj_foo/file.scm"
     "/home/zimoun/proj_bar/file.txt"
     "/home/zimoun/proj_bar/file.tar.gz"
     "/home/zimoun/proj_bar/file.scm")
Scheme Procedure: pick [n] key collection

This procedure allows you to pick a named item from a collection by looking for the specified keyword key. Optionally, you can provide a selector procedure or index n as the first argument. Without a selector the first item matching the given key will be returned. When the selector is * all items following the key (up to the next tag) will be returned. If the selector is a number it is used as a zero-based index into the list of items following the key. If the selector is a procedure it is applied to the list of items following the key.

define collection
  list
    . "one"
    . "two"
    . "three"
    . mine: "four"
    . "five"
    . yours: "six"

pick mine: collection

; => "four"

pick * mine: collection

; => '("four" "five")

pick second mine: collection

; => "five"

pick 0 yours: collection

; => "six"
Scheme Syntax: load-workflow file

This macro lets you load a workflow from the given file. The file must evaluate to a workflow value. This macro is useful for when you want to extend previously defined workflows. The argument file is expected to be a file name relative to the file invoking load-workflow.

Scheme Procedure: display-file file [max-lines]

This procedure lets you display a file, or the first max-lines lines of a file. This can be used to display a banner when the workflow starts, or to display a text report upon completion.

Scheme Procedure: get collection [#:default default] path

This procedure allows you to select an item from a (potentially nested) collection by traversing the specified path, a sequence of string or symbols that are keys in the collection. This becomes much clearer with an example:

(define config
  '(("locations"
     . (("input"  . "/home/rekado/foo")
        ("output" . "/dev/null")))
    ("resources"
     . (("R"
         . (("memory" . "2GB")
            ("cores"  . 2)))
        ("samtools"
         . (("memory" . "128kB")
            ("cores"  . 1)))))))

(get config "locations" "output")

; => "/dev/null"

(get config "resources" "R" "cores")

; => 2

The variable config here is a so-called association list that associates string keys with values. Some of these values are again association lists. get simply traverses the provided path of keys and “enters” each specified collection in turn.

Association lists are very common in Scheme, and they are also used as an intermediate representation for many parsed files. Here is an example of using get on a parsed JSON file (this depends on the guile-json package):

;; Declare packages
require-packages
  . "guile-json"

;; Load it
import
  json

define config
  json-string->scm "\
{
  \"locations\": {
    \"input\": \"/home/rekado/foo\",
    \"output\": \"/dev/null\"
  },
  \"resources\": {
    \"R\": {
      \"memory\": \"2GB\",
      \"cores\": 2
    },
    \"samtools\": {
      \"memory\": \"128kB\",
      \"cores\": 1
    }
  }
}
"

get config "locations" "output"

; => "/dev/null"

get config "resources" "R" "cores"

; => 2

If the provided path cannot be followed because one or more of the keys do not exist or the value after looking up an intermediate key does not result in a collection, get will raise an error condition. If you only want to look up an optional value in a collection that may or may not exist, you can provide a default value to get. That value will be returned instead of raising an error.

;; Declare packages
require-packages
  . "guile-json"

;; Load it
import
  json

define config
  json-string->scm "\
{
  \"locations\": {
    \"input\": \"/home/rekado/foo\",
    \"output\": \"/dev/null\"
  },
  \"resources\": {
    \"R\": {
      \"memory\": \"2GB\",
      \"cores\": 2
    },
    \"samtools\": {
      \"memory\": \"128kB\",
      \"cores\": 1
    }
  }
}
"

get config default: "/tmp" "locations" "temp-directory"

; => "/tmp"