Inspecting Pipelines

Overview

One of Martian’s principal design objectives is to make it easy to develop and debug pipelines. Key to that objective is making it easy to inspect the state of in-progress, completed, and failed pipeline runs. A particular instantiation of a pipeline is referred to as a “pipestance.”

All Martian pipestance state is persisted in the filesystem in an intuitive, human-decipherable layout. There is no opaque database, client, or other mechanism impeding visibility into pipestance state and metadata.

All that is required to inspect a Martian pipestance are cd, ls, and cat. A web UI is available as well.

Pipestance Layout

When mrp starts a pipestance, it converts the MRO definition into a graph representation of the pipeline. It then creates a directory tree on disk matching that graph. Here is an example:

piperun1/
    DUPLICATE_FINDER/
        fork0/
        SORT_ITEMS/
            fork0/
                split/
                chnk0/
                join/
        FIND_DUPLICATES/
            fork0/
                split/
                chnk0/
                join/        

The basic directory hierarchy is pipelinestage / subpipelineforkchunk.

For more information on forks, see Advanced Features: Parameter Sweeping.

Chunk Metadata

For details on splits, joins, and chunks, see Advanced Features: Parallelization. For simplicity of the discussion here, we will refer to all three, generically, as chunks. A chunk is a the most basic unit of execution in Martian that mrp actually schedules to run.

A chunk directory contains a number of metadata files, plus a subdirectory named files/, which contains the output files generated by that chunk’s code. For ease-of-use and human-decipherability, all metadata files are either pretty-printed JSON for structured information, or plain-text line-oriented logs. A chunk directory may contain the following:

FileFormatDescription
_argsJSONInput arguments to the chunks, passed from upstream stages.
_outsJSONOutput values generated by the chunk upon successful completion, passed to downstream stages.
_jobinfoJSONDetails on how this chunk job was run, including performance profiling from rusage.
_logTXTTimestamped log generated by the chunk’s code via the Martian adapter log API.
_stdoutTXTCapture of the chunk code’s STDOUT.
_stderrTXTCapture of the chunk code’s STDERR.
_errors/_assertTXTCreated if the chunk fails, and contains the error captured by the Martian adapter, e.g. a stack trace.
_completeTXTCreated when the chunk completes successfully, and contains a timestamp.
_progressTXTA string indicating the stage’s current level of progress (optionally). When it is updated, this string is bubbled up to mrp’s log.
_chunk_defs/_chunk_outsJSONSee Advanced Features: Parallelization

There are several other metadata files which may be present, either as sentinel values for the pipeline runtime, or optional debugging information from stage code, for example from performance profiling tools.

Pipestance Metadata

mrp also generates metadata at the top level of the pipestance directory. These may include:

FileFormatDescription
_invocationMROA preserved copy of the invocation MRO passed to mrp. If a pipestance restart is attempted and the provided invocation MRO does not match the preserved copy, mrp will refuse to proceed.
_mrosourceMROA preserved copy of the complete MRO source, after recursive preprocessing, that mrp parsed and ran.
_jobmodeTXTThe job mode mrp ran under, e.g. local, sge. See Advanced Features: Job Management for details.
_logTXTTimestamped log generated by mrp. Includes command-line options, environment variables, and job manager settings under which mrp ran. Log entries are the same as what mrp prints to STDOUT.
_timestampTXTLine-oriented file containing timestamps for pipestance start and successful completion.
_uuidTXTAn RFC 4122 Version 4 randomly generated UUID. Useful when aggregating Martian pipestances from multiple sources where pipestance ID / directory names are not guaranteed to be globally unique.
_versionsJSONRecords version numbers for mrp as well as git commit hash associated with the MRO code.
_tagsJSONUser-defined key-value pairs passed to the --tags option of mrp. These are passed through for the user’s convenience and are not processed by Martian.
_perfJSONPipeline, stage, and chunk performance data generated by mrp. Useful for analysis and optimization.
_finalstateJSONInput arguments to the chunks, passed from upstream stages.

The top level of the pipestance directory may also contain two subdirectories journal and tmp.

The journal directory is used as an notification mailbox by mrp to receive asynchronous status updates from running chunks. Its contents should never be modified.

The tmp directory is used as a TMPDIR target for all chunk code. Normally chunk code should simply write to its current working directory, which is set by mrp to be each chunk’s files/ directory. However, some third-party tools used in chunks may unavoidably write to TMPDIR. Therefore, mrp sets each chunk’s TMPDIR to the pipestance’s tmp directory to ensure that the pipeline only writes files within the confines of the pipestance directory. Without this measure, pipelines may sometimes write to /tmp, potentially and unexpectedly filling up the local filesystems of remote execution hosts.

Logging

While running, mrp outputs logging information simultaneously to standard output and the _log file in the top level of the pipestance directory.

The start of the _log contains information on how mrp was started, including command-line options used, relevant environment variables, and job manager settings. The balance of the log largely comprises notifications of stage progress. One can use this logging to monitor the progress of the mrp through the pipeline’s graph.

Here is an example of the top of a log:

Martian Runtime - 2.2.0
2017-04-09 14:22:43 [cmdline] mrp invoke.mro piperun1 --localcores=16
2017-04-09 14:22:43 [environ] MROFLAGS=--vdrmode=rolling
2017-04-09 14:22:43 [options] --localcores=16
2017-04-09 14:22:43 [environ] MROPATH=/home/user/git/mypipeline/mro
2017-04-09 14:22:43 [version] MRO Version=2.2.0
2017-04-09 14:22:43 [options] --jobmode=local
2017-04-09 14:22:43 [options] --maxjobs=-1
2017-04-09 14:22:43 [options] --jobinterval=-1
2017-04-09 14:22:43 [options] --vdrmode=rolling
2017-04-09 14:22:43 [options] --profile=disable
2017-04-09 14:22:43 [options] --stackvars=false
2017-04-09 14:22:43 [options] --zip=false
2017-04-09 14:22:43 [options] --noexit=false
2017-04-09 14:22:43 [options] --nopreflight=false
2017-04-09 14:22:43 [jobmngr] Job config = /opt//martian-2.2.0/jobmanagers/config.json
2017-04-09 14:22:43 [jobmngr] Using 16 cores, per --localcores option.
2017-04-09 14:22:43 [jobmngr] Using 94 GB, 100% of system memory.
2017-04-09 14:22:44 [webserv] UI disabled.
Running preflight checks (please wait)...

Because the log is mirrored to the _log file, you may detach from a running mrp or redirect its standard output to /dev/null, and instead monitor progress using tail -f _log. Since pipelines are often long-running, and usually launched on remote workstations via an ssh connection, it is recommended to launch mrp as a background process and disown it so that it is not interrupted if the terminal connection is broken.

Final Outputs

When a pipestance completes, the formal output files of the pipeline are moved to the top level of the pipestance directory into a subdirectory named outs/. Their original locations inside their respective chunks’ files/ directories are then symlinked to the file in the outs directory.

Remote command line access

The mrstat tool is a basic tool for querying mrp’s state over the API exposed through the UI port (see below). Given a pipestance directory, it will return mrp’s current state information. It also has a --stop flag, which will cause mrp to abort the pipestance if it is running and shut down.

Web Interface

In addition to console logging, mrp features an embedded webserver that serves an HTML user interface for monitoring progress and a REST API. By default, the mrp webserver listens on a kernel-assigned non-privileged network port and generates an authentication token to pass in the URL. mrp reports the kernel-selected port number and authentication token as a URL on the console and in the log, for example

Serving UI at http://utopia-planitia.mars.sol:59410?auth=y1ezpEGscyA3jfion7iR58ED-ufs_drYOOKAeIR6GeQ

The URL is also written to the pipestance metadata file named _uiport. To access the UI, simply direct your browser to the specified URL.

Alternatively, you may choose a specific port using the --uiport command-line option. By default, if a specific port is chosen, the authentication token is only required for API calls which modify the pipestance state. These default behaviors can be altered with the mrp flags --disable-ui, --disable-auth, --require-auth, and --auth-key.

Normally, when a pipeline run completes, mrp terminates and the user interface is no longer available. If mrp was started with --noexit then it will stay up until mrp is killed (either with an operating system signal or with mrstat --stop). One can also restart mrp in read-only mode with the --inspect option (which implies --noexit).

Pipeline details

The right pane of the user interface provides general information about the pipestance, including the command line given to mrp, some environment variables, version information, paths, and the invocation mro. Access to the pipestance _log file is also available through this interface.

Pipeline Graph

The left pane of the UI displays the graph structure of the pipeline.

Each node of the graph represents a pipeline, subpipeline, or stage. Pipelines and subpipelines are displayed as rectangles while stages are displayed as rounded rectangles. Edges connecting the nodes indicate input/output dependencies between stages and pipelines. Note that while stages which depend on the outputs of a stage are drawn as depending on the entire stage in order to reduce clutter in the graph view, those outputs are bound to individual stage outputs. Thus it is possible for a stage which appears to depend on a subpipeline which has not completed yet to start running anyway, as long as the appropriate underlying stages have completed.

Nodes are colored grey while they are waiting for upstream dependencies to be satisfied in order for that node to be able to run. Nodes turn blue when their dependencies are satisfied and they may run, green when they have successfully completed, and dark grey if they fail.

For more details on a particular node, simply click on it.

Node Details

If you click on a node in the graph, the right pane of the user interface replaces the pipestance details with the execution details associated with the selected node. This includes information about the stage code, relevant paths, and then progressive drill-down detail about the real-time status of parameter sweeping, parallelization, and chunk execution. All metadata files in the pipestance are available through the UI.

To get back to the pipeline details view, click the ← button next to the node name at the top.