Inspecting Pipelines
Overview
One of Martian’s principal design objectives is to make it easy to develop and debug pipelines. Key to that objective is making it easy to inspect the state of in-progress, completed, and failed pipeline runs. A particular instantiation of a pipeline is referred to as a “pipestance.”
All Martian pipestance state is persisted in the filesystem in an intuitive, human-decipherable layout. There is no opaque database, client, or other mechanism impeding visibility into pipestance state and metadata.
All that is required to inspect a Martian pipestance are cd
, ls
, and cat
. A web UI is available as well.
Pipestance Layout
When mrp
starts a pipestance, it converts the MRO definition into a graph representation of the pipeline. It then creates a directory tree on disk matching that graph. Here is an example:
piperun1/
DUPLICATE_FINDER/
fork0/
SORT_ITEMS/
fork0/
split/
chnk0/
join/
FIND_DUPLICATES/
fork0/
split/
chnk0/
join/
The basic directory hierarchy is pipeline → stage / subpipeline → fork → chunk.
For more information on forks, see Advanced Features: Parameter Sweeping.
Chunk Metadata
For details on splits, joins, and chunks, see Advanced Features: Parallelization. For simplicity of the discussion here, we will refer to all three, generically, as chunks. A chunk is a the most basic unit of execution in Martian that mrp
actually schedules to run.
A chunk directory contains a number of metadata files, plus a subdirectory
named files/
, which contains the output files generated by that chunk’s code.
For ease-of-use and human-decipherability, all metadata files are either
pretty-printed JSON for structured information, or plain-text line-oriented
logs. A chunk directory may contain the following:
File | Format | Description |
---|---|---|
_args | JSON | Input arguments to the chunks, passed from upstream stages. |
_outs | JSON | Output values generated by the chunk upon successful completion, passed to downstream stages. |
_jobinfo | JSON | Details on how this chunk job was run, including performance profiling from rusage. |
_log | TXT | Timestamped log generated by the chunk’s code via the Martian adapter log API. |
_stdout | TXT | Capture of the chunk code’s STDOUT . |
_stderr | TXT | Capture of the chunk code’s STDERR . |
_errors /_assert | TXT | Created if the chunk fails, and contains the error captured by the Martian adapter, e.g. a stack trace. |
_complete | TXT | Created when the chunk completes successfully, and contains a timestamp. |
_progress | TXT | A string indicating the stage’s current level of progress (optionally). When it is updated, this string is bubbled up to mrp ’s log. |
_chunk_defs /_chunk_outs | JSON | See Advanced Features: Parallelization |
There are several other metadata files which may be present, either as sentinel values for the pipeline runtime, or optional debugging information from stage code, for example from performance profiling tools.
Pipestance Metadata
mrp
also generates metadata at the top level of the pipestance directory. These may include:
File | Format | Description |
---|---|---|
_invocation | MRO | A preserved copy of the invocation MRO passed to mrp . If a pipestance restart is attempted and the provided invocation MRO does not match the preserved copy, mrp will refuse to proceed. |
_mrosource | MRO | A preserved copy of the complete MRO source, after recursive preprocessing, that mrp parsed and ran. |
_jobmode | TXT | The job mode mrp ran under, e.g. local, sge. See Advanced Features: Job Management for details. |
_log | TXT | Timestamped log generated by mrp . Includes command-line options, environment variables, and job manager settings under which mrp ran. Log entries are the same as what mrp prints to STDOUT . |
_timestamp | TXT | Line-oriented file containing timestamps for pipestance start and successful completion. |
_uuid | TXT | An RFC 4122 Version 4 randomly generated UUID. Useful when aggregating Martian pipestances from multiple sources where pipestance ID / directory names are not guaranteed to be globally unique. |
_versions | JSON | Records version numbers for mrp as well as git commit hash associated with the MRO code. |
_tags | JSON | User-defined key-value pairs passed to the --tags option of mrp . These are passed through for the user’s convenience and are not processed by Martian. |
_perf | JSON | Pipeline, stage, and chunk performance data generated by mrp . Useful for analysis and optimization. |
_finalstate | JSON | Input arguments to the chunks, passed from upstream stages. |
The top level of the pipestance directory may also contain two subdirectories journal
and tmp
.
The journal
directory is used as an notification mailbox by mrp
to
receive asynchronous status updates from running chunks.
Its contents should never be modified.
The tmp
directory is used as a TMPDIR
target for all chunk code. Normally
chunk code should simply write to its current working directory, which is set by
mrp
to be each chunk’s files/
directory. However, some third-party tools
used in chunks may unavoidably write to TMPDIR
. Therefore, mrp
sets each
chunk’s TMPDIR
to the pipestance’s tmp
directory to ensure that the
pipeline only writes files within the confines of the pipestance directory.
Without this measure, pipelines may sometimes write to /tmp
, potentially and
unexpectedly filling up the local filesystems of remote execution hosts.
Logging
While running, mrp
outputs logging information simultaneously to
standard output and the _log
file in the top level of the pipestance
directory.
The start of the _log
contains information on how mrp
was started,
including command-line options used, relevant environment variables, and job
manager settings. The balance of the log largely comprises notifications of
stage progress. One can use this logging to monitor the progress of the mrp
through the pipeline’s graph.
Here is an example of the top of a log:
Martian Runtime - 2.2.0
2017-04-09 14:22:43 [cmdline] mrp invoke.mro piperun1 --localcores=16
2017-04-09 14:22:43 [environ] MROFLAGS=--vdrmode=rolling
2017-04-09 14:22:43 [options] --localcores=16
2017-04-09 14:22:43 [environ] MROPATH=/home/user/git/mypipeline/mro
2017-04-09 14:22:43 [version] MRO Version=2.2.0
2017-04-09 14:22:43 [options] --jobmode=local
2017-04-09 14:22:43 [options] --maxjobs=-1
2017-04-09 14:22:43 [options] --jobinterval=-1
2017-04-09 14:22:43 [options] --vdrmode=rolling
2017-04-09 14:22:43 [options] --profile=disable
2017-04-09 14:22:43 [options] --stackvars=false
2017-04-09 14:22:43 [options] --zip=false
2017-04-09 14:22:43 [options] --noexit=false
2017-04-09 14:22:43 [options] --nopreflight=false
2017-04-09 14:22:43 [jobmngr] Job config = /opt//martian-2.2.0/jobmanagers/config.json
2017-04-09 14:22:43 [jobmngr] Using 16 cores, per --localcores option.
2017-04-09 14:22:43 [jobmngr] Using 94 GB, 100% of system memory.
2017-04-09 14:22:44 [webserv] UI disabled.
Running preflight checks (please wait)...
Because the log is mirrored to the _log
file, you may detach from a running
mrp
or redirect its standard output to /dev/null
, and instead monitor
progress using tail -f _log
. Since pipelines are often long-running, and
usually launched on remote workstations via an ssh connection, it is
recommended to launch mrp as a background process and
disown
it so that it is
not interrupted if the terminal connection is broken.
Final Outputs
When a pipestance completes, the formal output files of the pipeline are moved to the top level of the pipestance directory into a subdirectory named outs/
. Their original locations inside their respective chunks’ files/
directories are then symlinked to the file in the outs
directory.
Remote command line access
The mrstat
tool is a basic tool for querying mrp
’s state over the API
exposed through the UI port (see below). Given a pipestance directory, it
will return mrp’s current state information. It also has a --stop
flag,
which will cause mrp
to abort the pipestance if it is running and shut down.
Web Interface
In addition to console logging, mrp
features an embedded webserver that
serves an HTML user interface for monitoring progress and a REST API.
By default, the mrp
webserver listens on a kernel-assigned non-privileged
network port and generates an authentication token to pass in the URL.
mrp
reports the kernel-selected port number and authentication token as a
URL on the console and in the log, for example
Serving UI at http://utopia-planitia.mars.sol:59410?auth=y1ezpEGscyA3jfion7iR58ED-ufs_drYOOKAeIR6GeQ
The URL is also written to the pipestance metadata file named _uiport
.
To access the UI, simply direct your browser to the specified URL.
Alternatively, you may choose a specific port using the --uiport
command-line option.
By default, if a specific port is chosen, the authentication token is only
required for API calls which modify the pipestance state. These default
behaviors can be altered with the mrp
flags
--disable-ui
, --disable-auth
, --require-auth
, and --auth-key
.
Normally, when a pipeline run completes, mrp
terminates and the user interface
is no longer available. If mrp
was started with --noexit
then it will stay
up until mrp
is killed (either with an operating system signal or with
mrstat --stop
). One can also restart mrp
in read-only mode with the
--inspect
option (which implies --noexit
).
Pipeline details
The right pane of the user interface provides general information about
the pipestance, including the command line given to mrp
, some environment
variables, version information, paths, and the invocation mro
. Access to the
pipestance _log
file is also available through this interface.
Pipeline Graph
The left pane of the UI displays the graph structure of the pipeline.
Each node of the graph represents a pipeline, subpipeline, or stage. Pipelines and subpipelines are displayed as rectangles while stages are displayed as rounded rectangles. Edges connecting the nodes indicate input/output dependencies between stages and pipelines. Note that while stages which depend on the outputs of a stage are drawn as depending on the entire stage in order to reduce clutter in the graph view, those outputs are bound to individual stage outputs. Thus it is possible for a stage which appears to depend on a subpipeline which has not completed yet to start running anyway, as long as the appropriate underlying stages have completed.
Nodes are colored grey while they are waiting for upstream dependencies to be satisfied in order for that node to be able to run. Nodes turn blue when their dependencies are satisfied and they may run, green when they have successfully completed, and dark grey if they fail.
For more details on a particular node, simply click on it.
Node Details
If you click on a node in the graph, the right pane of the user interface replaces the pipestance details with the execution details associated with the selected node. This includes information about the stage code, relevant paths, and then progressive drill-down detail about the real-time status of parameter sweeping, parallelization, and chunk execution. All metadata files in the pipestance are available through the UI.
To get back to the pipeline details view, click the ← button next to the node name at the top.