Running Pipelines#
Invoking a Pipeline#
Thus far we have shown how to define stages and pipelines in MRO files. To invoke a pipeline, write an MRO file containing a pipeline call
statement with the desired input arguments. This call
statement is called an invocation. To invoke the example pipeline from above:
invoke.mro
@include "pipeline.mro"
call DUPLICATE_FINDER(
unsorted = "/home/duplicator_dave/unsorted.txt",
)
Typically, an invocation MRO file contains a single @include
statement that causes the pipeline definition to be included, and a single call
statement of that pipeline. It is generally discouraged to call
a pipeline in the same file in which it is defined, because then the pipeline definition cannot be easily reused for other invocations with different input arguments.
Running mrp#
mrp
is the runtime executable that runs Martian pipelines. When a pipeline is run, the instantiation of it is called a pipestance, which is a portmanteau of “pipeline” and “instance”. The command-line interface for mrp
is:
$ mrp <invocation_mro> <pipestance_id>
To start a run, provide an invocation MRO file, plus a unique pipestance ID, comprising only numbers, letters, dashes, and underscores. This ID will be the name of the directory containing the pipestance, relative to the current working directory. When running a pipeline multiple times, choose a different pipestance ID for each run.
mrp
features a number of command-line options, which are documented in Advanced Features.
Once mrp
starts, you should see the following output:
$ mrp invoke.mro piperun1
Martian Runtime - 2.2.0
Running preflight checks (please wait)...
2018-01-02 14:23:52 [runtime] (ready) ID.piperun1.DUPLICATE_FINDER.SORT_ITEMS
2018-01-02 14:23:53 [runtime] (split_complete) ID.piperun1.DUPLICATE_FINDER.SORT_ITEMS
2018-01-02 14:23:53 [runtime] (run:local) ID.piperun1.DUPLICATE_FINDER.SORT_ITEMS.fork0.chnk0.main
At a high level, mrp
performs the following to run a pipeline:
- Parse and validate MRO file (e.g. invoke.mro)
- Convert the MRO into a graph representation of the pipeline
- Create a directory for the pipestance named with the pipestance ID provided (e.g. piperun1)
- Begin evaluating dependencies and executing the stages of the pipeline
- Continuously monitor stages and advance through the pipeline graph when dependencies are satisfied
Completion and Failure#
If the pipestance encounters no errors while running, mrp
exits with status 0 and writes a _complete
file in the top level of the pipestance directory.
If the pipestance encounters does an encounter an error, mrp
exits with status 1. The failed stage(s) will contain an _errors
file with information about the error.
For more details about how to examine an in-progress, completed, or failed pipestance, see Inspecting Pipelines.
Restarting#
When a pipestance fails, it can be restarted by running mrp
with the same
arguments as before. mrp
will identify the failed stages, and reset them to
a clean state so that they can run again. Stages that have already completed
successfully will not be reset or re-run. mrp
attempts to verify that no
other instance of mrp
is currently running that pipestance, and that
other settings are compatible with the previous run. Normally retrying will
only re-run failed chunks. If the MRO_FULLSTAGERESET
environment variable
is non-empty, the entire failed stage will be reset.
Stages which failed with error messages that match regular expressions defined
in martian/jobmanagers/retry.json
may be retried automatically. mrp
will
restart itself in such circumstances a number of times configured either from
the command line or in retry.json
giving up. This is useful for error
types which are expected to be transient, such as receiving a signal from the
operating system.
If mrp
is restarted with the --inspect
flag set, it should attempt to read
the pipestance in “read only” mode. In combination with --noexit
this can be
used to open up a user interface for an old pipestance.