Stage Interfaces

Stages are the key components of a Pipeline; they are where work actually gets done. Today, stages have a partially-defined interface: they are configured by a Policy and accept a Clipboard, and they produce a Clipboard as output. The configuring Policy may be constrained by a dictionary, but the stage's manipulation of the Clipboard is unconstrained. We propose to add more rigor to this interface by having stage authors declare which Clipboard items they will use and which they will produce.

Use cases

  • Answer the question, "Given a pipeline, which stage put something on the clipboard?" This could be determined by scanning the list of outputs of each stage.
  • Produce better stage documentation, allowing potential stage re-users to understand what inputs they need to provide and what outputs they should expect.
  • Validate that all stages will get the data they expect. It may be difficult to do this statically, but it should be possible to do it dynamically at runtime.
  • Optimize the "freeze-drying" (checkpointing) of a pipeline (or portion thereof) for debugging by only saving Clipboard items that will be needed by downstream stages.
  • Support the eventual development of a pipeline construction tool.

Requirements

  • Constrain policy keys
  • Constrain Clipboard keys
    • Support formal/actual argument distinction
    • Support type specification
    • Support required/optional keys?
    • Support keys (actual arguments) generated from combinations of policy items?
    • Support keys generated from clipboard items?
    • Support output keys
  • Make it difficult for stage authors to do the wrong thing
    • Automatically pre-populate instance variables with declared input Clipboard items
    • Automatically post-populate the Clipboard with declared output instance variables
    • Do not provide the stage direct access to the Clipboard
  • Allow automated usage of stage interface data
    • Define stage interface in machine-readable form
    • Allow the stage to return extra dynamic interface information via a method?