wiki:OrganizingStageFiles
Last modified 12 years ago Last modified on 12/19/2007 07:51:32 PM

Organizing Stage Implementation Files within a Package

from: Testing Pipeline Stages -> Examples -> Usual Location.

This document advises on what to call and where to store the myriad of files associated with the implementation of a Stage of a Pipeline. The purpose for this is primarily to provide enough consistency across packages such one stage developer can find the stage implementations in packages other than their own, and understand how they work. Apart from the location of the final production stage policy files, the names and locations of these files do not need to serve any automated handling apart from what is encoded in the SCons scripts. Thus, the specifications for organization within an application package are merely recommendations.

It is assumed that a stage implementation is developed within the context of an application package: the stage applies algorithmic code provided by that package as a part of a pipeline. Thus, all the files that make up the implementation are checked in and built within that package. We note that this implies a dependency of the application code on the dps package. We would, however, like to be able to use classes from the application package independent of the dps package; thus, this should be set as an optional dependence such that:

  • the dps package need not be available in order to build the package (when dps is not available, the building of the stage-related classes are skipped.)
  • the dps libraries need not required in order to build applications that use the non-stage classes.

Some boilerplate SCons code will probably be required to enable this.

An implementation typically includes:

  • C++ class files that implement an algorithm independent of the Pipeline harness
  • python wrapper classes
  • python implementation of the python Stage class
  • policy file(s) that configures the Stage class

C++ Algorithm Classes and Python wrappers

The locations and names of the algorithm code follow our current guidelines and conventions. The C++ code is organized within the include and src directories according to the namespace-ing conventions, e.g. lsst::pkg going under include/lsst/pkg.

Python Stage Implementation

If there is only one simple stage:

  • place python code in python/lsst/pkg/pipeline.py
  • name the stage class name for the functionality followed by Stage (e.g. ImageSubtractStage)

If you have more than one stage (or if your stage class is very complicated--but typically you should put the guts of the stage code elsewhere):

  • place in python/lsst/pkg/pipeline
  • create one .py file per stage
  • name the stage class name for the functionality followed by Stage (e.g. ImageSubtractStage)
  • create a file named __init__.py that includes all stage classes; thus for each stage use a line such as "from ImageSubtract import ImageSubtractStage".

In either case one can create a stage via:

   import lsst.apppkg.pipeline
   s = lsst.apppkg.pipeline.ImageSubtractStage()

Policy Files

All policy files that are part of the application package will go under a top-level directory (i.e. at the same level as include, src, Sconstruct, etc.) called pipeline.

Dictionary Files

This section is subject to change pending some response to feedback. First, developers expressed interest in checking some or all dictionary files close to the code that uses them. Second, there was the suggestion that there be machinery available for code to access and load dictionary files stored within a package's installation directory; this would require transparently locating the product directory and so is not yet supported. Third, there was a suggestion that dictionary files be loaded to set default values (as an alternative to hard-coding defaults) that can then be overridden by the Policy object brought in externally.

  • these files document the policy parameters expected by a class.
  • call it classDictionary.fmt, where class is the class is the name of the class that is configured by the set of parameters, and fmt is the policy format extension (json or paf)
  • e.g. ImSubtractStageDictionary.paf
  • place directly under pipeline directory

Sample Policy Files

  • these are examples that can be used for testing or demonstrate different configurations.
  • place in a directory called pipeline/examples.
  • pipeline/examples can contain other files, too. In particular, it can contain any other scripts and/or (small) data files needed to run an example pipeline.

DC2 Production Policy Files

  • The policy files for each stage that will actually be used in the execution of DC2 will be checked into a special package called dc2pipe
  • Pipeline policy files will be stored in this package in a directory called pipeline.
  • a high-level pipeline policy will be checked in for each pipeline configuration with names like association.paf, movingobjects.paf.
  • a subdirectory named after the pipeline (e.g. association, movingobjects, etc.) will contain the policy files for each of the stages in that pipeline (along with any lower-level policy files).
  • each pipeline policy file will do "includes" of the component policies in its associated subdirectory.
  • a single nightlypipeline.paf file may be created as a master policy file for the 3 DC2 pipelines.
  • stage developers will check in the initial version of the production policy file, and the developers and integrators would update it as necessary.