Last modified 5 years ago Last modified on 10/29/2013 12:33:49 PM

Page to collect the thoughts and emails of folks about a redesign of the QA infrastucture in Winter 2014.

Desired Functionality for overall Toolkit (RHL)

  • Tools for image display (replacing "import ds9", although ds9 might be the backend)
  • How to debug algorithms as they run (the current lsstDebug mechanism).
    • How to make it more uniform, and generate documentation as tests are added.
  • Analysis tools for looking at pipeline outputs from FITS files; e.g.
    • 3-colour diagrams for triples of CCDs
    • CM-diagrams for pairs of rafts (with small dithers)
    • Repeatability of psf photometry
    • Repeatability of model colours
    • Comparison with input catalogues (with ability to display outliers and get more information about them)
    • images of deblends of interesting objects (for example, outliers in the 3-colour diagrams)
    • displays of deblends of objects found in child-of-ds9
  • The equivalent on a larger scale (maybe via the DB).
  • How to achieve all this in a way that we can share.
  • The replacement/augmentation of pipeQA

Thoughts for improving SDQA (RAS)

Of my favorite ideas for improving SDQA functionality, some have been mentioned before. It's worth bearing in mind that the SDQA functionality has high potential appeal for other projects that may consider adopting some LSST code.

  • Improve production efficiency. The production of QA artifacts has to be very efficient, and the raw material (derived quantities, plots, etc.) should be generated without undue computational burden. We have not run PipeQA on CalExp's for some time now because of the cost in time and cycles. The suggested re-write would undoubtedly help a lot, though perhaps the problem can also be ameliorated by lazy generation of QA artifacts (i.e., only generate when needed on a subset of data), or by generating some QA artifacts only when the production is run in some kind of "debug" mode.
  • Improve navigation among QA artifacts. It would be nice (and eventually, critical) to be able to navigate easily among QA artifacts for images that overlap spatially but differ in passband, or visit epoch; or to ascend/descend the data product hierarchy (Raw, CalExps, Co-Adds, or difference images); or to navigate to spatially adjacent regions.
  • Provide direct access to images. Having links to view the image (or a link to download it for viewing by another application) would be very helpful. Sometimes the cause of a QA issue is obvious just by looking at the image.
  • Provide summary QA reports. It would be really handy to visualize the distribution of various QA-related scalar quantities from a production run (PSF size and shape, zero-point, RMS of the WCS solution, etc.), say in the form of a histogram. Better still would be to allow the user to select for review those images that fall outside of a user-defined range.
  • Provide provenance information. While the need may not be urgent (and this is a little tangential to SDQA), this would be useful in the near term for comparing data products from split DRP, or comparing products generated from different software builds.

Thoughts on Primatives (JB)

#1 concern on this task is that these primitives be designed in such a way that they can be used to quickly develop new tools, and that the reliance on SQL be somewhat limited

  • Bidirectional mappings between SQL databases and afw::table objects; I want to be able to dump SQL queries into afw::table Catalogs, and upload afw::table Catalogs into temporary scratch-space SQL tables. For high-level tests that don't require extremely complicated SQL queries or massive data volumes, I'd like to see this used to base as much of our toolkit as possible on afw::table objects, rather than SQL databases. (I understand that some high-level tools will need a SQL backend; I only ask that we don't require a SQL backend for high-level tools that don't need one.)
    • [KTL] Planning to incorporate some of this functionality into the Butler, or at least a simple driver script around it. Think through the possibility of explicitly using SQLite as an alternate on-disk file format
  • Multi-input spatial matching and *simple* joins and filters for Catalogs. (I don't want to try to reproduce everything SQL can do, but we could do a lot with just a small amount of functionality in this department.)
    • [KTL] I worry that this is a slippery slope.
  • Implicit generation of mag columns from flux columns in Catalog column views (requires attaching Calib objects to Catalogs, which I'd like to see done, but is a bit tricky w.r.t. persistence).
  • Simple GUI primitives for small amounts of interactivity in plots:
    • Scatter plots that let us choose points and inspect them further, either by clicking on individual points or drawing boxes or polygons around them
    • Image overlays that let us select individual sources (or constituent images in a coadd) in a similar manner
    • Displaying images or per-chip summary quantities for a full focal plane, with similar support for retrieving interactively-selected chips.

I think it's also extremely important that our plotting and display primitives be object-oriented in a way that's friendly for interactivity. For instance, I don't want to just send commands to a ds9-like tool; I want to have an object that represents a displayed exposure with features and layers that can be enabled or disabled and have their properties changed (I think this is essentially impossible with ds9's xpa interface, and it's the main reason I'd like to see it replaced).

ds9 (RO and RHL)

The good:

  • Image display with zooming
  • WCS support
  • Mask overlays
  • Ability to draw to a graphics plane
  • Rescaling the displayed image

The bad:

  • Moderately easy to crash
  • Bad network support (the xpans stuff is clumsy and doesn't work well with firewalls)
  • Mask overlays are very slow and incomplete (my has to split them into separate mask planes and send each separately)
  • xpa is slow when overlaying glyphs (it's a text format) and fragile --- try sending more than (4096 - delta) bytes in a message
  • Rescaling isn't very good (e.g. they added asinh stretches, but not the ability to control the softening). Using the mouse to restretch modifies the lookup table, not the stretch
  • Lots of bad UIs (e.g. the scale parameters plot. I don't think you can control the range plotted --- you used to be

able to do so but it used the same binwidths as the full plot)

  • No real ability get things back from ds9 into python (e.g. where did I click? What key did I hit?). You *can* do it with a loop and calls to imexam but it's very clunky
  • Only one window at a time is supported
  • It is written in tcl and contains its own weird version of same, so it's a pain to build.
  • It runs poorly on MacOS X (recent versions don't even try to use Aqua).

Minor quibbles:

  • If you turn on smoothing in ds9 and display a mask, all pixels are masked
  • The only way to put up a title is to set a fits header keyword when loading the image (and the field is small with horizontal layout, which is what fits a screen)
  • axis labels are almost unreadable for "display vertical graph"
  • It's true-colour stuff isn't.
  • You can't toggle the graphics on and off
  • Clicking by default puts up a circle. This is rarely what I want to do.
  • I find it fairly clumsy to use -- I'd like to more easily be able to change scales.
  • No sinh-based grayscale option.

Existing Example Code

  • pipeQA + displayQA

Used for batch processing (retrospective) QA over a full focal plane, or over a set of sky tiles for coadds. Can run off of persisted products (afw::table) or MySQL database.

The functionality is split into 2 parts: database query plus metric/figure generation (testing_pipeQA) and metric/figure presentation (testing_displayQA). Each metric is implemented as an individual class based off of pipeBase.Task and pipeQA.QaAnalysisTask?. The main task methods are:

"get" the necessary data from a shared data object (ButlerQaData? for data on disk, DbQaData? for database queries). These classes basically fill a sparse pipeQA-specific "source" model. This is someplace where db<->afwTable integration would be highly useful.

"test" by generating metrics based upon these data, and comparing the values of the metrics to baseline values contained in the class (and potentially overridden by its Config).

"plot" a set of figures representing the test

These tests are implemented on a per-CCD basis, as well as on a full-focal plane basis. Because plotting can take a while, the data distilled during the testing and needed to make the plot may be pickled to disk; plot PNGs are then only rendered if they are requested for viewing (the vast majority of images that may be generated are never looked at). Resulting values of the metrics needed to render the pipeQA page by displayQA are written to disk in a SQLite database ([SB] HSC uses postgresql instead of SQLite, which has a much faster cache-building post-process). Each of these persisted data products is written to a subdirectory representing a pipeline_run/visit_test combination.

An overall pipeQA run specifies on the command line a combination of visit,raft,ccd,QAtest to process, so that this may be easily parallelized.

The display of the metrics is handed by PHP, which auto-generates the HTML served to the browser, using the SQLite databases it finds on disk. The PHP also has the ability to execute basic "plotting" scripts if an image is requested that does not exist (will only happen the first time an image is requested), using the pickled data. Each pipeline run is given a summary page; each visit is given its own summary page below this; each test its own summary page showing a full focal plane figure and full focal plane metrics; and each CCD its own summary page showing detailed results of the test for that CCD. The deepest drilldown we have been able to achieve is to have "hover" information (e.g. x,y coords) for data points on these plots.

  • lsstDebug

Used for streaming (real-time) debugging. The QA code is placed inside the scripts being run, which the user may choose to run in debug mode using "--debug" in the call to a CmdLineTask?. The file must be found in the users PYTHONPATH, and toggles the different debug options to be run by setting a boolean flag. A typical use case looks like:

import lsstDebug
display = lsstDebug.Info(__name__).display
displayTemplate = lsstDebug.Info(__name__).displayTemplate
if display and displayTemplate:
    ds9.mtv(templateMaskedImage, frame=lsstDebug.frame, title="Image to convolve")

with the relevant parts of looking like

    import lsstDebug

    print "Importing debug settings..."
    def DebugInfo(name):
        di = lsstDebug.getInfo(name)          # N.b. lsstDebug.Info(name) would call us recursively
        if name == "lsst.ip.diffim.imagePsfMatch":
            di.display = True                 # global
            di.displayTemplate = True         # show full (remapped) template

Common use cases are to render images in ds9, enable mask planes, and indicate detected or measured objects; and to render and show manually generated QA (non-pipeQA) plots in matplotlib.

[ACB] Ideally, the QA plots would be derived from the same base classes as pipeQA plots. Ideally, there would be interactivity between the images and the plots via shared data (a-la ASCOT). And ideally the overall level of interactivity (image and data) enabled from this script-level debugging would be similar to what is available from our high-level QA interface (i.e. pipeQA replacement).

[PAP] Here's what I want to see done:

  • Switch to using Config (or some debug-tuned subclass). This introduces documentation for each of the options, and defines which options are available.
  • Add support for activating debug options from the command-line.
  • Abstract out ds9.
  • Add support for common high-level operations (e.g., display sources, display matches, pause to allow inspection of an image, hitting 'p' to drop into pdb).

  • RHL toolkit


[KSK] ASCOT is the AStronomical COllaborative Toolkit. It is a widget based dashboard system intended to allow users to dynamically change the view of their data as they flow from exploration to research to publication. The framework allows for inter widget communication. This facilitates interactions like: brushing and linking, multi-widget annotation, and simultaneous and automatic update of all browsers viewing the dashboard. It is based on node.js on the server side using ShareJS (from Google Wave) for state operations. The intent is for all data interaction to be in the browser (this means no sending pngs back and forth to a server for viewing images). See this link for more information and links to the development server.

The good:

  • There is a significant trend to move to the browser, so there is a lot of active development for the technologies that ASCOT is based on.
  • The power of the browser increasing rapidly with HTML5, WebGL, and asm.js.
  • There are several widgets already implemented
    • Sky viewer (Google Earth)
    • SQL interface to the SDSS catalogs
    • Histogram and scatter plots (Highcharts)
    • Dataset viewer (YUI)
    • FITS viewer
    • Interface to sciDB server
  • An 'ASCOT button' API that allows dashboards to be assembled pragmatically from information in a web page. This makes it possible to go from a static view of data (pipeQA) to a dynamic one (ASCOT dashboard).
  • Having multiple browsers seeing the same view of the data with all browsers able to modify the view makes the interaction very collaborative.
  • The above also means that you can work on a dashboard in one place and open it up another place with exactly the same view.
  • The framework is simple enough that the learning curve for developing widgets is relatively shallow, so that community contribution is possible.

The bad:

  • Not production ready (but close).
  • Plotting in JavaScript has a limit of ~few 10,000 points. This limitation is primarily due to the fact that JavaScript plotting libraries typically use SVG.
  • Some aspects are hard (full featured FITS viewer) and ASCOT is resource limited.
  • It's not clear how to generate publication quality plots.
  • Dataset I/O is not completely implemented.
  • There is no API for creating composite/derived columns

My take: This could be a very good tool for data exploration (especially collaboratively). I think it could be very useful in the context of QA. It is almost mature enough for this, but would need some work to get all the way there.


[KSK] TOPCAT is a standalone Java application for plotting and data manipulation. The STILTS package provides much of the same functionality of the TOPCAT GUI via the command line. I have found TOPCAT to be very useful for doing rapid comparisons of specific datasets.

The good:

  • There are extensive tools for creating composite/derived columns as well as sub-selecting the data
    • Tools include a relatively comprehensive set of mathematical functions including some aggregation functions
    • Sub-selection via logical operations on columns as well as by bounding regions (including freehand regions)
    • New datasets can be created from sub-selections.
  • Many plotting options
    • 2 and 3 D scatter plots.
    • Histograms and density plots
  • Utilities for matching datasets via indexes or positions with positional errors
  • Built in support for celestial coordinate systems
  • Very mature I/O. Many file types are supported. I find it very useful to be able to load FITS files, for example.

The bad:

  • Native app, so not collaborative
  • I have noticed some bugs with saving sessions and with the density plots. I have not gone far enough to make "how to reproduces," and it has not been bad enough to make me stop using it.
  • There is no way to make publication quality plots (I don't know how to make Greek letters).
  • It is fairly memory intensive. I have been limited by memory before performance.
  • I have very little experience with the command line tools, but the libraries are only callable from Java and Jython.

My Take: This has been a good tool for doing preliminary exploratory work on relatively large datasets. The data manipulation utilities are especially useful. The ability to make many different sub-selections and plot them all together has also been useful to me. I think it should be viewed as a good model, in terms of functionality, for any plotting/visualization framework the project uses down the road.

  • opSim/calSim metrics (under development)

  • Ginga
    Python-based FITS image viewer