wiki:DC3bDataQuality
Last modified 8 years ago Last modified on 08/25/2010 09:35:04 AM

OBSOLETE DC3b Data Quality Requirements (refer to docushare document-9655) from draft by Tim Axelrod: DRAFT Aug 28, 2009

This document summarizes the data quality requirements for the outputs of DC3b. The requirements are divided into those that are most naturally assessed on the outputs of particular processing stages, and those that are assessed on the integrated science database produced by DC3b. Numerical thresholds are part of most of these requirements. At this stage of requirements development, the actual values should be viewed as provisional and subject to change. The structure of the requirements is what matters most, and the intention is that these be fixed early. Note that computing speed requirements are not covered here.

Stage Output Requirements

  1. Instrument Signature Removal (ISR)
    1. For a subset of simulated images without atmospheric effects, it should be possible to test that the output of ISR agrees with an "ideal" simulated image to reasonable tolerance. Defining this will require input from the ImSim team. TBD

Comment from DAL: In general actually comparing knowns from simulated input to the output of the stages run on the simulated data seems like a pipeline validation activity. However, for a number of reasons it is highly desirable for the SDQA system to be able to process simulated data the same way as real data, so supporting some of this type of activity may be a match where the known inputs can play the role of expectations we would reasonably have in ops (e.g. injected PSF vis DIMM data to compare to measured PSF.)

  1. Image Characterization Pipeline (ICP)
    1. WCS Definition: The error of a WCS is the RMS distance over a set of stars between the position predicted from the WCS and the catalog value of the position. The set of stars shall be all stars in a catalog of bright, isolated stars whose positions are within the image.
      1. Failure rate: A WCS shall be termed a failure when either the stage that produces the WCS raises an exception, or when the error > Pwcs1 arcsec. The failure rate, averaged over all images in the DC3b input set, shall be less than Pwcs2 %.

Comment from DAL: Based on conversations with Gregory it seems straightforward for SDQA to assume the Database will retain a record of expected data products and that a pipeline stage throwing an exception and bailing could mark it's child products with a QA status equivalent to "I don't exist" or "I know I am garbage". In this case it would be straighforward for SDQA to maintain statistics on failure rate.

Comment from KTL: As long as the catalog mentioned in this item is available to the WCS determination stage (either because it is used by WCS determination or because it is supplied separately), it is reasonable for the stage to throw an exception or mark its products as garbage. Note that throwing an exception will generally mean that no products other than log messages will be persisted, to the database or otherwise. SDQA could use the log messages to determine the type/cause of the failure.

Comment from Russ Laher: To facilitate post-mortem analysis, it is highly desirable to persist metadata in the database about failed images. Failed images can be failed for different reasons, and it would be nice to set bit-flags defined to indicate the precise reason for the failure. Regardless of whether an image is failed, it is useful for pipeline-tuning purposes to store in the database the WCS solution and other SDQA metadata that resulted from WCS determination.

More comments from Russ Laher: A WCS should also be termed a failure if there are an insufficient number of reference-catalog matches. The reason is that just a few "good" random matches can result in low RMS values. Also, consider tracking and thresholding the RMS values separately along image axes 1 and 2.

  1. Median error: The median WCS error, including all images in the DC3b input set that do not result in WCS failures, shall be less than Pwcs3 mas.
  1. Photometric Zeropoint
    1. For simulated images, the photometric zeropoint determined by the ICP shall agree with the zeropoint applied by the simulator, spatially averaged over the image, to within Pzp1 mag.
  1. PSF Determination
    1. Failure rate: PSF determination failure, as signaled by an exception being raised, shall occur in less than Ppsf1 % of the input images
    2. Average error: For simulated images, the applied PSF shall be stored at least for the center, and the four corners of the image. The determined PSF shall agree at each of these points with the applied PSF to Ppsf2 %, as measured by TBD

Comment from DAL: It will also be interesting to be able to compare the PSF across a CCD, a field of view and as a function of time. Self-consistency of PSF is a good SDQA task.

  1. Background Determination
    1. Average error: For simulated images, ... (is the background well defined for the simulator? TBD)

Question from DAL: In operations what would SDQA use as an expectation to compare the measured background to, analogous to the injected simulated background? Is there an analog?

  1. Produce Coadd for Detection
    1. Registration of images in stack: TBD

Comment from DAL: I expect there are a number of potential self-diagnostics which are more of a debugging exercise. However, SDQA can and probably should compare the measured PSF (and other quality metrics) of the coadd against the expectations based on the properties of the input images.

Comment from KTL: Is someone measuring the PSF of the coadd?

  1. Produce Coadd for Subtraction Template
    1. Registration of images in stack: TBD

Comment from DAL: I imagine there will be some process for "blessing" and releasing templates, which will likely include hands-on analysis in addition to whatever information SDQA can compile during construction of the template images. Probably similar metrics than for the other coadds.

Comment from KTL: Is this the only place where there will be manual intervention in the Data Release Production, or (at the other extreme) will there need to be manual intervention after every stage that computes a persistent data product?

  1. PSF characteristics:
    1. PSF width, as defined by FWHM, shall be narrower than that of at least Ptempl1 % of the input images in the stack
    2. PSF eccentricity: TBD
  1. Image Subtraction The residual image is defined as the subtracted image divided by the square root of the predicted variance image.
    1. Failure rate: A subtraction shall be termed a failure when a stage that produces the subtraction raises an exception, or when the absolute value of the average residual > Psub1. The failure rate, averaged over all images in the DC3b input set that generate a valid WCS, shall be less than Psub2 %.
    2. Residuals in object footprints: For each object in the template image with magnitude < Psub3 (probably need to make this filter dependent), define the footprint pixels to be those where the amplitude of the PSF, centered on the object is > Psub4 % of its maximum value. For the union of all object footprint pixels in an image, a subtraction is deemed of bad quality if the fraction of those pixels with abs(resid) > Psub5 is > Psub6 percent. The fraction of bad quality subtractions shall be < Psub7 percent.
    3. Detect Sources in Subtracted Image
      1. Completeness: For simulated input images, a minimum of Pdet1 % of sources that differ from their template flux by at least Pdet2 times the square root of the variance of the sky flux in a PSF in both exposures in a visit shall result in valid DIASources. Note: this is much weaker than expected from Gaussian statistics.
      2. False detection rate: For the same values of detection parameters as chosen to meet the completeness requirement, above, and for simulated images, no more than Pdet3 % of the DIASources generated for a visit shall be false, ie not resulting from an actual flux change in a simulated source. Again, much weaker than Gaussian.
    4. Deep Detection and Measurement
      1. Completeness: for simulated images, TBD
      2. Accuracy of shape measurements: for simulated images, TBD

Comment from DAL: I have no doubt that SDQA will be able to track information that will be useful in this analysis, but determining completeness and reliability has a lot to do with comparing to known results. IMO this is more of a science team type of activity, although the same personnel may in fact be able to do it operations. In DC3b, I doubt we have the resources to own this.

  1. Day MOPS
    1. Adopt metrics from PS-MOPS
  1. Association (AP)
    1. Accuracy of association of DIASources to Objects: For simulated images, DIASources resulting from actual flux changes shall be correctly associated with their template object at least Pap1 % of the time on average.
    2. Completeness of MOPS matching: For DIASources that fall within the MOPS-predicted error ellipse of a solar system object, the AP shall match the DIASource to that solar system object at least Pap2 % of the time on average.
    3. Classification correctness: Cosmic rays that result in a DIASource in one exposure of a visit shall be correctly classified by the AP at least Pap3% of the time on average.
  1. Photometric Calibration
    1. Accuracy of photometry (basically, can do SRD tests): TBD
    2. For simulated images, comparison of derived to input atmospheric model: TBD
    3. Limitations on systematics: TBD

Comment from DAL: I have asked IPAC gurus to weigh in on this.

  1. Astrometric Calibration
    1. Accuracy of astrometric models: TBD
    2. Limits on systematics: TBD
  1. Integrated Database Requirements
    1. Exposure table shall contain an entry for every exposure processed, and each shall include a minimal set of valid metadata (TBD)
    2. Database shall pass consistency checks: eg, no orphaned Sources, DIASources