wiki:DM/buildbot/Weekly_Production
Last modified 8 years ago Last modified on 09/20/2011 07:48:05 PM

DM/buildbot/Weekly_Production

Detailed discussion of the Weekly Production Runs and Not-Runs

  • Arranged sequentially with Most Recent first.
  • Codeset: Trunk indicates the trunk was used for the lsst stack. In the future, a tagged-set may be specified
  • Dataset: Full indicates the complete set of data specified in $SVN/DMS/datarel/pipeline/<date>_weekly.input. Short indicates a test run using a small subset of available image data.
  • To check on the status of an in-progress run, determine <date_time current run> from run details below, then:
    % cd /lsst3/weekly/datarel-runs/wp_trunk_<date_time current run> # eg 2011_0507_160801
    % /home/buildbot/slave/trunkVsTrunk_lsst/work/RunStatus.sh
    % /home/buildbot/slave/trunkVsTrunk_lsst/work/FindAllErrors.sh
    % ~jbosch/runStatus.py -a -r .   
    

For a status overview of all daily buildbot runs see DM/buildbot/Daily_Status
Past Months' Daily_Status pages: May June July

20 Sept 2011

  • Why: Test Trunk Run prior to Full Trunk tagging
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline TBD after run:pipeline_wp_trunk_2011_0920_153202
  • Codeset: trunk; see /lsst3/weekly/datarel-runs/wp_trunk_2011_0920_153202/config/weekly.tags
  • Dataset: full
  • Output: /lsst3/weekly/datarel-runs/wp_trunk_2011_0920_153202
  • Database: buildbot_PT1_2_u_wp_trunk_2011_0920_153202
  • pipeQA: /lsst/public_html/pipeQA/html/dev/buildbot_PT1_2_u_wp_trunk_2011_0920_153202
  • Status: In progress - still generating the pipQA output.
    • Status summary indicates that there were some problems with
      /lsst3/weekly/datarel-runs/wp_trunk_2011_0920_153202
       Data Available: 502
       Jobs Available: 2
       Jobs Possible: 0
       Jobs In Progress: 8
       Jobs Done: 502
      -----------------------------------------------------------------
       icSrc: 481
       psf: 481
       sdqaCcd: 996
       src: 481
       apCorr: 481
       csv-SourceAssoc: 128
       
       Science_Ccd_Exposure.csv: exists
       Science_Ccd_Exposure_Metadata.csv: exists
       Raw_Amp_To_Snap_Ccd_Exposure.csv: exists
       Snap_Ccd_To_Science_Ccd_Exposure.csv: exists
       sdqa_Rating_ForScienceAmpExposure.csv: exists
       sdqa_Rating_ForScienceCcdExposure.csv: exists
      
    • Check of the Error summary provides a medly of problems:
    • 10 of:
      ./work/PT1PipeA_1/Slice0.log:harness.slice.visit.stage.tryProcess FATAL: Traceback (most recent call last):
      ./work/PT1PipeA_1/Slice0.log:LsstCppException: 0: lsst::pex::exceptions::LengthErrorException thrown at src/image/ImagePca.cc:425 in double lsst::afw::image::<unnamed>::do_updateBadPixels(const lsst::afw::image::detail::MaskedImage_tag&, const typename lsst::afw::image::ImagePca<MaskedImageT>::ImageList&, const std::vector<double, std::allocator<double> >&, const typename lsst::afw::image::ImagePca<MaskedImageT>::ImageList&, long unsigned int, int) [with ImageT = lsst::afw::image::MaskedImage<float, short unsigned int, float>]
      
    • 6 of:
      ./work/PT1PipeA_1/Slice0.log:harness.slice.visit.stage.tryProcess FATAL: Traceback (most recent call last):
      ./work/PT1PipeA_1/Slice0.log:LsstCppException: 0: lsst::pex::exceptions::LengthErrorException thrown at src/image/ImagePca.cc:493 in double lsst::afw::image::<unnamed>::do_updateBadPixels(const lsst::afw::image::detail::MaskedImage_tag&, const typename lsst::afw::image::ImagePca<MaskedImageT>::ImageList&, const std::vector<double, std::allocator<double> >&, const typename lsst::afw::image::ImagePca<MaskedImageT>::ImageList&, long unsigned int, int) [with ImageT = lsst::afw::image::MaskedImage<float, short unsigned int, float>]
      
    • 4 of:
      ./work/PT1PipeC_3/Slice0.log:harness.slice.visit.stage.tryProcess FATAL: Traceback (most recent call last):
      ./work/PT1PipeC_3/Slice0.log:OperationalError: disk I/O error
      
    • two random errors:
      ./work/PT1PipeA_2/Slice0.log:harness.slice.visit.stage.tryProcess FATAL: Traceback (most recent call last):
      ./work/PT1PipeA_2/Slice0.log:    raise RuntimeError("No candidate PSF sources")
      
      ./work/PT1PipeA_2/Slice0.log:RuntimeError: No candidate PSF sources
      
  • Resolution: When the Job completes the pipeQA, please review the output.

10 Sep 2011

13:38

  • Why: New dataset of ~100 different realizations with exactly the same observing conditions (as last run), but with different seeds for the random number generator.
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline TBD after run: pipeline_wp_tags_2011_0910_133828
  • Codeset: tags; see /lsst3/weekly/datarel-runs/wp_trunk_2011_0910_133828/config/weekly.tags
  • Dataset: full
  • Output: /lsst3/weekly/datarel-runs/wp_trunk_2011_0910_133828
  • Database: buildbot_PT1_2_u_wp_trunk_2011_0910_133828
  • pipeQA: /lsst/public_html/pipeQA/html/dev/buildbot_PT1_2_u_wp_trunk_2011_0910_133828
  • Status: Run processes 101 records out of 101 but failed in the first post-processing step for Source Association.
    • Error found in /lsst3/weekly/datarel-runs/wp_tags_2011_0910_133828/SourceAssoc_ImSim.log:
      lsst.ap.cluster.optics: Created k-d tree for 16019 sources
      lsst.ap.cluster.optics: Clustering sources using OPTICS
      lsst.ap.cluster.optics: Produced 286 clusters
      SimpleStageTester.lsst.ap.cluster: Finished clustering sources
      SimpleStageTester.lsst.ap.cluster: Creating good source histogram
      SimpleStageTester.lsst.ap.cluster: Computing source cluster attributes
      Traceback (most recent call last):
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/datarel/4.4.0.11/bin/sst/Sourc
      eAssoc_ImSim.py", line 123, in <module>
          main()
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/datarel/4.4.0.11/bin/sst/Sourc
      eAssoc_ImSim.py", line 120, in main
          lsstSimMain(sourceAssocProcess, "source", "skyTile")
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/datarel/4.4.0.11/python/lsst/d
      atarel/utils.py", line 223, in lsstSimMain
          skyTile=skyTile)
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/datarel/4.4.0.11/bin/sst/Sourc
      eAssoc_ImSim.py", line 66, in sourceAssocProcess
          clip = sourceAssocPipe(srcList, calexpMdList, skyTile)
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/datarel/4.4.0.11/bin/sst/Sourc
      eAssoc_ImSim.py", line 108, in sourceAssocPipe
          """, clip)
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/datarel/4.4.0.11/python/lsst/d
      atarel/utils.py", line 48, in runStage
          return sst.runWorker(clip)
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.4.0.1/python/lss
      t/pex/harness/simpleStageTester.py", line 189, in runWorker
          stage.applyProcess()
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.4.0.1/python/lss
      t/pex/harness/stage.py", line 353, in applyProcess
          self.process(clipboard)
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/ap/4.4.2.0/python/lsst/ap/clus
      ter/sourceClusterAttributesStage.py", line 113, in process
          gaussianFluxIgnoreMask, ellipticityIgnoreMask)
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/ap/4.4.2.0/python/lsst/ap/clus
      ter/clusterLib.py", line 973, in computeAttributes
          return _clusterLib.SourceClusterAttributes_computeAttributes(*args)
      lsst.pex.exceptions.exceptionsLib.LsstCppException: 0: lsst::pex::exceptions::In
      validParameterException thrown at src/cluster/SourceCluster.cc:1296 in void lsst
      ::ap::cluster::SourceClusterAttributes::setObsTime(double, double, double)
      0: Message: mean observation time is not between earliest and latest observation
       time
      
  • Resolution:
    • Serge needs to solve this issue and report back if/when we can continue the rest of the post-processing steps which load the various DB and run the pipeQA.
      • Serge resolved the issue and built new Release and Trunk versions of ap.

9 Sep 2011

21:35

  • Why: New dataset of ~100 different realizations with exactly the same observing conditions (as last run), but with different seeds for the random number generator.
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_wp_tags_2011_0909_213525
  • Codeset: trunk; see /lsst3/weekly/datarel-runs/wp_tags_2011_0909_213525/config/weekly.tags
  • Dataset: full
  • Output: /lsst3/weekly/datarel-runs/wp_tags_2011_0909_213525
  • Database: buildbot_PT1_2_u_wp_tags_2011_0909_213525
  • pipeQA: /lsst/public_html/pipeQA/html/dev/buildbot_PT1_2_u_wp_tags_2011_0909_213525
  • Status: The data wasn't found.
  • Resolution: Remove all run detritus.
    • Issue: only E000 raw data was provided and the main-Imsim.paf expects paired visits. The run failed because all requests for the second image in the pair were not found.

A sample for work/PT1PipeA_1/Slice0.log:

LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827199, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827203, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827160, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827158, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827183, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827168, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827147, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827198, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827182, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827177, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827209, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827144, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827161, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827212, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827231, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827149, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827229, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827196, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827171, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827169, 'channel': '0,1'}
LsstException: Timed out waiting for dataset raw with keys {'snap': 1, 'raft': '2,2', 'sensor': '1,1', 'visit': 88827164, 'channel': '0,1'}

8 Sep 2011

1717

  • Why: Rerun One-Off which failed with some many errors.
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline TBD after run: wp_tags_2011_0908_171709
  • Codeset: tags; see /lsst3/weekly/datarel-runs/wp_trunk_2011_0908_171709/config/weekly.tags
  • Dataset: took 999 out of 4500+ records to test
  • Output: /lsst3/weekly/datarel-runs/wp_trunk_2011_0908_171709
  • Database: buildbot_PT1_2_u_wp_trunk_2011_0908_171709
  • Status:
    • Pipeline completed, pipeQA still processing. 997 out of 999 processed.
    • Error recorded in:
      • Record: 905078821 1,2 0,2 ------ 80 reads, 2 writes, 0 calexp persisted
      • Info: from work/PT1PipeA_1/Slice0.log: LOOPNUM: 184
        harness.slice.PsfDeterminationStage - parallel: Estimating PSF is in process
          RUNID: wp_tags_2011_0908_171709
          JobId: unknown
          DATE: 2011-09-09T09:22:27.186797
          workerid: -1
          TIMESTAMP: 1315560181186797000
          LEVEL: 0
          SLICEID: 0
          PIPELINE: main-ImSim
        
        harness.slice.visit.stage.tryProcess FATAL: Traceback (most recent call last):
          File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.4.0.1/python/lsst/pex/harness/Slice.py", line 575, in tryProcess
            stageObject.applyProcess()
          File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.4.0.1/python/lsst/pex/harness/stage.py", line 353, in applyProcess
            self.process(clipboard)
          File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/meas_pipeline/4.4.0.1/python/lsst/meas/pipeline/psfDeterminationStage.py", line 72, in process
            psf, psfCellSet = self.psfDeterminer.determinePsf(exposure, psfCandidateList, sdqaRatings)
          File "/lsst/DC3/stacks/gcc443/15oct2010/EupsBuildDir/Linux64/meas_algorithms-4.4.1.1/meas_algorithms-4.4.1.1/python/lsst/meas/algorithms/pcaPsfDeterminer.py", line 352, in determinePsf
          File "/lsst/DC3/stacks/gcc443/15oct2010/EupsBuildDir/Linux64/meas_algorithms-4.4.1.1/meas_algorithms-4.4.1.1/python/lsst/meas/algorithms/pcaPsfDeterminer.py", line 63, in _fitPsf
          File "/lsst/DC3/stacks/gcc443/15oct2010/EupsBuildDir/Linux64/meas_algorithms-4.4.1.1/meas_algorithms-4.4.1.1/python/lsst/meas/algorithms/algorithmsLib.py", line 975, in createKernelFromPsfCandidates
        LsstCppException: 0: lsst::pex::exceptions::LengthErrorException thrown at src/image/ImagePca.cc:493 in double lsst::afw::image::<unnamed>::do_updateBadPixels(const lsst::afw::image::detail::MaskedImage_tag&, const typename lsst::afw::image::ImagePca<MaskedImageT>::ImageList&, const std::vector<double, std::allocator<double> >&, const typename lsst::afw::image::ImagePca<MaskedImageT>::ImageList&, long unsigned int, int) [with ImageT = lsst::afw::image::MaskedImage<float, short unsigned int, float>]
        0: Message: You only have 1 eigen images (you asked for 3)
        
    • Error recorded in:
      • Record: 925722281 4,1 1,0 ------ 80 reads, 2 writes, 0 calexp persisted
      • Info: from work/PT1PipeB_1/Slice0.log : LOOPNUM: 84
        harness.slice.PsfDeterminationStage - parallel: Estimating PSF is in process
          RUNID: wp_tags_2011_0908_171709
          JobId: unknown
          DATE: 2011-09-09T04:13:43.148007
          workerid: -1
          TIMESTAMP: 1315541657148007000
          LEVEL: 0
          SLICEID: 0
          PIPELINE: main-ImSim
        
        harness.slice.visit.stage.tryProcess FATAL: Traceback (most recent call last):
          File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.4.0.1/python/lsst/pex/harness/Slice.py", line 575, in tryProcess
            stageObject.applyProcess()
          File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.4.0.1/python/lsst/pex/harness/stage.py", line 353, in applyProcess
            self.process(clipboard)
          File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/meas_pipeline/4.4.0.1/python/lsst/meas/pipeline/psfDeterminationStage.py", line 72, in process
            psf, psfCellSet = self.psfDeterminer.determinePsf(exposure, psfCandidateList, sdqaRatings)
          File "/lsst/DC3/stacks/gcc443/15oct2010/EupsBuildDir/Linux64/meas_algorithms-4.4.1.1/meas_algorithms-4.4.1.1/python/lsst/meas/algorithms/pcaPsfDeterminer.py", line 265, in determinePsf
        IndexError: invalid index
        
    • Two errors recorded in: ingestSourceAssoc.log
      • lsst.ap.match: Opening position table /lsst3/weekly/datarel-runs/wp_tags_2011_0908_171709/csv-SourceAssoc/objDump.tsv
        lsst.ap.match: Scanning position table to determine min/max epoch of input positions
        lsst.ap.match: Scanned 7472 records
        lsst.ap.match:   - time range is [50508.352, 52897.215] MJD
        lsst.ap.match: Opening reference catalog /lsst3/weekly/datarel-runs/wp_tags_2011_0908_171709/csv-SourceAssoc/refFilt.csv
        lsst.ap.match: Scanning reference catalog to determine time-range and/or maximum velocity/parallax
        lsst.ap.match: Scanned 1490372 records
        lsst.ap.match:   - time range is [51544.500, 51544.500] MJD
        lsst.ap.match:   - max parallax is 33.900 milliarcsec
        lsst.ap.match:   - max angular velocity is 471.345 milliarcsec/yr
        lsst.ap.match:   - read-ahead is 1813.441 milliarcsec
        
        lsst.ap.match: Starting reference catalog to position table match
        Traceback (most recent call last):
          File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/ap/4.4.2.0/bin/qa/refPosMatch.py", line 121, in <module>
            main()
          File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/ap/4.4.2.0/bin/qa/refPosMatch.py", line 118, in main
            refPolicy, posPolicy, matchPolicy)
          File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/ap/4.4.2.0/python/lsst/ap/match/matchLib.py", line 953, in referenceMatch
            return _matchLib.referenceMatch(*args)
        lsst.pex.exceptions.exceptionsLib.LsstCppException: 0: lsst::pex::exceptions::RuntimeErrorException thrown at src/match/ReferenceMatch.cc:949 in const lsst::ap::match::ReferencePosition* lsst::ap::match::<unnamed>::RefReaderBase::_readReferencePosition()
        0: Message: Reference catalog is not sorted by declination
        
        *
        lsst.ap.match: Matching reference catalog to position table...
        lsst.ap.match: Opening position table /lsst3/weekly/datarel-runs/wp_tags_2011_0908_171709/csv-SourceAssoc/srcDump.tsv
        lsst.ap.match: Scanning position table to determine min/max epoch of input positions
        lsst.ap.match: Scanned 1662852 records
        lsst.ap.match:   - time range is [49932.417, 52938.004] MJD
        lsst.ap.match: Opening reference catalog /lsst3/weekly/datarel-runs/wp_tags_2011_0908_171709/csv-SourceAssoc/refFilt.csv
        lsst.ap.match: Scanning reference catalog to determine time-range and/or maximum velocity/parallax
        lsst.ap.match: Scanned 1490372 records
        lsst.ap.match:   - time range is [51544.500, 51544.500] MJD
        lsst.ap.match:   - max parallax is 33.900 milliarcsec
        lsst.ap.match:   - max angular velocity is 471.345 milliarcsec/yr
        lsst.ap.match:   - read-ahead is 2148.148 milliarcsec
        lsst.ap.match: Starting reference catalog to position table match
        Traceback (most recent call last):
          File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/ap/4.4.2.0/bin/qa/refPosMatch.py", line 121, in <module>
            main()
          File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/ap/4.4.2.0/bin/qa/refPosMatch.py", line 118, in main
            refPolicy, posPolicy, matchPolicy)
          File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/ap/4.4.2.0/python/lsst/ap/match/matchLib.py", line 953, in referenceMatch
            return _matchLib.referenceMatch(*args)
        lsst.pex.exceptions.exceptionsLib.LsstCppException: 0: lsst::pex::exceptions::RuntimeErrorException thrown at src/match/ReferenceMatch.cc:949 in const lsst::ap::match::ReferencePosition* lsst::ap::match::<unnamed>::RefReaderBase::_readReferencePosition()
        0: Message: Reference catalog is not sorted by declination
        
  • Resolution:
    • Start next run after clearing the disk of the 2 Sep 2011 failed run.
    • Developers should review the exceptions noted.
      • Serge fixed the ingestSourceAssoc issue on the trunk.

2 Sep 2011

2225

  • Why: OneOff? run for Simon regarding Dave Monet's astrometry puzzles
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_wp_tags_2011_0902_222518
  • Codeset: tagged; see /lsst3/weekly/datarel-runs/wp_tags_2011_0902_222518/config/weekly.tags
  • Dataset: one-off of sensors selected by Simon
  • Output: /lsst3/weekly/datarel-runs/wp_tags_2011_0902_222518
  • Database: buildbot_PT1_2_u_wp_tags_2011_0902_222518
  • Status:
    • About 50% of the sensors failed to successfully process. Question 1: why did so many sensors fail when the software should have been the same as the 3000 runs. I suspect but have not verified that the set of Current Released tags on lssst6 has been updated so the local cluster and teragrid cluster software stacks are no longer equivalent.
    • The pipeQA postprocessing run is in progress at this moment: Sun Sep 4 11:20am PT. PipeQA is now automatically invoked by buildbot.
    • The breakdown of the errors is as follows:
      (1) bad_alloc : 1844
       - work/PT1PipeA_2/Slice0.log:Exception: std::bad_alloc
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/ip_pipeline/4.4.0.0/python/lsst/ip/pipeline/isrCcdDefectStage.py", line 105, in process
          ipIsr.interpolateDefectList(exposure, defectList, fwhm)
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/ip_isr/4.4.0.0+1/python/lsst/ip/isr/isr.py", line 286, in interpolateDefectList
          fallbackValue = afwMath.makeStatistics(mi.getImage(), afwMath.MEANCLIP).getValue()
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/afw/4.4.4.0/python/lsst/afw/math/mathLib.py", line 5169, in makeStatistics
          return _mathLib.makeStatistics(*args)
      Exception: std::bad_alloc
      
      (2) MemoryError : 37
       - work/PT1PipeA_2/Slice0.log:MemoryError
      
      (3) IndexError : 1
       - work/PT1PipeC_4/Slice0.log:IndexError: invalid index
      
      (4) RuntimeErrorException : 75 from src/net/GlobalAstrometrySolution.cc:1264
       - work/PT1PipeA_4/Slice0.log:LsstCppException: 0: 
      lsst::pex::exceptions::RuntimeErrorException 
      thrown at src/net/GlobalAstrometrySolution.cc:1264 
      in std::vector<double, std::allocator<double> > 
      lsst::meas::astrom::net::getTagAlongFromIndex
      (index_t*, std::string, int*, int)
      
             : 2 from src/match/ReferenceMatch.cc:949
       - ingestSourceAssoc.log:lsst.pex.exceptions.exceptionsLib.LsstCppException: 
      0: lsst::pex::exceptions::RuntimeErrorException thrown at 
      src/match/ReferenceMatch.cc:949 in const 
      lsst::ap::match::ReferencePosition* 
      lsst::ap::match::<unnamed>::RefReaderBase::_readReferencePosition()
      
      (5) LsstCppException : 61 from include/lsst/afw/image/fits/fits_io_private.h:232
       - work/PT1PipeA_2/Slice0.log:LsstCppException: 0: 
      lsst::pex::exceptions::LogicErrorException 
      thrown at include/lsst/afw/image/fits/fits_io_private.h:232 in 
      lsst::afw::image::detail::fits_file_mgr::fits_file_mgr
      (const std::string&, const std::string&)
       - work/PT1PipeA_2/Slice0.log:0: Message: cfitsio error 
      (/lsst3/weekly/datarel-runs/wp_tags_2011_0902_222518/input/raw/v918388101-fg/
      E001/R14/S11/imsim_918388101_R14_S11_C16_E001.fits.gz): 
      could not open the named file (104)
      
      (6) LsstCppException : 1  from src/image/ImagePca.cc:493
       - work/PT1PipeC_4/Slice0.log:LsstCppException: 0: 
      lsst::pex::exceptions::LengthErrorException 
      thrown at src/image/ImagePca.cc:493 in double 
      lsst::afw::image::<unnamed>::
      do_updateBadPixels(const lsst::afw::image::detail::MaskedImage_tag&, 
      const typename lsst::afw::image::ImagePca<MaskedImageT>::ImageList&, 
      const std::vector<double, 
      std::allocator<double> >&, const typename 
      lsst::afw::image::ImagePca<MaskedImageT>::ImageList&, 
      long unsigned int, int) [with ImageT = 
      lsst::afw::image::MaskedImage<float, short unsigned int, float>]
      
      (7) read_array_into : 69 from fitstable.c:835:read_array_into
       - work/PT1PipeA_4/launch.log:fitstable.c:835:read_array_into: 
      Failed to read column from FITS file
      
      (8) startree_get_data_column : 75 from starkd.c:80:startree_get_data_column
       - work/PT1PipeA_4/launch.log:starkd.c:80:startree_get_data_column: 
      Failed to read tag-along data
      
    • The following Run Status logs/summaries are all rooted at: /lsst3/weekly/datarel-runs/wp_tags_2011_0902_222518/
      • ProcessedRecords.out : Jim Bosch's post-processing summary of fits data read and written. Note, a successfully processed record should have entries like: "862826551 4,2 2,2 ------ 80 reads, 4 writes";
      • !FindAllErrors_Stripped.out : one liner per error statement in log. It's just a shell script so it's only as good as the search ensemble.
      • Errors/* : errors categorized by type in different files. a block of ~20 lines from the error log before and after error report. Each entry separated by '--'.
      • RunStatus.out : count of job office input records input and key output records generated.
  • Resolution: Errors 1-6 represent mostly applications issues but items 7&8 might need attention from middleware. Ticket #1757 has been generated and given to RHL to distribute as appropriate. Copies to SteveP and Simon.