wiki:DM/buildbot/Weekly_Production/May2011
Last modified 8 years ago Last modified on 06/09/2011 09:19:57 PM

Buildbot Weekly Production Status for May 2011

28 May 2011

2246

  • Why: Test the latest trunk revisions.
    • Note SFM-sourceMeasure.paf had SHAPELET_MODEL_8 enabled. See pipeline archived policies at: /lsst3/weekly/datarel-runs/wp_trunk_2011_0528_224638/work/PT1Pipe/*
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0528_224638
  • Codeset: trunk; see /lsst3/weekly/datarel-runs/wp_trunk_2011_0528_224638/config/weekly.tags
  • Dataset: full
  • Output: /lsst3/weekly/datarel-runs/wp_trunk_2011_0528_224638
  • Database: buildbot_PT1_2_u_wp_trunk_2011_0528_224638
  • Status:
    • 266 records processed to completion, 1 failed due to error.
    • Single Memory exception logged in work/PT1PipeA_1/Slice0.log.
      • Data being processed: Processing job: type=calexp sensor=0,1 visit=885335911 raft=1,2
        harness.slice.iostage.output: will load isrExposure01_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,1'}
        harness.slice.iostage.output: will load isrExposure05_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,5'}
        harness.slice.iostage.output: will load isrExposure05_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,5'}
        harness.slice.iostage.output: will load isrExposure04_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,4'}
        harness.slice.iostage.output: will load isrExposure03_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,3'}
        harness.slice.iostage.output: will load isrExposure06_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,6'}
        harness.slice.iostage.output: will load isrExposure06_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,6'}
        harness.slice.iostage.output: will load isrExposure02_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,2'}
        harness.slice.iostage.output: will load isrExposure04_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,4'}
        harness.slice.iostage.output: will load isrExposure01_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,1'}
        harness.slice.iostage.output: will load isrExposure03_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,3'}
        harness.slice.iostage.output: will load isrExposure15_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,5'}
        harness.slice.iostage.output: will load isrExposure15_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,5'}
        harness.slice.iostage.output: will load isrExposure00_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,0'}
        harness.slice.iostage.output: will load isrExposure00_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,0'}
        harness.slice.iostage.output: will load isrExposure14_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,4'}
        harness.slice.iostage.output: will load isrExposure16_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,6'}
        harness.slice.iostage.output: will load isrExposure16_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,6'}
        harness.slice.iostage.output: will load isrExposure14_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,4'}
        harness.slice.iostage.output: will load isrExposure12_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,2'}
        harness.slice.iostage.output: will load isrExposure12_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,2'}
        harness.slice.iostage.output: will load isrExposure17_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,7'}
        harness.slice.iostage.output: will load isrExposure17_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,7'}
        harness.slice.iostage.output: will load isrExposure02_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,2'}
        harness.slice.iostage.output: will load isrExposure10_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,0'}
        harness.slice.iostage.output: will load isrExposure10_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,0'}
        harness.slice.iostage.output: will load isrExposure13_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,3'}
        harness.slice.iostage.output: will load isrExposure13_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,3'}
        harness.slice.iostage.output: will load isrExposure11_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,1'}
        harness.slice.iostage.output: will load isrExposure11_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '1,1'}
        harness.slice.iostage.output: will load isrExposure07_0 from raw with keys {'snap': 0, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,7'}
        harness.slice.iostage.output: will load isrExposure07_1 from raw with keys {'snap': 1, 'raft': '1,2', 'sensor': '0,1', 'visit': 885335911, 'channel': '0,7'}
        
      • Traceback
        harness.slice.visit.stage.tryProcess FATAL: Traceback (most recent call last):
          File "/home/buildbot/buildbotSandbox/Linux64/pex_harness/svn20013/python/lsst/pex/harness/Slice.py", line 563, in tryProcess
            stageObject.applyProcess()
          File "/home/buildbot/buildbotSandbox/Linux64/pex_harness/svn20013/python/lsst/pex/harness/stage.py", line 353, in applyProcess
            self.process(clipboard)
          File "/home/buildbot/buildbotSandbox/Linux64/ctrl_sched/svn20014/python/lsst/ctrl/sched/pipeline.py", line 511, in process
            self.tellJobDone(clipboard)
          File "/home/buildbot/buildbotSandbox/Linux64/ctrl_sched/svn20014/python/lsst/ctrl/sched/pipeline.py", line 498, in tellJobDone
            self.tellDataReady(clipboard)
          File "/home/buildbot/buildbotSandbox/Linux64/ctrl_sched/svn20014/python/lsst/ctrl/sched/pipeline.py", line 441, in tellDataReady
            possible = client.tellDataReady(possible, completed)
          File "/home/buildbot/buildbotSandbox/Linux64/ctrl_sched/svn20014/python/lsst/ctrl/sched/pipeline.py", line 207, in tellDataReady
            self.dataSender.createDatasetEvent(self.name, report, fullsuccess))
          File "/home/buildbot/buildbotSandbox/Linux64/ctrl_sched/svn20014/python/lsst/ctrl/sched/utils.py", line 123, in send
            self.trxr.publishEvent(event.create())
          File "/home/buildbot/slave/trunkVsTrunk_lsst/work/svn/ctrl_events_21011/python/lsst/ctrl/events/eventsLib.py", line 939, in publishEvent
            return _eventsLib.EventTransmitter_publishEvent(*args)
        LsstCppException: 0: lsst::pex::exceptions::MemoryException thrown at src/Citizen.cc:332 in long unsigned int lsst::daf::base::defaultCorruptionCallback(const lsst::daf::base::Citizen*)
        0: Message: Citizen "4880224: 0x2aaaafd6aa68 lsst::daf::base::PropertySet" is corrupted
        
        
  • Resolution:
    • Ready for Analysts.
    • Needs review of memory exception.

1309

  • Why: Test interaction between recently tagged packages during proto-production run
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0528_130958
  • Codeset: tag; see /lsst3/weekly/datarel-runs/wp_tags_2011_0528_130958/config/weekly.tags
  • Dataset: full, new set of 3 visits, new registry, newish calibration files
  • Output: /lsst3/weekly/datarel-runs/wp_tags_2011_0528_130958
  • Database: buildbot_PT1_2_u_wp_tags_2011_0528_130958
  • Status: 567 records processed successfully. No errors reported.
  • Resolution:
    • Analysts, do your job. Remember this is the tagged stack.
    • Greg -- ensure that you use the datarel package just tagged by Robert so that $DATAREL_DIR/PT1Pipe/SFM-sourceMeasure.paf is updated. I used a backdoor method to effect the required change for the weekly tagged run.

1236

  • Why: Test interaction between recently tagged packages during proto-production run
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0528_123642
  • Codeset: tag; see /lsst3/weekly/datarel-runs/wp_tags_2011_0528_123642/config/weekly.tags
  • Dataset: debug, new set of 3 visits, new registry, newish calibration files
  • Output: /lsst3/weekly/datarel-runs/wp_tags_2011_0528_123642
  • Database: buildbot_PT1_2_u_wp_tags_2011_0528_123642
  • Status: Successful run to completion. 10 records only. No errors noted.
  • Resolution: Will start the full tag run shortly

27 May 2011

2025

  • Why: Test interaction between recently tagged packages during proto-production run
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0527_202528
  • Codeset: tag; see /lsst3/weekly/datarel-runs/wp_tags_2011_0527_202528/config/weekly.tags
  • Dataset: short, debug
  • Output: /lsst3/weekly/datarel-runs/wp_tags_2011_0527_202528
  • Database: buildbot_PT1_2_u_wp_tags_2011_0527_202528
  • Status: Failed to generate any 'src'.
    • The following raw input data is implicated:
      raw visit=886294741 raft=0,2 sensor=0,2
      raw visit=886294741 raft=3,3 sensor=1,0
      raw visit=886294741 raft=3,3 sensor=1,2
      raw visit=886294741 raft=3,3 sensor=2,2
      raw visit=886294741 raft=3,3 sensor=0,0
      raw visit=886294741 raft=3,3 sensor=2,1
      raw visit=886294741 raft=3,3 sensor=1,1
      raw visit=886294741 raft=3,3 sensor=2,0
      raw visit=886294741 raft=3,3 sensor=0,1
      raw visit=886294741 raft=3,3 sensor=0,2
      
    • Errors recorded in Files: work/PT1PipeA_2/Slice0.log work/PT1PipeB_3/Slice0.log work/PT1PipeB_2/Slice0.log work/PT1PipeA_1/Slice0.log PT1PipeB_1/Slice0.log. All errors equivalent to the following:
      harness.slice.visit.stage.tryProcess FATAL: Traceback (most recent call last):
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.3.0.0/python/lsst/pex/harness/Slice.py", line 563, in tryProcess
          stageObject.applyProcess()
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.3.0.0/python/lsst/pex/harness/stage.py", line 353, in applyProcess
          self.process(clipboard)
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/meas_pipeline/4.3.1.2/python/lsst/meas/pipeline/sourceMeasurementStage.py", line 89, in process
          sourceSet = srcMeas.sourceMeasurement(exposure, psf, footprintLists, measurePolicy)
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/meas_utils/4.3.1.1/python/lsst/meas/utils/sourceMeasurement.py", line 60, in sourceMeasurement
          measureSources = measAlg.makeMeasureSources(exposure, measSourcePolicy)
        File "/lsst/DC3/stacks/gcc443/15oct2010/EupsBuildDir/Linux64/meas_algorithms-4.3.2.0/meas_algorithms-4.3.2.0/python/lsst/meas/algorithms/algorithmsLib.py", line 1833, in makeMeasureSources
      LsstCppException: 0: lsst::pex::exceptions::NotFoundException thrown at /lsst/DC3/stacks/gcc443/15oct2010/Linux64/afw/4.3.3.0/include/lsst/afw/detection/Measurement.h:581 in static std::pair<boost::shared_ptr<T> (*)(typename ImageT::ConstPtr, boost::shared_ptr<const PeakT>, boost::shared_ptr<const lsst::afw::detection::Source>), bool (*)(const lsst::pex::policy::Policy&)> lsst::afw::detection::MeasureQuantity<T, ImageT, PeakT>::_registryWorker(const std::string&, boost::shared_ptr<T> (*)(typename ImageT::ConstPtr, boost::shared_ptr<const PeakT>, boost::shared_ptr<const lsst::afw::detection::Source>), bool (*)(const lsst::pex::policy::Policy&)) [with T = lsst::afw::detection::Photometry, ImageT = lsst::afw::image::Exposure<float, short unsigned int, float>, PeakT = lsst::afw::detection::Peak]
      0: Message: Unknown algorithm SHAPELET_MODEL_8 for image of type lsst::afw::image::Exposure<floatushortfloat>
      
  • Resolution: Ticket #1692 issued; Review policies being used by sourceMeasurement.

2025

  • Why: Test interaction between recently tagged packages during proto-production run
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0527_160316
  • Codeset: tag; see /lsst3/weekly/datarel-runs/wp_tags_2011_0527_202528/config/weekly.tags
  • Dataset: short, debug
  • Output: /lsst3/weekly/datarel-runs/wp_tags_2011_0527_202528
  • Database: buildbot_PT1_2_u_wp_tags_2011_0527_202528
  • Status: Each slice which was started subsequently failed during setup of pipeline.
    • Errors recorded in files: work/PT1PipeA_2/launch.log work/PT1PipeB_3/launch.log work/PT1PipeB_2/launch.log work/PT1PipeA_1/launch.log work/PT1PipeB_1/launch.log.
    • Error in each slice was equivalent to the following. No raw input data was involved.
      Exception in thread Thread-2:
      Traceback (most recent call last):
         File
      "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/external/python/2.5.2/lib/python2.5/threading.py",
      line 486, in __bootstrap_inner
           self.run()
         File
      "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.3.0.0/python/lsst/pex/harness/SliceThread.py",
      line 79, in run
           self.pySlice.initializeStages()
         File
      "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.3.0.0/python/lsst/pex/harness/Slice.py",
      line 374, in initializeStages
           stageObject = StageClass(stagePolicy, self.log,
      self.eventBrokerHost, sysdata)
         File
      "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.3.0.0/python/lsst/pex/harness/stage.py",
      line 339, in __init__
           self.setup()
         File
      "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/meas_pipeline/4.3.1.1/python/lsst/meas/pipeline/sourceMeasurementStage.py",
      line 57, in setup
           self.policy.mergeDefaults(defPolicy.getDictionary())
         File
      "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_policy/4.3.0.1/python/lsst/pex/policy/policyLib.py",
      line 1021, in mergeDefaults
           return _policyLib.Policy_mergeDefaults(*args)
      LsstCppException: Validation error (3 errors):
         * measureSources.astrometry: no value available for
      required parameter
         * measureSources.shape: no value available for required
      parameter
         * measureSources.source.astrom: no value available for
      required parameter
      
      
  • Resolution: K-T provided a new and tagged meas_pipeline, Ray installed it.

25 May 2011

2301

  • Why: Do the Tuesday Weekly Production Run with new data, new calibration, and new astrometry catalogs.
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_wp_trunk_2011_0525_230118
  • Codeset: trunk; see /lsst3/weekly/datarel-runs/wp_trunk_2011_0525_230118/config/weekly.tags
  • Dataset: full
  • Output: /lsst3/weekly/datarel-runs/wp_trunk_2011_0525_230118
  • Database: buildbot_PT1_2_u_wp_trunk_2011_0525_230118
  • Status: Completed with major caveats:
    • Two complete visits were inaccessible due to unexpected removal of symlink source on disk;
      • they have since been restored by Simon
      • 189 raw images where skipped due to this error
    • Five glibc failures leading to termination of each pipeline slice. Backtraces in logs as follow:
      • work/PT1PipeA_2/launch.log
        • Backtrace starting at line 19 continuing thru line 742
          *** glibc detected *** python: free(): corrupted unsorted chunks: 0x00002aabef423af0 ***
          ======= Backtrace: =========
          /lib64/libc.so.6[0x3c3c47245f]
          /lib64/libc.so.6(cfree+0x4b)[0x3c3c4728bb]
          /home/buildbot/buildbotSandbox/Linux64/meas_multifit/svn21902/python/lsst/meas/multifit/_multifitLib.so(_ZN4lsst4meas8multifit6detail9FrameBaseD2Ev+0x53)[0x2aaab86f5b83]
          
          ...<snip>
          
      • work/PT1PipeB_3/launch.log
        • Backtrace starting at line 19 continuing thru line 771
          *** glibc detected *** python: free(): corrupted unsorted chunks: 0x00002aacd21929f0 ***
          ======= Backtrace: =========
          /lib64/libc.so.6[0x3bc607245f]
          /lib64/libc.so.6(cfree+0x4b)[0x3bc60728bb]
          /home/buildbot/buildbotSandbox/Linux64/meas_multifit/svn21902/python/lsst/meas/multifit/_multifitLib.so(_ZN4lsst4meas8multifit6detail9FrameBaseD2Ev+0x53)[0x2aaab86f5b83]
          
          ...<snip>>>
          
      • work/PT1PipeB_2/launch.log
        • Backtrace starting at line 19 continuing thru line 774
          *** glibc detected *** python: free(): corrupted unsorted chunks: 0x00002aad13b55720 ***
          ======= Backtrace: =========
          /lib64/libc.so.6[0x3bc607245f]
          /lib64/libc.so.6(cfree+0x4b)[0x3bc60728bb]
          /home/buildbot/buildbotSandbox/Linux64/meas_multifit/svn21902/python/lsst/meas/multifit/_multifitLib.so(_ZN4lsst4meas8multifit6detail9FrameBaseD2Ev+0x53)[0x2aaab86f5b83]
          
          ...<snip>
          
      • work/PT1PipeA_1/launch.log
        • Backtrace starting at line 19 continuing thru line 674
          *** glibc detected *** python: free(): corrupted unsorted chunks: 0x00002aaadfe9fed0 ***
          ======= Backtrace: =========
          /lib64/libc.so.6[0x3c3c47245f]
          /lib64/libc.so.6(cfree+0x4b)[0x3c3c4728bb]
          /home/buildbot/buildbotSandbox/Linux64/meas_multifit/svn21902/python/lsst/meas/multifit/_multifitLib.so(_ZN4lsst4meas8multifit6detail9FrameBaseD2Ev+0x53)[0x2aaab86f5b83]
          ...<snip>
          
      • work/PT1PipeB_1/launch.log
        • Backtrace starting at line 19 continuing thru line 708
          *** glibc detected *** python: free(): corrupted unsorted chunks: 0x000000001c346c00 ***
          ======= Backtrace: =========
          /lib64/libc.so.6[0x3bc607245f]
          /lib64/libc.so.6(cfree+0x4b)[0x3bc60728bb]
          /home/buildbot/buildbotSandbox/Linux64/meas_multifit/svn21902/python/lsst/meas/multifit/_multifitLib.so(_ZN4lsst4meas8multifit6detail9FrameBaseD2Ev+0x53)[0x2aaabc6f5b83]
          ...<snip>
          
  • Resolution:

2019

  • Why: Do the Tuesday Weekly Trunk Production Run
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0525_201941
  • Codeset: trunk; see /lsst3/weekly/datarel-runs/wp_trunk_2011_0525_201941/config/weekly.tags
  • Dataset: full
  • Output: /lsst3/weekly/datarel-runs/wp_trunk_2011_0525_201941
  • Database: buildbot_PT1_2_u_wp_trunk_2011_0525_201941
  • Status:
  • Resolution:
    • Killed since Serge provided the last piece enabling a run with new data, new calibration, and new astrometry catalogs.
    • TBD: Remove all detritus, including DB

1724

  • Why: Test correct setup of astrometry_net_data for tagged packages.
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0525_172433
  • Codeset: tagged; see /lsst3/weekly/datarel-runs/wp_tags_2011_0525_172433/config/weekly.tags
  • Dataset: full
  • Output: /lsst3/weekly/datarel-runs/wp_tags_2011_0525_172433
  • Database: buildbot_PT1_2_u_wp_tags_2011_0525_172433
  • Status:
  • Resolution:
    • Run failed. TBD: Remove all detritus including DB

20 May 2011

2254

  • Why: Test integration of multifit, use of new calibration data, defaultFwhm: 0.7, postISR and postISRCCD output, plus more...
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0520_225420
  • Codeset: trunk; see $RUN_DIR/config/weekly.tags
  • Dataset: full
  • Output: $RUN_DIR=/lsst3/weekly/datarel-runs/wp_trunk_2011_0520_225420
  • Database: buildbot_PT1_2_u_wp_trunk_2011_0520_225420
  • Status: Processed: 404 out of 506.
    • For stats, see: $RUN_DIR/RunStatus.out
    • For error scan, see: $RUN_DIR/FindAllErrors.out
    • mysql error - 3 occurrences. Encompassing log file is noted below. Review log files for complete error message blocks.
      ./ingestSdqa_ImSim.log:ProcesseERROR 126 (HY000): Incorrect key file for table './mysql/general_log.MYI'; try to repair it
      ./finishDb.log:ERROR 126 (HY000): Incorrect key file for table './mysql/general_log.MYI'; try to repair it
      ./linkDb.log:ERROR 126 (HY000): Incorrect key file for table './mysql/general_log.MYI'; try to repair it
      
    • Corrupted memory error - 3 occurrences {./work/PT1PipeA_2/launch.log ./work/PT1PipeB_2/launch.log ./work/PT1PipeA_1/launch.log}
      *** glibc detected *** python: free(): corrupted unsorted chunks: 0x00002aaaef96d860 ***
      ======= Backtrace: =========
      /lib64/libc.so.6[0x3c3c47245f]
      /lib64/libc.so.6(cfree+0x4b)[0x3c3c4728bb]
      /home/buildbot/buildbotSandbox/Linux64/meas_multifit/svn21902/python/lsst/meas/multifit/_multifitLib.so(_ZN4lsst4meas8multifit6detail9FrameBaseD2Ev+0x53)[0x2aaabc6f4b83]
      
  • Resolution:
    • The backtrace errors indicate that meas_multifit is initiating failure mode.
    • What's causing the repeated DB errors? Did these errors abort the respective processes or did each recover automatically?
    • postISRCCD was not generated for the run. Was the policy only updated for postISR output?

1121

  • Why: Test the freshly tagged packages play well together.
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0520_112126
  • Codeset: tagged; see /lsst3/weekly/datarel-runs/wp_tags_2011_0520_112126/config/weekly.tags
  • Dataset: full
  • Output: /lsst3/weekly/datarel-runs/wp_tags_2011_0520_112126
  • Database: buildbot_PT1_2_u_wp_tags_2011_0520_112126
  • Status: Aborted early; forgot to run in debug mode. Same problems. but this time astrometry_net_data was correctly set; look at config/weekly.tags mentioned above.
    • work/PT1PipeA_2/Slice0.log:
      harness.slice.WcsDeterminationStageParallel WARNING: No astrometric solution found, using input WCS
      harness.slice.visit.stage.tryProcess FATAL: Traceback (most recent call last):
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.3.0.0/python/lsst/pex/harness/Slice.py", line 563, in tryProcess
          stageObject.applyProcess()
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.3.0.0/python/lsst/pex/harness/stage.py", line 353, in applyProcess
          self.process(clipboard)
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/meas_pipeline/4.3.1.1/python/lsst/meas/pipeline/wcsDeterminationStage.py", line 120, in process
          for m in matchList:
      TypeError: 'NoneType' object is not iterable
      
    • work/PT1PipeB_1/Slice0.log (1 occurrence in 2 logs)
                harness.slice.WcsDeterminationStageParallel DEBUG: Setting numBrightObj
      harness.slice.visit.stage.tryProcess FATAL: Traceback (most recent call last):
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.3.0.0/python/lsst/pex/harness/Slice.py", line 563, in tryProcess
          stageObject.applyProcess()
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.3.0.0/python/lsst/pex/harness/stage.py", line 353, in applyProcess
          self.process(clipboard)
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/meas_pipeline/4.3.1.1/python/lsst/meas/pipeline/wcsDeterminationStage.py", line 113, in process
          srcSet, solver=self.solver, log=self.log)
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/meas_astrom/4.3.2.0/python/lsst/meas/astrom/determineWcs.py", line 259, in determineWcs
          isSolved = solver.solve(wcsIn)
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/meas_astrom/4.3.2.0/python/lsst/meas/astrom/net/netLib.py", line 716, in solve
          return _netLib.GlobalAstrometrySolution_solve(*args)
      LsstCppException: 0: lsst::pex::exceptions::RuntimeErrorException thrown at src/net/GlobalAstrometrySolution.cc:576 in int lsst::meas::astrom::net::GlobalAstrometrySolution::_addSuitableIndicesToSolver(double, double, double, double)
      0: Message: No suitable indices found for given input parameters:Probably the ra/dec range isn't covered
      
      
  • Resolution: Robert, marshal your forces.

19 May 2011

2231

  • Why: Test integration of multifit, use of new calibration data
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0519_223114
  • Codeset: trunk; see /lsst3/weekly/datarel-runs/wp_trunk_2011_0519_223114/config/weekly.tags
  • Dataset: full, new calibration data
  • Output: /lsst3/weekly/datarel-runs/wp_trunk_2011_0519_223114
  • Database: buildbot_PT1_2_u_wp_trunk_2011_0519_223114
  • Status: Success of sorts. The logs indicate some areas for improvement
    • Out of 506 input images, The output stats (/lsst3/weekly/datarel-runs/wp_trunk_2011_0519_223114/RunStatus.out) were:
      icSrc  421
      psf  421
      sdqaCcd  846
      calexp  421
      icMatch  421
      apCorr  421
      src  418
      csv-SourceAssoc  125
      
    • The quick-n-dirty log error extraction (/lsst3/weekly/datarel-runs/wp_trunk_2011_0519_223114/FindAllErrors.out) uncovered:
      • SourceAssoc_ImSim.log shows ~75509 WARNING of the form:
        lsst.ap.cluster WARNING: Source 28773560661049345 in exposure 439049692704 has invalid position (NaN or out-of-bounds coordinate value(s))
        
        
      • work/PT1PipeA_2/Slice0.log indicates the Citizen issue is still occurring, although it only happened once:
          File "/home/buildbot/buildbotSandbox/Linux64/ctrl_sched/svn20014/python/lsst/ctrl/sched/utils.py", line 123, in send
            self.trxr.publishEvent(event.create())
          File "/home/buildbot/slave/trunkVsTrunk_lsst/work/svn/ctrl_events_21011/python/lsst/ctrl/events/eventsLib.py", line 939, in publishEvent
            return _eventsLib.EventTransmitter_publishEvent(*args)
        LsstCppException: 0: lsst::pex::exceptions::MemoryException thrown at src/Citizen.cc:332 in long unsigned int lsst::daf::base::defaultCorruptionCallback(const lsst::daf::base::Citizen*)
        0: Message: Citizen "118218459: 0x2aaac99ac578 lsst::daf::base::PropertySet" is corrupted
        
      • work/PT1PipeB_2/launch.log indicates a gcc corruption resulting in an extensive tracedump and termination of 3 different launch processes:
        *** glibc detected *** python: free(): corrupted unsorted chunks: 0x00002aaad07fc700 ***
        ======= Backtrace: =========
        /lib64/libc.so.6[0x3bc607245f]
        ....
        
  • Resolution: There's enough for everyone to have a go.

1951

  • Why: Test the freshly tagged packages play well together.
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0519_195131
  • Codeset: TAGGED; see /lsst3/weekly/datarel-runs/wp_tags_2011_0519_195131/config/weekly.tags
  • Dataset: short
  • Output: /lsst3/weekly/datarel-runs/wp_tags_2011_0519_195131
  • Database: buildbot_PT1_2_u_wp_tags_2011_0519_195131
  • Status: Failed in error.
    • Possibly due to skypix not tagged Current with the correct version: 3.0.4 (instead of 3.0.3)
    • Possibly due to the error reported for each of 10 images presented:
      harness.slice.visit.stage.tryProcess FATAL: Traceback (most recent call last):
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.3.0.0/python/lsst/pex/harness/Slice.py", line 563, in tryProcess
          stageObject.applyProcess()
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/pex_harness/4.3.0.0/python/lsst/pex/harness/stage.py", line 353, in applyProcess
          self.process(clipboard)
        File "/lsst/DC3/stacks/gcc443/15oct2010/Linux64/meas_pipeline/4.3.1.1/python/lsst/meas/pipeline/wcsDeterminationStage.py", line 120, in process
          for m in matchList:
      TypeError: 'NoneType' object is not iterable
      
      
  • Resolution:
    • An incorrect astrometry_net_data was used. config/weekly.tags shows,
      • 'astrometry_net_data cfhttemplate Setup' overrode
      • 'astrometry_net_data imsim-2011-05-01-0 Current' which is the default setup.
    • Determined situation and resolved with workaround
      • Something happens during the installation/setup process. The datarel/ups/datarel.table changes to add constraints on the code source's barebones declaration: 'setupOptional(astrometry_net_data)' to turn it into:
        • for the trunk installed version: setupOptional(astrometry_net_data imsim-2011-05-01-1 [>= imsim-2011-05-01-1])
        • for the tagged installed version: setupOptional(astrometry_net_data cfhttemplate [>= cfhttemplate])

1455

  • Why: Test the freshly tagged packages play well together.
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline TBD after run: 2011_0519_145524
  • Codeset: tagged; see /lsst3/weekly/datarel-runs/wp_tags_2011_0519_145524/config/weekly.tags
  • Dataset: short
  • Output: /lsst3/weekly/datarel-runs/wp_tags_2011_0519_145524
  • Database: buildbot_PT1_2_u_wp_tags_2011_0519_145524
  • Status: Packages failing to play well together. See Ticket #1686
  • Resolution:

1048

  • Why: Test integration of multifit, use of new calibration data
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0519_104846
  • Codeset: trunk; see /lsst3/weekly/datarel-runs/wp_trunk_2011_0519_104846/config/weekly.tags
  • Dataset: full
  • Output: /lsst3/weekly/datarel-runs/wp_trunk_2011_0519_104846
  • Database: buildbot_PT1_2_u_wp_trunk_2011_0519_104846 Delete Useless DATABASES
  • Status: Killed due to same error afflicting run 18 May 2011 at 13:54
  • Resolution:
    • Back to the developers for repair.
    • Need to delete database: buildbot_PT1_2_u_wp_trunk_2011_0519_104846

17 May 2011

2259

  • Why: Test integration of multifit, use of new calibration data
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0518_225902
  • Codeset: trunk; see /lsst3/weekly/datarel-runs/wp_trunk_2011_0518_225902/config/weekly.tags
  • Dataset: full
  • Output: /lsst3/weekly/datarel-runs/wp_trunk_2011_0518_225902
  • Database: none
  • Status: Aborted itself before it setup the pipelines. ctrl_orca failed to initialize the DB.
    • See:/home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0518_225902/unifiedPipeline.log
      Traceback (most recent call last):
        File "/home/buildbot/buildbotSandbox/Linux64/ctrl_orca/svn20067/bin/orca.py", 
      line 100, in <module>
          productionRunManager.runProduction(skipConfigCheck=parser.opts.skipconfigche
      ck, workflowVerbosity=parser.opts.pipeverb)
        File "/home/buildbot/buildbotSandbox/Linux64/ctrl_orca/svn20067/python/lsst/ct
      rl/orca/ProductionRunManager.py", line 190, in runProduction
          self.configure(workflowVerbosity)
        File "/home/buildbot/buildbotSandbox/Linux64/ctrl_orca/svn20067/python/lsst/ct
      rl/orca/ProductionRunManager.py", line 136, in configure
          workflowManagers = self._productionRunConfigurator.configure(workflowVerbosi
      ty)
        File "/home/buildbot/buildbotSandbox/Linux64/ctrl_orca/svn20067/python/lsst/ct
      rl/orca/ProductionRunConfigurator.py", line 120, in configure
          cfg.setup(self._provSetup)
        File "/home/buildbot/buildbotSandbox/Linux64/ctrl_orca/svn20067/python/lsst/ct
      rl/orca/db/DC3Configurator.py", line 74, in setup
          dbNames = self.setupInternal()
        File "/home/buildbot/buildbotSandbox/Linux64/ctrl_orca/svn20067/python/lsst/ct
      rl/orca/db/DC3Configurator.py", line 105, in setupInternal
          dbNames = self.prepareForNewRun(self.runid)
        File "/home/buildbot/buildbotSandbox/Linux64/ctrl_orca/svn20067/python/lsst/ct
      rl/orca/db/DC3Configurator.py", line 153, in prepareForNewRun
          return self.delegate.prepareForNewRun(runName, self.dbUser, self.dbPassword,
       runType)
        File "/home/buildbot/buildbotSandbox/Linux64/ctrl_orca/svn20067/python/lsst/ct
      rl/orca/db/MySQLConfigurator.py", line 133, in prepareForNewRun
          self.connect(userName, userPassword, self.globalDbName)
        File "/home/buildbot/buildbotSandbox/Linux64/cat/svn21530/python/lsst/cat/MySQ
      LBase.py", line 72, in connect
          (e.args[0], e.args[1], self.dbHostName, self.dbHostPort, dbUser))
      RuntimeError: DB Error 126: Incorrect key file for table './mysql/general_log.MY
      I'; try to repair it. host=lsst10.ncsa.uiuc.edu, port=3306, user=buildbot, pass=
      <hidden>
      
      
  • See logfile: /lsst3/weekly/datarel-runs/wp_trunk_2011_0518_225902/weekly_production_2011_0518.log
    RUNNING
    orca.py -r /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline -e /hom
    e/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline/stack_trunk.sh -V 30 -
    L 2 weekly_production.paf wp_trunk_2011_0518_225902
    ------------------------------------
    FATAL: Failed in setting up DB access.
    
    
  • Resolution: Ctrl_orca / DB developers to examine failure. Perhaps it was during a routine nightly DB process which locks out use.
    • Solved...it was a hardware failure which has been resolved.

1354

  • Why: Test integration of multifit, use of new calibration data
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0518_135420
  • Codeset: trunk; see /lsst3/weekly/datarel-runs/wp_trunk_2011_0518_135420/config/weekly.tags
  • Dataset: full
  • Output: /lsst3/weekly/datarel-runs/wp_trunk_2011_0518_135420
  • Database: none generated
  • Status: Run killed due to failure to calculate any 'src' out of 119 sensor files.
    • Failure in source measurement for every loop. See /lsst3/weekly/datarel-runs/wp_trunk_2011_0518_135420/work/PT1PipeB_1/Slice0.log
      harness.slice.visit.stage.tryProcess FATAL: Traceback (most recent call last):
        File "/home/buildbot/buildbotSandbox/Linux64/pex_harness/svn20013/python/lsst/pex/harness/Slice.py", line 563, in tryProcess
          stageObject.applyProcess()
        File "/home/buildbot/buildbotSandbox/Linux64/pex_harness/svn20013/python/lsst/pex/harness/stage.py", line 353, in applyProcess
          self.process(clipboard)
        File "/home/buildbot/buildbotSandbox/Linux64/meas_pipeline/svn21484/python/lsst/meas/pipeline/sourceMeasurementStage.py", line 89, in process
          sourceSet = srcMeas.sourceMeasurement(exposure, psf, footprintLists, measurePolicy)
        File "/home/buildbot/buildbotSandbox/Linux64/meas_utils/svn21777/python/lsst/meas/utils/sourceMeasurement.py", line 61, in sourceMeasurement
          measureSources = measAlg.makeMeasureSources(exposure, measSourcePolicy)
        File "/home/buildbot/slave/trunkVsTrunk_lsst/work/svn/meas_algorithms_21727/python/lsst/meas/algorithms/algorithmsLib.py", line 1833, in makeMeasureSources
          return _algorithmsLib.makeMeasureSources(*args)
      Exception: stream error
      
      
  • Resolution:
    • Hand back to developer to fix source measurement issue.

2317

  • Why: Test integration of multifit, use of new calibration data
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0517_231747
  • Codeset: trunk; see /lsst3/weekly/datarel-runs/wp_trunk_2011_0517_231747/config/weekly.tags
  • Dataset: full
  • Output: /lsst3/weekly/datarel-runs/wp_trunk_2011_0517_231747
  • Database: none created
  • Status: Failed due to Policy problem in source measurement. Killed job.
  • Resolution: Developer to fix Policy setup (done: rev 21830)

13 May 2011

2005

  • Why: use new dataset; fix saturation problem; fix single pixel problem; fix dangling ptr problem; Using same old calibration files
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_2011_0513_200506
  • Codeset: trunk; see /lsst3/weekly/datarel-runs/wp_trunk_2011_0513_200506/config/weekly.tags
  • Dataset: full, stars and more; 20110511_weekly.input
  • Output: /lsst3/weekly/datarel-runs/wp_trunk_2011_0513_200506.
  • Database: buildbot_PT1_2_u_wp_trunk_2011_0513_200506
  • Status: Completed. All records processed successfully.
    • Two processing errors noted, as below:
      harness.slice.visit.stage.tryProcess FATAL: Traceback (most recent call last):
        File "/home/buildbot/slave/trunkVsTrunk_lsst/work/svn/pex_harness_20013/python/lsst/pex/harness/Slice.py", line 563, in tryProcess
          stageObject.applyProcess()
        File "/home/buildbot/slave/trunkVsTrunk_lsst/work/svn/pex_harness_20013/python/lsst/pex/harness/stage.py", line 353, in applyProcess
          self.process(clipboard)
        File "/home/buildbot/slave/trunkVsTrunk_lsst/work/svn/ctrl_sched_20014/python/lsst/ctrl/sched/pipeline.py", line 511, in process
          self.tellJobDone(clipboard)
        File "/home/buildbot/slave/trunkVsTrunk_lsst/work/svn/ctrl_sched_20014/python/lsst/ctrl/sched/pipeline.py", line 498, in tellJobDone
          self.tellDataReady(clipboard)
        File "/home/buildbot/slave/trunkVsTrunk_lsst/work/svn/ctrl_sched_20014/python/lsst/ctrl/sched/pipeline.py", line 441, in tellDataReady
          possible = client.tellDataReady(possible, completed)
        File "/home/buildbot/slave/trunkVsTrunk_lsst/work/svn/ctrl_sched_20014/python/lsst/ctrl/sched/pipeline.py", line 207, in tellDataReady
          self.dataSender.createDatasetEvent(self.name, report, fullsuccess))
        File "/home/buildbot/slave/trunkVsTrunk_lsst/work/svn/ctrl_sched_20014/python/lsst/ctrl/sched/utils.py", line 123, in send
          self.trxr.publishEvent(event.create())
        File "/home/buildbot/slave/trunkVsTrunk_lsst/work/svn/ctrl_events_21011/python/lsst/ctrl/events/eventsLib.py", line 939, in publishEvent
          return _eventsLib.EventTransmitter_publishEvent(*args)
      LsstCppException: 0: lsst::pex::exceptions::MemoryException thrown at src/Citizen.cc:332 in long unsigned int lsst::daf::base::defaultCorruptionCallback(const lsst::daf::base::Citizen*)
      0: Message: Citizen "90763939: 0x26b4b6b8 lsst::daf::base::PropertySet" is corrupted
      
  • Resolution:
    • Ready for Developers to review results of their upgrades.
    • RAA to determine why 'latest' symlink was not established at run completion. It has been updated.

10 May 2011

1710

  • Why: RHL PSF overflow issue, KTL sqlite memory issue, SMM new RefSrcMatch? table. Forgot to 'commit' to datarel KTL's linkDB script...next run; will do by hand after run completes.
  • Setup: /home/buildbot/slave/trunkVsTrunk_lsst/work/weeklyPR/pipeline_20110510_171052
  • Codeset: trunk; see /lsst3/weekly/datarel-runs/wp_trunk_2011_0510_171052/config/weekly.tags
  • Dataset: full, stars + other objects
  • Output: /lsst3/weekly/datarel-runs/wp_trunk_2011_0510_171052
  • Database: buildbot_PT1_2_u_wp_trunk_2011_0510_171052
  • Status: Test complete: all images processed OK.
    • loadDb.py was run on the database
    • Error noted in work/PT1PipeA_2/Slice0.log:
      harness.slice.visit.stage.tryProcess FATAL: Traceback (most recent call last):
        File "/home/buildbot/slave/trunkVsTrunk_lsst/work/svn/pex_harness_20013/python/lsst/pex/harness/Slice.py", line 563, in tryProcess
          stageObject.applyProcess()
        File "/home/buildbot/slave/trunkVsTrunk_lsst