wiki:InfrastructureWGMeetingD20091207
Last modified 9 years ago Last modified on 12/09/2009 01:05:56 PM

We will be having our regular bi-weekly Infrastructure WG telecon on Monday, December 7, at 12 Noon CT.

Agenda

  • HPC Allocation Update
    • Current Allocation (as of Dec4)
      • Allocated: 30K SUs on abe; 30K SUs on lincoln (nVidia Tesla GPUs)
      • Used: 9 SUs (abe); 0 SUs (lincoln)
      • Remaining: 29.9K SUs (abe); 30K SUs (lincoln)
      • How to get access: contact MikeF
    • Next TG Allocation Cycle
      • Proposals due Jan15 for Apr1 allocations
  • Debrief of NCSA Meeting on Nov24
    • Initiation of regular ongoing collaborations with NCSA managers on LSST status, current sizing estimates, design and planning, etc.
    • Initiation of bi-weekly meeting with SME in each of those area to review of our current infrastructure design and cost models.
  • Cost Sheet Update
    • Baseline verion is v45
    • Current version now v69
    • Summary of Changes
      • LSST-60 AOS & DPS: don't purchase SW or lease without HW in 2013; CSS don't lease without HW in 2013-2014 (-1372K/-1372K)
      • LSST-57 CSS db drives: why do we have more drives on the floor than spindles required? (0K/-14590K) [answer: math wrong on num spindles purchased wrt replacement policy]
      • LSST-55 How is TFlops per node calculated? How does this evolve over time? [constant TF/core; constant $/node; moore's law for cores/node (model now more flexible)] ($-451K/$-4096K)
        • Radical reduction in number of compute nodes. Arch from 741->2049 to 99->102; Base from 337->376 to 48->49
        • need feedback/reaction
    • Questions & Notes
      • Ramp up: One of the things in the cost sheet that I wonder about is our "ramp up", i.e. we're currently planning on buying 1/3 of the hardware 3 years early, 2/3 two years early, etc. I wonder if 3 years early is a little too soon.
      • LSST-29 The css db excess capacity is 20x (208TB acquired vs. 11TB required at each site)
      • LSST-58 CSS db drives: validate that spindle count requirements really do go down in years 5, 9 and 10
    • Upcoming Changes
      • LSST-50 Floorspace tab: Floorspace calculation does not take into account the increase in drive capacities over time
      • LSST-10 Update Power & Cooling at Base Site (info already received from RonL)
      • LSST-36 Update Power & Cooling at ArchSite
    • Next steps with cost sheet
      • Full review each of the elements of the cost sheet (boxes of the mapping document)
        • More readable description of the formulas being used
        • Identification and documentation of assumptions
        • Identification and documentation of external data input
      • Serves two significant purposes
        • Allows for better internal reviews (validation of models and information used)
        • Provides justifications for external reviews
      • Results in an updated (or replacement of) Document-1684 and related documents ("Explanation of Cost Estimates")
  • DC3b Storage Options/Costs?
    • http://dev.lsstcorp.org/trac/wiki/DC3bHardwareRequirements
    • Request in to TG allocations of PT1
    • Request in to NCSA allocations for PT2, PT3
    • Met with Allocations last week
      • Good discussion
      • Will have a response back soon
    • Spinning Disk
      • Should be doable if we stay close to the lower end of the range
    • DB Storage
      • Can use existing SAN; Need to address I/O rates (LSST-54)
    • Tape
      • This is going to be difficult
      • Another source quoted me $62/TB (for single copy) [$25/tape=400GB]; for 300TB is $19K
    • Compute
      • SU being addressed by our first topic above
    • Bottom line: Nothing definitive yet

Notes

Attendees: Daniel, K-T, Jacek, RonL, Ray, Mike

  • Compute model currently based on Rpeak. This needs to be changed. Rmax is better, but still not right. What to use? See LSST-71
  • Cost sheet only includes baseline
  • Forcedsources not approved for baseline
  • To estimate the cost of forced sources, we'll need two sets of database sheets (storage & diskIO). See LSST-9

Useful Links