wiki:InfrastructureWGMeetingD20100111
Last modified 9 years ago Last modified on 01/11/2010 02:39:15 PM

We will be having our regular bi-weekly Infrastructure WG telecon on Monday, January 11, at 12 Noon CT.

Agenda

  • Existing Resource Usage Update
    • TeraGrid Resources
      • Service Units (as of Jan8)
        • Allocated: 30K SUs on abe; 30K SUs on lincoln (nVidia Tesla GPUs)
        • Used: 123 SUs (abe); 0 SUs (lincoln)
        • Remaining: ~29K SUs (abe); 30K SUs (lincoln)
      • Disk Storage (as of Jan 8)
        • Allocated: 5TB
        • Used: 0TB
        • Remaining: 5TB
      • Tape Storage (as of Jan 8)
        • Allocated: 40TB
        • Used: 0TB
        • Remaining: 40TB
      • Next TG Allocation Cycle
        • Proposals due Jan15 for Apr1 allocations
  • Cost Sheet Update
    • Baseline verion is v45
    • Current version now v71
    • Summary of Changes
      • LSST-72 Update PMCS Baseline in Cost Summary Sheet
      • LSST-71 Compute model currently based on Rpeak. This needs to be changed. Rmax is better, but still not right. What to use? Using efficient rating. Set to 20%->7% (+9.8M/+19.6M). Nodes are Arch 718->201 (740); Base 330->42 (335).
      • LSST-80 Planning to introduce a "Factor X" for the out years
        • Deferring this idea; closing ticket; Efficiency rating seems to resolve the concern at at least a course grained level; May need to revisit this later
      • LSST-40 Disk Cost/Capacity? Trends (3 yr step, etc.) in tech predictions tab (-689K/-5329K)
    • Questions & Notes
      • Ramp up: One of the things in the cost sheet that I wonder about is our "ramp up", i.e. we're currently planning on buying 1/3 of the hardware 3 years early, 2/3 two years early, etc. I wonder if 3 years early is a little too soon.
    • Upcoming Changes
      • Priority is getting a new BaseFloorSpace.xls to RonL/JeffB, which depends on:
        • LSST-50 Floorspace tab: Floorspace calculation does not take into account the increase in drive capacities over time
      • LSST-78 Move the 3% CPU spare from document 2116 "CPU Sizing" to document 6284 "Cost Estimate"
      • LSST-79 Add tape library replacement to ArchAOS and BaseAOS
      • LSST-10 Update Power & Cooling at Base Site (info already received from RonL)
      • LSST-47 Power Costs at BaseSite: Use Historical Data to Model Future Power Prices
      • LSST-36 Update Power & Cooling at ArchSite
      • LSST-28 Optimal CPU Replacement Policy
      • LSST-36 P&C and Floorspace at PCF (rates, payment approach, green features of PCF)
      • LSST-69 New Model for Floorspace (Lease Costs) at ArchSite
      • LSST-14 Processor Sizing Update (Doc2116 LSST CPU Sizing)
      • LSST-37 Missing controller costs for disk
    • Next steps with cost sheet
      • Full review each of the elements of the cost sheet (boxes of the mapping document)
        • More readable description of the formulas being used
        • Identification and documentation of assumptions
        • Identification and documentation of external data input
      • Serves two significant purposes
        • Allows for better internal reviews (validation of models and information used)
        • Provides justifications for external reviews
      • Results in an updated (or replacement of) Document-1684 and related documents ("Explanation of Cost Estimates")
  • DC3b Infrastructure Options/Costs? for the Performance Tests
    • http://dev.lsstcorp.org/trac/wiki/DC3bHardwareRequirements
    • LSST-11 DC3b Hardware
    • Input Data Missing
      • Added additional space for input data
        • 47TB for ImSim
          • checking with AndyC whether or not this is only copy (thus need for fault tolerate storage)
        • 10TB for CFHTLS
          • Q: Does this cover all three PTs?
        • Assuming this is tape storage, not spinning disk
    • Discussion with Allocations
      • commitment of 10TB spinning disk from TG (covers PT1)
      • tape complicated - can't commit until ~Jan
        • old tape system/new tape system; tape format compatibilities; funding sources & HPC followons; etc.
    • Spinning Disk
      • Should be doable if we stay close to the lower end of the range
    • DB Storage
      • Can use existing SAN
      • LSST-73 Add additional 3.3TB of our existing SAN allocation to lsst10 /scr
    • Tape
      • This is going to be difficult
      • $62/TB (for single copy) [$25/tape=400GB]; for 300TB is $19K
      • Q: Is data loss acceptable? (750 tapes) A: Yes, except for ImSim data. I am still waiting on reply regarding tape failure rates, however, I expect the failure rate to be acceptably low.
    • Compute
      • Priority is getting the TG proposal written and submitted by this Friday
      • SU being addressed by our first topic above

  • Distributed File Management (REDDNET/Lstore/iRods/DataNet) iff we have the right people on the call
  • InfraWG Ticket Update

Notes

Attendees: Ray, Arun, Jacek, K-T, Daniel, MikeF

  • Action items reflected in JIRA tickets.

Useful Links

Attachments