We will be having our regular bi-weekly Infrastructure WG telecon on Monday, January 11, at 12 Noon CT.
Agenda
- Existing Resource Usage Update
- TeraGrid Resources
- Service Units (as of Jan8)
- Allocated: 30K SUs on abe; 30K SUs on lincoln (nVidia Tesla GPUs)
- Used: 123 SUs (abe); 0 SUs (lincoln)
- Remaining: ~29K SUs (abe); 30K SUs (lincoln)
- Disk Storage (as of Jan 8)
- Allocated: 5TB
- Used: 0TB
- Remaining: 5TB
- Tape Storage (as of Jan 8)
- Allocated: 40TB
- Used: 0TB
- Remaining: 40TB
- Next TG Allocation Cycle
- Proposals due Jan15 for Apr1 allocations
- Service Units (as of Jan8)
- TeraGrid Resources
- Cost Sheet Update
- Baseline verion is v45
- https://www.lsstcorp.org/docushare/dsweb/Get/Version-12185/Infrastructure-Costs-v45.xls
- caveats apply: v45 does not *exactly* match PMCS
- Current version now v71
- Summary of Changes
- LSST-72 Update PMCS Baseline in Cost Summary Sheet
- LSST-71 Compute model currently based on Rpeak. This needs to be changed. Rmax is better, but still not right. What to use? Using efficient rating. Set to 20%->7% (+9.8M/+19.6M). Nodes are Arch 718->201 (740); Base 330->42 (335).
- LSST-80 Planning to introduce a "Factor X" for the out years
- Deferring this idea; closing ticket; Efficiency rating seems to resolve the concern at at least a course grained level; May need to revisit this later
- LSST-40 Disk Cost/Capacity Trends (3 yr step, etc.) in tech predictions tab (-689K/-5329K)
- Questions & Notes
- Ramp up: One of the things in the cost sheet that I wonder about is our "ramp up", i.e. we're currently planning on buying 1/3 of the hardware 3 years early, 2/3 two years early, etc. I wonder if 3 years early is a little too soon.
- Upcoming Changes
- Priority is getting a new BaseFloorSpace.xls to RonL/JeffB, which depends on:
- LSST-50 Floorspace tab: Floorspace calculation does not take into account the increase in drive capacities over time
- LSST-78 Move the 3% CPU spare from document 2116 "CPU Sizing" to document 6284 "Cost Estimate"
- LSST-79 Add tape library replacement to ArchAOS and BaseAOS
- LSST-10 Update Power & Cooling at Base Site (info already received from RonL)
- LSST-47 Power Costs at BaseSite: Use Historical Data to Model Future Power Prices
- LSST-36 Update Power & Cooling at ArchSite
- LSST-28 Optimal CPU Replacement Policy
- LSST-36 P&C and Floorspace at PCF (rates, payment approach, green features of PCF)
- LSST-69 New Model for Floorspace (Lease Costs) at ArchSite
- LSST-14 Processor Sizing Update (Doc2116 LSST CPU Sizing)
- LSST-37 Missing controller costs for disk
- Priority is getting a new BaseFloorSpace.xls to RonL/JeffB, which depends on:
- Next steps with cost sheet
- Full review each of the elements of the cost sheet (boxes of the mapping document)
- More readable description of the formulas being used
- Identification and documentation of assumptions
- Identification and documentation of external data input
- Serves two significant purposes
- Allows for better internal reviews (validation of models and information used)
- Provides justifications for external reviews
- Results in an updated (or replacement of) Document-1684 and related documents ("Explanation of Cost Estimates")
- Full review each of the elements of the cost sheet (boxes of the mapping document)
- Baseline verion is v45
- DC3b Infrastructure Options/Costs for the Performance Tests
- http://dev.lsstcorp.org/trac/wiki/DC3bHardwareRequirements
- LSST-11 DC3b Hardware
- Input Data Missing
- Added additional space for input data
- 47TB for ImSim
- checking with AndyC whether or not this is only copy (thus need for fault tolerate storage)
- 10TB for CFHTLS
- Q: Does this cover all three PTs?
- Assuming this is tape storage, not spinning disk
- 47TB for ImSim
- Added additional space for input data
- Discussion with Allocations
- commitment of 10TB spinning disk from TG (covers PT1)
- tape complicated - can't commit until ~Jan
- old tape system/new tape system; tape format compatibilities; funding sources & HPC followons; etc.
- Spinning Disk
- Should be doable if we stay close to the lower end of the range
- DB Storage
- Can use existing SAN
- LSST-73 Add additional 3.3TB of our existing SAN allocation to lsst10 /scr
- Tape
- This is going to be difficult
- $62/TB (for single copy) [$25/tape=400GB]; for 300TB is $19K
- Q: Is data loss acceptable? (750 tapes) A: Yes, except for ImSim data. I am still waiting on reply regarding tape failure rates, however, I expect the failure rate to be acceptably low.
- Compute
- Priority is getting the TG proposal written and submitted by this Friday
- SU being addressed by our first topic above
- DC3b Infrastructure Options/Costs for Data Serving
- http://dev.lsstcorp.org/trac/wiki/DC3bDataServingRequirements
- LSST-54 Connections Speeds between lsst10 and the SAN Storage. We need 300 MB/s. What are our options?
- Image Retrieval needs are unspecified (servers? spinning disk?)
- Distributed File Management (REDDNET/Lstore/iRods/DataNet) iff we have the right people on the call
- Upcoming Conferences of Interest
- 11th LCI International Conference on High-Performance Clustered Computing
- March 8-11, 2010 Pittsburgh
- http://www.linuxclustersinstitute.org/conferences/
- 11th LCI International Conference on High-Performance Clustered Computing
Notes
Attendees: Ray, Arun, Jacek, K-T, Daniel, MikeF
- Action items reflected in JIRA tickets.
Useful Links
- InfraWG Home Page
- InfraWG Tickets (in priority order)
Attachments
-
lsst-infrawg-tickets-20100111.jpg
(23.1 KB) - added by mfreemon
7 months ago.

