wiki:InfrastructureWGMeetingD20100315
Last modified 9 years ago Last modified on 03/17/2010 10:19:49 AM

We will be having our regular bi-weekly Infrastructure WG telecon on Monday, March 15, at 12N CT (10A PT).

Agenda

  • TeraGrid TRAC Allocation for LSST Announcement
    • Full amounts awarded - 1.51M SUs, 400TB tape, 20TB disk
    • Allocation starts Apr1
  • Existing Resource Usage Update (as of Mar14)
    • TeraGrid Resources (Startup Allocation)
      • Service Units
        • Abe: Allocated: 30K SUs; Remaining ~29.4K SUs
        • Lincoln (nVidia Tesla GPUs): Allocated: 30K SUs; Remaining 30K SUs
      • Disk Storage
        • Allocated: 5TB; Remaining: 5TB
      • Tape Storage
        • Allocated: 40TB; Remaining: 40TB
      • Authorized Users: GregD, DavidG, K-T, SteveP, RayP, Jonathan Myers
  • DC3b Infrastructure for the Performance Tests
    • http://dev.lsstcorp.org/trac/wiki/DC3bHardwareRequirements
    • LSST-11 DC3b Hardware
    • https://www.lsstcorp.org/docushare/dsweb/Get/Document-8529/LSST-TeraGrid-Proposal.pdf
    • Compute
      • We're good on this. 1.51M SUs awarded from TG.
    • HPC Project Space
      • We're good on this. 20TB awarded from TG.
    • Database Disk
      • We're good on this. 15TB for database on lsst10 as of Feb 22, which covers the requirements for all of DC3b.
    • Tape Storage
      • We're good on this. 400TB of dual copy mass storage awarded from TG. No tape gap. No need to purchase additional tapes. No data loss is expected due to media failures.
    • Disk I/O Bandwidth
      • Additional information has come to light that our previous estimate of ~140MB/s between abe and the spinning disk (scratch and project space on able) is low. We should expect to get more than that. I will be attempting to quantify this.
  • DC3b User Access
    • DC3bUserAccess
    • Unique Identifier for Logical Set of Related Files
      • discussion w RHL -- pending feedback from him
      • DC3bUserAccess (this is what we're talking about, but don't look at it yet)
    • Bulk Upload Into Catalog
      • DC3bUserAccess
      • assuming standard mysql utilities
      • assuming storage requirements are not significant
        • MikeF proposing stmt re user expectations for storage
    • Web Data Server update
    • Image Cutout Service update (K-T)
    • Sample Scripts
      • IPAC (Suzy)
    • Web Interface
      • IPAC (Suzy); interface to scripts; reuse existing portals
    • LSST-54 Connections Speeds between lsst10 and the SAN Storage. We need 300 MB/s. What are our options?
      • Do we really need 300MB/s? (Jacek)
        • Currently: lsst10 150MB/s; Scan of Object table is 5m; Source table is 1h2m; ForcedSource is 1h36m (dbDC3bHardware?)
      • Adapter slots on lsst10 will not support 8Gb HBA
        • in the process of getting price estimates for a new database server
      • New server to support 300MB/s will cost $4K (2 Quad core; 16G RAM, HBA Emulex dual channel 4Gbps PCIe)
    • Database Server(s) at SLAC (Jacek)
    • Database replication strategy (Jacek)
  • DC3b User Support
    • Separate item from above
      • User Access is about systems and software; User Support is about receiving questions/problems from human beings
    • Active discussion going on among SuzyD, DickS, MikeF
    • One line summary: Ticket system would be good, KB would be good, no labor resources available, planning on an email address, discussions continue
    • Support email address: dc-support at lsst.org
      • Scope: user support for the data challenges
      • support at lsst.org is too generic
      • all dc issues -- do not try to have user select "category" of issue
  • ImSim Data Management with iRODS (Arun)
  • Output Data Management with REDDnet
    • http://docs.google.com/View?id=dgvmjj2x_16f4mvfmd6
    • 2x24TB (48TB) going to both NCSA and SLAC; depots exist at Caltech and elsewhere
    • big focus on monitoring by team at Vandy
      • perfSONAR suite (I2) (snmp), BWCTL (iperf), MRTG (snmp), Nagios, and custom
      • monitors availability, throughput, latency, general health, alerts
    • single virtual directory structure -- sandbox for lsst created
    • L-Store
      • provides client view / interfaces (get/put/list/etc.)
      • defines the virtual directory structure
    • StorCore
      • partitions REDDnet space into logical volumes (think: LVM)
      • L-Store uses StorCore for resource discovery
    • Web interfaces for both StorCore and L-Store
    • Example code available (contact mike to get a copy)
      • upload.sh file1 file2 dir1 dir2 remotefolder
      • download.sh removefile1 remotefile2 localdestination
      • ls.sh remotefile or remotedirectory
      • mkdir.sh remotedir1
      • additional commands to "stage" files across depots
  • Update on LSST Database Performance Tests Using SSDs (Arun/Jacek?)
    • LSST expects to manage some 50 billion (50*109) objects and 150 trillion (150*1012) detections of these objects generated over the lifetime of the survey. This data will be managed through a database. The current baseline system consists of off-the-shelf open source database servers (MySQL) with a custom code on top, all running in a shared-nothing MPP architecture.
    • To date, we have run numerous tests with MySQL to project performance of the query load we expect to see on the production LSST system, including low volume, high volume and super high volume queries (simple queries, full table scans and correlations, respectively). Based on these tests we estimated hardware needed to support expected load. All these tests were done using spinning disks.
    • Having the opportunity to redo these tests with solid-state technology (solid state disks, or SSD) would allow us to understand potential savings and determine whether SSD could help us simplify the overall architecture of the system by approaching things in a “different” way than on spinning disk.
    • The tests we expect to run include:
      • Selecting small amount of data from a very large table via clustered and non-clustered index (this is related to low volume queries).
      • Verifying whether we can achieve speed improvements for full table scans comparable to raw disk speed improvements seen when switching from spinning disk to SSD (this is related to high volume queries).
      • Testing architecture that involves heavy use of indexes, including composite indexes instead of full table scans for high volume queries.
      • Executing near neighbor using indexes on subChunkId without explicit subpartitioning.
    • We expect to run these tests using USNOB data set, which, including indexes and other overheads fits on ~200 GB.
  • Update on Lawerence Livermore database scalability testing (DanielW)
    • Description: LLNL has provided a number of nodes (currently 25) as a testbed for our scalable query processing system. Being able to test over many nodes allows us to understand where our query parallelism model succeeds and fails, and helps us develop a prototype that can handle LSST database query needs. So far, use of this many-node cluster has uncovered problems in scalability in job control, threading, messaging overhead, and queuing, which we have been incrementally addressing in each new iteration (3 so far).
    • Status: developing and testing a new model since tests in Jan showed bottlenecks at >4 nodes
    • Hoping to get time on a 64 node cluster at SLAC
    • software will be installed on lsst10 after testing
      • is this qserv?
      • MikeF to get with Jacek on details
    • [Jacek] New Resource: A 64-node cluster at slac (used to be for PetaCache tests), which we will be able to use for lsst related scalability tests (kind of permanently). Total of 128 CPUs, 1 TB of memory (16 GB per node), 2 TB of total local storage (34 GB per node).
  • [JeffK] "In Seattle we adopted language in the Science Requirements Document that states that DM will provide Level 1 and Level 2 data products, and APIs for doing Level 3 data products, as well as user-dedicated processing and storage equal to at least 10% of the total DMS system capacities. We should briefly discuss that last item at the next Infrastructure WG meeting."
  • Server Administration at NCSA
    • With the upcoming DC3b runs and the increased need for system reliability with introduction of end user access to our DC data, we're tightening up the processes and procedures related to the administration of the LSST servers at NCSA
    • New email address: lsst-admin at ncsa.uiuc.edu
      • Scope: technical issues, questions, problems with the servers located at NCSA
      • Helps with communication, coordination, coverage (if someone happens to be unavailable for whatever reason), etc.
  • DNS Names for DC3b Servers
    • Server list
      • Web Data Server
      • Schema Browser
      • Primary Catalog Database
      • iRODS Server
    • Result of discussions: There is no strong motivation to establish a consistent naming scheme for the servers, therefore we will keep things simple (more intuitive, easier to administer, fewer bugs/problems, etc.): Use their "real" names, no multiple DNS names for the same server.
  • Shared Memory Architectures
    • New Shared Memory HPC Machine at NCSA
      • "Ember will be composed of SGI Altix UV systems with a total of 1,536 processor cores and 8 TB of memory. The system will have 170 TB of storage with 13.5 GB/s I/O bandwidth. Ember will be configured to run applications with moderate to high levels of parallelism (16-384 processors)."
      • http://www.ncsa.illinois.edu/News/10/0302NCSAprovide.html
    • ScaleMP
    • Should we pursue?
  • Skype
    • any interest?

  • Cost Sheet Update
    • Baseline verion is v45
    • Current version now v74
    • Summary of Changes
      • xxx
    • Questions & Notes
      • Ramp up: One of the things in the cost sheet that I wonder about is our "ramp up", i.e. we're currently planning on buying 1/3 of the hardware 3 years early, 2/3 two years early, etc. I wonder if 3 years early is a little too soon.
    • Upcoming Changes
      • Priority is updating the Power & Cooling estimates
        • LSST-10 Update Power & Cooling at Base Site (info already received from RonL)
        • LSST-47 Power Costs at BaseSite: Use Historical Data to Model Future Power Prices
        • LSST-36 Update Power & Cooling at ArchSite
        • LSST-36 P&C and Floorspace at PCF (rates, payment approach, green features of PCF)
      • LSST-78 Move the 3% CPU spare from document 2116 "CPU Sizing" to document 6284 "Cost Estimate"
      • LSST-79 Add tape library replacement to ArchAOS and BaseAOS
      • LSST-28 Optimal CPU Replacement Policy
      • LSST-14 Processor Sizing Update (Doc2116 LSST CPU Sizing)
      • LSST-37 Missing controller costs for disk
    • Next steps with cost sheet
      • Full review each of the elements of the cost sheet (boxes of the mapping document)
        • More readable description of the formulas being used
        • Identification and documentation of assumptions
        • Identification and documentation of external data input
      • Serves two significant purposes
        • Allows for better internal reviews (validation of models and information used)
        • Provides justifications for external reviews
      • Results in an updated (or replacement of) Document-1684 and related documents ("Explanation of Cost Estimates")
  • InfraWG Ticket Update

Notes

Attendees: BillB, RobynA, JeffK, JacekB, DanielW, RayP, MikeF

  • TG announcement
    • full award
    • no data loss
  • web data server, iRODS server
    • prep work underway
  • no need for 300MBs lsst10-SAN, dropping as agenda item
  • new database servers at SLAC - secondary read-only, qserv, 15-20TB
  • user support
    • use ephibian?
  • L3 resources to be sized 10% of total DM
  • REDDnet - progressing
  • iRODS - progressing
  • skype - mixed reaction, no strong motivators - dropping
  • Action items reflected in JIRA tickets.

Useful Links

Attachments