Fault Tolerance Workshop
A workshop will be held at SLAC on July 15 and 16, 2008, to discuss how to incorporate fault tolerance into the architecture, design, implementation, and schedule of the LSST middleware, as well as the relationship of the middleware to the infrastructure and the applications with regard to fault tolerance.
Location
Kavli Building (051), Room 222. Map
Dinner location TBD based on interest.
Attendees
- SLAC: Jacek Becla, Gregory Dubois-Felsmann, Kian-Tat Lim, Steffen Luitz
- NCSA: Greg Daues, Ray Plante
- IPAC/Applications: Russ Laher
- Audio-only: Jeff Kantor, Deborah Levine, Francesco Pierfederici
Goal
Prepare a draft document with a fault-tolerant system architecture, high-level designs for fault-tolerant components, and a plan for their implementation. After consultation and review, the document will be presented at PDR, and its plan will be incorporated into the schedules for middleware and application development for DC3 and DC4.
Agenda
| Time (PDT) | Topic | Homework |
|---|---|---|
| July 15, 2008 | ||
| 09:30-09:45 | Welcome and logistics | |
| 09:45-10:30 | Middleware/infrastructure and middleware/application interfaces | Strawman proposal - KTL: FaultToleranceInterfaces |
| 10:30-11:00 | Explicit science requirements | Summary - GDF |
| 11:00-11:15 | DC3 teleconference | |
| 11:15-12:00 | Derived requirements | Summary - GDF |
| 12:00-13:00 | Lunch | |
| 13:00-13:30 | Failure types | Summary - JB: FaultToleranceUseCases |
| 13:30-14:30 | Use cases (specific failure type combinations) | Summary - JB: FaultToleranceUseCases |
| 14:30-14:45 | Break | |
| 14:45-15:45 | "Peer" philosophy and implications | Proposal/Discussion - KTL/RP: FaultToleranceStrategies |
| 15:45-16:45 | "Master" philosophy and implications | Proposal/Discussion - KTL/RP: FaultToleranceStrategies |
| 16:45-17:00 | Summary/Wrap-up | |
| 18:00-20:00 | Dinner | |
| July 16, 2008 | ||
| 09:00-09:15 | Gathering/Review | |
| 09:15-10:30 | Possible combinations of "Peer" and "Master" | Proposal/Discussion - KTL/RP: FaultToleranceStrategies |
| 10:30-10:45 | Break | |
| 10:45-12:00 | Final decisions on architecture/design | |
| 12:00-13:00 | Lunch | |
| 13:00-14:00 | Impact on algorithms | |
| 14:00-14:30 | Impact on pipeline manager/orchestration | |
| 14:30-15:00 | Impact on pipeline harness/framework | |
| 15:00-15:15 | Break | |
| 15:15-15:45 | Impact on pipeline stages | |
| 15:45-16:45 | Implementation plan and schedule | |
| 16:45-17:00 | Summary/Wrap-up/Assignments | |
