Database Scalability in DC3b
Detailed plan
demonstrate near neighbor query on 2-3 nodes (query broken into subqueries, multiple databases, subqueries run in parallel, data partitioned by hand, no cross-server joins, results streamed to client, using xrootd, minimal tools for partitioning). [mostly DW]architecture designrun near neighbor query on lsst10, test performance (includes porting to linux) [JB]research htm/healpix/stomp/dif [SM]research xrootd vs gearman
scalability tests, benchmarking (1st round, 25 nodes)scalability related fixes
- partitioning
implement generic query partitioner [SM]basic, generic query parser in place ("select from where", supports queries that select single row, select from multiple partitions, join with non-partitioned tables) [DW]- 06/15/10 Implement data synthesizer [DW]
- 06/15/10 Demonstrate queries involving partitioned and non-partitioned tables (via shared volume) [DW]
- mysql proxy [JB]
API between the proxy and the master [DW, JB]extracting and passing hintsfetching resultsconcurrency- 06/30/10 ellipse, polygon, circle, "objectId IN" [JB]
- 06/30/10 sql functions to support spatial constraints (ellipses etc) [SM]
DC3b syntax parser and aggregationsyntax support [DW/JB]collating query results [DW]aggregation (sum, average, group by, sort by, ...) [DW]query parser: support aggregation {DW], basic geometry support (bounding box and circle) [SM]
05/15/10 task manager [KT/DW]rewriting backend with new xrootdreceiving and running multiple queries simultaneously
- 07/30/10: data release tools [JB/?]
- complete tools for partitioning data [SM/JB]
- implement admin tools for releasing data (reindexing, marking read only, etc)
- implement monitoring of the system (for troubleshooting/debugging)
- 06/30/10: first release
returning resultsthreading in c++multi-queriesshort queries- 06/15/10 generating objectId,chunkId table from partitioner [SM]
- 07/15/10 preventing obvious killer queries (result size limits) [DW]
- 06/01/10 - 07/31/10 deployment and troubleshooting on lsst10 and 2 slac servers
- documentation
- 06/30/10: for users [DW/JB]
- 07/30/10: for all hands meeting [DW/JB]
- 11/01/10: for PDR [JB/DW]
- 08/31/10 scalability tests and benchmarking (2nd round, 100 nodes)
- including redoing test with 4 and 25 nodes
- cluster management tools
- tests with and without compression
- 08/31/10 adding support for advanced sql syntax
- order by, limit, having etc
- interactions with users
- 09/30/10: basic interactions (query status)
- 12/31/10: task manager (location of query results, query progress/cost)
- 12/31/10 mysql admin commands (for example show processlist)
- 12/31/10 statistics
Beyond DC3b
Tasks left out for DC4:
- 03/31/11 task manager (killing/suspending jobs)
- 06/30/11 shared scans
- inter-node communication (near neighbor query with distance larger than pre-calculated adjacent regions)
- fault tolerance
- multi-front end (for scalability and fault tolerance)
- inserts and deletes (needed by Association Pipeline and unreleased catalog)
After DC4
- semantic analyzer (checking validity of table names, column names etc)
- detecting slow chunk-queries
- sql subqueries
Hardware needed
see dbDC3bHardware
