wiki:db/DataPlacement
Last modified 5 years ago Last modified on 12/16/2013 02:09:23 PM

Data Placement

This page covers high-level, technology agnostic data placement aspects and requirements.

Requirements

  • Must scale to few hundreds of nodes, and work in scale down version (single machine)
  • Must support data replication.
  • System should be able to recover from a failure of a replica without disrupting user queries

Open questions

  • Could number of replicas vary between databases (e.g., L2 - 3 replicas, L3 - 2 replicas), or the number of replicas should be constant?
  • should all replicas be the same? Should replica of node x be distributed across many other nodes?
  • Could be rely on other data center to bring missing data?
  • Could we support load on demand?
    • [from Mario:] Imagine the following scenario: scientist M at university H has access to 200+ nodes w. petabytes of storage. They install qserv, and configure it in a mode where it initially downloads only the metadata from the LSST Archive (e.g., where the chunks are, how many of them are there, which tables are where, etc.). When a query is fired against that database, it downloads the chunks touched by the query and caches them for future use. Note that if no query ever touched a particular table, those chunks would never get downloaded (e.g., a stellar astronomer would never download the table with bdSamples). After an initial period of downloading/caching, the database would asymptote to a steady state mode where all frequently queried tables are cached locally (thus offloading our database). Bonus points for allowing bootstrapping with mass-downloaded data (or sneakernet delivered disks), or expiration of rarely accessed chunks. This was a killer feature for LSD [*] (combined with on-the-fly cross-matching against remote catalogs, which is a simple extension). I'm willing to bet people would use it if we supplied it. It may also solve our issue with the number of replicas needed to guard against failure, since we could configure our Archive center database to fetch any chunks that it doesn't have (e.g., because the nodes have failed) from the Chilean or French site. [*] http://mwscience.net/trac/wiki/ReleaseNotes_0_5_4