wiki:lsstdb_partImplications
Last modified 12 years ago Last modified on 05/25/2007 06:02:52 PM

Implications of Partitioning

LSST Database Issues to Discuss

We will partition some of the largest tables, including DIASource, Source and Object tables. These tables are expected to be partitioned both at the Base Camp and the Main Archive for efficiency and scalability reasons. Unique ids will be generated in the application code (eg we will not rely on auto_increment), which will quarantee id uniqueness across partitions. We are planning to maintain one table per partition. Main implications:

  • large number of tables
  • joining with these tables becomes more difficult

Large number of tables

The number of partitions at the Base Camp will exceed 300,000 and is optimized for efficient cross-matching and other spatially-localizable tasks. We don't yet know the optimal number of partitions at the Main Archive, but it will likely be measured in thousands. To scale, the tables will be spread across many databases, and in many cases on multiple servers. We will maintain our own mapping between regions and partitions to know which partition to access without scanning all of them.

Joining with partitioned tables

Many LSST tables will be relatively small, and it is cumbersome to partition each of them the same way the biggest tables are partitioned. If we did that, we would end up with millions of tiny tables, plus, some tables simply can't be partitioned the same way the biggest tables can. So we will have to join partitioned tables with the non-partitioned ones. For example we should be able to find difference image sources corresponding to a given alert. To do that, we maintain _DIASource2Alert which binds diaSourceId with alertId. But a given diaSourceId can be in any of the 300,000 + tables spread across multiple databases, so it is no longer a straightforward join to get from Alert to DIASource.

Based on discussion at DataAccWG telecon (2007, May 25), there is not issue at the Base Camp: the only queries will come from Association Pipeline and MOPS, plus QA, but QA will be done during the day and will use non-partitioned DIASource table containing new detections.

One possible solution would be to keep a mapping table of data ids to partitions, or ensure spatial coordinates are in every joined table (assuming the big tables are partitioned spatially). This requires an additional application-side lookup or computation to determine the partition to join to. It is messy for large joins (essentially takes the outer loop out of the database into app code).

Another solution would be to use partitioning offered by RDBMS, for example MySQL partitioning. At the momemt MySQL partitioning is limited to max 1024 partitions and there are no plans to raise this limit.

Another solution would be to embed 20 bit partition id into every id. This add extra complexity, and requires that things are pinned to partitions.

We are planning to come back to this issue (for Main Archive) in the summary or fall of 2007

MOPS and Partitioning

Historical DIASources should only be needed by MOPS for orbit determination, not for orbit prediction, which is instead based on pre-computed orbit parameters. Since Tim has said that orbit determination can likely be done in non-realtime, during the day, and perhaps even at the Archive Center, access to historical DIASources by MOPS at the Base Camp is not likely to be a problem.

We need to cross-match the predicted positions of objects that are within the MOPS search radius of the field of view against the incoming DIASources of positive excursions that have not already been matched with variable objects. There are only a maximum of a few thousand predicted positions, so partitioning of those should not be needed. We just need to union the DIASource partitions.

Now, for orbit determination (i.e., taking three or more DIASource detections and turning them into orbit parameters), it's a different story. Remember that we have been told that this will likely happen at the Archive Center, not at the Base Camp.

There are two cases here: 1) improving the estimate of an orbit for an existing moving object and 2) detecting a brand-new moving object.

For case 1, it's likely simplest to keep a list of DIASources associated with a given moving object (and have a way to get to their partitions). If we only have a pointer from a DIASource to a moving object, we have to search multiple DIASource partitions to find the desired moving object's ID; this is messy. The Object-to-DIASource mapping is simpler, because we can (usually? always?) assume the DIASource will be in the same spatial partition, but keeping a partition id or set of coordinates in the MovingObject?-to-DIASource mapping shouldn't be too hard.

For case 2, MOPS is probably going to have to know all unmatched positive excursion DIASources within a (large) search radius at three (or more) different times and will then try to match them up by fit to an orbit solution, magnitude, shape, etc. The query to obtain this data is complex (at first glance, a three-way join between unmatched DIASource tables for three different views of the same piece of sky), but spatial partitioning can be used to limit the difficulty, if this turns out to be necessary (the number of unmatched DIASources might be small enough that indexing is good enough). There is no requirement for the identical partitioning to be applied to the MOPS orbit catalog, as that is the output.