wiki:dbProdQueries
Last modified 6 years ago Last modified on 03/05/2013 12:33:46 PM

Queries in Data Production

LSST Database | Queries for Tests

This document describes difficult queries we will need to support in data production.

Astrometry

Astrometry can be split into two pieces:

  • Absolute astrometry is the computation of the position of objects with respect to the external reference frame J2000. Until we get a better reference catalog, this is a very difficult procedure, is subject to a large number of error sources, and has only a negligible scientific return. For the rest of this discussion, the details of this calculation will be ignored. We presume only a coarse, perhaps 0.1 arcsecond, astrometric calibration of the LSST data, and we already know how to do this.
  • Differential astrometry deals with the computation of the changes in an objects position as a function of time (and other things), and is done in relatively small regions of the sky. It can be done to a much higher accuracy than absolute astrometry. Differential astrometry is sufficient to satisfy most of the goals in the SRD.

The size of the region of sky over which a single differential astrometric solution can be generated is limited by the seeing. Atmospheric seeing is a complicated issue, but the relevant portion is how the turbulence changes the refraction which changes the apparent places of the stars. Whereas we would like to use as large a patch of sky as possible, we are limited by the number of sources in the patch and the sophistication of the model that characterizes (and removes) the effects of the seeing. Based on simulation, likely size of the region to be used by LSST is several arcminutes.

So, from the database point of view, differential astrometry needs to scan the Object Catalog, extracting Objects grouped into tiles on the sky as of a common epoch, obtaining each object exactly once. There is no requirement for Objects to be related to nearby Objects except through the fixed tiling scheme. For each Object, the algorithm will need to obtain all observations in the Source Catalog. For each Source, it will need to obtain exposure information from the Exposure Catalog, including the CCD used for the exposure. This assumes that the deep detection pipeline has been able to properly associate Sources with Objects.

Columns fetched from Object catalog:

  • objectId
  • position (ra, decl)
  • motion (muRA, muDecl)
  • parallax
  • refraction (for ra and decl)

Columns fetched from Source Catalog:

  • unique exposure id (to point to metadata: epoch, field rotation, pointing, etc)
  • column and row position to an accuracy of about 0.001 pixel
  • ccd and raft
  • brightness, but only with enough accuracy for statistical weighting like maybe 0.1 mag

We need to run this once per data release.

Weak Lensing

[based on input from David Wittman]

For each galaxy:

  • give me all the postage stamp images
  • was it ever split?
  • what is its (mean) magnitude, position, size, ellipticity?

Photometric Calibration

According to Tim Axelrod, photometric calibration will need to iterate first over atmospheric data from the auxiliary telescope, ordered by time, to determine the length of the time windows for which a single atmospheric model is sufficient and begin computing that model. Since the time windows are unlikely to span more than one night, it may be possible to parallelize this by observing night. Queries might look something like SELECT * FROM aux_LIDARshot WHERE time BETWEEN :min AND :max ORDER BY time.

It will then iterate over all Exposures within each time window, completing the atmospheric model by using telescope information in the Exposure metadata, magnitudes of some of the Sources obtained from that Exposure, and reference "top of the atmosphere" magnitudes for the Objects corresponding to those Sources. Queries might look like SELECT * FROM Science_FPA_Exposure WHERE time BETWEEN :min AND :max and SELECT * FROM Source INNER JOIN ReferenceObject USING (objectID) WHERE ampExposureId = :aeID.

The result will be an estimate for the atmospheric extinction by time, position, and wavelength. The calibration algorithm can then iterate through the complete set of Sources within the time window, reversing the extinction to determine "top of the atmosphere" magnitudes for each. Queries here might look like UPDATE Source SET modelMag = SomeComplexFunction(psfMag, :various, :other, :parameters).

Finally, it will iterate over every Object, analyzing its corresponding set of (now-adjusted) Sources, and identifying variable objects from the light curve.

to be continued... not finished