wiki:Winter2014/Bosch/CameraGeom+Wcs
Last modified 5 years ago Last modified on 10/21/2013 11:40:07 AM

Bosch's Refactoring Thoughts: cameraGeom and Wcs

Comments are welcome, but please add them in italics and prefix with your initials or username; insert them directly after the text you want to comment.

RHL: Split on-disk representation from the description of the detectors in the focal plane

The current camGeom mixes the concept of, "Handle the on-disk layout of FITS files" with "describe how the detectors are laid out in the focal plane".  This is very confusing, and should be split.  The as-delivered camera object should know nothing about decisions taken by the camera team.  We do, of course, need access to that on-disk layout.

The Coordinate System Graph

It is useful to view the problem of representing coordinate systems in code from the perspective of a graph: coordinate systems are nodes, while transforms are edges. Any unknown edge on the graph can generally be computed by composing transforms that correspond to known edges as the graph is traversed between the starting and ending nodes. The vast majority of the coordinate systems we deal with are 2-d, and I think it's useful to focus only on these at first. We can then divide coordinate systems into two clear categories: Cartesian coordinate systems and spherical coordinate systems. In this view, a Wcs object is simply a particular kind of edge in the graph: one that connects a Cartesian coordinate system node to a spherical coordinate system node. But there are clearly two other core kinds of edges as well, which are similarly important: Cartesian-Cartesian transforms, and spherical-spherical transforms. In our current stack, the latter are handled by the Coord class constructors, as spherical coordinate systems are represented by a tag (i.e. "Galactic", "ICRS") and possibly an epoch, and given how infrequently our code has to deal with different spherical coordinate systems, I think this is largely sufficient. We also have a relatively new class, afw::geom::XYTransform, that represents a Cartesian-Cartesian transform, and one of my main arguments for cameraGeom and Wcs refactoring is that this class should be used more, and in fact be the main API through which most coordinate system transformations proceed.

Here are a few concrete proposals that flow naturally from this perspective, but we should be on the lookout for more:

  • All APIs that current operate on the composition of two Wcss (with the spherical coordinate system node on the "inside"), such as image warping, should instead take an XYTransform. This is already how the WarpedPsf class works, and we already have an XYTransform subclass that is implemented simply by combining two Wcss in this way.
  • We should either make AffineTransform and LinearTransform subclasses of XYTransform or create XYTransform subclasses that adapt them to the XYTransform interface. I'm not sure at this point which would be the better approach.
  • We should add an XYTransform that simply composes two (or more?) XYTransforms. This should be done through an API that allows XYTransform subclasses to return a something other than the naive composition when possible (i.e. composing two affine transforms should return another affine transform, as this is just a matrix multiplication).
  • We should provide a Wcs subclass that combines a simple, wcslib-provided Cartesian-spherical transform with an arbitrary XYTransform, which could then represent the camera and atmospheric distortions currently in the SIP terms of a TanWcs, but in a more flexible way: we could use polynomials, Chebyshevs, radial analytic functions from cameraGeom, or any combination of these (a Chebyshev perturbation to a cameraGeom-provided radial function seems like a particularly well-motivated choice). Eventually I envision this replacing the TanWcs class.
  • cameraGeom should use XYTransform extensively to define the relationships between its various coordinate systems. Note that the number of coordinate systems in cameraGeom is potentially very large, and in my opinion it should be completely flexible, with a few predefined coordinate systems that will always exist. These include the pixel coordinate systems of each detector and a boresight-centered physical-unit coordinate system in the focal plane. We may also want a coordinate system that represents positions on the filters, on rafts, and different pixel coordinate systems that include or ignore small-scale effects due to electric field variations like "tree rings". I'm not enough of an expert on hardware or astrometry to be able to enumerate all those coordinate systems, but I think it's clear we will have a lot of them, and I think it makes sense to make them as flexible as possible, and to use XYTransform to define the relationships between them.
  • If we intend to rely on algorithms that match astrometry between two images (i.e. as has been at least proposed for image differencing, though I'm not sure if it's still the plan), these algorithms should fit for and return XYTransforms, rather than Wcs objects; we can then create a Wcs if desired by composing this XYTransform with whichever of the two images' absolute astrometry we trust more.

This doesn't necessarily mean we'll want a cameraGeom implementation based on a graph algorithms; it may be better to enumerate a few always-present coordinate systems and require arbitrary coordinate systems to be defined relative to one of these, allowing us to determine the path we'd take between any two nodes on the (logical) coordinate system graph with using general graph traversal algorithms or ever creating a data structure that holds the graph.

cameraGeom Classes

I believe the current cameraGeom classes are a bit too polymorphic, which causes them to require more downcasting than should be necessary and making iteration uglier. Instead, I propose the following hierarchy, which I believe works for all cameras:

  • A single Camera object, which holds one or more Raft objects.
  • Each Raft object holds one or more Detector objects.
  • Camera, Raft, and Detector are all nonpolymorphic, so Camera can provide an iteration over Rafts and Raft can provide iteration over its Detectors, without the user having to cast the result of the iteration from a base class to a derived class.
  • Camera should additionally provide direct iteration over all Detectors of all Rafts; this allows Cameras for which a Raft concept is not useful (i.e. they have only one Raft) to simply ignore the Raft level. Algorithms that don't care about Rafts (most don't!) can do the same.
  • Rafts should provide a back-reference to their Camera, and Detectors should provide a back-reference to their Raft (and hence indirectly, their Camera).
  • Each Detector holds a Sensor object, which is the base class of a polymorphic hierarchy; Ccd would then be a subclass of Sensor. Detector itself would handle all the purely geometric information (i.e. the XYTransforms that map its coordinate systems to the coordinate systems defined by its Raft and Camera), while Sensor and its subclasses would handle electronic information that may be dependent on the type of detector, including subdivisions into amplifiers as appropriate. Algorithms that care about the electronic information would then have to do some downcasting, but I believe this is unavoidable and this design keeps that downcasting to a minimum.
  • A Camera also holds a container of possible Filters (I think it's natural to move Filter definitions from the mappers to cameraGeom), along with a flag that indicates whether the Filter choice is determined by the visit and applies to all detectors (most cameras) or is a static attribute of the Detector (i.e. SDSS). I don't think we need to worry about other possibilities until someone builds such a camera; to my knowledge, no one has (aside from spectrographs).

It may seem a little strange that Detector and Sensor have such different roles, that they have a composition relationship (a Detector has a Sensor), and that a Ccd is a Sensor but is not a Detector. Good code design "is-a" and "has-a" relationships don't have to match the real-world "is-a" and "has-a" relationships, however, and trying to make them match is often a recipe for poor code design (this is the classic "a circle is not an ellipse" problem)

Camera Versioning

One of the biggest requirements for the new cameraGeom is the ability to have multiple versions of the same camera. I believe these should be controlled largely in the same manner as detrend files; a particular camera version is associated with a particular date range, and is automatically attached to the correct exposures using this information. There are almost certainly subtleties to doing this that I'm not aware of, because I'm really not that familiar with how detrends are handled, but I think this is the right broad approach.

Immutability and Copying

I believe it's important that an entire Camera hierarchy be immutable after construction, and moreover that two equivalent Camera hierarchies should never exist in memory at the same time. These two requirements go together; if something is immutable, you never have to clone it to ensure the one you have doesn't change underneath you. Together, these let you compare objects for equivalence simply using pointer equality.

I believe this is important because the coordinate systems a Camera hierarchy contains form a graph, and a Detector on its own is not really a complete object from this perspective; the back-reference to its Raft and Camera are crucial pieces. We don't want to have the same full Camera hierarchy duplicated in memory many times simply because we have many Exposure objects in memory (each of which should have a Detector object), and given the complexity of building a Camera from its on-disk description, we probably don't want to spend time reading it multiple times.

Immutability also matches the actual use of cameraGeom objects: these are generally constructed once and then never modified. Even algorithms that may provide constraints on cameraGeom (i.e. astrometric ubercal) should generally operate by producing a new cameraGeom description, so updates to the camera description can be carefully controlled (see Camera Versioning, above). It does present a bit of a challenge for the camera-building process, however: a camera is a fairly complex hierarchy of classes, and we'd likely want mutable versions of all of the classes involved to use while we're in the process of putting it together, and that'd likely produce a lot of duplication and/or boilerplate. Rather than strict C++ const-based immutability, then, I think we may want to just have a dynamic "freezing" of a camera hierarchy, after which calling a mutators on any member of the hierarchy would throw an exception. We'd thus also want a way to create an unfrozen copy of a camera, which could be modified before it is frozen and/or persisted.

The requirement that we should not unpersist same camera twice is actually already met by our current Mapper classes, as they load the camera description once in advance, and simply attach it to all Exposures loaded thereafter. I don't think things will be quite so simple in the future with versioning, but I think it's still quite doable: the Butler would simply have to save a dictionary of {version:instance} instead of a single Camera instance. In fact, as discussed on my butler/mapper page, I think the Butler may want to have a weak-reference dictionary that caches and avoids re-loading for all unpersisted objects, not just camera geometry.

Serialization and Definition

One of the top priorities in a cameraGeom rewrite is clearly replacing the Policy-file definition format. I don't have a complete proposal for how to do this, but I do have some strong opinions about what it should and should not involve:

  • It should *not* be based on pex_config; data types in pex_config are limited and their implementations complicated by the history-tracking and override support, which is something cameraGeom does not need. That said, we may want to use execed Python files for the cameraGeom, just as pex_config does, but with either Python built-ins (i.e. dictionaries and lists) or custom cameraGeom classes (when you think about it, we wouldn't need that many).
  • The Mapper class for a particular camera also requires a certain amount of camera definition (albeit much less than cameraGeom). If at all possible, we should be able to define this shared definition in only one place, and probably in the code itself (i.e. imported, not `exec'ed Python; git-versioned, not versioned as a calibration data product). At the level the Mapper needs to know about the camera, it really isn't expected to change.
  • We should not expect to be able to fully define a camera in a human-readable text file; rather, we should expect that a camera may need to contain fully polymorphic objects (e.g. XYTransform instances, filter curves) serialized through the same arbitrary-object serialization framework (see my persistence page) we use elsewhere. Some of these (i.e. simple offsets or affine transforms) can be easily defined in a text file, but we may need more complex, camera-specific classes for distortions. Often these will be machined-generated or machine-updated, and I don't believe we gain much by creating a new, human-readable persistence framework just to allow these objects to be put into the text file that defines the overall camera structure; it'd be better for that text file to be allowed to contain links to objects persisted in other files.

Naming, Organization, and Dependencies

I would advocate renaming the afw::cameraGeom package and namespace to simply afw::camera, as the package defines more than just geometry. If at all possible, it should depend only on afw::geom and packages below afw. I suspect it will at least require access to the Mask class from afw::image, but it should avoid depending on Exposure or any of its non-image components. It may also need limited dependendencies on afw::table::io for persistence, unless that can be moved to a package below afw.

I would also like to see the Wcs classes moved out of afw::image and into afw::coord, or possibly afw::geom (in which case we should also move the Coord class hierarchy into afw::geom and remove afw::coord). In any case, Wcs should absolutely be in the same namespace as the Coord classes.

ajc comments:

Integration with other parts of the project

cameraGeom has the potential to work across multiple groups outside of DM (e.g. opsim, phosim etc). We should decide if this is a goal of the rewrite as it could have a number of consequences (though maybe not at the level of hierarchy description here): the raft, sensor, etc will need to have a representation of height and 3-D position including variations in the sensor thickness; tying to the butler could make it more difficult for others to adopt cameraGeom; integrating the filter into the camera may require a representation of the filter geometry; definition of the charge blocks, non square pixels, tree-rings etc should be incorporable