Changes between Version 13 and Version 14 of ImageMetadataProposal


Ignore:
Timestamp:
08/28/2008 12:09:10 AM (11 years ago)
Author:
ktl
Comment:

Updated proposal.

Legend:

Unmodified
Added
Removed
Modified
  • ImageMetadataProposal

    v13 v14  
    44 
    55= Proposal for Metadata For Image-Like Objects =  
     6 
     7The main text of this proposal has not been changed significantly.  Some clarifications have been added, particularly with regard to persistence of _extraMetadata items, as a result of the Middleware/Apps Task Force meeting (see [#Extractedfromnotesof26-Aug-2008MiddlewareApplicationsTaskForcetelecon below]). 
    68 
    79== The Present Design == 
     
    6971We should consider eliminating the !MaskedImage class and using Exposure instead. These two classes are very similar, differing mainly in whether they contain exposure-related metadata. Furthermore, the main use case for !MaskedImage is to contain exposure data, so the metadata is wanted. 
    7072 
    71 The safest thing to do for now is probably modify Exposure to subclass MaskedImage rather than contain a MaskedImage. This gives users of Exposure more direct access to the image data and eliminates the need to deal with the MaskedImage class directly. We can then check later to see if there are any users of MaskedImage and if not, we can remove it without modifying any existing code. 
     73The safest thing to do for now is probably modify Exposure to subclass !MaskedImage rather than contain a !MaskedImage. This gives users of Exposure more direct access to the image data and eliminates the need to deal with the !MaskedImage class directly. We can then check later to see if there are any users of !MaskedImage and if not, we can remove it without modifying any existing code. 
    7274 
    7375== Prototype Design == 
     
    119121        borderline metadata/provenance cases: 
    120122   
    121  * RA/Dec and oriention on sky of center of focal plane. I would exclude this from Exposure since WCS contains the same information in a more directly useful form. 
     123 * RA/Dec and orientation on sky of center of focal plane. I would exclude this from Exposure since WCS contains the same information in a more directly useful form. 
    122124 * bad pixel mask ID (int?) 
    123125 * CCD ID or info. This is likely to be a field of !AmpInfo, in which case we have it in Exposure whether we want it or not. But I doubt data processing will have any use for it. 
    124  * ISR otuput that is provenance (an object); defined by Nicole. 
     126 * ISR output that is provenance (an object); defined by Nicole. 
    125127 
    126128!TemplateExposure inherits from Exposure and adds: 
     
    167169== Should _extraMetadata Be Persisted To the Database? == 
    168170 
    169 K-T noted that in the present design, only standard known entries from Image._metaData are persisted to the database; other fields are ignored. (FITS files are different; all entries are persisted to FITS files, if possible). The obvious translation to the new design is that '''no''' entries in _extraMetadata are persisted to the database. This agrees with the design goal that _extraMetadata is not for standard data. If this is too limiting then we can persist such data in a Character Large Object field (CLOB). 
     171K-T noted that in the present design, only standard known entries from Image._metaData are persisted to the database; other fields are ignored. (FITS files are different; all entries are persisted to FITS files, if possible).  In the new design, to enable round-tripping of FITS files from non-LSST sources, we will persist all _extraMetadata values to a separate table, likely as strings, and retrieve them when the object is retrieved. 
     172 
     173------ 
     174== Extracted from notes of 26-Aug-2008 Middleware/Applications Task Force telecon == 
     175 
     176Discussion: 
     177 
     178    * Proposal: Instance variables for most metadata. Persistence Formatter maps file/database form to/from instance variables. !DataProperty for new items or items that are not processed by code. 
     179    * TA agrees that Formatter is the equivalent of the adapter he was seeking. 
     180    * Can't get rid of the !DataProperty; similarly, can't get rid of the instance variables. Do need to enforce consistency between the two unless there is no duplication. 
     181    * Non-LSST metadata needs to be preserved. 
     182    * May need to maintain readability of prior releases' data using later releases' software. 
     183    * Boundary between instance variables and !DataProperty may vary with time. 
     184    * Analysts will want to annotate data arbitrarily at any time; camera may add metadata arbitrarily. 
     185    * Would like to have a single interface for accessing metadata, both from code and from SQL. Looks like it is impossible to do from SQL with an RDBMS. Could do from code accessing metadata in objects (with some difficulty) or from code accessing the database (relatively easily). 
     186    * FITS keywords need not be read into a !DataProperty first, although that may be the simplest implementation, since many will end up there.  
     187 
     188Decisions: 
     189 
     190    * Have instance variables and a separate !DataProperty as in the proposal. These are mutually exclusive; no metadata is in both. 
     191    * Formatter converts to/from FITS keywords and database columns. 
     192    * Round-trip from input FITS to output FITS must be guaranteed not to lose any information (may gain some), even if object is persisted to a database in between. 
     193    * We will investigate whether we should seek to define a "namespace" for LSST-specific keywords in FITS files.  
     194 
     195Comment by TA: I agree with all the points above, and offer the following scenario to make explicit some of the concerns surrounding the shifting boundary between instance variables and !DataProperty: 
     196 
     197    * The Camera team decides after the survey has been running for a while that there is a significant parameter they want to add to the image metadata, which they call "TEMP32". They add it, and as a result, there is a new "TEMP32" item in the !DataProperty that goes into the database as the metadata associated with every image. This addition does not require any software change for DM. 
     198    * Data quality analysts create various plots and metrics that utilize "TEMP32", which they access via an SQL query involving the image metadata (no, I don't know exactly how to do that either...) (KTL: Something like "`SELECT CAST(value AS DECIMAL(10,6)) FROM ImageMetadata WHERE key = 'TEMP32' AND imageId = ...`".  We could also provide a view that makes known entries in the !ImageMetadata table look like columns in the view, but this could be expensive.  Finally, we could provide a `getMetadata` stored procedure that would hide the table location.) 
     199    * Six months later, Nicole decides that she can use "TEMP32" to compensate for a CTE effect that has been a problem for the ISR, so she promotes "TEMP32" to be an instance variable of the Image class. She tries it out, and decides to include this in a new release of the Nightly pipeline software.  (KTL: This would also require a new release of the `ImageFormatter`, or `ExposureFormatter`.) 
     200 
     201Now we have a couple of problems: 
     202 
     203    * Since "TEMP32" has been promoted to an instance variable, the DB schema has to include a new column for the Image table (really one of the Exposure tables, if we are sticking to the schema as it is). Somehow we have to accommodate this schema change "on the fly", by which I mean that it happens between data releases. (KTL: This would involve yet another release of the `ExposureFormatter`, a new column in the database, and a query to transform all of the entries in `ImageMetadata` into entries in the new column.) 
     204    * Jacek and KT wave their magic wands and change the schema, but now the DQ analysts, and all those external scientists (who have also gotten addicted to using "TEMP32") find all their software is broken, since "TEMP32" is no longer in the Image metadata.  (KTL: Unless they were using the view or stored procedure with SQL or a similar `getMetadata()` method on Exposure.) 
    170205 
    171206------ 
     
    180215Russell wants this proposal discussed and agreed to as soon as possible so we can design the ISR with it in mind. To that end we spent the rest of the meeting discussing it. 
    181216 
    182  * Does it make sense for LSST standard metadata to be represented as metadata-specific member variables, rather than as name:value pairs in a general DataProperty. (Note: name:value DataProperty metadata will still be supported for experimental code, test code, etc.) 
     217 * Does it make sense for LSST standard metadata to be represented as metadata-specific member variables, rather than as name:value pairs in a general !DataProperty. (Note: name:value !DataProperty metadata will still be supported for experimental code, test code, etc.) 
    183218 
    184219Andy Becker worries about needing to modify the class every time we add a bit of metadata. 
     
    211246 
    212247 
    213  * Should we keep MaskedImage or combine it with Exposure into one class? Also, do we want more than one kind of Exposure? 
    214  
    215 Nicole and Andy could not think of any use case for a bare MaskedImage, but neither they nor Russell have carefully looked through the EA model to be sure. 
     248 * Should we keep !MaskedImage or combine it with Exposure into one class? Also, do we want more than one kind of Exposure? 
     249 
     250Nicole and Andy could not think of any use case for a bare !MaskedImage, but neither they nor Russell have carefully looked through the EA model to be sure. 
    216251 
    217252K-T then asked if we should have different kinds of Exposures for different kinds of images, e.g. bias vs science image. This would be sensible if the metadata is very different (especially if the metadata is in instance variables). Andy felt that the different kinds of images weren't different enough to justify this. 
    218253 
    219 Ray pointed out that we can avoid the need for different kinds of Exposures if we use a DataProperty for most metadata (as we do now). In other words, using dynamically typed metadata. 
     254Ray pointed out that we can avoid the need for different kinds of Exposures if we use a !DataProperty for most metadata (as we do now). In other words, using dynamically typed metadata. 
    220255 
    221256However, Andy was pretty sure we only need one kind of Exposure in any case. 
    222257 
    223 The consensus was that we probably only need one kind of Exposure and that we can probably do without MaskedImage. But given the uncertainties it makes sense to change Exposure to inherit from MaskedImage (rather than contain a MaskedImage, as now) and prefer Exposure to maskedImage and see how that works out. If it works well then we can get rid of Exposure with minimal pain later. 
     258The consensus was that we probably only need one kind of Exposure and that we can probably do without !MaskedImage. But given the uncertainties it makes sense to change Exposure to inherit from !MaskedImage (rather than contain a !MaskedImage, as now) and prefer Exposure to maskedImage and see how that works out. If it works well then we can get rid of Exposure with minimal pain later. 
    224259 
    225260 * What about obscure metadata that is only needed by one or two tiny bits of code? 
     
    231266 - Get rid of metadata in Image provided Tim agrees. 
    232267 - Avoid duplicate metadata. There should only be one obvious place to look for an item of metadata and it should never be duplicated in a given object. 
    233  - Combine MaskedImage and Exposure only if MaskedImage truly not needed and there is only one kind of Exposure. (Change Exposure to inherit from MaskedImage otherwise). 
     268 - Combine !MaskedImage and Exposure only if !MaskedImage truly not needed and there is only one kind of Exposure. (Change Exposure to inherit from !MaskedImage otherwise). 
    234269 - It is OK to put standard LSST metadata in member variables. But we probably need to add a new lazyMetadata member variable. 
    235270 
     
    240275I think the separation of metadata into class members and a list of DataProperties needlessly complicates a situation which is intrinsically complicated anyway.   It offers yet another opportunity for a given metadata item to be represented in two locations with conflicting values.    Changing the class definition every time we decide we want another metadata item usefully available will lead to a maintenance nightmare, as others have suggested. 
    241276 
    242 It still seems to me that we need a uniform way of making all image metadata available to a class, and a list of DataProperties is not a bad way to do it.   I recognize the convenience of having some frequently used metadata values available as class members.   Perhaps this could be done in a way that avoids the maintenance issues by constructing some sort of cache of these values from the DataProperty list when a class member is constructed.   The cache entries would point to the relevant list members, thus avoiding the need to search the list whenever you need something.   I haven't thought this through, and it has some obvious problems - but perhaps it has a useful germ of an idea. 
     277It still seems to me that we need a uniform way of making all image metadata available to a class, and a list of DataProperties is not a bad way to do it.   I recognize the convenience of having some frequently used metadata values available as class members.   Perhaps this could be done in a way that avoids the maintenance issues by constructing some sort of cache of these values from the !DataProperty list when a class member is constructed.   The cache entries would point to the relevant list members, thus avoiding the need to search the list whenever you need something.   I haven't thought this through, and it has some obvious problems - but perhaps it has a useful germ of an idea. 
    243278 
    244279 * Robert Lupton comments on above 
     
    266301 
    267302==== re: Should the Image and Mask classes have metadata (beyond the bare minimum required for the classes) ==== 
    268 This seems hard to work with.   Consider this scenario:   The camera team creates a Mask that has planes for bad pixels, charge traps, etc. That Mask naturally has metadata, perhaps lots of it, including parameters that were used to run the bad pixel finding algorithms, etc.    I now want to create a MaskedImage from that Mask and a raw camera image, and I reasonably expect to have all the relevant metadata accessible.    How do I do that in the proposed scheme? 
     303This seems hard to work with.   Consider this scenario:   The camera team creates a Mask that has planes for bad pixels, charge traps, etc. That Mask naturally has metadata, perhaps lots of it, including parameters that were used to run the bad pixel finding algorithms, etc.    I now want to create a !MaskedImage from that Mask and a raw camera image, and I reasonably expect to have all the relevant metadata accessible.    How do I do that in the proposed scheme? 
    269304 
    270305==== re: What about obscure metadata that is only needed by one or two tiny bits of code? ====