Ticket #810 (deferred defect)

Opened 10 years ago

Last modified 7 years ago

Provide information about visit/ccd/amp for every log message

Reported by: rhl Owned by: RayPlante
Priority: critical Milestone:
Component: pex_logging Keywords:
Cc: Blocked By:
Blocking: Project: LSST
Version Number: 3.3.4
How to repeat:

Try to find which visit/ccd/amp caused the

Failed to find WCS solution; leaving raw Wcs unchanged

messages in rlp1188

Description

The DC3a logging doesn't provide enough information in the logger table (or the on-disk logs) to associate a given error message with a visit/ccd/amp.

This makes debugging extremely difficult. I understand that Ray has some changes ready for DC3b, but I think that they need to be moved up to DC3a.

Change History

comment:1 follow-up: ↓ 3 Changed 10 years ago by ktl

If we do this, it should be done using setPreambleProperty(). pex_logging internally should not need to know anything about visits/CCDs/amps.

comment:2 Changed 10 years ago by ktl

Although it would be convenient to have this information in every event, I'm worried about bulking up the events and the logs. Another way to handle this would be to maintain this information as state in a (relatively) well-known place. For example, we could log the contents of each event that was received. Any log message after that would correspond to that event.

Other thoughts: the pipeline cannot determine visits/CCDs/amps until an event has arrived. It seems like it would be difficult to have a generic way to call setPreambleProperty() in the pipeline harness. It would be easier to do this in the stage receiving the event, but would that affect other stages' logs? When more than one event is received, how do we decide which event properties should be logged and which should not? For some stages, the properties of one event or the other may be important; for other stages, the properties of both events may be important.

comment:3 in reply to: ↑ 1 Changed 10 years ago by RayPlante

  • Status changed from new to assigned

Replying to ktl:

If we do this, it should be done using setPreambleProperty(). pex_logging internally should not need to know anything about visits/CCDs/amps.

Yes, this is the correct approach. I've been playing with different techniques for capturing extra properties within the harness. I believe I can recommend a practice for application level properties that would include things like avisits, CCDs, and amps.

comment:4 follow-up: ↓ 5 Changed 10 years ago by ktl

It sounds from Ray's comment like it will be up to individual stages to log the application properties they feel are relevant. Is that acceptable to Robert?

comment:5 in reply to: ↑ 4 Changed 10 years ago by RayPlante

Replying to ktl:

It sounds from Ray's comment like it will be up to individual stages to log the application properties they feel are relevant. Is that acceptable to Robert?

Certainly a stage is able to log what ever properties it wants; however, it would be good to have more of some standard-ish properties always recorded as it makes, for example, it easier to grab all messages (in the database) related to the same visit. The visit ID is something that need only be set once--when it is known. We can have, for instance, the input stage set the visitId as log property, but we need to re-use that log instance throughout the Slice. So we can set down some patterns which do this conveniently.

comment:6 follow-up: ↓ 7 Changed 10 years ago by rhl

KT and I discussed this a little offline. I wrote:

I want to identify uniquely the state of a stage so I can comprehend the error conditions. Why can't I require that someone putting an item onto the clipboard also put information about that item, indexed by the same name? If the item's an exposure, they'd provide (visit, e0, c0003, a06) or whatever; that's a python list but a string would be OK too. Let's call it a "signature".

KT responded

The state of a stage is primarily determined by its slice number and the input events available when it executes. All of the information you are interested in is available (though perhaps indirectly) from those events. The problems are that the events are not logged and that tying the events to stages/clipboard items is not yet straightforward.

I don't quite understand this comment. If KT's saying that I can get my signature from information that is, or could be, in the database then I am happy. Is he saying that? I thought that other remarks implied that he didn't think that the desired information was available without inspecting the insides of clipboard items, which we don't want the middleware to do.

How do we decide what goes into that signature, and how do we use it afterwards? this seems to imply that there must be a strong linkage between the signature and the persisted form of the clipboard item (if any exists). Ensuring that may be difficult.

I don't think that we converged. My proposal still makes sense to me; we're not trying to use this signature for provenance or anything like that, just to provide enough context to associate an error message with the data being processed via a join on the stage ID. It's the job of the person putting an item onto the clipboard to ensure that the signature's useful.

comment:7 in reply to: ↑ 6 Changed 10 years ago by ktl

Replying to rhl:

I want to identify uniquely the state of a stage so I can comprehend the error conditions. Why can't I require that someone putting an item onto the clipboard also put information about that item, indexed by the same name?

Uniquely identifying the state of the producer of a clipboard item is somewhat different from uniquely identifying the state of the consuming stage. One is metadata about a particular piece of data (the clipboard item); the other is information about a particular code execution.

we're [...] trying to use this signature to provide enough context to associate an error message with the data being processed via a join on the stage ID.

The error message is produced by application (stage) code, not by data. How is the middleware to know (in a generic way) what data "signatures" should be associated with a given error message? What Ray appears to be proposing is that the item producer can set the "signature" via setPreambleProperty() on a logging instance and that the consuming stage code would have to arrange (manually) to use that same logging instance to generate its error message. Is this sufficient? It's not clear to me if the intersection of Ray's and Robert's suggestions is that each clipboard data item has its own associated logging instance, although that is one possible interpretation.

comment:8 Changed 10 years ago by RayPlante

KT is essentially correct. One application stage close to the loading of the visit data can register the desired properties with a log that can persist across stages. Other stages need to use the same log (or one created from it) to have those same properties attached to their own messages. This is not difficult to do, but it does require a consistent use pattern.

I'll note that I was not planning to fix this before the completion of DC3a because:

  • the relevent visitid, ccd id, and amp id are all discernible from the log messages, though not as conveniently as we would like.
  • the correct solution for this ticket would require combing through a lot of application code to ensure that the required pattern is employed.

When using the log files, figuring out the current data being processed is not difficult; one just scans backwards to the last data loading stage. The inconvenience is more noticeable if one is using the database to select log messages. However, there are consistent patterns that can be leveraged to map slice id and loop number to the data ids:

  • a straight-forward SQL query can list the visit ids in order, giving the mapping from loop number to visit id
  • the same ccd id and amp id is always assigned to the same slice.

hope this helps.

comment:9 Changed 8 years ago by RayPlante

  • Status changed from assigned to inTicketWork

comment:10 Changed 8 years ago by RayPlante

  • Status changed from inTicketWork to deferred

Alas, I cannot confirm that this was finished completely as planned. We should review this ticket again when we review logging and the pipeline harness overall.

comment:11 Changed 7 years ago by robyn

  • Milestone DC3a MW DB MWI Improvements deleted

Milestone DC3a MW DB MWI Improvements deleted

Note: See TracTickets for help on using tickets.