LinkTracklets is the name of the algorithm and software used to build tracks of moving object detections. It was developed as the central item in Jeremy Kubica's PhD thesis. It is so named because it uses tracklets generated by FindTracklets as a tool to speed up the search for tracks.
The following illustrates three tracklets (A, B, C) with implied linear motion being linked into a single track.
Tracks are sets of point detections which follow a quadratic curve (that is, their motion in RA and Dec can each be approximated using initial position, velocity, and acceleration) and which fulfill some additional requirements, if specified. When we talk about the tracks in some data, we mean all sets of detections which fit our criteria; this will include "true" or "correct" tracks which link together only detections of the same object and "false" or "incorrect" tracks, which link together detections attributable to multiple objects. It is important to bear in mind that we will find both, as the latter are generally the overwhelming majority of the output.
Generally, we impose the following requirements on tracks we wish to report:
 They must contain detections from three or more distinct nights (otherwise, we cannot use them for reliable orbit determination)
 Their RA and Dec accelerations must be within some defined limits (specified by the user)
 Their detections must be within some given distance of the bestfit line describing their motion (specified by the user)
Additionally, we usually apply a window to the set of image times within which we search for tracks. Longer time windows allow us to find objects which may have been observed less frequently, but over longer times objects are more significantly displaced by acceleration and their velocity fluctuates more, and these tend to increase the number of tracks which we find by quite a large number.
Note that linkTracklets works purely in sky (RA/Dec) coordinates, and not in spherical coordinates (i.e. the quadratic fit is quadratic in RA/Dec, not on the sphere).
Algorithm
LinkTracklets is implemented mainly through a treewalk on KDTrees. It was originally implemented in C by Jeremy Kubica as part of the Auton package, and has also been partially implemented in LSSTC++.
Outer Loop
Rough psuedocode for linkTracklets would read as follows, given a set of tracklets from various nights:
Let TrackletTrees be an empty map from tracklet start time to a KDTree on (RA0, Dec0, dRA/dt, dDec/dt) Output := {} For each unique tracklet start time, start_time: stTracklets := all tracklets with start time == start_time stKDT := a KDTree of those tracklets represented in (RA0, Dec0, dRA/dt, dDec/dt) TrackletTrees := TrackletTrees + stKDT For each tree in TrackletTrees, startTree: For each later tree in TrackletTrees, endTree: SupportTrees := all trees in TrackletTrees between start_time and end_time Output := Output + findTracks(start_Tree, supportTrees, end_Tree) return Output
We will explain findTracks shortly, but first a few details:
Note that this is quadratic on the number of tracklet start times. To reduce this pain, Jeremy Kubica's software implements "plateing", under which tracklets which are sufficiently close in start time are projected to a common time and an additional error threshold is added in to their approximate location and velocity. This is not yet implemented in the LSSTC++ version.
findTracks: The meat of the algorithm
The findTracks function will use one tree (startTree) to find possible starting tracklets for tracks, a second (endTree) to find possible end tracklets for tracks, and a set of intermediate trees (supportTrees) which are used to find middle points ("support points") for the tracks.
Each KDTree is responsible for holding data from a given region of space; the algorithm here is based mainly on finding regions of (RA0, Dec0, dRA/dt, dDec/dt)space which are "compatible"  that is, such that any object within one space could reach the other space. If two spaces are incompatible, it is possible to ignore the possibility of any track passing from one tree to the second, and thus the searching can be abandoned within those spaces. By performing a mutually recursive treewalk of the start tree and the end tree, with the set of support trees filtered down to only those which are mutually compatible with the start and end trees, it is possible to terminate much unnecessary searching by abandoning sets of trees which are incompatible or for which there exists no possible support.
The following image demonstrates the concept of spatial compatibility, but simplified to (Dec, dDec/dt)space.
In this image, we imagine searching for regions of space which are compatible with region A, which covers a range of declinations (shown by a vertical line) and a range of declination velocities (approximated with arrows.) The blue region of space show the areas which are "compatible" with region A; note that the area of compatibility grows with respect to time as we allow for some acceleration factor. Outside this blue space is region B, which is not compatible because although it contains velocities similar to that of A, its location is too far away; no object in A could reach region B within the change in time. Similarly, though region B is reachable in declination, its velocity range is not feasible  no object in A could have changed velocity so extremely by that time. Only region C is compatible with region A, as its location is reachable and its velocity range consistent.
Note that in practice, unlike this example, we are concerned with whether regions are compatible in all four of RA, Dec, dRA/dt, and dDec/dt.
Using this notion of compatibility, we can explain the findTracks algorithm with the following psuedocode:
findTracks(startTree, supportTrees, endTree): if startTree and endTree are not compatible: # no possible track could start at startTree and end at endTree return {} else: # prune and refine the support trees for s in supportTrees: if compatible(s, startTree) AND compatible(s, endTree): if s has children: remove s from supportTrees add s's children to supportTrees else: do nothing else: remove s from supportTrees if supportTrees < min_support_points: return {} if startTree is a leaf and endTree is a leaf: # terminal case of recursion return findTracksBruteForce(startTree, supportTrees, endTree) else: # recursive case, keep going # choose a node to split and recurse. if startTree > endTree: return findTracks(startTree>leftChild, supportTrees, endTree) + findTracks(startTree>rightChild, supportTrees, endTree) else: return findTracks(startTree, supportTrees, endTree>leftChild) + findTracks(startTree>rightChild, supportTrees, endTree)
findTracksBruteForce: The final terminal case
Once the recursion above (findTracks) finds two leaf nodes and some support points, the actual bruteforce linking is done. This is done quite simply; the important caveat is that we find compatible endpoints in our "starting tracklets" tree and "ending tracklets" tree, and we greedily add every support point which lies along the track.
findTracksBruteForce(startLeafNode, supportTrees, endLeafNode): results = {} for startTracklet in startLeafNode: for endTracklet in endLeafNode: newTrack := [startTracklet + endTracklet] if newTrack has a wellfitting line: for each support point sup in supportTrees: if sup lies along newTrack and is a better fit than the best point at sup.time: newTrack := newTrack + [sup] if newTrack has enough support: results += newTrack return results
Note that this is quadratic on the number of tracklets held in each leaf, but this number is constant. So the real time taken by this function is actually the product of the number of compatible endpoints from the start treenode and the endtree node times the number of support points in all support nodes.
Distributing linkTrackets
There is some ongoing work at the University of Arizona to investigate distribution of the workload for this algorithm. The algorithm consists of two parts: a multiply recursive treewalk (above called findTracks) and a section for linking of individual tracklets and DiaSources?, which is performed when compatible leaf nodes are found (above called findTracksBruteForce).
findTracks examines the tree nodes exclusively, and findTracksBruteForce only uses the tracklets and DiaSources?. This provides a fairly natural separation of the data, and it is also somewhat scalable in that the size of a leaf node is settable at tree creation time, meaning that the trees can actually be shrunk arbitrarily small (though this does affect performance).
The current approach is use a network of processors arranged in a master/slave relationship. The "master" processor is responsible for running the findTracks section and identifying the compatible sets of leaf nodes. Each set of compatible leaf nodes (first endpoint, final endpoint, and support nodes) is called a "work item" because it represents a location of a possible track. These work items are noted and distributed to worker nodes which are responsible for performing the findTracksBruteForce method on each work item and outputting the resulting tracks.
Since a multiply recursive treewalk can be trivially parallelized, if needed it will be possible in the future to use multiple "master" processors a pool of "slave" processors as well.
Attachments

3tracklets.png
(11.9 KB) 
added by jmyers 9 years ago.
An illustration of three tracklets being linked into a track

compatiblespaces.png
(26.2 KB) 
added by jmyers 9 years ago.
An illustration of "compatible spaces" as used in linkTracklets; here simplified to just declination and dec velocity