wiki:ProposalNDarray

Version 2 (modified by dubcovsky, 10 years ago) (diff)

--

Third Party Software Proposal - ndarray

Jim Bosch (Of UC Davis) has written a C++ template-based N-dimensional array library called ndarray. I'd like to submit ndarray for use in LSST as a third party package. I would like to use this library in the development of Multifit, but it is a general use library that could have application in other packages as well.

It has a lot in common with Blitz++ and particularly boost::multi_array, which some of you may be familiar with, but it is designed to map more easily to numpy arrays in Python, it's got some nice optimizations for contiguous arrays, and it uses Eigen to do any math heavy-lifting. It also has a fairly complete set of tests, and I think it should fit in well with what already exists in the LSST DM stack, particularly in interfacing with Eigen and FFTW.

The ndarray library has no depencies on LSST software.
The library includes a set of unit test, which also serve as examples.
Jim Bosch will continue to support it, should additional work/support be needed. Further support should not be a large concern, this is a very small library, which we could support ourselves if the need arose.

Applications to Multifit

Multifit makes use of a lot of reinterpretation and changes of dimensionality and shape in its arrays and subarrays. The underlying data is indeed just an image, but we often treat it as a vector. Or, rather, it's a collection of differently-sized images we will flatten into one giant vector. And while that's easy enough to support without a complete multi-dimensional array library, we also will deal quite a bit with matrices in which the rows or columns will correspond to this giant flattened pixel vector, and the other dimension will correspond to something else entirely (such as the derivative of a model with respect to some set of parameters).

Because we'll be allocating those as giant vectors/matrices and setting their coefficients by extracting subarrays, reshaping them to 2 or 3 dimensions, and dealing with them as (y,x) images or (y,x,parameter) 3-tensors, a full multidimensional array library started to sound like a good idea. There are some places I could envision wanting two different parameter dimensions, leading to 4-tensors (y, x, intrinsic params, psf params), but we might instead be able to combine pairs of dimensions or dotting one of the dimensions with a vector without ever instantiating the 4-tensor.

Uses beyond Multifit

  • The ndarray types map very nicely to numpy arrays, and I have utility functions for converting between ndarray objects and PyObject* that can be used to make nice swig typemaps (or whatever). I've found this particularly useful in testing, when you want to get some data generated in Python into C++ as simply as possible.
  • Shallow Eigen objects can be extracted from arrays and subarrays without copying, which makes it very easy to do optimized linear algebra in-place on general (and potentially large) arrays.

Aditional Materials

Doxygen Documentation

Attachments