from Software Standards

Unit Testing Policy

Introduction

Unit testing validates the implementation of all objects from the lowest level defined in the detailed design (classes and functions) up to and including the lowest level in the architectural design (robustness diagram). The next layer of testing, known as integration testing, tests the interfaces between the architectural design (robustness diagram) objects. This Policy only addresses Unit Testing.

Refer to  Introduction into testing or why testing is worth the effort, John R. Phillips, for a short, but good, discussion on Unit Testing and its transition into Integration Testing.

Types of Unit Tests

Unit tests should be developed from the detailed design of the baseline, i.e. from either the structure diagrams or the class/function definitions. The type of tests performed during unit testing include:

  • White-box Tests
    • designed by examining the internal logic of each module and defining the input data sets that force the execution of different paths through the logic. Each input data set is a test case.
  • Black-box Tests
    • designed by examining the specification of each module and defining input data sets that will result in different behavior (e.g. outputs). Black-box tests should be designed to exercise the software for its whole range of inputs. Each input data set is a test case.
  • Performance Tests
    • if the detailed design placed resource constraints on the performance of a module, compliance with these constraints should be tested. Each input data set is a test case.

The collection of a module's white-box, black-box, and performance tests is known as a Unit Test suite. A rough measure of a Unit Test suite's quality is the percentage of the module's code which is exercised when the test suite is executed.

Due to the nature of white box testing, it is best if the original developer of an object creates the test suite validating that object. During later object modification, the current developer should update the test suite to validate the changed internals.

LSST DM developers should create test suites to unit test all objects they implement. They should also update an object's test suite after any object's modification.

Testing Frameworks

Test suite execution should be managed by a testing framework, also known as a test harness, which monitors the execution status of individual test cases.

C++: boost.test

LSST DM developer should use the  single-header variant of the Unit Test Framework.

It may not be particularly obvious how to testing private functions using the Boost unit test macros. Should this come up, consult the standard methods described in private function testing.

Python: unittest

LSST DM developer should use the Python.unittest framework. Use the Python inline help feature for more information on the unittest interface.

Java

To be determined but probably junit.

Unit Testing Composite Objects

Data Management uses a bottom-up testing method where validated objects are tested with, then added to, a validated baseline. That baseline, in turn, is used as the new validated baseline for further iterative testing.

When developing test suites for composite objects, the developer should first ensure that adequate test suites exist for the base objects.

Verifying Test Quality

Since Unit Tests are used to validate the implementation of detailed design objects through comprehensive testing, it's important to measure the thoroughness of the test suite. Coverage analysis does this by executing an instrumented code which records the complete execution path through the code and then calculating metrics indicative of the coverage achieved during execution.

Coverage Analysis examines the output of a code instrumented to record every line executed, every conditional branch taken, and every block executed. Then using static information such as total number of: lines of code, branches, and blocks; lists of functions and class methods, it generates metrics on:

  • % of statements executed
  • % of methods (and/or functions) executed
  • % of conditional branches executed
  • % of a method's (and/or function's) entry/exit branches taken.

The metrics give a general idea of the thoroughness of the unit tests. The most valuable aspect of a coverage analysis report is the color-coded report where the statements not exercised and the branches not taken are strikingly evident. The color-coded coverage holes clearly show the developer where unit tests need improvement.

Using the Coverage Analysis reports, the LSST DM developer should determine code segments which have not been adequately tested and should then revise the unit test suite as appropriate.

DM Coverage Analysis Metrics

Refer to  Code Coverage Analysis, by Steve Cornett, for an excellant discussion of coverage metrics and to  Minimum Acceptable Code Coverage, also by Steve Cornett, for the companion discussion on determining 'good-enough' overall test coverage.

A specific metric for lines of code executed and/or metric for branch conditionals executed, is not defined for Data Challenge 3. The metrics will first be defined for DC4 and will be incremented for each new baseline until Construction begins.

Using Coverage Analysis Tools

C++

LSST scons builds will automatically instrument all object and link modules with coverage counters when invoked with:

scons profile=gcov

This passes --coverage to all compile and link builds; this is equivalent to -fprofile-arcs -ftest-coverage on compile and -lgcov on link.

Executing the instrumented program causes coverage output to be accumulated. For each instrumented object file, associated files: .gcda, and .gcno, are created in the object file's directory. Successive runs add to the .gcda files resulting in a cumulative picture of object coverage.

Use one of the following tools to create the coverage analysis reports to verify that your unit testing coverage is adequate. Editor's preference is for either ggcov or tggcov since only the local source files are processed; see below for details.

gcov

 gcov is the original coverage analysis tool delivered with the GNU C/C++ compilers. The coverage analysis output is placed in the current directory. The analysis is done on all source and include files to which the tool is directed so be prepared for reports on all accessed system *.hpp files if you use gcov.

Use the following to generate coverage analysis on the LSST <module>/src directory

cd <module>
scons profile=gcov
gcov -b -o src/ src/*.cc src.gcov >& src_gcov.log

ggcov

 ggcov is an alternate coverage analysis tool to gcov which uses a GTK+ GUI. ggcov uses the same profiling data generated from a gcc instrumented code but uses its own analysis engine.

Use the following to bring up the ggcov GUI.

cd <module>
scons profile=gcov
ggcov -o src/

tggcov

 tggcov is the non-graphical interface to 'ggcov'.

tggcov creates its output files in the same directory as where the source files are located. It creates analysis files for only the local source files (i.e. not the system files).

Use the following for a comprehensive coverage analysis. Output files will be in 'src/*.cc.tggcov'.

cd <module>
scons profile=gcov
tggcov -a -B -H -L -N -o src/ src

Python

No recommendations have been made for Python coverage analysis tools. The following are options to explore when time becomes available.

coverage.py

 Coverage.py, written by Ned Batchelder is a Python module that measures code coverage during Python execution. It uses the code analysis tools and tracing hooks provided in the Python standard library to determine which lines are executable, and which have been executed.

figleaf

 figleaf, written by Titus Brown, is a Python code coverage analysis tool, built somewhat on the model of Ned Batchelder's fantastic coverage module. The goals of figleaf are to be a minimal replacement of 'coverage.py' that supports more configurable coverage gathering and reporting.

Java

No options have been researched.

Python & C++ Test Setup

DM developers frequently use the Python unittest framework to exercise C++ methods and functions. This scenario still supports the use of the C++ coverage analysis tools.

As usual, the developer instruments the C++ routines for coverage analysis at compilation time by building with scons profile=gcov. The C++ routines generated from the swig *.i source are also instrumented. Later when a Python unittester invokes an instrumented C++ routine, the coverage is recorded into the well-known coverage data files: <src>.gcda and <src>.gcno. Post-processing of the coverage data files is done by the developer's choice of C++ coverage analsyis tool.

GCOV output files in SVN directories

Gcov coverage output files should be identified as non-svn files to avoid svn continually warning of that status.

The least intrusive method of permanently marking the files requires the svn manager to update the configuration file, <repository>/conf/svnserve.conf, to include:

        *.gcno = svn:ignore
        *.gcda = svn:ignore

However, a comment in the LSST DM svn configuration file states the auto_prop is NOT working. A DM Trac Ticket (#464) has been issued to install this feature.

If the global property set doesn't work then each developer should set the svn:ignore property on each directory potentially containing gcov files to contain "*.gcno" and "*.gcda".