Diagnose CTest failures from logs

CTest automatically logs test outputs to:

${CMAKE_BINARY_DIR}/Testing/Temporary/LastTest.log

If you have a test failure and want to diagnose, first copy this file somewhere else to work with it, in case it gets overwritten. This file is usually quite useful with nice formatting even when running many tests in parallel.

A simple list of all “failed” and “not run” tests are in:

${CMAKE_BINARY_DIR}/Testing/Temporary/LastTestsFailed.log

“Not run” tests are those that have FIXTURES_REQUIRED that itself failed or did not run.

At the time of running CTest, one can also use the -O option like:

ctest -O test.log

“ctest -O” only logs what is printed to the screen during the CTest run. If the “ctest -V” option wasn’t used, the extra useful information as in LastTest.log such as the command line run will be missing in “test.log”.

CTest set environment variable

It’s often useful to set per-test environment variables in CMake’s CTest testing frontend. The environment variables appear and disappear with the start and end of each test, in isolation from any other tests that may be running in parallel. This is accomplished via the test property ENVIRONMENT.

Example: set environment variable FOO=1 for a test “bar” like:

set_tests_properties(bar PROPERTIES ENVIRONMENT "FOO=1")

multiple variables are set with a CMake list (semicolon delimited) like:

set_tests_properties(bar PROPERTIES ENVIRONMENT "FOO=1;BAZ=0")

Here comes an issue. In general, Windows needs DLLs to be on the current working directory or in environment variable PATH. Since Windows also delimits with a semicolon, we need to do a little extra work to append to PATH on Windows for CTest. We handle this by a script that appends to PATH for CTest on Windows:


In Python likewise set/unset environment variables within tests using PyTest monkeypatch fixture.

Rename and cleanup conda Python environment

Rename Python conda environment “old” to “new” by copying the environment and deleting the original environment:

conda create --name new --clone old

conda remove --name old --all

Each Miniconda/Anaconda environment consumes disk space. One may wish to delete old, unused conda environments to free disk space. Conda environment disk size can be checked by listing all environment paths

conda env list

This will show entries like:

py37 ~/miniconda3/envs/py37

Print the disk size of a conda environment:

  • Linux / MacOS: du -sh ~/miniconda3/envs/py37
  • Windows: dir miniconda3/envs/py37

Linux control groups tips

Linux control groups can limit any user’s CPU, memory or other resource usage. Control groups can be used to test program behavior under constrained resources. Control groups v2 are recommended in general with a new architecture and better performance. By default with RHEL/CentOS 8, we need to enable cgroups-v2.

Although setting up persistent control groups is straightforward, it’s possible to create a transient commend line initiated control group using systemd-run. This use can be good for diagnosing program behavior–for example, does a program’s memory use blow up then come down faster than “top” might show. An example use constraining a program to 2 GB of RAM is like:

systemd-run --scope -p MemoryMax=2G -p MemorySwapMax=0 ./my_program

The flag --user did not work–we needed to type the sudo password despite running as the standard user.

Another way to set hardware/firmware-based limits for more intensive benchmarks is to simply use a device with less RAM, edit BIOS/UEFI to only enable a limited amount of RAM, or on Linux use GRUB kernel mem= parameter to constrain the available RAM. Ensure the swap/paging file is turned off.

Clean delete untracked files from Git repo.

Sometimes files are accidentally spilled into a Git repo. Before the files are git add, they are “untracked”. If the files match a pattern in “.gitignore” they will not appear in Git operations generally. Untracked, non-ignored files show with:

git status --porcelain

like

?? oops.txt

where the question marks indicate the file is untracked.

These files may be interactively removed (deleted) by:

git clean -id

When there are files spilled in multiple directories, the “filter by pattern” options lets you select files to retain. The updated display shows files to be deleted. When satisfied, select “Clean”–there’s no recovering those files trivially, so be sure of your choices.

To clean files matching patterns in .gitignore, add the “-x” option like:

git clean -xid

That’s useful for cleaning up in source builds, perhaps from Makefile or LaTeX.

Stop shell script on exception

We often set our shell scripts to stop upon exception rather than handling each one manually. This is accomplished in the Linux / MacOS Terminals by setting near the top of the .sh script file:

set -e

On any operating system, stop executing a Powershell script upon exception by adding near the top of the .ps1 Powershell script:

$ErrorActionPreference = 'Stop'

Fortran allocate large variable memory

Variables that are larger than a few kilobytes often should be put into heap memory instead of stack memory. In Fortran, compilers typically put variables with parameter property into stack memory. Our practice in Fortran is to put non-trivial arrays intended to be static/unchanged memory into an allocatable, protected array. Example:

module foo

implicit none (type, external)

integer, allocatable, protected :: x(:,:)

contains

subroutine init()
  allocate(x(1024,256))
  !! in real life, this would be some constant data array or
  !! expression filling the "constant" array x.
  x = 1
end subroutine init

end module


program bar

use foo, only : init, x

call init()

if (any(x /= 1)) error stop "did not init"

end program

In this example, x is approximately a one megabyte variable, assuming kind=int32. Even though the compiler may not warn if we instead declare this variable as parameter, it can cause segfaults and other seemingly random runtime errors.

Normally we would use a derived type instead of a bare module, but we did it here for simplicity.

Fortran allocate large variables

If the variable to be allocated is about one gigabyte or larger, sometimes special techniques are needed, even on systems with very large amounts of RAM including HPC. This is especially the case on Windows systems, where even the latest Windows 10 has particular limitations.

The error messages one may get upon allocating large variables in Fortran include:

Error allocating <N> bytes: Not enough space

Segmentation fault (core dumped)

For Windows, a peculiar limitation is that each variable (including allocatable) cannot exceed the virtual paging file size, even if the Windows computer has large amount of RAM that isn’t being exceeded. The paging file size may be inspected and set under: Control Panel | System and Security | System | Advanced system settings | Advanced | Performance | Settings | Advanced | Virtual memory

In general, the compiler may need to have the memory model flag set for the situation. This flag has a set of implications.

MacOS 11 excessive SSD write wear

As noted by Hector Martin and others, MacOS 11 appeared to have a possible kernel bug causing excessive SSD write wear whenever the SSD was in the “on” state. One can use “smartmontools” to check SSD write history:

brew install smartmontools

smartctl --all /dev/disk0

Note that SSD on state time can be much less than Mac powered-on time, particularly if the Mac is sitting idle. This is especially the case for the Mac Mini, which may sit powered on but unused for the majority of the time by some users.

Thankfully as noted by Jonas Ribe, Hector Martin and others, MacOS 11.4 appears to have fixed this SSD write bug:

Thankfully we haven’t see the 100+ TB of excess SSD wear pre-11.4 as Jonas did. We saw less than 5TB of excess wear on each of our mostly idle, continuously powered on Mac Minis.

Again, tentatively this problem is resolved by MacOS 11.4.

CMake CTest subset test label

The CMake test frontend CTest can easily select subsets of tests. While there are more advanced CTest test selection options, two of the most common and easy to use test subset selection methods are by regex selection of names, labels and/or fixtures exclusion. Assuming the project has a meaningful test naming scheme, one may trivially select tests by either or both of the ctest -R and ctest -E flags. For this article, assume the tests are named:

alpha:egg
alpha:bacon
beta:egg
beta:apple
gamma:egg
gamma:orange

where each test was setup in CMakeLists.txt like:

add_executable(alpha_egg alpha_egg.c)

add_test(NAME alpha:egg COMMAND $<TARGET_FILE:alpha_egg>)

The colon has no special meaning, and CTest names may use special characters if desired. When figuring out how to use CTest test selection, it’s very helpful to also add the ctest -N option, so that test names are printed without running the tests. For all examples, we assume the user working directory is PROJECT_BINARY_DIR or is using ctest --test-dir.

One may select all the “egg” tests by:

ctest -R egg

Suppose one wishes to exclude the test named “beta:egg”:

ctest -R egg -E beta

To run all tests except those with “beta” in the name:

ctest -E beta

A more sophisticated test selection scheme requires setting test labels in the respective CMakeLists.txt like:

set_tests_properties(alpha:egg beta:apple gamma:egg PROPERTIES LABEL "unit;gravy")

Combinations of test labels and test names regex can be used to select subsets of tests. For example:

ctest -R egg -L unit

Note also the option ctest -LE, which works like ctest -E for labels.

CTest fixtures can also be excluded with the ctest -FA option. This allow not rerunning expensive FIXTURES_SETUP tests when not needed.


Set labels for all targets and tests in a directory like:

set_directory_properties(PROPERTIES LABELS linalg)

where “linalg” is the desired label(s).