Scientific Computing

WSL cron scheduled tasks

Windows Subsystem for Linux version 0.67.6 and newer can use systemd, which makes using Cron on WSL straightforward. Check the WSL version from Windows Terminal (not WSL):

wsl --version

Check that Cron is running in WSL:

systemctl status cron

Then as usual use crontab -e.

Cron log file

Many types of programs can be run via Cron, including CTest scripts that upload to CDash. While CDash gives information once the CMake / CTest processes have started, if the CTest scripts aren’t running, the Cron job may need to be debugged.

Debugging Cron jobs involves examining Cron logs. Cron logging is optional. For debugging, it may be easier to simply append to the end of the cron job line a file dump like:

$HOME/my_exe 1>${TMPDIR}/out.log 2>${TMPDIR}/err.log

On macOS, Cron logging is disabled by default, but the results of cron jobs are available via system mail, which is stored in /var/log/$(whoami). Apple launchd can be used instead of Cron via Property List files, but many simply prefer Cron.

On Linux systems, Cron may log to locations like “/var/log/cron”.

For systems that use “rsyslog”, check “/etc/rsyslog.conf” for this line to be uncommented:

cron.*          /var/log/cron.log

conda run non-interactive environment

“conda activate” is intended for interactive shells and therefore doesn’t work properly with Cron jobs and other non-interactive shell environments. Rather than “bash -l”, consider using conda run.

By default the stdin/stdout/stderr are captured by conda run–no text input or output is seen until after the executable finishes running. For testing scripts, it’s possible to get live stdin/stdout/stderr with the option:

conda run --no-capture-output <exe>

For example, if a Cron job runs a CTest or CDash script that needs to use a Conda Python environment “myenv”, make a Cron job like:

conda run --name myenv ctest -S memcheck.cmake

Cron jobs are single lines only in “crontab -e”. In general more complex Cron jobs are invoked by calling a script from crontab. For example suppose a CMake project using Conda Python needs to be configured, built and tested. Make a cron job script like:

# So CMake finds Conda Python environment "myenv"
conda run --name myenv \
  cmake -Bbuild

cmake --build build --parallel
# preface with "conda run --name myenv" if Python invoked during the build

conda run --name myenv \
  ctest --test-dir build

NOTE: environment variables are defined for all cron jobs at the top of the crontab. The “conda” executable must be on PATH environment variable, or specify the full path to “conda” in each cron job line.

For example, on macOS the crontab PATH might be like:

PATH=/home/username/miniconda3/condabin:/opt/homebrew/sbin:/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/Apple/usr/bin

In particular, note that environment variables can’t refer to other environment variables like $HOME–type the full path on the system.

Gfortran type mismatch error flag

Gfortran 10 added the default behavior to make type mismatches an error instead of a warning. Legacy Fortran programs too often did not explicitly specify the procedure interfaces to allow implicit polymorphism. The Fortran 2008 standard cleaned up this situation in part with type(*).

Workaround: Gfortran flag -fallow-argument-mismatch can be used to degrade the errors to warnings. It is however strongly recommended to fix the problem in the legacy code, if it’s part of your code ownership.

Possible workaround

For external libraries like MPI-2, the interfaces are intended to be polymorphic but use Fortran 90-style interfaces. The user code can declare an explicit interface.

However, this is not recommended – we have seen intermittent runtime errors with MPI-2 using the technique below, that were entirely fixed by using the “mpi_f08” MPI-3+ interface.

use mpi, only : MPI_STATUS_SIZE

implicit none

interface
!! This avoids GCC >= 10 type mismatch warnings for MPI-2
subroutine mpi_send(BUF, COUNT, DATATYPE, DEST, TAG, COMM, IERROR)
type(*), dimension(..), intent(in) :: BUF
integer, intent(in) ::  COUNT, DATATYPE, DEST, TAG, COMM
integer, intent(out) :: IERROR
end subroutine

subroutine mpi_recv(BUF, COUNT, DATATYPE, SOURCE, TAG, COMM, STATUS, IERROR)
import MPI_STATUS_SIZE
type(*), dimension(..), intent(in) :: BUF
integer, intent(in) ::  COUNT, DATATYPE, SOURCE, TAG, COMM
integer, intent(out) :: STATUS(MPI_STATUS_SIZE), IERROR
end subroutine

end interface

CMake CTest memcheck

Many code languages don’t have garbage collection and also suffer from issues like uninitialized variables etc. CMake can use several common tools to check memory usage of a program. These tools are limited by the code coverage of the tests run.

The basic procedure is to build the project in “Debug” mode, run CTest memcheck task, then browse the results. Then fix the problems and rerun the tests to verify.

The popular “valgrind” memory tester is available on Linux, including Windows Subsystem for Linux (WSL). CMake will usually automatically detect valgrind if the valgrind executable is on the environment variable PATH. That is, if typing “valgrind” at the Terminal runs, then CMake should find and use it.

To use memory checking tools from CMake, the CMakeLists.txt must include the line:

include(CTest)

Here are example CMake commands to check memory of a project.

cmake -Bbuild -DCMAKE_BUILD_TYPE=Debug
cmake --build build

ctest -T memcheck --test-dir build

The tests may run more slowly than usual due to the testing overhead. The CTest terminal output will look like:

MemCheck log files can be found here: (<#> corresponds to test number)
build/Testing/Temporary/MemoryChecker.<#>.log
Memory checking results:
Memory Leak - 8
Potential Memory Leak - 8
Uninitialized Memory Conditional - 3

Each MemCheck CTest number has a log file. The traceback stack to the suspected area (line) of code detected to lead to memory problems is given. Edit the code to fix the problem, then rebuild and rerun memcheck as above.

Configure CTest memcheck

To change from default CTest memcheck settings, create a script memcheck.cmake in the top level of the project (where the top CMakeLists.txt is). Run this script like:

ctest -S memcheck.cmake -V

This program also has an example valgrind suppressions file. The suppressions file is derived from valgrind option

valgrind --gen-suppressions=all <executable>

CTest set environment variable

It’s often useful to set per-test environment variables in CMake’s CTest testing frontend. The environment variables appear and disappear with the start and end of each test, in isolation from any other tests that may be running in parallel. This is accomplished via CMake test properties ENVIRONMENT and ENVIRONMENT_MODIFICATION.

Example: set environment variable FOO=1 for a test “bar” like:

set_property(TEST bar PROPERTY ENVIRONMENT "FOO=1")

multiple variables are set with a CMake list (semicolon delimited) like:

set_property(TEST bar PROPERTY ENVIRONMENT "FOO=1;BAZ=0")

Here comes an issue. In general, Windows needs DLLs to be on the current working directory or in environment variable PATH. We handle this by a script that appends to PATH for CTest on Windows:

# works for Unix, Windows, etc.
cmake_minimum_required(VERSION 3.22)
project(WindowsPath LANGUAGES C)
find_library(ZLIB REQUIRED)
add_executable(hello hello.c)
target_link_libraries(hello PRIVATE ZLIB::ZLIB)
add_test(NAME unit_hello COMMAND hello)
if(WIN32)
set_property(TEST unit_hello PROPERTY ENVIRONMENT_MODIFICATION "PATH=path_list_append:/path/to/dlls")
endif()
view raw CMakeLists.txt hosted with ❤ by GitHub
#include "zlib.h"
int main(void) { return 0; }
view raw hello.c hosted with ❤ by GitHub

In Python likewise set/unset environment variables within tests using PyTest monkeypatch fixture.

Pacman clean unused packages

Package managers can detect which packages were manually installed (at user or script explicit request) and which were implicitly installed as a prerequisite. When uninstalling the manually installed package, the prerequisite packages are often not auto-uninstalled. To recover a significant amount of disk space (gigabytes perhaps) from unused packages, an autoremove command is useful.

Pacman is used in numerous Linux distros and in MSYS2. Show the auto-installed prerequisites:

pacman -Qdtq

This can be piped into the Pacman remove command upon verifying the packages above are indeed OK to remove:

pacman -Qdtq | pacman -Rs -

The package cache can contain gigabytes no longer needed. The pacman cache is cleared by:

pacman -Sc

Related: Clean APT and DNF cache

mdadm mount RAID

mdadm is a popular Linux tool to manage RAID arrays. RHEL provides a usage guide.

After creating a RAID array, or to find an existing array that may or may not be mounted, use commands like:

mdadm --detail /dev/md/Volume0_0

Note the individual RAID disk /dev/sd*, which can be further examined like:

mdadm --examine /dev/sda /dev/sdb

Also examine all available devices with:

fdisk --list

To mount the RAID, use commands like:

mkdir /mnt/raid
# arbitrary location you want to mount RAID device at

mount /dev/md/Volume0_0p1 /mnt/raid

There might be extra devices under /dev/md that can’t be mounted, but one of them should be the desired RAID.

Visual Studio /delayload with CMake

CMAKE_LINK_LIBRARY_USING_FEATURE doesn’t have a feature to delay loading import for MSVC-like compilers flag like /delayload. Nonetheless, /delayload can be accomplished in a compact way as in the following example:

add_library(example SHARED lib.c)
add_executable(main main.c)

target_link_libraries(main PRIVATE example)

if(MSVC)
  set_property(TARGET example PROPERTY WINDOWS_EXPORT_ALL_SYMBOLS true)
  target_link_libraries(main PRIVATE delayimp)
  target_link_options(main PRIVATE "/DELAYLOAD:$<TARGET_FILE_BASE_NAME:example>.dll")
endif()

Remove audio from video using FFmpeg

FFmpeg can losslessly copy the video stream while removing the audio stream. Use a command like:

ffmpeg -i input.mp4 -an -c:v copy output.mp4
-an
omits the audio stream from the output file
-c:v copy
losslessly copy the video stream without re-encoding it