Scientific Computing

Keep program running after disconnect

July 4, 2021

Screen is a terminal multiplexer program. Screen allows programs to continue running after a remote user disconnects. If a remote connection is lost unintentionally, screen may not allow reconnection by default by the usual

screen -list

screen -r <id>

normally allows reconnecting to a remote session after logging off. When a connection is lost before disconnecting from screen, you may need the “-x” option:

screen -x <id>

A downside of screen is the difficulty scrolling back in history.

Screen is a terminal multiplexer, and some prefer tmux over screen. Another option is using nohup.

CMake FetchContent vs. ExternalProject

June 25, 2021

Multiple subprojects can be invoked from a top-level superprojects with build systems such as CMake or Meson:

CMake FetchContent and ExternalProject can download subprojects, or the subproject can be included in the top-level project via Git submodule or monorepo approach. This subproject hierarchy works like any other CMake project from the command line or IDEs like Visual Studio.

Meson subproject and CMake ExternalProject keep project namespaces separate. Meson subproject and CMake FetchContent download and configure all projects at configure time. CMake FetchContent comingles the CMake project namespaces. FetchContent can be easier to use than ExternalProject if you control both software projects’ CMake scripts. If you don’t control the “child” project, it may be better to use ExternalProject instead of FetchContent.

For these examples, suppose we have a top-level project “parent” and a “child” project containing a library that is desired in parent. Suppose the child project can be built standalone (by itself) but also may be used directly from other CMake projects.

project	CMAKE_SOURCE_DIR	CMAKE_BINARY_DIR	PROJECT_SOURCE_DIR
parent	~/foo	~/foo/build	~/foo
child: standalone	~/bar	~/bar/build	~/bar
child: CMake ExternalProject	~/foo/build/child-prefix/src/child	~/foo/build/child-prefix/src/child-build	~/foo/build/child-prefix/src/child
child: CMake FetchContent	~/foo	~/foo/build	~/foo/build/_deps/child-src

FetchContent populates content from the other project at configure time. FetchContent populates the “child” project with default values from the “parent” project. Varibles set in the “child” project generally do not affect the “parent” project unless specifically used from the “parent” project. CMAKE_ARGS and CMAKE_CACHE_ARGS have no effect with FetchContent_Declare. To set a value in the child project from the parent project, set variables in the parent project.

From “parent” project CMakeLists.txt:

cmake_minimum_required(VERSION 3.14)
project(parent Fortran)

include(FetchContent)
FetchContent_Declare(child
  URL https://github.invalid/username/archive.tar.bz2
)
# it's much better to use a specific Git revision or Git tag for reproducibility

FetchContent_MakeAvailable(child)

# your program
add_executable(myprog main.f90)
target_link_libraries(myprog mylib)  # mylib is from "child"

FetchContent_MakeAvailable: make “child” code configure, populating variables and targets as if it were part of “parent” CMake project.

suppose “child” project CMakeLists.txt contains:

project(child LANGUAGES Fortran)

include(GNUInstallDirs)

add_library(mylib mylib.f90)

target_include_libraries(mylib INTERFACE ${CMAKE_CURRENT_BINARY_DIR}/${CMAKE_INSTALL_INCLUDEDIR})

set_property(TARGET mylib PROPERTY Fortran_MODULE_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/${CMAKE_INSTALL_INCLUDEDIR})

The child project CMAKE_BINARY_DIR and CMAKE_SOURCE_DIR will be those of parent project. That is, if the parent project is in ~/foo and the build directory is ~/foo/build, then the child project in ~/childcode called by FetchContent will also have CMAKE_SOURCE_DIR of ~/foo and CMAKE_BINARY_DIR of ~/foo/build. So be careful in the child project when using such variables that may be defined by parent projects. This is why projects that aren’t specifically designed to work together may be better joined by ExternalProject. A typical technique within the child project that can operate standalone is to refer to CMAKE_CURRENT_SOURCE_DIR instead of CMAKE_SOURCE_DIR as the latter will break when used from FetchContent.

IMPORTANT: When using if() clauses to determine execution of FetchContent, ensure that the FetchContent stanzas are executed each time CMake is run. Otherwise, the FetchContent targets may fail to be available or may have missing target properties on CMake rebuild.

ExternalProject populates content from the other project at build time. This means the other project’s libraries are not visible until the parent project is built. Since ExternalProject does not combine the project namespaces, ExternalProject may be necessary if you don’t control the other projects.

ExternalProject may not activate without the add_dependencies() statement. Upon cmake --build of the parent project, ExternalProject downloads, configures and builds.

From “parent” project CMakeLists.txt:

project(parent LANGUAGES Fortran)

include(GNUInstallDirs)
include(ExternalProject)

set(mylist "a;b;c")
# passing a list to external project is best done via CMAKE_CACHE_ARGS
# CMAKE_ARGS doesn't work correctly for lists

set_property(DIRECTORY PROPERTY EP_UPDATE_DISCONNECTED true)
# don't repeatedly build ExternalProjects.
# dir prop scope: CMake_current_source_dir and subdirectories

ExternalProject_Add(CHILD
GIT_REPOSITORY https://github.com/scivision/cmake-externalproject
GIT_TAG main
CMAKE_ARGS --install-prefix=${CMAKE_INSTALL_PREFIX}
CMAKE_CACHE_ARGS -Dmyvar:STRING=${mylist}   # need variable type e.g. STRING for this
CONFIGURE_HANDLED_BY_BUILD ON
BUILD_BYPRODUCTS ${CMAKE_INSTALL_FULL_LIBDIR}/${CMAKE_STATIC_LIBRARY_PREFIX}timestwo${CMAKE_STATIC_LIBRARY_SUFFIX}
)

add_library(timestwo STATIC IMPORTED GLOBAL)
set_property(TARGET timestwo PROPERTY IMPORTED_LOCATION ${CMAKE_INSTALL_FULL_LIBDIR}/${CMAKE_STATIC_LIBRARY_PREFIX}timestwo${CMAKE_STATIC_LIBRARY_SUFFIX})
set_property(TARGET timestwo PROPERTY INTERFACE_INCLUDE_DIRECTORIES ${CMAKE_INSTALL_FULL_INCLUDEDIR})

add_executable(test_timestwo test_timestwo.f90)  # your program
add_dependencies(test_timestwo CHILD)  # externalproject won't download without this
target_link_libraries(test_timestwo PRIVATE timestwo)

add_dependencies(): make ExternalProject always update and build first
CONFIGURE_HANDLED_BY_BUILD ON: tells CMake not to reconfigure each build, unless the build system requests configure
BUILD_BYPRODUCTS: necessary for Ninja to avoid “ninja: error: “lib” needed by “target”, missing and no known rule to make it”. Note how we can’t use BINARY_DIR since it’s populated by ExternalProject_Get_Property()

The imported library ext is used in the “parent” project just like any other library.

“child” project CMakeLists.txt includes:

project(child Fortran)

include(GNUInstallDirs)

add_library(timestwo STATIC timestwo.f90)
set_property(TARGET timestwo PROPERTY Fortran_MODULE_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/${CMAKE_INSTALL_INCLUDEDIR})

Configure “child” Fortran_MODULE_DIRECTORY so that it’s not necessary for “parent” to introspect “child” directory structure.

We have created live ExternalProject examples:

https://github.com/scivision/fortran2018-examples submodules directory
https://github.com/scivision/sparse-fortran/ mumps directory

CMake can detect if a project is “top level” that is, NOT via FetchContent using PROJECT_IS_TOP_LEVEL.

target_link_directories() is generally NOT preferred because library name collisions can occur, particularly with system libraries.

Reference: CMake staff comparison of multiple project with CMake

WSJT-X LoTW Python upload script

June 20, 2021

WSJT-X puts QSO ADIF files in its log directory. Once a Trusted QSL (TSQL) profile is created and authenticated by ARRL LoTW, and the TQSL application is installed, our LoTW upload Python script can be used to upload the QSO ADIF files to LoTW via TQSL in a second or two.

ARRL LoTW recent users list

June 11, 2021

The ARRL Logbook of the World (LoTW) provides a weekly list of recent users in CSV format. This list is used by other services such as PSK Reporter to show recent users of LoTW with an “L” icon. Note that the callsign is for the current holder of the LoTW account, not necessarily the original licensee.

Diagnose CTest failures from logs

June 10, 2021

CTest automatically logs test outputs to:

${CMAKE_BINARY_DIR}/Testing/Temporary/LastTest.log

If you have a test failure and want to diagnose, first copy this file somewhere else to work with it, in case it gets overwritten. This file is usually quite useful with nice formatting even when running many tests in parallel.

A simple list of all “failed” and “not run” tests are in:

${CMAKE_BINARY_DIR}/Testing/Temporary/LastTestsFailed.log

“Not run” tests are those that have FIXTURES_REQUIRED that itself failed or did not run.

At the time of running CTest, one can also use the -O option like:

ctest -O test.log

“ctest -O” only logs what is printed to the screen during the CTest run. If the “ctest -V” option wasn’t used, the extra useful information as in LastTest.log such as the command line run will be missing in “test.log”.

Rename and cleanup conda Python environment

June 7, 2021

Rename Python conda environment “old” to “new” by copying the environment and deleting the original environment:

conda create --name new --clone old

conda remove --name old --all

Each Miniconda/Anaconda environment consumes disk space. One may wish to delete old, unused conda environments to free disk space. Conda environment disk size can be checked by listing all environment paths

conda env list

This will show entries like:

py37 ~/miniconda3/envs/py37

Print the disk size of a conda environment:

Linux / macOS: du -sh ~/miniconda3/envs/py37
Windows: dir miniconda3/envs/py37

Linux control groups tips

June 5, 2021

Linux control groups can limit any user’s CPU, memory or other resource usage. Control groups can be used to test program behavior under constrained resources. Control groups v2 are recommended in general with a new architecture and better performance. By default with RHEL 8, we need to enable cgroups-v2.

Although setting up persistent control groups is straightforward, it’s possible to create a transient commend line initiated control group using systemd-run. This use can be good for diagnosing program behavior–for example, does a program’s memory use blow up then come down faster than “top” might show. An example use constraining a program to 2 GB of RAM is like:

systemd-run --scope -p MemoryMax=2G -p MemorySwapMax=0 ./my_program

The flag --user did not work–we needed to type the sudo password despite running as the standard user.

Another way to set hardware/firmware-based limits for more intensive benchmarks is to simply use a device with less RAM, edit BIOS/UEFI to only enable a limited amount of RAM, or on Linux use GRUB kernel mem= parameter to constrain the available RAM. Ensure the swap/paging file is turned off.

Clean delete untracked files from Git repo.

June 4, 2021

Sometimes files are accidentally spilled into a Git repo. Before the files are git add, they are “untracked”. If the files match a pattern in “.gitignore” they will not appear in Git operations generally. Untracked, non-ignored files show with:

git status --porcelain

?? oops.txt

where the question marks indicate the file is untracked.

These files may be interactively removed (deleted) by:

git clean -id

When there are files spilled in multiple directories, the “filter by pattern” options lets you select files to retain. The updated display shows files to be deleted. When satisfied, select “Clean”–there’s no recovering those files trivially, so be sure of your choices.

To clean files matching patterns in .gitignore, add the “-x” option like:

git clean -xid

That’s useful for cleaning up in source builds, perhaps from Makefile or LaTeX.

Fortran allocate large variable memory

June 2, 2021

Variables that are larger than a few kilobytes often should be put into heap memory instead of stack memory. In Fortran, compilers typically put variables with parameter property into stack memory. A good practice in Fortran is to put non-trivial arrays intended to be static/unchanged memory into an allocatable, protected array. Example:

module foo

implicit none (type, external)

integer, allocatable, protected :: x(:,:)

contains

subroutine init()
  allocate(x(1024,256))
  !! in real life, this would be some constant data array or
  !! expression filling the "constant" array x.
  x = 1
end subroutine init

end module


program bar

use foo, only : init, x

call init()

if (any(x /= 1)) error stop "did not init"

end program

In this example, x is approximately a one megabyte variable, assuming kind=int32. Even though the compiler may not warn if we instead declare this variable as parameter, it can cause segfaults and other seemingly random runtime errors.

Normally we would use a derived type instead of a bare module, but we did it here for simplicity.

Fortran allocate large variables

If the variable to be allocated is about one gigabyte or larger, sometimes special techniques are needed, even on systems with very large amounts of RAM including HPC. This is especially the case on Windows systems.

The error messages one may get upon allocating large variables in Fortran include:

Error allocating <N> bytes: Not enough space

Segmentation fault (core dumped)

For Windows, a peculiar limitation is that each variable (including allocatable) cannot exceed the virtual paging file size, even if the Windows computer has large amount of RAM that isn’t being exceeded. The paging file size may be inspected and set under: Control Panel | System and Security | System | Advanced system settings | Advanced | Performance | Settings | Advanced | Virtual memory

In general, the compiler may need to have the memory model flag set for the situation. This flag has a set of implications.

Intel Fortran: -mcmodel
GCC Gfortran: -mcmodel

macOS 11 excessive SSD write wear

June 1, 2021

Note: This problem is resolved by macOS 11.4.

As noted by Hector Martin and others, early macOS 11 appeared to have a possible kernel bug causing excessive SSD write wear whenever the SSD was in the “on” state. One can use “smartmontools” to check SSD write history:

brew install smartmontools

smartctl --all /dev/disk0

Note that SSD on state time can be much less than powered-on time. This is especially the case for the Mac mini, which may sit powered on but unused for the majority of the time by some users.

Thankfully as noted by Jonas Ribe, Hector Martin and others, macOS 11.4 appears to have fixed this SSD write bug. Thankfully we haven’t see the 100+ TB of excess SSD wear pre-11.4 as Jonas did. We saw less than 5TB of excess wear on mostly idle, continuously powered on Minis.