Scientific Computing

CMake ExternalProject ensure same compilers

Common among build systems is the use of environment variables CC, CXX, FC to signal user intent to use a compiler, regardless of what appears first in environment variable PATH. A contradiction can arise for a CMake project using ExternalProject in that CMAKE_C_COMPILER et al are not automatically passed to the ExternalProject. Thus if environment variable “CC=gcc” but the top-level project user specified “cmake -DCMAKE_C_COMPILER=icx”, the top level CMake project would use icx IntelLLVM but the subproject would use GCC, which can cause unintended results.

The fix for this issue is to explicitly pass CMAKE_C_COMPILER et al to the ExternalProject CMAKE_ARGS if the subproject is also a CMake project.

set(args
-DCMAKE_C_COMPILER=${CMAKE_C_COMPILER}
-DCMAKE_CXX_COMPILER=${CMAKE_CXX_COMPILER}
-DCMAKE_Fortran_COMPILER=${CMAKE_Fortran_COMPILER}
)

ExternalProject_Add(...
CMAKE_ARGS ${args}
)

For Autotools ExternalProject do like:

set(args
CC=${CMAKE_C_COMPILER}
CXX=${CMAKE_CXX_COMPILER}
FC=${CMAKE_Fortran_COMPILER}
)

ExternalProject_Add(...
CONFIGURE_COMMAND <SOURCE_DIR>/configure ${args}
)

For GNU Make subproject do like:

set(args
CC=${CMAKE_C_COMPILER}
CXX=${CMAKE_CXX_COMPILER}
FC=${CMAKE_Fortran_COMPILER}
)

ExternalProject_Add(...
CONFIGURE_COMMAND ""
BUILD_COMMAND ${MAKE_EXECUTABLE} -j ${args}
)

CMake archive extract syntax

CMake file(ARCHIVE_EXTRACT), is more robust and easy to use than the prior syntax.

if(CMAKE_VERSION VERSION_GREATER_EQUAL 3.18)
  file(ARCHIVE_EXTRACT ${archive} ${out_dir})
else()
  # older, less robust
  file(MAKE_DIRECTORY ${out_dir})

  execute_process(
  COMMAND ${CMAKE_COMMAND} -E tar xf ${archive}
  WORKING_DIRECTORY ${out_dir}
  RESULT_VARIABLE ret
  )
  if(NOT ret EQUAL 0)
    message(FATAL_ERROR "extract ${archive} => ${out_dir}    ${ret}")
  endif()
endif()

CMake print to stdout or stderr

CMake can print to “stderr” pipe cleanly like:

message(NOTICE "this message is on stderr pipe")

However, printing cleanly without “–” leading message requires a workaround:

execute_process(COMMAND ${CMAKE_COMMAND} -E echo "this message is on stdout pipe")

NOTE: Another technique that does NOT work in general for stdout is to print the invisible carriage return character. This only works visibly on the Terminal but if piping CMake stdout to another shell command does NOT work in general.

# don't do this, only works for printed shell, doesn't work for stdout pipe

string(ASCII 13 cr)
message(STATUS "${cr}  ${cr}No dashes visible, but stdout pipe is messed up")

Fortran maximum name and line lengths

Ancient Fortran code readability is impacted by the restrictions on variable length and line length that could lead to inscrutable variable and procedure names. The Fortran 2003 standard raised many of these limits to lengths that might only be a factor for auto-generated code with internally used very long names. If going beyond the usual name lengths, it’s a good idea to test across the compilers of interest (including compiler versions) to ensure that the required compiler vendors and versions can support the proposed name lengths.

We provide code examples verifying that compilers can support 63 character syntax elements (names for modules, submodules, variables), which is the maximum set by Fortran 2003 standard. The maximum line length is officially 132, but can be much longer depending on the compiler and compiler options.

WSL cron scheduled tasks

Windows Subsystem for Linux version 0.67.6 and newer can use systemd, which makes using Cron on WSL straightforward. Check the WSL version from Windows Terminal (not WSL):

wsl --version

Check that Cron is running in WSL:

systemctl status cron

Then as usual use crontab -e.

Cron log file

Many types of programs can be run via Cron, including CTest scripts that upload to CDash. While CDash gives information once the CMake / CTest processes have started, if the CTest scripts aren’t running, the Cron job may need to be debugged.

Debugging Cron jobs involves examining Cron logs. Cron logging is optional. For debugging, it may be easier to simply append to the end of the cron job line a file dump like:

$HOME/my_exe 1>${TMPDIR}/out.log 2>${TMPDIR}/err.log

On macOS, Cron logging is disabled by default, but the results of cron jobs are available via system mail, which is stored in /var/log/$(whoami). Apple launchd can be used instead of Cron via Property List files, but many simply prefer Cron.

On Linux systems, Cron may log to locations like “/var/log/cron”.

For systems that use “rsyslog”, check “/etc/rsyslog.conf” for this line to be uncommented:

cron.*          /var/log/cron.log

conda run non-interactive environment

“conda activate” is intended for interactive shells and therefore doesn’t work properly with Cron jobs and other non-interactive shell environments. Rather than “bash -l”, consider using conda run.

By default the stdin/stdout/stderr are captured by conda run–no text input or output is seen until after the executable finishes running. For testing scripts, it’s possible to get live stdin/stdout/stderr with the option:

conda run --no-capture-output <exe>

For example, if a Cron job runs a CTest or CDash script that needs to use a Conda Python environment “myenv”, make a Cron job like:

conda run --name myenv ctest -S memcheck.cmake

Cron jobs are single lines only in “crontab -e”. In general more complex Cron jobs are invoked by calling a script from crontab. For example suppose a CMake project using Conda Python needs to be configured, built and tested. Make a cron job script like:

# So CMake finds Conda Python environment "myenv"
conda run --name myenv \
  cmake -Bbuild

cmake --build build --parallel
# preface with "conda run --name myenv" if Python invoked during the build

conda run --name myenv \
  ctest --test-dir build

NOTE: environment variables are defined for all cron jobs at the top of the crontab. The “conda” executable must be on PATH environment variable, or specify the full path to “conda” in each cron job line.

For example, on macOS the crontab PATH might be like:

PATH=/home/username/miniconda3/condabin:/opt/homebrew/sbin:/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/Apple/usr/bin

In particular, note that environment variables can’t refer to other environment variables like $HOME–type the full path on the system.

Gfortran type mismatch error flag

Gfortran 10 added the default behavior to make type mismatches an error instead of a warning. Legacy Fortran programs too often did not explicitly specify the procedure interfaces to allow implicit polymorphism. The Fortran 2008 standard cleaned up this situation in part with type(*).

Workaround: Gfortran flag -fallow-argument-mismatch can be used to degrade the errors to warnings. It is however strongly recommended to fix the problem in the legacy code, if it’s part of your code ownership.

Possible workaround

For external libraries like MPI-2, the interfaces are intended to be polymorphic but use Fortran 90-style interfaces. The user code can declare an explicit interface.

However, this is not recommended – we have seen intermittent runtime errors with MPI-2 using the technique below, that were entirely fixed by using the “mpi_f08” MPI-3+ interface.

use mpi, only : MPI_STATUS_SIZE

implicit none

interface
!! This avoids GCC >= 10 type mismatch warnings for MPI-2
subroutine mpi_send(BUF, COUNT, DATATYPE, DEST, TAG, COMM, IERROR)
type(*), dimension(..), intent(in) :: BUF
integer, intent(in) ::  COUNT, DATATYPE, DEST, TAG, COMM
integer, intent(out) :: IERROR
end subroutine

subroutine mpi_recv(BUF, COUNT, DATATYPE, SOURCE, TAG, COMM, STATUS, IERROR)
import MPI_STATUS_SIZE
type(*), dimension(..), intent(in) :: BUF
integer, intent(in) ::  COUNT, DATATYPE, SOURCE, TAG, COMM
integer, intent(out) :: STATUS(MPI_STATUS_SIZE), IERROR
end subroutine

end interface