Clear temporary scratch files on HPC

Unix-like HPC systems often have shared temporary scratch directories mapped by environment variable $TMPDIR to a directory like “/scratch” or “/tmp”. $TMPDIR may be used for temporary files during build or computation. $TMPDIR is often shared among all users with no expectation of preservation or backup. If user files are left in $TMPDIR, the HPC system may email a periodic alert to the user.

If the user determines that $TMPDIR files aren’t needed after the HPC batch job completes, one can clear $TMPDIR files with a command near the end of the batch job script. Carefully consider whether this is appropriate for the specific use case, as the scratch files will be permanently deleted.

rm -r -i $TMPDIR 2>/dev/null

Verify that deletes only the user’s files, as each user’s files have write permissions only for their own files. Once this is established, to use this command in batch scripts replace the “-i” with “-f” to make it non-interactive.

CMake TARGET_RUNTIME_DLL_DIRS for CTest

Building Windows shared libraries in general creates DLLs whose directory must be on environment variable PATH when the executable target is run. Windows error code -1073741515 corresponding to hex error code 0xc0000135 emits when the necessary DLLs are not in the program’s working directory or on PATH. This will make CTest tests fail with error code 135.

The CMake generator expression TARGET_RUNTIME_DLL_DIRS along with test property ENVIRONMENT_MODIFICATION can be used to set the PATH environment variable for the test, gathering all the directories of the DLLs CMake knows the target needs.

set_property(TEST adder PROPERTY ENVIRONMENT_MODIFICATION PATH=path_list_append:$<TARGET_RUNTIME_DLL_DIRS:main>)

in this minimal example CMakeLists.txt uses the properties above to work correctly.

Limit code language standard

C++17 and C++20 standard code is used throughout projects of all sizes, perhaps with limited-feature fallback to older language standards. Some standards certifications require a specific language standard. High reliability and safety-critical projects may require specific language standards. Examples include FACE and MISRA C++.

To enforce a specific language standard be limited, consider in a header used throughout the project as follows. This example limits the language standard to C++14 or earlier by halt the build if a higher standard is detected:

#if __cplusplus >= 201703L
#error "C++14 or earlier required"
#endif

For C code say no higher than C99, consider in a header used throughout the project, which will halt the build if a higher standard is detected:

#if __STDC_VERSION__ >= 201112L
#error "C99 or earlier required"
#endif

CMake ignore Anaconda libraries and compilers

Anaconda Python conda activate puts Conda directories first on environment variable PATH. This leads CMake to prefer finding Anaconda binaries (find_library, find_program, …) and Anaconda GCC compilers (if installed) over later directories on system PATH. Anaconda libraries and compilers are generally incompatible with the system or desired compiler. For certain libraries like HDF5, Anaconda is particularly problematic at interfering with CMake.

Detect Anaconda environment by existence of environment variable CONDA_PREFIX.

Fix by putting in CMakeLists.txt like the following.

NOTE: CMAKE_IGNORE_PREFIX_PATH does not take effect if set within Find*.cmake.

cmake_minimum_required(VERSION ...)

# ignore Anaconda compilers, which are typically not compatible with the system
if(DEFINED ENV{CONDA_PREFIX})
  set(CMAKE_IGNORE_PATH $ENV{CONDA_PREFIX}/bin)
endif()

project(my LANGUAGES C)

# Optional next two lines if needing Python in CMake project
unset(CMAKE_IGNORE_PATH)
find_package(Python ...)
# end optional lines

# exclude Anaconda directories from search
if(DEFINED ENV{CONDA_PREFIX})
  list(APPEND CMAKE_IGNORE_PREFIX_PATH $ENV{CONDA_PREFIX})
  list(APPEND CMAKE_IGNORE_PATH $ENV{CONDA_PREFIX}/bin)
  # need CMAKE_IGNORE_PATH to ensure system env var PATH
  # doesn't interfere despite CMAKE_IGNORE_PREFIX_PATH
endif()

To totally omit environment variable PATH from CMake find_* use CMAKE_FIND_USE_SYSTEM_ENVIRONMENT_PATH:

set(CMAKE_FIND_USE_SYSTEM_ENVIRONMENT_PATH false)

However, this can be too aggressive i.e. it might miss other programs on PATH actually wanted.

Windows host cross-build for Linux target

Visual Studio supports cross-builds on Windows host for Linux targets. This requires either a remote Linux machine connection, or using WSL on the local computer.

A more robust solution without additional setup on developer computers is CI/CD such as GitHub Actions or many other online and offline choices such as Jenkins. When the developer Git pushes, the CD job provides binaries across operating systems.

An example of GitHub Actions CD is the Ninja project. They provide old (CentOS 7) Linux binaries, macOS and Windows. This could easily be extended to ARM etc.

Black format exclude multiple directories

To exclude multiple directories from Black Python code formatter, use the following format in pyproject.toml. The multi-line regex format seems to be required–any other way didn’t take effect.

Edit / add / remove as many directories as desired, using the following multi-line format (indentation is not important). Note the escaping needed for “.” since this is a regex.

This is particularly useful when using Black in a project with Git submodules to not disturb the Git submodule Python code with Black from the top-level project. Likewise for other tools such as flake8 and mypy set exclude in their settings for Git submodules.

[tool.black]
force-exclude = '''
/(
\.git
| \.mypy_cache
| \.venv
| _build
| build
| dist
)/
'''

CMake ExternalProject/FetchContent Git vs. URL archive

CMake ExternalProject and FetchContent can download from Git or URL archive. Archive download is usually much faster, especially for projects with a large number of Git commits. Checksum of the archive can optionally be verified with URL_HASH option.

Example

Git submodule

At first glance since Git config can set fetchParallel Git clone submodule in parallel might be something the ExternalProject GIT_CONFIG could do, but we have not tried this.

GitHub Oauth token

To give secure access to private GitHub repositories on less-trusted systems like CI or HPC or shared workstation, consider GitHub Oauth tokens. The Oauth token can give read-only (or other fine-grained permissions) to all or a specific subset of repositories the GitHub account has access to.

Create a GitHub Oauth token with the desired permissions.

For read-only private GitHub repo access the “repo” permission group is selected.

Copy the text string token and SSH into the remote system where access is desired. Configure the global user Git config to use the Oauth token for the desired GitHub organization or user.

Suppose a coworker “sara” has a private GitHub repo “myrepo” and has added your GitHub username as a collaborator in the “myrepo” settings. On the remote computer, configure Git to use the Oauth token for the “sara” GitHub user:

git config --global url.https://oauth2:OauthToken@github.com/sara/.insteadOf https://github.com/sara/

A similar syntax is used for GitHub organizations or specific repositories.

The text OauthToken is replaced with the actual Oauth token string from GitHub.


Related: Git pull HTTPS push SSH

Fortran module file format

The Fortran standard does not define a .mod file format, and every compiler has a unique incompatible format. This means that .mod files are not portable between different compilers or even different versions of the same compiler.

Gfortran .mod files are GZIP text. The format is not documented, and it is not recommended to rely on the format. The .mod file format is not guaranteed to be stable between different versions of Gfortran.

Extract the .mod file header like:

gunzip -c file.mod | head

The outputs starts like:

GFORTRAN module version ‘15’ created from …

Reference

GCC version    module file version
-----------------------------------
up to 4.3.2    unversioned
4.4            0
4.5.1          4
4.6.3          6
4.7.0pre       8
4.7.1          9
4.8.[1-3]      10
4.9.2          12
5.1.0          14
8.1.0          15
9 - 13         15

If you get gzip: not in gzip format then the .mod file is probably from another compiler vendor e.g. Intel oneAPI.

GitHub Actions per-job compiler

GitHub Actions workflows can use different compilers per job. This is useful for programs and libraries that need distinct compiler versions. An example of this is Matlab, where each Matlab release has a range of compatible compilers.

  • Matlab R2021a, R2021b: GCC-8
  • Matlab R2022a, R2022b, R2023a, R2023b: GCC-10

Implement in GitHub Actions:

jobs:

  linux:
    runs-on: ubuntu-latest

    strategy:
      matrix:
        release: [R2021b, R2023b]

    steps:

    - name: GCC-8 (Matlab < R2022a)
      if: ${{ matrix.release < 'R2022a' }}
      run: |
        echo "CC=gcc-8" >> $GITHUB_ENV
        echo "CXX=g++-8" >> $GITHUB_ENV
        echo "FC=gfortran-8" >> $GITHUB_ENV        

    - name: GCC-10 (Matlab >= R2022a)
      if: ${{ matrix.release >= 'R2022a' }}
      run: |
        echo "CC=gcc-10" >> $GITHUB_ENV
        echo "CXX=g++-10" >> $GITHUB_ENV
        echo "FC=gfortran-10" >> $GITHUB_ENV        

    - name: Install MATLAB
      uses: matlab-actions/setup-matlab
      with:
        release: ${{ matrix.release }}