Scientific Computing

Cleanup unused files in Linux

Keep at least 10% of drive space to avoid:

  • SSD wear
  • HDD fragmentation

Determine free space on Linux / macOS / Windows Subsystem for Linux with “ncdu”. ncdu uses Ncurses terminal graphics to quickly show the biggest files in the Linux filesystem tree. ncdu is very handy to find large files or directories that may be unneeded.

df -h

gives a drive-level summary of disk usage.

Package managers cache installed files in case of need to reinstall, but the packages can be redownloaded if needed to save disk space by clearing the cache. Clear the package cache–for APT (common in Debian-based systems):

apt autoclean

or for DNF (Fedora, RHEL, CentOS):

dnf clean dbcache

Remove unwanted packages

TeX Live documentation can consume a lot of disk space. To cleanup the documentation, consider removing packages matching texlive-*doc. This also removes texlive-full but with no detriment to TeX Live working.

Synaptic list of files to remove for texlive-doc to save disk space

Packages removed for texlive-doc to save over 1 GB of disk space.


Related:

CMake FindOpenSSL hints

For all CMake find_*() commands including FindOpenSSL, the package path can be hinted by setting an appropriate environment variable or CMake variable. This examples supposes a Homebrew package manager has installed OpenSSL 1.1, which the user wishes to use in a CMake project. To hint the package path when configuring a CMake project, either specify OpenSSL_ROOT by environment variable:

export OpenSSL_ROOT=$(brew --prefix openssl@1.1)

or directly in the CMake configure command:

cmake -B build -DOpenSSL_ROOT=$(brew --prefix openssl@1.1)

The example CMakeLists.txt:

cmake_minimum_required(VERSION 3.16)

project(f LANGUAGES NONE)

find_package(OpenSSL REQUIRED)

Use the –debug-find CMake option to see the paths CMake is searching.

To disable various search paths, consider the following CMake variables. These are normally only used for debugging or special cases.

set(CMAKE_FIND_USE_CMAKE_PATH false)
set(CMAKE_FIND_USE_CMAKE_SYSTEM_PATH false)
set(CMAKE_FIND_USE_CMAKE_ENVIRONMENT_PATH false)
set(CMAKE_FIND_USE_SYSTEM_ENVIRONMENT_PATH false)

The OpenSSL world is gradually transitioning from OpenSSL 1.1 to 3, and Homebrew uses subdirectory to isolate the OpenSSL installs. CMake does not recursively search as that would in general not have a stopping condition and at least significantly slow down the search performance.

GitHub Actions Apple Silicon CPU

GitHub Actions macOS runners can use Apple Silicon CPU, which is what most Apple users have. Some build issues including the linker have historically had Apple Silicon-specific issues. Generally it’s good to test on the same CPU architecture as the target platform.

We sometimes find it necessary to select the Xcode version compatible with Homebrew GCC if build errors occur that are not present on a physical Apple Silicon laptop.

jobs:

  mac:
    runs-on: macos-14

    strategy:
      matrix:
        cxx: [g++-14, clang++]

    env:
      HOMEBREW_NO_AUTO_CLEANUP: 1
      CXX: ${{ matrix.cxx }}

    steps:
    - uses: actions/checkout

    - run: sudo xcode-select --switch /Applications/Xcode_15.1.app

    - run: cmake --workflow --preset debug

    - run: cmake --workflow --preset release

In this example Ninja enables quick testing of builds in Debug and Release mode, which is important to catch bugs.

macOS WiFi BSSID scan

The undocumented, discontinued macOS command-line utility airport– not to be confused with the Airport Utility app–gave detailed information about the current WiFi connection and nearby WiFi APs. This utility was located at /System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport.

Since discontining airport, current BSSID requires using CoreWLAN framework as demonstrated in Python scan-wifi-python.

Apple provides a list of device WiFi support.

Matlab on macOS doesn't source ~/.zshrc

On macOS, Matlab does not source ~/.zshrc. This issue has existed at least since macOS started using ZSH as the default shell.

To workaround this issue, particularly when programs from package managers like Homebrew are needed, add a setup script to the Matlab project containing like:

if ~ismac
  return
end

% Add Homebrew to the PATH
[ret, homebrew_prefix] = system('brew --prefix');
if ret == 0
  p = fullfile(strip(homebrew_prefix), "bin");
  if isfolder(p)
    setenv('PATH', append(p, pathsep, getenv('PATH')))
  end
end

Then programs installed by Homebrew like CMake, GCC, etc. will be on Path environment variable in Matlab.

Note that the Matlab commands below do not help:

!source ~/.zshrc

system("source ~/.zshrc")

PowerShell tilde expansion

PowerShell tilde expansion was dropped in 7.4.0. Automatic variable $home remains available across operating systems.

ls $home

PowerShell tilde expansion was fraught with difficulties that led PowerShell maintainers to at least temporarily drop tilde expansion in PowerShell 7.4.0.

Note that automatic variables are just inside PowerShell itself–they are not environment variables. Thus, automatic PowerShell variables are generally not visible to other programs or scripts unless additional steps are taken to expose them, perhaps as a command line argument or environment variable.

Aspell don't backup

Aspell creates backup files with a .bak extension by default. To turn off the backup files configure Aspell to not create them. Often there is not a not a user configuration file “aspell.conf” present. Even if there is a config file present, it can be overridden by environment variable ASPELL_CONF:

export ASPELL_CONF="dont-backup;"

or do similarly through Control Panel in Windows.

Confirm the setting has taken effect by:

aspell dump config | more

and look for the line backup false.


Related: Aspell dictionary location

Clear temporary scratch files on HPC

Unix-like HPC systems often have shared temporary scratch directories mapped by environment variable $TMPDIR to a directory like “/scratch” or “/tmp”. $TMPDIR may be used for temporary files during build or computation. $TMPDIR is often shared among all users with no expectation of preservation or backup. If user files are left in $TMPDIR, the HPC system may email a periodic alert to the user.

If the user determines that $TMPDIR files aren’t needed after the HPC batch job completes, one can clear $TMPDIR files with a command near the end of the batch job script. Carefully consider whether this is appropriate for the specific use case, as the scratch files will be permanently deleted.

rm -r -i $TMPDIR 2>/dev/null

Verify that deletes only the user’s files, as each user’s files have write permissions only for their own files. Once this is established, to use this command in batch scripts replace the “-i” with “-f” to make it non-interactive.

CMake TARGET_RUNTIME_DLL_DIRS for CTest

Building Windows shared libraries in general creates DLLs whose directory must be on environment variable PATH when the executable target is run. Windows error code -1073741515 corresponding to hex error code 0xc0000135 emits when the necessary DLLs are not in the program’s working directory or on Path environment variable. This will make CTest tests fail with error code 135.

The CMake generator expression TARGET_RUNTIME_DLL_DIRS along with test property ENVIRONMENT_MODIFICATION can be used to set the Path environment variable for the test, gathering all the directories of the DLLs CMake knows the target needs.

set_property(TEST adder PROPERTY ENVIRONMENT_MODIFICATION PATH=path_list_append:$<TARGET_RUNTIME_DLL_DIRS:main>)

in this minimal example CMakeLists.txt uses the properties above to work correctly.

Limit code language standard

C++17 and C++20 standard code is used throughout projects of all sizes, perhaps with limited-feature fallback to older language standards. Some standards certifications require a specific language standard. High reliability and safety-critical projects may require specific language standards. Examples include FACE and MISRA C / C++.

To enforce a specific language standard be limited, consider in a header used throughout the project as follows. This example limits the language standard to C++14 or earlier by halt the build if a higher standard is detected:

#if __cplusplus >= 201703L
#error "C++14 or earlier required"
#endif

For C code say no higher than C99, consider in a header used throughout the project, which will halt the build if a higher standard is detected:

#if __STDC_VERSION__ >= 201112L
#error "C99 or earlier required"
#endif

Related: MVSC __cplusplus macro flag