Scientific Computing

Stop GNU Octave autoload all packages

Speed up Octave startup by a factor of 10-100x by not autoloading packages. For each of the Octave config files that exist, comment out the line:

# pkg ("load", "auto");

Verify GNU Octave speedup fix: in Octave type

pkg list

if all packages are starred, that means you’re autoloading, which greatly slows down Octave startup. The factor of 100 improvement in GNU Octave speedup was estimated with the command

time octave --eval 'exit'

Before fix 4.904 seconds, after fix 0.072 seconds to load Octave.

CMake INACTIVITY_TIMEOUT not foolproof

CMake implements a file(DOWNLOAD INACTIVITY_TIMEOUT) option that uses curl CURLOPT_LOW_SPEED_TIME to fail much earlier than a TIMEOUT option would when the connection speed is extremely slow. However, some systems intentionally block downloads, but do so in a way that fools the underlying curl library into not tripping INACTIVITY_TIMEOUT. Rather than try to figure out a direct fix in curl for this apparently longstanding problem, we instead decide to implement a CMake configure-time internet connectivity detection.

When a CTest test is comprised of a CMake script, this test can be setup as a FIXTURES_SETUP to avoid calling it repeatedly for each test, instead just once for a set of tests.

The key technique here is the URL connectivitycheck.gstatic.com/generate_204 that is used by Android devices to check connectivity. Any similar URL with zero or tiny download size would also be suitable.

CMake GCC build

This CMake project builds GCC and high level prerequisites such as GMP. This avoids needing to manually download and configure each project. This assumes fundamental prerequisites such as libc, autotools, make, CMake are present. The 3-stage compiler bootstrap is enabled to help ensure a working, performant GCC.

Obscuring Matlab code for sharing

“Security through obscurity” alone does not actually confer security. Obscuring code simply increases effort for someone who wishes to use code for unauthorized purposes. For Matlab “.m” code, two methods to partially obscure the underlying “.m” code for locally-run Matlab-based algorithms are:

  • compile Matlab code to executable (OS-specific and Matlab version-specific)
  • convert Matlab code to pcode (OS-agnostic and Matlab version-agnostic)

In general, when critical IP needs to be made available for user data while keeping IP non-public, this is done by providing web services. For example, the IP of Google, Bing, Office 365 is usable via web services, but generally the core code remains non-public.

Matlab pcode() can obscure directories or lists of files to “.p” code that runs on Matlab on any supported OS. Newer versions of Matlab use a more powerful obscuration algorithm that is not backward compatible.

conda install tar.bz2 low bandwidth Internet

For sites with poor Internet connection, conda install mypkg will fail and not resume a partially downloaded package. Workaround: use URL from the error message with curl to download the package and install directly.

curl --location --remote-name --continue-at https://conda.anaconda.org/conda-forge/win-64/<package file>

Resume download interrupted by reissuing the command to resume download. Install from the tar.bz2 file by

conda install <package stem>.tar.bz2

Using Python in CMake script

CMake FindPython prioritizes location over version number. Prior to CMake 3.15, even specifying Python_ROOT could be overridden if the other Python was a higher version.

Using the Python interpreter in CMake should generally be via ${Python_EXECUTABLE} instead of Python::Interpreter. CMake provides the imported target Python::Interpreter only when the CMAKE_ROLE is PROJECT. This means that Python::Interpreter is not available when using CTest, which is often when using the Python interpreter is desired. Normally, to use Python interpreter from a CMake script, including in execute_process or add_test, use Python_EXECUTABLE.

Example:

find_package(Python COMPONENTS Interpreter REQUIRED)

add_test(NAME Foo COMMAND ${Python_EXECUTABLE} myscript.py -arg1 value)

Install Python package from CMake

For system Python or other cases where “site-packages” is a non-writable directory, the pip --user option is necessary to install a Python package under the user home directory. However, if using Python virtualenv (with or without conda) the pip --user option is invalid. Environment variables set by Python indicate when a virtualenv is being used by Python currently.

“pip” is important for locally installed packages, since pip via pyproject.toml will automatically use the latest setuptools. This is quite important as too many user systems have too-old setuptools. The project’s pyproject.toml file should contain at least:

pyproject.toml:

[build-system]
requires = ["setuptools>=61.0.0", "wheel"]
build-backend = "setuptools.build_meta"

Detect Anaconda environment by existence of environment variable CONDA_PREFIX

CMakeLists.txt

find_package(Python COMPONENTS Interpreter REQUIRED)

# detect virtualenv and set Pip args accordingly
if(DEFINED ENV{VIRTUAL_ENV} OR DEFINED ENV{CONDA_PREFIX})
  set(_pip_args)
else()
  set(_pip_args "--user")
endif()

To install a package (named in CMake variable _pypkg) from PyPI:

# install PyPI Python package using pip
execute_process(COMMAND ${Python_EXECUTABLE} -m pip install ${_pip_args} ${_pypkg})

To install a local package in development mode (live changes):

execute_process(COMMAND ${Python_EXECUTABLE} -m pip install ${_pip_args} -e ${CMAKE_CURRENT_LIST_DIR})

CMake internally in Modules/FindPython/Support.cmake detects Python virtualenv.

GitHub Actions strategy array exclude

One of the powerful uses of YaML with CI systems such as GitHub Actions (GHA) is the ability to setup job matrices. Job matrices allow deduplication of jobs with terse specification, encouraging better test platform and parameter coverage. GHA strategy matrix allows excluding jobs, including via exclusion arrays. An example strategy matrix below excludes shared build on macOS, due to bugs in third party library on macOS.

jobs:

  unix:

    strategy:
      matrix:
        shared: [true, false]
        img: [
          {os: ubuntu-latest, fc: gfortran},
          {os: macos-latest, fc: gfortran-13}
        ]
        exclude:
          - shared: true
            img: {os: macos-latest}

    runs-on: ${{ matrix.img.os }}
    env:
      FC: ${{ matrix.img.fc }}

This strategy results in 3 CI jobs:

  • os=ubuntu-latest, shared=true, fc=gfortran
  • os=macos-latest, shared=false, fc=gfortran-13
  • os=ubuntu-latest, shared=false, fc=gfortran

Build Python from CMake

Getting Python can be tricky for license-restricted users e.g. government and corporations. Building Python can be an arcane process without the automation of a high-level build system CMake. Python uses Autotools on most platforms except Windows where Visual Studio is used. The libraries need to be built with specific options and the forums are full of suggestions for tweaking Python build scripts etc.

The CMake project to build Python elides those issues for Linux/macOS platforms at least. It builds basic requirements of Python including expat, ffi, bzip2, xz, readline, zlib and more.

CMake undefined variable compare

In CMake, undefined variables can evaluate as false in simple if(x) statements. For comparison operations like LESS GREATER, undefined variables do not operate like “false”.

if(x LESS 1)
  message(FATAL_ERROR "undefined not less 1")
endif()

No matter what value is compared to undefined variable “x”, the if() statement will not be true. As in most programming languages, a key best practice for CMake is to ensure variables are defined with a proper default value.