Scientific Computing

CMake packaging with CPack example

CMake has a corresponding meta package system CPack, which generates configuration files for numerous packaging systems. Distributing code and binary executablees and libraries to users without requiring them to compile a project is done via these packages. CPack creates these binary packages like Windows .msi, Linux .deb/.rpm, macOS .dmg, etc. CPack also creates traditional source archives as are also generated by GitHub Releases, but with fine-grained control of the contents.

Assuming the PROJECT_BINARY_DIR is “build”, CPack generates build/CPackConfig.cmake for binary packages and build/CPackSourceConfig.cmake for source packages. CPackConfig.cmake is generated according to install() commands in the CMakeLists.txt files of the project.

Note that in general “install()” DESTINATION should always use relative paths. CPack ignores install() items with absolute DESTINATION.

CPackSourceConfig.cmake works the opposite way–it includes everything not excluded by CPACK_SOURCE_IGNORE_FILES, so we make a file cmake/.cpack_ignore with regex excluding non-source files. As a last step at the end of the main CMakeLists.txt after all install(), we include cmake/cpack.cmake:

As usual:

cmake -B build
cmake --build build

The distribution packaged .zip / .tar.gz files under build/package are generated by:

cpack --config build/CPackSourceConfig.cmake

cpack --config build/CPackConfig.cmake

These can be built by the CI system and uploaded for distribution on GitHub Releases, etc. by configuring the .github/workflows/ci.yml accordingly.

Check last error code across OS

In general, programs don’t usually print to console the integer return code from the main procedure. The program may well print some message indicating success or failure, but maybe not. When calling executables from a compiled or scripted language such as Fortran, C, Python or Matlab, it’s often vital to know the value of the integer return code as a signal that the program thought it was successful or not. Further, some program crashes do not emit any console text, and could make the user think the program was successful.

To help eliminate doubt, issue a command to print the last error code to console when working with command line programs. The method to print this integer code depends on the shell. From Terminal / Command Prompt, the return code from the last command is printed by:

  • Unix-like shell: echo $?
  • Windows command prompt: echo %errorlevel%
  • PowerShell: echo $lastexitcode

Skip builds on specific CI systems

Most CI systems will not build if a git commit -m message includes a string like [skip ci]. For example:

git commit -m "update install docs [skip ci]"

Some CI systems have additional custom commit keywords that allow skipping only that CI:

Google Shared Drives benefits and caveats

Google Shared Drive is for groups of people with Gmail addresses (corporate or personal). Public folder or limited access link-sharing of folders and files in Shared Drives are possible as with personal Google Drive. Moving files is done in bundles of files, not by folder. Google Drive for desktop works for Shared Drives like individual Google Drive for read/write access. Google Shared Drives and personal Google Drive can be synced with rclone, which is especially handy for HPC and cloud computing. If using Windows Subsystem for Linux, keep the files on the native Windows filesystem.

Fortran build systems

Two of the most important aspects of a build system are generation speed and build/rebuild speed. For non-trivial projects in any compiled language, build speed can be a significant pain point. Build speed is the time it takes to first build project targets. Rebuild speed is the time taken on subsequent builds of a target, where most build artifacts can be reused, but the dependencies and sources must be scanned for changes since last build.

Even for medium projects, the seconds lost rebuilding add up lost productivity for developers. CMake and Meson are among the fastest generators, while SCons and Autotools are among the slowest. CMake and Meson have full support for Ninja. Ninja is generally significantly faster than GNU Make, particularly for the case where only a few out of very many files need to be rebuilt.

Meson is a Python program without external dependencies. It’s easy to run the latest development snapshot or track Git changes as a result.

pip install meson ninja

Since Meson generates Ninja build files, we installed Ninja as well. With Meson, the code build process goes like:

meson setup build

meson compile -C build

CMake has existed since the early 2000’s, and has broad usage. Build a CMake project like:

cmake -B build

cmake --build build

Python plot HTML browser

HTML plotting is important for communicating key data to colleagues, the general public and policymakers. HTML plots allow easy sharing of interactive data plots to any web browser. There are numerous HTML plotting methods available from Python. These methods are completely open source and work in any web browser. The HTML file can be shared by several means:

  • email attachment
  • embedded in a webpage as an HTML5 iframe
  • Ipython notebook
  • hosted plotting service (Plotly, figshare, et al)

Matplotlib HTMLWriter can plot animated sequences. More powerful HTML plotting capability requires mpld3. Virtually any Matplotlib plot can be converted to HTML for display in any web browser. mpld3 uses HTML and D3.js to animate the plots.

from matplotlib.pyplot import figure
import mpld3

fig = figure()
ax = fig.gca()
ax.plot([1,2,3,4])

mpld3.show(fig)

mpld3.show() converts any Matplotlib plot to HTML and opens the figure in the web browser. mpld3.save_html(fig,'myfig.html') method saves figures to HTML files for archiving or uploading as an iframe to a website, etc. D3.js enabled interactive elements are available from the mpld3 API. Note the template_type='simple' keyword for .save_html(), which can increase robustness across web browsers.


The open source Plotly library can plot completely offline, without Plotly servers. Plotly can make offline Python plots that work in any web browser. Plotly offline plots don’t need IPython/Jupyter, although you can certainly use them as well.

Offline plotting is important as it doesn’t rely on external proprietary services that could prevent future users from making plots. The Plotly API is available for Python, R, Matlab, JavaScript, Scala and a growing number of other programming languages. Plotly examples show the simple syntax. When using plain Python, be sure to use plotly.offline.plot() as plotly.offline.iplot (iplot) will silently fail to do anything!

Trendnet TEG-s80g vs. TEG-s82g

The Trendnet TEG-S80g version 3.0 - 4.1 lack LED speed indicators, with only active (plugin detection) indicators for each port.

Model/Rev LED speed indicator metal jacks buffer memory (kB) MAC address table (entries)
TEG-S80g Rev 1.0 yes yes 128 4K
TEG-S80g Rev 2.1 yes yes 128 8K
TEG-S80g Rev 3.0 no no 256 8K
TEG-S80g Rev 4.1 no no 192 4K
TEG-S82g Rev 2.0 no no 256 8K

The TRENDnet TEG-S80g model has been reliable for me across a decade of use in small instrument networks with challenging physical environments.

Trendnet internals

MyPy PEP 585, 604 support

MyPy supports PEP 585 and 604, bringing concise Python 3.10 type annotation syntax to earlier Python versions. The new type annotation syntax works all supported Python versions, if each file using them has at the top:

from __future__ import annotations

Separately, Numpy 1.20 made the long-awaited Numpy type hinting a reality.

MyPy type check quick start

The benefits of Python static type checking and examples have been discussed at length and widely adopted and funded by major tech companies, especially Dropbox. Python static type checking enhances code quality now and in the future by defining (constraining) variables and functions (methods).

Type enforcement can be done with assert. Type hinting is more concise, flexible and readable than assert, with significantly less performance impact. Type hinting is being continually enhanced in CPython, numerous IDEs and type annotation checkers. With type hinting, the hint is right at the variable name (e.g. in the function declaration), while assert must occur in the code body.

MyPy is installed and upgraded by:

pip install -U mypy

MyPy static type checker considers the following to be interchangeable (valid) due to duck typing:

  • intfloat
  • floatcomplex

Note that str is not equivalent to bytes.

Usage

Add to pyproject.toml:

[tool.mypy]
files = ["src"]

assuming Python package files are under “src/” Then issue command:

python -m mypy

Note this command checks the package and not the top-level scripts, which must be manually specified. Configure pyproject.toml to eliminate nuisance errors or otherwise configure mypy.

It takes a little practice to understand the messages. Where multiple types are accepted, for example, str and pathlib.Path use typing.Union. See the examples below.

Examples

Many times a function argument can handle more than one type. This is handled as follows:

from __future__ import annotations
from pathlib import Path


def reader(fn: Path | str) -> str:
    fn = Path(fn).expanduser()

    txt = fn.read_text()

    return txt

Another case is where lists or tuples are used, the types within can be checked (optionally):

from __future__ import annotations


def reader(fn: Path | str) -> tuple[float, float]:
    fn = Path(fn).expanduser()

    txt: list[str] = fn.read_text().split(',')

    latlon = (float(txt[0]), float(txt[1]))

    return latlon

Or perhaps dictionaries, where optionally types within can be checked:

from __future__ import annotations


def reader(fn: Path | str) -> dict[str, float]:
    fn = Path(fn).expanduser()

    txt: list[str] = fn.read_text().split(',')

    params = {'lat': float(txt[0]),
              'lon': float(txt[1])}

    return params

If many value types are in the dictionary, or possibly some types are not yet supported for type hinting, simply use typing.Any e.g.

dict[str, typing.Any]

The default where no type is declared is typing.Any, which basically means “don’t check this variable at this location in the code”.


As in C++, Python can type hint that a function must not return.

def hello() -> typing.NoReturn:
    print("hello")
error: Implicit return in function which does not return

This is used for functions that always raise an error or always exit, and in general to help ensure control flow is not returned.