Scientific Computing

Python argparse vis shell glob

Users (or developers!) may not realize that the shell expands glob asterisk * unless enclosed in quotes. This can surprise users unfamiliar with this shell behavior, say when using Python argparse with position-based arguments. Say a user has a single file to process in a directory, and doesn’t want to type the long filename, so they type:

python myScript.py ~/data/*.h5 32

Here we assume myScript.py expects two positional arguments, the first being a filename, and the second being an integer. If more than one “*.h5” file subsequently exists and myScript.py is run, the actual input to Python would be like:

python myScript.py ~/data/file1.h5 ~/data/file2.h5 32

Which causes a Python argparse exception.

To see what the shell is going to expand to, with default keybindings and Bash or Zsh at least, press after typing the command these keys:

Ctrlx g

CMake ExternalProject verbose progress with Ninja

CMake ExternalProject works for many types of sub-projects across CMake generators. An implementation detail of Ninja is by default ExternalProject doesn’t print progress until each ExternalProject step is finished. For large external projects that take several minutes to download and build, users could be confused thinking CMake has frozen up. To make ExternalProject show live progress as it does with Makefiles generators, add the USES_TERMINAL_* true arguments to ExternalProject_Add.

ExternalProject_Add(
  BigProject
  ...
  USES_TERMINAL_DOWNLOAD true
  USES_TERMINAL_UPDATE true
  USES_TERMINAL_PATCH true
  USES_TERMINAL_CONFIGURE true
  USES_TERMINAL_BUILD true
  USES_TERMINAL_INSTALL true
  USES_TERMINAL_TEST true
)

“USES_TERMINAL* true” forces ExternalProject steps to run sequentially. For large projects this is ordinarily not significant.

Pytest skiporimport matlab.engine

PyTest can work with Matlab Engine if the Matlab Engine is setup. Use a try-catch to ensure any non-functioning Matlab Engine issue is skipped.

import pytest


def test_me():
    try:
        mateng = pytest.importorskip("matlab.engine")
    except Exception:  # can also get RuntimeError, let's just catch all
        pytest.skip("Matlab engine not available")

    eng = mateng.start_matlab("-nojvm")
    # test code

Python subprocess tee to screen and variable

Python subprocess can be used to run a long-running program, capturing the output to a variable and printing to the screen simultaneously. This gives the user the comfort that the program is working OK and gives program status messages without waiting for the program to finish.

This example demonstrates the “tee” subprocess behavior.

Python subprocess multi-line Python script

Python subprocess can run inline multi-line Python code. This is useful to use Python as a cross-platform demonstration or for production code where a new Python instance is called.

import subprocess
import sys

# the -u is to ensure unbuffered output so that program prints live
cmd = [sys.executable, "-u", "-c", r"""
import sys
import datetime
import time

for _ in range(5):
    print(datetime.datetime.now())
    time.sleep(0.3)
"""]

subprocess.check_call(cmd)

Matlab batch use stdout

Matlab command batch “matlab -batch” is useful for running Matlab scripts from the command line. When using “stdout” text output from Matlab, especially if only a single line is expected, there may be extraneous text output from Matlab with regard to licensing. A command example is prereleases like:

matlab -batch "disp(matlabroot)"

outputs to stdout:

    Prerelease License -- for engineering feedback and testing
	purposes only. Not for sale.

/Applications/MATLAB_R2023b.app

A workaround for this in shell scripts is like:

set -e  # stop on error

r=$(matlab -batch "disp(matlabroot)" | tail -n1)

cd ${r}
# and so on

Open file in default program from Terminal

It can be convenient to open a file by launching the default program without first leaving the Terminal. For simplicity, we assume the file is named “file.txt” but it can be any file openable by a program on the computer. This technique works with any file type that has an associated default program on the computer.

  • macOS: open file.txt
  • Linux: xdg-open file.txt
  • Windows: start file.txt

rsync private Git avoid sharing credentials

Recommended: rather than using Rsync, it is more convenient to give the remote host read-only Git access via:


Rsync over SSH allows one to edit and update code without putting credentials on the remote host.

Laptop to remote host:

rsync -r -t -v -z --exclude={build/,.git/} ~/myProg login@host:myProg
--exclude={build/,.git/}
Exclude syncing of Git information and build/ directory, which wastes time and may fail
-z --compress
compress data for transfer
-r --recursive
recursively sync directories
-t --times
preserve modification times
-v --verbose
verbose output

CMake ExternalProject and Git filters

Git filters may clash with the CMake ExternalProject update step. The “download” step invokes checkout and the “update” step may stash and invoke the Git filters, causing the build to fail.

There is not a straightforward way to turn off CMake Git filters.

Solution: Git pre-commit hook instead of Git filters. Users with Git filters need to disable the filters and preferably change the filters to pre-commit hooks if possible.

Things that did not work

For reference, these did not help override Git filters.

ExternalProject_Add(...
GIT_REMOTE_UPDATE_STRATEGY  "CHECKOUT"
UPDATE_COMMAND ""
)
ExternalProject_Add_Step(MyProj gitOverride
DEPENDERS update
COMMAND git -C <SOURCE_DIR> config --local filter.strip-notebook-output.clean cat
COMMAND git -C <SOURCE_DIR> config --local --list
COMMENT "CMake ExternalProject: override git config to strip notebook output"
LOG true
INDEPENDENT true
)