Scientific Computing

Find files from the command line

One can very rapidly find files with arbitrary criteria on systems with GNU Findutils. This includes Linux, macOS and Windows Subsystem for Linux.

Install Findutuils

Linux normally comes with GNU Findutils already installed. Windows users can do this via Windows Subsystem for Linux. macOS users can install GNU Findutils via Homebrew that makes the command “gfind” in place of “find”.

A wide range of criteria can be used to rapidly find files. If working on a remote filesystem mounted over SSHFS we suggest SSHing into that system and running the find command that–it will be orders of magnitude faster.

Most examples use home directory ~ as a starting point just for convenience. Appending 2>/dev/null to the end of a command removes nuisance messages about file permissions. If piping the find command, put 2>/dev/null before the pipe.

Find files with “report” in the filename, case-insensitive:

find ~ -iname "*report*"

Suppose ~/data is a symbolic link to another directory, either on your computer, a USB hard drive or a server. By default, find will not search this resource unless you “resolve” the symbolic link to a directory by putting a final / like ~/data/:

find ~/data/ -iname "*report*"

See the findutils manual on symbolic links , in particular the -H and -L options.

Python calling Python via subprocess

On Windows, calling Python sys.executable or console scripts run as .exe may fail with

Fatal Python error: _Py_HashRandomization_Init: failed to get random numbers to initialize Python

The issue arises when environment variables are passed in via the env= argument to Python subprocess. In general, one should add or overwrite variables to the OS environment in subprocess calls as follows in this example for using Clang and Flang compilers. os.environ returns a mapping (general form of dict) of environment variables.

import os
import subprocess

# get a Mapping of all the current environment variables
env = os.environ

# set these to use Clang and Flang compilers
myvar = {'CC': "clang", 'CXX': "clang++", 'FC': "flang"}

# %% This is important--add / overwrite environment variables for this subprocess only.
env.update(myvar)

subprocess.check_call(['meson', 'setup'], env=env)

Whenever using Python subprocess environment variables, generally pass in all the existing environment variables, adding or overwriting the specific variables needed. Otherwise, fundamental environment variables will be missing and the subprocess call generally won’t work. By default, when not specified, env=None, which tells subprocess to copy all the environment variables of the shell Python was called from.

Windows setx environment variables PowerShell

In Windows, the setx command allows storing environment variables in the Windows registry on a per-user or system basis. A problem arises when it is desired to remove such variables entirely. Setting a blank value does not work.

Example: assume one previously set the environment variable CC to the Intel C compiler like:

setx CC icx

Now the problem is, you’d like the system to fall back to using whatever default C compiler is on the Windows system. The existing variable is visible in PowerShell via:

Get-ChildItem Env:CC

But trying to remove the variable with PowerShell command

Remove-Item Env:FC

is only effective for this PowerShell session. The old value of CC comes back upon opening a new PowerShell.

To permanently remove the “setx” variable, we must remove it from where it’s stored in the Windows registry. For this example of wanting to delete enviornment variable CC, do from PowerShell:

reg delete "HKCU\Environment" /v CC

Optionally, confirm deletion with PowerShell:

Get-ChildItem Env:CC

Upgrade octave-signal package

GNU Octave remez() is for Parks-McClellan FIR filter coefficient design in the signal package. This is approximately equivalent to Matlab firpm().

From Octave prompt, install the signal package:

pkg install -verbose -forge signal

For more details, including compilers needed, see the GNU Octave pkg install page

Dual display notes LaTeX Beamer presentation

Pympress can present talk slides using Beamer in dual screen. Practice ahead of time with actual laptop.

Install the prereqs:

Install Pympress:

pip install pympress

Show the Beamer presentation in dual screen by running:

pympress talk.pdf

Tap the s key to swap screens.


In LaTeX .tex preamble (top of main .tex file), put this code to make dual-screen PDF work:

\usepackage{pgfpages}
\setbeameroption{show notes on second screen}

Notes

To fix ModuleNotFoundError: No module named 'gi', install missing prereqs:

apt install python3-gi python3-cairo gir1.2-poppler

Be sure using SYSTEM Python, not Anaconda python or other user Python installs. Remove pympress and pympress*distinfo directories from ~/anaconda3/lib/python*/site-packages/pympress* or wherever it might be on $PATH.

Then install pympress with SYSTEM Python:

/usr/bin/pip3 install --user pympress

Matlab / GNU Octave "isinteractive" function

It’s useful to know if the Matlab or GNU Octave GUI is open for a number of use cases, including

  • pause for each group of a large set of plots–only if user is there to look at them, otherwise save to disk and close thereafter.
  • increase (or decrease) verbosity of print statements or if console output is logged, depending on if it batch mode or not.

We don’t use the Matlab batchStartupOptionUsed as it doesn’t detect the -nodesktop case often used for unattended batch processing. Save this code to isinteractive.m for your project.

function isinter = isinteractive()
%% tell if the program is being run interactively or not.

if isoctave
  isinter = isguirunning;
else
  % matlab, this test doesn't work for Octave
  % don't use batchStartupOptionUsed as it neglects the "-nodesktop" case
  isinter = usejava('desktop');
end

end

Writing multipage TIFF with Python

An easy, robust way to write multipage TIFF on any platform in Python is imageio.

For all examples below, assume a stack of images in an Numpy ndarray imgs, with dimensions:

imgs.shape
(Nimg, y, x) for monochrome. (Nimg, y, x, 3) for RGB color.

ImageIO is a library we have contributed code to and recommend in general for Python image IO.

pip install imageio
import imageio

imageio.mimwrite('myimgs.tiff',imgs)

tifffile

ImageIO uses tifffile internally, so most don’t need to use tifffile directly. To use tifffile directly, install tifffile.py:

pip install tifffile
import tifffile

tifffile.imsave('myimages.tiff',imgs)

tifffile.imsave() is capable of description and tags arguments and to compress losslessly.

Advanced Python TIFF multi-page writing example: archive/old_tiffile_demo.py.

Read TIFF headers

The de facto TIFF header tags. can be read from the command line with Perl Image::ExifTool

apt install libimage-exiftool-perl

exiftool myfile.tif

Note: tiffinfo doesn’t print tiff tags.

Print all TIFF tags from Python using archive/PrintTiffTags.py

Alternative multipage-Tiff method using scikit-image and FreeImage: (we recommend imageio or tifffile instead)

from skimage.io._plugins import freeimage_plugin as freeimg

freeimg.write_multipage(imgs,'myimages.tiff')

Due to the large number of image libraries invoked, sometimes scikit-image needs a little tweaking for image I/O:

Windows

if you get:

RuntimeError: Could not find a FreeImage library

Fix by downloading the FreeImage binaries and extract Dist/x64/FreeImage.dll to the directory found by:

$(python -c "import skimage; print(skimage.__path__[0])")/io/_plugins/

Linux

If you get:

freeimage had a problem: Could not find a FreeImage library in any of...

Fix by:

apt install libfreeimage3

reference

Old Numpy

To fix tifffile error caused by too-old Numpy version:

RuntimeError: module compiled against API version 0xb but this version of numpy is 0xa

Install a newer Numpy version:

pip install numpy

pip install tifffile

Since there is a Numpy .whl binary wheel for ARM, the latest Numpy installs quickly on Raspberry Pi.

Install packages in GNU Octave

GNU Octave can install third-party packages in a friendly way, analogous to the Matlab App Store or how Linux repositories work. Regardless of operating system, Octave can install these extension packages from the Octave command line. Some packages require a compiler or libraries. If package install fails, read the log output to see if installing a system library is required.

Packages are installed at the Octave command prompt, and download automatically. Prereqs are not automatically installed, but messages are given telling which package needs to be installed first. signal is a perfect example of this, given below.

“signal” is a popular Octave package, which brings many functions found in Matlab’s DSP and Communications Toolbox. We’ll see that signal needs other packages first; let’s walk through the Octave signal install. All commands are from Octave command prompt.

Try using a command that requires signal

diric(0.2, 5)

warning: the ‘diric’ function belongs to the signal package from Octave Forge which seems to not be installed in your system.

If I had already installed signal, but forgotten to load it since I started Octave, the error would have been:

warning: the ‘diric’ function belongs to the signal package from Octave Forge which you have installed but not loaded.

Install signal Octave package from Octave prompt:

pkg install -forge signal

The -forge option indicates to use the Octave-Forge repo for automatic download.

This returns a warning saying that control is required.

Install control:

pkg install -forge control

this requires the gfortran compiler.

pkg install -forge signal

Use Octave packages in a Matlab-compatible way simply by enclosing in try end

function d = twicediric(x)
  try
    pkg load signal
  end

  d = 2*diric(x)
end

If the package isn’t installed, the message on reaching the missing function tells which package is needed.

Matplotlib datetime examples

Matplotlib can make many types of plots with a time axis. However, sometimes it takes an additional command or two to make the date/time axis work right in Matplotlib.

As seen in xarray_matplotlib.py, for imshow() datetime64 extent, you need to do something like:

import matplotlib.dates as mdates

# whatever your time vector is
t = np.arange('2010-05-04T12:05','2010-05-04T12:06', dtype='datetime64[s]').astype(datetime)

mt = mdates.date2num((t[0],t[-1]))

ax.imshow(im, extent=[mt[0],mt[1], y[0],y[-1]], aspect='auto')

In most Matplotlib plotting functions numpy.datetime64 is a first-class citizen, but not yet for imshow() perhaps due to the limits-oriented nature of imshow(). We use pcolormesh() instead of imshow() for datetime-oriented raster data.

Matlab OpenCV C++/CUDA/MEX support

To use OpenCV from Matlab via vision Support Packages as integrated by the Mathworks using MEX.

Select “Computer Vision System Toolbox OpenCV Interface by MathWorks Computer Vision System Toolbox Team” and install.

Note: the examples require particular compilers depending on Matlab version and operating system.

Examples directory contains Computer Vision Toolbox examples from the Mathworks. Find the Matlab OpenCV example directory, in Matlab:

fileparts(which('mexOpenCV'))

The examples below assume you’re starting from this directory. See the README.txt in each directory for compilation details. Some examples require a CUDA GPU.

Foreground Detector: build example

cd ForegroundDetector

mexOpenCV backgroundSubtractorOCV.cpp

If the example fails to compile due to compiler mismatch, follow the instructions given in the error message.

Run the OpenCV Matlab demo:

testBackgroundSubtractor

You will see a Video Player window pop up with cars driving by, with the cars detected outlined in white rectangles.