Scientific Computing

Profile and speedup Matlab functions

November 4, 2020

Code profiling and micro-benchmarks can be useful to discover hot spots in code that might be targets for improving efficiency and overall execution time. Matlab has long had an effective profiler built in from the factory.

Commercial and free software often prioritized quality and correctness over speed as part of a good quality software product. There are cases where the Matlab factory functions are simply not optimized due to development time constraints. Yair Altman has expertise with the undocumented scripts inside Matlab that many Matlab users rely on. He has written a three-part series on profiling and optimizing Matlab factory code, which I find to be worthwhile reading even for general Matlab code optimization.

Speeding up builtin Matlab functions:

macOS Homebrew without sudo / admin

November 2, 2020

Homebrew is a popular framework for quickly installing development tools on macOS, including Gfortran. macOS cloud services such as MacInCloud may say they provide Homebrew, but you may not be able to install packages without a sudo/admin account. To install Homebrew without sudo/admin in the user home directory for cloud or physical hardware, follow these steps:

mkdir ~/homebrew && curl -L https://github.com/Homebrew/brew/tarball/master | tar xz --strip 1 -C ~/homebrew

If one truly doesn’t have sudo / admin access as typical with a managed (less-expensive) cloud macOS plan, and if Xcode is not the appropriate version, GCC may compile from source, which can take tens of minutes. This may occur when doing

brew install gcc

Reference: Homebrew without sudo script

Matlab Compiler tbb.dll oneTBB

October 31, 2020

Running compiled M-scripts using Matlab Compiler deploytool may result in warning messages about tbb.dll. Updating the Threaded Building Blocks library may help. Also consider updating the Matlab version if applicable.

Download and install latest Intel oneTBB. If using a command prompt, make oneTBB available by command tbbvars. Recompile with Matlab deploytool.

Default MyPy type hint checks

October 30, 2020

MyPy is a Python type annotation checker. MyPy recursively checks all files in a Python project by typing:

mypy ~/myproject

We typically use the PyPI MyPy instead of conda to have the most recent MyPy version:

pip install mypy

It’s often useful to have a per-project MyPy configuration file to avoid excessive command line options. Configure MyPy defaults in pyproject.toml:

[tool.mypy]
files = ["src", "Examples"]
ignore_missing_imports = true
strict_optional = false
show_column_numbers = true

Where “files” is set appropriately for the project. Making a per-project files is strongly recommended to ensure files aren’t missed in the type check. One can make a system-wide ~/.mypy.ini, that is overridden by the per-project pyproject.toml.

Enhanced mypy usage

Build executable from Python

October 28, 2020

We often use executables from Python with data transfer via:

stdin/stdout (small transfers, less than a megabyte)
temporary files (arbitrarily large data)

This provides a language-agnostic interface that we can use from other scripted languages like Matlab or Julia, future-proofing efforts at the price of some runtime efficiency due to the out-of-core data transfer.

Here is a snipping we use to compile a single C code executable from Python (from GeoRINEX program):

"""
save this in say src/mypkg/build.py
then from the code that needs the output executable, say "myprog.bin":

from .build import build
exe = "myprog.bin"
...
if not exe.is_file():
    build("src/myprog.c")
# code that passes data via stdin/stdout and/or files using subprocess.run()

"""
import subprocess
import shutil
from pathlib import Path


def build(src: Path, cc: str = None) -> int:
    """

    Parameters
    ----------

    src: pathlib.Path
        path to single C source file.
    cc: str, optional
        desired compiler path or name

    Returns
    -------

    ret: int
        return code from compiler (0 is success)
    """
    if cc:
        return do_compile(cc, src)

    compilers = ["cc", "gcc", "clang", "icx", "clang-cl"]
    ret = 1
    for cc in compilers:
        if shutil.which(cc):
            ret = do_compile(cc, src)
            if ret == 0:
                break

    return ret


def do_compile(cc: str, src: Path) -> int:
    if not src.is_file():
        raise FileNotFoundError(src)

    if cc.endswith("cl"):  # msvc-like
        cmd = [cc, str(src), f"/Fe:{src.parent}"]
    else:
        cmd = [cc, str(src), "-O2", f"-o{src.with_suffix('.bin')}"]

    ret = subprocess.run(cmd).returncode

    return ret

LTE cellular smartwatch RF signal performance

October 27, 2020

LTE smartwatches may get up to 90% of the communications range of a smartphone. Most providers have turned off (or are turning off) 2G and 3G so coverage may be dynamic. Generally Bluetooth headsets can be used with LTE smartwatches, which helps call quality for any phone device.

Mobile devices including smartwatches may switch frequency bands when going from idle to phone call or data usage:

E: 2G EDGE, the oldest digital network mode still in use, very slow.
H: 3G HSPA/HSPA+, good enough for basic web browsing and email.
4G: really good 3G. Carriers may call their upgraded 3G networks 4G.
LTE: actually using 4G LTE.
5G: not necessarily faster than LTE when in NSA (non-standalone mode), but can be much faster in SA (standalone mode).

The signal bars may jump up/down a few notches when going from idle to active due to the phone band switching e.g. 700 MHz vs. 1900 MHz. Apps like Network Cell Info can help reveal these behaviors.

CI Python package install

October 24, 2020

For continuous integration, it’s important to test the traditional package install

pip install .

along with the more commonly used in situ pip development mode

pip install -e .

Otherwise, the Python package install may depend on files not included in the MANIFEST.in file and fail for most end users who don’t use “pip install -e” option.

A particular failure this will catch on Windows CI is graft path/to/ where the trailing / will fail on Windows only.

Get CPU count from Python

October 22, 2020

Python psutil allows accessing numerous aspects of system parameters, including CPU count. We recommend using a recent version of PSutil to cover more computing platforms.

Ncpu = psutil.cpu_count(logical=False)

usually gives the physical CPU count.

PSutil uses Python script and compiled C code to determine CPU count–it’s not just a simple Python script.

Related: Matlab CPU count

CMake RESOURCE_LOCK vs. RUN_SERIAL advantages

October 21, 2020

CMake (via CTest) can run tests in parallel. Some tests need to be run not in parallel, for example tests using MPI that use lots of CPU cores, or tests that use a lot of RAM, or tests that must access a common file or hardware device. We have found that using the RUN_SERIAL makes whole groups of tests run sequentially instead of individually running sequentially when fixtures are used. That is, all the FIXTURES_SETUP run, then all FIXTURES_REQUIRED that have RUN_SERIAL. This is not necessarily desired, because we had consuming fixtures that didn’t have to wait for all the fixtures to be setup.

We found that using RESOURCE_LOCK did not suffer from this issue, and allows the proper test dependencies and the expected parallelism.

CMake Resource Groups are orthogonal to Resource Locks, and are much more complicated to use. There may be some systems that would benefit from Groups, but many can just use the simple Locks.

For simplicity, this example omits the necessary add_test() and just show the properties.

The test has an MPI-using quick setup “Quick1” and then a long test “Long1” also using MPI. Finally, we have a quick Python script “Script1” checking the output.

In the real setup, we have Quick1, Quick2, … QuickN and so on. When we used RUN_SERIAL, we had to wait for ALL Quick* before Long* would start. With RESOURCE_LOCK the tests intermingle, making better use of CPU particularly on large CPU count systems, and with lots of tests.

The name “cpu_mpi” is arbitrary like the other names.

set_property(TEST Quick1 PROPERTY RESOURCE_LOCK cpu_mpi)
set_property(TEST Quick1 PROPERTY FIXTURES_SETUP Q1)

set_property(TEST Long1 PROPERTY RESOURCE_LOCK cpu_mpi)
set_property(TEST Long1 PROPERTY FIXTURES_REQUIRED Q1)
set_property(TEST Long1 PROPERTY FIXTURES_SETUP L1)

set_property(TEST Script1 PROPERTY FIXTURES_REQUIRED L1)

Use Python subprocess from Matlab

October 20, 2020

Matlab system() lacks features needed for blackbox interfacing with executables, including lack of stdin pipe. matlab-stdlib subprocess_run() can exchange data in stdin, stdout, stderr pipes, cwd, environment variables, and more using Java ProcessBuilder. Matlab can also call Python subprocess.