Profile and speedup Matlab functions

Code profiling and micro-benchmarks can be useful to discover hot spots in code that might be targets for improving efficiency and overall execution time. Matlab has long had an effective profiler built in from the factory.

Commercial and free software often prioritized quality and correctness over speed as part of a good quality software product. I have made numerous code contributions to projects such as Numpy and Scipy with regard to performance and robustness. Matlab is no exception–in the early days of my engineering career I used interpolation functions a lot, and realized that by taking advantage of certain characteristics of my problem, I could remove unneeded checks and steps inside Matlab’s factory interpolation functions. Moreover, there are cases where the Matlab factory functions are simply not optimized due to development time constraints.

Yair Altman has expertise with the undocumented scripts inside Matlab that many Matlab users rely on. He has written a three-part series on profiling and optimizing Matlab factory code, which I find to be worthwhile reading even for general Matlab code optimization.

Speeding up builtin Matlab functions

Skip builds on specific CI systems

Most CI systems will not build if a git commit message includes a string like [skip ci]. For example:

git commit -m "update install docs [skip ci]"

When testing out a particular CI system, some CI systems allow skipping their particular CI. For example, if testing out new AppVeyor systems, skip building on Travis-CI by including

[skip travis]

in the git commit message.

Skip strings

For each CI system include the string in git commit -m Git commit message.

Generic skip only

At the time of this writing, these CI are known only to support the generic [skip ci]:

  • CircleCI supports [skip ci]

  • GitLab supports:

    • commit messages containing [skip ci] or [ci skip]
    • git push option:
    git push -o ci.skip
    
  • Bitbucket support [skip ci]

Travis-CI Quick Start

NOTE: many projects have transitioned to GitHub Actions due to Travis-CI build quotas for open source projects.


This Travis-CI quick start assumes Python on GitHub for simplicity. We assume the project has a minimal setup.cfg.

Create a .travis.yml like:

language: python
group: travis_latest

git:
  depth: 25
  quiet: true

python:
- 3.9
- 3.7

os:
- linux

matrix:
  include:
  - os: linux
    name: PEP8 MyPy Coverage
    python: 3.9
    install: pip install .[tests,cov]
    script:
    - flake8
    - mypy
    after_success:
    - pytest --cov
    - coveralls

install: pip install .[tests]

script: pytest

pytest --cov assumes code coverage settings in .coveragerc.

Now, upon every git push, the Travis-CI dashboard will make the badge red/green depending on whether your test passed.

  • flake8 tests PEP8 compliance. Try autopep8 -i -r . to quickly fix most minor issues. This setup assumes a file .flake8
  • mypy checks static type annotation and assumes “files” is defined in .mypy.ini

Travis-CI output

The key point is that Travis CI considers only stderr == / != 0 for pass/fail:

stderroutcome
== 0PASS
!= 0FAIL

A third case is an ERROR in setup, perhaps a prereq is missing from setup.cfg.

MacOS Homebrew without sudo

Homebrew is a popular framework for quickly installing development tools on MacOS, including Gfortran. MacOS cloud services such as MacInCloud may say they provide Homebrew, but you may not be able to install packages without a sudo/admin account. To install Homebrew without sudo/admin in the user home directory for cloud or physical Mac hardware, follow these steps:

mkdir homebrew && curl -L https://github.com/Homebrew/brew/tarball/master | tar xz --strip 1 -C homebrew

Edit ~/.profile to include

export PATH=$HOME/homebrew/bin:$PATH

Open a new Terminal to use Homebrew.

May need to compile GCC

If one truly doesn’t have sudo/admin access as typical with a managed (less-expensive) cloud MacOS plan, and if XCode is not the appropriate version, GCC may compile from source, which can take about an hour on a modern quad-core Mac. This may occur when doing

brew install gcc

Reference

Detect CI via environment variable

CI systems typically set the environment variable CI as a de facto standard for easy CI detection. Here are details of several popular CI services:

Detect CI inside Python

Pytest handles conditional tests well. This allows one to test Matplotlib on their local computer, while skipping those plotting tests on CI.

import os
import pytest

CI = os.environ.get('CI') in ('True', 'true')


@pytest.mark.skipif(CI, reason="no plots for CI")
def test_myfun():
    from matplotlib.pyplot import figure,show

    ...

Create fake X11 display on CI

Default MyPy type hint checks with .mypy.ini

MyPy is a Python type annotation checker. MyPy recursively checks all files in a Python project by typing:

mypy ~/myproject

Install

We typically use the PyPi MyPy instead of conda, to have the most recent MyPy version:

pip install mypy

Example config

It’s often useful to have a per-project MyPy configuration file to avoid excessive command line options. Put a file .mypy.ini in each Python project containing:

[mypy]
files = src/, scripts/

ignore_missing_imports = True
strict_optional = False
allow_redefinition = True
show_error_context = False
show_column_numbers = True

Where “files” is set appropriately for your project. Making a per-project files is strongly recommended to ensure files aren’t missed in the type check. One can make a system-wide ~/.mypy.ini, that is overridden by the per-project .mypy.ini.

isolate problem packages

Sometimes an external package adds type hinting that is incompatible with the current MyPy release. This is relatively rare, but was the case with Xarray. To ignore a package’s type hinting, add the following to .mypy.ini, where we assume we want to ignore xarray type checking.

[mypy-xarray]
follow_imports = skip

Notes

enhanced mypy usage http://calpaterson.com/mypy-hints.html

CMake version recommendations and install

CMake ≥ 3.17 is strongly recommended in general for more robust and easy syntax.

Compile/Install CMake

This will get you the latest release of CMake. For Linux and Mac, admin/sudo is NOT required.

  • Linux: Download/build/install Cmake 3 using cmake_setup
  • Mac: brew install cmake or use .dmg
  • Windows: use Windows win64-x64 installer

pip

There is an unoffical PyPi CMake package:

pip install cmake

This often is the quickest cross-platform way to get the current CMake release.

CMake major versions

  • 3.19: added support for ISPC language, JSON parsing, FindPython/find_package version ranges
  • 3.18: CMake Profiler cmake -B build --profiling-output=perf.json --profiling-format=google-trace
  • 3.17: Ninja Multi-Config generator, --debug-find to see what find_package() is trying to do, eliminate Windows “sh.exe is on PATH” error. Recognize that Ninja 1.10.0 correctly works with Fortran.
  • 3.16: Precompiled headers, unity builds, many advanced project features
  • 3.15: CMAKE_GENERATOR environment variable works like -G option, enhanced Python interpreter finding, add “cmake –install” command
  • 3.14: check_fortran_source_runs(), better FetchContent
  • 3.13: ctest --progress, better Matlab compiler support, lots of new linking options, fixes to Fortran submodule bugs, cmake -B build incantation, target_sources() with absolute path
  • 3.12: transitive library specification (out of same directory), full Fortran Submodule support
  • 3.11: specify targets initially w/o sources, FetchContent

Deprecated

These versions of CMake have been deprecated. Setting cmake_policy to accomodate these old versions emits a deprecation warning from CMake.

  • 3.10: added Fortran Flang (LLVM) compiler, extensive MPI features added
  • 3.9: further C# and Cuda support originally added in CMake 3.8.
  • 3.8: Initial Cuda support
  • 3.7: comparing ≤ ≥ , initial Fortran submodule support
  • 3.6: better OpenBLAS support
  • 3.5: Enhanced FindBoost target with auto Boost prereqs
  • 3.4: Limit CPU usage when using ctest -j parallel tests
  • 3.3: List operations such as IN_LIST

Build executable from Python

We often use executables from Python with data transfer via:

  • stdin/stdout (small transfers, less than a megabyte)
  • temporary files (arbitrarily large data)

This provides a language-agnostic interface that we can use from other scripted languages like Matlab or Julia, future-proofing our efforts at the price of some runtime efficiency due to the out-of-core data transfer.

Here is a snipping we use to compile a single C code executable from Python (from GeoRINEX program):

"""
save this in say src/mypkg/build.py
then from the code that needs the output executable, say "myprog.bin":

from .build import build
exe = "myprog.bin"
...
if not exe.is_file():
    build("src/myprog.c")
# code that passes data via stdin/stdout and/or files using subprocess.run()

"""
import subprocess
import shutil
from pathlib import Path


def build(src: Path, cc: str = None) -> int:
    """

    Parameters
    ----------

    src: pathlib.Path
        path to single C source file.
    cc: str, optional
        desired compiler path or name

    Returns
    -------

    ret: int
        return code from compiler (0 is success)
    """
    if cc:
        return do_compile(cc, src)

    compilers = ["cc", "gcc", "clang", "icc", "icl", "cl", "clang-cl"]
    ret = 1
    for cc in compilers:
        if shutil.which(cc):
            ret = do_compile(cc, src)
            if ret == 0:
                break

    return ret


def do_compile(cc: str, src: Path) -> int:
    if not src.is_file():
        raise FileNotFoundError(src)

    if cc.endswith("cl"):  # msvc-like
        cmd = [cc, str(src), f"/Fe:{src.parent}"]
    else:
        cmd = [cc, str(src), "-O2", f"-o{src.with_suffix('.bin')}"]

    ret = subprocess.run(cmd).returncode

    return ret

Get user home directory in Matlab on Windows

The tilde “~” character is used by most terminal shells as shorthand for the user home directory. The home directory is typically a safer place to write files than the system root directory, so ~ give a convenient way to refer to an absolute path generic across systems. Many code languages require parsing the tilde, including Python, C++, Fortran and Matlab. GNU Octave understands that ~ tilde is the user’s home directory on any operating system, even Windows. Matlab does not consistently understand ~ as the user home directory, particularly on Windows.

Matlab Fix

We created expanduser.m that works on Linux, Mac and Windows.

expanduser('~')

returns the absolute path of the user home directory, for example:

  • Linux: /home/username
  • MacOS: /Users/username
  • Windows: C:\Users\Username

Test python setup.py install on CI

For continuous integration, it’s important to test the traditional package install

pip install .

along with the more commonly used in situ pip development mode

pip install -e .

Otherwise, the Python package install may depend on files not included in the MANIFEST.in file and fail for most end users who don’t use “pip install -e” option.