Using Intel oneAPI and MKL with CMake

There can be substantial speed boosts from Intel compilers. Intel oneAPI gives advanced debuggers and performance measurements. Intel oneMKL can give a significant speed boost even to non-Intel compilers for certain math operations.

On any OS and particularly Windows we generally use CMake Ninja generator. Visual Studio backend often causes additional difficulties, so unless you know you need Visual Studio we recommended making the default CMake generator Ninja by setting environment variable:

CMAKE_GENERATOR=Ninja

or do this one-time by:

cmake -G Ninja

Specify the environment variables CC, CXX, FC to indicate desired compilers.

export FC=ifort CC=icx CXX=icpx

cmake -B build

cmake --build build

For Windows, the procedure is similar:

set FC=ifort
set CC=icx
set CXX=icx

cmake -B build

cmake --build build

Intel oneMKL can be used with any compiler, e.g. ifort or gfortran. An example CMakeLists.txt using the factory FindLAPACK.cmake.

project(MKLtest Fortran)

# allows selecting parallel, sequential, 32/64 bit
# This example is sequential 32 bit

set(BLA_VENDOR Intel10_64lp_seq)
find_package(LAPACK REQUIRED)

add_executable(mytest main.f90)
target_link_libraries(mytest LAPACK::LAPACK)

We have created an easier to use FindLAPACK.cmake that handles MKL and non-MKL LAPACK.


To see the compiler commands CMake is issuing, use

cmake --build build -v

Refer to Intel Link Advisor.


Get runtime confirmation that MKL is being used via MKL_VERBOSE.

  • Linux / MacOS:

    MKL_VERBOSE=1 ./mytest
    
  • Windows

    set MKL_VERBOSE=1
    mytest.exe
    

That gives verbose text output upon use of MKL functions. That runtime option does slow down MKL performance, so normally we don’t use it.

Intel MPI on Windows

On Windows, the Intel C, C++ and Fortran compilers present Visual Studio-like command line options. The correct version of Visual Studio must be installed on Windows for Intel compilers to work.

The free Intel oneAPI with HPC toolkit includes the Intel MPI library, which provides mpiexec needed to run MPI programs and MPI compiler wrappers.

Loading Intel compiler environment

Most users use the Intel oneAPI command prompt. Alternatively, run “compilervars.bat” script to enable the Intel compilers for each session. “psxevars.bat” is not appropriate for this setup. For convenience, make a batch script like ~/intel.bat containing:

"C:\Program Files (x86)\Intel\oneAPI\setvars.bat"

set FC=ifort
set CC=icx
set CXX=icx

Intel MPI on Windows is only for Intel compiler

Unlike for Linux Intel MPI, Windows Intel MPI is only for the Intel C, C++ and Fortran compilers and Visual Studio.

Notes

Although not often needed, a separate username can be used for Windows Intel MPI jobs by from Command Prompt:

runas /user:username cmd

Environment variables are not passed to the new window, so it may be necessary to run Intel compilervars.bat again. It’s possible to register the user credential in the Windows registry.

Five free C C++ Fortran compiler families

Five modern, currently-supported compiler families are free-to-use for C, C++ and Fortran.

GCC has broad support of modern standards on a very wide range of computing platforms. GCC’s downside in some cases can be slower runtime performance than compilers having less broad language and platform support.

Compiler language standard
gcc C17
g++ C++20
gfortran Fortran 2018

Intel oneAPI compilers are free to use for any user. The Intel performance libraries like MKL, IPP, and TBB are available at no cost.

Compiler language standard
icx C17
icpx C++20
ifort Fortran 2018

LLVM Clang and Flang have significant industry support, including from Nvidia, and are known for high performance.

Compiler language standard
clang C17
clang++ C++20
flang Fortran 2018 (coming ~2022)

Nvidia HPC SDK is free to use. A key feature of the HPC SDK compilers is intrinsic support for CUDA Fortran.

Compiler language standard
nvc C11
nvc++ C++17
nvfortran Fortran 2003

IBM XL compilers are currently for POWER CPUs only e.g. ppc64le. IBM XL compilers do not work with a typical x86-based computer. If you have a $3000 Raptor IBM POWER9 desktop, then IBM XL may be for you. IBM is working on LLVM upgrade for its compilers.

The IBM XL compilers are high-performance compilers that have a free community edition. IBM XL Fortran has wide support for Fortran 2008. However, the XL compilers have bugs in newer language support, so be sure to check with another compiler on the IBM system like GCC if a bug is suspected.

Compiler language standard
xlc C11
xlc++ C++14
xlf Fortran 2008

IBM Fortran 2008 reference

Stop shell script on exception

Shell scripts can stop upon exception rather than handling each one manually. This can significanlty reduce logical clutter in a script. Stop a shell script on error for Unix / Linux / MacOS shell by setting near the top of the .sh script file:

set -o errexit

This is a human-readable equivalent to set -e This works for commonly used Unix shells including Bash and Zsh.

Another useful option is to stop the script if any script variables are defined:

set -o nounset

Stop executing a Powershell script upon exception by adding near the top of the .ps1 Powershell script:

$ErrorActionPreference = 'Stop'

Install Windows Subsystem for Linux

Ubuntu LTS releases among other Linux distros are available on the Microsoft Windows Store or command prompt. Install WSL2 by:

wsl --install Ubuntu

WSL images can be switched between WSL1 and WSL2, but for most purposes WSL2 is generally preferred. WSL can use X11 GUI with programs like Spyder.

Verify if on WSL1 or WSL2 by from PowerShell / Command Prompt:

wsl --list --verbose

The result will be like:

  NAME      STATE           VERSION
* Ubuntu    Running         2

Convert an existing WSL1 distro to WSL2 from PowerShell:

wsl --set-version Ubuntu 2

Install, list, and switch between Linux distros on Windows default for bash by from Command Prompt:

wslconfig

configure

Limit the amount of RAM WSL2 can use by editing Windows file ~/.wslconfig to include:

[wsl2]
swap=0GB
memory=4GB  # arbitrary, set to less than your total computer physical RAM to help avoid using Windows swap

A WSL default that is confusing and slows down WSL program-finding is stuffing Windows PATH into WSL PATH. We normally disable Windows PATH injection into WSL, because it also breaks library finding in build systems like CMake. Additionally, we enable filesystem metadata, as weird permission errors can occur, even causing CMake to fail to configure simple projects.

Each Linux distro has its own /etc/wsl.conf We typically include in our /etc/wsl.conf:

[automount]
enabled = true
options = "metadata"

[interop]
enabled=false
appendWindowsPath=false

The Windows file ~/.wslconfig file sets parameters for all Linux distros, versus the per distro /etc/wsl.conf discussed above. To avoid the use of Linux swap and excessive memory thrashing we include in Windows ~/.wslconfig:

[wsl2]
swap=0GB

Run Ubuntu apps from Windows Command Prompt or PowerShell:

wsl ls -l

Run Windows program from Ubuntu terminal:

/mnt/c/Windows/System32/notepad.exe

Note that capitalization matters and .exe must be at the end.


If necessary to reinstall Ubuntu, copy off your Linux user files as the next step deletes them. From Command Prompt:

Ubuntu clean

Ubuntu

Notes

Homebrew binary bottle download

Downloading binary Homebrew bottles without installing Homebrew can be useful to check the location of the bottle contents. This is useful when developing Meson native-files or CMake Find*.cmake modules.

Homebrew distributes bottles from GitHub Packages. For example, HDF5 binary “bottle” may be inspected by:

tar --list -f <filename>

No Homebrew install is necessary for inspection. Using the libraries and binaries is best done by installing Homebrew.

GNU Octave font size fix

If GNU Octave plot fonts are too small or the lines are too thin in GNU Octave plotting, typically the first of these methods will be adequate, the others are for reference.

Octave has multiple graphics “toolkits” or “backends”:

  • GNUplot (old, not many features)
  • QT (best support and graphics)

Octave will try to use QT before GNUplot.

If you don’t want the graphical IDE, start Octave with octave -no-gui instead of octave-cli. Using octave-cli disables the QT backend.

Default plot settings are for qt backend, NOT gnuplot! I avoid setting font sizes in the program itself. Other people have different resolutions and PPI, and putting your particular computer display tweaks in your program code may make your plots look awful on other computers.

The “correct” way to scale plot fonts is thus to change your system defaults. Add this to ~/.octaverc instead of ~/Documents/MATLAB/startup.m so that you don’t disturb Matlab’s plotting defaults.

set(0, "defaulttextfontsize", 24)  % title
set(0, "defaultaxesfontsize", 16)  % axes labels

set(0, "defaulttextfontname", "Courier")
set(0, "defaultaxesfontname", "Courier")

set(0, "defaultlinelinewidth", 2)

adjust 16 to produce the most appealing text labels in:

  • axes tick labels
  • legend key
  • title text

defaultline is the root category for lines, so defaultlinelinewidth is not a typo.


The alternative methods below are not normally needed, but are for reference. PPI adjustments: find your PPI by Internet search or spec sheet. Octave’s PPI estimate is:

get(0, 'screenpixelsperinch')

If Octave’s PPI estimate is too small, this is probably why your plot text is too small–Octave thinks your resolution is much less than it really is.

If still a font size problem, try changing system DPI scaling. On Ubuntu, try Gnome Tweak Tool → Fonts → Scaling Factor Octave GUI settings:

  • → General → Interface → Icon Size: large
  • → Editor Styles → Octave: default
  • → Terminal → Font Size

You can also try changing the graphics toolkit. Usually QT is the best, most modern, as it’s QT5 (most likely).

Octave graphics toolkits available:

available_graphics_toolkits()

gnuplot qt

Active graphics toolkit:

graphics_toolkit()

qt

Select graphics toolkit to see if font sizes are better.

graphics_toolkit('gnuplot')
figure()
plot([1,2,3,4])
title('hi there')

be sure to open a new figure when trying different graphics toolkits.


GNU Octave default settings docs


Related: GNU Octave set defaults

Keep program running after disconnect

Screen is a terminal multiplexer program available for Linux, MacOS, BSD and similar. Screen allows programs to continue running after a remote user disconnects. If a remote connection is lost unintentionally, screen may not allow reconnection by default by the usual

screen -list

screen -r <id>

normally allows reconnecting to a remote session after logging off. When a connection is lost before disconnecting from screen, you may need the “-x” option:

screen -x <id>

A downside of screen is the difficulty scrolling back in history. Although screen is a mature project, development is still ongoing.

Screen is a terminal multiplexer, and some prefer tmux over screen. Another option is using nohup.

Avoid overriding CMake default install prefix

CMake FetchContent is useful to incorporate subprojects at configure time. FetchContent subproject cache variables can override the top-level project cache, which can be confusing. A particular instance we’ve found problematic is overriding the CMake default install prefix. A top level project desiring to install to the CMake default location will get a surprising result if the child project overrides this.

# DON'T DO THIS

if(CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT)
  # will not take effect without FORCE
  set(CMAKE_INSTALL_PREFIX ${PROJECT_BINARY_DIR} CACHE PATH "Install top-level directory" FORCE)
endif()

Instead, we recommend projects, including projects intended to be consumed via FetchContent use CMakePresets.json to set the default install directory:

{
  "version": 3,

"configurePresets": [
{
  "name": "default",
  "binaryDir": "${sourceDir}/build",
  "installDir": "${sourceDir}/build"
}
]
}

CMake FetchContent vs. ExternalProject

Making multiple software projects work together is readily done by the build system:

instead of Git submodule or monorepo.

Meson subproject and CMake ExternalProject keep project namespaces separate. Meson subproject and CMake FetchContent download and configure all projects at configure time. CMake FetchContent comingles the CMake project namespaces. FetchContent can be easier to use than ExternalProject if you control both software projects' CMake scripts. If you don’t control the “child” project, it may be better to use ExternalProject instead of FetchContent.

For these examples, suppose we have a top-level project “parent” and a “child” project containing a library that is desired in parent. Suppose the child project can be built standalone (by itself) but also may be used directly from other CMake projects.

project CMAKE_SOURCE_DIR CMAKE_BINARY_DIR PROJECT_SOURCE_DIR
parent ~/foo ~/foo/build ~/foo
child: standalone ~/bar ~/bar/build ~/bar
child: CMake ExternalProject ~/foo/build/child-prefix/src/child ~/foo/build/child-prefix/src/child-build ~/foo/build/child-prefix/src/child
child: CMake FetchContent ~/foo ~/foo/build ~/foo/build/_deps/child-src

FetchContent

FetchContent populates content from the other project at configure time. FetchContent populates the “child” project with default values from the “parent” project. Varibles set in the “child” project generally do not affect the “parent” project unless specifically used from the “parent” project.

From “parent” project CMakeLists.txt:

cmake_minimum_required(VERSION 3.14)
project(parent Fortran)

include(FetchContent)
FetchContent_Declare(child
  GIT_REPOSITORY https://github.invalid/username/child.git
  GIT_TAG develop   # it's much better to use a specific Git revision or Git tag for reproducibility
)

FetchContent_MakeAvailable(child)

# your program
add_executable(myprog main.f90)
target_link_libraries(myprog mylib)  # mylib is from "child"
FetchContent_MakeAvailable
make “child” code configure, populating variables and targets as if it were part of “parent” CMake project.

suppose “child” project CMakeLists.txt contains:

project(child LANGUAGES Fortran)

add_library(mylib mylib.f90)
target_include_libraries(mylib INTERFACE ${CMAKE_CURRENT_BINARY_DIR}/include)
set_target_properties(mylib PROPERTIES
  Fortran_MODULE_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/include)

The child project CMAKE_BINARY_DIR and CMAKE_SOURCE_DIR will be those of parent project. That is, if the parent project is in ~/foo and the build directory is ~/foo/build, then the child project in ~/childcode called by FetchContent will also have CMAKE_SOURCE_DIR of ~/foo and CMAKE_BINARY_DIR of ~/foo/build. So be careful in the child project when using such variables that may be defined by parent projects. This is why projects that aren’t specifically designed to work together may be better joined by ExternalProject. A typical technique within the child project that can operate standalone is to refer to CMAKE_CURRENT_SOURCE_DIR instead of CMAKE_SOURCE_DIR as the latter will break when used from FetchContent.

IMPORTANT: When using if() clauses to determine execution of FetchContent, ensure that the FetchContent stanzas are executed each time CMake is run. Otherwise, the FetchContent targets may fail to be available or may have missing target properties on CMake rebuild.

ExternalProject

ExternalProject populates content from the other project at build time. This means the other project’s libraries are not visible until the parent project is built. Since ExternalProject does not combine the project namespaces, ExternalProject may be necessary if you don’t control the other projects.

ExternalProject will not activate without the add_dependencies() statement. Upon cmake --build of the parent project, ExternalProject downloads, configures and builds.

From “parent” project CMakeLists.txt:

project(parent LANGUAGES Fortran)

include(ExternalProject)

set(mylist "a;b;c")
# passing a list to external project is best done via CMAKE_CACHE_ARGS
# CMAKE_ARGS doesn't work correctly for lists

set_directory_properties(PROPERTIES EP_UPDATE_DISCONNECTED true)
# don't repeatedly build ExternalProjects.
# dir prop scope: CMake_current_source_dir and subdirectories

set(child_ROOT ${PROJECT_BINARY_DIR}/child)

ExternalProject_Add(CHILD
  GIT_REPOSITORY https://github.com/scivision/cmake-externalproject
  GIT_TAG develop  # it's much better to use a specific Git revision or Git tag for reproducability
  CMAKE_ARGS -DCMAKE_INSTALL_PREFIX:PATH=${child_ROOT}
  CMAKE_CACHE_ARGS -Dmyvar:STRING=${mylist}   # need variable type e.g. STRING for this
  CONFIGURE_HANDLED_BY_BUILD ON
  BUILD_BYPRODUCTS ${child_ROOT}/${CMAKE_STATIC_LIBRARY_PREFIX}timestwo${CMAKE_STATIC_LIBRARY_SUFFIX}
)

file(MAKE_DIRECTORY ${child_ROOT}/include)  # avoid race condition

add_library(timestwo STATIC IMPORTED GLOBAL)
set_target_properties(timestwo PROPERTIES
  IMPORTED_LOCATION ${child_ROOT}/lib/${CMAKE_STATIC_LIBRARY_PREFIX}timestwo${CMAKE_STATIC_LIBRARY_SUFFIX}
  INTERFACE_INCLUDE_DIRECTORIES ${child_ROOT}/include)

add_executable(test_timestwo test_timestwo.f90)  # your program
add_dependencies(test_timestwo CHILD)  # externalproject won't download without this
target_link_libraries(test_timestwo PRIVATE timestwo)
add_dependencies()
make ExternalProject always update and build first
CONFIGURE_HANDLED_BY_BUILD ON
tells CMake not to reconfigure each build, unless the build system requests configure
BUILD_BYPRODUCTS
necessary for Ninja to not complain about missing targets. Note how we can’t use BINARY_DIR since it’s populated by ExternalProject_Get_Property()

The imported library ext is used in the “parent” project just like any other library.


“child” project CMakeLists.txt includes:

project(child Fortran)

add_library(timestwo STATIC timestwo.f90)
set_target_properties(timestwo PROPERTIES
  Fortran_MODULE_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/include)

Configure “child” Fortran_MODULE_DIRECTORY so that it’s not necessary for “parent” to introspect “child” directory structure.

We have created live ExternalProject examples:

CMake can detect if a project is “top level” that is, NOT via FetchContent using PROJECT_IS_TOP_LEVEL.

cmake_minimum_required(VERSION 3.21)
project(child)

if(PROJECT_IS_TOP_LEVEL)
  message(STATUS "${PROJECT_NAME} directly building, not FetchContent")
endif()

Note that the PARENT_DIRECTORY and PROJECT_IS_TOP_LEVEL properties are NOT useful for detecting if the “child” is being used as an ExternalProject.


  • target_link_directories() is generally NOT preferred because library name collisions can occur, particularly with system libraries.

Reference: CMake staff comparison of multiple project with CMake