Scientific Computing

Clang -Wunsafe-buffer-usage tips

Clang C++ flag -Wunsafe-buffer-usage enables a heuristic that can catch potentially unsafe buffer access. However, this flag is known to make warnings that are unavoidable, such as accessing elements of argv beyond argv[0], even via encapsulation such as std::span.

This flag could be used by occasionally having a human (or suitably trained AI) occasionally review the warnings. For example, in CMake:

option(warn_dev "Enable warnings that may have false positives" OFF)

if(warn_dev)
  add_compile_options("$<$<COMPILE_LANG_AND_ID:CXX,AppleClang,Clang,IntelLLVM>:-Wunsafe-buffer-usage>")
endif()

argv general issues

General issues with argv are discussed in C++ proposal P3474R0 std::arguments. An LLVM issue proposed an interim solution roughly like the following, but at the time of writing, this still makes a warning with -Wunsafe-buffer-usage.

#if __has_include(<span>)
#include <span>
#endif
#if defined(__cpp_lib_span)
#  if __cpp_lib_span >= 202311L
#    define HAVE_SPAN_AT
#  endif
#endif

int main(int argc, char* argv[]) {

#ifdef HAVE_SPAN_AT
const std::span<char*> ARGS(argv, argc);
#endif

int n = 1000;

if(argc > 1) {
  n = std::stoi(
#ifdef HAVE_SPAN_AT
  ARGS.at(1)
#else
  argv[1]
#endif
  );
}

return 0;
}

MSVC __cplusplus macro

The __cplusplus macro indicates the version of the C++ standard that the compiler claims to implement given the current compiler flags. Some later language standard features like __has_include are available despite earlier compiler standard settings, which is a great convenience. C++ projects regularly use the __cplusplus macro to conditionally compile code based on the C++ standard version implemented by the compiler in use. This allows adding new optional features, which still working with older compilers that do not support them.

Surprisingly, Visual Studio MSVC defines __cplusplus as 199711L by default, which is the C++98 standard. Visual Studio 2017 15.7 added the flag /Zc:__cplusplus to define __cplusplus as the correct value like other compilers.

Intel oneAPI 2023.1 release uniformly adds the MSVC flag /Zc:__cplusplus. To see the note, scroll down to the text “oneAPI 2023.1, Compiler Release 2023.1 New in this release” and click the down caret.

Added /Zc:__cplusplus as a default option during host compilation with MSVC.

In CMake, add this flags as needed by deciphering the MSVC compiler version.

if(CMAKE_CXX_COMPILER_ID STREQUAL "MSVC" AND CMAKE_CXX_COMPILER_VERSION VERSION_GREATER_EQUAL 19.14)
  # MSVC has __cpluscplus = 199711L by default, which is C++98!
  # oneAPI since 2023.1 sets __cplusplus to the true value with MSVC by auto-setting this flag.
  add_compile_options("$<$<COMPILE_LANGUAGE:CXX>:/Zc:__cplusplus>")
endif()

Meson build system adds this flag automatically.

Matlab system stdin pipe

Matlab or GNU Octave can call programs and handle arbitrarily large and complex inputs and outputs via stdin, stderr, and stdout command line pipes as in matlab-stdlib subprocess_run that works for Matlab or GNU Octave across operating systems. This Java interface is via Matlab external language interface.

stdlib.subprocess_run() overcomes limitations of factory system. and works like Python subprocess. “stdout” and “stderr” are returned from stdlib.subprocess_run() separately, and “stdin” can be passed as a string.

stdlib.subprocess_run() can be faster than using temporary files. stdlib.subprocess_run() helps avoid filesystem clashes when running many external processes in parallel or asynchronously.

Across programming languages, calling an external program with pipes avoids the need to write additional code directly interfacing memory between Fortran or C/C++ by using file-based or pipe-based API for data streaming.


Reference: Python or Java pass stdin from Matlab to executable.

HDF5 / NetCDF4 in GNU Octave

Open data file formats such as HDF5 and NetCDF4 are excellent way to share and store archival data across computing platforms and software languages. Numerical software such as Matlab, GNU Octave, Python, and many more support these data file formats.

The syntax in the code examples below is exactly the same for Matlab and GNU Octave. Omit the pkg load and pkg install statements in Matlab.

HDF5

HDF5 files in GNU Octave are accessed via hdf5oct in similar fashion to Matlab.

From Octave prompt, install the package:

pkg install -forge hdf5oct

Octave program that writes an array to an HDF5 file “example.h5” dataset “/m”:

pkg load hdf5oct

fn = 'example.h5';

h5create (fn, '/m', [3 3]);
h5write (fn, '/m', magic (3));

Observe the file “example.h5” has been created. If the HDF5 command line tools are installed, the contents can be printed from system Terminal:

h5ls -v example.h5

In Octave or Matlab, the HDF5 file can be read to an array:

x = h5read (fn, '/m')
8   1   6
3   5   7
4   9   2

NetCDF4

NetCDF4 files in GNU Octave are accessed via Octave NetCDF4 package. Install the package from Octave prompt:

pkg install -forge netcdf

Write an array to a NetCDF4 file “example.nc” dataset “m”:

pkg load netcdf

fn = 'example.nc';

nccreate (fn, 'm', "Dimensions", {"x", 3, "y", 3});
% must include dimensions or a scalar dataset will be created

ncwrite (fn, 'm', magic (3));

Read the NetCDF4 file “example.nc” to an array:

x = ncread (fn, 'm')

Reference:

  • oct-hdf5 package: Octave low-level access to HDF5 files.

Eliminate old C-style casts in C++

The C++ named casts such as static_cast, dynamic_cast, and reinterpret_cast are preferred in C++ Core Guideline ES.49 over ambiguous old C-style casts. C++ named casts can help provide type safety by making the intention of the cast explicit / readable. C++ compilers can detect and warn about improper or unsafe casts when using named casts. C-style cast mistakes are more difficult to detect by humans or automated tools.

static_cast is used for conversions between compatible types, such as converting an int to a float or a pointer to a base class to a pointer to a derived class. Another common static_cast use case is interfacing with C functions such as Windows API functions that require specific types less common in pure C++ code.

int a = 10;
float b = static_cast<float>(a);

reinterpret_cast is used for low-level reinterpreting of bit patterns. It casts a type to a completely different type. This cast is not type safe and should be used with caution to avoid undefined behavior. reinterpret_cast is commonly used in low-level programming, such as interfacing with hardware or converting between pointers and integers.

int a = 10;
char* b = reinterpret_cast<char*>(&a);

dynamic_cast is used for safe downcasting of pointers or references to classes in a class hierarchy. It performs a runtime or RTTI check to help ensure that the cast is valid. dynamic_cast is used when you need to convert a pointer or reference to a base class to a pointer or reference to a derived class. static_cast is more common and faster than dynamic_cast, but dynamic_cast is safer when downcasting in a class hierarchy.

Detecting Old-Style Casts with GCC or Clang

To ensure that old C-style casts are not used in a codebase, consider the -Wold-style-cast flag with GCC or Clang. This flag generates warnings for any old-style casts found in the code.

In CMake, this flag is applied like:

dd_compile_options("$<$<COMPILE_LANG_AND_ID:CXX,AppleClang,Clang,GNU>:-Wold-style-cast>")

If the CMake variable CMAKE_COMPILE_WARNING_AS_ERROR is set true, the old-style cast warnings (and other compile warnings) will be treated as errors.

Matlab / Octave integer representation

For proper integer representation in Matlab / Octave use explicit type to avoid Matlab unwanted casting to “double” for integers.

x = int64(2^63);

Operations involving an explicitly-typed variable will retain that type, assuming implicit casting due to other variables or operations doesn’t occur. Precise string representation of “x” can be done using int2str(), sprintf(), or string():

xc = int2str(x);

xf = sprintf('%d', x);

xs = string(x);

sprintf() gives more control over the string output format, while string() or int2str() are more concise.

ATSC 1.0 MPEG-4 older TVs no video

In 2008 the ATSC ratified the MPEG-4 TV broadcast standard. Numerous ATSC 1.0 TVs were sold before this standard was ratified, and still operate today. TV manufacturers continued to make some non-MPEG-4 TVs for a decade after the standard was ratified. As a practical matter to avoid abandoning viewers with older receivers, ATSC 1.0 broadcasts remain on while implementing ATSC 3.0 broadcasts. This lighthousing of ATSC 1.0 broadcasts leads broadcasters to use MPEG-4 encoding for ATSC 1.0 broadcasts.

MPEG-2 is the legacy encoding standard for ATSC 1.0 broadcasts, which any old DTV can receive. A typical ATSC 1.0 MPEG-2 broadcast channel layout was one 1080i channel and several 480i channels, or 1-2 720p channel(s) with even more 480i channels. ATSC broadcast channel layout is a tradespace between the number of subchannels vs. the bandwidth per subchannel. This database lists the channels available in a given area. Click “Technical Data” to see the resolution and encoding of each channel.

As ATSC 3.0 broadcasts roll out, the number of ATSC 1.0 channels will decrease. A mitigation for broadcasters is to switch to MPEG-4 encoding for the ATSC 1.0 broadcasts, which is more efficient than MPEG-2 and allows packing more channels into the same transmitter bandwidth. This leaves older TVs and receivers with audio-only on MPEG-4 channels. This MPEG-4 list is missing some broadcasters. Note that some ATSC broadcasts have audio-only subchannels.

A solution for the end user lacking an MPEG-4-capable TV is to buy an ATSC receiver box that supports MPEG-4. These can be obtained for less than $50. ATSC 3.0 receivers are available for less than $100 if desired to access ATSC 3.0 broadcasts not available even on some new TVs.

Enthusiasts make their “band scan” data available for TV and FM radio typically using a Raspberry Pi to enjoy and share the hobby of broadcast DXing.

Satellite radio outside North America

SDARS satellite radio broadcasts of music or video to mobile receivers has largely been a North American phenomenon. While there are North American specific satellite TV networks like DirectTV and Dish Network, satellite TV has long been a global phenomenon in certain markets.

Automobiles typically have Bluetooth audio, which may one day be used with 5G broadcast instead of individual mobile data streams. This may stymie the growth of SDARS in other continents. Despite a global need for wide-area and rural broadcast radio coverage, SDARS is only widespread in North America. SiriusXM has made massive investment in a “long game” with receiver availability just as mobile internet streaming became widely feasible. Other continents’ markets may be too fragmented despite the large population, with not enough intercity user mobility to make the subscriber base big enough, and the cities dense enough to also need hundreds of terrestrial repeaters.

Notes:

Free 2-D CAD drawing programs

AutoCAD 2-D libre alternatives are available for Linux, macOS and Windows. They generally require retraining for users coming from AutoCAD. Libre 2-D AutoCAD-like choices include FreeCAD, QCAD and LibreCAD.

FreeCAD is 3-D parametric modeling akin to SolidWorks that can import DXF or DWG. QCAD has distinct paid vs. Community Edition features include DWG and DXF read / write. LibreCAD can read / write DXF and read DWG.

The no-cost ODA File Converter converts DXF to / from DWG.

Install Nvidia HPC C, C++, Fortran compilers

The free-to-use Nvidia HPC SDK offers possible speed improvements and CUDA Fortran. A typical reason for using Nvidia HPC SDK is the Cuda GPU features. Nvidia HPC compilers support C11, C++23, and partial Fortran 2008 including submodule and error stop.

Download and install Nvidia HPC SDK. Create a script nvidia.sh:

To use NVIDIA HPC SDK, source the script:

source ~/nvidia.sh

CMake

In CMake, set NVIDIA HPC compiler-specific options in CMakeLists.txt like:

if(CMAKE_Fortran_COMPILER_ID STREQUAL "NVHPC")
  add_compile_options($<$<COMPILE_LANGUAGE:Fortran>:-Mdclchk;-Munixlogical>)
endif()

To use newer languages standard features ensure the underlying GCC toolchain is set to a new-enough compiler as per Nvidia HPC SDK documentation. The compiler path can be determined on RHEL-like Linux distros like:

scl enable gcc-toolset-12 "dirname $(which g++)"

If using a CMake toolchain file, instead of CXXFLAGS environment variable, one can set

set(CMAKE_CXX_COMPILER_EXTERNAL_TOOLCHAIN "/opt/rh/gcc-toolset-12/root/usr/")