Scientific Computing

Clang -Wunsafe-buffer-usage tips

Clang C++ flag -Wunsafe-buffer-usage enables a heuristic that can catch potentially unsafe buffer access. However, this flag is known to make warnings that are unavoidable, such as accessing elements of argv beyond argv[0], even via encapsulation such as std::span.

This flag could be used by occasionally having a human (or suitably trained AI) occasionally review the warnings. For example, in CMake:

option(warn_dev "Enable warnings that may have false positives" OFF)

if(warn_dev)
  add_compile_options("$<$<COMPILE_LANG_AND_ID:CXX,AppleClang,Clang,IntelLLVM>:-Wunsafe-buffer-usage>")
endif()

argv general issues

General issues with argv are discussed in C++ proposal P3474R0 std::arguments. An LLVM issue proposed an interim solution roughly like the following, but at the time of writing, this still makes a warning with -Wunsafe-buffer-usage.

#if __has_include(<span>)
#include <span>
#endif
#if defined(__cpp_lib_span)
#  if __cpp_lib_span >= 202311L
#    define HAVE_SPAN_AT
#  endif
#endif

int main(int argc, char* argv[]) {

#ifdef HAVE_SPAN_AT
const std::span<char*> ARGS(argv, argc);
#endif

int n = 1000;

if(argc > 1) {
  n = std::stoi(
#ifdef HAVE_SPAN_AT
  ARGS.at(1)
#else
  argv[1]
#endif
  );
}

return 0;
}

MSVC __cplusplus macro

The __cplusplus macro indicates the version of the C++ standard that the compiler claims to implement given the current compiler flags. Some later language standard features like __has_include are available despite earlier compiler standard settings, which is a great convenience. C++ projects regularly use the __cplusplus macro to conditionally compile code based on the C++ standard version implemented by the compiler in use. This allows adding new optional features, which still working with older compilers that do not support them.

Surprisingly, Visual Studio MSVC defines __cplusplus as 199711L by default, which is the C++98 standard. Visual Studio 2017 15.7 added the flag /Zc:__cplusplus to define __cplusplus as the correct value like other compilers.

Intel oneAPI 2023.1 release uniformly adds the MSVC flag /Zc:__cplusplus. To see the note, scroll down to the text “oneAPI 2023.1, Compiler Release 2023.1 New in this release” and click the down caret.

Added /Zc:__cplusplus as a default option during host compilation with MSVC.

In CMake, add this flags as needed by deciphering the MSVC compiler version.

if(CMAKE_CXX_COMPILER_ID STREQUAL "MSVC" AND CMAKE_CXX_COMPILER_VERSION VERSION_GREATER_EQUAL 19.14)
  # MSVC has __cpluscplus = 199711L by default, which is C++98!
  # oneAPI since 2023.1 sets __cplusplus to the true value with MSVC by auto-setting this flag.
  add_compile_options("$<$<COMPILE_LANGUAGE:CXX>:/Zc:__cplusplus>")
endif()

Meson build system adds this flag automatically.

HDF5 / NetCDF4 in GNU Octave

Open data file formats such as HDF5 and NetCDF4 are excellent way to share and store archival data across computing platforms and software languages. Numerical software such as Matlab, GNU Octave, Python, and many more support these data file formats.

The syntax in the code examples below is exactly the same for Matlab and GNU Octave. Omit the pkg load and pkg install statements in Matlab.

HDF5

HDF5 files in GNU Octave are accessed via hdf5oct in similar fashion to Matlab.

From Octave prompt, install the package:

pkg install -forge hdf5oct

Octave program that writes an array to an HDF5 file “example.h5” dataset “/m”:

pkg load hdf5oct

fn = 'example.h5';

h5create (fn, '/m', [3 3]);
h5write (fn, '/m', magic (3));

Observe the file “example.h5” has been created. If the HDF5 command line tools are installed, the contents can be printed from system Terminal:

h5ls -v example.h5

In Octave or Matlab, the HDF5 file can be read to an array:

x = h5read (fn, '/m')
8   1   6
3   5   7
4   9   2

NetCDF4

NetCDF4 files in GNU Octave are accessed via Octave NetCDF4 package. Install the package from Octave prompt:

pkg install -forge netcdf

Write an array to a NetCDF4 file “example.nc” dataset “m”:

pkg load netcdf

fn = 'example.nc';

nccreate (fn, 'm', "Dimensions", {"x", 3, "y", 3});
% must include dimensions or a scalar dataset will be created

ncwrite (fn, 'm', magic (3));

Read the NetCDF4 file “example.nc” to an array:

x = ncread (fn, 'm')

Reference:

  • oct-hdf5 package: Octave low-level access to HDF5 files.

Eliminate old C-style casts in C++

The C++ named casts such as static_cast, dynamic_cast, and reinterpret_cast are preferred in C++ Core Guideline ES.49 over ambiguous old C-style casts. C++ named casts can help provide type safety by making the intention of the cast explicit / readable. C++ compilers can detect and warn about improper or unsafe casts when using named casts. C-style cast mistakes are more difficult to detect by humans or automated tools.

static_cast is used for conversions between compatible types, such as converting an int to a float or a pointer to a base class to a pointer to a derived class. Another common static_cast use case is interfacing with C functions such as Windows API functions that require specific types less common in pure C++ code.

int a = 10;
float b = static_cast<float>(a);

reinterpret_cast is used for low-level reinterpreting of bit patterns. It casts a type to a completely different type. This cast is not type safe and should be used with caution to avoid undefined behavior. reinterpret_cast is commonly used in low-level programming, such as interfacing with hardware or converting between pointers and integers.

int a = 10;
char* b = reinterpret_cast<char*>(&a);

dynamic_cast is used for safe downcasting of pointers or references to classes in a class hierarchy. It performs a runtime or RTTI check to help ensure that the cast is valid. dynamic_cast is used when you need to convert a pointer or reference to a base class to a pointer or reference to a derived class. static_cast is more common and faster than dynamic_cast, but dynamic_cast is safer when downcasting in a class hierarchy.

Detecting Old-Style Casts with GCC or Clang

To ensure that old C-style casts are not used in a codebase, consider the -Wold-style-cast flag with GCC or Clang. This flag generates warnings for any old-style casts found in the code.

In CMake, this flag is applied like:

dd_compile_options("$<$<COMPILE_LANG_AND_ID:CXX,AppleClang,Clang,GNU>:-Wold-style-cast>")

If the CI system has variable CMAKE_COMPILE_WARNING_AS_ERROR set true, the old-style cast warnings will be treated as errors.

Matlab / Octave integer representation

For proper integer representation in Matlab / Octave use explicit type to avoid Matlab unwanted casting to “double” for integers.

x = int64(2^63);

Operations involving an explicitly-typed variable will retain that type, assuming implicit casting due to other variables or operations doesn’t occur. Precise string representation of “x” can be done using int2str(), sprintf(), or string():

xc = int2str(x);

xf = sprintf('%d', x);

xs = string(x);

sprintf() gives more control over the string output format, while string() or int2str() are more concise.

ATSC 1.0 MPEG-4 older TVs no video

In 2008 the ATSC ratified the MPEG-4 TV broadcast standard. Numerous ATSC 1.0 TVs were sold before this standard was ratified, and still operate today. TV manufacturers continued to make some non-MPEG-4 TVs for a decade after the standard was ratified. The need to lighthouse ATSC 1.0 broadcasts while implementing ATSC 3.0 broadcasts has led numerous broadcasters to use MPEG-4 encoding for their ATSC 1.0 broadcasts.

Previously MPEG-2 was the only widespread encoding standard for ATSC 1.0 broadcasts. Until recently, a typical ATSC 1.0 broadcast channel layout was to have one 1080i channel and several 480i channels, or 1-2 720p channel(s) with several more 480i channels. The number of channels depends on the broadcaster’s choice of resolution and the number of subchannels they want to broadcast vs. the bandwidth per channel. This database lists the channels available in a given area. Click “Technical Data” to see the resolution and encoding of each channel.

As ATSC 3.0 broadcasts roll out, the number of ATSC 1.0 channels will decrease. A mitigation for broadcasters is to switch to MPEG-4 encoding for the ATSC 1.0 broadcasts, which is more efficient than MPEG-2 and allows packing more channels into the same transmitter bandwidth. This leaves older TVs and receivers with audio-only on MPEG-4 channels. This MPEG-4 list is missing some broadcasters.

A solution for the end user lacking an MPEG-4-capable TV is to buy an ATSC receiver box that supports MPEG-4. These can be obtained for less than $50. The ATSC 3.0 receivers are available for less than $100 if desired to access ATSC 3.0 broadcasts not available even on some new TVs.

Enthusiasts make their “band scan” data available for TV and FM radio typically using a Raspberry Pi to enjoy and share the hobby of broadcast DXing.

Satellite radio outside North America

SDARS satellite radio broadcasts of music or video to mobile receivers has largely been a North American phenomenon. While there are North American specific satellite TV networks like DirectTV and Dish Network, satellite TV has long been a global phenomenon in certain markets.

Automobiles typically have Bluetooth audio, which may one day be used with 5G broadcast instead of individual mobile data streams. This may stymie the growth of SDARS in other continents. Despite a global need for wide-area and rural broadcast radio coverage, SDARS is only widespread in North America. SiriusXM has made massive investment in a “long game” with receiver availability just as mobile internet streaming became widely feasible. Other continents’ markets may be too fragmented despite the large population, with not enough intercity user mobility to make the subscriber base big enough, and the cities dense enough to also need hundreds of terrestrial repeaters.

Notes:

Visual Studio /utf-8 source files

The MSVC compiler needs the /utf-8 flag when UTF-8 strings are in the source file. If not specified, too many bytes are assigned to each character, leading to incorrect string lengths. This will lead to failures with string width conversions such as WideCharToMultiByte.

Windows Intel oneAPI compiler didn’t need the /utf-8 flag when tested.

In CMake, apply the /utf-8 flag like:

add_compile_options($<$<AND:$<COMPILE_LANGUAGE:CXX>,$<CXX_COMPILER_ID:MSVC>>:/utf-8>)

In Meson, the /utf-8 flag is applied automatically to C++ source files with MSVC compilers.

Matlab upgrade from Terminal

Matlab version upgrade can be initiated from the system Terminal or from the Matlab Help menu. Running the upgrade or Add-On install requires a graphical connection to the computer (Remote Desktop, VNC, or similar).

Matlab release upgrade

The bell icon on the upper right corner of the Matlab Desktop typically shows when a Matlab update is available. To force checking for install, even if the bell is not showing an update, under the Matlab binary directory look for “MathWorksUpdateInstaller”

fullfile(matlabroot, "bin", computer("arch"))

Matlab Add-Ons

Matlab Add-On Explorer is normally launched from the Matlab Desktop. The Add-On Explorer requires a graphical connection. To launch Add-On explorer from the Matlab binary directory look for executable “AddOnInstaller”

Troubleshooting

If the graphical connection has a problem like:

terminating with uncaught exception of type (anonymous namespace)::DisplayError: No display available.

try for diagnosis:

w = matlab.internal.webwindow('www.mathworks.com');

w.bringToFront();

The first command (object instantiation) should not error. The method “bringToFront” should bring up a window if the connection / system allows.


If on Linux with a graphical connection and errors result, try renaming libcrypto.so.1.1 like:

r=$(matlab -batch "disp(fullfile(matlabroot, 'bin', computer('arch')))" | tail -n1)

mv libcrypto.so.1.1 libcrypto.so.1.1.bak

Related: Matlab install on Linux

Datetime vectors in Matlab / Octave

Generating a range of datetime data is a common data analysis and simulation task across programming languages. Matlab and GNU Octave can also generate datetime vectors.

Matlab datetime deprecates datenum. Generate a sequence of datetime like:

t0 = datetime('2020-01-05 12:30:00');
t1 = datetime('2020-01-06 18:15:10');
dt = hours(5.5);

t = t0:dt:t1;

disp(t)

GNU Octave can use many datetime features via the tablicious package.

pkg install -forge tablicious

Load in Octave prompt:

pkg load tablicious

Then use the same Matlab code above.