Scientific Computing

FFmpeg optimize YouTube video

FFmpeg can optimally re-encode video from numerous formats for YouTube (or other service) upload. YouTube suggested settings for SDR video are implemented below. There are additional settings for HDR Video.

ffmpeg -colorspace bt709 -i in.avi -b:v 8M -bufsize 16M -c:v libx264 -preset slow -c:a aac -b:a 384k -pix_fmt yuv420p -movflags +faststart out.mp4
-colorspace bt709
BT.709 color space for SDR video. HDR should not use this flag.
-i in.avi
input file
-b:v 8M -bufsize 16M
8 Mbps video bitrate with 16 Mbps buffer size. This assumes 1080p input video–adjust this to the actual resolution of the input video. Use the bitrate table to choose appropriately.
-c:v libx264
H.264 video codec
-preset slow
better encoding quality at expense of more processing time
-c:a aac
AAC-LC audio codec
-b:a 384k
384 kbps audio bitrate (stereo) – this is not the sample rate which should be 48 kHz
-pix_fmt yuv420p
YUV 4:2:0 pixel format
-movflags +faststart
moov atom at the beginning of the file
out.mp4
MP4 container as suggested by YouTube

Additional flags that can be set, but require knowing the video parameters like frame rate:

-flags +cgop -g 30
GOP of 30 frames – should be 1/2 frame rate so here it was a 60 fps video

Video only, without audio

ffmpeg -i in.avi -c:v libx264 -preset slow -crf 18 -pix_fmt yuv420p -movflags +faststart out.mp4

Reference: FFmpeg for YouTube

GitHub Actions upload CMakeConfigureLog.yaml

When CMake fails on the configure or generate steps in a CI workflow, having CMakeConfigureLog.yaml uploaded a as a file can help debug the issue. Add this step to the GitHub Actions workflow YAML file:

  - name: Configure CMake
    run: cmake -B build

# other steps ...

  - name: upload CMakeConfigureLog.yaml
    if: failure() && hashFiles('build/CMakeFiles/CMakeConfigureLog.yaml') != ''
    uses: actions/upload-artifact@v4
    with:
      name: ${{ runner.os }}-${{ env.CC }}-CMakeConfigureLog.yaml
      path: build/CMakeFiles/CMakeConfigureLog.yaml
      retention-days: 5

The “retention-days” parameter is optional. Ensure the “name” parameter is unique to avoid conflicts with other jobs in the workflow. Here we assume that the OS and C compiler are unique between jobs.

Fortran stream buffering

Fortran programs, like most other programming languages, can access the standard input, output, and error streams, which usually connect to the terminal. Stream buffering generally increases efficiency of small bursts or groups of text typical to interactive applications. Buffering means that data is not immediately written to the terminal or console until the buffer is full or flushed. The Fortran statement to flush is flush(unit), where unit is the unit number of the stream to flush. Fortran statement print automatically flush the stream. Fortran statement write flushes the stream unless the advance specifier is set to no.

Different compilers have distinct default stream buffering behavior.

  • Gfortran: streams buffered by default. Environment variable GFORTRAN_UNBUFFERED_PRECONNECTED can be set to “y” to disable buffering.
  • Intel Fortran “ifx”: streams unbuffered by default. Environment variable FORT_BUFFERED can be set to “TRUE” to enable buffering.

Matlab reading CDF files

NASA CDF data file format can be read in many languages including Matlab. Matlab data file interfaces don’t always yield an obvious error message if something is wrong like a data file doesn’t exist or a variable doesn’t exist in a data file. For CDF, the message for cdfread of a non-existent variable is like:

Error using matlab.internal.imagesci.cdflib

CDF library failure encountered when executing the CDFclosezVar routine: “ILLEGAL_IN_zMODE: Operation is illegal while in zMode.”

One can use try-catch or check that the variable name exists with cdfinfo .Variables column 1.

Install latest LLVM Clang / Flang on GitHub Actions

The Clang / Flang compiler versions on GitHub Actions might be older than desired. While GCC is usually the latest release on GA, LLVM might be a couple versions behind latest. This example shows how to install a range of LLVM versions in a GitHub Actions workflow.


jobs:

  linux-flang:
    runs-on: ubuntu-latest
    timeout-minutes: 15

    strategy:
      matrix:
        llvm-version: [20]

    env:
      CC: clang-${{ matrix.llvm-version }}
      CXX: clang++-${{ matrix.llvm-version }}
      FC: flang-${{ matrix.llvm-version }}

    steps:
    - uses: actions/checkout@v4

    - name: Apt LLVM
      run: |
          wget https://apt.llvm.org/llvm.sh
          chmod +x llvm.sh
          sudo ./llvm.sh ${{ matrix.llvm-version }}
          sudo apt-get update

    - name: install Flang
      run: sudo apt install --no-install-recommends clang-${{ matrix.llvm-version }} flang-${{ matrix.llvm-version }}

    # build , test, etc.

Matlab builtin() function overload

Matlab built-in functions can be overloaded in classes or in general. This can be useful when making code compatible with both Matlab and GNU Octave, or to be compatible with older Matlab versions.

For example, the Matlab built-in function strlength gets the length of char and string arrays, but older Matlab and GNU Octave don’t have this function. We made a function in strlength.m in a project’s private/ directory like:

%% STRLENGTH get length of character vector or scalar string

function L = strlength(s)

L = [];

if ischar(s)
  L = length(s);
elseif isstring(s)
  L = builtin('strlength', char(s));
  % char() is workaround for bug described below
end

end

A bug we found and confirmed by Xinyue Xia of Mathworks Technical Support is as follows: strlength() works with char, or string scalars or N-D arrays. However, when using

builtin('strlength')

only char is accepted. Scalar or N-D string array errors unexpectedly.

Example:

builtin('strlength', ["hi", "there"])

we expect to get [2,5] but instead get:

Error using strlength First argument must be text.

builtin('strlength', "hi")

expect value 2, but instead get

Error using strlength First argument must be text.

Matlab remote desktop plot rendering

Matlab graphics rendering can break with remote desktop via VNC or X11 forwarding. Local Matlab plots can break due to GPU graphics driver problems. Before Matlab R2025a, setting Matlab figure renderer could workaround Matlab plot issues. Consider using Matlab Online for remote development if available.

f = figure;

% Obsolete as of Matlab R2025a
set(f, Renderer='painters', RendererMode='manual')

Matlab rendererinfo() provides extensive details about the rendering backend.

Conda commands fail with Windows PowerShell 7.5

With Conda older than 25.0 on Windows, upon upgrading to PowerShell 7.5, all Conda commands might fail like:

Invoke-Expression: Cannot bind argument to parameter ‘Command’ because it is an empty string.

The underlying issue is the faulty syntax of the Conda scripts.

The easiest solution is to uninstall and reinstall Miniconda or Anaconda.

Year 2038 long int on Windows

The POSIX epoch time is the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970. Operations involving POSIX epoch time, particularly on Windows, may run into Year 2038 problems if “long” type is used to store the epoch time in seconds. The “long” int type only guarantees 32-bit integers, which can store values up to 2,147,483,647. The maximum value that can be stored in a 32-bit signed integer is 2,147,483,647, which corresponds to 03:14:07 UTC on Tuesday, 19 January 2038. After this time, a 32-bit “long” value will overflow, which can cause unexpected behavior in applications that rely on the epoch time. To avoid this problem in a cross-platform compatible manner, consider the explicit 64-bit integer type int64_t to store the epoch time in programs written in C, C++ or likewise in other code languages to store POSIX epoch time. We chose “int64_t” instead of “long long” to be explicit about the integer size needed and useful, which is a general best practice in C / C++.

Example using std::difftime to calculate the difference between two time points:

#include <cstdint>
#include <ctime>
#include <string>
#include <iostream>

int main() {
    std::time_t t1 = 0;
    std::time_t t2 = 2147483648; // 03:14:08 UTC on Tuesday, 19 January 2038
    std::string posix = std::to_string(static_cast<std::int64_t>(std::difftime(t2, t1)));

    std::cout << "POSIX time difference: " << posix << " seconds\n";

    return 0;
}

CMake end of support schedule

CMake 4.0 introduced a formal end of support schedule for older CMake versions. This schedule impacts cmake_minimum_required in that a CMake project will not configure if CMAKE_POLICY_VERSION_MINIMUM is older than the minimum version of CMake that is still supported.

Practical observations of CMake projects are that projects too often have CMake minimum versions that aren’t actually usable, or set to CMake versions so old that they are difficult to run on modern computers. A good CI test is to have an old CMake version that matches cmake_minimum_required(VERSION) to see if the project actually works with such an old CMake version.

Find the date of a CMake release under Milestones.