Scientific Computing

Extracting a page from PDF

We use the free Poppler tools instead of using Acrobat, despite having a license. Extracting one or more pages from a PDF file can be done with paid Adobe Acrobat. The free Adobe Reader or FoxIt Reader cannot extract pages. Adobe Acrobat is a large and cumbersome program that installs startup daemons that are not trivially disabled.

To extract pages 2 to 3 from in.pdf using Poppler:

pdfseparate -f 2 -l 3 in.pdf out.pdf

Note: you must leave a space after -f and -l as shown.

Poppler is installed by:

  • Linux: apt install poppler-utils
  • macOS: brew install poppler
  • Windows: use WSL poppler

An alternative to Poppler is GhostScript can also extract pages from PDF, but it’s a more complicated command:

gs -sDEVICE=pdfwrite -dFirstPage=2 -dLastPage=3 -dNOPAUSE -dSAFER -dBATCH -sOutputFile=out.pdf in.pdf

Related: extract raw full-quality images from PDF

GitHub Python dependency check

GitHub CodeQL semantically analyzes Python code for security issues. Also, CVE Lists are checked vs. your GitHub repo’s dependency graph. CodeQL can install the Python package for more fidelity.

This approach finally fixes the concerns we had with the previous implementation that simply did CVE scans versus dependency graphs. The prior method of extracting dependencies did not work for modern Python packages. The new CodeQL method is much more robust and useful.

Logitech on Linux with Solaar

The Solaar program manages connections with Logitech Unifying receiver on Linux, including pairing/unpairing. This means wireless keyboards and mice, including wireless trackballs work well on Linux. Logitech wireless firmware updates are provided seamlessly in Linux. The Unifying receiver “just works” on Linux upon plugging in, with trackballs being recognized as an HID device.

Solaar provides a GUI for: pairing/unpairing, configuring buttons, monitoring battery level, and checking firmware version of Unifying receiver and connected devices. Logitech Unifying receivers can be paired with multiple devices. This allows one to carry a laptop from home to office without dragging the wireless keyboard or mouse along.

scipy.integrate.cumulative_trapezoid vs. Matlab cumtrapz

The 0-based indexing of Python / Numpy versus the 1-based indexing of Matlab is perhaps the most obvious difference when working between the languages. Other, more subtle defaults come into play and may not be immediately caught within functions except by manual checks.

In atmospheric science among other fields, cumulative integration implemented as cumulative trapezoidal numerical integration is a common function named “cumtrapz”. Distinctive behavior between Matlab cumtrapz and scipy.integrate.cumulative_trapezoid might not be immediately caught inside a function, although it’s obvious when manually checking. Specifically, SciPy scipy.integrate.cumulative_trapezoid needs the initial=0 argument to match Matlab. Let’s show this by example:

Example: given input:

x = [-2.5, 5, 10]

Matlab (or GNU Octave) outputs:

y = cumtrapz(x)

[0, 1.25, 8.75]

scipy.integrate.cumulative_trapezoid output one element less than the input by default:

scipy.integrate.cumulative_trapezoid(x)

[1.25, 8.75]

To match Matlab output from SciPy, add the initial=0 argument:

scipy.integrate.cumulative_trapezoid(x, initial=0)

[0, 1.25, 8.75]

HDF5 CMake build

CMake is recommended to build HDF5 in general. CMake is required to build HDF5 on Windows. Trying to use Autotools to build HDF5 may encounter needless trouble compared to CMake.

HDF5 builds test executables that don’t pickup CMake environment variables. Set compiler variables CC and FC in the shell, particularly when using non-system-default compilers.

Avoid this issues by building HDF5 with CMake.

This creates static and dynamic HDF5 libraries under the user install prefix–we don’t show the library suffixes for simplicity. Note: the *stub files may not be present.

hdf5/lib/libhdf5
hdf5/lib/libhdf5_fortran
hdf5/lib/libhdf5_hl
hdf5/lib/libhdf5_hl_fortran
hdf5/lib/libhdf5_hl_f90cstub
hdf5/lib/libhdf5_f90cstub

The Fortran .mod files that need to be included are under

hdf5/include/

Tell CMake to use this HDF5 from the user project:

cmake -B build -DHDF5_ROOT=~/.local/hdf5

WSJT-X operating tips

The WSJT-X program enables popular digital modes like FT8 and FT4 among several other modes. WSJT-X is generally easier to use than FLdigi, another popular digital program that handles a wide range of other ham radio data modes. WSJT-X decodes multiple signals at once, and discriminates between overlapping signals. WSJT-X includes multiple high performance digital modes, each tuned for challenging propagation conditions from LF to microwave.

Control transmitter RF output power with the sound card output level, NOT with radio RF power control. Optimize transmitter RF cleanliness by:

  1. set the RF power control to well above the intended transmit power level (say 6 dB to 10 dB higher)
  2. set microphone gain to about midrange
  3. set computer audio volume to precisely set output RF transmit power

If the radio transmitter is driven into ALC by using RF power control to limit transmitter level, the radio will splatter, interfering with adjacent frequencies. While some operators use QRP ≤ 5 watt transmit power for WSJT-X supported modes, many others use 25 .. 100 watts. This accounts for the discrepancy one might experience in a QSO between the received signal strength by each operator. Some use a very minimal amount of RF power for a “kilometers per watt” challenge.

When using CAT control for PTT, RFI (for example, common mode RF interference) can cause the radio to fall out of transmit, even with very low RF transmit power like 100 milliwatts. With the compromise HF antennas enabled by these efficient digital modes, one must be just as mindful of proper grounding and RFI as with a 100 watt SSB or CW station. Diagnosing this problem can be done by setting the transmitter power to as low as measurable, and seeing if the radio still drops out of transmit. If so, then the problem is likely RFI or common mode RF on the USB control cables.

Older radios may have too-narrow receive filters to capture the whole audio passband used by modern data-friendly transceivers. If the radio has “IF shift” or “pass band tuning”, it might receive from 400-2800 Hz. The “split” transmit feature of WSJT-X shifts the radio RF transmit frequency and audio frequency to optimize the transmit audio passband to minimize transmitted audio harmonics.

A radio that receives on multiple HF bands simultaneously greatly increases the data gathering capability for ionospheric studies. SDRs oriented toward “traditional” ham radio use typically receive one or two RF bands at once. Simultaneous multi-band reception can be accomplished with a bank of SDR receivers or a broadband direct-sampling SDR.

Matlab + Octave unit tests quick-start

Matlab unit test framework is well-established and xUnit-like. The Matlab unit test framework is incompatible with the GNU Octave BIST unit testing framework. The Matlab unit test syntax is completely different from the Octave unit test syntax. Nonetheless, a common ground between Matlab and GNU Octave can be found in the easy to use Matlab script-based tests. A Matlab script-based test can be run as an ordinary script with GNU Octave.

A key limitation of Matlab script-based tests is they can’t use the Qualifiable functions (verify*, assert*, assume*). This also means you cannot mark a test as skipped or Incomplete (using assume*) with script-based tests. Basic tests can work without those functions, but more advanced tests suites do strongly benefit from those functions.

Matlab or Octave can run a script-based test as a plain script. The downsides of running a script-based test as a plain script are:

  1. there is no decorated TestResult
  2. the first failure ends the whole test

It’s best to put the script-based test scripts in the same directory as the code they’re testing to avoid Matlab path issues.

When using Matlab on this Octave-compatible script-based test, a richer result comes from using Matlab-only runtests(). Matlab runtests() will search all subdirectories under the current working directory and run all scripts with case-insensitive “test” in the filename. For example, a script testing HDF5 function of a program might be named TestHDF5.m or HDF5test.m. This is similar to PyTest filename-filtering.

Denote each test in the script-based test file with a double percent sign like:

% test_example.m
% setup code goes up top, as each test's variables are isolated from each other
% when using Matlab runtests()

A = magic(2);

%% test_dummy
B = A*2;  % only visible to this test, when using Matlab runtests()
assert(isequal(B, A*2), 'dummy example')

%% test_four
C = A*4; % note that "B" is not visible under runtests()
assert(isequal(C, A*4), 'dummy four')

Run Matlab runtests() from CI command like:

matlab -batch "assertSuccess(runtests)"

Just doing matlab -batch runtests will NOT fail in CI, even with failing test.

Octave runtests('.') only runs tests in specially formatted comments that Matlab ignores. For projects that need to have tests supporting Octave and Matlab, we generally recommend writing Matlab script-based tests, and manually running each file from Octave.

Linking HDF5 with CMake for C, C++ and Fortran

CMake links HDF5 into the C, C++, or Fortran program with just two lines in CMakeLists.txt. If experiencing trouble finding HDF5 with CMake, try FindHDF5.cmake, which is more up to date than the FindHDF5.cmake included with CMake. An example CMake for writing network data to HDF5 in C: CMakeLists.txt.


We show an example for C and another example for Fortran. “HL” refers to the high-level HDF5 interface that is more convenient and thus commonly used.

Note: if terminal has the Conda environment loaded and you keep getting the Conda HDF5 library, do first:

conda deactivate

before running the CMake configure commadn.


project(myproj LANGUAGES C)

find_package(HDF5 REQUIRED COMPONENTS C HL)

add_executable(myprog myprog.c)
target_link_libraries(myprog PRIVATE HDF5::HDF5)

project(myproj LANGUAGES Fortran)

find_package(HDF5 REQUIRED COMPONENTS Fortran HL)

add_executable(myprog myprog.f90)
target_link_libraries(myprog PRIVATE HDF5::HDF5)

HDF5 C example

The Fortran HDF5 syntax is quite similar.

#include "hdf5.h"

int main(void) {

   hid_t       file_id, dataset_id,dataspace_id; /* identifiers */
   herr_t      status;
   int         i, j, dset_data[4][6], read_data[4][6];
  hsize_t     dims[2];

   /* Initialize the dataset. */
   for (i = 0; i < 4; i++)
      for (j = 0; j < 6; j++)
         dset_data[i][j] = i * 6 + j + 1;

   /* Create a new file using default properties. */
   file_id = H5Fcreate("test.h5", H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

   /* Create the data space for the dataset. */
   dims[0] = 4;
   dims[1] = 6;
   dataspace_id = H5Screate_simple(2, dims, NULL);

   /* Create the dataset. */
   dataset_id = H5Dcreate2(file_id, "/dset", H5T_STD_I32BE, dataspace_id,
                          H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);

   /* Write the dataset. */
   status = H5Dwrite(dataset_id, H5T_NATIVE_INT, H5S_ALL, H5S_ALL, H5P_DEFAULT,
                     dset_data);

   /* End access to the dataset and release resources used by it. */
   status = H5Dclose(dataset_id);

//------------------------------------------------------

   /* Open an existing dataset. */
   dataset_id = H5Dopen2(file_id, "/dset", H5P_DEFAULT);


   status = H5Dread(dataset_id, H5T_NATIVE_INT, H5S_ALL, H5S_ALL, H5P_DEFAULT,
                    read_data);

   for (i = 0; i < 4; i++)
      for (j = 0; j < 6; j++)
        printf("%d ",read_data[i][j]); // 1-24

   /* Close the dataset. */
   status = H5Dclose(dataset_id);

   /* Close the file. */
   status = H5Fclose(file_id);

   return 0;
}

HDF5 compiler script

As an alternative to CMake, HDF5 compiler script h5cc links HDF5 and necessary libraries:

h5cc myprog.c func.c -lm
  • h5cc: C
  • h5c++: C++
  • h5fc: Fortran