Scientific Computing

Viewing FITS image stack

The FITS data file format is used in astronomical imagery. Two of the easiest ways to view FITS file image stacks with standalone programs are HDFView and NASA FV.

While FV has utilities and a UI more oriented to astronomical uses, HDFView is a generally useful tool to view HDF5, HDF4 and FITS files.

Recursive latexdiff

The Perl script “latexdiff” generates highlighted differences between two versions of a LaTeX document. This is required in submitting academic paper revisions.

  • macOS: brew install latexdiff
  • Linux / WSL: apt install latexdiff

Recursive latexdiff.py processes all .tex files in the directory. This is useful for very large projects like Ph.D. thesis or journal article.

Platform independent builds with Cmake

A wide variety of programming languages are used by engineers and scientists. Tie them all together (C, C++, C#, Cuda, Fortran, etc.) in a platform-independent and simple way using CMake or Meson. These high-level build systems generate low-level build system backend files for Ninja (strongly recommended) or Make.

Assume a single-file C++ program that uses the Math library and Boost for flexible command-line input of numerous parameters. Turn on additional compiler warnings to help avoid common coding pitfalls. Consider an example CMakeLists.txt for a C++ and Fortran project, line by line.

Language(s) selection:

project(zakharov CXX)

Naming a project facilitates packaging and installation. CXX is required to enable the hooks for the language(s) you used. The most frequently used languages include

tag language
C C
C# C#
CXX C++
Fortran Fortran

Languages that aren’t built into Cmake such as Pascal can be added via custom Cmake modules.

Compiler options:

if(CMAKE_Fortran_COMPILER_ID STREQUAL "GNU")
  add_compile_options(-Wall -Warray-bounds)
endif()
-Wall -Warray-bounds
turn on warnings for common programming mistakes
-fexceptions
more detailed debug info with no speed penalty–enabled by default on Clang.
find_package(Boost REQUIRED COMPONENTS filesystem program_options)

We use Boost:

filesystem
directory manipulation
program-options
advanced command-line parsing
add_executable(zakh zakh.cpp)
target_link_libraries(zakh PRIVATE Boost::filesystem Boost::program_options)
target_compile_features(modules PRIVATE cxx_std_11)

This project requires C++11 features, so an old compiler not supporting C++11 will emit a configuration error.

zakh
the exe file that will be created on compile, run with ./zakh.
zakh.cpp
the files making up “zakh”

Compiling a simple project with CMake: It’s convenient to create a separate directory, typically build/ under your main code directory. Let’s say your main code directory is ~/code/zakharov, then do

# configure
cmake -B build/`

# build program
cmake --build build/ --parallel

# run program
./zakh

Let’s say you edit the code–rebuild and run by:

cmake --build build/ --parallel

./zakh

Normally you do not need to reconfigure CMake if just editing source code.


CMake alternatives include Meson.

Related:

Use ** instead of pow in Python

In Python, x**y is much faster than:

Julia is more than 5 times faster than Python at scalar exponentiation, while Go was in-between Python and Julia in performance.

Python

Benchmarking was the same for integer or float base or exponent.

Python testing done with:

  • Python 3.7.4
  • Ipython 7.8.0
  • Numpy 1.16.5

** operator

The ** operator in Python also has the advantage of returning int if inputs are int and arithmetic result is integer.

10**(-3)
8.22 ns ± 0.0182 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

pow(10, -3)
227 ns ± 0.313 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

math.pow(10, -3)
252 ns ± 1.56 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

numpy.power(10., -3)
1.5 µs ± 2.91 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Numpy is known in general to be slower at scalar operations than Python-native operators and Python built-in math. But of course Numpy is generally a lot faster and easier for N-dimensional array operations.

Julia

Julia 1.2.0 was likewise benchmarked under power/ for reference on the same computer.

First we installed Julia BenchmarkTools:

import Pkg
Pkg.add("BenchmarkTools")

The Julia wallclock time for exponentiation was the same for float and int as with Python.

3.399 nanoseconds

Go

Go 1.13.1 was benchmarked under power/:

go test -bench=Power
BenchmarkPower-12       33883672                31.8 ns/op

go benchmark reference

Python flatten list of lists into list

For scenarios where a function outputs a list, and that function is in a for loop or asyncio event loop, the final output will be a list of lists, like:

x = [[1,2,3], [4, 5, 6]]

This may be inconvenient for applications where a flattened list is required. The simplest and fastest way to flatten a list of lists in like:

import itertools

x = [[1,2,3], [4, 5, 6]]

x_flat = list(itertools.chain(*x))

which results in

[1, 2, 3, 4, 5, 6]

GNSS data abbreviations

Some of the most fundamental GNSS measurements retrieved from GNSS receivers include:

  • CNR: carrier to noise ratio or C/N0. The estimation technique varies between receivers. Typical values in the range 30..50 [dB Hz]
  • PSR: Psuedorange [meters]
  • ADR: accumulated Doppler range–carrier phase measurements [cycles]

Using networks of GNSS receivers along with appropriate post-processing techniques, estimated maps of vertical TEC (integrated electron density) can be derived.

Reference

Numpy / OpenCV image BGR to RGB

Conversion between any/all of BGR, RGB, and GBR may be necessary when working with

  • Matplotlib pyplot.imshow(): M x N x 3 image, where last dimension is RGB.
  • OpenCV imshow(): M x N x 3 image, where last dimension is BGR
  • Scientific Cameras: some output M X N x 3 image, where last dimension is GBR

Note: as in any programming language, operations on memory-contiguous arrays are most efficient. In particular, OpenCV in-place operations require a contiguous array from Python to avoid unexpected results. The safest approach is to always make a copy of the array as in the examples below.

Use .copy() to avoid unexpected results if using OpenCV. If just using Matplotlib, .copy() is not necessary–but performance (speed) may benefit from .copy().


BGR to RGB: OpenCV image to Matplotlib

rgb = bgr[...,::-1].copy()

RGB to BGR: Matplotlib image to OpenCV

bgr = rgb[...,::-1].copy()

RGB to GBR:

gbr = rgb[...,[2,0,1]].copy()

The axis order convention for Python images:

  • 3-D: W x H x 3, where the last axis is color (e.g. RGB)
  • 4-D: W x H x 3 x 1, where the last axis is typically an alpha channel

Further examples:

CMake builds for modern C++

Non-standard language options and incomplete feature support are normal for compilers across virtually all programming languages from BASIC to Fortran and here C++. Modern C++ features typically require using specific compiler flags to enable support. Knowing what compiler flags to set can be confusing for those new to modern C++ features. Setup of C++ compiler flags for modern C++ features is easily and automatically handled by CMake.

add_executable(filesep_cpp filesep.cpp)
target_compile_features(filesep_cpp PRIVATE cxx_std_17)

C++ fstream allows writing files to disk. Some operations need to manage directory slashes (Windows vs. POSIX). C++ std::filesystem::path::preferred_separator manages platform-agnostic path separators. Akin to Python pathlib, use std::filesystem::path. C++ filesystem works on almost all current C++ compilers.

#include <filesystem>
#include <iostream>

int main() {
    std::cout << std::filesystem::path::preferred_separator << "\n";
    return 0;
}

brew-like Scoop for Windows

Note: Winget might be preferred for Windows packages.


Scoop brings easy install like scoop install gcc of developer programs in the package list to Microsoft Windows. Scoop works from a fresh Windows install, for example a free Windows virtual machine image.

Install Scoop, then install Git via Scoop, so that Scoop can update its recipes:

scoop install git

Commmon development tools:

  • gcc / gfortran: scoop install gcc
  • make / cmake: scoop install make cmake
  • clang / LLVM: scoop install clang
  • GNU Octave: scoop install octave

From time to time scoop update gcc or similar to update individual packages.

Scoop quick start

One-click RDP + SSH Linux to remote Windows PC

Assumes Linux laptop to connect to a remote PC such as:

On the Linux laptop:

apt install freerdp2-x11
Remote PC IP Remote PC SSH port Remote PC RDP port
1.2.3.4 22 (open TCP firewall) 3389 (blocked by remote PC firewall)

Create executable script myrdp.sh on the Linux laptop:

#!/bin/sh

ssh -f -L 4389:localhost:3389 remoteusername@1.2.3.4 sleep 1;

xfreerdp /v:localhost:4389

Running that script on the Linux laptop connects using RDP over SSH to the Windows computer

Notes

  • Advanced freerdp configuration (e.g. limited bandwidth)
  • xfreerdp command line options
  • If freerdp2-wayland on Wayland doesn’t work, try freerdp2-x11.

Related: Windows to Windows SSH / RDP