Scientific Computing

sed one-liners to clean blanks

Using sed one-liners, recursively clean from text files such as blank lines and trailing whitespace.

ℹ️ Note

ensure the globbing pattern is only for the expected text files or unwanted PDF files etc. might be destroyed by just using “*”

The script below is used like:

./clean.sh ~/my_site "*.md"

clean.sh contains:

#!/usr/bin/env bash

set -o errexit

loc=$1
pat=$2

find $loc -not -path "*/.git*" -type f -name "$pat" -execdir sed --in-place 's/[[:space:]]\+$//' {} \+ -execdir sed --in-place -e :a -e '/^\n*$/{$d;N;};/\n$/ba' {} \+

Note that each “-execdir” command is separate. Add more commands or take out what is unwanted.

Use cases include keeping files “Git clean” of trailing spaces and extra lines at end of file. Matlab editor doesn’t autoclean these lines, so use this script for “*.m” files.

Windows SSH server

OpenSSH client and server are built into Windows. The setup procedure is easier than using Cygwin. RDP (Remote Desktop) over SSH can be significantly more secure than RDP alone, assuming SSH is well configured.

Enable OpenSSH Server in Windows Settings → Apps → Apps & features → Optional features → Add a feature → OpenSSH Server. This also sets Windows Firewall to allow inbound SSH TCP connections.

Edit “$Env:ProgramData/ssh/sshd_config” on the OpenSSH server PC. At least set PasswordAuthentication no to require SSH public key for better security.

A minimal SSH keypair can be created for the SSH client by:

ssh-keygen -t ed25519 -f ~/.ssh/my_server

Copy the contents of client laptop file ~/.ssh/my_server.pub to the Windows SSH server computer, creating or adding a line to file ~/.ssh/authorized_keys. The location of this file is defined in sshd_config as AuthorizedKeysFile. Use a unique key for each connecting client–do not reuse SSH keypairs between servers or clients.

If the user is a Windows Administrator on the OpenSSH server computer, add the SSH public key to file “$Env:ProgramData/ssh/administrators_authorized_keys”

Start the SSH server (for this session only) from PowerShell:

Start-Service sshd

To always start OpenSSH on boot, type services.msc and in Properties of OpenSSH server → General set “Startup Type: Automatic”

As on Linux, the “authorized_keys” file must have the correct file permissions ACL. Run this PowerShell script:

The SSH client should be able to connect to the SSH server. If this doesn’t work, try using SSH locally on the OpenSSH server computer to troubleshoot.

To use RDP (remote desktop) over SSH do this one-step setup

Tips:

  • Edit text files from Windows console over SSH in the Terminal by using WSL:

    wsl

    then enter commands like nano foo.txt just like in Linux as it’s the WSL shell.

  • Change the default SSH shell. Assuming you have PowerShell 7 on the SSH server, the commands would be like (from pwsh PowerShell):

    New-ItemProperty -Path "HKLM:\SOFTWARE\OpenSSH" -Name DefaultShell -Value "$Env:ProgramFiles\PowerShell\7\pwsh.exe" -PropertyType String -Force

mpi_f08 Fortran interface

Fortran MPI programs should use the Fortran mpi_f08 interface:

use mpi_f08

Intel MPI supports Fortran mpi_f08 including on Windows using free Intel oneAPI compiler.

MPI constants like mpi_comm_world and mpi_real are Fortran derived types.

For legacy user programs if needed, access the MPI legacy integer value via the %mpi_val property.

use mpi_f08

integer :: comm = mpi_comm_world%mpi_val
!! %mpi_var emits the legacy integer

Fortran MPI examples

Too much data that is still not enough

This example uses the aurora, which is produced around most planetary bodies due to energetic particle kinetics as the particles penetrate the ionosphere. Optical instruments such as cameras give a line integrated measurement for each pixel (angle) of the imagers. This data can be useful for tomographic techniques, when the location and orientation of the camera is well known, and multiple cameras with overlapping field of view exist.

However, this rich data can be greatly supplemented and even superseded by other instruments, especially incoherent scatter radar, where 3-D + time data are available due to volume integrated target returns. Many analyses rely on those thin (~ 0.5 degree FWHM) radar beams to complete an analysis. We rarely know the needed orientation of the radar beams beforehand, and many ISR cannot change the location of their pre-programmed beams. Although as AESA they can steer almost instantaneously within the radar backend processor limits.

This is just a geospace example of too much data, but not enough to gauge individual analyses without additional processing techniques.

MINGWROOT environment variable

By convention, the environment variable MINGWROOT tells the path to MinGW64 (just above bin/, lib/, include/)

  • MSYS2: MINGWROOT=%SYSTEMDRIVE%\msys64\mingw64

This variable may be needed to modify the GNU Octave PATH on Windows when using “system()” calls with executables compiled by MinGW. A similar issues exists on Windows with Matlab and Parallel Computing Toolbox, that provides its own mpiexec.

We made a function to workaround these issues.

Eliminating non-https external links

With a website / blog having thousands of pages and many thousands of external links, it is impractical to check external outbound link quality with any regularity. Informal link checks revealed that non-https:// websites had a substantially higher chance of becoming a defunct site that gets snapped up by spammers and scammers. To help mitigate some of the risk of websites going to unintended destinations, we decided to eliminate almost all non-https external links.

An increasing number of undesired websites are enabling https both to improve SEO and trick visitors. However, this additional friction anecdotally for the external links we’ve seen go bad has so far been rarer for https:// URLs. We have seen https:// sites be replaced by undesired content, but what often happens is the spammer doesn’t bother to setup the certificates correctly, so either the website won’t load if HSTS was used, or there are prominent warnings that the user has to click through.

There’s nothing to stop spammers from correctly setting certificates, but we feel https-only external links currently afford a meaningful benefit.

WSL2 date time skew error

WSL2 (including with Windows 20H1 2004) is known to have issues with having the WSL clock get hours or days behind actual time after the computer is suspended. This issue was not seen in WSL1, but upon upgrading to WSL2 has been almost immediately apparent to multiple people that reported this issue. This causes errors with build systems (including GNU Make and Ninja) and SSL verification among others.

A workaround for this, when it occurs (have to keep doing workaround) is to synchronize the software clock to the onboard hardware clock from WSL Terminal:

hwclock -s

or if suitable from Windows Terminal:

wsl --shutdown

If that doesn’t work, try using NTP from WSL Terminal:

ntpdate time.windows.com

This issue has been noted at WSL GitHub Issues:

Other issues are linked from those

Fix BibTeX error with .bbl file

Sometimes cryptic errors occur if there was a syntax error in a .bib BibTeX bibliography file that doesn’t disappear even when the .bib syntax is corrected. The fix for this is often to delete the auto-generated files:

Example: top-level LaTeX file “main.tex”. The compilation generates main.bbl and main.aux among several others. Try:

rm main.aux main.bbl

pdflatex main
bibtex main
pdflatex main
pdflatex main

Saving / resurrecting old Fortran programs

A general rule for resurrecting old code in any language is first try to get it working for known input/output via emulation or virtual machine. Many old codes were made before linters and other correctness checks we’ve long taken for granted. The codes may use non-standard tricks no one has thought about for years to squeeze tiny fractions of efficiency that haven’t been significant for decades.

After correct operation is established, Fortran 66 code often needs a few tweaks to work with modern compilers. The Fortran Wiki has an excellent article on modernizing Fortran syntax.

Always make a copy of the file first. Sometimes 10,000+ lines will match so it can be tedious to check. Similar operations may be needed if a mix of tabs and spaces are used.

To move line numbers left, run this regex and replace with null (nothing):

^\s+(?=\d+\s)

To make code start in column 7, recall that fixed format (Fortran <= 77) code has line numbers in columns 1-5 of each line numbers After the prior operation, push the actual code right of column 6 by replacing this regex with appropriate number of spaces:

(?<=^\d+\s+)

Other compatibility notes:

  • $ can be used like a semicolon in CDC Cyber Fortran
  • put procedures in separate files for duplicated line numbers.

Delete Git tag / release version

Deleting or updating a GitHub or GitLab release leaves the Git tag, which must also be deleted to reissue a corrected release.

  1. delete the Git tag locally. Here we assume the release tag was v1.2.3

    git tag -d v1.2.3
  2. delete the Git tag on remote site

    git push -d origin v1.2.3

    This puts the release notes from v1.2.3 into “draft” mode–hidden from the public.

  3. Make the necessary changes, then do the release again, reusing the draft release notes.

Each Git hosting service has distinct approaches to releases and tags: