Scientific Computing

Fix Git line endings on Windows + Cygwin or WSL

When using Git on Windows with Cygwin or Windows Subsystem for Linux, CRLF conflicts can falsely make a Git repo dirty. From Cygwin or WSL with line ending clashes, “git diff” will show ^M at the end of each line and fail merge on “git pull”. This can cause missed code changes or needless commits. We suggest to force LF line endings no matter what environment the user is in. Even Windows Notepad supports LF line endings.

git config --global core.autocrlf input

git config --global core.eol lf

This tells Git to force line endings \n on committed files.

To disregard line endings for diff and patch:

diff -Naur --strip-trailing-cr old.txt new.txt

Fix corrupt UTF8 files with Python

I find that sometimes files included in Python projects, for example Fortran files, have corrupted characters that are incorrect UTF-8 characters. Maybe it’s a case of bad OCR that also plagues LaTeX / BibTeX copy / paste references from journal websites. Thus, this method will also apply to BibTeX files.

Pure Python script find_bad_characters.py recursively:

  1. finds such corrupt files
  2. removes the corrupted characters
  3. backs up original file and overwrites if desired

Install Xrdp for VNC via Windows Remote Desktop

xrdp creates an RDP server on remote Linux PCs.

RDP client on laptop:

  • Windows: factory installed
  • macOS: RDP client
  • Linux: apt install xfreerdp

Setup Xrdp server: remote Linux PC has the Xrdp server. Install Xrdp and Openbox desktop

apt install xrdp openbox

Create ~/.xsession containing

exec openbox-session

Enable xrdp with new config

service xrdp restart

Openbox will show a grey screen upon typing password at Xrdp login. Right-click mouse to open menu. If only a gray/black screen, try editing /etc/xrdp/startwm.sh on the remote PC:

#!/bin/sh

if [ -r /etc/default/locale ]; then
. /etc/default/locale
export LANG LANGUAGE
fi

exec openbox-session

Configuring MUMPS for reduced verbosity

By default, the Fortran MPI parallel sparse direct solver library MUMPS is extremely verbose, clogging up the terminal or log file with perhaps 100s of MBytes of text. Disable the log messages by setting ICNTL as in the following MUMPS example. Note: for MUMPS < 5.2 the ICNTL(4) does not take effect; that was a known bug.

MUMPS can be installed on Linux systems like:

apt install libmumps-dev

or install MUMPS using CMake.


For ICNTL(1-4), setting the value ≤ 0 suppresses verbose messages. ICNTL(4) allows fine-grained setting of verbosity; see page 54 of the MUMPS User Manual.

!! this must be called AFTER the first mumps call that had job=-1

program test_mumps
use, intrinsic:: iso_fortran_env, only: output_unit, error_unit
implicit none (type, external)

include 'dmumps_struc.h'  ! per MUMPS manual

type (dMUMPS_STRUC) :: mumps_par

mumps_par%ICNTL(1) = error_unit  ! error messages
mumps_par%ICNTL(2) = output_unit !  diagnostic, statistics, and warning messages
mumps_par%ICNTL(3) = output_unit ! global info, for the host (myid==0)
mumps_par%ICNTL(4) = 1           ! default is 2.  1 is less verbose

end program

Open-source license for geospace

This is a brief description of how we choose software licenses for over 100 geospace software projects. The discussion is quite simplified to keep length short. The 2018 NAS report on open source for NASA provides a useful software license survey for geospace science.

One of the costs of copyleft licensing to the contributing author is possible opportunity cost. Will the other party simply reimplement what you did to avoid your copyleft, and you lose out on collaboration / consulting? Would you actually have the resources and time to enforce the copyleft license?

A cost of permissive licenses is the lost possibility of licensing fees. If the code was developed for a grant or employer, the possibility of the contributing author actually getting those license fees may be small. Also the chance that anyone would even try to pay for a license rather than reimplement etc. is usually small for geospace programs.

Thus, except for large programs that would have real value to a large company or contractors thereof, we usually use MIT or Apache licenses. Permissive licenses such as MIT / BSD are among the most corporate-friendly and collaborator-friendly. For these programs, we get large companies using them who may be interested in consulting work. If the license was too restrictive (copyleft), we might not hear from these companies or possible collaborators. Types of programs for which we often use MIT / BSD permissive-style licenses:

  • one-off scripts that maybe are only useful to a handful of users
  • ~ 1000 line or so programs that didn’t take a lot of effort
  • program has significant public domain / permissive components

For geospace programs and libraries that need to be used by geospace agencies, we typically use Apache. If the license is non-permissive, they might not be allowed to use / modify / distribute tje code freely and we might miss collaboration / consulting opportunities. The Apache license’s patent protection provisions read like those of GPLv3.

The Affero GPL, is said to provide some cloud / SaaS protections in addition to GPLv3 that it’s based on. However, there are also said to be “holes” in Affero GPL that have not yet been tested in court.

Copyleft licenses are often shunned by companies. Government agencies typically discourage and/or restrict copyleft usage. The trend from both industry and funding agencies is to support permissive open-source licenses instead of copyleft licenses.

Compile/install Python 3 on Raspberry Pi

Raspberry Pi OS includes Python 3. Here is how to compile the latest Python on the Raspberry Pi.

For apt installed Python modules that access hardware like GPIO, access system Python 3 via /usr/bin/python3.

Get prereqs on the ARM device:

apt install libffi-dev libbz2-dev liblzma-dev libsqlite3-dev libncurses5-dev libgdbm-dev zlib1g-dev libreadline-dev libssl-dev tk-dev build-essential libncursesw5-dev libc6-dev openssl git

Extract the latest Python source code

Configure (3 minutes on Raspberry Pi 2):

cd cpython-3.*

./configure --prefix=$HOME/.local --enable-optimizations

Build and install–this step takes 10-40 minutes, depending on Raspberry Pi model. Do not use sudo!

make -j -l 4
make install

Note: don’t omit -l 4 or Pi will be quickly overwhelmed and error build. This limits load average to 4. Without it, load average will soar to 100+ (bad).

Add to ~/.profile:

export PATH=$HOME/.local/bin/:$PATH

then open a new Terminal. Check that which python3 and which pip3 etc. refer to ~/.local/bin/ instead of /usr/bin. Don’t uninstall system Python 3 /usr/bin/python3 because system packages depend on it. The PATH you set in Step 5 above makes Linux prefer the new Python.


reference

Setup Azure Pipelines for free open-source CI

Azure Pipelines is free for open source projects. Azure CI images exist for Linux, macOS and Windows. Several images run in parallel, and generally the instances are quite fast, including macOS.

Start Pipelines free with GitHub makes a one-time connection to GitHub. Azure can also connect to general Git repos hosted wherever.

Each GitHub repo corresponds to an Azure Project. Many people don’t use all the extra features Azure provides for free, just the Azure Pipelines.

Create a top-level repo file “azure-pipelines.yml”. This can be edited from the Azure webpage, hence Azure asking for pre-repo GitHub write permissions. DO NOT name this with .yaml, as Azure won’t detect it automatically, and you’ll have to manually reconfigure each project you use .yaml in.