Scientific Computing

Create GitHub or GitLab hosted website

GitHub, GitLab, Bitbucket and similar services allow free, fast static websites under usage limits. Netlify is recommended for use with any static site generator (SSG). GitHub Pages can use GitHub Actions for Hugo. GitLab Pages may be used with any SSG.

GitHub Pages is noticeably easier to use than GitLab or Bitbucket Pages. GitLab runners are slow and build quota can run out before month’s end. Most should start with GitHub Pages for websites of any size. While we generally recommned Hugo over Jekyll, here’s how to setup a Jekyll website.

The Minimal Mistakes Jekyll template is one of numerous quick-loading Jekyll templates. Forget about AMP, get lightning-fast mobile browsing Google PageSpeed scores with Jekyll and Minimal Mistakes. This procedure is based on Linux (including Windows Subsystem for Linux).

Install prereqs:

apt install ruby-dev libssl-dev

gem update --system

Configure Ruby Gem install without sudo and install Gem bundler (without sudo):

gem install jekyll bundler

Download and extract latest Minimal Mistakes release. Install needed Gems:

mv minimal-mistakes username.github.io
cd username.github.io

bundle install

where username is your GitLab or GitHub username. On GitHub/GitLab, create a new blank repository username.github.io (for GitLab, username.gitlab.io).

Edit _config.yml, change the following lines to fit your needs: title, name, description, url, repository

Connect your new website to GitLab/GitHub (swapping gitlab for github as appropriate)

git init
git add .
git commit -am init
git remote add origin https://github.invalid/username/username.github.io
git push

Future edits will follow the usual

git commit -am foo
git push

The test page should be live at username.github.io. See Github Pages docs for custom domains and advanced configs.

Tips

Folders under _posts with filenames starting with date appear on the site. Subfolders under _posts are transparently processed. This is useful to organize posts by year for example, without affecting URL formatting.

Example filename: _posts/2018/2018-09-23-joes-big-vacation.md appears to the public with URL: https://username.github.io/joes-big-vacation

Enable search icon in _config.yml with key:

search: true

This enables site-wide Lunr instant search as the user types. The search icon is at the upper right corner of the toolbar on top of every page/post. It’s much better/faster than Google-based search of your site! This instant as-you-type search scales well for sites with thousands of pages.

Edit static navigation buttons in _data/navigation.yml. To improve default formatting, copy/paste into _config.yml these lines (anywhere in file):

defaults:
  -
    scope:
      path: ""

    values:
      layout: "single"
      toc: true
      author_profile: false
      read_time: false
      comments: true
      share: true
      related: true

include: ["_pages"]

To configure the banner, add to index.html header (between three dashes) the lines:

header:
    overlay_color: "#000"
    overlay_filter: "0.5"
    overlay_image: /images/header.jpg
excerpt: "text overlaid on banner image"

Remove the author image by:

author_profile: false

Remove category from permalinks: in case you later decide a page category should change, without screwing up your search engine results, in _config.yml:

permalink: /:title/

Control the number of posts per archive page in _config.yml:

paginate: 10 # amount of posts to show

Reference: Jekyll install reference

GitLab Pages vs. GitHub Pages

feature GitLab GitHub
site generator any any via GitHub Actions
  • GitHub Pages is substantially easier to setup and use, and is capable of medium websites getting several million hits / year
  • GitLab Pages has more features and flexibility for advanced users

GitLab Pages quick setup

Create a new GitLab Project named username.gitlab.io (put the GitLab username in for “username”). If one already has a GitHub Pages website: Import from GitHub OR create/copy in your existing static website (if you had a GitHub Pages website, copy it here). If the latter, clone to the PC.

On the GitLab project page e.g. https://gitlab.com/username/username.gitlab.io click Set up CI Create the .gitlab-ci.yml under the apply a GitLab CI YAML template. If coming from GitHub Pages use GitHub Actions for Hugo.

The site is now building as seen with the Pipelines tab of the website project. It takes about 3-4 minutes to install the gems for a Jekyll site, then 2-3 more minutes to complete the build depending on the size of the website. The public URL should be like username.gitlab.io.

If a custom domain was purchased, tie to GitLab Pages by: Project Settings → Pages add TWO new domains

example.invalid
www.example.invalid

Transfer DNS can take the website down so do this at low traffic times. Once ready, setup/transfer DNS to GitLab. Suppose the domain is example.invalid, then set DNS records to

example.invalid CNAME username.gitlab.io
www             CNAME username.gitlab.io

assuming the DNS provider supports CNAME flattening.

SSL Config

GitLab Pages used with for example Cloudflare works well to provide HTTPS with your custom domain name as per this procedure. With that procedure enable SSL “Full (Strict)” security.

Free GitLab accounts have a monthly quota for build “pipeline” time. For a small to moderate size static website it should be enough. Save quota by canceling pipelines / runs for unneeded builds.

For frequently updated, medium sized websites (hundreds or thousands of pages) consider Netlify with GitLab Pages or GitHub Pages.

Both GitLab and GitHub allow the source files (e.g. Markdown) to be private for a public website. Consider a private website repo, otherwise Google may present search results from Markdown code before the actual webpage.

Notes

  • For larger or active websites use Netlify, or build on laptop or cloud service like Wercker with any static generator such as Hugo and push HTML to GitHub Pages
  • Useful Jekyll plugins that GitHub doesn’t allow include jekyll-archives (page per category/tag)
  • GitLab Pages from scratch

Netlify works well with GitHub or GitLab, adding speed and reliability among other benefits

Sphinx + Python on Github Pages / Jekyll

Sphinx works great with Github Pages. Sphinx requires one-time setup as described below. The URL will be like https://geospace-code.github.io/pymap3d/.

Install Sphinx in an environment otherwise it may downgrade other packages:

conda create -n sphinx

conda activate sphinx

pip install sphinx

Setup docs using Sphinx Quickstart

sphinx-quickstart

Most defaults are fine, except:

autodoc: automatically insert docstrings from modules (y/n) [n]: y
mathjax: include math, rendered in the browser by MathJax (y/n) [n]: y
viewcode: include links to the source code of documented Python objects (y/n) [n]: y
githubpages: create .nojekyll file to publish the document on GitHub pages (y/n) [n]: y

Add to .gitignore

doctrees/
.buildinfo

Edit docs/Makefile to include

SOURCEDIR     = .
BUILDDIR      = .

Create empty docs/.nojekyll or else Jekyll will reject all directories starting with _, breaking the Sphinx docs.

Edit docs/index.rst to have entries like

.. automodule:: pymap3d
  :members:

.. automodule:: pymap3d.vincenty
  :members:

Create docs/index.html containing only

<html>
<head>
<meta http-equiv="refresh" content="0; url=html/index.html" />
</head>
<body></body>
</html>

Add docs to branch

Select a branch to use for HTML docs under the repo settings page “GitHub Pages” section. Suppose we use branch “html-docs”:

git switch -c html-docs

git add docs/

git commit -am "add html docs"

git push -u origin html-docs

Related: easier to use pdoc Python autodoc generator

Specify shell script interpreter

In general it is not appropriate to assume the default shell is Bash. Using a generic script shebang:

#!/bin/sh

will either use the default shell or invoke legacy Bourne Shell 1980s compatibility mode. Either way, a shell script using the general #!/bin/sh may fail on other computers. To improve shell script robustness, specify a particular shell with the shebang. Popular shells besides Bash include Dash and Zsh, which is the macOS default. To have even better cross-platform robustness, consider using Python instead of shell scripts.

The default shell is selectable in the shebang in the first line of the “my_script.sh” shell script. For example, to specify Bash shell, put as the first line:

#!/usr/bin/env bash

The currently used shell is revealed by:

echo $SHELL

this $SHELL variable may not strictly be the “default” shell if you have scripts changing the shell on interactive login. Other users may choose a different default shell.

To run a script in a specific shell, do like:

bash my_script.sh

To permanently change user default shell use chsh.

sed one-liners to clean blanks

Using sed one-liners, recursively clean from text files such as blank lines and trailing whitespace.

ℹ️ Note

ensure the globbing pattern is only for the expected text files or unwanted PDF files etc. might be destroyed by just using “*”

The script below is used like:

./clean.sh ~/my_site "*.md"

clean.sh contains:

#!/usr/bin/env bash

set -o errexit

loc=$1
pat=$2

find $loc -not -path "*/.git*" -type f -name "$pat" -execdir sed --in-place 's/[[:space:]]\+$//' {} \+ -execdir sed --in-place -e :a -e '/^\n*$/{$d;N;};/\n$/ba' {} \+

Note that each “-execdir” command is separate. Add more commands or take out what is unwanted.

Use cases include keeping files “Git clean” of trailing spaces and extra lines at end of file. Matlab editor doesn’t autoclean these lines, so use this script for “*.m” files.

Windows SSH server

OpenSSH client and server are built into Windows. The setup procedure is easier than using Cygwin. RDP (Remote Desktop) over SSH can be significantly more secure than RDP alone, assuming SSH is well configured.

Enable OpenSSH Server in Windows Settings → Apps → Apps & features → Optional features → Add a feature → OpenSSH Server. This also sets Windows Firewall to allow inbound SSH TCP connections.

Edit “$Env:ProgramData/ssh/sshd_config” on the OpenSSH server PC. At least set PasswordAuthentication no to require SSH public key for better security.

A minimal SSH keypair can be created for the SSH client by:

ssh-keygen -t ed25519 -f ~/.ssh/my_server

Copy the contents of client laptop file ~/.ssh/my_server.pub to the Windows SSH server computer, creating or adding a line to file ~/.ssh/authorized_keys. The location of this file is defined in sshd_config as AuthorizedKeysFile. Use a unique key for each connecting client–do not reuse SSH keypairs between servers or clients.

If the user is a Windows Administrator on the OpenSSH server computer, add the SSH public key to file “$Env:ProgramData/ssh/administrators_authorized_keys”

Start the SSH server (for this session only) from PowerShell:

Start-Service sshd

To always start OpenSSH on boot, type services.msc and in Properties of OpenSSH server → General set “Startup Type: Automatic”

As on Linux, the “authorized_keys” file must have the correct file permissions ACL. Run this PowerShell script:

The SSH client should be able to connect to the SSH server. If this doesn’t work, try using SSH locally on the OpenSSH server computer to troubleshoot.

To use RDP (remote desktop) over SSH do this one-step setup

Tips:

Edit text files from Windows console over SSH in the Terminal by using WSL. Enter commands like nano foo.txt just like in Linux as it’s the WSL shell.

wsl

Change the default SSH shell. Assuming PowerShell on the SSH server, the commands would be like (from pwsh PowerShell):

New-ItemProperty -Path "HKLM:\SOFTWARE\OpenSSH" -Name DefaultShell -Value "$Env:ProgramFiles\PowerShell\7\pwsh.exe" -PropertyType String -Force

mpi_f08 Fortran interface

Fortran MPI programs should use the Fortran mpi_f08 interface:

use mpi_f08

Intel MPI supports Fortran mpi_f08 including on Windows using free Intel oneAPI compiler.

MPI constants like mpi_comm_world and mpi_real are Fortran derived types.

For legacy user programs if needed, access the MPI legacy integer value via the %mpi_val property.

use mpi_f08

integer :: comm = mpi_comm_world%mpi_val
!! %mpi_var emits the legacy integer

Fortran MPI examples

Too much data that is still not enough

This example uses the aurora, which is produced around most planetary bodies due to energetic particle kinetics as the particles penetrate the ionosphere. Optical instruments such as cameras give a line integrated measurement for each pixel (angle) of the imagers. This data can be useful for tomographic techniques, when the location and orientation of the camera is well known, and multiple cameras with overlapping field of view exist.

However, this rich data can be greatly supplemented and even superseded by other instruments, especially incoherent scatter radar, where 3-D + time data are available due to volume integrated target returns. Many analyses rely on those thin (~ 0.5 degree FWHM) radar beams to complete an analysis. We rarely know the needed orientation of the radar beams beforehand, and many ISR cannot change the location of their pre-programmed beams. Although as AESA they can steer almost instantaneously within the radar backend processor limits.

This is just a geospace example of too much data, but not enough to gauge individual analyses without additional processing techniques.

MINGWROOT environment variable

By convention, the environment variable MINGWROOT tells the path to MinGW64 (just above bin/, lib/, include/)

  • MSYS2: MINGWROOT=%SYSTEMDRIVE%\msys64\mingw64

This variable may be needed to modify the GNU Octave PATH on Windows when using “system()” calls with executables compiled by MinGW. A similar issues exists on Windows with Matlab and Parallel Computing Toolbox, that provides its own mpiexec.

We made a function to workaround these issues.

Eliminating non-https external links

With a website / blog having thousands of pages and many thousands of external links, it is impractical to check external outbound link quality with any regularity. Informal link checks revealed that non-https:// websites had a substantially higher chance of becoming a defunct site that gets snapped up by spammers and scammers. To help mitigate some of the risk of websites going to unintended destinations, we decided to eliminate almost all non-https external links.

An increasing number of undesired websites are enabling https both to improve SEO and trick visitors. However, this additional friction anecdotally for the external links we’ve seen go bad has so far been rarer for https:// URLs. We have seen https:// sites be replaced by undesired content, but what often happens is the spammer doesn’t bother to setup the certificates correctly, so either the website won’t load if HSTS was used, or there are prominent warnings that the user has to click through.

There’s nothing to stop spammers from correctly setting certificates, but we feel https-only external links currently afford a meaningful benefit.