Scientific Computing

git push to multiple sites

The October 2018 day-long GitHub outage led many to consider having a live backup of their Git repos. Multi-remote git push is intrinsic to Git itself. Thus, automatic multi-site pushes are easy to configure for an arbitrarily large number of backup sites (GitHub, GitLab, Bitbucket, Dropbox, …).

This article covers the typical case where the main Git repository is on GitHub, but a public backup is desired on GitLab or similar site. The article assumes an existing working GitHub repo cloned to the computer username/myprog. We also assume SSH keys on both GitHub and GitLab. Use different SSH key pairs for each site for best security.

Back up the GitHub repo username/myprog to GitLab with every git push automatically as in the following section.

GitHub with GitLab backup

  1. check that the GitHub repo is setup normally on your computer:

    cd myprog
    
    git remote -v

    should be like:

    origin	https://github.invalid/username/myprog (fetch)
    origin	ssh://github.invalid/username/myprog (push)
  2. Create a new repo username/myprog on GitLab or other backup site

  3. Add the GitLab (or other) backup site:

    git remote set-url origin --push --add ssh://github.invalid/username/myprog
    git remote set-url origin --push --add ssh://gitlab.invalid/username/myprog
  4. Verify that BOTH push sites are there (in this example, github.com and gitlab.com):

    git remote -v

    should be like:

    origin	https://github.invalid/username/myprog (fetch)
    origin	ssh://gitlab.invalid/username/myprog (push)
    origin	ssh://github.invalid/username/myprog (push)

When pushing, you will see multiple Everything up-to-date – one for each site you’re pushing to.

Multiple forks of GitLab repos

GitLab currently does not have a direct means to fork the same repo multiple times. Users may wish to fork a repo multiple times to develop separate features and merge request each feature separately.

Multiple GitLab repo forks are possible by pasting a simple URL into the web browser as follows:

For main repo at https://gitlab.com/otherusername/repo:

  1. create a New Group with arbitrary name
  2. create a fork into this new group by visiting https://gitlab.com/otherusername/repo/forks/new.
  3. rename this fork, setting the repo URL to be distinct from the original repo name–perhaps repo_feature1.
  4. create an unlimited number of forks of the original repo by repeating steps 1, 2 and 3 for each fork.

Note

For user GitLab installations (such as https://gitlab.kitware.com) this workaround requires creating a group. One might have to do things manually in such a situation (and ask GitLab to make this simple and very necessary feature addition).

Fortran polymorphism with CMake

Fortran polymorphism and generic programming (sets, lists, generators) are consistently the highest ranked survey feature requests for the next Fortran standard. Fortran programmers introduce polymorphic procedures and variables into modern Fortran by the following methods.

C preprocessor #ifdef etc. is simplest, and is widely supported and used. C preprocessor is invoked by convention when the source file suffix is capitalized like .F90. Build systems like CMake and Meson introspect Fortran source code, and so it’s important to use uppercase filename suffix if a Fortran source file needs preprocessing.

Fortran 2003 static procedure polymorphism is simple to use and is a standard Fortran language feature.

Fortran derived types with duplicated procedures for every desired type/kind. This is more verbose to use, but it is the most powerful and flexible true Fortran polymorphism.

The preprocessor method might be thought of as compile-time polymorphism. It’s not a perfect solution, and not true polymorphism since each procedure still requires exactly one type/kind per argument. However, this technique combined with static polymorphism is simple to develop and handles many real-life use cases quickly and easily.

Example: compile-time Fortran polymorphic REAL For each REAL variable and function, make kind=wp. For example polyreal.F90 (notice the capital F90):

program demo
use, intrinsic:: iso_fortran_env
implicit none (type, external)

#if REALBITS==32
integer,parameter :: wp=real32
#elif REALBITS==64
integer,parameter :: wp=real64
#elif REALBITS==128
integer,parameter :: wp=real128
#endif

real(wp) :: pi,b
integer :: i

pi = 4._wp * atan(1._wp)

b = timestwo(pi)

print *,'pi',pi,'2pi',b

contains

elemental real(wp) function timestwo(a) result(b)

real(wp), intent(in) :: a

b = 2*a

end function timestwo

end program

Make a command-line options -Drealbits=64 or -Drealbits=32 etc. in CMakeLists.txt:

project(realpoly Fortran)

if(NOT realbits)
  set(realbits 64)
endif()

# your modules and programs
add_executable(poly polyreal.f90)
target_compile_definitions(poly PRIVATE REALBITS=${realbits})

We typically have a comm.f90 that contains various constants including wp used throughout a program.

Generate then build as usual:

cmake -Drealbits=64 -B build

pi 3.1415926535897931 2pi 6.2831853071795862

That uses double-precision real64 variables and functions. The concept is trivially extensible to large programs consisting of many files and modules.

To then select a different kind and rerun, perhaps to evaluate accuracy vs. runtime tradeoffs (real32 is generally faster than real64, but less accurate):

cmake -Drealbits=32 -B build

pi 3.14159274 2pi 6.28318548

or for quad-precision Fortran real128:

cmake -Drealbits=128 -B build

pi 3.14159265358979323846264338327950280 2pi 6.28318530717958647692528676655900559

Using Python3 on ReactOS

At the time of this writing, Python 3.4 is the newest version that can be used with ReactOS, since newer Python version require sufficient Windows NT ≥ 6.0, and ReactOS 0.4.x is NT 5.2 (Windows 2003 / XP). Specifically, Miniconda didn’t yet work when tried with the 32-bit installer.

Python 3.4 is installed via ReactOS Application Manager, accessible from Start → Programs.

Fix Ubuntu Desktop Matlab icons and Menu items

Create a start/activities menu icon for Matlab or other programs in Ubuntu by downloading Matlab icon:

curl https://upload.wikimedia.org/wikipedia/commons/2/21/Matlab_Logo.png -o ~/.local/share/icons/matlab.png

Create file: ~/.local/share/applications/matlab.desktop

#!/usr/bin/env xdg-open
[Desktop Entry]
Type=Application
Icon=matlab.png
Name=MATLAB
Exec=matlab -desktop
Categories=Development;

Reference

OCR PDF with Tesseract

To use Tesseract-OCR on PDF convert PDF to TIFF. For single page PDF and multipage PDF:

magick -density 300 in.pdf -depth 1 -strip -background white -alpha off out.tiff

This binary (black or white only) TIFF file is about 1 MB / page. Consider doing groups of pages for large/complicated PDFs. Pages are 0-indexed, so to do say pages 4-7 of the PDF:

magick -density 300 in.pdf[3-6] -depth 1 -strip -background white -alpha off out.tiff

While at least 300 DPI is recommended, sometimes increasing resolution can make Tesseract performance worsen, particularly for poor quality text. In such cases, it may be better to work on filtering/processing the input imagery more before inputting into Tesseract.

Run OCR: Tesseract can also output PDF or other formats. Be aware that not all documentation/tips on the web address the machine learning models present in Tesseract 4.x.

tesseract out.tiff out

Tesseract processing can be controlled in numerous ways.

  • improving tesseract input

Fix ImageMagick 6 not authorized reading PDF

ImageMagick uses policy.xml to set read/write permissions by file format. When read permissions are disabled for a format such as PDF, ImageMagick operations might fail like:

convert-im6.q16: not authorized

convert-im6.q16: DistributedPixelCache ‘127.0.0.1’

Fix: find policy.xml location at the top of

magick -list policy

for example on Linux it might be at /etc/ImageMagick-*/policy.xml

Edit this policy.xml to have a line like:

<policy domain="coder" rights="read" pattern="PDF" />

Markdown relative links in Readme / docs

Markdown as a de facto documentation syntax has many variants. The relative linking syntax seems to be widely supported by sites including GitHub and GitLab among others. The syntax is simply like:

[TODO list](./TODO.md)

then even when cloned, forked, renamed, etc. the relative links will continue to work.

Spyder / Jupyter plots in separate window

IPython console in Spyder IDE by default opens non-interactive Matplotlib plots in the same inline “notebook”. Fix this by creating separate windows for interactive figures in Spyder:

Tools → Preferences → Ipython Console → Graphics → Graphics Backend → Backend: “automatic”

Interactive figures are more useful in general to probe the figure data and zoom/pan the figure, unlike the static PNGs in the inline notebook.

Jupyter notebooks can also have interactive plots. Instead of static inline notebook plots with

%matplotlib inline

for inline interactive plots in Jupyter:

%matplotlib notebook

Example:

%matplotlib notebook

from matplotlib.pyplot import figure

ax = figure().gca()
ax.plot(range(5))

References: