Scientific Computing

Matlab array broadcasts like Python

Array broadcasting (no bsxfun() needed) is in Matlab and GNU Octave

Array broadcasting can minimize copying of large arrays in memory or excessive looping over arrays, both of which can be very inefficient. However, there are certain cases particularly with JIT compilers where looping can be faster than array broadcasting. Often the clean, clear code enabled by array broadcasting outweighs those edge-case possible gains. As always, make a simple benchmark if you’re concerned about a particular case (broadcast vs. loop) and for Python try Numba.

bsxfun is no longer needed for:

  • plus
  • minus
  • times
  • rdivide
  • ldivide
  • power

and many more operations.

If A has size 4x5 and B has shape 1x5, A .* B, A + B etc. just work without bsxfun(), provided the N-D array dimension multiplied is of matching shape.

This is important for clarity and conciseness of syntax. bsxfun() is a factor that drove me to using Python over Matlab almost entirely.

A .* B

can result in

error: operator *: nonconformant arguments (op1 is 5x4, op2 is 1x5)

Assuming A and B are compatibly shaped when transposed, try:

A .* B.'

Use .' to transpose because ' means take the Hermetian (complex conjugate) transpose. Remember that Matlab does memory copies (expensive for large arrays) on transpose. Python transposes are “free” O(1) since they’re just a view into the original array (pointer tricks).

Programs earlier than these versions did require bsxfun().

Program minimum version
Matlab R2016b
Octave 3.6.0

Fortran 95 brought function spread() for efficient array broadcasting by replicating the array in memory.

Switch from IDL to Python

Python-based data analysis is dominating many fields. In astronomy and geoscience applications, Python dominance is increasing. Consider a widely valued language like Python inside of a niche, virtually unknown language outside a few specialties.

Installing Python for the first time: Miniconda Python is a 100 MB download–it won’t initially take several gigabytes of hard drive space like Matlab and IDL.

Package distribution can occur via a website or preferably, a centralized conda or pip repo. For example, to install popular data analysis modules:

conda install pandas scipy matplotlib spyder

IDL vs. Python syntax gives NumPy/Python equivalents to common IDL commands

IDL is a bit distinctive in its syntax. IDL and Matlab syntax remind me a bit of Fortran, while Python is C-based, including that Python has 0-based indexing and C-order array axes. Mayavi also gives advanced 3-D visualization.

Jupyter Notebook makes an IDE in your web browser, and you can make remarkable animated, interactive 3-D plots as well with Ipyvolume.

conda install jupyter
pip install ipyvolume
web browser 3-D animation flow field

Ipyvolume interactive flow field.

HDF5 can handle large and small datasets quickly and easily. Assume HDF5 file terabyte.h5 with double-precision float variable X of size 100000x2048x2048 (3.4 TB). Let’s load the first frame of the 2048x2048 image data, and write it variable first to image.h5. The with syntax uses Python’s context manager, which closes the file upon exiting the indented code section under with.

import h5py

with h5py.File('terabyte.h5') as f:
    img = f['X'][0,...]

with h5py.File('image.h5') as f:
    f['first'] = img

IDL supports Python

Call GDL from Python: GDL Gnu Data Language is free open-source alternative to IDL, compatible with most IDL syntax.

from Python, call IDL/GDL functions by simply import GDL. See PYTHON.TXT

Call Matlab from Python and Python from Matlab:

Remove leftmost columns from text file

Removing the left N columns from a text file can be done with the one-liner below. This example trims off the first 2 columns of each row of a text file “old.txt”, resaving as “new.txt”:

cut -c2- > new.txt < old.txt

For example, clean diff output of all the plus/minus signs or eliminate unwanted characters in the first column(s) of a text file.


Related: remove right columns from text file

Trim right columns from text file

Old Fortran or COBOL programs may have columns 73-80 filled with vestigial card sequence numbers. Trim rightmost columns as in this example. This keeps the first 72 columns of text from “ancient.for” on each row, resaving as “ancient.f”.

cut -c-72 > ancient.f < ancient.for

Any trailing whitespace (up to 72 characters for this example) remains.


Related: remove leftmost columns from text file

Linux Wake-on-LAN worldwide

Wake-on-LAN can also work worldwide (over WAN Internet), in case a PC 10,000 km away gets turned off accidentally. To turn on a target PC from anywhere in the world, first setup the target PC and router port forwarding. For simplicity/clarity we assume the following arbitrary parameters, which will be unique for the target Linux PC:

  • target interface: eth0
  • target PC static LAN IP: 192.168.1.100
  • target public WAN IP: 1.2.3.4
  • public WoL port forward: 10009
  • target MAC: 00:11:22:33:44:55

Remote Linux PC: modern Linux distros such as Ubuntu have WakeOnLan configured in NetworkManager.

  1. Settings → Network and select the wired LAN interface to use for WakeOnLan.
  2. under the “Ethernet” tab, seee that “Wake on Lan: Default” is checked. No Wake on LAN password is needed for most use caes.
  3. copy down this “Device” hexadecimal string, this is the MAC address to turn on the PC remotely.

Confirm the correct MAC address by typing in Terminal

ip a

Have someone nearby it in case it doesn’t turn it back on–shutdown remote PC for testing.


On a laptop:

apt install wakeonlan

Wake PC on same LAN

on control PC, on the same LAN

wakeonlan 00:11:22:33:44:55

using the MAC address of the target PC of course.

If PC didn’t turn on, ensure BIOS/UEFI settings have Wake On Lan enabled. Try a discrete Intel Ethernet NIC, some motherboard NICs don’t work for Wake-on-LAN

setup worldwide Wake-on-LAN

To wake-on-LAN from anywhere in the world, using example parameters at top and after doing the procedure above, further do:

  1. Router: port forward 10009 to 192.168.1.100 port 9
  2. To use without Dynamic DNS service, have the PC auto-send email when the public IP address changes.

From the control PC on a different network type in Terminal:

wakeonlan -i 1.2.3.4 -p 10009 00:11:22:33:44:55

and target PC should power on.

Notes

Legacy Linux remote PC setup

This procedure is for older Linux distros that don’t have WakeOnLan config in NetworkManager.

Install ethtool

apt install ethtool

Add the following line to /etc/network/interfaces under the interface to control auto eth0

up ethtool -s eth0 wol g

Edit /etc/init.d/halt, changing text near top to:

NETDOWN=no

instead of yes.

Edit/create /etc/default/tlp with line

WOL_DISABLE=N

Note link/ether for eth0–this is MAC address

ip a

Reboot target PC, then shutdown target PC for testing

New version of GLOW auroral simulation

Stan Solomon and the NCAR team have greatly cleaned up GLOW auroral energy deposition and kinetics model. The new version number is 0.981, and since it’s mostly to Fortran 90/95 standards, it’s easier to use and modify, along with being more stable.

Updated GLOW Python/Matlab code is available for new GLOW 0.981.

GLOW output

Surprising modern Fortran behaviors

Here are a couple things that may trip up a classic or modern Fortran programmer.

Single line IF

The single line IF in both classic and modern Fortran only “captures” the first statement. If you use a semicolon, the code after the semicolon is always executed!

  • Correct:

    if (.true.) x = x+1
  • Incorrect: Here y always is y+1 !!

    if (.true.) x = x+1; y=y+1

MOD() vs. MODULO()

Fortran 77 had mod(), which could output negative numbers. This is often NOT the behavior you want, as other languages’ mod() functions only output positive numbers.

So typically, use Fortran modulo().

FFmpeg Live stream

YouTube Live streaming (optionally including other sites simultaneously) with screensharing is simple and stable with FFmpeg or OBS Studio.

Webcam:

ffmpeg \
-f alsa -ac 2 -i hw:1,0 \
-f v4l2 -r 10 -i /dev/video0 \
-c:v libx264 -pix_fmt yuv420p -preset ultrafast -g 20 -b:v 2500k \
-c:a aac -ar 44100 \
-threads 0 -bufsize 512k \
-f flv rtmp://a.rtmp.youtube.com/live2/YOURSTREAM &> stream.log

FFmpeg audio option -f alsa -ac 2 -i hw:1,0 uses -ac 2 for stereo to hardware address 1,0. For default audio device, use -f alsa -i pulse. Obtain hardware addresses via arecord -l. Note that monoaural -ac 1 doesn’t work on some hardware. -c:a aac -ar 44100 is AAC encode audio at 44.1 kHz sampling frequency (passing audio up to about 22 kHz, CD-quality).

FFmpeg video option -f v4l2 -r 10 -i /dev/video0 say to acquire video from webcam /dev/video0 at 10 fps, using the native resolution of the webcam. Force webcam resolution like -s 640x480. -c:v libx264 -pix_fmt yuv420p -preset ultrafast -g 20 -b:v 2500k encodes video using the H.264 video compression algorithm, prioritizing reduced CPU utilization, in blocks of 20 frames with a target bitrate of 2.5 Mbps.

FFmpeg config -threads 0 -bufsize 512k lets FFmpeg choose the number of CPU threads, with a 512kB buffer. Adjust buffer size as needed–tradeoff between latency and robustness. -f flv rtmp://a.rtmp.youtube.com/live2/YOURSTREAM &> stream.log uses the FLV container format, live stream to YouTube, with stream ID YOURSTREAM. Don’t let others know the Stream ID!.

Screenshare: use the same command as above, swapping -f v4l2 -r 10 -i /dev/video0 with

-f x11grab -r 10 -s 1024x720 -i :0.0+0,24

-f x11grab -r 10 -s 1024x720 -i :0.0+0,24 sets screengrab at 10 fps, starting in the upper left hand corner, send a 1024x720 pixel video. If your screen resolution is more than 1024x720, there would be unsent regions at the bottom and/or right side.

Audio-only:

ffmpeg -f alsa -ac 2 -i pulse -c:a aac -ar 8000 -f rtp rtp://localhost:1234

Connect from the local or remote PC to rtp:://serverip:1234.

Restream.io multisite livestreaming with FFmpeg

Restream.io has a free basic service that streams to over 30 sites, from a single incoming stream from FFmpeg. While FFmpeg can “tee” live-stream to multiple sites, this uses N times the data bandwidth for N sites, which doesn’t scale well. Restream.io takes one incoming stream and sends it to multiple sites, saving the streaming creator’s Internet data bandwidth. This is useful for slow Internet connections or where Internet connection cost is expensive per gigabyte.

The free service from restream.io works for YouTube Live among others.

  • restream.io can send stream notifications to the feed.
  • restream.io sends preview/link to stream
  • use Titles on restream.io to set stream titles all at once (before stream start)

Getting NEXRAD weather radar archive data

NOAA archives NEXRAD and TDWR radar data.

12000x5400 PNGs are available since 2010 (6000x2400 back to 1995). This Python program downloads composite reflectivity data rapidly using multiple download threads.

Interactive HTML5 National Reflectivity Mosaic

Unfortunately a colormap-data table or function isn’t readily available. Of course it’s inside the NEX2IMG program, but I would have to hack this a bit to extract the function. Another approach would be to use the HSV map to bin the PNG data.

Nexrad map

Notes