Scientific Computing

FFmpeg build

Build FFmpeg from source:

git clone https://github.com/FFmpeg/FFmpeg

# Ubuntu
apt install autoconf automake build-essential libass-dev libfreetype6-dev libsdl2-dev libtheora-dev libtool libva-dev libvdpau-dev libvorbis-dev libxcb1-dev libxcb-shm0-dev libxcb-xfixes0-dev pkg-config texinfo zlib1g-dev libfdk-aac-dev libx264-dev yasm libmp3lame-dev libopus-dev libvpx-dev libx265-dev

# example options
./configure --enable-gpl --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-nonfree --enable-libpulse

make -j

make install

Related: install FFmpeg

WiFi at very low signal strength

This article is about improvised/first check methods to help make “sanity checks” when there is a question if something will work or why it’s not working. Live streaming with little ability to buffer needs more WiFi signal strength than general Web browsing or little logging sensors. Normally for good WiFi performance, a starting point is to have at least:

  • 20 dB SNR
  • -70 dBm RSSI (signal strength)

The signal levels mentioned here are relative and conceptual. Don’t design according to this highly simplified discussion alone!

A simple SNR estimate can be made by simply looking at desired WiFi signal strength and subtract the next strongest signal on your channel. For example, your WiFi is -68 dBm and there is another WiFi AP on or overlapping with your channel at -75 dBm, the SNR is no better than -68 - (-75) = 7 dB. This over-simplified calculation has significant error for RF congested environments, because it does not account for

  • non-WiFi interference (microwave overs, baby monitors, Bluetooth, etc.) (makes SNR worse)
  • channel occupancy time (is the unwanted co-channel WiFi unused, or are they streaming HD video (heavy utilization) (light usage of unwanted WiFi makes effective SNR better)

Here we assume a worst-case near-far problem, where CSMA/CA isn’t stopping conflicting transmissions because the unit is in the middle between the desired and undesired transmitters that can’t hear each other.

Real-life tests

A good 5 GHz AP was tested in an urban environment, with minimal traffic (late at night). The system configuration was a 40 MHz 5 GHz channel with a few other APs on channel, using Ubiquiti hardware with 20 dBm AP transmit power. The AP heard phone about 3 dB weaker than phone heard AP.

RSSI (dBm) uplink throughput (Mbps) downlink throughput (Mbps)
-85 5.0 30.0
-90 4.2 7.2
-91 3.6 4.2

A cheap 2.4 GHz AP was tested with minimized environmental variation from outside or inside due to random human activity such as network loading, Bluetooth, microwave ovens, etc. AP was in adjacent office of commercial building, using a clear channel (no overlap/co-channel in test area), on a holiday (no microwave ovens/bluetooth etc. from employees).

RSSI (dBm) throughput (Mbps) MCS / modulation streaming video resolution (vertical pixels)
-93 - -85 0 0 / BPSK N/A
-84 0.5 2 / QPSK 144
-83 1.3 "" 240
-82 1.7 "" ""
-81 2.2 3 / 16-QAM 360
-80 3.5 "" ""
-79 > 4 4 / 16-QAM 480

Important factors not considered in this test include: latency, error rate, and increased loading due to transmission repeats. In many non-interference-free scenarios, streaming video dropouts may occur at these low signal levels. This is due to people moving about, trees waving, etc. causing signal variability. Log packet errors, transmission drops and signal levels. Rather than try to get big antennas for one AP, use multiple APs and allow beamforming to work. Whenever possible, 5 GHz is much better than 2.4 GHz for most use cases. Even if you’re at an isolated campground, visitors use WiFi-cellular gateways, baby monitors, microwave ovens and other 2.4 GHz devices, wiping out your 2.4 GHz premises monitoring system.

It’s important for wireless workers at all levels to remember principles of significant digits, especially when instrument and antenna calibrations have finite accuracies and the need to build in link budget margins to ensure sufficient reliability. Finer than 0.5 dB precision for field measurements, raises the question of measurement limits. Even 1-2 dB resolution is hard to achieve in real-world environments. The table above is generally indicative of trends, but each device has a couple dB ± accuracy between units of the same model, vs. temperature, etc. Keep in mind always that the environmental variability far dominates the instrument variability. It’s possible in urban environments on 2.4 GHz to have a -70 dBm signal and 5.5 Mbps raw connection speed. This is due to high activity on the channel.

WiFi noise floor

A fundamental factor for any radio receiver is the Johnson-Nyquist thermal noise floor. Assuming the system is designed so that the first receiver amplifier (typically called the low noise amplifier or LNA) dominates system performance, we can start with a simple system noise model that assumes a 20 MHz receiver bandwidth at 30 degrees Celsius.

P_noise = kTB

k is Boltzmann Constant, approximated as 1.38 x 10^-23. T is temperature in Kelvin, here taken as 300K ~ 27°C ~ 80°F. B is the equivalent receiver bandwidth [Hz].

The decibel expression is much more handy for computing ratios of power, which for convenience is expressed as power relative to one milliwatt “dBm”. Many times, just to help show that everyone is using the same reference point and units (dBm vs. dBW, say), kTB will be expressed as a constant in dBm with B=1 Hz. That is, noise power in dBm/Hz.

P_noise,B=1 = 1.38e-23 * 300 = 4.14 zW/Hz

Let’s go from zeptowatts to dBm:

P_noise,B=1 = 10*log10(4.14e-21) + 30 = -173.8 dBm/Hz

where the +30 factor converts from dBW to dBm. For a 802.11n 20 MHz channel, we can estimate the receiver thermal noise floor power as:

P_noise = -173.8 + 10*log10(20e6) = -100.8 dBm

This means an ideal WiFi 802.11n receiver will have a noise floor of about -101 dBm. Typical consumer WiFi gear will have a noise figure degrading (raising) this thermal noise floor by several dB.

Minimum WiFi SNR

A fundamental lower limit on minimum SNR for WiFi is set by the modulation type used. Assuming BPSK or QPSK is being used (MCS0, 1, 2) then 1% BER corresponds to about 4 dB Eb/N0. Assuming a 3 dB noise figure for the receiver of the AP and unit, and assuming the transmitter powers are equal, it seems that about

-101 + 3 + 4 = -94 dBm

should be a rough lower limit for establishing a WiFi connection. The throughput as anyone has experienced at that signal level will be nil, but a WiFi connection can be established in the -94 dBm range. This analysis is oversimplified, because WiFi uses OFDM with numerous subcarriers operating in parallel.

Making a minimally useful WiFi connection requires a few dB more signal strength. If we believe the receive power indicator, -84 dBm seems to an approximate lower limit for actually being able to slowly download a website or buffer video.

Factors affecting the minimum viable WiFi signal include:

  • ambient RF noise: microwave oven, Bluetooth, other WiFi traffic, baby monitors
  • amount of other WiFi traffic: after-hours business vs. streaming HD video or torrents
  • environmental variability: signal strengths are never constant. Desired and undesired signal strengths are always changing due to reflections from countless objects.
  • equipment variability: performance of equipment varies from how it’s held, its temperature, etc.
  • interference at one or both ends: If either end is in an interference area, the link is broken as both ends have to hear each other. This and the asymmetric data transfer are some reasons why we set AP transmit power a bit higher than the mobile devices.

With professional WiFi gear in greenfield (no interference) WiFi environments, I’ve seen 360p video streaming work acceptably, but with intermittent interruptions at -87 dBm. Would I ever design a system like that? No! I was merely surprised to see it actually usually working like that, but giving occasional signal loss alarms. The real question to ask when wanting to know if a system application will be viable is what is the minimum viable signal for the required data throughput and reliability factor.

To stream live video with occasional glitches at 360p in a clean RF environment, -77 dBm may be just enough signal allowing 10 dB for environmental variation. I can’t say it would not work, but I would be doubtful about how reliable it would be for clean uninterrupted no-glitch streaming for an hour.

To cover a warehouse on the edge of a rural town for bar code readers or temperature sensors and the budget was tight, it’s possible that -77 dBm is just enough, assuming temporary outages don’t impact the business or safety, and that outages are logged. Merely going in and designing a system for -65 dBm everywhere can be too expensive, and your competitor might get the bid instead. Know the characteristics of the system elements you design/support/sell/maintain. When you want to maximize performance vs. cost, you have to measure, experiment and log to verify your designs work in a wide variety of scenarios and to insure performance is met over time.


YouTube Live bandwidth requirements

Related: Manual WiFi AP config for better performance

Scan / print on WSL

In Windows Subsystem for Linux, printing via cups and scanning via xsane works fine for network-connected printer/scanner. For example, WSL works with networked Brother MFC devices. In general, Brother Linux printer/scanner support is very good. Generally it’s easier to setup a printer/scanner in Linux than Windows. A scanner that didn’t work on Windows might work from Windows Subsystem for Linux–on the same PC and Windows install.

We did not use DBUS. If you’re having trouble, it might help to try the printer with a computer running Ubuntu natively to see what’s up. I.e. it could be a faulty driver or a configuration step missing from the documents.

Install CUPS:

apt install cups

Then install the Linux printer driver.

For the scanner:

apt install xsane

Then install the Linux scanner driver.

Convert Cinepak videos with FFmpeg for ImageJ

ImageJ cannot read Cinepak codec video files. Convert from Cinepak to popular video formats using FFmpeg.

Motion JPEG is widely-compatible with video players including ImageJ.

ffmpeg -i old.avi -c:v mjpeg -q:v 1 out.avi

Uncompressed AVI output file size could be a factor of 10 larger than the Cinepak version. By definition, every video player should be able to play uncompressed AVI–including ImageJ.

ffmpeg -i old.avi -c:v rawvideo out.avi

Lossless FFV1 preserves the original video quality with lossless compression. Many video players can handle FFV1 AVI video.

ffmpeg -i old.avi -c:v ffv1 out.avi

The advantage of using a PNG image stack comes in frame-by-frame analysis of the video.

Consider converting video to HDF5 using dmcutils/avi2hdf5.py for analysis purposes.

NSF Dear Colleague letter on Sondrestrom

The 26 DEC 2017 NSF “Dear Colleague” letter notes an effective shutdown of Sondrestrom ISR on 31 MAR 2018. This is generally following Recommendations 7.2, 7.3, and 9.11 of the 2015 NSF Geospace Section Portfolio Review, with final report issued 14 APR 2016.

We have been running instruments remotely at Sondrestrom since 2012. As soon as the NSF 2016 report was issued, I started to hear murmurs about the publication count from Sondrestrom. I homed in on Recommendation 7.36:

NSF GS should develop a common set of annual metrics from each facility which can be collected year-on-year to provide an underpinning of the next Senior Review. These metrics could include

  • science outputs both from facility staff and external users
  • annual expenditure (capital and resource)
  • data downloads and usage
  • key technical developments (hardware and software).

While noting that NSF GS may not have the informatics systems necessary at present, it seems likely that other NSF directorates or other funding agencies such as NIH may have developed such metrics and they should be employed.

Rationale

One of the key issues acknowledged by many at CEDAR 2017 workshop “save Sondrestrom ISR” meeting was the relatively low publication count involving Sondrestrom. NSF questioned whether instruments had real-time streaming capability, for support of space weather nowcasting. The internet bandwidth to all of Greenland is limited, and the satellite link to Sondrestrom costs $45/GByte. The throughput of 50 kB/sec led to a number of difficulties and workarounds. Data is mostly transported by mailing/carrying USB hard drives in and out of Sondrestrom.

Given flat (effectively declining) budgets, any program manager looks at the low-hanging fruit to cut. Sondrestrom ISR is unique globally in being the only ISR to run at such a short wavelength (23cm, 1.29 GHz). The Sondrestrom vacuum-tube (Klystron) based transmitter technology presents longevity concerns. The mechanically steered dish limits the spatiotemporal resolution considerably vs. electronically steered ISRs. The ISR is powered by a 600 kW generator, and the station power is provided by two 180 kW generators that cycle periodically to even out wear.

NSF also planned to cut Arecibo’s budget by ~ 75%, so running large, expensive facilities is a pressure NSF Geospace directorate seems concerned with pushing down. I think we have to acknowledge, something had to change. However, as the assessment below states, it’s not clear that planned economies by accessing EISCAT-3D will come to fruition in the next few years.

Review of the Portfolio Review

An assessment of the Geospace Portfolio Review was conducted in 2016 by National Academy of Science. Their assessment report DOI: 10.17226/24666 made a few critical points. (Note, you can download the 1.1 MB PDF for free as “guest”). Per the report, the Portfolio Review used per-facility metrics like:

  • Hours of operation per annum
  • Publications for at least 5 years
  • Number of site users (instruments placed) and data users
  • Current state of maintenance
  • Future science and technology plans
  • sources of funding
  • International agreements
  • Present and future plans in support of the survey

The plan to shift some recovered funds to EISCAT-3D was questioned in Section 5.2.2, as EISCAT-3D is not yet fully funded (isn’t built).

A key takeaway from the two reports is you can’t understand what isn’t measured. Opaque budgets and arbitrary metrics are not a great starting point for any effort. European funding agencies under Horizon 2020 have a mandate for open data, open publication. They accept metadata from repositories like Zenodo to close the loop. While not trivial, the problems have been partially solved in other funding agencies.

Simple AstroPy Python FITS image stack examples

Assume an image stack in file myimg.fits. FITS files do not memory map except in special cases. Usually FITS files are under 2 GB, making it feasible to work with the whole image stack on a modern PC. That is, load the whole image stack and then index the 3-D array in RAM.

from astropy.io import fits

fn = 'myimg.fits'

with fits.open(fn, mode='readonly') as h:
    img = h[0].data

    lat = h[0].header['GLAT']
    lon = h[0].header['GLON']

The header contained location metadata that we assigned to lat and lon.

Newer image formats HDF5 and NetCDF4 can have effectively unlimited file sizes, and easily store arbitrary organizations of variables, data, and metadata.


Related: read FITS image stack in Matlab

Identifying file type without extension

I received an email attachment, with no filename extension because of the spam filters in corporate email. I “knew” it was a legitimate file because I had just requested it from a notable researcher. Rather than bother the sender to tell me the original file extension, I determined the file type by:

file emailedfilename

which gave output

gzip compressed data, from Unix

Thus I changed the filename to be emailedfilename.tar.gz since a gzip’d file almost always contains a tar archive, and I could extract the files.

Note that tar is smart enough even with the wrong file extension to work, so I could have instead just done

tar -xf emailedfilename

University / School ham radio club license

This discussion pertains to United States of America Federal Communications Commission Part 97 regulations on Amateur Radio.


FCC 47 CFR § 97.5(b)2 lays out the structure for a club station license. In practical terms, this gives amateur radio clubs a memorable callsign of note they can rally around, to build branding, etc. The club station license does not confer operating privileges, but the trustee must hold an amateur radio license.

The trustee will receive FCC official mail, and ARRL LoTW (Logbook of the World) will only sign up clubs through the trustee. Thus clubs should ensure the trustee is someone who actually checks their physical mail, that is on campus at least several times a month on average. The trustee does not have to be be a school employee. However, some club constitution or bylaws require the trustee to be a school employee.

Control Operator vs. Club Trustee: §97.103(b) notes that by default, the station licensee (here, the school club trustee) is the control operator. Even if another person is the control operator, §97.103(a) holds the trustee and control operator equally responsible. Intuitively, §97.105 states the control operator is responsible for “immediate proper operation of the station, regardless of the type of control.”

Student operating privileges: provided the control operator is in control, anyone operating (including non-licensed persons) may use the control operator’s license privileges. Ideally, the club control operator will have an Extra class ham radio license so that operators get to use maximum privileges. §97.115(b) notes that for third-party communications (someone besides control operator working the radio), the “control operator is present at the control point and is continuously monitoring and supervising the third party’s participation”. Many clubs take §97.115(b) to mean the control operator is physically on site, indeed in the radio room itself.

Reference: FCC § 97.5

AGU FM2017 Python lunch notes

At the meeting, it was mentioned that

  • Juno Waves instrument uses python from day one
  • other researchers also using Fortran from Python
  • PyAstro is a useful conference
  • Major packages like Numpy typically lack funding. Numpy got its first funding in 2017!
  • see paper: The AstroPy Problem on funding geoscience software development
  • should make open source software part of CEDAR Decadal Survey
  • Autoplot: one line command to plot many science formats including: CDF, HDF5, NetCDF and many more.

NetCDF4 vs. HDF5 for large datasets

NetCDF4 uses a subset of HDF5 features, and adds some new features. NetCDF4 reads/writes specially structured HDF5 files. Performance of HDF5 and NetCDF4 is highly similar including on supercomputers. The main idea behind NetCDF4 is a simpler API than HDF5, while maintaining the same performance.

Python h5py makes HDF5 read/write very easy. NetCDF4 is a little more complicated to use from Python.

HDF5 from low-level languages such as C, C++ and Fortran is a little elaborate as compared to NetCDF4 ease of use. Using HDF or NetCDF4 from Python is easier as these examples collapse down to a couple lines of code.