The ubiquitous GNU coreutils has long been missing from Windows.
We found ourselves invoking coreutils utilities via WSL using wsl <coreutils command> to get access to these utilities on Windows until now.
Microsoft has enhanced Rust-based uutils coreutils to run natively on Windows, and has made it available via WinGet:
winget install --id=Microsoft.Coreutils -e
Close / reopen the Terminal windows to use Coreutils, which has distinct
conflict
and availability of tools when using ComSpec Command Prompt vs PowerShell.
Some commands are so POSIX-intrinsic that they are not available or relevant in Microsoft coreutils.
Other coreutils commands overlap so much or conflict with Windows intrinsic commands that they are omitted from Microsoft coreutils.
There are distinctions in
command parsing
of Microsoft coreutils vs. standard coreutils to be aware of.
A
general issue
across systems that use “coreutils”, say on embedded or other minimal systems where not all coreutils are available,
is that it’s up to the developer to handle cases where some coreutils tools isn’t available or overloaded by something else.
Build systems like CMake also handle these problems, like what exactly is “gcc” when multiple compilers masquerade as “gcc” – CMake inspects the version string to formally ID the compiler vendor.
To use a script consuming coreutils on Windows, the script needs to handle issues like the following, where MSVC “link” is overriding coreutils “link”:
Have a look in
Cargo.toml
to see the Microsoft coreutils commands.
We have long augmented CMake projects with Bash and PowerShell scripts to handle tasks too awkward for CMake.
Python usually has enough built-in capability in “os”, “pathlib”, and “shutil” to avoid needing coreutils in Python scripts.
Could Windows with Microsoft coreutils be considered a GNU / Windows hybrid - no, because Microsoft coreutils is based on uutils coreutils, which is an MIT-licensed reimplementation of GNU coreutils in Rust.
GNU / Linux is a common term for Linux distributions that include GNU utilities, and Microsoft coreutils brings many of those utilities to Windows.
While it’s not a full GNU environment, it does provide a significant portion of the GNU toolset on Windows, making it a sort of hybrid in terms of command-line utilities.
As background, the core components of a typical GNU/Linux system include:
Linux Kernel: Core of the system, handling hardware and process management
GNU Utilities: Essential tools for file management, text processing, and system administration
Display Server and Desktop Environment: X11 or Wayland for graphics, with desktop environments like GNOME, KDE Plasma, or Xfce
Package Manager: Software installation and updates (e.g., APT, DNF, Pacman).
Shell: Command-line interface for interacting with the system.
On Windows the core components include:
Windows Kernel: Core of the system, handling hardware and process management
Microsoft coreutils: Essential tools for file management, text processing
Display Server and Desktop Environment: Windows GUI for graphics and user interface
Package Manager: WinGet for software installation and updates
PowerShell: Command-line interface for interacting with the system
On macOS the core components include:
XNU Kernel: Core of the system, handling hardware and process management
BSD Utilities: Essential tools for file management, text processing, and system administration
Display Server and Desktop Environment: Quartz for graphics, with the macOS desktop environment
Package Manager: Homebrew for software installation and updates
Shell: Terminal with Zsh for command-line interface
Python has become a dominant language for scientific computing, data analysis, machine learning, and engineering workflows.
Julia offers a modern high-performance syntax specifically designed for numerical and scientific computing.
GNU Octave is an open-source MATLAB alternative with largely MATLAB compatibile syntax.
GNU Octave
continues to be developed by
John W. Eaton.
Octave is a high-level interpreted language designed for numerical computations.
The community continues to release major versions roughly yearly.
Octave shines when you need:
Near drop-in compatibility with MATLAB .m files (as long as proprietary toolboxes aren’t required).
A quick way to test whether it’s worth porting a MATLAB function or script to Python.
Calling MATLAB/Octave functions directly from Python using Oct2Py.
Octave includes its own growing set of packages (toolboxes) that extend its capabilities in areas like signal processing, control systems, and optimization.
Julia
is a modern, high-performance language designed specifically for scientific and numerical computing. It aims to combine the ease of use of Python/MATLAB with the speed of C/Fortran.
Julia excels when:
You need high performance without dropping to lower-level languages (JIT compilation often delivers near-C speeds for numerical loops and linear algebra).
Working on large-scale simulations, differential equations, optimization, or other compute-intensive scientific tasks.
You want a clean, math-friendly syntax with advanced features like multiple dispatch, metaprogramming, and excellent built-in support for parallelism and distributed computing.
Reproducibility and package management are priorities (via its built-in package manager).
Julia has strong libraries for data science, machine learning, visualization, and more, though its overall ecosystem is smaller than Python’s. It’s particularly appealing for researchers writing performance-critical code from scratch.
Vast ecosystem: NumPy, SciPy, Pandas, Matplotlib, scikit-learn, PyTorch/TensorFlow, and thousands of other specialized libraries cover everything from microcontrollers to supercomputers.
Scalability: The same language and core libraries work from embedded devices → Raspberry Pi → laptops → HPC clusters.
Reproducibility: Open-source nature means anyone can run your code with pip install or conda environments—no license server or version-matching headaches.
Embedded / IoT support: Since 2014, MicroPython has brought a capable subset of Python (including exception handling, coroutines, etc.) to low-cost hardware like the Raspberry Pi Pico and many other MCUs/SoCs.
Python’s general-purpose nature also makes it easier to integrate with web apps, databases, GUIs, automation scripts, and version control workflows—areas where Octave is weaker.
Octave for pure MATLAB feel; Python for broader skills; Julia for high-performance numerical work
Large-scale data analysis & ML
Python
Mature ecosystem and tooling
High-performance numerical simulations
Julia or Python + Numba/Cython
Julia for clean high-speed code
Embedded / low-cost hardware
Python (MicroPython)
Much broader hardware support
Reproducible open research
Python or Julia
No licensing barriers
Existing large MATLAB codebase
Octave (or Python + oct2py)
Minimize immediate rewrite cost
With Python and Oct2Py, Octave can be a bridge for those transitioning away from MATLAB.
While Python is often a default choice for new projects, Julia can be a compelling alternative for high-performance numerical work.
The default interactive shell for operating systems is typically:
Linux: Bash
macOS: Zsh
Windows: PowerShell
Note that the non-interactive shell may default to a simpler POSIX shell like
Dash,
so ensure that script
shebang
line specifies the intended shell for running scripts.
Each shell vendor has configuration files to change the default shell parameters.
Shells typically have a persistent command history file that stores the commands that have been executed.
This allows users to recall and reuse previous commands.
A very long history may retain mistyped commands or commands that are no longer relevant.
Get the location of the Bash command history file:
echo"${HISTFILE:-$HOME/.bash_history}"
Edit the
~/.bashrc
file to include the following settings:
# Number of commands remembered in the current session (in memory)exportHISTSIZE=500# Number of commands saved to the history file on disk# Keep at least a little bigger than HISTSIZE to handle duplicatesexportHISTFILESIZE=1000# Ignore both duplicate and empty commandsexportHISTCONTROL=ignoredups:ignorespace
Edit the
~/.zshrc
file to include the following settings:
# Number of commands remembered in the current session (in memory)exportHISTSIZE=500# Number of commands saved to the history file on disk# Keep at least a little bigger than HISTSIZE to handle duplicatesexportHISTFILESIZE=1000setopt hist_ignore_dups
setopt hist_ignore_space
There isn’t a built-in way to have C++ tell what compiler flags were used to compile a library or executable.
Fortran has intrinsic function
compiler_options
that can be used to print the compiler options used to compile a Fortran program.
This
example C++ program
demonstrates how to print the compiler flags used to compile the C++ program or library itself
CMake can be configured to use shorter paths for build paths, which is important for large or complex projects on Windows where the 260 character path limit is a problem for some tools.
This is done via
CMAKE_INTERMEDIATE_DIR_STRATEGY
which is a CMake environment variable as well as a CMake command-line option.
The default is to use full paths for human readability, but for those occasions where the path length is a problem, this option can be set to SHORT to use shorter paths.
This example below is contrived to use a long source file path - the problem in practice comes from nested dependencies and build directories, which can easily exceed the 260 character limit on Windows when building a project with CMake.
However, this example still demonstrates the issue and the solution with the shorten build path option.
cmake_minimum_required(VERSION4.2)project(soLongLANGUAGESCXX)# make a long path to demonstrate the issue
set(long_path"${CMAKE_BINARY_DIR}/this/is/a/very/long/path/that/will/exceed/the/260/character/limit/on/windows/when/building/a/project/with/cmake/lets/see/if/it/works/with/the/shorten/build/path/option/just/to/make/sure/it/is/long/enough/to/exceed/the/limit/we/need/to/make/sure/it/is/long/enough/to/exceed/the/limit/")string(LENGTH"${long_path}"L)message(STATUS"Long path length: ${L} characters")message(STATUS"CMAKE_INTERMEDIATE_DIR_STRATEGY: ${CMAKE_INTERMEDIATE_DIR_STRATEGY}")set(CMAKE_EXPORT_COMPILE_COMMANDSON)message(STATUS"See file ${CMAKE_BINARY_DIR}/compile_commands.json for the compile commands with the long path")file(MAKE_DIRECTORY"${long_path}")file(GENERATEOUTPUT"${long_path}/main.cpp"CONTENT"int main() { return 0; }")add_executable(soLong"${long_path}/main.cpp")
Compare the -o flag parameter between these two commands.
The SHORT will be much shorter than FULL.
CMake can print cache variables during the configuration phase using any of these methods.
The “cmake” command itself can print cache variables to the console.
Variable values may be set by passing -D options to the “cmake” command, or by editing them in the CMake GUI or “ccmake” interface.
cmake -Bbuild -LAH
-L
Print only the variable names and values, without help messages
-LA
Print all variables, including advanced ones that are not shown by default.
-LAH
Also print help message for each variable.
The
CMake GUI
is available if installed and a graphical desktop is available.
Press “Configure” to see the cache variables.
Values may be edited if desired.
cmake-gui -S . -B build
The “ccmake” Curses-based interface is available on non-Windows platforms, which can also edit cache variables.
ccmake -B build
From the “ccmake” interface, press “c” to configure, “t” to toggle visibility of Advanced variables that are not shown by default.
Mathworks published a
Terminal emulator app
for Matlab, which is performant and well-integrated with Matlab.
It does not require any Matlab toolboxes and can be used on all platforms that Matlab supports.
Matlab Terminal supports multiple tabs, customizable themes, and various shell and AI Agent environments.
For those using a separate IDE like
VS Code with Matlab,
the integrated terminal in VS Code is still a good choice.
For users who prefer to work directly in Matlab, this Matlab Terminal app is a great addition.
Symbolic links are useful in any operating system to shorten long, complicated path names like C:/user/foo/data to just C:/data.
If encountering problems with user permission,
set user permission
to create symbolic links on Windows.
The reparse tag value 0x8000001b is a Windows App Execution Alias IO_REPARSE_TAG_APPEXECLINK.
App Execution Aliases are not symbolic links, but are a way for Windows
CreateProcess
to find the correct executable to run from a user-friendly name like “wt.exe” or “bash.exe”.
Not every language works with App Execution Aliases at this time–Java io and nio don’t work with App Execution Aliases currently.
Python does work with App Execution Aliases, for example:
GCC / Gfortran 10 and newer warn for arrays too big for the current stack settings.
Having arrays that exceed the stack limit may cause unexpected behavior - they should use allocate() instead in general.
Example of improper use of stack memory:
Warning: Array ‘big2’ at (1) is larger than limit set by ‘-fmax-stack-var-size=’, moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider using ‘-frecursive’, or increase the ‘-fmax-stack-var-size=’ limit, or change the code to use an ALLOCATABLE array. [-Wsurprising]
This is generally a true warning when one has assigned arrays as above too large for the stack.
Simply making the procedure recursive may lead to segfaults.
Windows has particular linking requirements for shared libraries that can become challenging with MSVC-like compilers such as Intel oneAPI when linking Fortran and C libraries together.
In short, the workaround is to use static libraries instead of shared libraries for such cases on Windows.
Example: MUMPS-superbuild project provides MUMPS libraries in the same way as the original MUMPS project’s Makefiles, that is with a library called “mumps_common” and then a library for each of 4 numerical precisions “smumps”, “dmumps”, “cmumps”, and “zmumps” that link against “mumps_common”.
This is quite robust across compilers and linkers - the only issue is on Windows with oneAPI when building shared libraries.
Building shared libraries for MUMPS-superbuild is done by:
cmake --workflow shared
Most unresolved externals with oneAPI on Windows were symbols like:
A first hypothesis was that auto-export was incomplete.
CMake on Windows can auto-export symbols via
CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS.
In mixed C and Fortran projects, that is often good enough.
Given the unresolved names, it was reasonable to suspect missing DATA exports from mumps_common.dll.
We ran a few checks to diagnose:
Inspected linker diagnostics to collect unresolved symbol families.
Inspected generated export definition .def files to see what CMake actually exported.
Compared expected names versus actual exports in the produced DLL/import library path.
Tested oneAPI-specific compile/link flag ideas.
Auto-export did miss symbols that appeared in unresolved lists.
But even after supplementing exports, unresolved externals persisted.
We tested multiple fixes, including:
Supplemental export definitions for missing DATA symbols.
oneAPI flag experiments intended to improve dynamic common/module handling.
Internal bridge-style link topology changes.
Why these were rejected:
Supplemental exports alone did not clear unresolved module-data references.
oneAPI flag experiments were either ineffective or unstable for this build.
Bridge-link approaches can alter expected import-library behavior and create downstream risk for users linking mumps_common.lib.
The root cause appears to be broader than CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS.
Auto-export can be incomplete, but even with explicit extra exports, oneAPI Fortran on Windows still struggled to resolve cross-DLL module/common data references in this configuration.
To maintain mixed Fortran/C HPC packaging across compilers, validate shared-library topology on every target compiler, especially Windows.
Treat Fortran module/common data across Windows DLL boundaries as a first-class risk.
To keep it simple - use static libraries.