Jupyter notebook outputs can be large (plots, images, etc.), making Git repo history excessively large and making Git operations slower as the Git history grows.
Jupyter notebook outputs can reveal personal information with regard to usernames, Python executable, directory layout, and data outputs.
Strip all Jupyter outputs from Git tracking with a client-side Git pre-commit
hook
by
configuring Git pre-commit hooks.
We use Git pre-commit hook because Git filters can interfere with other programs such as
CMake ExternalProject.
Configure Git user-wide where to use an IPython script to strip Jupyter notebook outputs by:
In some environments, Fortran Package Manager commands like fpm build or fpm test can fail when a dependency (say HDF5) is resolved through pkg-config and the .pc file includes a system include path such as -I/usr/include.
Git does not have a integral mechanism to have multiple authors per Git commit.
A Git coauthor notation convention has become accepted by major services including
Github
and GitLab.
Git itself can programmatically
parse arbitrary Git commit trailers
but does not have a built-in notion of coauthors.
Indicate Git coauthor by placing plaintext in the commit message body.
The email address cited must match a registered email with the Git service.
The email can be a working public email or the “fake” noreply email provided by the Git service.
Multiple coauthors each use the same syntax on the same Git commit like:
added foo function to bar.py
Co-authored-by: David <snake@users.noreply.github.com>
The coauthor commits do show up in GitHub search under “Commits”.
Caveats: as with regular Git commits, there is no authentication to avoid someone masquerading as someone else with Git coauthor commits.
Git coauthor commits cannot be GPG signed for each coauthor, only the primary Git committer can GPG sign as usual.
Commands like git rebase can use --trailer to for example show who reviewed a rebase like:
git rebase can add trailers to the Git commit message to indicate who reviewed the rebase like:
git rebase --trailer "Reviewed-by: Nobody <nobody@users.noreply.github.com>"
When only a subdirectory of a Git repository is opened in Visual Studio Code, repo-root Copilot customizations like
.github/copilot-instructions.md
are not discovered by default.
This can make Copilot ignore repository-wide instructions even though they exist at the top of the current Git repository.
Visual Studio Code has a built-in configuration items to resolve this issue by enabling
parent repository discovery
for chat customizations.
With this setting enabled true, VS Code walks upward from the opened workspace folder until it finds .git.
It then discovers chat customizations between the opened folder and the repository root, including:
.github/copilot-instructions.md
.github/instructions/*.instructions.md
prompt files
agent files such as AGENTS.md
hooks and other chat customizations
This setting is especially useful for monorepos and for workflows that open a focused subdirectory such as content/posts/, src/, or packages/frontend/ instead of the full repository root.
Without parent repository discovery, Copilot can miss repository-specific style and validation rules.
A few conditions apply:
the opened folder must not itself be a separate Git repository (e.g. Git submodule)
a parent folder must contain .git
the parent repository folder must be trusted in VS Code
To verify that the repository instructions are in use, inspect the References list on a Copilot Chat response.
If parent discovery is working, the response references typically include the repo-root customization files.
A Linux computer temp folder can be purged on schedule to free up disk space and remove old temporary files.
The programs “tmpwatch” or “tmpreaper” can be used to purge the temp folder on a schedule.
tmpwatch
is available on Red Hat-based Linux distributions, while
tmpreaper
is available on Debian-based Linux distributions.
To do a “dry run” of the purge command to see what files would be deleted, use the “–test” flag:
<tmpwatch|tmpreaper> --test --mtime 7d /tmp
Set the temp path explicitly, especially on HPC systems where scratch space may be under system-specific paths.
To remove duplicate entries in shell history for pressing “up” on repeated commands to give the last non-duplicated command, set for the respective shell as follows.
Bash: “~/.bashrc”: ignore duplicate lines, and omits lines that start with space.
Measuring the peak RAM usage of a process and all its children can be done using various tools and techniques.
OS-dependent tools may be the most accurate, but they can be complex to use.
A simpler approach is to periodically sample the RAM usage of the process and its children, like this scripts for Linux and macOS using
ps.
It is also possible though less accurate on macOS or Linux to use
/usr/bin/time,
but this only measures the peak RAM usage of the largest child process, not the total of all children, so this is unsuitable for multiprocess applications like “mpiexec”.
For Linux, a more accurate method is the Cgroup v2, such as implemented by
cgmemtime.
For macOS, the Instruments tool can be used to measure the RAM usage of a process and its children, but it requires a ‘codesign’d application and is more complex to set up.
xcrun xctrace record --template "Game Memory" --launch -- /path/to/application --output bench_game.trace --time-limit 30s
open bench_game.trace
This can be used to create compact commands for frequently used Git operations.
These aliases reduce typing (and typos) for frequent operations.
It can also be used in some cases for older versions of Git to use newer-style syntax (at least for the compatible parts of the command).
CMake
cmake_language(TRACE)
enables tracing selected nestable portions of CMake script, which is important for debugging CMake projects due to the generally large volume of trace output.
The trace output is large as the nature of CMake’s platform-independence means that numerous checks are performed even on minimal CMake scripts.
This can make it difficult to find the relevant portion of the trace output for debugging.
The cmake_language(TRACE) command allows specification of a named portion of the CMake script to trace, including nested trace regions.
This is a powerful debugging tool because it narrows trace output to the relevant part of the CMake script instead of emitting the entire script trace.
To trace only part of a script, wrap that region with cmake_language(TRACE) as in this CMakeLists.txt example:
observe that only the trace output for the find_package(Zlib) command is emitted, while the find_package(LAPACK) command and compiler discovery are not traced.