Just know stuffs | Hieu N. Nguyen

This is just a place for my personal use to keep track of what I have learnt. So that I can free up my mental space.

Even fairly good students, when they have obtained the solution of the problem and written down neatly the argument, shut their books and look for something else. Doing so, they miss an important and instructive phase of the work. … A good teacher should understand and impress on his students the view that no problem whatever is completely exhausted. One of the first and foremost duties of the teacher is not to give his students the impression that mathematical problems have little connection with each other, and no connection at all with anything else. We have a natural opportunity to investigate the connections of a problem when looking back at its solution. (George Pólya, “How to Solve It“)

Every composer knows the anguish and despair occasioned by forgetting ideas which one had no time to write down. (Hector Berlioz)

I might also make my work available here.

We must get beyond textbooks, go out into the bypaths … and tell the world the glories of our journey. (John Hope Franklin)

Model Architectures

Score-based diffusion models. (Yang Song’s blog post, my gh repo)
How residual networks are discretised ordinary differential equations. (NeuralODE)
Learn U-Nets. Build a toy implementation.

Transformers

Rotary Embeddings (EleutherAI’s blog post)

Iterpretability

https://leonardbereska.github.io/blog/2024/mechinterpreview/
https://www.anthropic.com/research/engineering-challenges-interpretability
Activation Patching, nnsight
An Adversarial Perspective on “Overinterpretation Reveals Image Classification Model Pathologies”

ML Engineering

How to do hyperparameter optimisation via Bayesian optimisation. (Optuna Sweeper)
Forward- and reverse-mode autodifferentiation: I started looking into this when I learned Jax. There are great tutorial on this.
Zero-order and first-order optimisation techniques (Stanford’s slide). Why do we use first-order ptimisation techniques, rather than anything else (such as Gauss-Newton, Newton-Raphson, Levenberg-Marquardt)?
ML Frameworks: jax.vmap

Theoretical ML topics

The manifold hypothesis and its implications in machine learning (Wiki, reddit discussion, Chris Olah’s blog post).
Neural Tangent kernel

Elementary

The Moore–Penrose pseudoinverse of a matrix.

Software development

How to pass muster as a junior developer
- Git commands. Don’t just delete it and clone from the remote when you mess up your your git repo. (https://xkcd.com/1597/)
- Writing clean code, orthogonal abstractions. When things’re messy, be willing to refactor them. Avoid spaghetti code and ravioli code.
Introduction to writing CUDA kernels and intergation with Pytorch. my github repo
Jax (some of my repo: Thinking in mazes, [gpt2-jax])
Hydra

Model Architectures

Transformers

Iterpretability

ML Engineering

Theoretical ML topics

Elementary

Software development

Enjoy Reading This Article?