Just know stuffs
This is just a place for my personal use to keep track of what I have learnt. So that I can free up my mental space.
Even fairly good students, when they have obtained the solution of the problem and written down neatly the argument, shut their books and look for something else. Doing so, they miss an important and instructive phase of the work. … A good teacher should understand and impress on his students the view that no problem whatever is completely exhausted. One of the first and foremost duties of the teacher is not to give his students the impression that mathematical problems have little connection with each other, and no connection at all with anything else. We have a natural opportunity to investigate the connections of a problem when looking back at its solution. (George Pólya, “How to Solve It“)
Every composer knows the anguish and despair occasioned by forgetting ideas which one had no time to write down. (Hector Berlioz)
I might also make my work available here.
We must get beyond textbooks, go out into the bypaths … and tell the world the glories of our journey. (John Hope Franklin)
Model Architectures
- Score-based diffusion models. (Yang Song’s blog post, my gh repo)
- How residual networks are discretised ordinary differential equations. (NeuralODE)
- Learn U-Nets. Build a toy implementation.
Transformers
- Rotary Embeddings (EleutherAI’s blog post)
Interpretability
- MechInterp Preview
- Engineering Challenges in Interpretability
- Tools and methods: Activation Patching, nnsight
- An Adversarial Perspective on “Overinterpretation Reveals Image Classification Model Pathologies”
ML Engineering
- How to do hyperparameter optimisation via Bayesian optimisation. (Optuna Sweeper)
- Forward- and reverse-mode autodifferentiation: I started looking into this when I learned Jax. There are great tutorial on this.
- Zero-order and first-order optimisation techniques (Stanford’s slide). Why do we use first-order ptimisation techniques, rather than anything else (such as Gauss-Newton, Newton-Raphson, Levenberg-Marquardt)?
- ML Frameworks:
jax.vmap
Theoretical ML topics
- The manifold hypothesis and its implications in machine learning (Wiki, reddit discussion, Chris Olah’s blog post).
- Neural Tangent kernel
Elementary
- The Moore–Penrose pseudoinverse of a matrix.
Software development
- How to pass muster as a junior developer
- Git commands. Don’t just delete it and clone from the remote when you mess up your your git repo. (https://xkcd.com/1597/)
- Writing clean code, orthogonal abstractions. When things’re messy, be willing to refactor them. Avoid spaghetti code and ravioli code.
- Introduction to writing CUDA kernels and intergation with Pytorch. my github repo
- Jax (some of my repo: Thinking in mazes, [gpt2-jax])
- Hydra
My thought on research - v0.5
So I started writing this because I feel a growing uncertainty about how we, as AI/ML researchers, should do research in a world where AI is rapidly improving and beginning to automate AI research and develop its own successor.
This question isn’t entirely new to me. Back in 2021, during a conversation with my former advisor about neural architecture search, we wondered: If AI becomes so good that it can iteratively improve itself to solve specific tasks, what should ML researchers do then? At the time, my answer—from an undergraduate who hadn’t done any research—was that we could try to understand these systems. If machines exhibit intelligence, then studying them might help us understand our own. That belief is what drew me toward machine learning theory.
But there was an implicit assumption in that answer: that “doing research” would remain the last frontier—something machines couldn’t fully take over. I’m no longer sure that’s true.
Today, the question feels more urgent—but my answer hasn’t fundamentally changed. What has changed is the research process. We’re already seeing systems that automate large parts of the research workflow: generating ideas, running experiments, iterating on code, and drafting papers. In this emerging paradigm, researchers propose directions—and AI executes.
This shift brings undeniable gains in productivity, but it also creates a real tension—especially for students. If you don’t adopt these tools, you risk falling behind in speed and output. If you do, you risk outsourcing the very skills that shape your development as a researcher.
So what should we do? I think the answer depends on what you believe the goal of your research is. For me, it remains the same: to understand, to explain, and to create knowledge—not just to produce results. In that sense, the philosophy of research doesn’t change, even if the workflow does.
Take writing as an example. AI can already help generate drafts, refine language, and even generate a research paper. But writing isn’t just about producing text—it’s a way of thinking. It forces clarity, exposes gaps in understanding, and shapes the ideas themselves. If we fully delegate that process, we risk losing these “by-products”. The same applies more broadly. If we reduce research to proposing ideas and validating them through automated pipelines, we might become efficient—but also shallow.
This reminds me of what happened when AlphaGo defeated the strongest human Go players. For many top players, it triggered a kind of existential crisis: What is the meaning of playing Go now? But maybe that question was always there. Before AlphaGo, playing Go was about mastery, creativity, and understanding the game deeply. After AlphaGo, those values didn’t disappear—they just shifted. Players began to learn from AI, explore new styles, and engage with the game differently. And importantly, people still play.
Enjoy Reading This Article?
Here are some more articles you might like to read next: