All Posts
Subscribe
About Me
Posts
Categories
All
(4)
Attention
(1)
GPUs
(1)
Machine Learning
(1)
Mathematics
(2)
Python
(1)
Transformers
(1)
Drilling Down into Multimodal Attention
Transformers
Attention
This post explains how to inspect the attention patterns of a vision-language models (VLMs) using a new module I created on a fork of the
circuitsviz
library. To interact…
Feb 1, 2025
Tomas Ruiz
How Does Tiling Speed Up Matrix Multiplications on GPUs?
Mathematics
GPUs
TL;DR:
Tiling is a technique used to reduce the number of memory accesses performed during matrix multiplication. We see how it improves compute intensity and how it speeds…
Dec 23, 2024
Tomas Ruiz
Grokking an Inner Product Inequality With Python on WebAssembly
Mathematics
Python
The purpose of this post is two-fold:
Sep 12, 2024
Tomas Ruiz
A Closed-Form Solution to Linearly Fine-Tune LLMs for Binary Classification
Machine Learning
In this post I show how to linearly fine-tune a large language model (LLM) using a closed-form solution, based on the Moore-Penrose Inverse. I will focus on the special case…
Aug 2, 2024
Tomas Ruiz
No matching items