Constructive gravity beats constructive linguistics over time

Jae L.

↓

JL:About#

Interested in building GPU kernels and ML Systems.

Cross-substrate work: NVFP4 on NVIDIA's Blackwell, MXFP4 on AMD's MI355X, and a custom MoE Metal kernel on Apple silicon. Working across all three, I built mental models of how each architecture ticks and where they diverge and align.

Open source contributor to SGLang's MLX backend.

JL:Projects#

SGLang MLX backend

Lead contributor to the MLX backend. MoE performance optimization for Apple Silicon, with merged kernels for fused SwiGLU and quantized matvec.

GitHub

AMD-MXFP4-MM kernel

A fused Triton quant and shuffle kernel on MI355X, ranked on the AMD x GPU MODE leaderboard through per shape tile selection and hardcoded dispatch.

GitHub

Serverless GPU inference

A small inference engine built on consumer hardware, benchmarking the path from cold start to first token.

GitHub

JL:Writing#

Notes and reading, kept as a Zettelkasten.

Notes

JL:Contact#

GitHub LinkedIn X Email