I’m a low-level, math-driven systems engineer with a PhD in Machine Learning and 8 years of production C++ experience at Apple and NVIDIA. I love the challenge of making models run fast and efficiently on real hardware and I’ve shipped optimized ML features across multiple platforms.

I have also worked on vision, tracking, and generative AI problems, implementing papers and writing training pipelines (some trace of that era is on my Github). My PhD was on unsupervised and active learning, but included some reinforcement and deep learning.

CUDA Tiled Matmul