David Nguyen

Life notes

A collection of thoughts, experiences, and life updates.

Latest

Designing a Multi-Tenant LLM Inference Platform

Why serving LLMs breaks classic API intuitions, and how to design around the physics: KV cache, continuous batching, placement under uncertainty, and fairness.

More posts
2026