Otter Peeks

Peeks are simple posts about my current interests.

LLM inference engines and servers

March 11, 2025 8 minute read

Deploying a LLM on your own infrastructure is getting common, but how does it actually work?