Overview
Minimal Scale Inference Engine is an educational and production-ready inference server that strips away complexity to expose the core algorithms behind modern LLM serving.
Key Features
Continuous batching with annotated implementation
KV-cache management with visual debugging
GPU scheduling with profiling hooks

