The inference serving layer is being attacked from every direction simultaneously. DeepSeek compresses the problem from the model side, Cerebras from the hardware side, open source from the orchestration side. These are just examples. There will be no winner here.
A 130-question benchmark for neurodivergent adult life management tasks, built for small on-device models. Architecture, reasoning behind the design, and the first data point from LFM2.5-350M.