ComingUp

Home Days Collections Submit Product

Login Sign Up

Bonsai 1.7B ternary model at 442T/s on M4 Max

Bonsai 1.7B ternary model at 442T/s on M4 Max

Clint Wilderman

by Clint Wilderman

Visit

May 4, 2026 AI & Machine Learning

ai models autonomous search machine learning optimization

Gallery

Bonsai 1.7B ternary model at 442T/s on M4 Max

About

We took a recently released Bonsai 1.7B ternary model from PrismML (https://github.com/PrismML-Eng/Bonsai-demo) and ran our agentic evolution search on it for 6 hours to optimize the Metal kernels. The search was fully autonomous.Measured against unmodified upstream llama.cpp at the same Bonsai/Q2_0 commit, same M4 Max:- tg128: 309.82 → 442.42 t/s (+42.0%)- pp512: 4250.32 → 4622.63 t/s (+8.8%)

Comments (0)

No comments yet. Be the first to comment!

Related Products

Parse LLM Markdown streams incrementally on the server or client

Find the best local LLM for your hardware, ranked by benchmarks

Watch a neural net learn to play Snake

JDS – a Copilot skill suite for structuring AI coding behavior

Find the best local LLM for your hardware, ranked by benchmarks

Containarium – self-hosted sandbox for AI agents, MCP-native

Related Products

Parse LLM Markdown streams incrementally on the server or client

Find the best local LLM for your hardware, ranked by benchmarks

Watch a neural net learn to play Snake

JDS – a Copilot skill suite for structuring AI coding behavior

Find the best local LLM for your hardware, ranked by benchmarks

Containarium – self-hosted sandbox for AI agents, MCP-native