Gallery
About
We ran an experiment over the weekend to explore whether multiple autonomous agents could collaboratively optimize inference on Apple’s Neural Engine (ANE).Each agent ran locally on a different Mac (M1–M4), repeatedly modifying how a DistilBERT model is executed on the ANE, benchmarking latency, and sharing results and insights with other agents in real time.Instead of exploring independently, agents could:- see what others had tried - reuse working strategies - avoid known failure modesAcross all tested chips, the agents ended up outperforming Apple’s CoreML baseline, with up to 6.31× lower median inference latency on the same hardware.An interesting pattern we observed: an agent stuck at ~2.1ms latency on M4 was able to break through after incorporating strategies discovered by agents on different chips (M2, M4 Max), eventually reaching ~1.5ms and surpassing CoreML.Full write-up: https://x.com/christinetyip/status/2039040161439224157Detailed results: https://ensue-network.ai/lab/ane?view=strategies https://ensue-network.ai/lab/aneCurious what other optimization problems this kind of setup could be applied to, especially in systems, compilers, or ML infra. Would be interested in exploring similar experiments.
Comments (0)
No comments yet. Be the first to comment!
Related Products
Luvvoice
Free Convert Text to Speech Online, No Word Limit
Picaloca: AI Image Analyzer
Analyze any photo with AI: locations, movies, outfits, menus and more
Picaloca: AI Image Analyzer
Analyze any photo with AI: locations, movies, outfits, menus and more
ImgVid
All-in-one AI Platform for images & videos
ImgVid
All-in-one AI Platform for images & videos
Apexai
A proper chatbot with api integration