Omni

by Carolanne Treutel

Visit

Jun 6, 2026 AI & Machine Learning

embedding mlx multimodal on-device vector search

Gallery

About

Finally made something I've always wanted, using the model we built.• SOTA omni embedding model, fully local, indexes text, PDF, image, audio, and video • Swift-native app UI + mlx-swift-transformer core. No Python. • Tested on M3 Pro 18G / M3 Ultra 512G / M4 Pro 48G. All work fine. • HTTP server exposes search to local agents like OpenClaw & Hermes − Indexing still feels slow even on the latest M3 Ultra, ranging from 10K tps to 300 tps depending on file type − Fans go crazy, high power draw while indexing − Search is near-instant. Multimodal relevance is sometimes arguable, but the idea is recall (the agentic LLM takes the results and refines for the final answer), so maybe that's fine