ComingUp ComingUp
Jun 6, 2026 AI & Machine Learning
embedding mlx multimodal on-device vector search

Gallery

Omni

About

Finally made something I've always wanted, using the model we built.• SOTA omni embedding model, fully local, indexes text, PDF, image, audio, and video • Swift-native app UI + mlx-swift-transformer core. No Python. • Tested on M3 Pro 18G / M3 Ultra 512G / M4 Pro 48G. All work fine. • HTTP server exposes search to local agents like OpenClaw & Hermes − Indexing still feels slow even on the latest M3 Ultra, ranging from 10K tps to 300 tps depending on file type − Fans go crazy, high power draw while indexing − Search is near-instant. Multimodal relevance is sometimes arguable, but the idea is recall (the agentic LLM takes the results and refines for the final answer), so maybe that's fine

Comments (5)

Clare Reinger Clare Reinger 4 days ago

how's performance on longer video files

Don Baumbach Don Baumbach 3 days ago

fully local huh so my macbook just suffers then

Lowell Skiles Lowell Skiles 3 days ago

local multimodal embeddings finally shipped

Flavie Marquardt Flavie Marquardt 3 days ago

Five modalities locally indexed is wild, what's the retrieval latency?

Omer Turcotte Omer Turcotte 1 day ago

how's video embedding speed running locally on mlx