ComingUp

Home Days Collections Submit Product

Login Sign Up

Marlin-2B: a tiny VLM to extract structured information from videos

Marlin-2B: a tiny VLM to extract structured information from videos

Malcolm Bechtelar

by Malcolm Bechtelar

Visit

May 18, 2026 AI & Machine Learning

deep learning video analysis visual-language-model

Gallery

Marlin-2B: a tiny VLM to extract structured information from videos

About

The Marlin-2B is a tiny Visual-Language Model (VLM) designed to extract structured information from videos. It achieves this by aligning visual and textual features to facilitate tasks such as video question answering and video captioning. The model is available on the Hugging Face platform for integration into various applications.

Comments (0)

No comments yet. Be the first to comment!

Related Products

OpenBrief – Local-first video downloader/summarizer

Nerve – self hosted runtime for AI agents

skills-for-humanity – 171 structured reasoning skills for Claude Code

skills-for-humanity – 171 structured reasoning skills for Claude Code

OpenBrief – Local-first video downloader/summarizer

Bae – AI companion built around persistent memory architecture

Related Products

OpenBrief – Local-first video downloader/summarizer

Nerve – self hosted runtime for AI agents

skills-for-humanity – 171 structured reasoning skills for Claude Code

skills-for-humanity – 171 structured reasoning skills for Claude Code

OpenBrief – Local-first video downloader/summarizer

Bae – AI companion built around persistent memory architecture