Add TwelveLabs provider for Marengo embeddings#820
Conversation
The provider-coverage specs enumerate every registered provider: "covers every registered provider" asserts api_base_cases lists them all, and "file protocol resolution" instantiates each via config_for. Adding the :twelvelabs provider without updating these fixtures left it absent from api_base_cases and unconfigured in config_for, so both specs failed (the latter raising ConfigurationError on new). Add the twelvelabs api_base case and config_for branch.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #820 +/- ##
==========================================
+ Coverage 81.90% 81.94% +0.04%
==========================================
Files 169 173 +4
Lines 7713 7771 +58
Branches 1284 1288 +4
==========================================
+ Hits 6317 6368 +51
- Misses 884 890 +6
- Partials 512 513 +1 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
CI is green now (commit The issue-first question still stands — happy to move this to an approved issue or repackage as a |
Hi! I'm Mohit, I work at TwelveLabs (@mohit-twelvelabs).
What this does
Adds an opt-in
:twelvelabsprovider that wires TwelveLabs' Marengo multimodal embedding model into RubyLLM's existing synchronousembedinterface:Marengo produces embeddings in a shared space across video, image, audio, and text, which is useful for video search/RAG built on RubyLLM. This PR adds the text path (synchronous, 512-dim) as the smallest first slice.
It follows the existing provider conventions exactly: a
Providersubclass with aProtocolthatincludes anEmbeddingsmodule (mirroringChatCompletions::Embeddings),assume_models_exist?since TwelveLabs models aren't in the models.dev registry, and config registered viaconfiguration_options. The TwelveLabs/embedendpoint takesmultipart/form-data, so the payload usesFaraday::Multipart::ParamPart(thefaraday-multipartdep is already present).Non-breaking / opt-in: no defaults change, nothing is touched unless you pass
provider: :twelvelabs.Type of change
Scope check
Required for new features
I'm aware of the issue-first policy and the high bar for core providers (and the community-gem path for emerging ones). I'm opening this as a concrete, reviewable reference rather than a large speculative drop. Happy to first move this to an approved issue, or repackage it as a
ruby_llm-twelvelabscommunity gem — whatever you prefer. Please feel free to close if it's not a fit for core.Quality check
bundle exec rspec spec/ruby_llm/providers/twelve_labsmodels.json,aliases.json)Verified against the live API as well: a
marengo3.0text embedding returns a real 512-dim float vector. Three specs cover the single-vector path, the single-element-array (nested) path, and rejection of multi-input (the endpoint accepts one text per call).rubocopis clean on all changed files.API changes
You can grab a free API key at https://twelvelabs.io — there's a generous free tier.