Skip to content

[FEATURE] Support Experimental Ollama Image Generation models #776

Description

@SteveAquino

Scope check

  • This is core LLM communication (not application logic)
  • This benefits most users (not just my use case)
  • This can't be solved in application code with current RubyLLM
  • I read the Contributing Guide

Due diligence

  • I searched existing issues
  • I checked the documentation

What problem does this solve?

Ollama recently added new new image generation models: https://ollama.com/blog/image-generation

RubyLLM already supports Ollama as a provider, and also supports image generation through the paint method, but currently none of the new image models are registered in the model registry, requiring a patch for them to be used together.

Currently these models only work on OSX machines, but enable developers to iterate rapidly with powerful free models and avoid the pay-to-play limitations of frontier image generation models.

Proposed solution

Introduce the new models x/z-image-turbo and x/flux2-klein in the model registry and introduce an adapter to format the params to the underlying Ollama API call properly, including supported additional params (eg., width, height, and steps). Add defensive programming to ensure engineers using this on an incompatible platform understand the limitaitons.

Why this belongs in RubyLLM

RubyLLM acts as an interface between Ruby applications and LLM providers, already supports Ollama for text generation, and already supports image generation for frontier models. Introducing this capability seems like a natural step, and allows developers to rapidly prototype using free, local models, paving the way for quicker features.

Note: I have been using the x/flux2-klein model with a small patch to RubyLLM and have found great results, so I would be open to opening a PR to introduce this enhancement and collaborate on a good solution.

Thanks for your time and consideration - RubyLLM is a great library, so I would be honored to contribute!

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions