[Demo] Add TUI agent chat command#2158
Conversation
|
I kept forgetting which agent did what, so I wrote the tui version of chat to give a lot more details to help select the agent.
The TUI component rocks. The chat is similar to https://github.com/GromNaN/symfony-tui-games in that it prompts you for a chat agent and returns to it when you're done, like the games menu does. |
|
Nice one 😊 |
GromNaN
left a comment
There was a problem hiding this comment.
Very nice.
Since it's requiring symfony/tui, you can assume symfony/console 8.1 is used and use the fancy features like invokable commands with validation.
…face `CachePlatform` could only build a cache key for `string`, `array` and `MessageBag` inputs and threw on anything else, so it could not decorate platform tasks whose top-level input is a content object — audio transcription (`Audio`), OCR (`DocumentUrl`/`Document`) and single-image vision (`ImageUrl`/`Image`). Add an opt-in `CacheableInputInterface` exposing a stable, cache-key-safe `getCacheKey()`, honored by `CachePlatform` before its `default` throw. `DocumentUrl` and `ImageUrl` key on the URL, `File` (and its `Audio`, `Image`, `Video` and `Document` subclasses) on a hash of its bytes. This is additive: inputs not implementing the interface behave as before, and third-party inputs can opt in without touching the bridge. See symfony#2159
Add a dedicated `Ocr` model with its own `ModelClient` and `ResultConverter` for Mistral's `mistral-ocr-latest` model, which uses the `/v1/ocr` endpoint rather than chat completions. The result is a typed `OcrResult` (pages, layout images with bounding boxes, per-page annotations and usage info) returned via `ObjectResult`/`asObject()`, not a text blob. Document URL, binary PDF and image inputs are supported by widening the existing `Document`, `DocumentUrl` and `ImageUrl` contract normalizers to also accept the `Ocr` model. Catalog entries (`mistral-ocr-latest`, `mistral-ocr-2505`) and a generator rule keep OCR models out of the chat model class. Fix symfony#2072
Add a chat feature that accepts a document URL, extracts its text with Mistral OCR (`mistral-ocr-latest` via the new `ai.platform.mistral` platform) and answers questions about the document. The OCR result is cached per document URL in `OcrExtractor` so the billed OCR endpoint is only ever called once per document.
| $names = array_map(static fn (object $c): string => strtolower(str_replace('_', '-', $c->name)), $capabilities); | ||
|
|
||
| return implode(', ', $names); | ||
| } catch (\Throwable) { |
There was a problem hiding this comment.
which exception are you expecting here?
There was a problem hiding this comment.
ModelNotFound -- I'm fixing the code to reflect that.
|
|
||
| $metadata = $result->getMetadata(); | ||
| $usage = $metadata->get('token_usage'); | ||
| if (!$usage instanceof TokenUsage) { |
There was a problem hiding this comment.
we should test for the interface instead
|
Thanks for the review — those were dev shortcuts, now fixed:
More broadly, the whole Would you be open to |
Add Capability::description() for a human-readable summary of each capability, and Capability::inputContentType() mapping an input capability to the Message\Content class it accepts (e.g. INPUT_PDF -> DocumentUrl). Used by the demo's capability explorer TUI.
Add app:capabilities (browse models by platform/capability, backed by the new Capability::description()) and app:tools (browse agent tools). Harden ChatTuiCommand: drop method_exists() guards in favour of instanceof/type checks and catch ModelNotFoundException specifically. Pull in symfony/models-dev for a richer model catalog to explore.
Add a curated sample-document picker (with an "I'll provide my own URL" option), open the chat with an agent-generated summary instead of the raw OCR dump, surface fetch/OCR failures to the user, and show a loading state while reading. Sample list and summary prompt move to parameters in services.yaml.
# Conflicts: # src/platform/CHANGELOG.md
# Conflicts: # demo/composer.json # src/platform/CHANGELOG.md # src/platform/src/Bridge/Mistral/CHANGELOG.md
Add replace (all 90 symfony/ai-* sub-packages + symfony/mcp-bundle), autoload (per-component PSR-4, which transitively covers every nested bridge), and extra.branch-alias dev-tac => 0.10.x-dev so consumers can `composer require symfony/ai:dev-tac` from this fork and get every AI component (incl. Mistral OCR, cache key, Capability self-description) resolved transitively. Fork-only convenience commit.
Since replace makes composer ignore the replaced sub-packages' own requires, the umbrella must declare their third-party runtime deps itself. Aggregate the deps of the 7 core components plus the bridges the demo/release use (oskarstark/enum-helper, psr/log, symfony/* core, mcp/sdk, etc.) so `composer require symfony/ai:dev-tac` is self- sufficient. Fork-only convenience commit.
Add a VCS repository for tacman/ai and require symfony/ai:dev-tac (the umbrella), so the demo resolves every AI component from the fork — incl. Mistral OCR, the cache key interface, and Capability self-description — instead of dev-linking local src or pulling vanilla 0.10 from Packagist. Flex registers the ux-typed TypedBundle the demo already uses.
A handwritten 1863 NARA document (multi-page PDF) — a strong OCR demo.
Track composer.lock (the demo is an app) and add a bin/deploy that builds a throwaway git repo from this monorepo subdir and pushes it to Dokku, setting API* config from .env.local. Adds survos/deployment-bundle (dev).
|
looks like this branch got mixed up with the #2161 |
document_annotation is a document-wide field on the OCR response root, not a per-page field. Move it from Page to OcrResult and read it from $data instead of each page.
chr-hertel
left a comment
There was a problem hiding this comment.
Needs a rebase on main - merged #2161 and the git history is somehow twisted.

Adds a terminal UI chat command to the demo application so the configured agents can be explored directly from the console.
The command provides an agent picker, streaming chat output, a verbose debug pane for tool calls and token usage, and returns to the picker after leaving a chat session. The demo dependencies are also updated to Symfony 8.1 packages so it can use the new TUI component.
No separate documentation was added because this is a demo-only interactive command and is discoverable via
bin/console list app.Validation:
composer validate --no-check-publishphp -l demo/src/Tui/Command/ChatTuiCommand.phpgit diff --check origin/main..HEADphp bin/console lint:yaml config --parse-tagsvendor/bin/php-cs-fixer fix --config=.php-cs-fixer.dist.php --dry-run --diff demo/src/Tui/Command/ChatTuiCommand.phpcd demo && vendor/bin/phpstancd demo && vendor/bin/phpunitcd demo && php bin/console list app --format=txt