Is your feature request related to a problem?
We currently lack proactive monitoring for the llm_call surface area (model, provider combinations). This leaves us unaware of issues until clients report them.
Describe the solution you'd like
- Implement a CRON JOB to perform dummy calls to LLM_CALL with varying payloads across providers, models, and modalities in a round-robin fashion.
- Check the database for failures.
- Log any failures as Sentry alerts on Discord.
Original issue
Describe the current behavior
We are in blind if llm_call surface area (model, provider combinations) are not healthy. Only after clients complaint we start investigating.
Describe the enhancement you'd like
CRON JOB --> LLM_CALL with dummy payload (across providers, models, modalities etc) in round robin--> Check DB --> if something fails --> shows up in Sentry alerts on Discord.
Why is this enhancement needed?
Proactively keep a tab on llm/call surface.
Is your feature request related to a problem?
We currently lack proactive monitoring for the llm_call surface area (model, provider combinations). This leaves us unaware of issues until clients report them.
Describe the solution you'd like
Original issue
Describe the current behavior
We are in blind if llm_call surface area (model, provider combinations) are not healthy. Only after clients complaint we start investigating.
Describe the enhancement you'd like
CRON JOB --> LLM_CALL with dummy payload (across providers, models, modalities etc) in round robin--> Check DB --> if something fails --> shows up in Sentry alerts on Discord.
Why is this enhancement needed?
Proactively keep a tab on llm/call surface.