Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
Unreleased
[0.5.0] - 2026-04-30
Added
- Merge pull request #412 from majiayu000/feat/provider-model-refresh-2026-04-21
- feat(models): update model catalogs for OpenAI, Anthropic, and Zhipu AI (#388)
- feat(router): add zai prefix alias to zhipu routing
- feat(router): add moonshot/minimax/zhipu dynamic and prefix routing
- feat(router): add atomic routing metrics counters (#376)
- feat(anthropic): add beta headers, structured outputs, and built-in tool types (#324)
- feat(openai): add store/metadata/service_tier params, update image models, mark deprecated (#322)
- feat: replace wildcard re-exports with explicit pub use in lib.rs (#315)
- feat(providers): reject unknown provider type strings with clear error at parse time (#311)
- feat(mistral): add missing params - frequency_penalty, presence_penalty, n, parallel_tool_calls, guardrails (#302)
- feat(core): enable user_management module with stub DB implementations (#296)
- feat: add CI job to compile-check disabled modules (#295)
- feat: enable virtual_keys module with stub database implementations (#292)
- feat(openai): add GPT-5.4 family and fix GPT-4.1 context window (#287)
- feat(gemini): add Gemini 3.1 models, fix systemInstruction and tool call handling (#291)
- feat(mistral): overhaul model catalog with 36+ current models (#290)
- feat: add reasoning_effort parameter and Developer message role for o-series models (#289)
- feat(anthropic): add claude-sonnet-4-6, claude-haiku-4-5; fix opus-4-6 limits and thinking serialization (#288)
- feat(openai-like): forward extra_params to upstream provider (#286)
- feat(config): implement YAML env var substitution in Config::from_file (#285)
- feat(mcp): add lightweight JSON Schema validation for tool arguments (#212) (#232)
- feat(a2a): add periodic health checks and exclude Unknown agents from routing (#213) (#227)
- feat(storage): add cache-aside pattern for API key verification (#207) (#228)
- feat(router): add structured tracing for routing decisions (#229)
- feat(config): add environment variable support for cache/rate-limit/enterprise (#66)
- feat(config): add schema_version field to GatewayConfig (#68)
- feat(examples): add hello example and fix broken bin references (#57)
Fixed
- fix(cli): add gateway release entrypoint
- fix(router): execute with capability-aware deployments
- fix(auth): normalize brute-force lockout keys
- Merge pull request #455 from majiayu000/fix/issue-408-rate-limit-stable-client-key
- fix(rate-limit): ignore untrusted auth headers
- Merge pull request #454 from majiayu000/fix/issue-407-cors-validation-gate
- fix(server): fail fast on invalid cors config
- Merge pull request #453 from majiayu000/fix/issue-409-embedding-array-validation
- fix(embeddings): reject non-string array input
- Merge pull request #452 from majiayu000/fix/issue-413-sdk-chat-model
- fix(sdk): preserve explicit chat model
- Merge pull request #451 from majiayu000/fix/issue-414-vertex-gemini3-models
- fix(vertex-ai): route Gemini 3 models
- Merge pull request #450 from majiayu000/fix/issue-424-gemini-thinking-pricing
- fix(pricing): align Gemini thinking cost
- Merge pull request #449 from majiayu000/fix/issue-436-storage-file-config
- fix(storage): honor configured file storage
- Merge pull request #448 from majiayu000/fix/issue-438-streaming-deployment-lifecycle
- fix(ai): hold deployment leases for streams
- Merge pull request #447 from majiayu000/fix/issue-439-openai-error-envelope
- fix(ai): return OpenAI error envelopes
- Merge pull request #446 from majiayu000/fix/issue-433-filtered-key-pagination
- fix(keys): paginate filtered key listings
- Merge pull request #445 from majiayu000/fix/issue-432-key-admin-promotion
- fix(keys): block non-admin management permission grants
- fix(models): align GPT-5.4 Pro token limits
- Merge pull request #418 from majiayu000/fix/gstack-health-2026-04-24
- fix(deps): address TLS migration review
- Merge pull request #441 from majiayu000/fix/issue-434-key-manager
- Merge pull request #444 from majiayu000/fix/issue-440-virtual-key-persistence
- fix(storage): keep virtual key last-used monotonic
- fix: align Homebrew release automation (#429)
- fix(storage): require explicit sqlite fallback (#442)
- fix: add utility pricing for Gemini flash variants (#425)
- fix(storage): preserve virtual key spend during usage updates
- fix(storage): reject placeholder vector backends (#443)
- fix(storage): avoid virtual key usage races
- fix(keys): bound last used cache
- fix(storage): persist virtual keys
- fix(keys): share key manager across requests
- fix(release): publish gateway binary only (#431)
- fix(deps): eliminate vulnerable TLS and YAML chains
- fix(sdk): wire execute_stream_request to provider dispatch (#396) (#402)
- fix(sdk): parse data URI to extract correct media_type for Anthropic multimodal (#401)
- fix(sdk): implement atomic round-robin rotation in LoadBalancer (#397)
- fix(auth): guard is_admin_route() against prefix confusion (SEC-04) (#393)
- fix(auth): replace prefix match with exact equality in is_public_route (#390)
- fix(errors): replace Box with typed errors at trait boundaries (#384)
- fix(a2a): auto-trigger agent health checks before routing (#381)
- fix(mcp): add optional JSON Schema validation for MCP tool parameters (#380)
- fix(storage): wrap multi-step DB operations in SeaORM transactions (#377)
- fix(storage): add cache-aside invalidation on API key usage write (#378)
- fix(streaming): add CancellationToken to cancel provider streams on client disconnect (#379)
- fix(providers): wire 6 unreachable provider types into factory (#374)
- fix(security): redact sensitive fields in Debug impls for config structs (#369)
- fix(auth): tighten password reset rate limit to 5 requests per 15 minutes (#371)
- fix(rust): add rust-toolchain.toml pinning stable channel (#368)
- fix(router): wire min_requests and success_threshold into circuit breaker (#367)
- fix(providers): implement 5 missing from_config_async branches (#365)
- fix(a2a): replace hardcoded request ID=1 with unique IDs (#361)
- fix(streaming): add VecDeque buffer size limit to prevent OOM (#362)
- fix(responses): address 6 correctness issues from code review (#329)
- fix(responses): resolve CI failures in Responses API implementation (#328)
- fix(macros): remove dead helper functions from provider_config! macro (#321)
- fix(dead_code): resolve 55 of 56 dead_code suppressions (#278) (#320)
- fix(lib): restore FunctionCall and ToolCall to public re-exports (#319)
- fix(errors): replace .unwrap() in production hot paths (#261) (#312)
- fix(openai): update capability lists for GPT-5.4, o3, o4-mini (#274) (#306)
- fix(config): replace hardcoded default string comparison in StorageConfig merge logic (#310)
- fix(config): remove dead hot_reload entries from ConfigPresets (#308)
- fix(core): gate user_management behind storage feature flag (#300)
- fix: deep-merge reasoning object and make effort/max_tokens mutually exclusive (#301)
- fix(openrouter): add HTTP-Referer/X-Title headers and wire reasoning param (#299)
- fix: implement user_management DB ops and wire TeamManager to persistent storage (#298)
- fix: gate virtual_keys module behind gateway feature flag (#294)
- fix: resolve critical TODOs in teams, redis pubsub, and monitoring (#293)
- fix(security): migrate API key hashing to HMAC-SHA256 with server secret (#254)
- fix(auth): implement basic RBAC with admin/user roles in check_permission (#242) (#251)
- fix(security): enforce minimum 32-byte JWT secret length (#240) (#250)
- fix(security): reject empty OAuth allowed_origins instead of permitting all (#241) (#247)
- fix(router): add circular alias and fallback cycle detection (#214) (#234)
- fix(streaming): add idle timeout to SSE streams to prevent zombie connections (#205)
- fix(auth): reject empty JWT secret on startup instead of warn (#204)
- fix(auth): separate access and refresh token verification (#203)
- fix(router): use min_requests and success_threshold in circuit breaker (#200)
- fix(streaming): cancel upstream provider stream on client disconnect (#198)
- fix(provider): add missing from_config_async branches for catalog-covered provider types (#197)
- fix(config): change Redis default to enabled=false (#196)
- fix(streaming): handle SSE errors with proper error events instead of HTTP 200 (#185)
- fix(storage): replace relative ./data path with absolute path in local file storage (#184)
- fix(config): fix boolean merge one-way override in CacheConfig (#183)
- fix(a2a): replace hardcoded request ID=1 with atomic counter (#182)
- fix(config): validate port range to reject values >65535 (#181)
- fix(auth): add input validation for API key creation (#180)
- fix(router): rename CostBased strategy to PriorityBased (#178)
- fix(storage): implement 4 unimplemented S3 methods (#177)
- fix(streaming): add VecDeque buffer capacity limit to prevent OOM (#176)
- fix(security): redact secrets in Debug impl for AuthConfig and ProviderConfig (#175)
- fix(auth): add rate limiting to password reset endpoint (#174)
- fix(storage): replace hardcoded relative SQLite path with platform-aware default_sqlite_path() (#156)
- fix(perf): throttle api_key last_used DB writes to every 5 minutes (#153)
- fix(storage): apply max_connections config to Redis connection pool (#148)
- fix(storage): remove dead BatchOperations referencing nonexistent Database enum (#147)
- fix(security): mask usernames in login log messages to prevent PII leak (#146)
- fix(provider): replace from_f64().unwrap() with safe error handling across providers (#130)
- fix(auth): use transactional reset_password_with_token to eliminate TOCTOU race (#129)
- fix(perf): replace blocking parking_lot::Mutex with tokio::sync::Mutex in memory cache (#133)
- fix(api): forward stream_options field in chat completion requests (#131)
- fix: remove unwrap() panic in Mistral transform_request (closes #77) (#127)
- fix: remove unwrap() panics in vertex_ai provider (closes #78) (#128)
- fix: remove unwrap() panic in S3 cache storage_class parse (closes #76) (#126)
- fix: remove unwrap() panics in openai provider (closes #79) (#125)
- fix(server): mount missing auth/keys/teams/budget/health routes in create_app (#112)
- fix(provider): OpenAILikeProvider::name() returns actual provider name (#117)
- fix(middleware): X-Request-ID generated twice and not returned in responses (#111)
- fix(api): unify pricing routes from /api/v1/ to /v1/ prefix (#123)
- fix(cache): log Redis write failure in dual-cache set_with_size (#121)
- fix(middleware): remove no-op CorsMiddleware implementation (#120)
- fix(budget): eliminate TOCTOU race in create_budget() via Entry API (#116)
- fix(error): replace wildcard with explicit match arms for 11 GatewayError variants (#115)
- fix(sync): eliminate read-modify-write race in AtomicValue::update() (#114)
- fix(security): add ownership verification to API key CRUD endpoints (IDOR) (#110)
- fix(security): SSRF protection for custom API endpoint_url (#109)
- fix(auth): add IP-based rate limiting to /auth/login endpoint (#108)
- fix(api): GET /auth/me incorrectly registered as POST method (#113)
- fix(security): CORS empty origins list no longer defaults to wildcard '*' (#107)
- fix(auth): wrap password reset token ops in database transaction (#73)
- fix(server): add X-Forwarded-For trusted proxy validation (#72)
- fix(auth): replace unwrap_or_else with proper error handling in auth middleware (#67)
- fix(config): correct boolean merge logic in config system (#65)
- fix(deps): consolidate reqwest to single version 0.12.x (#48)
- fix(deps): upgrade quinn-proto to fix CVE-2026-0037 (#50)
- fix(deps): upgrade rand from 0.8 to 0.9 (#47)
- fix(lint): resolve 314 collapsible_if warnings for clippy 1.94.0 (#49)
- fix(security): add rate limiting and unify error messages for registration (#42)
- fix(security): reject session auth until proper session store is implemented (#41)
- fix(security): reject refresh tokens in authenticate_jwt (#39)
- fix(security): use SHA-256 for rate limit key hashing (#40)
- fix(ci): pin rust toolchain and add PR guardrails (#34)
- fix(security): consolidated security hardening — audit fixes, auth hash, env validation, route bypass, OAuth, concurrency (#33)
- fix(error): preserve provider identity in map_http_status_to_error (FUT-59) (#22)
- fix: harden boundary guard and stabilize router/error mapping integration (#14)
- fix(sse): map reasoning_content to thinking delta (#11)
Changed
- style(config): format serde_norway migration cleanup
- refactor(deps): use explicit maintained crate names
- refactor(router): remove redundant dead-code zai/ prefix check (#405)
- refactor(providers): split LLMProvider into focused sub-traits (#383)
- fix(a2a): auto-trigger agent health checks before routing (#381)
- refactor: split factory.rs into registry, resolver, builder, coordinator modules (#317)
- refactor(config): split gateway.rs tests and fix pricing source path (#318)
- refactor(errors): split utils.rs (1435 lines) into focused sub-modules (#316)
- refactor(deps): replace async-trait with native AFIT in core traits (#246) (#252)
- refactor(provider): eliminate unwrap() in provider request/response paths (#245) (#248)
- refactor(provider): remove associated types from LLMProvider trait (#238)
- refactor: extract test modules from oversized gateway_error files (#221) (#237)
- refactor(config): consolidate default values into single source of truth (#235)
- refactor(error): simplify From to use GatewayError::Provider directly (#233)
- refactor(storage): add transaction wrapping and optimistic locking for DB operations (#206) (#230)
- refactor(config): replace Arc with AtomicValue for atomic hot reload (#209) (#226)
- refactor(provider): consolidate 5 dispatch macros into single parametric macro (#224)
- refactor(quality): eliminate unwrap() calls in auth and security paths (#215) (#231)
- refactor(storage): remove dead legacy migration files (#222)
- refactor(error): consolidate GatewayError from 29 to 15 variants (#160)
- refactor(provider): remove orphan LLMProvider implementations (#159)
- refactor: split openai/transformer.rs into focused sub-modules (#154)
- refactor(provider): remove standalone impls for catalog-covered providers (#151)
- refactor(provider): remove dead Provider enum variants without factory paths (#150)
- refactor(provider): consolidate OpenAI dual LLMProvider implementations (#149)
- refactor: remove deprecated legacy config types (#132)
- refactor: remove duplicate LiteLLMError and OpenAIError type definitions (#124)
- refactor: 3-phase architectural refactoring (God Module, Type, Error) (#58)
- perf(observability): shorten record_request write lock hold time (#19)
- perf(recovery): remove blocking mutexes in circuit breaker async path (#15)
- perf(health): replace std rwlock with async monitor locks (#20)
- perf(cache): remove deep clone in hit path via Arc payload (#16)
- refactor(streaming): dedupe done marker handling for pilot providers (#24)
Removed
- Removed the legacy
google-gatewaybinary from Cargo, release archives, CI artifacts, and Docker images. The published gateway distribution now focuses on the maingatewayexecutable.
[0.4.2] - 2026-02-28
Fixed
- fix(ci): fallback to grep when ripgrep is unavailable
[0.4.1] - 2026-02-28
Fixed
- fix(clippy): satisfy strict lints in audio service and router tests
[0.4.0] - 2026-02-28
Changed
- Provider Infra:
BaseConfig::for_provider()now delegates environment loading with the original provider input while keeping normalized default resolution in one place, removing duplicated normalization flow. - Provider Infra:
BaseConfig::provider_env_key()env-key normalization now explicitly covers trimmed/case-variant provider input via regression test. - Provider Infra:
BaseConfig::provider_env_key()now normalizes provider names internally, andfrom_env()reuses normalized env helpers directly to remove duplicated normalization flow. - Provider Infra: Centralized provider environment variable key/value resolution in
BaseConfighelpers (provider_env_key,env_value) to remove repeated env lookup formatting. - Provider Infra: Centralized endpoint URL construction in
BaseConfig::build_endpoint()and reused it for chat/embeddings endpoints to remove duplicated formatting logic. - Provider Infra: Centralized default API version assignment in
BaseConfig::default_api_version()to remove repeated provider-specific conditionals. - Provider Infra:
BaseConfig::for_providernow normalizes provider names (trim + lowercase) before catalog/fallback resolution to prevent casing/spacing drift. - Provider Infra: Removed legacy alias fallback in
BaseConfigand kept canonical provider-name defaults only to avoid alias drift. - Provider Infra: Extracted
legacy_default_base_url()helper inBaseConfigto isolate non-catalog fallback mapping and simplify maintenance while preserving behavior. - Provider Infra:
BaseConfig::for_providernow consults Tier-1 provider catalog defaults first, reducing duplicated base URL definitions while preserving existing fallback behavior. - Provider Infra: Removed the unused
CommonProviderConfigduplicate fromcore::providers::shared, keeping provider base config responsibilities centralized incore::providers::baseand reducing schema duplication.
Added
- Provider Tests: Added B1 batch coverage to validate
aiml_api,anyscale,bytez, andcomet_apiselectors and creation paths resolve through Tier-1 catalog toOpenAILikeproviders. - Provider Tests: Added B2 batch coverage to validate
compactifai,aleph_alpha,yi, andlambda_aiselector and creation paths resolve through Tier-1 catalog toOpenAILikeproviders. - Provider Tests: Added B3 batch coverage to validate
ovhcloud,maritalk,siliconflow, andlemonadeselector and creation paths resolve through Tier-1 catalog toOpenAILikeproviders.
0.3.0 - 2026-02-05
Added
- Agent Coordinator: New
core::agentmodule for managing concurrent agent lifecycles with cancellation, timeouts, and stats. - Utilities: Added
utils::eventpublish/subscribe broker andutils::syncconcurrent containers.
Changed
- Providers: Migrated
ai21,amazon_nova,datarobot, anddeepseekto pooled HTTP provider hooks. - HTTP Client: Standardized pooled client usage and shared client caching across core/providers.
- Routing: Refined provider routing and OpenAI-compatible request/response handling.
Fixed
- Auth Context: Corrected user/api-key context propagation in auth routes and middleware.
- SSRF Validation: DNS resolution failures no longer hard-fail SSRF checks while preserving IP safety.
- Observability: Prometheus label handling now safely maps provider identifiers.
- Concurrency: Event broker handles zero capacity; VersionedMap retry now guarantees progress under contention.
- Packaging: Track core cache sources and add root README for crates.io.
0.1.3 - 2025-09-18
Fixed
- docs.rs Build: Fixed documentation build failure on docs.rs by excluding
vector-dbfeature- Added
all-features = falsetopackage.metadata.docs.rsconfiguration - Explicitly listed features that work with docs.rs read-only filesystem
- Added
- Internationalization: Translated all Chinese comments and documentation to English
- Cleaned 40+ files with hundreds of Chinese comments
- Improved accessibility for international developers
- Maintained technical accuracy in all translations
Changed
- Configuration: Updated
Cargo.tomlmetadata for better docs.rs compatibility - Documentation: All code comments are now in English
0.1.1 - 2025-7-28
Fixed
- Security: Excluded sensitive configuration file
config/gateway.yamlfrom published package - Package: Only include example configuration files (
.example,.template) in published crate - Privacy: Prevent accidental exposure of API keys and secrets in published package
0.1.0 - 2025-07-28
Added
- Initial release of Rust LiteLLM Gateway
- High-performance AI Gateway with OpenAI-compatible APIs
- Intelligent routing and load balancing capabilities
- Support for multiple AI providers (OpenAI, Anthropic, Google, etc.)
- Enterprise features including authentication and monitoring
- Actix-web based web server with async/await support
- PostgreSQL and Redis integration for data persistence and caching
- Comprehensive configuration management via YAML
- Rate limiting and request throttling
- WebSocket support for real-time communication
- Prometheus metrics integration
- OpenTelemetry tracing support
- Vector database integration (Qdrant)
- S3-compatible object storage support
- JWT-based authentication system
- Docker and Kubernetes deployment configurations
- Comprehensive API documentation
- Integration tests and examples
Features
- Core Gateway: OpenAI-compatible API endpoints
- Multi-Provider Support: Seamless integration with various AI providers
- Load Balancing: Intelligent request distribution
- Caching: Redis-based response caching
- Monitoring: Prometheus metrics and OpenTelemetry tracing
- Authentication: JWT-based security
- Rate Limiting: Configurable request throttling
- WebSocket: Real-time streaming support
- Storage: PostgreSQL for persistence, S3 for object storage
- Vector DB: Qdrant integration for embeddings
- Deployment: Docker, Kubernetes, and systemd configurations