Configure rate limits on VirtualMCPServer PR B 1#5276
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5276 +/- ##
==========================================
+ Coverage 68.87% 69.20% +0.32%
==========================================
Files 634 635 +1
Lines 64439 64522 +83
==========================================
+ Hits 44384 44650 +266
+ Misses 16779 16569 -210
- Partials 3276 3303 +27 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
695a96c to
480152f
Compare
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
480152f to
619264d
Compare
619264d to
f3cf1e4
Compare
Large PR justification has been provided. Thank you!
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
There was a problem hiding this comment.
Pull request overview
This PR wires the VirtualMCPServer.spec.config.rateLimiting field (introduced in PR A) into the vMCP runtime so rate limits are actually enforced. It builds a Redis-backed limiter during server construction, refactors the MCP middleware chain so rate limiting runs after auth + MCP parsing (allowing per-user/per-tool keying), and adds optimizer-aware tool-name resolution for call_tool.
Changes:
- Pass namespace and
RateLimitingconfig throughvmcpserver.Configand initialize a Redis-backed rate-limit middleware (with cleanup on Stop) inNew(). - Refactor the vMCP MCP middleware composition into small
applyXxxhelpers and place rate limiting between MCP parser and audit/discovery; resolvecall_toolto its innertool_namewhen optimizer is active. - Generalize
pkg/ratelimit/middleware.goto exposeNewMiddlewarewith an optionalToolNameResolver, and add unit + focused E2E coverage.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
pkg/vmcp/cli/serve.go |
Plumbs VMCP_NAMESPACE env (defaulting to local) and RateLimiting into vmcpserver.Config. |
pkg/vmcp/cli/serve_test.go |
Adds unit tests for vmcpNamespace() default and env override. |
pkg/vmcp/server/server.go |
Adds Namespace + RateLimiting config fields, initializes rate-limit middleware in New(), and splits MCP middleware chain into helpers with new execution order. |
pkg/vmcp/server/ratelimit.go |
New file: builds Redis client + limiter, validates Redis session storage, returns middleware + cleanup, and resolves optimizer call_tool inner tool name. |
pkg/vmcp/server/ratelimit_test.go |
New unit tests for config validation, per-user/per-tool/optimizer behavior, and chain helpers. |
pkg/ratelimit/middleware.go |
Introduces ToolNameResolver/DefaultToolNameResolver and exported NewMiddleware; CreateMiddleware now uses it. |
pkg/ratelimit/middleware_test.go |
Adds tests for custom resolver, default resolver nil-handling, and end-to-end CreateMiddleware registration. |
test/e2e/thv-operator/virtualmcp/virtualmcp_rate_limiting_test.go |
New focused E2E: deploys Redis + OIDC + backend + vMCP with perUser.maxTokens=1 and asserts second tools/call is rejected. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
f3cf1e4 to
589dfe9
Compare
|
@jerm-dro Thanks, that split makes sense. |
jerm-dro
left a comment
There was a problem hiding this comment.
Inline question on the duplication with pkg/ratelimit/middleware.go — blocking until we've decided whether to share or document why we can't.
| return rateLimitHandler(limiter), cleanup, nil | ||
| } | ||
|
|
||
| func rateLimitHandler(limiter ratelimit.Limiter) func(http.Handler) http.Handler { |
There was a problem hiding this comment.
question (blocking): rateLimitHandler, writeRateLimited, and rateLimitedBody here look like close copies of pkg/ratelimit/middleware.go:110-175 — same tools/call early-return, same identity extraction, same JSON-RPC 429 body. The Redis-ping flow above also overlaps with lines 80-95 of that file.
Is there a way to share the rate-limit logic between the two paths, or is there a constraint that makes that worse than it looks? Marking blocking mainly because two copies of the wire contract tend to drift. Open on how to address it.
There was a problem hiding this comment.
I removed the copied vMCP rate-limit handler/429 response/Redis setup logic.
The vMCP factory now delegates to the existing pkg/ratelimit.CreateMiddleware using a small runner adapter that captures the middleware and cleanup. That keeps pkg/ratelimit/middleware.go as the single owner of the tools/call filtering, identity extraction, fail-open behavior, Redis ping/client setup, and JSON-RPC 429 wire contract, while still letting vMCP pass RateLimitMiddleware into server.Config.
d09cb69 to
f5a7973
Compare
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
f5a7973 to
727c50e
Compare
| return runner.middleware.Handler(), cleanup, nil | ||
| } | ||
|
|
||
| type captureRunner struct { |
There was a problem hiding this comment.
suggestion: captureRunner implements the MiddlewareRunner interface only to catch one callback — the other five methods (SetAuthInfoHandler, SetPrometheusHandler, GetConfig, GetUpstreamTokenReader, GetKeyProvider) are dead-weight stubs that will need updating every time the interface grows. There's also a JSON round-trip: this factory marshals MiddlewareParams into MiddlewareConfig.Parameters, then ratelimit.CreateMiddleware immediately unmarshals it back into the same struct in the same process. Both round-trips exist only to fit through an API shaped for a different caller.
The cleaner path is to export a value-returning builder from pkg/ratelimit and let CreateMiddleware become a thin wrapper that calls it:
// In pkg/ratelimit/middleware.go
func NewMiddleware(params MiddlewareParams) (types.Middleware, error) {
if params.RedisAddr == "" {
return nil, fmt.Errorf("rate limit middleware requires a Redis address")
}
client := redis.NewClient(&redis.Options{
Addr: params.RedisAddr,
DB: int(params.RedisDB),
Password: os.Getenv(redisPasswordEnvVar),
})
pingCtx, pingCancel := context.WithTimeout(context.Background(), 5*time.Second)
defer pingCancel()
if err := client.Ping(pingCtx).Err(); err != nil {
_ = client.Close()
return nil, fmt.Errorf("rate limit middleware: failed to connect to Redis at %s: %w", params.RedisAddr, err)
}
limiter, err := NewLimiter(client, params.Namespace, params.ServerName, params.Config)
if err != nil {
_ = client.Close()
return nil, fmt.Errorf("failed to create rate limiter: %w", err)
}
return &rateLimitMiddleware{handler: rateLimitHandler(limiter), client: client}, nil
}
// CreateMiddleware stays as the Runner-API entry point for legacy callers
func CreateMiddleware(config *types.MiddlewareConfig, runner types.MiddlewareRunner) error {
var params MiddlewareParams
if err := json.Unmarshal(config.Parameters, ¶ms); err != nil {
return fmt.Errorf("failed to unmarshal rate limit middleware parameters: %w", err)
}
mw, err := NewMiddleware(params)
if err != nil {
return err
}
runner.AddMiddleware(MiddlewareType, mw)
return nil
}This file then collapses to:
func NewMiddleware(_ context.Context, cfg Config) (func(http.Handler) http.Handler, func(context.Context) error, error) {
if cfg.RateLimiting == nil {
return nil, nil, nil
}
if cfg.SessionStorage == nil || cfg.SessionStorage.Provider != "redis" {
return nil, nil, fmt.Errorf("rate limiting requires Redis session storage")
}
if cfg.SessionStorage.Address == "" {
return nil, nil, fmt.Errorf("rate limiting requires Redis session storage address")
}
mw, err := ratelimit.NewMiddleware(ratelimit.MiddlewareParams{
Namespace: cfg.Namespace,
ServerName: cfg.ServerName,
Config: cfg.RateLimiting,
RedisAddr: cfg.SessionStorage.Address,
RedisDB: cfg.SessionStorage.DB,
})
if err != nil {
return nil, nil, err
}
cleanup := func(context.Context) error { return mw.Close() }
return mw.Handler(), cleanup, nil
}captureRunner and the five stubs go away, the JSON marshal/unmarshal round-trip goes away, and ratelimit.CreateMiddleware is still available for the legacy Runner-based callers. The single keep-this-tradeoff-clear change is one new exported symbol (ratelimit.NewMiddleware).
There was a problem hiding this comment.
Thanks, that make sense, I've added the commit with suggested changes.
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Summary
This PR wires the
VirtualMCPServer.spec.config.rateLimitingconfig into the vMCP runtime for normaltools/callrequests.The change now focuses only on the base rate-limit runtime path:
pkg/ratelimit.NewMiddlewareas the typed, value-returning builder for Redis setup, limiter construction, cleanup, and HTTP wire behavior.pkg/ratelimit.CreateMiddlewareas the legacy runner-based entry point by delegating topkg/ratelimit.NewMiddleware.vmcpserver.ConfigasRateLimitMiddleware.parsed.ResourceIDas the rate-limit tool key.pkg/ratelimit/middleware.goas the single owner of request-time rate-limit behavior.Optimizer-aware
call_toolinner-tool-name resolution is intentionally deferred to a follow-up PR.Fixes: #4552
Type of Change
Test plan
task test)task test-e2e)task lint-fix)Changes
pkg/ratelimit/middleware.goNewMiddlewareas the typed reusable builder and keepsCreateMiddlewareas the runner API wrapper.pkg/vmcp/ratelimit/factory/middleware.gopkg/ratelimit.NewMiddlewaredirectly.pkg/vmcp/cli/serve.gopkg/vmcp/server/server.goRateLimitMiddlewareto server config and inserts it into the MCP middleware chain.pkg/ratelimit/middleware_test.goNewMiddlewarebuilder and preserves runner-wrapper coverage.pkg/vmcp/ratelimit/factory/middleware_test.gotest/e2e/thv-operator/virtualmcp/virtualmcp_rate_limiting_test.goNotes for Reviewers
pkg/ratelimit:NewMiddleware(MiddlewareParams), so vMCP can reuse the rate-limit implementation without a fakeMiddlewareRunneror JSON marshal/unmarshal round-trip.call_tooltool-name extraction.pkg/ratelimit/middleware.goremains the single owner of Redis setup, limiter construction, cleanup, the JSON-RPC 429 response, and request-time rate-limit behavior.Manual Testing
Prerequisites
Use a local kind cluster with locally built operator and vMCP images:
Verify the cluster and operator:
Expected:
Ready.Running.VirtualMCPServerandMCPServerCRDs exist.1. Validate Admission: Redis Is Required
Apply a vMCP with
config.rateLimitingbut without Redis session storage:Expected rejection:
2. Validate Admission: Per-User Requires OIDC
Apply a vMCP with
perUserrate limiting but anonymous auth:Expected rejection:
3. Deploy Redis
4. Verify ConfigMap Serialization
Create a valid vMCP with a shared limit. This uses anonymous auth so setup stays simple; runtime per-user enforcement is validated by the focused E2E test in the next section.
Wait for readiness:
Inspect the generated vMCP config:
Expected
config.yamlcontains:5. Verify Runtime Enforcement
Run the focused E2E. It deploys Redis, a parameterized OIDC server, one backend MCPServer, and a vMCP with
perUser.maxTokens=1. It then sends two authenticatedtools/callrequests from the same user and expects the second call to be rate-limited.Expected:
This validates:
config.rateLimiting.tools/callreceives the expected rate-limit error.6. Cleanup
The focused E2E uses timestamped resources and cleans them up automatically.
Large PR Justification
This is a new feature package with a large test suite, and it needs to land as one coherent phase.