When a hosted agent's LLM call returns HTTP 429 (rate limit exceeded), the error is caught by the generic _HandlerError handler in _endpoint_handler.py and replaced with a hardcoded {"error":{"message":"internal server error","type":"server_error","code":"server_error"}} response. The original error details (429 status, retry-after headers, rate limit message) are completely discarded — only logged to container stderr, which most users never see.
Observed on: 2026-06-12T07:24:32Z, session 93e0a28c2603bb3e3ab60431a3972efe360a8fcb62e90200c4fec93967d6b5c, eastus2, agent browser-automation-agent-sample-foundry:1
Span
| where TIMESTAMP between (datetime(2026-06-12T07:24:30Z) .. datetime(2026-06-12T07:24:35Z))
| where env_dt_traceId == "3fa039b49b814a369d928eb9b97773a5"
| project TIMESTAMP, name, ['http.response.status_code'], success, statusMessage
| order by TIMESTAMP asc
6/12/2026, 7:24:32.605 AM EnableRequestBufferingAttribute true
6/12/2026, 7:24:32.652 AM rate_limit_request true
6/12/2026, 7:24:32.654 AM xapi.api_method._validate_query_params true
6/12/2026, 7:24:32.654 AM xapi.api_method._validate_path_params true
6/12/2026, 7:24:32.655 AM read_body true
6/12/2026, 7:24:32.656 AM xapi.utils.json_utils.parse_json true
6/12/2026, 7:24:32.665 AM parse_function true
6/12/2026, 7:24:32.668 AM parse_function true
6/12/2026, 7:24:32.67 AM parse_function true
6/12/2026, 7:24:32.673 AM parse_function true
6/12/2026, 7:24:32.675 AM parse_function true
6/12/2026, 7:24:32.677 AM parse_function true
6/12/2026, 7:24:32.679 AM xapi.api_method._validate true
6/12/2026, 7:24:32.679 AM xapi.api_method._validate_body true
6/12/2026, 7:24:32.679 AM validate_body_param true
6/12/2026, 7:24:32.693 AM responsesapi.response_creator_service.ensure_options true
6/12/2026, 7:24:32.715 AM responsesapi.response_creator_service.hydrate_model_info true
6/12/2026, 7:24:32.72 AM rate_limit_tool_call true
6/12/2026, 7:24:32.734 AM get_item true
6/12/2026, 7:24:32.734 AM POST true
6/12/2026, 7:24:32.744 AM get_item true
6/12/2026, 7:24:32.744 AM POST true
6/12/2026, 7:24:32.756 AM POST true
6/12/2026, 7:24:32.757 AM get_item true
6/12/2026, 7:24:32.769 AM POST true
6/12/2026, 7:24:32.77 AM get_item true
6/12/2026, 7:24:32.786 AM Provider.ScopeByoEntityClientProvider.GetClientAsync true
6/12/2026, 7:24:32.788 AM GET dbs/agents/colls/run-state-v1/docs/fc_6a2bb4144b488196828d16db7459156a false 404
6/12/2026, 7:24:32.788 AM GET dbs/agents/colls/run-state-v1/docs/fc_6a2bb420a8e08196a654b3fac45a4ac2 false 404
6/12/2026, 7:24:32.788 AM GET dbs/agents/colls/run-state-v1/docs/fc_6a2bb426d2388196a32b2685f5061066 false 404
6/12/2026, 7:24:32.789 AM GET dbs/agents/colls/run-state-v1/docs/fc_6a2bb41a649c8196baad5b9e3c6b7e9e false 404
6/12/2026, 7:24:32.79 AM Controller.StorageController.RetrieveItemsBatchAsync true
6/12/2026, 7:24:32.791 AM POST agents/v2.0/subscriptions/{subscription}/resourceGroups/{resourceGroup}/providers/Microsoft.MachineLearningServices/workspaces/{workspace}/entities/items/batch true 200
6/12/2026, 7:24:32.793 AM responses_common.services.msft_sediment_hydration_service.hydrate_items true
6/12/2026, 7:24:32.793 AM POST true
6/12/2026, 7:24:32.794 AM build_items true
6/12/2026, 7:24:32.816 AM conversation_context true
6/12/2026, 7:24:32.817 AM responsesapi.response_creator_service.context true
6/12/2026, 7:24:32.817 AM _hydrate_files true
6/12/2026, 7:24:32.818 AM responsesapi.response_creator_service.previous_response_hydration_service true
6/12/2026, 7:24:32.826 AM responsesapi.response_creator_service.create_response true
6/12/2026, 7:24:32.826 AM responsesapi.response_creator_service.enrich_response_tools true
6/12/2026, 7:24:32.827 AM get_hydrated_model_info_config true
6/12/2026, 7:24:32.829 AM responsesapi.create_response_method.parse_request_body true
6/12/2026, 7:24:32.867 AM emitter.emit true
6/12/2026, 7:24:32.874 AM responses_common.persistence.models.chat_conversation true
6/12/2026, 7:24:32.874 AM responsesapi.execute_response.build_conversation true
6/12/2026, 7:24:32.892 AM POST true
6/12/2026, 7:24:32.894 AM emitter.emit true
6/12/2026, 7:24:32.905 AM execute_response_telemetry false TooManyRequests: TooManyRequests[429, internal=Your requests to gpt-5.4 for gpt-5.4 in eastus2 have exceeded rate limit., user=Too Many Requests, internal_extra={'internal_message': 'Your requests to gpt-5.4 for gpt-5.4 in eastus2 have exceeded rate limit.', 'user_message': 'Too Many Requests', 'status': 429, 'headers': {'x-ms-fe-error': 'true'}, 'type': 'too_many_requests', 'code': 'too_many_requests'}]
6/12/2026, 7:24:32.905 AM emitter.emit true
6/12/2026, 7:24:32.923 AM xapi.api_method.endpoint_with_background_tasks true
6/12/2026, 7:24:32.926 AM POST v1/responses true 200
6/12/2026, 7:24:32.927 AM ProxyRequest true
6/12/2026, 7:24:32.928 AM POST /v1/responses true
6/12/2026, 7:24:32.931 AM POST agents/v2.0/subscriptions/{subscription}/resourceGroups/{resourceGroup}/providers/Microsoft.MachineLearningServices/workspaces/{workspace}/openai/responses true 200
6/12/2026, 7:24:32.983 AM AuthorizeFilter true
6/12/2026, 7:24:32.997 AM Provider.ScopeByoEntityClientProvider.GetClientAsync true
6/12/2026, 7:24:33 AM GET dbs/agents/colls/run-state-v1/docs/conv_bhQ2a7EiIxBo8Xfp6tl6bViqIX8c8E9B true 200
6/12/2026, 7:24:33.002 AM Provider.ScopeByoEntityClientProvider.GetClientAsync true
6/12/2026, 7:24:33.005 AM GET dbs/agents/colls/run-state-v1/docs/conv2item_bhQ2a7EiIxBo8Xfp6tl6bViqIX8c8E9B_0 true 200
6/12/2026, 7:24:33.01 AM GET dbs/agents/colls/run-state-v1/docs/conv_bhQ2a7EiIxBo8Xfp6tl6bViqIX8c8E9B true 200
6/12/2026, 7:24:33.021 AM POST dbs/agents/colls/run-state-v1/docs true 200
6/12/2026, 7:24:33.022 AM Provider.ScopeByoEntityClientProvider.GetClientAsync true
6/12/2026, 7:24:33.023 AM Provider.ScopeByoEntityClientProvider.GetClientAsync true
6/12/2026, 7:24:33.026 AM GET dbs/agents/colls/run-state-v1/docs/conv_bhQ2a7EiIxBo8Xfp6tl6bViqIX8c8E9B true 200
6/12/2026, 7:24:33.026 AM Provider.ScopeByoEntityClientProvider.GetClientAsync true
6/12/2026, 7:24:33.03 AM POST dbs/agents/colls/run-state-v1/docs true 201
6/12/2026, 7:24:33.033 AM POST dbs/agents/colls/run-state-v1/docs true 201
6/12/2026, 7:24:33.033 AM Manager.ResponseStorageManager.CreateAsync true
6/12/2026, 7:24:33.035 AM POST agents/v2.0/subscriptions/{subscription}/resourceGroups/{resourceGroup}/providers/Microsoft.MachineLearningServices/workspaces/{workspace}/storage/responses true 201
6/12/2026, 7:24:33.045 AM ContainerProxy.Forward true 200
6/12/2026, 7:24:33.046 AM POST agents/v2.0/subscriptions/{subscription}/resourceGroups/{resourceGroup}/providers/Microsoft.MachineLearningServices/workspaces/{workspace}/agents/{agentName}/endpoint/protocols/openai/responses true 200
When a hosted agent's LLM call returns HTTP 429 (rate limit exceeded), the error is caught by the generic _HandlerError handler in _endpoint_handler.py and replaced with a hardcoded {"error":{"message":"internal server error","type":"server_error","code":"server_error"}} response. The original error details (429 status, retry-after headers, rate limit message) are completely discarded — only logged to container stderr, which most users never see.
Observed on: 2026-06-12T07:24:32Z, session 93e0a28c2603bb3e3ab60431a3972efe360a8fcb62e90200c4fec93967d6b5c, eastus2, agent browser-automation-agent-sample-foundry:1
Telemetry evidence:
Span in aoaiagents1.westus.kusto.windows.net / prod database, trace ID 3fa039b49b814a369d928eb9b97773a5:
KQL Query:
Output: