An MCP server that exposes NVIDIA GPU metrics as tools. Any MCP-compatible AI agent (Claude, Goose, Cursor, etc.) can query real-time GPU utilization, memory, temperature, power, PCIe and NVLink throughput no Prometheus or dcgm-exporter required.
Built on the official Go MCP SDK and NVIDIA go-nvml.
| Tool | Description |
|---|---|
list_gpus |
List all GPUs with utilization and memory info |
get_gpu_metrics |
Detailed metrics for a GPU by index or UUID |
get_gpu_processes |
PID-level GPU process attribution |
gpu_summary |
Aggregate stats across all devices |
All tools support MIG (Multi-Instance GPU) - MIG instances appear as separate devices with their parent GPU's shared metrics (temperature, power, PCIe).
Each tool returns structured JSON. The examples below show the shape of the data an agent receives from a node with two NVIDIA A100 GPUs.
list_gpus:
{
"count": 2,
"devices": [
{
"index": 0,
"uuid": "GPU-aaaa-1111",
"name": "NVIDIA A100-SXM4-80GB",
"gpu_utilization_percent": 85,
"memory_used_mib": 57344,
"memory_total_mib": 81920
},
{
"index": 1,
"uuid": "GPU-bbbb-2222",
"name": "NVIDIA A100-SXM4-80GB",
"gpu_utilization_percent": 20,
"memory_used_mib": 12288,
"memory_total_mib": 81920
}
]
}get_gpu_metrics (with {"index": 0} or {"uuid": "GPU-aaaa-1111"}):
{
"index": 0,
"uuid": "GPU-aaaa-1111",
"name": "NVIDIA A100-SXM4-80GB",
"gpu_utilization_percent": 85,
"memory_utilization_percent": 70,
"memory_used_mib": 57344,
"memory_total_mib": 81920,
"temperature_celsius": 72,
"power_draw_watts": 300,
"power_limit_watts": 400,
"pcie_tx_kbps": 0,
"pcie_rx_kbps": 0,
"nvlink_tx_mbps": 0,
"nvlink_rx_mbps": 0
}gpu_summary:
{
"device_count": 2,
"avg_gpu_utilization": 52.5,
"avg_memory_utilization": 42.5,
"total_memory_used_mib": 69632,
"total_memory_total_mib": 163840,
"max_temperature_celsius": 72,
"total_power_draw_watts": 375
}MIG instances add is_mig, parent_gpu, and mig_profile fields to the
get_gpu_metrics and list_gpus payloads.
# build (requires CGO + NVML headers on Linux)
make build
# run the server communicates over stdio
./gpu-mcp-serverAdd to claude_desktop_config.json:
{
"mcpServers": {
"gpu": {
"command": "/path/to/gpu-mcp-server"
}
}
}extensions:
gpu-metrics:
type: stdio
cmd: /path/to/gpu-mcp-serverAdd to .cursor/mcp.json for a project, or ~/.cursor/mcp.json for all
projects:
{
"mcpServers": {
"gpu": {
"type": "stdio",
"command": "/path/to/gpu-mcp-server"
}
}
}Add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"gpu": {
"command": "/path/to/gpu-mcp-server"
}
}
}Requires Go 1.23+, CGO, and NVIDIA drivers on the target machine.
make build # compile binary
make test # run tests (no GPU needed uses mock)
make lint # golangci-lint
make docker # container imageTests use a mock collector, so they run anywhere no GPU hardware required.
Agent (Claude/Goose) ─── MCP (stdio) ──→ gpu-mcp-server ──→ NVML ──→ GPU
│
Tools:
• list_gpus
• get_gpu_metrics
• gpu_summary
The server runs as a local process alongside the agent. It calls NVML directly through cgo — no sidecar, no network hops, no metric pipeline to configure.
- License: Apache 2.0
- Language: Go
- AAIF project alignment: MCP
- Related: keda-gpu-scaler (GPU autoscaling for Kubernetes)
See ROADMAP.md for the 12-month public roadmap.
See CONTRIBUTING.md for how to get involved.
Thanks to all our contributors! Add yourself via PR.
This project follows Linux Foundation Minimum Viable Governance.
- ROADMAP.md - public roadmap
- GOVERNANCE.md - decision-making process
- DEPENDENCIES.md - external dependencies and licenses
- SECURITY.md - vulnerability reporting
- AGENTS.md - instructions for AI agents working on this repo
- CODE_OF_CONDUCT.md - community standards