Skip to content

pmady/gpu-mcp-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

gpu-mcp-server

CI Helm Go Report Card Go Reference License OpenSSF Scorecard

An MCP server that exposes NVIDIA GPU metrics as tools. Any MCP-compatible AI agent (Claude, Goose, Cursor, etc.) can query real-time GPU utilization, memory, temperature, power, PCIe and NVLink throughput no Prometheus or dcgm-exporter required.

Built on the official Go MCP SDK and NVIDIA go-nvml.

Tools

Tool Description
list_gpus List all GPUs with utilization and memory info
get_gpu_metrics Detailed metrics for a GPU by index or UUID
get_gpu_processes PID-level GPU process attribution
gpu_summary Aggregate stats across all devices

All tools support MIG (Multi-Instance GPU) - MIG instances appear as separate devices with their parent GPU's shared metrics (temperature, power, PCIe).

Sample output

Each tool returns structured JSON. The examples below show the shape of the data an agent receives from a node with two NVIDIA A100 GPUs.

list_gpus:

{
  "count": 2,
  "devices": [
    {
      "index": 0,
      "uuid": "GPU-aaaa-1111",
      "name": "NVIDIA A100-SXM4-80GB",
      "gpu_utilization_percent": 85,
      "memory_used_mib": 57344,
      "memory_total_mib": 81920
    },
    {
      "index": 1,
      "uuid": "GPU-bbbb-2222",
      "name": "NVIDIA A100-SXM4-80GB",
      "gpu_utilization_percent": 20,
      "memory_used_mib": 12288,
      "memory_total_mib": 81920
    }
  ]
}

get_gpu_metrics (with {"index": 0} or {"uuid": "GPU-aaaa-1111"}):

{
  "index": 0,
  "uuid": "GPU-aaaa-1111",
  "name": "NVIDIA A100-SXM4-80GB",
  "gpu_utilization_percent": 85,
  "memory_utilization_percent": 70,
  "memory_used_mib": 57344,
  "memory_total_mib": 81920,
  "temperature_celsius": 72,
  "power_draw_watts": 300,
  "power_limit_watts": 400,
  "pcie_tx_kbps": 0,
  "pcie_rx_kbps": 0,
  "nvlink_tx_mbps": 0,
  "nvlink_rx_mbps": 0
}

gpu_summary:

{
  "device_count": 2,
  "avg_gpu_utilization": 52.5,
  "avg_memory_utilization": 42.5,
  "total_memory_used_mib": 69632,
  "total_memory_total_mib": 163840,
  "max_temperature_celsius": 72,
  "total_power_draw_watts": 375
}

MIG instances add is_mig, parent_gpu, and mig_profile fields to the get_gpu_metrics and list_gpus payloads.

Quick start

# build (requires CGO + NVML headers on Linux)
make build

# run the server communicates over stdio
./gpu-mcp-server

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "gpu": {
      "command": "/path/to/gpu-mcp-server"
    }
  }
}

Goose

extensions:
  gpu-metrics:
    type: stdio
    cmd: /path/to/gpu-mcp-server

Cursor

Add to .cursor/mcp.json for a project, or ~/.cursor/mcp.json for all projects:

{
  "mcpServers": {
    "gpu": {
      "type": "stdio",
      "command": "/path/to/gpu-mcp-server"
    }
  }
}

Windsurf

Add to ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "gpu": {
      "command": "/path/to/gpu-mcp-server"
    }
  }
}

Build

Requires Go 1.23+, CGO, and NVIDIA drivers on the target machine.

make build       # compile binary
make test        # run tests (no GPU needed uses mock)
make lint        # golangci-lint
make docker      # container image

Tests use a mock collector, so they run anywhere no GPU hardware required.

Architecture

Agent (Claude/Goose) ─── MCP (stdio) ──→ gpu-mcp-server ──→ NVML ──→ GPU
                                              │
                                         Tools:
                                         • list_gpus
                                         • get_gpu_metrics
                                         • gpu_summary

The server runs as a local process alongside the agent. It calls NVML directly through cgo — no sidecar, no network hops, no metric pipeline to configure.

Project info

  • License: Apache 2.0
  • Language: Go
  • AAIF project alignment: MCP
  • Related: keda-gpu-scaler (GPU autoscaling for Kubernetes)

Roadmap

See ROADMAP.md for the 12-month public roadmap.

Contributing

See CONTRIBUTING.md for how to get involved.

Contributors

Thanks to all our contributors! Add yourself via PR.

Governance

This project follows Linux Foundation Minimum Viable Governance.

Documentation

Star History

Star History Chart

About

MCP server for NVIDIA GPU metrics give AI agents real-time access to GPU utilization, memory, temperature, and power

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors