Skip to content

HUD performance audit #7500

Description

@Goober5000

As requested by @wookieejedi and investigated by Claude.

HUD Performance Investigation — Report

Investigation dispatched five parallel subagents across batching/draw-call efficiency, algorithmic hot spots, radar & iteration cost, Lua/scripting overhead, and existing profiling instrumentation. Consolidated findings below.

TL;DR — The Smoking Gun

The graphics layer already ships a 2D batching API designed exactly for this problem, and nothing in the codebase calls it. From code/graphics/render.h:195-211:

"Start buffering 2D rendering operations. This will defer rendering 2D interface elements until gr_2d_stop_buffer is called. This can improve performance when doing a lot of 2D operations since the actual drawing will only be done once."

A grep for gr_2d_start_buffer returns zero callers anywhere — HUD, UI, mission load, debug overlays. The HUD renders ~75+ separate renderLine/renderRect/renderCircle calls per frame through this NanoVG path, and each one issues its own beginFrame/endFrame pair (see code/graphics/render.cpp:771,785). Wrapping the gauge loop in hud.cpp:2124-2147 with gr_2d_start_buffer()/gr_2d_stop_buffer() is a ~2-line change that the engine was designed to accept and that the suspecting devs were right to flag.

The caveat in the header — "might change the drawing order if incompatible rendering commands are executed" — is worth heeding (HUD bitmaps go through a different path than the NanoVG primitives), so this needs validation, not blind enable. But it's the single highest-leverage change identified.

Architecture Summary

The HUD render pipeline per frame:

  1. hud_render_preprocess() (hud.cpp:1929) — targeting, navigation, brackets, missile tracking
  2. hud_render_all() (hud.cpp:2077) → hud_render_gauges() iterates every gauge in the ship's hud_gauges vector (typically 50+), calling preprocess()onFrame()setupRenderCanvas()canRender()render() on each
  3. Each gauge's render() calls primitives that route to gr_line, gr_rect, gr_string, gr_bitmap, etc.

Batching status by primitive:

Primitive Batched? Notes
renderLine / renderRect / renderCircle / renderGradientLine No — per-call beginFrame/endFrame NanoVG path; would be batched if buffering_nanovg=true
renderBitmap / gr_aabitmap No — immediate material submission per call Separate path; not affected by gr_2d_start_buffer
renderString (VFNT fonts) Partially — chars batched into 300-vertex CPU buffer, GPU submission is immediate One submission per string
renderString (TTF/NVG fonts) No equivalent CPU buffering One draw per glyph in the worst case

Algorithmic Hot Spots (file:line)

Multiple full-list walks per frame (none individually O(n²), but they stack):

Redundant per-frame recomputation in the targetbox (hudtargetbox.cpp:1700-1873): 20+ sprintf/snprintf calls and 4+ gr_get_string_size calls per frame for hull %, subsystem names, weapon names, AI mode — with no caching even when the target hasn't changed. Subsystem name pipe-tokenization via strtok happens per frame at line 1841.

Math hot spots:

  • hudtarget.cpp:3981 polish_predicted_target_posvm_vec_dist_quick inside an iterative loop (multiple sqrts per lead-indicator calculation)
  • 14 distance-calculation sites in hudtarget.cpp alone, several inside ship/missile loops
  • radardradis.cpp:118vm_vec_normalize per blip in the render path, when blip position is already known at plot time (single cache slot would fix it)
  • hudshield.cpp:671 — generated 3D shield icons use the full g3_start_frame / matrix / projection pipeline per quadrant when most missions could use baked textures

Wasted gauge work (hud.cpp:2124-2147): preprocess() and onFrame() are called on every gauge before canRender() is checked. Gauges that are off-screen, configured off, or popup-only-and-not-popped pay full preprocessing cost.

Shared coordinate transforms are not shared. The same target may be g3_rotate_vertex/g3_project_vertex-ed by hud_show_targeting_gauges, hud_show_selection_set, individual targeting gauges, and bracket drawing — no per-frame projection cache keyed by object signature.

What's Actually Fine

  • Radar blip generation (radarsetup.cpp). Plotted once per object at post-move time and shared across all radar gauges via global blip lists. No per-gauge re-iteration. This is the right pattern; the rest of the HUD should learn from it.
  • Scripting overhead is well-optimized for the unhooked common case — ActiveActions hash lookup is the cost. Heavily-scripted HUDs would benefit from frame-constant Lua-value caching (Player.Position, etc.) but vanilla missions pay essentially nothing.
  • Mission parsing (hudparse.cpp's 5729 lines) is load-time only, not per-frame.

What's Already Instrumented

Only three TRACE_SCOPE points currently emit usable data: RenderHUDGauge (hud.cpp:2142,2162), RenderTargetingBracket (hudbrackets.cpp:396), RenderNavBracket (hudbrackets.cpp:509). The categories RenderMainFrame, RenderHUD, RenderHUDHook are declared in tracing/categories.h but never emit events — instrumenting them first would let you verify these recommendations against real numbers rather than estimates.

Recommendations (Prioritized by Estimated Impact ÷ Effort)

  1. Wire up the existing gr_2d_start_buffer/gr_2d_stop_buffer around the gauge loop in hud.cpp:2124-2147. Two lines plus validation that draw order isn't disturbed by the mixed bitmap/NanoVG paths. Highest-leverage single change.

  2. Reorder the gauge-render loop so canRender() is checked before preprocess() and onFrame(). Off-screen/disabled gauges should pay nothing. ~5-line change in hud.cpp:2124-2147.

  3. Add proper tracing first — instrument the declared-but-unused RenderMainFrame, RenderHUD categories and add per-gauge-class scopes. This stops being a guessing game once you have numbers. Without it, the rest of these recommendations are educated estimates.

  4. Cache targetbox strings keyed on target signature + last-changed timestamp (hudtargetbox.cpp:1700-1873). 20+ sprintfs and 4+ string-size measurements per frame collapse to ~0 when target/state hasn't changed. Subsystem-name tokenization should be done once at target acquisition, not per frame.

  5. Merge the two Missile_obj_list walks at hudtarget.cpp:3199 and 3369 into a single pass that handles both homing-missile tracking and remote-detonate brackets.

  6. Temporal cache for hud_show_hostile_triangle (hudtarget.cpp:3694) — the "current top threat" object rarely changes between frames; recompute only on a fixed interval or when invalidated by death/IFF change.

  7. Coalesce bracket lines (hudtarget.cpp:2842-2843, 6041-6114) — bracket corners are currently four separate renderLines; with the existing line_draw_list machinery (already used by draw_brackets_square_quick), audit which call sites still bypass it.

  8. Cache vm_vec_normalize result at plot time for DRADIS blips (radardradis.cpp:118) — one float[3] added to the blip struct.

  9. Cache escort list and invalidate on ship birth/death events rather than rebuilding from Ship_obj_list each frame (hudescort.cpp:608).

  10. Lower-priority: consider whether polish_predicted_target_pos (hudtarget.cpp:3953-3990) needs as many iterations as it does, and whether the generated 3D shield icons (hudshield.cpp:671) could be baked to textures at mission load for the common case.

Honest Caveats

  • The "75+ render calls per frame" and "85-95% reduction" figures from the batching investigation are estimates from reading code, not measured. Before doing big work on item 1, instrument and measure (item 3) so you have a before/after.
  • This investigation was read-only — no code was modified.
  • Several recommendations (especially temporal caching of hostile triangle and escort list) need careful invalidation logic; the "save once on event" pattern is correct but bug-prone if any event source is missed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    HUDA feature or issue related to the HUDgameplayA feature or issue that can significantly impact gameplay

    Type

    Fields

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions