Skip to content

perf(broadcast): build PreparedMessage once per broadcast, not per re…#153

Open
tomi77 wants to merge 4 commits into
zishang520:devfrom
tomi77:perf/broadcast-prepared-frame
Open

perf(broadcast): build PreparedMessage once per broadcast, not per re…#153
tomi77 wants to merge 4 commits into
zishang520:devfrom
tomi77:perf/broadcast-prepared-frame

Conversation

@tomi77

@tomi77 tomi77 commented May 1, 2026

Copy link
Copy Markdown
Contributor

…cipient

Today every recipient of a broadcast pays the cost of ws.NewPreparedMessage / webtransport.NewPreparedMessage even though the frame bytes are identical for the entire broadcast. With N matching sockets that is N redundant frame allocations and an N-fold loss of the per-frame cache that gorilla/websocket keeps inside PreparedMessage.

Introduce an opaque, transport-keyed, build-once cache (packet.PreparedFrame). The default adapter allocates one cache per broadcast in _encode (next to where it sets WsPreEncodedFrame) and threads it through Options. The websocket and webtransport transports build their PreparedMessage on first send and reuse the cached value for every remaining recipient.

Backward compatible: when PreparedFrame is nil (unicast emits, custom adapters that don't set it, anything outside the broadcast path) the behavior is identical. The cache is allocated only inside the existing single-string fast path, so binary and polling paths are unaffected.

tomi77 and others added 2 commits May 1, 2026 18:30
…cipient

Today every recipient of a broadcast pays the cost of
ws.NewPreparedMessage / webtransport.NewPreparedMessage even though
the frame bytes are identical for the entire broadcast. With N
matching sockets that is N redundant frame allocations and an N-fold
loss of the per-frame cache that gorilla/websocket keeps inside
PreparedMessage.

Introduce an opaque, transport-keyed, build-once cache
(packet.PreparedFrame). The default adapter allocates one cache per
broadcast in _encode (next to where it sets WsPreEncodedFrame) and
threads it through Options. The websocket and webtransport transports
build their PreparedMessage on first send and reuse the cached value
for every remaining recipient.

Backward compatible: when PreparedFrame is nil (unicast emits, custom
adapters that don't set it, anything outside the broadcast path) the
behavior is identical. The cache is allocated only inside the existing
single-string fast path, so binary and polling paths are unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hot path

Two follow-ups to the per-broadcast PreparedMessage cache that target
the remaining per-recipient allocations on the broadcast hot path:

1. types.Buffer.View() — a sibling of Clone() that shares the
   underlying byte slice and only allocates the small wrapper struct.
   The adapter's broadcast path is read-only over the encoded payload,
   so an independent read offset is enough to keep recipients isolated.

2. Use it from socket.io's Client.WriteToEngine when WsPreEncodedFrame
   is set (i.e., the adapter has pre-encoded a shared payload). Unicast
   emits and buffers without a View() implementation keep the existing
   Clone behavior.

3. Drop the defensive Options clone-and-normalize in engine.io's
   sendPacket. Options is read-only downstream — every transport
   already derives the effective compression flag from a possibly-nil
   Compress pointer. Polling's send loop is updated to treat a nil
   Options / Compress as the documented default (true) so behavior is
   preserved end-to-end.

In broadcast benchmarks against a real fpserver feed (see profile in
the upstream PR description), this removes ~17% of broadcast-path
alloc_space (Buffer.Clone) and ~15% of alloc_objects (sendPacket
Options clone), which translates to noticeably smoother CPU under
fan-out bursts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tomi77 tomi77 force-pushed the perf/broadcast-prepared-frame branch from 8cc900d to 88d572f Compare May 2, 2026 05:34
@zishang520 zishang520 changed the base branch from main to dev May 4, 2026 04:23
@zishang520

Copy link
Copy Markdown
Owner

I need more time to evaluate this PR. Please be patient.

@zishang520 zishang520 force-pushed the dev branch 2 times, most recently from ef405a9 to 88db3de Compare May 7, 2026 10:42

@zishang520 zishang520 left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking forward to your response.

Comment thread pkg/types/buffer-ext.go
Comment on lines +61 to +80
// View returns a Buffer that shares the underlying byte slice with b
// but has independent read state (off / lastRead). It is intended for
// hot paths like broadcast send, where the same encoded payload is
// consumed by many recipients concurrently.
//
// Callers MUST treat the returned view (and b itself, while any view
// is in flight) as read-only with respect to the underlying bytes —
// any write/grow/truncate on either side will race. Reads through
// independent views are safe because each view has its own off.
func (b *Buffer) View() *Buffer {
if b == nil {
return nil
}
return &Buffer{
buf: b.buf,
off: b.off,
lastRead: b.lastRead,
}
}

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

View() returns a writable *Buffer sharing mutable state with no synchronization — the fix is to return a ReadableBuffer interface (read-only subset) so the compiler enforces the read-only contract instead of relying on documentation.

Comment thread servers/engine/socket.go
Comment 1 — View() return type (buffer-ext.go):
Introduce ReadableBuffer, a read-only subset of BufferInterface containing
only the read-side methods (io.Reader/Seeker/WriterTo/ByteScanner/
RuneScanner, Bytes, Stringer, Peek, Len, Size, Cap, Available, Next,
ReadBytes, ReadString). BufferInterface now embeds ReadableBuffer and adds
the write-side methods on top.

(*Buffer).View(), (*BytesBuffer).View(), and (*StringBuffer).View() now
return ReadableBuffer instead of *Buffer / BufferInterface. The compiler
therefore prevents callers from invoking write/mutate methods (Write,
Reset, Truncate, Grow, …) on a view whose underlying byte slice is shared
with the original — enforcing the read-only contract at compile time
rather than through documentation alone.

bufferViewer in client.go is updated to match, and the local data variable
in WriteToEngine is narrowed to io.Reader (which both ReadableBuffer and
BufferInterface satisfy), so no type assertion is needed at the call site.

Comment 2 — nil Options in sendPacket (socket.go):
Expand the comment to name the exact nil-guard in each transport:
  websocket / webtransport: `if packet.Options != nil` before every field
    access, so nil passes through safely.
  polling: `opt == nil || opt.Compress == nil || *opt.Compress` in send(),
    and `options != nil && options.Compress != nil` in DoWrite() —
    both short-circuit correctly on nil.
The previous defensive clone-and-normalize (always producing a non-nil
Options with a non-nil Compress) is therefore no longer needed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants