From ecd625c94bf788e65999b9e0b8a2044a3eaa2bd5 Mon Sep 17 00:00:00 2001
From: Mateusz Jakub Fila <mateusz.jakub.fila@cern.ch>
Date: Mon, 22 Jun 2026 14:45:38 +0200
Subject: [PATCH] P4251: clarify CERN project description

---
 source/2026-07-july/d4251-ioawaitable-cuda.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/source/2026-07-july/d4251-ioawaitable-cuda.md b/source/2026-07-july/d4251-ioawaitable-cuda.md
index 18aac76..d6eb5d5 100644
--- a/source/2026-07-july/d4251-ioawaitable-cuda.md
+++ b/source/2026-07-july/d4251-ioawaitable-cuda.md
@@ -606,7 +606,7 @@ Several independent projects have arrived at the same design: coroutine-based as
 
 **cuda-oxide (NVIDIA Labs, Rust).**<sup>[35]</sup> NVIDIA's own research lab implemented the same mechanism in Rust. Their `DeviceFuture` submits GPU work, enqueues a `cuLaunchHostFunc` callback that sets an `AtomicBool` and wakes a Tokio `Waker`, and the async runtime resumes the task on the next poll. Zero busy-wait. The three-state machine (Idle, Executing, Complete) is structurally identical to a network socket future. When NVIDIA's own research lab arrives at the same `cudaLaunchHostFunc`-to-async-runtime pattern independently, in a different language, the convergence is a data point about where the pattern fits naturally.
 
-**CERN wp1.7-coroutine-tests.**<sup>[34]</sup> The ATLAS and LHCb experiments at CERN are evaluating C++20 coroutine patterns for task scheduling, including a Gaudi-framework-inspired coroutine hierarchy and CUDA examples. The project's [`StreamIoAwaitable`](https://github.com/cern-nextgen/wp1.7-coroutine-tests/blob/5049a37d7e74b6e2241b39dca5c81ff3aaece0e3/examples/capy_stream_await.hpp) is built directly on Capy's IoAwaitable protocol: `await_suspend(std::coroutine_handle<>, boost::capy::io_env const*)` enqueues a `cudaLaunchHostFunc` callback that, on CUDA-stream completion, posts the coroutine handle back to `env->executor` - the same `cudaLaunchHostFunc`-to-coroutine resumption described here, implemented independently against Capy's `io_env`.
+**CERN wp1.7-coroutine-tests.**<sup>[34]</sup> The CERN Next Generation Triggers project is evaluating C++20 coroutine patterns for task scheduling in CPU-GPU computing systems for experimental high-energy physics. The repository contains demonstrations of selected notification mechanisms and libraries. The example[`StreamIoAwaitable`](https://github.com/cern-nextgen/wp1.7-coroutine-tests/blob/5049a37d7e74b6e2241b39dca5c81ff3aaece0e3/examples/capy_stream_await.hpp) is built directly on Capy's IoAwaitable protocol: `await_suspend(std::coroutine_handle<>, boost::capy::io_env const*)` enqueues a `cudaLaunchHostFunc` callback that, on CUDA-stream completion, posts the coroutine handle back to `env->executor` - the same `cudaLaunchHostFunc`-to-coroutine resumption described here, implemented independently against Capy's `io_env`.
 
 **Taro (University of Wisconsin-Madison).**<sup>[36]</sup> A C++20 coroutine task-graph system for CPU-GPU workloads. GPU tasks suspend the CPU thread via coroutines when waiting for GPU completion, allowing other tasks to run. Uses `cudaLaunchHostFunc` for the callback. Published at Euro-Par 2024 and presented at CppCon 2023. Reported 40-80% speedup over blocking approaches.
 
@@ -817,7 +817,7 @@ Eric Niebler, Micha&lstrok; Dominiak, Lewis Baker, Lucian Radu Teodorescu, Lee H
 
 Richard Smith and Gor Nishanov for P0981R0 (HALO analysis). Chuanqi Xu for the `[[clang::coro_await_elidable]]` attribute and P2477R3 (coroutine allocation elision). Dietmar K&uuml;hl and Maikel Nadolski for P3552R3 (`std::execution::task`). Lewis Baker for cppcoro, the operator `co_await` and symmetric transfer blog posts, and P3425R1 (operation-state sizes). Michael Wong for P4029R0 (SG14 priority list).
 
-Michael Garland and the NVIDIA stdexec team for the nvexec GPU schedulers and the Maxwell FDTD benchmark. The CERN wp1.7 team for their C++20 coroutine task-scheduling experiments and the Capy IoAwaitable integration. Dian-Lun Lin (University of Wisconsin-Madison) for Taro and its CppCon 2023 presentation. The NVIDIA Labs team for cuda-oxide. Jiqun Tu (NVIDIA) and Ellery Russell (Schr&ouml;dinger) for the Desmond coroutine integration presented at GTC 2024. The TTG/PaRSEC team for demonstrating coroutine-based heterogeneous GPU dispatch at DOE Exascale scale.
+Michael Garland and the NVIDIA stdexec team for the nvexec GPU schedulers and the Maxwell FDTD benchmark. The CERN Next Generation Triggers project for their C++20 coroutine task-scheduling experiments and the Capy IoAwaitable integration. Dian-Lun Lin (University of Wisconsin-Madison) for Taro and its CppCon 2023 presentation. The NVIDIA Labs team for cuda-oxide. Jiqun Tu (NVIDIA) and Ellery Russell (Schr&ouml;dinger) for the Desmond coroutine integration presented at GTC 2024. The TTG/PaRSEC team for demonstrating coroutine-based heterogeneous GPU dispatch at DOE Exascale scale.
 
 This paper was generated with AI assistance (Claude, via Cursor).