Skip to content

feat[gpu]: arrow device array stream support#8483

Draft
0ax1 wants to merge 23 commits into
developfrom
ad/arrow-device-array-stream
Draft

feat[gpu]: arrow device array stream support#8483
0ax1 wants to merge 23 commits into
developfrom
ad/arrow-device-array-stream

Wrap Arrow device stream comments to 100 columns

b89a5c9
Select commit
Loading
Failed to load commit list.
CodSpeed HQ / CodSpeed Performance Analysis failed Jun 18, 2026

6 benchmarks regressed

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 7 improved benchmarks
❌ 6 regressed benchmarks
✅ 1568 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation take_10k_random 197.9 µs 255.8 µs -22.61%
Simulation take_10k_contiguous 218.5 µs 276.3 µs -20.93%
Simulation patched_take_10k_contiguous_patches 232.2 µs 290.9 µs -20.18%
Simulation patched_take_10k_random 244.2 µs 303 µs -19.4%
Simulation chunked_varbinview_opt_canonical_into[(1000, 10)] 178 µs 213.8 µs -16.78%
Simulation chunked_varbinview_opt_into_canonical[(1000, 10)] 193.4 µs 229.6 µs -15.78%
Simulation chunked_bool_canonical_into[(1000, 10)] 34.9 µs 20.3 µs +71.95%
Simulation chunked_varbinview_canonical_into[(1000, 10)] 198.5 µs 162.2 µs +22.35%
Simulation chunked_varbinview_into_canonical[(1000, 10)] 214.4 µs 178 µs +20.48%
WallTime cuda/bitpacked_u8/unpack/3bw[100M] 352.6 µs 298.7 µs +18.04%
Simulation chunked_varbinview_canonical_into[(100, 100)] 308.7 µs 273.1 µs +13.02%
Simulation chunked_varbinview_into_canonical[(100, 100)] 367.7 µs 332.8 µs +10.48%
Simulation eq_i64_constant 322.6 µs 292.8 µs +10.21%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing ad/arrow-device-array-stream (b89a5c9) with develop (d020924)

Open in CodSpeed