waynehacking8

WEI CHENG CHIU waynehacking8

Master's program in Computer Science at National Taiwan University of Science and Technology.Engaging in research on machine learning and information security,

2 followers · 8 following

Achievements

Pinned Loading

federated-learning-lab federated-learning-lab Public

From-scratch federated learning: FedAvg / FedProx / SCAFFOLD, DP-SGD & secure aggregation, plus FedPer / Byzantine-robust / FedAdam / FedLoRA. 33/33 tests, literature-cross-validated, with honest n…

Python
nccl-collectives-bench nccl-collectives-bench Public

NCCL collective benchmarks on an 8×H100 NVSwitch host — busbw vs link budget, NVLS/Ring/Tree, small-message latency floors (eager vs CUDA Graph vs symmetric memory), and the TP-decode comms ceiling…

Python
nim-agent-blueprint nim-agent-blueprint Public

Agentic RAG hallucination evaluation on adversarial SQuAD 2.0 (N=200) — nine gate methods compared (self/cross-family/70B judges, PoLL, CoT, MiniCheck, semantic entropy): grounding beats capacity, …

Python
trtllm-triton-serving trtllm-triton-serving Public

TensorRT-LLM vs vLLM controlled head-to-head on H100 — 12 studies including a knob-by-knob waterfall reproducing NVIDIA's published 27.7k tok/s (100.3%) and attributing the gap to real serving, plu…

Python
blackwell-tensorcore-kernels blackwell-tensorcore-kernels Public

Hand-written CUDA Tensor Core GEMM kernels on Blackwell (sm_120) and Hopper (sm_90) — raw mma.sync reaching 106% of the cuBLAS-TC kernel on sm_120, CUTLASS 3.x wgmma at 85.5% of nvjet on H100, and …

Cuda
physgate physgate Public

Validate LLM-generated robot plans in GPU physics simulation — best-of-N plan selection against the highest-quality verifier (Isaac Lab + ROS2 + MCP). Research prototype.

Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WEI CHENG CHIU waynehacking8

Achievements

Achievements

Block or report waynehacking8

Pinned Loading

Uh oh!