|
|
9fe9f061f8
|
fix(nccl-tests): set LIBRARY_PATH so ld finds libnccl.so in nccl cache
|
2026-03-26 23:59:06 +03:00 |
|
|
|
837a1fb981
|
fix(nccl-tests): pin /usr/local/cuda→12.8 symlink, auto-detect gencode by nvcc version
|
2026-03-26 23:54:07 +03:00 |
|
|
|
1f43b4e050
|
fix(nccl-tests): pass NCCL_LIB from nccl cache to fix -lnccl link error
|
2026-03-26 23:52:25 +03:00 |
|
|
|
83bbc8a1bc
|
fix(nccl-tests): upgrade to cuda-nvcc-12-8, add sm_100 (Blackwell B100/B200)
|
2026-03-26 23:51:26 +03:00 |
|
|
|
896bdb6ee8
|
fix(nccl-tests): use cuda-nvcc-12-6 to support Ampere/Volta (sm_70..sm_90)
|
2026-03-26 23:50:36 +03:00 |
|
|
|
5407c26e25
|
fix(nccl-tests): CUDA 13.0 supports only sm_90+ (Hopper/H100)
|
2026-03-26 23:49:45 +03:00 |
|
|
|
4fddaba9c5
|
fix(nccl-tests): limit CUDA gencode to sm_70+ (CUDA 13 dropped Pascal)
|
2026-03-26 23:48:40 +03:00 |
|
|
|
d2f384b6eb
|
fix(nccl-tests): use plain make instead of non-existent all_reduce_perf target
|
2026-03-26 23:47:49 +03:00 |
|
|
|
5644231f9a
|
feat(nccl): add nccl-tests all_reduce_perf for GPU bandwidth testing
- Dockerfile: install cuda-nvcc-13-0 from NVIDIA repo for compilation
- build-nccl-tests.sh: downloads libnccl-dev for nccl.h, builds all_reduce_perf
- build.sh: runs nccl-tests build, injects binary into /usr/local/bin/
- platform: RunNCCLTests() auto-detects GPU count, runs all_reduce_perf
- TUI: NCCL bandwidth test entry in Burn-in Tests screen [N] hotkey
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-03-26 23:22:19 +03:00 |
|