Demystifying NCCL(2507) - An In-depth Analysis of GPU Communication Protocols and Algorithms
Data-Transfer Methods and Transport Layer
| Intra-Node | Inter-Node | |
|---|---|---|
| Transport | P2P p2p.cc SHM shm.cc NVLS nvls.cc | NET net_ib.cc, net_socket.cc COLLNET coll_net.cc |
| Physical Interconnect | NVLink PCIe | InfiniBand RoCE TCP/IP (Socket) |
| Optimizations | GPUDirect P2P P2P_DIRECT | GPUDirect RDMA |
Intra-Node 통신에서 RDMA가 사용될 수 있고, Inter-Node 통신에서 NVLink가 사용될 수 있습니다.
Intra-node Data Transfer

Figure 1:Illustration of intra-node data transfer paths in NCCL.
Inter-node Data Transfer

Figure 2:Illustration of inter-node data transfer paths in NCCL.