OpenAI, together with AMD, Broadcom, Intel, Microsoft, and NVIDIA, has developed a new networking protocol called MRC (Multipath Reliable Connection). It is designed to make data transfer between GPUs in large AI supercomputers faster and more stable — and, in particular, to improve AI training
...