Tech Companies

AMD Collaborates with OpenAI and Microsoft to Bring MRC AI Networking Protocol to Open Ecosystem

NDM News Network

AMD in collaboration with OpenAI, Microsoft, and other industry leaders, announced that it is contributing Multipath Reliable Connection (MRC) to the Open Compute Project (OCP), making this new network protocol available to the broader ecosystem. As a long-standing contributor to open ecosystems helping advance Ethernet for the era of AI, AMD is helping transform AI networking into an open, programmable, production-ready foundation for customers building AI infrastructure.

For AMD, and the industry at large, MRC represents more than a new networking protocol for frontier-scale supercomputers. It is an important step toward a more open, programmable, and resilient foundation for AI infrastructure. As customers build larger AI clusters across cloud, enterprise, research, and sovereign AI environments, the industry needs networks that are not only fast in ideal conditions, but consistent, adaptive, and operationally practical in real world deployments.

MRC: Built for AI networking at Scale

MRC is designed specifically for large-scale AI training environments where traditional single-path networking models struggle. These workloads require continuous, high-speed communication, and even brief disruptions can impact overall system progress.

Instead of sending traffic along a single path, MRC distributes packets across multiple paths simultaneously. This reduces congestion hotspots and limits latency variation that can slow synchronized training. When failures inevitably occur, MRC adapts quickly and allows traffic to reroute in near real-time, avoiding the delays associated with traditional network recovery.

In practical terms, MRC helps turn the network into a shock absorber for AI infrastructure. Instead of forcing every event to become a disruption, MRC gives the network a way to adapt locally and quickly so workloads can continue making progress. That matters because performance at AI scale is not defined by peak bandwidth alone. It is defined by how much useful accelerator capacity remains productive under real-world conditions.

AMD Contributions: From Development to Deployment

AMD played a formative role in shaping how MRC works today. AMD co-led authorship of the MRC specification that defines next-generation AI networking and contributed advanced congestion control technology to improve performance under real-world conditions.

More importantly, this isn’t theoretical. AMD has implemented and deployed MRC, combined with AMD networking technology, at scale in test clusters with a leading cloud provider. This validation means the design reflects how networks actually perform under sustained AI workloads.

“As GPUs and CPUs continue to drive compute, real bottleneck in scaling AI is the network. AMD, alongside OpenAI and Microsoft announced MRC, marking a major step forward for the industry. The programmability from AMD enables us to rapidly turn innovations like this into real-world performance at scale, where consistent, resilient throughput matters more than theoretical peak bandwidth.” - Krishna Doddapaneni, CVP, Engineering, NTSG, AMD

Programmability remains a key differentiator for AMD, as one of the only networking solutions that combines full hardware and software programmability with proven deployments, allowing networks to adapt as workloads evolve.  Before the development of the MRC specification, AMD had a pre-standard implementation of an improved RoCEv2 transport protocol, which evolved into the MRC standard of today.

This was due to the open programmability of the AMD Pensando™ Pollara 400 AI NIC, and that programmability contributed to the flexibility in obtaining early validation. As AMD being one of the first and only companies to implement MRC on a 400G NIC, we can accelerate a seamless transition to our AMD Pensando “Vulcano” 800G AI NIC, which also supports the MRC transport protocol.

This combination of a defined specification, contributed technology, and implementation in testing positions AMD at the forefront of deploying MRC in real-world AI infrastructure.

Redefining Performance for AI Infrastructure

For AI at scale, performance is defined by how systems behave under real conditions, not peak bandwidth. Consistent throughput, effective congestion handling, and quick recovery from failures, while keeping GPUs synchronized and productive is what’s optimal to power AI networking at scale. MRC can improve model efficiency and helps make the networking protocols connecting large-scale AI training across large GPU clusters highly reliable.

By helping define, develop, and contribute to MRC, AMD, in collaboration with OpenAI, Broadcom, Intel, and Microsoft, is advancing AI networking from concept to practical, production-ready infrastructure. 

𝐒𝐭𝐚𝐲 𝐢𝐧𝐟𝐨𝐫𝐦𝐞𝐝 𝐰𝐢𝐭𝐡 𝐨𝐮𝐫 𝐥𝐚𝐭𝐞𝐬𝐭 𝐮𝐩𝐝𝐚𝐭𝐞𝐬 𝐛𝐲 𝐣𝐨𝐢𝐧𝐢𝐧𝐠 𝐭𝐡𝐞 WhatsApp Channel now! 👈📲

𝑭𝒐𝒍𝒍𝒐𝒘 𝑶𝒖𝒓 𝑺𝒐𝒄𝒊𝒂𝒍 𝑴𝒆𝒅𝒊𝒂 𝑷𝒂𝒈𝒆𝐬 👉 FacebookLinkedInTwitterInstagram