Among Nvidia Rubin's first cloud service providers: Nebius (NBIS.US) will launch the Vera Rubin NVL72 computing power cluster in the second half of 2026

智通財經·01/06/2026 13:41:13

語音播報

The Zhitong Finance App learned that Nebius (NBIS.US) said it will provide Nvidia (NVDA.US)'s Vera Rubin NVL72 graphics processor (GPU) in the US and Europe starting in the second half of 2026.

According to the Netherlands-based AI infrastructure provider, it will deploy the Nvidia Rubin platform through Nebius AI Cloud and Nebius Token Factory (Nebius Token Factory), and will be one of the first AI cloud vendors to offer the computing platform.

Nebius said it will integrate the Vera Rubin NVL72 into its full-stack infrastructure in US and European data centers to enable customers to build next-generation AI applications with regional availability and controllability.

“By integrating Vera Rubin into the Nebius AI cloud and our inference platform Nebius Token Factory, we are providing AI innovators and businesses with the infrastructure to help them develop proxy and inference AI systems faster and more efficiently,” said Arkady Voloz, founder and CEO of Nebius.

According to the company, Nebius Token Factory is an inference and post-training platform for enterprises.

Nebius indicated that Rubin's accelerated computing platform will complement its existing Nvidia GB200 NVL72 and Nvidia Grace Blackwell Ultra NVL72 production capacity and expand the range of customer platform choices.

On Monday, Nvidia CEO Wong In-hoon indicated that as Grace Blackwell's successor, the next-generation computing platform Vera Rubin has now entered full mass production.

Dion Harris, Nvidia's senior director of high-performance computing and AI infrastructure solutions, described Vera Rubin as “an AI supercomputer with six chips.” The platform consists of six core components: Vera CPU, Rubin GPU, sixth-generation NVLink switching chip, ConnectX-9 network card, BlueField 4 DPU, and Spectrum-X 102.4T CPO. It targets next-generation AI workloads in the cloud and large data centers.

Among them, the Rubin GPU chip is equipped with a third-generation Transformer engine, and the NVFP4 inference computing power is 50 PFLOPS, which is 5 times that of Nvidia's previous generation Blackwell GPU. At the overall architecture level, the Vera Rubin platform can complete the training of large-scale “Mixture of Experts” (MOE) models in the same training time, but it only requires a quarter of the original number of GPUs, and the training cost for each token is reduced to one-seventh of the original. Nvidia also emphasized that Vera Rubin will support third-generation confidential computing technology and will be the industry's first rack-level trusted computing platform for AI scenarios with high requirements for secure isolation, data privacy, and multi-tenant environments.