Chinese AI firms mixing different GPUs inside individual AI servers to combat GPU shortages from US sanctions

AI data center
(Image credit: Shutterstock)

DigiTimes Asia reports that Chinese AI firms are implementing a multi-chip hybrid approach to improve AI capabilities. The multi-chip approach also brings several advantages. These include increasing LLM training speeds using multi-GPU parallel training, allowing more data to be processed simultaneously for better memory utilization, and reducing costs by not relying solely on expensive Nvidia chips.

Chinese tech companies are bundling GPUs from different suppliers for their AI training needs to circumvent American sanctions limiting their access to advanced hardware. With the White House taking active measures to stop U.S.-made tech from entering China, like revoking eight of Huawei's export licenses in 2024, data center GPUs required for advanced AI processing are getting more complicated in the East Asian Country.

While local semiconductor companies are trying to fill the market gap, Huawei's Ascend AI processor is the only viable AI chip readily available in the country. However, the company is reportedly having issues with chip yields for the Ascend 910B, meaning prices of these chips could be more expensive and take longer to produce.

The GPU situation in China has become so bad since the U.S.-China chip war that there's already an underground smuggling network focusing primarily on Nvidia's AI GPUs. Nevertheless, the limited supply and exorbitant prices of these black market Nvidia cards mean that tech companies must find ways to supplement their American GPU supply with local chips (like those from Huawei) or unsanctioned supply, like older-generation GPUs.

To make this possible, Chinese firms have started developing 'multi-chip hybrid' technologies that would allow them to combine different chips into a single training cluster. For example, Baidu announced during its 2024 earnings call that it could combine GPUs from various vendors and use them for AI training. Another major Chinese tech company, Alibaba, has worked on a 'one cloud, multiple chips' solution since 2021.

Using different GPUs on a single AI server has challenges, like needing a high-speed fabric like Nvidia's NVLink to ensure disparate accelerators can communicate efficiently. However, Chinese tech companies are also pushing innovation that way, with Alibaba Cloud ditching it for its ethernet-based High-Performance Network.

The White House's sanctions on Beijing severely limit AI development in China. However, this doesn't mean that Chinese progress will stop. Even though many experts say that the country is at least ten years behind the U.S. in several critical technological aspects, it will still move forward, and tech companies will try to find a way to flourish despite the hindrances placed on them by geopolitics.

Contributing Writer