Chinese GPU maker Moore Threads' MTLink fabric tech challenges Nvidia's NVLink, can now scale to 10,000 GPUs for AI clusters

The Moore Threads MTT S4000 graphics card.
(Image credit: Moore Threads)

One of Nvidia's advantages in the data center space is that it not only offers leading-edge GPUs for AI and HPC computing but can also effectively scale the number of its processors across a data center using its own hardware and software. How could you defeat Nvidia if your GPUs are slower and your software stack is not as pervasive as Nvidia's CUDA? Well, expand your own scale-out capabilities. This is exactly what Chinese GPU maker Moores Threads has done, based on a Science China Morning Post report.

Moore Threads has upgraded its KUAE data center server for AI, enabling connecting up to 10,000 GPUs in a single cluster. The KUAE data center servers integrate eight MTT S4000 GPUs interconnected using the proprietary MTLink technology designed specifically for training and running large language models (LLMs). These GPUs are based on the MUSA architecture and feature 128 tensor cores and 48 GB GDDR6 memory with 768 GB/s of bandwidth. A 10,000-GPU cluster wields 1,280,000 tensor cores, but the actual performance is unknown as performance scaling depends on numerous factors.

This move highlights Moore Threads's efforts to boost its datacenter AI capabilities despite being on the U.S. Department of Commerce's Entity List. Moore Threads' products, of course, lag behind Nvidia's GPUs in terms of performance. Even Nvidia's A100 80 GB GPU introduced in 2020 offers compute performance significantly greater than that of the MTT S4000 (624/1248 INT8 TOPS vs 200 INT8 TOPS). Yet, there are claims that the MTT S4000 is competitive against unknown Nvidia GPUs.

Moore Threads, which was founded in 2020 by a former Nvidia China executive, does not have access to leading-edge process technologies due to U.S. export rules as it is blacklisted by the Biden administration. However, the company is developing new GPUs for gaming (these graphics cards aren't on our list of the best graphics cards) and is pushing forward in the AI sector despite significant obstacles.  

So far, Moore Threads has forged strategic partnerships with major state-run telecom operators, including China Mobile and China Unicom, as well as China Energy Engineering Corp. and Gulin Huajue Big Data Technology. These collaborations aim to develop three new computing cluster projects, further advancing China's AI capabilities. 

Moore Threads recently completed a financing round, raising up to 2.5 billion yuan (approximately US$343.7 million). This influx of funds is expected to support its ambitious expansion plans and technology advancements. However, without access to advanced process technologies offered by TSMC, Intel Foundry, and Samsung Foundry, the firm faces numerous challenges on the path to developing next-gen GPUs. 

Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • eichwana
    Yet, there are claims that the MTT S4000 is competitive against unknown Nvidia GPUs.

    What, like a GTX 1050?
    Reply
  • ivan_vy
    eichwana said:
    What, like a GTX 1050?
    if you can't get the best you go for the next (best?) thing, also enterprise GPGPU tends to have bad graphics performance, software is another thing to consider, maybe the card was never meant to be a graphical powerhouse but a product for enterprise clients.
    Reply
  • RyderXx
    China likes to make big claims, but they are years of not decades behind, if they can't evenatch Nvidia A100 80GB than no, while Nvidia is already onto Blackwell GB200
    Reply
  • zsydeepsky
    RyderXx said:
    China likes to make big claims, but they are years of not decades behind, if they can't evenatch Nvidia A100 80GB than no, while Nvidia is already onto Blackwell GB200

    Nvidia would like to beg a differ.

    Nvidia names Huawei a top competitor in major areas including AI chipsNvidia has named Huawei a top competitor in a number of areas, including in the crucial production of processors that power artificial intelligence (AI) systems.

    Nvidia's AI Chip Has A Tough Competitor From China - Huawei's Latest Ascend 910BNvidia Corp faces competition from Huawei Technologies Co’s latest AI chip, the Ascend 910B, which a Huawei executive claims matches or surpasses Nvidia’s A100 in some tests.
    Reply
  • alan.campbell99
    Using GDDR6. Are they restricted from using HBM (I presume GDDR is not)?
    Reply
  • zsydeepsky
    alan.campbell99 said:
    Using GDDR6. Are they restricted from using HBM (I presume GDDR is not)?
    yes, all sanctioned Chinese companies have no access to HBM.
    but for MooreThread, they don't really need HBM (for now), their position in Chinese market is "gaming GPU", which GDDR is sufficient for them, especially consider their current performance barely matches GTX 1060.

    on the other hand, no access to HBM is indeed a problem for Huawei, so I saw some rumors about them also developing their own HBM.
    Reply