Big Tech, VC firms pump $1B into ML data darling Scale AI

Be careful not to over-inflate, you may burst your bubble

Scale AI's valuation soared to nearly $14 billion on Tuesday after the startup revealed it has raked in a billion dollars in venture capital in a late-stage funding round led by VC house Accel with support from industry titans Nvidia, Amazon, and Meta to name a few.

While Nvidia made its fortune selling the hardware on which OpenAI, Anthropic, Meta and others rely, Scale's claim to fame is furnishing the data required to actually train those models. And, as we've previously discussed, modern models require a lot of data. Just to train a relatively small model like Llama 3, Meta says it used 15 trillion tokens — the scraps of words and punctuation that make up prose and speech.

Founded in 2016 by Alexandr Wang, Scale bills itself as a "data foundry" that's had a hand in powering "nearly every leading AI model" out there. This includes working directly with OpenAI on GPT-2 and InstructGPT as well as several programs run by the US Department of Defense.

In addition to furnishing model builders with massive quantities of meticulously labeled data, Scale also provides services to help its partners fine tune their existing datasets.

To say sourcing enough data to build ever more capable models has proven problematic would be an understatement. This issue is at the center of numerous lawsuits brought by artists, newspapers, photographers, and authors which allege that OpenAI and others violated creators' copyright by using their works to train machine learning models.

And it seems the problem isn't going to get any easier as model builders continue to push the envelope of what's possible with transformer models.

"The scaling laws imply an exponentially growing need for data as models get bigger, which raises a key question: will we run out of data," Wang said in a corporate statement Tuesday.

If artificial general intelligence has any hope of becoming a reality, Scale argues that an abundance of data is going to be required and it needs to be high-enough quality to actually contribute to more capable models.

Finally, Scale makes the case for a measurement and evaluation system to determine whether those models can be trusted enough for wide-scale adoption.

Scale is far from the only startup that has seen its valuation skyrocket in the wake of the generative AI boom. Earlier this month, GPU bit barn CoreWeave pulled in $1.1 billion in new funding pushing its valuation to $19 billion. Just a few weeks later, the datacenter operator revealed it'd secured $7.5 billion in debt financing from Blackstone, BlackRock and others to furnish yet more datacenters with GPUs.

Meanwhile, less established AI upstarts have found success bringing in seed funding. Alongside Scale's massive funding round, French AFI startup H — formerly Holistic AI — on Tuesday scored $220 million to accelerate development of multi-agent foundation models for generative AI apps. ®

More about

TIP US OFF

Send us news


Other stories you might like