Follow us on RSS

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

True to his word, billionaire multi-company leader Elon Musk’s startup xAI today made its first large language model (LLM) Grok open source.

The move, which Musk had previously proclaimed would happen this week, now enables any other entrepreneur, programmer, company, or individual to take Grok’s weights — the strength of connections between the model’s artificial “neurons,” or software modules that allow the model to make decisions and accept inputs and provide outputs in the form of text — and other associated documentation and use a copy of the model for whatever they’d like, including for commercial applications.

“We are releasing the base model weights and network architecture of Grok-1, our large language model,” the company announced in a blog post. “Grok-1 is a 314 billion parameter Mixture-of-Experts model trained from scratch by xAI.”

Those interested can download the code for Grok on its Github page or via a torrent link. Hugging Face also added a fast download instance here.

What Grok’s open sourcing means

Parameters refers to the weights and biases that govern the model — the more parameters, generally the more advanced, complex and performant the model is. At 314 billion parameters, Grok is well ahead of open source competitors such as Meta’s Llama 2 (70 billion parameters) and Mistral 8x7B (12 billion parameters).

Grok was open sourced under an Apache License 2.0, which enables commercial use, modifications, and distribution, though it cannot be trademarked and there is no liability or warranty that users receive with it. In addition, they must reproduce the original license and copyright notice, and state the changes they’ve made.

Grok’s architecture, developed using a custom training stack atop JAX and Rust in October 2023, incorporates innovative approaches to neural network design. The model utilizes 25% of its weights for a given token, a strategy that enhances its efficiency and effectiveness

Grok was initially released as a proprietary or “closed source” model back in November 2023 and it was, until now, accessible only on Musk’s separate but related social network X (formerly Twitter), specifically through the X Premium+ paid subscription service, which costs $16 per month or $168 per year.

However, Grok’s release does not include the full corpus of its training data. This doesn’t really matter for using the model, since it has already been trained, but it does not allow for users to see what it learned from — presumably user text posts on X (the xAI blog post states it opaquely as “Base model trained on a large amount of text data, not fine-tuned for any particular task.”)

It also does not include any hookup to the realtime information available on X, which Musk initially touted as a major attribute of Grok over other LLMs. For that, users will still need to subscribe to the paid version on X.

More than just a tech move — a business and PR strategy

Designed to rival ChatGPT made by OpenAI, the company Musk co-founded and broke from acrimoniously in 2018 and now competes with, Grok is named after the slang term that means “understanding,” and is described as “an AI modeled after the Hitchhiker’s Guide to the Galaxy,” the seminal 1970s radio drama and satirical sci-fi book series by UK author Douglas Adams (it was adapted into a major movie in 2005).

Musk has positioned Grok favorably as a more humorous and uncensored version of ChatGPT and other leading LLMs, a stance that has taken on renewed attractiveness among users more generally in light of complaints of AI censorship and Google Gemini’s embarrassing racial confused image generations and questionable ideological stances (Gemini suggested in at least one example that Musk’s tweets were possibly as bad for society as Nazi leader Adolf Hitler). Gemini has of course been resoundingly criticized by Musk and other influential tech leaders, including a16z co-founder and web pioneer Marc Andreessen.

The open sourcing of Grok is also clearly a helpful ideological stance for Musk in his lawsuit and general criticisms of OpenAI, which he sued recently, accusing his former company of abandoning its “founding agreement” to operate as a non-profit. OpenAI released emails in its defense in the court of public opinion, at least, indicating Musk was aware of and possibly supportive of its move away toward proprietary, for-profit technology.

Already, the AI community on X has reacted to the release with curiosity and excitement. Notably, the technical community has pointed out the model’s use of GeGLU in feedforward layers and its approach to normalization, with a nod to the intriguing sandwich norm technique. Even employees of OpenAI have posted about their interest in the model.

Grok weights are out under Apache 2.0: https://t.co/9K4IfarqXK

It's more open source than other open weights models, which usual come with usage restrictions.

It's less open source than Pythia, Bloom, and OLMo, which come with training code and reproducible datasets. https://t.co/kxu2anrNiP pic.twitter.com/UeNew30Lzn
— Sebastian Raschka (@rasbt) March 17, 2024

Few comments on Grok-1 code release in JAX!https://t.co/FpDCrCgz3l

Looking quickly:
– model nicely written
– partition rules for sharding follow the old style of t5x
– they used haiku but it wouldn't be too hard to update to flax
– they use shard_map on the MoE layers for…
— Boris Dayma ?️ (@borisdayma) March 17, 2024

lfg https://t.co/PLkUfQLXnL
— will depue (@willdepue) March 17, 2024

As such, the release of Grok is likely to put pressure on all other LLM providers, especially other rival open source ones, to justify to users how they are superior.

VB Daily

Stay in the know! Get the latest news in your inbox daily

By subscribing, you agree to VentureBeat's Terms of Service.

Thanks for subscribing. Check out more VB newsletters here.

An error occured.