Skip to main content

Google takes aim at OpenAI with launch of powerful new AI model Gemini

Credit: VentureBeat made with Midjourney
Credit: VentureBeat made with Midjourney

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Google has finally unveiled Gemini, its much-hyped new artificial intelligence (AI) model that experts say could become a key asset in the race for AI supremacy against rivals OpenAI, Microsoft, Meta, and Amazon.

Gemini, which has been anticipated for months as Google’s next big AI breakthrough, represents the tech giant’s largest and most ambitious AI model release yet.

According to CEO Sundar Pichai, Gemini brings the company significantly closer to creating a multifaceted AI assistant that can understand and reason about the world like a human.

The new model is, in part, Google’s answer to the rising demand for enterprise AI products that can analyze and generate text, images, audio, video and other data formats.

As VentureBeat reported, 60% of employees are expected to use their own AI tools at work in 2024 according to a Forrester Research study. Additionally, many businesses are already seeing an average 3.5x return on their AI investments according to an IDC report.

Engineered for sophisticated reasoning

Google says that Gemini is its most flexible AI model yet. It can efficiently run in the cloud at large data centers and also locally on mobile devices.

The company optimized Gemini in three different sizes:

  • Gemini Ultra is the largest version, aimed at highly complex tasks like scientific research and data analysis. This is the most compute-intensive and capable.
  • Gemini Pro is designed for scaling across a wide range of applications. It will be used in Google products like the Bard conversational AI and to power new Pixel smartphone features.
  • Gemini Nano is a lightweight on-device model that can run locally on smartphones and other devices.

Google says Gemini was built intentionally from the ground up as a multimodal model, which means it can seamlessly combine different modalities of information (such as video, photos, audio, and text) and perform sophisticated reasoning and problem-solving tasks in each of those different formats.

Gemini has been rigorously tested and has surpassed human experts on several complex reasoning tests. Gemini achieved the highest scores to date on more than 30 standardized AI benchmarks, including the Massive Multitask Language Understanding (MMLU) benchmark, which evaluates abilities across 57 subjects ranging from mathematics and physics to history, law, and ethics.

The model can also be used as the engine for advanced coding systems, such as AlphaCode 2, which Google says can solve competitive programming problems that involve complex math and theoretical computer science.

Game-changer for developers and businesses

Google will begin rolling out Gemini today across a wide range of products and platforms, starting with Bard, its AI-powered writing assistant that launched earlier this year to widespread disappointment. Bard will use a fine-tuned version of Gemini Pro for more advanced capabilities, such as generating poems, stories, essays, songs and more.

Gemini will also power new features on Pixel 8 Pro smartphone such as the “Summarize” feature in the Recorder app and the “Smart Reply” feature in Gboard. In the coming months, Gemini will be available in more Google products and services, such as Search, Ads, Chrome and Duet AI, a new AI-powered collaboration platform.

Evaluating the impact

The implications of Gemini’s arrival on the AI stage today are somewhat profound. For developers and enterprise customers, Gemini’s capabilities could revolutionize how they build and scale with AI, offering new and improved tools for their arsenal.

Moreover, the model’s native multimodality and advanced reasoning abilities could potentially transform industries that rely heavily on multi-format data analysis, such as healthcare, entertainment and autonomous driving.

In the realm of coding, Gemini’s prowess could be transformative. It can understand, explain and generate high-quality code in popular programming languages and shows promise in solving complex programming problems. This could significantly streamline the software development process and lead to more sophisticated and efficient software solutions.

Google comes out swinging with Gemini 

The competition between Google and its peers like Meta, Microsoft, and OpenAI continues to heat up as advanced models like Gemini prove their ability to automate new tasks and create novel content. Google seems determined to prove it can set the pace.

But the race for AI superiority is still in its early days. While models like GPT-4 and Gemini point to a future powered by intelligent machines, experts say we have only begun to scratch the surface of what artificial intelligence is capable of.

If Gemini works as promised, Google may have just shown its hand as the frontrunner in tomorrow’s AI. But the long game to develop artificial general intelligence remains wide open.