Google recently released a video showcasing the capabilities of its artificial intelligence (AI) model, Gemini, garnering significant attention with over 2 million views on YouTube. However, upon closer inspection, it appears that the demo may not be as straightforward as it initially seemed.
Content Analysis and Clarifications
The video demonstrates Gemini’s remarkable real-time responses to spoken-word prompts and video stimuli. Google, in the video description, admitted to accelerating responses for demonstration purposes and later clarified in a blog post that the AI did not actually respond to voice or video inputs.
Google acknowledged that the video was created by using still image frames from footage and prompting the AI through text. While the video claims to show real prompts and outputs from Gemini, it’s crucial to recognize that the AI’s capabilities were tested using a different approach.
Examples from the video, such as identifying objects and performing a magic trick, involved showing the AI a series of still images rather than responding to real-time video stimuli. Google explained that the prompts were generated by capturing footage and testing Gemini’s capabilities on various challenges.
Furthermore, a segment featuring the AI supposedly inventing a game called “Guess the Country” was not a spontaneous creation. Instead, the AI was given specific instructions through a text prompt, outlining the rules of the game and providing examples of correct and incorrect answers. This clarification emphasizes that the AI did not independently invent the game.
Comparison with OpenAI’s GPT-4
Despite the impressive capabilities displayed in the video, the use of still images and text-based prompts raises comparisons with OpenAI’s GPT-4. The similarity in capabilities suggests that Google’s Gemini may not significantly surpass its counterparts.
While Google’s Gemini AI exhibits commendable capabilities, transparency about the methods used in the demo is essential. The video, although showcasing the potential of AI, should be viewed with an understanding that certain elements were staged using still images and carefully crafted text prompts.
The competition between Google’s Gemini and OpenAI’s GPT-4 for AI supremacy is intensifying, with the former playing catch-up as OpenAI works on the next iteration of its AI, as revealed by former CEO Sam Altman. The question of which model is more advanced remains unanswered, but it’s clear that AI is rapidly evolving.