Google Exposes Attack Involving 100,000 Prompts on Gemini, AI Cloning Attempt Revealed
Google has revealed an attempt to hack its artificial intelligence chatbot, Gemini, through a coordinated attack involving thousands of specially designed prompts engineered to expose the chatbot’s internal logic and reasoning processes.
The attack was part of reverse-engineering practices aimed at replicating the functioning of the proprietary AI model so it could be cloned. Google disclosed this information in a report discussing various harmful activities and misuse of Gemini.
Rather than exploiting security vulnerabilities, the attackers leveraged legitimate official access through Gemini’s API, which Google provides to software developers for building chatbot-based applications. This approach allowed the perpetrators to interact lawfully with the AI model whilst gradually studying its response patterns and internal logic.
Under normal circumstances, Gemini only displays final answers to users without revealing the underlying reasoning process. However, the hackers attempted to force the model to display its entire reasoning process through carefully constructed and repetitive prompts.
One identified prompt instructed Gemini to ensure the language used in its reasoning content was entirely consistent with the user’s primary language, a technique designed to extract more detailed internal workings.
Google categorised the prompt flooding activity as a form of intellectual property theft, noting that it violated Gemini’s Terms of Service. The company stated it has the right to terminate access for users proven to be involved in such violations. Google also warned other AI developers to remain vigilant against similar model extraction attacks.
Although the immediate impact remains unclear, Google assessed that such attacks could potentially be exploited for commercial interests or harmful cyber activities. The company emphasised that whilst the direct damage may not be immediately apparent, the long-term risks pose significant threats to proprietary AI systems.