State-backed hackers weaponize AI models for attacks

State-sponsored threat actors are increasingly leveraging commercial AI models to enhance their cyberattack capabilities. Google's Threat Intelligence Group has documented how government-backed hackers from Iran, North Korea, China, and Russia are exploiting large language models like Gemini for reconnaissance, social engineering, and malware development.

This marks a significant evolution in threat landscape dynamics. Unlike traditional attacks that relied purely on human expertise, these operations now combine human tradecraft with AI-powered automation to scale and sophisticate their campaigns.

AI-Enhanced Social Engineering Operations

Iranian threat group APT42 has weaponized Gemini to conduct targeted reconnaissance and social engineering at scale. The group leverages AI to generate legitimate-seeming email addresses and research credible pretexts for approaching high-value targets.

The AI assistance extends beyond basic automation. APT42 uses language models to craft personas and scenarios specifically designed to elicit target engagement:

Language translation — Converting phishing content across multiple languages
Native phrasing — Generating natural-sounding text that bypasses traditional grammar-based detection
Cultural adaptation — Tailoring communication styles to specific regional contexts
Pretext generation — Creating believable scenarios for initial contact

North Korean group UNC2970 has similarly integrated Gemini into its targeting workflow. The group focuses on defense contractors and uses AI to profile high-value targets across major cybersecurity and defense companies.

Model Extraction and IP Theft

Model extraction attacks represent a new frontier in AI-targeted threats. These "distillation attacks" aim to steal intellectual property from proprietary models by reverse-engineering their capabilities.

One campaign targeted Gemini's reasoning abilities using over 100,000 specially crafted prompts. The attack attempted to coerce the model into revealing its internal reasoning processes across multiple languages and task domains.

Google's defense systems detected these extraction attempts in real-time. The company deployed countermeasures to protect internal reasoning traces while maintaining legitimate model access:

Real-time detection — Identifying suspicious query patterns automatically
Access controls — Limiting API access for accounts exhibiting extraction behaviors
Response filtering — Preventing models from exposing internal reasoning traces
Account monitoring — Tracking usage patterns across commercial API endpoints

Commercial Sector Involvement

The threat extends beyond state actors. Private sector entities and researchers globally have attempted similar model extraction attacks against frontier AI systems. This suggests both commercial espionage motivations and academic research interests in reverse-engineering proprietary model capabilities.

AI-Generated Malware Frameworks

Threat actors are integrating AI directly into malware architecture. HONESTCUE represents a new class of AI-powered malware that uses Gemini's API to generate functionality on-demand.

The malware operates as a downloader and launcher framework. It sends prompts to Gemini's API and receives C# source code as responses, then compiles and executes payloads directly in memory:

Fileless execution — No disk artifacts for forensic analysis
Dynamic generation — Payload functionality generated at runtime
Multi-layer obfuscation — AI-generated code complicates static analysis
Network evasion — API calls blend with legitimate traffic patterns

COINBAIT represents another evolution: a phishing kit likely accelerated by AI code generation tools. The kit masquerades as a major cryptocurrency exchange and was built using the Lovable AI platform for rapid development.

Social Engineering via AI Platforms

A novel attack vector exploits the public sharing features of generative AI services. Threat actors manipulate AI models to create realistic-looking instructions for common computer tasks, embedding malicious command-line scripts as "solutions."

The technique affects multiple platforms including Gemini, ChatGPT, Copilot, DeepSeek, and Grok. By creating shareable links to AI chat transcripts, attackers use trusted domains to host their initial attack stages.

This approach distributes ATOMIC malware targeting macOS systems. The technique is particularly effective because users trust content hosted on legitimate AI platform domains.

Underground AI Tool Markets

Criminal forums show persistent demand for AI-enabled attack tools. However, most threat actors lack the resources to develop custom models and instead rely on stolen credentials to access commercial platforms.

The toolkit "Xanthorox" exemplifies this trend. Advertised as a custom AI for autonomous malware generation, investigation revealed it actually leverages multiple commercial AI products including Gemini through stolen API keys.

Defensive Countermeasures

Google has implemented several defensive measures against identified threat actors. The company disables accounts and assets associated with malicious activity while strengthening model classifiers to refuse assistance with similar attacks.

Key defensive strategies include:

Account termination — Disabling projects and accounts linked to malicious activity
Model hardening — Training models to recognize and refuse malicious requests
API monitoring — Tracking usage patterns for suspicious behavior
Intelligence integration — Applying threat intelligence to improve detection capabilities

Why It Matters

While no APT or information operations actors have achieved breakthrough capabilities that fundamentally alter the threat landscape, the integration of AI into attack workflows represents a significant evolution. Enterprise security teams must enhance defenses against AI-augmented social engineering and reconnaissance operations.

The democratization of sophisticated attack capabilities through commercial AI platforms lowers barriers for both state and criminal actors. Organizations should expect increasingly sophisticated phishing campaigns and reconnaissance operations powered by AI assistance.