TY - JOUR
T1 - Patterns of LLM Weaponization: A Comparative Analysis of Exploitation Incidents Across Commercial AI Systems
AU - Antoniou, George
PY - 2025/12/31
Y1 - 2025/12/31
N2 - This comparative study examines patterns of Large Language Model (LLM) weaponization through systematic analysis of four major exploitation incidents spanning from 2023-2025. While existing research focuses on isolated incidents or theoretical vulnerabilities, this study provides one of the first comprehensive comparative framework analyzing exploitation patterns across state-sponsored cyber-espionage (Anthropic Claude incident), academic security research (GPT-4 autonomous privilege escalation), social engineering platforms (SpearBot phishing framework), and underground criminal commoditization (WormGPT/FraudGPT ecosystem). Through comparative analysis across eight dimensions, adversary sophistication, target selection, exploitation techniques, autonomy levels, detection evasion, attribution challenges, defensive gaps, and capability democratization, this research identifies critical cross-case patterns informing defensive prioritization. Findings reveal three universal exploitation mechanisms transcending adversary types: autonomous goal decomposition via chain-of-thought reasoning (present in all four cases), dynamic tool invocation and code generation (3/4 cases), and adaptive social engineering (4/4 cases). Analysis demonstrates progressive capability democratization: state-level sophistication (Claude: 80-90% autonomy) transitioning to academic accessibility (GPT-4: 33-83% success rates), specialized criminal tooling (SpearBot: generative-critique architecture), and mass commoditization (WormGPT: $200-1700/year subscriptions). Comparative findings identify four cross-cutting defensive imperatives applicable regardless of adversary type: multi-turn conversational context monitoring, behavioral fingerprinting distinguishing legitimate from malicious complex workflows, federated threat intelligence enabling rapid cross-organizational learning, and capability-based access controls proportional to LLM reasoning sophistication.
AB - This comparative study examines patterns of Large Language Model (LLM) weaponization through systematic analysis of four major exploitation incidents spanning from 2023-2025. While existing research focuses on isolated incidents or theoretical vulnerabilities, this study provides one of the first comprehensive comparative framework analyzing exploitation patterns across state-sponsored cyber-espionage (Anthropic Claude incident), academic security research (GPT-4 autonomous privilege escalation), social engineering platforms (SpearBot phishing framework), and underground criminal commoditization (WormGPT/FraudGPT ecosystem). Through comparative analysis across eight dimensions, adversary sophistication, target selection, exploitation techniques, autonomy levels, detection evasion, attribution challenges, defensive gaps, and capability democratization, this research identifies critical cross-case patterns informing defensive prioritization. Findings reveal three universal exploitation mechanisms transcending adversary types: autonomous goal decomposition via chain-of-thought reasoning (present in all four cases), dynamic tool invocation and code generation (3/4 cases), and adaptive social engineering (4/4 cases). Analysis demonstrates progressive capability democratization: state-level sophistication (Claude: 80-90% autonomy) transitioning to academic accessibility (GPT-4: 33-83% success rates), specialized criminal tooling (SpearBot: generative-critique architecture), and mass commoditization (WormGPT: $200-1700/year subscriptions). Comparative findings identify four cross-cutting defensive imperatives applicable regardless of adversary type: multi-turn conversational context monitoring, behavioral fingerprinting distinguishing legitimate from malicious complex workflows, federated threat intelligence enabling rapid cross-organizational learning, and capability-based access controls proportional to LLM reasoning sophistication.
KW - Large language models
KW - Comparative analysis
KW - Cyber exploitation patterns
KW - LLM weaponization
KW - Autonomous agents
KW - Capability democratization
KW - Underground AI
KW - Defensive frameworks
U2 - 10.55549/ijasse.50
DO - 10.55549/ijasse.50
M3 - Article
SN - 3023-5510
VL - 3
SP - 125
EP - 146
JO - International Journal of Academic Studies in Science and Education (IJASSE)
JF - International Journal of Academic Studies in Science and Education (IJASSE)
IS - 2
ER -