What goes wrong
with AI agents.
A public log of AI agent failures in production. Curated by AgentShield, open to submissions. Every incident is tagged with categories from the AAS Framework.
Report an incident
AgentReport documents AI agent failures in production: autonomous AI taking action that caused real-world harm. Every submission requires a source URL and a verified email, and is reviewed by the editorial team before publishing.
Do not submit speculation, opinions, unsourced claims, or non-agent incidents. Submissions without working source URLs will be rejected automatically.
Does this incident qualify? (5-criterion AGENT TEST)
To publish, ALL 5 must be true. If any is FALSE, the editorial team will reject the submission:
- Autonomous action. The AI executed code/commands/API calls or modified real-world state WITHOUT per-operation human confirmation.
- Agent scope. The AI was operating with multi-step task scope (autonomous loop, scheduled job, named role like "code reviewer" / "deployment agent"). Not a 1-question-1-answer chat.
- Real consequence. The action produced concrete material harm (data deleted, system offline, money lost, legal ruling, real person harmed).
- Agent language in source. The article/report explicitly identifies the AI as "agent" / "coding agent" / "autonomous agent" / "agentic AI" / "AI-powered automation". Not just "AI" / "chatbot" / "ML model".
- Agent failed its mission. The agent took an unintended, unauthorized, or out-of-scope action while doing the task it was deployed to do. Out of scope: agent SUCCEEDED at an offensive mission (red-team compromising a target — "AI as weapon"), or agent was the TARGET of an attack (platform breached, jailbroken by humans — "AI as victim").
NOT incident-worthy: chatbot conversational harm (e.g. Character.AI tragedies), human misuse of AI tool (lawyer with ChatGPT fake citations), AI platform data breach without agent involvement (OpenAI Redis cache bug), generic ML bias / fairness incidents. These belong in a different aggregator.
Don't end up on this page.
AgentShield's 57 adversarial scenarios cover every AAS category. Catch the next incident before it goes public.
Test your agent →