Zeitgeist / strand dossier

Slop to Retrenchment to Legitimacy

The field is not rejecting machines. It is rejecting undifferentiated machine effort and inflated security theater. Legitimacy is being rebuilt through gates, evidence, scoped claims, and selective trust.

field → phase → rail → receiptReturn to landing read

same-strand line contract

The dossier now descends in the same order as the landing page. Hold the strand fixed, then read posture, dated turn, dominant rail, and receipt before widening back out to signals and source artifacts.

same strand, one line, deeper evidence
  1. 01 · fieldfield → phase

    active

    What posture is this strand holding right now?

    The field is not rejecting machines. It is rejecting undifferentiated machine effort and inflated security theater. Legitimacy is being rebuilt through gates, evidence, scoped claims, and selective trust.

  2. 02 · phasephase → rail

    Verification replaces goodwill

    What dated threshold says the posture actually changed?

    Oct 2, 2025

  3. 03 · railrail → receipt

    Institutional closure

    What force is bending the line hardest?

    82% intensity

  4. 04 · receiptreceipt → continuity

    Powerful AI-enabled security research capability is being normalized under selective, institutional access rather than open release.

    What receipt earns the read instead of merely styling it?

    Apr 7, 2026 · Anthropic

field summary

What posture this strand is holding

question · What posture is this strand holding right now?

answer now

active

The field is not rejecting machines. It is rejecting undifferentiated machine effort and inflated security theater. Legitimacy is being rebuilt through gates, evidence, scoped claims, and selective trust.

why this posture holds

Synthetic abundance changes intake economics first, then credibility rules, then the social meaning of machine-assisted security work.

confidence · 86%

phase ledger

How the strand turned

2 turns
  1. 01 · Jun 18, 2025

    Noise becomes policy pressure

    Submission fatigue exits anecdote and starts to shape public posture.

  2. 02 · Oct 2, 2025

    Verification replaces goodwill

    Programs tighten around reproduction and evidence instead of assuming human effort implies care.

current dated threshold

Verification replaces goodwill

Oct 2, 2025. Programs tighten around reproduction and evidence instead of assuming human effort implies care.

pressure rail

Forces bending this strand

dominant rail · Institutional closure

Institutional closure

Open intake remains visible, but selective gates are tightening around credibility and access.

82% intensity

Disclosure friction

Submission volume and quality pressure are changing what counts as effort worth reading.

74% intensity

Machine legitimacy

Machine assistance is moving from embarrassment to differentiated craft.

67% intensity

rail reading

Institutional closure

Open intake remains visible, but selective gates are tightening around credibility and access.

receipt ledger

Source-bearing observations

Close the strand on receipts, not vibes. Each observation resolves to a public artifact and keeps the interpretive claim inspectable.

23 linked observations

Apr 7, 2026 · capability gating

Powerful AI-enabled security research capability is being normalized under selective, institutional access rather than open release.

81% confidence

Frontier security tooling was presented as useful but too sensitive for broad public availability.

source artifact

Project Glasswing: Securing critical software for the AI era

Anthropic · Apr 7, 2026

Frontier offensive security capability was framed as useful but too sensitive for open release.

Open source artifact

Apr 7, 2026 · legitimacy shift

Selective access to high-end offensive model capability is starting to gain public legitimacy when framed as time-buying for shared infrastructure defense.

79% confidence

Willison accepts Anthropic's restricted rollout as a reasonable trade-off because modern models are already credibly finding and exploiting serious bugs at scale.

source artifact

Anthropic’s Project Glasswing—restricting Claude Mythos to security researchers—sounds necessary to me

Simon Willison · Apr 7, 2026

A high-trust interpreter treats restricted release as a credible response to autonomous exploit capability, making selective access legible as prudence rather than mere hype.

Open source artifact

Mar 31, 2026 · workflow shift

As frontier coding agents surface real kernel vulnerabilities at scale, the bottleneck is shifting from bug discovery to human validation and responsible reporting capacity.

87% confidence

Lynch recounts Carlini finding multiple remotely exploitable Linux bugs with Claude Code while withholding hundreds more candidate crashes until humans can verify them, explicitly to avoid sending maintainers slop.

source artifact

Claude Code Found a Linux Vulnerability Hidden for 23 Years

mtlynch.io · Mar 31, 2026

A close read of Nicholas Carlini's results argues that frontier coding agents are now surfacing remotely exploitable Linux kernel bugs faster than experts can manually validate and responsibly report them.

Open source artifact

Mar 26, 2026 · legitimacy shift

AI-assisted findings became credible enough to destabilize the assumption that machine-generated submissions are intrinsically low-signal.

86% confidence

The public conversation moved from junk-report panic to acknowledgment of real utility.

source artifact

AI bug reports went from junk to legit overnight, says Linux kernel czar

The Register · Mar 26, 2026

Machine-assisted findings were increasingly seen as credible rather than merely noisy.

Open source artifact

Mar 25, 2026 · capability shift

Leading researchers increasingly describe AI vulnerability research as producing bug-finding leverage that exceeds prior human-only workflows.

85% confidence

Nicholas Carlini and peers discussed modern models as capable of changing the economics and ceiling of vulnerability research rather than merely assisting routine work.

source artifact

AI Finds Vulns You Can’t With Nicholas Carlini

Security Cryptography Whatever · Mar 25, 2026

Credible offensive researchers described AI vulnerability research as crossing from hype into materially new bug-finding capability.

Open source artifact

Feb 11, 2026 · evaluation shift

Evaluation legitimacy in AI safety and security remains frame-sensitive, with small prompt changes able to relabel the same trace as a safety issue without adding new evidence.

76% confidence

Across four models, prepending “You are a safety researcher” shifted 25 to 55 percent of classifications from execution or role failures into safety codes, often at the classification stage alone.

source artifact

Under a safety frame: Exploring classification shifts in LLM-as-a-Judge evaluation

lab.fukami.eu · Feb 11, 2026

Adding a five-word safety frame shifted 25–55% of judge classifications into safety codes, suggesting that evaluative legitimacy can move with framing even when traces stay fixed.

Open source artifact

Jan 15, 2026 · policy change

Platform operators converted informal frustration about AI slop into explicit procedural gatekeeping.

88% confidence

Submission rules and quality thresholds became more formalized.

source artifact

Bugcrowd Policy Changes to Address AI Slop Submissions

Bugcrowd · Jan 15, 2026

Triage platforms began tightening submission rules in response to low-signal AI-generated volume.

Open source artifact

Jan 1, 2026 · workflow shift

Trusted machine-assisted security work is moving beyond detection into bounded remediation loops where agents validate, patch, and confirm findings before humans exercise final judgment.

83% confidence

Ramp describes a detector-manager-validator-fixer pipeline that surfaced novel issues, rejected many false positives, wrote tests, and opened patches with human involvement concentrated at review and merge time.

source artifact

We proactively fixed ~100 security issues in 6 days with 0 humans

Ramp Builders · Jan 1, 2026

Ramp claims agentic security workflows can find, validate, patch, and confirm large batches of real vulnerabilities with human review only at the pull-request boundary, shifting trust toward supervised machine remediation loops.

Open source artifact

Jan 1, 2026 · institutionalization shift

AI-led security research is being formalized into institutions that credential machine authorship, require disclosure of human help, and treat reproducible agent workflows as publishable method.

81% confidence

SynSec creates tracks for fully automated papers, mandates human-contribution appendices, publishes AI and human reviews side by side, and makes machine-readable submission infrastructure part of the venue itself.

source artifact

Conference of Synthetic Security Research

SynSec · Jan 1, 2026

A new security venue treats AI-led research as a first-class submission category, requires explicit human contribution disclosure, and even centers AI agents in the formal review loop.

Open source artifact

Jan 1, 2026 · institutionalization shift

AI security is consolidating into a practitioner field that explicitly demands shared technical ground between model builders and defenders.

72% confidence

The summit says AI is shipping faster than security teams can evaluate it and positions itself as a place for people building AI and people securing it to work on the same problems with shared context, not generic thought leadership.

source artifact

AI Security Summit - 2026 Season

AI Security Summit · Jan 1, 2026

A dedicated AI security summit argues that builders and defenders now need shared technical ground, treating AI security as an operational field rather than a thought-leadership side topic.

Open source artifact

Dec 21, 2025 · market correction

Adversarial-robustness rhetoric is undergoing a credibility correction as prominent practitioners argue that automated red-teaming and guardrail products routinely overclaim what they can secure.

79% confidence

Schulhoff argues that jailbreak discovery is effectively guaranteed on transformer systems and that enterprise value lies more in scoped action design than in promises of broad prompt-injection prevention.

source artifact

The AI Security Industry is Bullshit

Substack · Dec 21, 2025

A leading prompt-injection researcher argues that automated red-teaming and guardrail vendors overstate adversarial-robustness progress, pushing real value toward scoped system design instead of universal jailbreak prevention.

Open source artifact

Oct 1, 2025 · exposure shift

Mass-market vibe-coding is already producing a large, patterned vulnerability surface, which means machine-assisted software loses legitimacy quickly when security discipline is absent at the scaffold level.

82% confidence

Escape says it analyzed 5,600 public applications and found over 2,000 vulnerabilities, 400-plus exposed secrets, 175 PII exposures, and recurring Supabase and frontend trust-boundary mistakes concentrated among non-expert builders.

source artifact

Methodology: How we discovered over 2k high-impact vulnerabilities in apps built with vibe coding platforms

Escape · Oct 1, 2025

A large-scale survey of 5,600 public vibe-coded apps found more than 2,000 vulnerabilities, hundreds of exposed secrets, and repeated Supabase and frontend trust-boundary failures, making machine-assisted app generation legible as a mass security surface rather than a novelty.

Open source artifact

Aug 16, 2025 · attack surface shift

Coding agents intensify the security stakes of model unreliability because hidden prompt channels become materially more dangerous once the model can write files, execute commands, and act across public repositories on the operator's behalf.

80% confidence

Marcus and Hamiel synthesize Black Hat research showing prompt injection hidden in repositories, README files, rules files, whitespace, and invisible characters, then argue that auto-run agent settings convert those prompt channels into realistic remote-code-execution and supply-chain compromise paths.

source artifact

LLMs + Coding Agents = Security Nightmare

Substack · Aug 16, 2025

A high-visibility warning argues that coding agents turn prompt injection and hidden prompt channels into operational system compromise once they gain tool authority and auto-run privileges.

Open source artifact

Aug 1, 2025 · security maturity shift

AI coding-agent security is hardening into an exploit-and-disclosure discipline, where legitimacy now depends on whether vendors can absorb concrete prompt-injection and tool-abuse reports instead of hand-waving them away as abstract safety concerns.

81% confidence

Rehberger says every reviewed coding agent exposed reportable vulnerabilities, notes that some vendors fixed issues rapidly while others stalled or stopped responding, and frames the field as needing concrete exploit publication to cut through insecure marketing claims.

source artifact

The Month of AI Bugs 2025

Embrace The Red · Aug 1, 2025

A month-long responsible-disclosure campaign argues that major coding agents already contain concrete prompt-injection and tool-abuse flaws, while vendors vary widely in how seriously they treat those reports.

Open source artifact

Apr 4, 2025 · productization shift

Major labs are turning AI security from a generic assistant promise into a specialized, benchmarked product category with tighter institutional framing and narrower trust claims.

74% confidence

Google presented Sec-Gemini v1 as an experimental cybersecurity model, signaling that credible machine security work is increasingly packaged as scoped expert infrastructure rather than open-ended general chat.

source artifact

Google announces Sec-Gemini v1, a new experimental cybersecurity model

Google Security Blog · Apr 4, 2025

Google is explicitly packaging a frontier model as a bounded cybersecurity specialist, making security competence legible as a productized, institutionally mediated capability rather than a generic chatbot side effect.

Open source artifact

Mar 19, 2025 · productization shift

AI security is being formalized as a segmented operational market, where credibility comes from scoped workflows and specialist surfaces rather than generic assistant rhetoric.

75% confidence

Menlo breaks the sector into eight concrete categories including pen testing, anomaly detection, code review, dependency management, compliance automation, and synthetic-content verification, which treats AI security as deployable infrastructure instead of a vague platform promise.

source artifact

AI for Security: Eight Areas of Opportunity

Menlo Ventures · Mar 19, 2025

A major venture firm frames AI security as a set of discrete operational markets, from pen testing and anomaly detection to compliance automation and synthetic-content verification.

Open source artifact

Jan 1, 2025 · institutional reaction

Public vulnerability intake channels became materially harder to operate under AI-assisted submission flood conditions.

84% confidence

Low-quality machine-generated bug reports imposed triage costs that changed maintainer posture.

source artifact

Death by a thousand slops

daniel.haxx.se · Jan 1, 2025

Public vulnerability intake was strained by a flood of low-quality AI-assisted submissions.

Open source artifact

Jan 1, 2025 · governance model

Cybersecurity is being used as a source domain for formalizing AI governance, testing language, and differentiated evaluation standards.

76% confidence

Microsoft framed cyber risk assessment, public-private coordination, and red-team practice as templates for AI testing and evaluation.

source artifact

AI Testing and Evaluation: Learnings from cybersecurity

Microsoft Research · Jan 1, 2025

Microsoft positioned cybersecurity testing practice as a governance template for AI evaluation, risk language, and public-private coordination.

Open source artifact

Jan 1, 2025 · institutionalization shift

AI security is consolidating into a self-conscious practitioner culture that tries to credential legitimacy through demos, short-form technical talks, and explicit rejection of marketing language rather than through generic AI thought leadership.

78% confidence

The conference frames itself as a gathering for professionals actually doing the work, invites offense, threat hunting, program building, and policy, encourages demos over slideware, and says it exists to take AI back from the marketers.

source artifact

[un]prompted - The AI Security Practitioner Conference

Luma · Jan 1, 2025

A volunteer-run AI security conference defines itself against marketing fluff and around sharp talks, real demos, and practitioners actually doing the work across offense, defense, program building, and policy.

Open source artifact

Jan 1, 2025 · risk horizon shift

Agentic AI security appears to be in a pre-breach window where exploit classes are already reproducible, but the worst public failures are lagging behind because deployment value and tool integration have not yet become widespread enough to fully attract attackers.

77% confidence

Willison says researchers keep finding new versions of the same agentic vulnerabilities, predicts a headline-grabbing breach within months, and argues that the main reason it has not happened yet is that most developers have not wired these systems into economically valuable projects deeply enough to make exploitation worthwhile.

source artifact

AI's Security Crisis: Why Your Assistant Might Betray You

Screaming in the Cloud · Jan 1, 2025

Simon Willison argues that agentic AI security failures are already easy for researchers to keep rediscovering, but mass exploitation lags until more economically valuable systems are wired deeply enough for attackers to care.

Open source artifact

Oct 30, 2024 · capability shift

Serious AI vulnerability research is becoming legible as preemptive defensive infrastructure, where agents find exploitable bugs in real software before release rather than merely generating speculative security output.

84% confidence

Google says Big Sleep found an exploitable SQLite memory-safety flaw before release, frames variant analysis as a promising route to bugs fuzzing misses, and explicitly casts the upside as asymmetric advantage for defenders.

source artifact

From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code

Google Project Zero · Oct 30, 2024

Google Project Zero and DeepMind present agentic variant analysis as a defensive path to finding exploitable real-world bugs before release, including a previously unknown SQLite memory-safety issue missed by existing testing.

Open source artifact

Aug 7, 2023 · institutional reaction

Major labs began channeling cybersecurity collaboration through bounded defensive grant programs instead of open-ended offensive capability support.

78% confidence

OpenAI offered rolling support for defensive projects while explicitly excluding offensive-security work and prioritizing public-benefit distribution.

source artifact

OpenAI Cybersecurity Grant Program

OpenAI · Aug 7, 2023

OpenAI framed defensive cybersecurity support as a grant-mediated, explicitly non-offensive channel for public-benefit work.

Open source artifact

May 4, 2023 · abundance shift

Open model ecosystems were already compressing frontier advantage, making capability diffusion and cheap iteration central pressures rather than side effects of the leading labs.

83% confidence

The memo argues that open-source communities were matching major labs on customization, privacy, speed, and iteration cadence with dramatically lower cost and weaker release constraints.

source artifact

Google "We Have No Moat, And Neither Does OpenAI"

SemiAnalysis · May 4, 2023

A leaked internal memo argues that cheap open-source iteration is collapsing proprietary advantage and making unrestricted model access harder to contain behind lab-scale capital and policy.

Open source artifact