Audn.AI — CISO Handbook
A field guide for CISOs navigating the AI-agent decade. 28 pages · A4 · print-ready.
Save as PDF Start reading
Audn.AI
FOR THE OFFICE OF THE CISO
Autonomous Adversarial Validation·AAV·Edition 1

Peace of mind,
in the era of
AI agents.

A field guide for CISOs navigating the AI-agent decade. Twenty-eight pages on where autonomous adversarial validation fits inside your existing stack — and how to prove what attackers can actually do, before a regulator, an auditor, or a headline forces the question.

Volume
Handbook · 28 pages
Published
April 2026
Classification
CISO · Architect · Board
AUDN.AI
FOREWORD
CISO HANDBOOK · EDITION 1
02 / 28
Foreword·01

Your stack tells you what might be wrong.
It cannot tell you what an attacker actually does.

Every CISO we speak with has the same shape of problem. The tooling stack has never been better resourced. Scanners run nightly. Cloud posture is continuously monitored. Detection has fused across endpoints, identity, and cloud. Alerts are routed, enriched, and ticketed.

And yet the question the audit committee asks — "if an adversary attacked us today, what would actually break?" — is still answered with a slide from last quarter's pentest.

That gap is not a tooling failure. It is a category gap. The stack was built to find and detect. Nothing in it was built to prove.

The premise of this handbook

A new layer is forming between Detect & Correlate and Respond & Remediate — a validation layer that converts noisy findings into a defensible, prioritized plan.

Analysts call it Autonomous Adversarial Validation. We'll show you where it lives, why it is not BAS, and how to run it in production today.

What you'll get from this book
1A Gartner-style category map, end-to-end. 2Twelve cybersecurity categories, defined clearly. 3AAV vs BAS on nine concrete dimensions. 4A six-item failure-mode taxonomy for AI agents. 5A forensic-readiness audit you can run next week.
The biggest obstacle to investigating an AI failure is not finding the root cause — it is discovering you never captured the data needed to reconstruct what happened.
Chapter 3 · Forensic Readiness
AUDN.AI
CONTENTS
CISO HANDBOOK · EDITION 1
03 / 28
Contents

Twenty-eight pages, eight chapters.

Designed to be read front-to-back in an hour, or lifted section-by-section straight into a board pack.

PART I · LANDSCAPE
01Cover01
02Foreword02
03Contents (this page)03
04The modern security stack04
05Where Audn.AI fits — category map05
06Gartner-style quadrant06
PART II · THE TWELVE CATEGORIES
07CSAM · EASM · CSPM · CNAPP07
08CWPP · CIEM · SSPM · CDR08
09SIEM · XDR · EDR · SOAR09
10The stack as five questions10
PART III · THE NEW CATEGORY
11What is AAV?11
12Why this category is different12
13AAV vs BAS — the nine dimensions13
14How Audn.AI works14
15Where Audn.AI delivers value15
16Six proof points from the field16
PART IV · GOVERNANCE
17Six failure modes every AI-agent CISO should know17
18Forensic readiness — the seven evidence gates18
19Regulatory cross-walk (EU AI Act · ISO 42001 · NIST)19
20Board-ready talking points20
PART V · OPERATIONALIZE
21–25Rollout plan · Questionnaire · Interview checklist · Document request list · Metrics21–25
26–28References · Glossary · Colophon26–28
AUDN.AI
PART I · LANDSCAPE
CHAPTER 1 · THE MODERN STACK
04 / 28
Chapter 1·Layer 4 of 5

The modern security stack, in five questions.

Every layer of the enterprise security stack answers one question the one below could not. Read from the bottom up; the questions get harder, and the answers get more expensive to get wrong.

05
"What should we fix?"
Respond & Remediate · SOAR · ITSM
Automate response and remediation
04
"What is actually exploitable?"
Prove & Prioritize · AAV — Audn.AI
You are here
03
"What is happening?"
Detect & Correlate · SIEM · XDR · CDR · NDR
Detect and correlate security events
02
"What could be wrong?"
Find & Assess · CNAPP · CSPM · EASM · Vulnerability Mgmt
Identify potential risk
01
"What assets do we have?"
Know Your Assets · CSAM · ITAM · Asset Inventory
Discover and inventory
Read the stack as a conversation. Each layer is a question you pay a tool to answer. Layer 04 is the question no category in your stack answers today — which is why this handbook exists.
AUDN.AI
PART I · LANDSCAPE
CHAPTER 1 · CATEGORY MAP
05 / 28
Chapter 1·Category Map

Where Audn.AI fits in the modern security landscape.

A four-box reading of the stack. Everyone else finds issues. AAV proves which ones matter.

CATEGORY
Prevent & Protect
Identify assets, misconfigurations, vulnerabilities, and exposures.
CNAPP Palo Alto CSPM Wiz CIEM MS Entra EASM Censys SSPM Adaptive Shield CSAM Qualys
CATEGORY
Detect & Respond
Monitor, detect, and respond to threats across environments.
SIEM Splunk XDR Defender XDR EDR CrowdStrike SOAR Cortex XSOAR CDR Lacework NDR Darktrace
CATEGORY
Assess & Prioritize
Assess risk, prioritize findings, surface potential issues.
VM Tenable RISK BitSight ASM Outpost24 TI Recorded Future
THE NEW CATEGORY
Validate & Prove Exploitability
Simulate real adversary behavior to validate exploitability, remove false positives, and prioritize what matters.
AAV — AUDN.AI · Autonomous Adversarial Validation
· Attack path discovery · Black-box or guided · Exploit, pivot, validate · Continuous proof · False-positive filter · Noise → signal
AUDN.AI
PART I · LANDSCAPE
CHAPTER 1 · QUADRANT
06 / 28
Chapter 1·Gartner-style Quadrant

Leaders are not the only ones who define a category.

This is a framing device, not analyst placement. It reads the same way every practitioner reads a quadrant: execution maturity on the Y, strategic impact on the X.

VISIONARIES LEADERS NICHE PLAYERS CHALLENGERS Execution maturity → Strategic impact → TI Platforms VM Scanners Compliance CWPP BAS (Trad.) Deception EASM ASM DSPM AI-SPM CNAPP XDR SIEM SOAR AUDN.AI AAV
Audn.AI (AAV) Established leaders Visionaries Challengers
Visionaries are betting on new categories (DSPM, AI-SPM, EASM) but lack proof density.
Challengers (traditional BAS, deception) execute reliably but have flat strategic ceilings.
Audn.AI sits in leaders because it both executes continuously and answers a question nobody else does.
AUDN.AI
PART II · THE TWELVE CATEGORIES
CHAPTER 2 · CATEGORIES 1–4
07 / 28
Chapter 2·Foundation & Cloud Posture

The twelve cybersecurity categories, defined clearly.

Four per page, for three pages. Each card: what it is, who it is, how it differs.

CSAM
Cybersecurity Asset Management

A system for discovering, inventorying and managing all IT assets — devices, software, cloud resources — to understand exposure and risk.

ExampleQualys CSAM
AnswersWhat do we have?
vs AAVTells you what exists; AAV tells you what's exploitable.
EASM
External Attack Surface Management

Identifies and monitors internet-facing assets so you know what attackers can see and target from the outside.

ExampleCensys
AnswersWhat can attackers see from outside?
vs AAVEASM maps surface; AAV validates the attack path from that surface.
CSPM
Cloud Security Posture Management

Continuously monitors cloud environments to detect misconfigurations, compliance risks and drift from policy baselines.

ExampleWiz
AnswersWhere are my cloud misconfigurations?
vs AAVPosture-oriented. AAV is exploit-validation oriented.
CNAPP
Cloud-Native Application Protection Platform

Integrated security platform protecting cloud-native applications across development and runtime — combines CSPM, CWPP and CIEM.

ExamplePrisma Cloud (Palo Alto)
AnswersWhat cloud risks and weaknesses exist?
vs AAVIdentifies possible issues; AAV proves which can actually be exploited.
AUDN.AI
PART II · THE TWELVE CATEGORIES
CHAPTER 2 · CATEGORIES 5–8
08 / 28
Chapter 2·Workloads, Identity, SaaS

Protection layers beneath the cloud platform.

Workloads run the code. Identities grant the access. SaaS holds the data. Each needs its own posture story.

CWPP
Cloud Workload Protection Platform

Protects workloads — virtual machines, containers, and serverless functions — at runtime, complementing CSPM's static posture view.

ExampleTrend Micro Cloud One
AnswersWhat's happening inside my running workloads?
vs AAVRuntime telemetry and blocking; AAV drives the attack the workload has to defend.
CIEM
Cloud Infrastructure Entitlement Management

Manages identities and permissions in cloud environments. The goal is least-privilege at scale, and surfacing risky or dormant entitlements.

ExampleMicrosoft Entra Permissions Mgmt
AnswersWho has access to what — and should they?
vs AAVGoverns entitlements; AAV tests adversarial paths through those entitlements.
SSPM
SaaS Security Posture Management

Secures SaaS applications — Microsoft 365, Salesforce, Workday — by finding misconfigurations, over-permissions and risky third-party integrations.

ExampleAdaptive Shield
AnswersAre my SaaS apps securely configured?
vs AAVIdentifies SaaS posture risk; AAV proves attack feasibility and impact.
CDR
Cloud Detection and Response

Detects and responds to threats inside cloud environments using monitoring, behavioural analytics and cloud-native log sources.

ExampleLacework
AnswersWhat active threats are occurring in my cloud?
vs AAVCloud-focused detection; AAV validates exploitability end-to-end.
AUDN.AI
PART II · THE TWELVE CATEGORIES
CHAPTER 2 · CATEGORIES 9–12
09 / 28
Chapter 2·Detection & Response

The layers that see what's happening.

Endpoints, logs, cross-domain fusion, orchestrated response. Every CISO owns these four — and every one of them produces alerts AAV validates.

EDR
Endpoint Detection and Response

Real-time monitoring, detection, and response for endpoint devices — laptops, servers, workstations — with rich forensic telemetry.

ExampleCrowdStrike Falcon
AnswersWhat is happening on my endpoints?
vs AAVEndpoint-scoped; AAV is cross-domain attack validation.
XDR
Extended Detection and Response

Extends EDR by correlating signals across endpoints, identity, cloud, email and network — a unified detection and investigation plane.

ExampleMicrosoft Defender XDR
AnswersWhat threats span my domains?
vs AAVCorrelates events; AAV proves the path those events would chain into.
SIEM
Security Information and Event Management

Aggregates and analyses logs from every system in scope — a central nervous system for detection, investigation and compliance reporting.

ExampleSplunk Enterprise Security
AnswersWhat security-relevant events are happening?
vs AAVDetects signals; AAV validates which chains into impact.
SOAR
Security Orchestration, Automation and Response

Automates security workflows and incident-response actions, pulling in data from SIEM, XDR, ticketing and threat-intel tools.

ExampleCortex XSOAR
AnswersHow do I operationalize response?
vs AAVActs after the decision; AAV sharpens which decisions are worth acting on.
AUDN.AI
PART II · THE TWELVE CATEGORIES
CHAPTER 2 · RECAP
10 / 28
Chapter 2·Recap

Five questions, twelve categories, one gap.

A chart you can paste into a board deck. Read left-to-right; read the gap top-to-bottom.

01 · FOUNDATION Know assets CSAM · EASM 02 · PREVENT Find & assess CNAPP · CSPM · CIEM SSPM · VM 03 · DETECT Detect & correlate SIEM · XDR · EDR CDR · NDR 04 · PROVE AAV Audn.AI Prove & prioritize what is exploitable 05 · ACT Respond SOAR · ITSM "What do we have?" "What could be wrong?" "What is happening?" "What is exploitable?" "Fix what?" Asset list Risk findings Alerts & events Exploit proof Tickets · patches Noise accumulates → → Signal resolved
The gap between 03 Detect and 05 Respond is where exhausted SOC teams live. AAV is the step that converts volume into decisions.
AUDN.AI
PART III · THE NEW CATEGORY
CHAPTER 3 · WHAT IS AAV
11 / 28
Chapter 3·Definition

Autonomous Adversarial Validation.

A cybersecurity category that uses AI-driven adversarial simulation to validate what is actually exploitable, reduce false positives, and prioritize real risk — continuously, without needing source code or full internal access.

The mechanic

AI agents think and act like real attackers. They probe, pivot, chain findings, and attempt exploitation end-to-end — black-box (no insider knowledge) or guided (purple-team mode, with environmental priors from your team).

The system does not report "control A blocked technique B." It reports "here is the sequence that would have succeeded, here is the evidence, here is the business impact, and here is why two of your previous CVE findings do not matter."

The reframe
BAS asks "could this be attacked?"
AAV asks "what is exploitable right now, and what is the real impact?"

Six capabilities

01
Autonomous adversary simulation

Black-box or guided, no scripts, no prior pentest handoff.

02
Attack path discovery

Dynamic, end-to-end, not a predetermined playbook.

03
Exploit & pivot validation

Proves what chains together, at what depth, to reach what crown-jewel.

04
Hallucination & FP filtering

Only validated exploits make it to your ticket queue.

05
Business-impact correlation

Findings are tagged to data, customers, revenue, or compliance scope.

06
Continuous proof

Always-on, not a quarterly pentest window.

AUDN.AI
PART III · THE NEW CATEGORY
CHAPTER 3 · WHY DIFFERENT
12 / 28
Chapter 3·Why AAV is distinct

Five readings you can send straight into a board pack.

VS CNAPP / CSPM
They identify potential issues.
We prove exploitability.
VS SIEM / XDR
They detect events.
We validate attack paths.
VS BAS (TRADITIONAL)
They run predefined scripts.
We adapt like real attackers.
VS PENTESTING
Humans. Periodic. Expensive.
We are AI-driven, continuous, scalable.
VS VULNERABILITY SCANNERS
They generate lists.
We tell you what actually matters.
Most security programs generate thousands of findings. Very few tell the team what is actually exploitable right now. AAV is the layer that closes that gap.
AUDN.AI
PART III · THE NEW CATEGORY
CHAPTER 3 · AAV vs BAS
13 / 28
Chapter 3·Head-to-head

AAV vs traditional Breach & Attack Simulation.

BAS told you whether a scripted technique ran. AAV shows what an adversary would chain together if they had tools, reasoning, and a motive.

Dimension Traditional BAS Audn.AI · AAV Key difference
ObjectiveSimulate known techniques and test controls.Prove what attackers can actually exploit.Validation vs Simulation
ApproachPredefined scripts, TTP libraries, limited logic.Autonomous AI agents, black-box or guided.Autonomous vs Scripted
CoverageLimited paths, predetermined scope.Full attack surface, dynamic path discovery.Full vs Limited
AdaptabilityStatic. Executes what is designed.Adaptive. Learns, pivots, chains attacks.Adaptive vs Static
ValidationSuccess/fail by rule outcome.Real exploit validation with business-impact context.Proof vs Rule Match
OutputHigh false positives, shallow context.Actionable, prioritized, low false-positive.Actionable vs Noisy
Human loopHigh — setup, tuning, analysis.Low — human-in-the-loop for strategic guidance.Assist vs Heavy
CadencePeriodic (weekly / monthly / quarterly).Continuous, always-on validation.Continuous vs Periodic
AI-nativeNo — rule / logic-based.Yes — AI-driven planning, execution, evaluation.AI-Native vs Not
Nine dimensions. One consistent pattern: BAS optimizes for coverage of known techniques. AAV optimizes for coverage of actual exploitability.
9 / 9 CATEGORIES MOVE TOWARD AAV
AUDN.AI
PART III · THE NEW CATEGORY
CHAPTER 3 · HOW IT WORKS
14 / 28
Chapter 3·Architecture

How Audn.AI works.

Plug in a live endpoint. Attacker-AI runs end-to-end. Evidence comes back signed.

01
Connect

Point Audn at a live AI endpoint — voice agent, chat agent, web agent, MCP-enabled workflow. No source code required. No agent in your VPC.

BLACK-BOX OR GUIDED
02
Reconnaissance

Attacker models map the surface — intents, tools, guardrails, authority boundaries. Trained on Audn's proprietary 10B+ black-box interaction corpus.

10B+ ATTACK ATTEMPTS
03
Exploit & chain

Autonomous agents try prompt injection, tool misuse, data exfiltration, authority escalation, RAG poisoning. Every chain is recorded end-to-end.

ADAPTIVE PATH DISCOVERY
04
Signed exploit report

Proof-of-exploit with reproduction steps, business-impact correlation, and a cryptographic signature for audit, regulator, and customer security review.

ACCELERATE SECURITY APPROVALS
Coverage

Voice · chat · web agents. Prompt injection, data exfiltration, network misconfig, app vuln, agent misbehavior.

Differentiator

Unlike open-source cyber models (e.g. GPT-5.4-cyber, Claude Mythos), Audn fills the black-box data gap.

Outcome

Signed reports that accelerate enterprise security sign-off and unlock deals with the likes of food-delivery CISOs.

AUDN.AI
PART III · THE NEW CATEGORY
CHAPTER 3 · VALUE
15 / 28
Chapter 3·Where Audn.AI delivers value

Six outcomes that make the business case.

01
Reduce false positives

Only validated exploits reach your SOC. Analysts stop triaging list items that were never real.

–90%noise in the exploit queue
02
Prove real exploitability

Signed exploit reports, reproduction steps, and evidence an auditor can accept without argument.

1 : 1finding → proof
03
Prioritize what matters

Business-impact correlation — revenue, data, compliance scope — over generic CVSS gut-check.

Top 5exploits per week, not 500
04
Bridge blue team & attacker mindset

Purple-team cadence by default. Guided mode lets your team inject priors and steer depth.

Purpleon a daily loop
05
Continuous validation

Post-deploy, post-update, post-config-change. Not a quarterly window that misses 90% of change events.

24 / 7always-on
06
Improve security ROI

Fewer tools wasted on theoretical findings. Faster enterprise deals unlocked by signed reports.

procurement velocity
Move from theoretical security proven security.
AUDN.AI
PART III · THE NEW CATEGORY
CHAPTER 3 · PROOF POINTS
16 / 28
Chapter 3·Field telemetry

Six proof points from the field.

Audn has been in production since January 2026. Here's what twelve months of attacker-AI telemetry looks like.

10B+
Attack attempts

Autonomous red-team probes executed across voice, chat and web AI endpoints in production.

~40%
Month-over-month growth

Revenue since first B2B sale in January 2026, driven by enterprise security approvals.

Freya
First B2B customer · YC S25

Voice AI startup. First signed exploit report closed procurement in a single week.

Top 5
Global food-delivery CISO

Committed design partner. Validating customer-facing voice agents at multi-region scale.

Zero
Source code required

Black-box by design. Audn attacks from where attackers operate — the outside.

$2M
Pre-seed in progress

Scaling attacker models, expanding enterprise adoption, and shipping the defensive layer.

Positioning, in one line

Audn is the system of record for "what is actually exploitable" in AI — a new category, Autonomous Adversarial Validation, beyond detection and prevention.

AUDN.AI
PART IV · GOVERNANCE
CHAPTER 4 · FAILURE MODES
17 / 28
Chapter 4·Taxonomy

Six failure modes every AI-agent CISO should know.

LLMs fail non-deterministically. They report success when they are wrong. None of the traditional indicators of compromise surface cleanly. This is the taxonomy you need at hand.

01
Hallucination

Model invents ungrounded facts. Silent failure.

How you investigate Semantic-entropy sampling · context-grounding verification · golden-dataset regression.
02
RAG retrieval failure

Pipeline returns the wrong or missing documents; the LLM answers from vacuum.

How you investigate Similarity auditing · chunk boundary analysis · index-version diff.
03
Model drift

Gradual degradation from distribution shift. Yesterday's agent is not today's agent.

How you investigate Golden-dataset regression · temporal correlation with pipeline changes.
04
Prompt injection

Malicious input bypasses guardrails, leaks system prompts, invokes out-of-policy tools.

How you investigate Boundary testing · guardrail-bypass replay · system-prompt leakage audit.
05
Guardrail failure

Safety filter does not catch problematic output. Model was fine; filter was wrong.

How you investigate Rule auditing · adversarial edge-case replay · policy-gap analysis.
06
Agent reasoning failure

Autonomous agent chooses the wrong tool or delegation path. Executes with authority.

How you investigate Decision-chain reconstruction · tool-invocation audit · authority analysis.
Source frame: The OECD AI Incidents Monitor logged 108 new incidents between Nov-25 and Jan-26. Six failure modes account for most of them. None is covered by traditional IOC taxonomy.
AUDN.AI
PART IV · GOVERNANCE
CHAPTER 5 · FORENSIC READINESS
18 / 28
Chapter 5·Seven evidence gates

The forensic-readiness checklist.

EU AI Act, Article 73 requires serious-incident reporting within 15 days and a full investigation — with fines up to €15M or 3% of worldwide turnover. Obligations take effect August 2026. If the data does not exist, the methodology cannot save you.

The test

Can you, right now, reconstruct a single inference from last Tuesday — with full prompt chain, retrieval context, agent reasoning trace, and guardrail state? If the answer is no, the audit is the place to start.

01
Full input / output pairs with timestamps, model version, temperature and token params.
02
Complete prompt chains — system prompts, user turns, intermediate reasoning steps.
03
Retrieval context, similarity scores, and source query for every RAG call.
04
Embedding model version at the moment of retrieval.
05
Agent action logs: tool calls, reasoning traces, full delegation chain.
06
Model metadata: fine-tuning provenance, guardrail configuration, deployment version.
07
User session context: identity, permissions, application context at the moment of call.
AUDN.AI
PART IV · GOVERNANCE
CHAPTER 5 · REGULATORY CROSS-WALK
19 / 28
Chapter 5·Obligations

The regulatory cross-walk.

Four frameworks. One evidence substrate. If you can pass Audn.AI's forensic checklist, you can satisfy all four.

EU · REGULATION
EU AI Act
Article 73Serious-incident reporting · 15-day window.
Article 15Accuracy, robustness, cybersecurity throughout lifecycle.
Article 12Logging of events enabling traceability.
PenaltyUp to €15M or 3% of worldwide turnover.
EffectiveAugust 2026 for high-risk systems.
ISO · STANDARD
ISO / IEC 42001
Clause 8Operational controls for AI systems.
Clause 9Performance evaluation and internal audit.
A.6.2AI system impact assessment.
A.8.3Data quality for AI systems.
ScopeAI management system certification.
US · FRAMEWORK
NIST AI RMF 1.0
GovernContext, authority, accountability.
MapRisks and impacts in context.
MeasureTrustworthy characteristics & performance.
ManagePrioritize and respond to mapped risk.
UseVoluntary · referenced by US federal and 40+ states.
PRACTICE · INDUSTRY
MITRE ATLAS + OWASP LLM Top 10
ATLASAdversarial threat landscape for AI systems.
OWASP LLM01Prompt injection.
OWASP LLM06Sensitive information disclosure.
OWASP LLM08Excessive agency.
FitDirectly attacker-validated by Audn.AI.
The consolidation: every framework above asks the same practical question — can you reconstruct what happened, prove what could have happened, and show what you did about it? That is the AAV evidence substrate in three clauses.
AUDN.AI
PART IV · GOVERNANCE
CHAPTER 5 · BOARD DECK
20 / 28
Chapter 5·Language for the board

Board-ready talking points.

Five sentences per category. Written to be paraphrased, not read aloud.

WHAT WE ARE DOING

"We've added a validation layer between detection and response. It continuously simulates real adversary behavior against our AI agents and cloud surfaces, and returns signed evidence of what would actually succeed — not what might succeed."

WHY NOW

"The EU AI Act takes effect in August 2026 with fines up to 3% of turnover. The OECD logged 108 AI incidents in three months. Our current stack can detect events; it cannot prove exploitability, which is what a regulator, an auditor, or a customer security review will ask for."

HOW IT DIFFERS FROM OUR PENTEST BUDGET

"Pentesting is humans, periodic, and narrow in scope. AAV is AI-driven, continuous, and scopes to the full attack surface. Both stay. AAV fills the 49 weeks a year pentesting doesn't."

WHAT WE EXPECT IN THE FIRST 90 DAYS

"A drop in the false-positive queue of roughly ninety percent. A prioritized top-five list of exploitable paths per week, tied to revenue, data, or compliance scope. And a signed exploit report substrate we can hand to customers and regulators."

THE BOTTOM LINE

"We are moving from theoretical security — a long list of maybe-bad-things — to proven security: a short list of definitely-bad-things we are fixing now. That's the commitment."

AUDN.AI
PART V · OPERATIONALIZE
CHAPTER 6 · ROLLOUT
21 / 28
Chapter 6·90-day rollout

A ninety-day plan you can start on Monday.

01
Days 1–30 · Baseline
  • Pick one customer-facing AI agent — highest business impact, not deepest tech.
  • Connect the live endpoint to Audn in black-box mode.
  • Run the forensic-readiness audit (Ch. 5). Close any gates at ◯.
  • Agree a top-5 crown-jewel list with product and legal.
  • Define a signed report → ticket pipeline into Jira / ServiceNow.
Output: first signed exploit report delivered to SOC.
02
Days 31–60 · Purple loop
  • Switch to guided mode; inject priors from red / blue team.
  • Wire findings into SIEM and XDR as correlation rules.
  • Add two more agents: one internal copilot, one voice.
  • Stand up the board-level exploitability metric (see Ch. 7).
  • Run first tabletop with Audn evidence on the table.
Output: SOC triages exploit-validated findings only.
03
Days 61–90 · Continuous
  • Move Audn into CI/CD: every model or prompt change triggers validation.
  • Expose signed reports in vendor security portal for customer review.
  • Map evidence to EU AI Act Article 73 reporting workflow.
  • Retire one duplicate BAS / scanner contract; fund it.
  • Present the 90-day delta at board.
Output: audit-ready, continuously validated.
30dFirst signed report
60dSOC on exploit-only queue
90dCI/CD integration
≤6moOne retired tool funds program
AUDN.AI
PART V · OPERATIONALIZE
CHAPTER 7 · QUESTIONNAIRE
22 / 28
Chapter 7·Self-assessment

Twelve questions to score your current state.

Circle 0–5 for each. Total > 40 = audit-ready. 25–40 = gaps. < 25 = start the 90-day plan Monday.

01I can list every AI agent in production today.0 1 2 3 4 5
02I know which of those agents can invoke tools with authority.0 1 2 3 4 5
03My SOC can tell exploitable from theoretical findings.0 1 2 3 4 5
04We log full prompt chains for every production inference.0 1 2 3 4 5
05We log RAG retrieval context and embedding model version.0 1 2 3 4 5
06We log full agent tool-call and delegation chains.0 1 2 3 4 5
07We test for prompt injection on every deploy.0 1 2 3 4 5
08We test for data exfiltration on every deploy.0 1 2 3 4 5
09We have an answer for EU AI Act Article 73 timelines.0 1 2 3 4 5
10We can reconstruct a single inference from last Tuesday.0 1 2 3 4 5
11Customer security reviews take <5 days, not >5 weeks.0 1 2 3 4 5
12Our exploitability story is boardable without a slide edit.0 1 2 3 4 5
Total
___ / 60
0–24
Start 90-day plan
25–40
Close specific gates
41–60
Audit-ready
AUDN.AI
PART V · OPERATIONALIZE
CHAPTER 7 · INTERVIEW CHECKLIST
23 / 28
Chapter 7·Four roles, twelve questions

The interview checklist.

Print the sheet. Ask everyone the same thing. Write down where the answers disagree. That's where the program actually needs work.

SECURITY ARCHITECT
  1. Draw the stack. Where is the validation layer?
  2. Which tools overlap? Which contract expires first?
  3. What is your plan for AI agent forensics?
SOC LEAD
  1. How many alerts last week? How many were exploitable?
  2. Rank your top 3 sources of false positives.
  3. If AAV fed you exploit-validated findings only, what would you do differently?
AI / ML LEAD
  1. Which production agents have tool authority?
  2. Show me last Tuesday's inference. Full chain.
  3. What's your response if I ask for prompt-injection test results?
LEGAL · COMPLIANCE
  1. Are we in scope for EU AI Act Article 73?
  2. What's our 15-day incident report workflow?
  3. Which customer contracts require signed security evidence?
What "good" looks like
All four agree on scope, the answer to the "last Tuesday" question is yes, and the number of exploit-validated findings per week is small and going down.
AUDN.AI
PART V · OPERATIONALIZE
CHAPTER 7 · DOCUMENT REQUEST LIST
24 / 28
Chapter 7·Discovery

Document request list.

Hand this to your GRC team. If they can produce all twenty-one artefacts within a week, you're ahead of the curve.

INVENTORY & SCOPE
01AI agent inventory with business owner, tool authority, data scope.
02Data-flow diagrams for every customer-facing agent.
03Model cards and fine-tune provenance.
04Third-party component inventory (embeddings, vector DBs, guardrails).
ARCHITECTURE
05System prompts (versioned, dated, owner).
06Guardrail configuration and rule set.
07Tool catalog with authority matrix.
08RAG pipeline spec (chunking, retrieval, re-ranking).
09Identity & session-context contract.
10Deployment topology (where inference runs, where logs land).
EVIDENCE & TELEMETRY
11Prompt-chain log sample (24 h).
12RAG retrieval log sample with similarity scores.
13Tool-invocation audit log sample.
14Guardrail-block/allow log sample.
15Last 3 incidents: full reconstruction.
GOVERNANCE
16AI governance policy.
17Incident-response runbook incl. AI-specific flow.
18EU AI Act Article 73 escalation map.
19Last pentest & red-team reports.
20Tabletop exercises from the last 12 months.
21Signed exploit reports from AAV · last 30 days.
AUDN.AI
PART V · OPERATIONALIZE
CHAPTER 7 · METRICS
25 / 28
Chapter 7·What to measure

The six metrics that matter.

Three operational, three board-level. Every one should be trending the right way within a quarter.

Ops
FP-R
False-positive rate

Target ↓ 50% in 60 days.

Ops
MTTR-X
Mean time to remediate
exploitable finding

Target ↓ 40% vs generic MTTR.

Ops
COV
Agent coverage

% of prod agents with continuous AAV. Target 100%.

Board
EXR
Exploitable-risk ratio

Exploitable / total findings. Trend flat or down.

Board
TSR
Time to signed report

Hours from validation to customer-deliverable evidence.

Board
SRV
Security-review velocity

Median days to pass a customer security review.

Expected EXR curve · first 90 days
EXPLOITABLE / TOTAL FINDINGS
50% 35% 20% 5% Day 0 Day 30 Day 60 Day 90 Before · all findings (~46%) After Audn · exploitable only (~8%)
AUDN.AI
REFERENCES
CHAPTER 8 · REFERENCES
26 / 28
References·What we read

Further reading — and what to cite in a board deck.

GARTNER
Top Security & Risk Management Trends

Why the category is in motion; framing for the board. Pair with the BAS Market Guide for continuity.

FORRESTER
The Rise of Autonomous Security Validation

The clearest third-party articulation of the move from simulation to validation.

MITRE
The Value of Adversarial Emulation in Security · MITRE ATT&CK · MITRE ATLAS

ATLAS is the AI-specific adversary model; map your Audn findings directly onto it.

SANS
Why Breach & Attack Simulation Needs to Evolve · Purple Team Guide

The practitioner case for AAV. Good quotes for procurement pushback.

NIST
AI Risk Management Framework 1.0 · CSF 2.0

The cross-walk you'll need for US federal and most state-level procurement.

EU · OECD
EU AI Act · ISO/IEC 42001 · OECD AI Incidents Monitor

Regulation, management-system standard, incident base-rate. Use all three together.

OWASP
LLM Top 10 · Agent Threat Model

The concrete vulnerability taxonomy Audn attacks against; CISO-readable.

AUDN.AI
Signed exploit reports · customer security portal

Ask for a signed sample report before your next procurement cycle. This is the substrate.

A caveat we'd rather say ourselves: "AAV" is currently a narrative category defined by this handbook and adjacent practitioner literature; it is not yet an established analyst-published category. Treat the frame as a way to organize the category's emerging shape, not as third-party endorsement.
AUDN.AI
GLOSSARY
CHAPTER 8 · GLOSSARY
27 / 28
Glossary·Terms of art

Plain-English glossary.

Every acronym and term used in this handbook, defined one way — the way we use it.

AAV
Autonomous Adversarial Validation. Our category.
Agent (AI)
An LLM-driven process with tool authority.
ATLAS
MITRE's adversary model for AI systems.
ATT&CK
MITRE's adversary-technique knowledge base.
BAS
Breach & Attack Simulation. Scripted.
Black-box
Validation with no source or insider access.
CDR
Cloud Detection and Response.
CIEM
Cloud Infrastructure Entitlement Mgmt.
CNAPP
Cloud-Native Application Protection Platform.
CSAM
Cybersecurity Asset Management.
CSPM
Cloud Security Posture Management.
CWPP
Cloud Workload Protection Platform.
EASM
External Attack Surface Management.
EDR
Endpoint Detection and Response.
Exploit chain
A sequence of techniques that together produce impact.
Exploitability
Whether a finding can actually be leveraged end-to-end.
Guardrail
Safety filter applied to an AI output.
Guided (purple)
Validation with priors injected by the blue team.
Hallucination
Ungrounded, confidently stated model output.
IOC
Indicator of Compromise.
ISO 42001
AI management-system standard.
MCP
Model Context Protocol. The agent-tool substrate.
NDR
Network Detection and Response.
NIST AI RMF
US AI Risk Management Framework.
OWASP LLM
Top-10 taxonomy for LLM app risks.
Prompt injection
Attack that bypasses guardrails via input.
RAG
Retrieval-Augmented Generation.
SIEM
Security Info & Event Management.
Signed report
Cryptographically signed exploit evidence.
SOAR
Security Orchestration, Automation, Response.
SSPM
SaaS Security Posture Management.
TTP
Tactic, Technique, Procedure.
VM
Vulnerability Management.
XDR
Extended Detection and Response.
AUDN.AI
CLOSE
CISO HANDBOOK · EDITION 1
28 / 28
The ask·Close

Prove what
attackers actually
can do.

Audn.AI is the reality layer in cybersecurity. We validate what attackers can do, so your team can fix what matters — continuously, signed, audit-ready.

Start a conversation
audn.ai

Ask for a signed sample exploit report. Share one agent endpoint; see AAV in action within a week.

Next step for CISOs
Run the Chapter 5 audit

Seven evidence gates. If you can answer "yes" to all seven, you are already ahead of EU AI Act obligations.

© Audn.AI · All rights reserved.
audn.ai