Audn.AI

FOR THE OFFICE OF THE CISO

Autonomous Adversarial Validation·AAV·Edition 1

Peace of mind,
in the era of
AI agents.

A field guide for CISOs navigating the AI-agent decade. Twenty-eight pages on where autonomous adversarial validation fits inside your existing stack — and how to prove what attackers can actually do, before a regulator, an auditor, or a headline forces the question.

Volume

Handbook · 28 pages

Published

April 2026

Classification

CISO · Architect · Board

AUDN.AI

FOREWORD

CISO HANDBOOK · EDITION 1

02 / 28

Foreword·01

Your stack tells you what might be wrong.
It cannot tell you what an attacker actually does.

Every CISO we speak with has the same shape of problem. The tooling stack has never been better resourced. Scanners run nightly. Cloud posture is continuously monitored. Detection has fused across endpoints, identity, and cloud. Alerts are routed, enriched, and ticketed.

And yet the question the audit committee asks — "if an adversary attacked us today, what would actually break?" — is still answered with a slide from last quarter's pentest.

That gap is not a tooling failure. It is a category gap. The stack was built to find and detect. Nothing in it was built to prove.

The premise of this handbook

A new layer is forming between Detect & Correlate and Respond & Remediate — a validation layer that converts noisy findings into a defensible, prioritized plan.

Analysts call it Autonomous Adversarial Validation. We'll show you where it lives, why it is not BAS, and how to run it in production today.

What you'll get from this book

1A Gartner-style category map, end-to-end. 2Twelve cybersecurity categories, defined clearly. 3AAV vs BAS on nine concrete dimensions. 4A six-item failure-mode taxonomy for AI agents. 5A forensic-readiness audit you can run next week.

“The biggest obstacle to investigating an AI failure is not finding the root cause — it is discovering you never captured the data needed to reconstruct what happened.”

Chapter 3 · Forensic Readiness

AUDN.AI

CONTENTS

CISO HANDBOOK · EDITION 1

03 / 28

Contents

Twenty-eight pages, eight chapters.

Designed to be read front-to-back in an hour, or lifted section-by-section straight into a board pack.

PART I · LANDSCAPE

01Cover01

02Foreword02

03Contents (this page)03

04The modern security stack04

05Where Audn.AI fits — category map05

06Gartner-style quadrant06

PART II · THE TWELVE CATEGORIES

07CSAM · EASM · CSPM · CNAPP07

08CWPP · CIEM · SSPM · CDR08

09SIEM · XDR · EDR · SOAR09

10The stack as five questions10

PART III · THE NEW CATEGORY

11What is AAV?11

12Why this category is different12

13AAV vs BAS — the nine dimensions13

14How Audn.AI works14

15Where Audn.AI delivers value15

16Six proof points from the field16

PART IV · GOVERNANCE

17Six failure modes every AI-agent CISO should know17

18Forensic readiness — the seven evidence gates18

19Regulatory cross-walk (EU AI Act · ISO 42001 · NIST)19

20Board-ready talking points20

PART V · OPERATIONALIZE

21–25Rollout plan · Questionnaire · Interview checklist · Document request list · Metrics21–25

26–28References · Glossary · Colophon26–28

AUDN.AI

PART I · LANDSCAPE

CHAPTER 1 · THE MODERN STACK

04 / 28

Chapter 1·Layer 4 of 5

The modern security stack, in five questions.

Every layer of the enterprise security stack answers one question the one below could not. Read from the bottom up; the questions get harder, and the answers get more expensive to get wrong.

05

"What should we fix?"

Respond & Remediate · SOAR · ITSM

Automate response and remediation

04

"What is actually exploitable?"

Prove & Prioritize · AAV — Audn.AI

You are here

03

"What is happening?"

Detect & Correlate · SIEM · XDR · CDR · NDR

Detect and correlate security events

02

"What could be wrong?"

Find & Assess · CNAPP · CSPM · EASM · Vulnerability Mgmt

Identify potential risk

01

"What assets do we have?"

Know Your Assets · CSAM · ITAM · Asset Inventory

Discover and inventory

Read the stack as a conversation. Each layer is a question you pay a tool to answer. Layer 04 is the question no category in your stack answers today — which is why this handbook exists.

AUDN.AI

PART I · LANDSCAPE

CHAPTER 1 · CATEGORY MAP

05 / 28

Chapter 1·Category Map

Where Audn.AI fits in the modern security landscape.

A four-box reading of the stack. Everyone else finds issues. AAV proves which ones matter.

Leaders are not the only ones who define a category.

This is a framing device, not analyst placement. It reads the same way every practitioner reads a quadrant: execution maturity on the Y, strategic impact on the X.

Audn.AI (AAV) Established leaders Visionaries Challengers

Visionaries are betting on new categories (DSPM, AI-SPM, EASM) but lack proof density.

Challengers (traditional BAS, deception) execute reliably but have flat strategic ceilings.

Audn.AI sits in leaders because it both executes continuously and answers a question nobody else does.

AUDN.AI

PART II · THE TWELVE CATEGORIES

CHAPTER 2 · CATEGORIES 1–4

07 / 28

Chapter 2·Foundation & Cloud Posture

The twelve cybersecurity categories, defined clearly.

Four per page, for three pages. Each card: what it is, who it is, how it differs.

CSAM

Cybersecurity Asset Management

A system for discovering, inventorying and managing all IT assets — devices, software, cloud resources — to understand exposure and risk.

ExampleQualys CSAM

AnswersWhat do we have?

vs AAVTells you what exists; AAV tells you what's exploitable.

EASM

External Attack Surface Management

Identifies and monitors internet-facing assets so you know what attackers can see and target from the outside.

ExampleCensys

AnswersWhat can attackers see from outside?

vs AAVEASM maps surface; AAV validates the attack path from that surface.

CSPM

Cloud Security Posture Management

Continuously monitors cloud environments to detect misconfigurations, compliance risks and drift from policy baselines.

ExampleWiz

AnswersWhere are my cloud misconfigurations?

vs AAVPosture-oriented. AAV is exploit-validation oriented.

CNAPP

Cloud-Native Application Protection Platform

Integrated security platform protecting cloud-native applications across development and runtime — combines CSPM, CWPP and CIEM.

ExamplePrisma Cloud (Palo Alto)

AnswersWhat cloud risks and weaknesses exist?

vs AAVIdentifies possible issues; AAV proves which can actually be exploited.

AUDN.AI

PART II · THE TWELVE CATEGORIES

CHAPTER 2 · CATEGORIES 5–8

08 / 28

Chapter 2·Workloads, Identity, SaaS

Protection layers beneath the cloud platform.

Workloads run the code. Identities grant the access. SaaS holds the data. Each needs its own posture story.

CWPP

Cloud Workload Protection Platform

Protects workloads — virtual machines, containers, and serverless functions — at runtime, complementing CSPM's static posture view.

ExampleTrend Micro Cloud One

AnswersWhat's happening inside my running workloads?

vs AAVRuntime telemetry and blocking; AAV drives the attack the workload has to defend.

CIEM

Cloud Infrastructure Entitlement Management

Manages identities and permissions in cloud environments. The goal is least-privilege at scale, and surfacing risky or dormant entitlements.

ExampleMicrosoft Entra Permissions Mgmt

AnswersWho has access to what — and should they?

vs AAVGoverns entitlements; AAV tests adversarial paths through those entitlements.

SSPM

SaaS Security Posture Management

Secures SaaS applications — Microsoft 365, Salesforce, Workday — by finding misconfigurations, over-permissions and risky third-party integrations.

ExampleAdaptive Shield

AnswersAre my SaaS apps securely configured?

vs AAVIdentifies SaaS posture risk; AAV proves attack feasibility and impact.

CDR

Cloud Detection and Response

Detects and responds to threats inside cloud environments using monitoring, behavioural analytics and cloud-native log sources.

ExampleLacework

AnswersWhat active threats are occurring in my cloud?

vs AAVCloud-focused detection; AAV validates exploitability end-to-end.

AUDN.AI

PART II · THE TWELVE CATEGORIES

CHAPTER 2 · CATEGORIES 9–12

09 / 28

Chapter 2·Detection & Response

The layers that see what's happening.

Endpoints, logs, cross-domain fusion, orchestrated response. Every CISO owns these four — and every one of them produces alerts AAV validates.

EDR

Endpoint Detection and Response

Real-time monitoring, detection, and response for endpoint devices — laptops, servers, workstations — with rich forensic telemetry.

ExampleCrowdStrike Falcon

AnswersWhat is happening on my endpoints?

vs AAVEndpoint-scoped; AAV is cross-domain attack validation.

XDR

Extended Detection and Response

Extends EDR by correlating signals across endpoints, identity, cloud, email and network — a unified detection and investigation plane.

ExampleMicrosoft Defender XDR

AnswersWhat threats span my domains?

vs AAVCorrelates events; AAV proves the path those events would chain into.

SIEM

Security Information and Event Management

Aggregates and analyses logs from every system in scope — a central nervous system for detection, investigation and compliance reporting.

ExampleSplunk Enterprise Security

AnswersWhat security-relevant events are happening?

vs AAVDetects signals; AAV validates which chains into impact.

SOAR

Security Orchestration, Automation and Response

Automates security workflows and incident-response actions, pulling in data from SIEM, XDR, ticketing and threat-intel tools.

ExampleCortex XSOAR

AnswersHow do I operationalize response?

vs AAVActs after the decision; AAV sharpens which decisions are worth acting on.

AUDN.AI

PART II · THE TWELVE CATEGORIES

CHAPTER 2 · RECAP

10 / 28

Chapter 2·Recap

Five questions, twelve categories, one gap.

A chart you can paste into a board deck. Read left-to-right; read the gap top-to-bottom.

The gap between 03 Detect and 05 Respond is where exhausted SOC teams live. AAV is the step that converts volume into decisions.

AUDN.AI

PART III · THE NEW CATEGORY

CHAPTER 3 · WHAT IS AAV

11 / 28

Chapter 3·Definition

Autonomous Adversarial Validation.

A cybersecurity category that uses AI-driven adversarial simulation to validate what is actually exploitable, reduce false positives, and prioritize real risk — continuously, without needing source code or full internal access.

The mechanic

AI agents think and act like real attackers. They probe, pivot, chain findings, and attempt exploitation end-to-end — black-box (no insider knowledge) or guided (purple-team mode, with environmental priors from your team).

The system does not report "control A blocked technique B." It reports "here is the sequence that would have succeeded, here is the evidence, here is the business impact, and here is why two of your previous CVE findings do not matter."

The reframe

BAS asks "could this be attacked?"
AAV asks "what is exploitable right now, and what is the real impact?"

Six capabilities

01

Autonomous adversary simulation

Black-box or guided, no scripts, no prior pentest handoff.

02

Attack path discovery

Dynamic, end-to-end, not a predetermined playbook.

03

Exploit & pivot validation

Proves what chains together, at what depth, to reach what crown-jewel.

04

Hallucination & FP filtering

Only validated exploits make it to your ticket queue.

05

Business-impact correlation

Findings are tagged to data, customers, revenue, or compliance scope.

06

Continuous proof

Always-on, not a quarterly pentest window.

AUDN.AI

PART III · THE NEW CATEGORY

CHAPTER 3 · WHY DIFFERENT

12 / 28

Chapter 3·Why AAV is distinct

Five readings you can send straight into a board pack.

VS CNAPP / CSPM

They identify potential issues.
We prove exploitability.

VS SIEM / XDR

They detect events.
We validate attack paths.

VS BAS (TRADITIONAL)

They run predefined scripts.
We adapt like real attackers.

VS PENTESTING

Humans. Periodic. Expensive.
We are AI-driven, continuous, scalable.

VS VULNERABILITY SCANNERS

They generate lists.
We tell you what actually matters.

“Most security programs generate thousands of findings. Very few tell the team what is actually exploitable right now. AAV is the layer that closes that gap.”

AUDN.AI

PART III · THE NEW CATEGORY

CHAPTER 3 · AAV vs BAS

13 / 28

Chapter 3·Head-to-head

AAV vs traditional Breach & Attack Simulation.

BAS told you whether a scripted technique ran. AAV shows what an adversary would chain together if they had tools, reasoning, and a motive.

Dimension	Traditional BAS	Audn.AI · AAV	Key difference
Objective	Simulate known techniques and test controls.	Prove what attackers can actually exploit.	Validation vs Simulation
Approach	Predefined scripts, TTP libraries, limited logic.	Autonomous AI agents, black-box or guided.	Autonomous vs Scripted
Coverage	Limited paths, predetermined scope.	Full attack surface, dynamic path discovery.	Full vs Limited
Adaptability	Static. Executes what is designed.	Adaptive. Learns, pivots, chains attacks.	Adaptive vs Static
Validation	Success/fail by rule outcome.	Real exploit validation with business-impact context.	Proof vs Rule Match
Output	High false positives, shallow context.	Actionable, prioritized, low false-positive.	Actionable vs Noisy
Human loop	High — setup, tuning, analysis.	Low — human-in-the-loop for strategic guidance.	Assist vs Heavy
Cadence	Periodic (weekly / monthly / quarterly).	Continuous, always-on validation.	Continuous vs Periodic
AI-native	No — rule / logic-based.	Yes — AI-driven planning, execution, evaluation.	AI-Native vs Not

Nine dimensions. One consistent pattern: BAS optimizes for coverage of known techniques. AAV optimizes for coverage of actual exploitability.

9 / 9 CATEGORIES MOVE TOWARD AAV

AUDN.AI

PART III · THE NEW CATEGORY

CHAPTER 3 · HOW IT WORKS

14 / 28

Chapter 3·Architecture

How Audn.AI works.

Plug in a live endpoint. Attacker-AI runs end-to-end. Evidence comes back signed.

01

Connect

Point Audn at a live AI endpoint — voice agent, chat agent, web agent, MCP-enabled workflow. No source code required. No agent in your VPC.

BLACK-BOX OR GUIDED

02

Reconnaissance

Attacker models map the surface — intents, tools, guardrails, authority boundaries. Trained on Audn's proprietary 10B+ black-box interaction corpus.

10B+ ATTACK ATTEMPTS

03

Exploit & chain

Autonomous agents try prompt injection, tool misuse, data exfiltration, authority escalation, RAG poisoning. Every chain is recorded end-to-end.

ADAPTIVE PATH DISCOVERY

04

Signed exploit report

Proof-of-exploit with reproduction steps, business-impact correlation, and a cryptographic signature for audit, regulator, and customer security review.

ACCELERATE SECURITY APPROVALS

Coverage

Voice · chat · web agents. Prompt injection, data exfiltration, network misconfig, app vuln, agent misbehavior.

Differentiator

Unlike open-source cyber models (e.g. GPT-5.4-cyber, Claude Mythos), Audn fills the black-box data gap.

Outcome

Signed reports that accelerate enterprise security sign-off and unlock deals with the likes of food-delivery CISOs.

AUDN.AI

PART III · THE NEW CATEGORY

CHAPTER 3 · VALUE

15 / 28

Chapter 3·Where Audn.AI delivers value

Six outcomes that make the business case.

01

Reduce false positives

Only validated exploits reach your SOC. Analysts stop triaging list items that were never real.

–90%noise in the exploit queue

02

Prove real exploitability

Signed exploit reports, reproduction steps, and evidence an auditor can accept without argument.

1 : 1finding → proof

03

Prioritize what matters

Business-impact correlation — revenue, data, compliance scope — over generic CVSS gut-check.

Top 5exploits per week, not 500

04

Bridge blue team & attacker mindset

Purple-team cadence by default. Guided mode lets your team inject priors and steer depth.

Purpleon a daily loop

05

Continuous validation

Post-deploy, post-update, post-config-change. Not a quarterly window that misses 90% of change events.

24 / 7always-on

06

Improve security ROI

Fewer tools wasted on theoretical findings. Faster enterprise deals unlocked by signed reports.

↑procurement velocity

Move from theoretical security → proven security.

AUDN.AI

PART III · THE NEW CATEGORY

CHAPTER 3 · PROOF POINTS

16 / 28

Chapter 3·Field telemetry

Six proof points from the field.

Audn has been in production since January 2026. Here's what twelve months of attacker-AI telemetry looks like.

10B+

Attack attempts

Autonomous red-team probes executed across voice, chat and web AI endpoints in production.

~40%

Month-over-month growth

Revenue since first B2B sale in January 2026, driven by enterprise security approvals.

Freya

First B2B customer · YC S25

Voice AI startup. First signed exploit report closed procurement in a single week.

Top 5

Global food-delivery CISO

Committed design partner. Validating customer-facing voice agents at multi-region scale.

Zero

Source code required

Black-box by design. Audn attacks from where attackers operate — the outside.

$2M

Pre-seed in progress

Scaling attacker models, expanding enterprise adoption, and shipping the defensive layer.

Positioning, in one line

Audn is the system of record for "what is actually exploitable" in AI — a new category, Autonomous Adversarial Validation, beyond detection and prevention.

AUDN.AI

PART IV · GOVERNANCE

CHAPTER 4 · FAILURE MODES

17 / 28

Chapter 4·Taxonomy

Six failure modes every AI-agent CISO should know.

LLMs fail non-deterministically. They report success when they are wrong. None of the traditional indicators of compromise surface cleanly. This is the taxonomy you need at hand.

01

Hallucination

Model invents ungrounded facts. Silent failure.

How you investigate Semantic-entropy sampling · context-grounding verification · golden-dataset regression.

02

RAG retrieval failure

Pipeline returns the wrong or missing documents; the LLM answers from vacuum.

How you investigate Similarity auditing · chunk boundary analysis · index-version diff.

03

Model drift

Gradual degradation from distribution shift. Yesterday's agent is not today's agent.

How you investigate Golden-dataset regression · temporal correlation with pipeline changes.

04

Prompt injection

Malicious input bypasses guardrails, leaks system prompts, invokes out-of-policy tools.

How you investigate Boundary testing · guardrail-bypass replay · system-prompt leakage audit.

05

Guardrail failure

Safety filter does not catch problematic output. Model was fine; filter was wrong.

How you investigate Rule auditing · adversarial edge-case replay · policy-gap analysis.

06

Agent reasoning failure

Autonomous agent chooses the wrong tool or delegation path. Executes with authority.

How you investigate Decision-chain reconstruction · tool-invocation audit · authority analysis.

Source frame: The OECD AI Incidents Monitor logged 108 new incidents between Nov-25 and Jan-26. Six failure modes account for most of them. None is covered by traditional IOC taxonomy.

AUDN.AI

PART IV · GOVERNANCE

CHAPTER 5 · FORENSIC READINESS

18 / 28

Chapter 5·Seven evidence gates

The forensic-readiness checklist.

EU AI Act, Article 73 requires serious-incident reporting within 15 days and a full investigation — with fines up to €15M or 3% of worldwide turnover. Obligations take effect August 2026. If the data does not exist, the methodology cannot save you.

The test

Can you, right now, reconstruct a single inference from last Tuesday — with full prompt chain, retrieval context, agent reasoning trace, and guardrail state? If the answer is no, the audit is the place to start.

01

Full input / output pairs with timestamps, model version, temperature and token params.

◯

02

Complete prompt chains — system prompts, user turns, intermediate reasoning steps.

◯

03

Retrieval context, similarity scores, and source query for every RAG call.

◯

04

Embedding model version at the moment of retrieval.

◯

05

Agent action logs: tool calls, reasoning traces, full delegation chain.

◯

06

Model metadata: fine-tuning provenance, guardrail configuration, deployment version.

◯

07

User session context: identity, permissions, application context at the moment of call.

◯

AUDN.AI

PART IV · GOVERNANCE

CHAPTER 5 · REGULATORY CROSS-WALK

19 / 28

Chapter 5·Obligations

The regulatory cross-walk.

Four frameworks. One evidence substrate. If you can pass Audn.AI's forensic checklist, you can satisfy all four.

EU · REGULATION

EU AI Act

Article 73Serious-incident reporting · 15-day window.

Article 15Accuracy, robustness, cybersecurity throughout lifecycle.

Article 12Logging of events enabling traceability.

PenaltyUp to €15M or 3% of worldwide turnover.

EffectiveAugust 2026 for high-risk systems.

ISO · STANDARD

ISO / IEC 42001

Clause 8Operational controls for AI systems.

Clause 9Performance evaluation and internal audit.

A.6.2AI system impact assessment.

A.8.3Data quality for AI systems.

ScopeAI management system certification.

US · FRAMEWORK

NIST AI RMF 1.0

GovernContext, authority, accountability.

MapRisks and impacts in context.

MeasureTrustworthy characteristics & performance.

ManagePrioritize and respond to mapped risk.

UseVoluntary · referenced by US federal and 40+ states.

PRACTICE · INDUSTRY

MITRE ATLAS + OWASP LLM Top 10

ATLASAdversarial threat landscape for AI systems.

OWASP LLM01Prompt injection.

OWASP LLM06Sensitive information disclosure.

OWASP LLM08Excessive agency.

FitDirectly attacker-validated by Audn.AI.

The consolidation: every framework above asks the same practical question — can you reconstruct what happened, prove what could have happened, and show what you did about it? That is the AAV evidence substrate in three clauses.

AUDN.AI

PART IV · GOVERNANCE

CHAPTER 5 · BOARD DECK

20 / 28

Chapter 5·Language for the board

Board-ready talking points.

Five sentences per category. Written to be paraphrased, not read aloud.

WHAT WE ARE DOING

"We've added a validation layer between detection and response. It continuously simulates real adversary behavior against our AI agents and cloud surfaces, and returns signed evidence of what would actually succeed — not what might succeed."

WHY NOW

"The EU AI Act takes effect in August 2026 with fines up to 3% of turnover. The OECD logged 108 AI incidents in three months. Our current stack can detect events; it cannot prove exploitability, which is what a regulator, an auditor, or a customer security review will ask for."

HOW IT DIFFERS FROM OUR PENTEST BUDGET

"Pentesting is humans, periodic, and narrow in scope. AAV is AI-driven, continuous, and scopes to the full attack surface. Both stay. AAV fills the 49 weeks a year pentesting doesn't."

WHAT WE EXPECT IN THE FIRST 90 DAYS

"A drop in the false-positive queue of roughly ninety percent. A prioritized top-five list of exploitable paths per week, tied to revenue, data, or compliance scope. And a signed exploit report substrate we can hand to customers and regulators."

THE BOTTOM LINE

"We are moving from theoretical security — a long list of maybe-bad-things — to proven security: a short list of definitely-bad-things we are fixing now. That's the commitment."

AUDN.AI

PART V · OPERATIONALIZE

CHAPTER 6 · ROLLOUT

21 / 28

Chapter 6·90-day rollout

A ninety-day plan you can start on Monday.

01

Days 1–30 · Baseline

Pick one customer-facing AI agent — highest business impact, not deepest tech.
Connect the live endpoint to Audn in black-box mode.
Run the forensic-readiness audit (Ch. 5). Close any gates at ◯.
Agree a top-5 crown-jewel list with product and legal.
Define a signed report → ticket pipeline into Jira / ServiceNow.

Output: first signed exploit report delivered to SOC.

02

Days 31–60 · Purple loop

Switch to guided mode; inject priors from red / blue team.
Wire findings into SIEM and XDR as correlation rules.
Add two more agents: one internal copilot, one voice.
Stand up the board-level exploitability metric (see Ch. 7).
Run first tabletop with Audn evidence on the table.

Output: SOC triages exploit-validated findings only.

03

Days 61–90 · Continuous

Move Audn into CI/CD: every model or prompt change triggers validation.
Expose signed reports in vendor security portal for customer review.
Map evidence to EU AI Act Article 73 reporting workflow.
Retire one duplicate BAS / scanner contract; fund it.
Present the 90-day delta at board.

Output: audit-ready, continuously validated.

30dFirst signed report

60dSOC on exploit-only queue

90dCI/CD integration

≤6moOne retired tool funds program

AUDN.AI

PART V · OPERATIONALIZE

CHAPTER 7 · QUESTIONNAIRE

22 / 28

Chapter 7·Self-assessment

Twelve questions to score your current state.

Circle 0–5 for each. Total > 40 = audit-ready. 25–40 = gaps. < 25 = start the 90-day plan Monday.

01I can list every AI agent in production today.0 1 2 3 4 5

02I know which of those agents can invoke tools with authority.0 1 2 3 4 5

03My SOC can tell exploitable from theoretical findings.0 1 2 3 4 5

04We log full prompt chains for every production inference.0 1 2 3 4 5

05We log RAG retrieval context and embedding model version.0 1 2 3 4 5

06We log full agent tool-call and delegation chains.0 1 2 3 4 5

07We test for prompt injection on every deploy.0 1 2 3 4 5

08We test for data exfiltration on every deploy.0 1 2 3 4 5

09We have an answer for EU AI Act Article 73 timelines.0 1 2 3 4 5

10We can reconstruct a single inference from last Tuesday.0 1 2 3 4 5

11Customer security reviews take <5 days, not >5 weeks.0 1 2 3 4 5

12Our exploitability story is boardable without a slide edit.0 1 2 3 4 5

Total

___ / 60

0–24

Start 90-day plan

25–40

Close specific gates

41–60

Audit-ready

AUDN.AI

PART V · OPERATIONALIZE

CHAPTER 7 · INTERVIEW CHECKLIST

23 / 28

Chapter 7·Four roles, twelve questions

The interview checklist.

Print the sheet. Ask everyone the same thing. Write down where the answers disagree. That's where the program actually needs work.

SECURITY ARCHITECT

Draw the stack. Where is the validation layer?
Which tools overlap? Which contract expires first?
What is your plan for AI agent forensics?

SOC LEAD

How many alerts last week? How many were exploitable?
Rank your top 3 sources of false positives.
If AAV fed you exploit-validated findings only, what would you do differently?

AI / ML LEAD

Which production agents have tool authority?
Show me last Tuesday's inference. Full chain.
What's your response if I ask for prompt-injection test results?

LEGAL · COMPLIANCE

Are we in scope for EU AI Act Article 73?
What's our 15-day incident report workflow?
Which customer contracts require signed security evidence?

What "good" looks like

All four agree on scope, the answer to the "last Tuesday" question is yes, and the number of exploit-validated findings per week is small and going down.

AUDN.AI

PART V · OPERATIONALIZE

CHAPTER 7 · DOCUMENT REQUEST LIST

24 / 28

Chapter 7·Discovery

Document request list.

Hand this to your GRC team. If they can produce all twenty-one artefacts within a week, you're ahead of the curve.

INVENTORY & SCOPE

01AI agent inventory with business owner, tool authority, data scope.

02Data-flow diagrams for every customer-facing agent.

03Model cards and fine-tune provenance.

04Third-party component inventory (embeddings, vector DBs, guardrails).

ARCHITECTURE

05System prompts (versioned, dated, owner).

06Guardrail configuration and rule set.

07Tool catalog with authority matrix.

08RAG pipeline spec (chunking, retrieval, re-ranking).

09Identity & session-context contract.

10Deployment topology (where inference runs, where logs land).

EVIDENCE & TELEMETRY

11Prompt-chain log sample (24 h).

12RAG retrieval log sample with similarity scores.

13Tool-invocation audit log sample.

14Guardrail-block/allow log sample.

15Last 3 incidents: full reconstruction.

GOVERNANCE

16AI governance policy.

17Incident-response runbook incl. AI-specific flow.

18EU AI Act Article 73 escalation map.

19Last pentest & red-team reports.

20Tabletop exercises from the last 12 months.

21Signed exploit reports from AAV · last 30 days.

AUDN.AI

PART V · OPERATIONALIZE

CHAPTER 7 · METRICS

25 / 28

Chapter 7·What to measure

The six metrics that matter.

Three operational, three board-level. Every one should be trending the right way within a quarter.

Ops

FP-R

False-positive rate

Target ↓ 50% in 60 days.

Ops

MTTR-X

Mean time to remediate
exploitable finding

Target ↓ 40% vs generic MTTR.

Ops

COV

Agent coverage

% of prod agents with continuous AAV. Target 100%.

Board

EXR

Exploitable-risk ratio

Exploitable / total findings. Trend flat or down.

Board

TSR

Time to signed report

Hours from validation to customer-deliverable evidence.

Board

SRV

Security-review velocity

Median days to pass a customer security review.

Expected EXR curve · first 90 days

EXPLOITABLE / TOTAL FINDINGS

AUDN.AI

REFERENCES

CHAPTER 8 · REFERENCES

26 / 28

References·What we read

Plain-English glossary.

Every acronym and term used in this handbook, defined one way — the way we use it.

AAV

Autonomous Adversarial Validation. Our category.

Agent (AI)

An LLM-driven process with tool authority.

ATLAS

MITRE's adversary model for AI systems.

ATT&CK

MITRE's adversary-technique knowledge base.

BAS

Breach & Attack Simulation. Scripted.

Black-box

Validation with no source or insider access.

CDR

Cloud Detection and Response.

CIEM

Cloud Infrastructure Entitlement Mgmt.

CNAPP

Cloud-Native Application Protection Platform.

CSAM

Cybersecurity Asset Management.

CSPM

Cloud Security Posture Management.

CWPP

Cloud Workload Protection Platform.

EASM

External Attack Surface Management.

EDR

Endpoint Detection and Response.

Exploit chain

A sequence of techniques that together produce impact.

Exploitability

Whether a finding can actually be leveraged end-to-end.

Guardrail

Safety filter applied to an AI output.

Guided (purple)

Validation with priors injected by the blue team.

Hallucination

Ungrounded, confidently stated model output.

IOC

Indicator of Compromise.

ISO 42001

AI management-system standard.

MCP

Model Context Protocol. The agent-tool substrate.

NDR

Network Detection and Response.

NIST AI RMF

US AI Risk Management Framework.

OWASP LLM

Top-10 taxonomy for LLM app risks.

Prompt injection

Attack that bypasses guardrails via input.

RAG

Retrieval-Augmented Generation.

SIEM

Security Info & Event Management.

Signed report

Cryptographically signed exploit evidence.

SOAR

Security Orchestration, Automation, Response.

SSPM

SaaS Security Posture Management.

TTP

Tactic, Technique, Procedure.

VM

Vulnerability Management.

XDR

Extended Detection and Response.

AUDN.AI

CLOSE

CISO HANDBOOK · EDITION 1

28 / 28

The ask·Close

Prove what
attackers actually
can do.

Audn.AI is the reality layer in cybersecurity. We validate what attackers can do, so your team can fix what matters — continuously, signed, audit-ready.

Start a conversation

audn.ai

Ask for a signed sample exploit report. Share one agent endpoint; see AAV in action within a week.

Next step for CISOs

Run the Chapter 5 audit

Seven evidence gates. If you can answer "yes" to all seven, you are already ahead of EU AI Act obligations.

audn.ai

Peace of mind,in the era ofAI agents.

Your stack tells you what might be wrong. It cannot tell you what an attacker actually does.

Twenty-eight pages, eight chapters.

The modern security stack, in five questions.

Where Audn.AI fits in the modern security landscape.

Leaders are not the only ones who define a category.

The twelve cybersecurity categories, defined clearly.

Protection layers beneath the cloud platform.

The layers that see what's happening.

Five questions, twelve categories, one gap.

Autonomous Adversarial Validation.

The mechanic

Six capabilities

Five readings you can send straight into a board pack.

AAV vs traditional Breach & Attack Simulation.

How Audn.AI works.

Six outcomes that make the business case.

Six proof points from the field.

Six failure modes every AI-agent CISO should know.

The forensic-readiness checklist.

The regulatory cross-walk.

Board-ready talking points.

A ninety-day plan you can start on Monday.

Twelve questions to score your current state.

The interview checklist.

Document request list.

The six metrics that matter.

Further reading — and what to cite in a board deck.

Plain-English glossary.

Prove what attackers actually can do.

Peace of mind,
in the era of
AI agents.

Your stack tells you what might be wrong.
It cannot tell you what an attacker actually does.

Prove what
attackers actually
can do.