Skip to main content

ADR-004: AI Validation Layer for Scanner Findings

Status: βœ… Accepted Date: 2026-03-30 Decision Makers: Tilak Kumar


Context​

Traditional DAST (Dynamic Application Security Testing) scanners have notoriously high false positive rates β€” 30–60% is common in the industry. Every false positive wastes a security engineer's time and erodes trust in the tool.

ThreatWeaver's 59-agent scanner generates raw findings based on heuristics. Without a second-pass validation, many of these findings would be false positives: payloads that returned a 200 status code but weren't actually exploitable, or endpoints that looked vulnerable based on response patterns but weren't.

The options were to build better heuristics (time-consuming, brittle) or use an AI model to reason about evidence.


Decision​

Add a Claude-powered AI validation layer between raw finding detection and final finding storage.

Every raw finding goes through an AI validation step before being surfaced to the user:

  1. Agent detects a potential vulnerability (e.g., IDOR: user A can access user B's resource)
  2. Evidence is assembled: request/response pairs, status codes, response diff, business impact
  3. Claude is called with the evidence and asked to rate confidence (0–10) and provide reasoning
  4. Threshold applied: findings below the confidence threshold are discarded or marked as FP
  5. Validated findings are written to the Blackboard and stored in PostgreSQL

Claude models used:

  • Haiku: fast pre-filter for obvious false positives (cheap, low latency)
  • Sonnet: standard validation for most finding types
  • Opus: reserved for critical/complex chains (SSRF, auth bypass chains, BFLA)

Consequences​

Positive:

  • False positive rate reduced from ~40% (industry average) to ~15% in testing
  • AI provides human-readable reasoning for each finding β€” developers understand why it's a vulnerability
  • AI can reason about business context (e.g., "this endpoint is expected to return data for any user" is not a BOLA)
  • Confidence scores give security engineers a triage priority signal
  • Opus audit of scan logic (used during development) caught systemic bugs before they shipped

Negative / Trade-offs:

  • Each scan incurs AI API costs (Claude API pricing per token)
  • AI validation adds latency per finding (100–800ms depending on model)
  • AI can be wrong β€” hallucinated "reasoning" for borderline cases must be reviewed
  • Prompt engineering for each agent type requires ongoing maintenance
  • AI API rate limits can slow scans when finding volume is high
  • Cost tracking is required to prevent runaway billing on large scans

Alternatives Considered​

OptionWhy Rejected
Better heuristics onlyHeuristics are brittle β€” every new application framework requires new rules. High ongoing maintenance. FP rate remains high.
Human review of all findingsDoesn't scale. 200+ raw findings per scan Γ— multiple scans per day = unsustainable analyst workload.
Third-party AI validation (e.g., OpenAI)Claude has superior instruction-following for structured JSON output and security reasoning. Anthropic's model cards align with responsible security use.
Rule-based FP suppression listsCovers known patterns but misses novel FPs. Requires constant updating.
No validation (ship everything)Tested during early rounds. Security engineers rejected the tool due to noise. Trust recovery is very hard once lost.