Version: Local · In Progress

Findings Validation

Every finding generated by any of the 58 scanning agents passes through a multi-layered validation pipeline before being reported. This pipeline is responsible for the 94%+ true positive rate.

Validation Architecture

Layer 1: Always-Reliable Bypass

Certain vulnerability types have deterministic detection logic that almost never produces false positives. These types bypass the LLM validation entirely to save cost and avoid incorrect LLM rejections:

jwt_none_algorithm -- Algorithm set to "none", server accepted the forged token
jwt_algorithm_confusion -- HS256/RS256 confusion, server accepted the wrong algorithm
jwt_weak_secret -- Dictionary attack found the signing secret
sqli_error -- Database engine error message in response body
sensitive_file_exposed -- Known sensitive file (.env, .git/config, etc.) returned with content

All other finding types proceed to heuristic filtering.

Layer 2: Heuristic Filters (H3-H19)

Deterministic rules that auto-reject known false positive patterns. Each rule targets a specific FP pattern identified through scan result analysis.

Rule	Name	What It Rejects
H3	NoSQL Target SQLi	SQL injection findings when the target uses MongoDB, DynamoDB, or other NoSQL databases -- SQL payloads are meaningless against document databases
H4	Undeclared Params	SQLi findings on parameters that do not exist in the OpenAPI specification -- if the spec declares no such parameter, the server likely ignores it
H4b	Type Validation Errors	SQLi error-based findings caused by framework type validation (e.g., `MethodArgumentTypeMismatchException`, `Invalid UUID`) rather than actual SQL engine errors
H4c	SQLi on File Upload	`sqli_auth_bypass` findings on file-upload endpoints or PUT method -- file operations produce status codes that mimic auth bypass patterns
H4d	Credential Auth-Init	`sensitive_data_credential` findings on authentication-initiation endpoints (login, register, token) -- credentials are expected on these endpoints
H4e	Non-Transactional Race	`race_double_submit` findings on endpoints without transactional side effects -- race conditions only matter on state-changing operations
H4f	Non-LDAP LDAPi	`ldap_injection` findings on targets without LDAP stack presence -- LDAP payloads are meaningless without an LDAP directory
H5	GET Race Conditions	Race condition findings on GET endpoints -- read-only operations have no exploitable side effects
H6	IDOR Collection	IDOR findings on GET endpoints that return all records regardless of the parameter value -- the endpoint is a collection endpoint, not an individual resource
H7	Default Data Exposure	`default_data_exposure` findings triggered only by adding query parameters -- not a real exposure
H8	File Upload on JSON	`file_upload` findings on endpoints that expect JSON or XML bodies rather than multipart form data
H9	Business Logic JWT	Business logic findings on JWT-related endpoints -- these are auth operations, not business logic flaws
H10	Low-Confidence XSS	XSS findings with very low confidence where the agent itself flagged the finding as unverified
H11	Auth Bypass Non-Admin	`AUTH_BYPASS` findings on non-admin endpoints -- auth bypass is only meaningful on privileged resources
H12	Cache Poison No-Cache	`cache_poisoning` findings without evidence that the target actually uses caching (no `Cache-Control`, `X-Cache`, or `Age` headers)
H13	IDOR Non-Exploitable	IDOR variant findings on endpoints where the response data does not change with different IDs
H14	Content Type Confusion	`content_type_confusion` findings when the server returned an error or the content type is expected
H15	Non-Callback Endpoints	`callback_manipulation` on endpoints that do not handle callbacks or webhooks
H16	File Upload Error	`unrestricted_file_upload` findings when the server returned an error response (upload was rejected)
H17	Standard API Response	`default_data_exposure` on standard API response patterns (pagination, metadata, etc.)
H18	XSS in JSON	XSS findings where the server response content type is `application/json` -- JSON is not an HTML execution context
H19	Auth Bypass 401/403	Auth bypass findings where the attack response status code is 401 or 403 -- the auth check actually worked

Layer 3: Multi-Probe Confirmation

Findings that pass heuristic filtering are replayed with payload variations to confirm exploitability:

Original payload replay -- Resend the exact attack to verify the behavior is reproducible
Payload variation -- Send modified payloads to confirm the vulnerability is not a one-off response quirk
Baseline comparison -- Compare attack response against the stored baseline response to verify differential behavior

Layer 4: AI Validation

The findingValidator sends surviving findings to an LLM (Claude or GPT) for semantic analysis. The LLM receives:

The attack request (what payload was sent)
The attack response (what the server returned)
A baseline response (what the server normally returns)
The vulnerability type and a true-positive description explaining what a real positive looks like
Application context from the blackboard (tech stack, target type)

The LLM returns:

Confidence adjustment (may increase or decrease the agent's original confidence)
FP flag (whether the LLM considers this a false positive)
Reasoning (explanation for the decision)

Performance constraints:

Only validates medium, high, and critical severity findings -- low/info findings are auto-passed
Timeout of 25 seconds per validation -- if the LLM times out, the finding passes through unchanged

Layer 5: Deduplication

The findingDeduplicator removes duplicate findings at two levels:

APP_WIDE Deduplication -- For vulnerability types that apply to the entire application (e.g., missing security headers), only one finding is kept per application
Endpoint-Level Deduplication -- For endpoint-specific findings, semantic similarity is used to merge findings that describe the same vulnerability on the same endpoint from different agents

Confidence Tiers

Final findings are assigned a confidence tier based on their score:

Tier	Score Range	Meaning
Confirmed	90-100	Exploitability verified with evidence
High	70-89	Strong indicators, high likelihood of true positive
Medium	50-69	Moderate evidence, may need manual review
Low	30-49	Weak indicators, likely requires manual verification
Informational	0-29	Observations that may warrant investigation

Scanner Agents Catalog -- Complete list of all 58 agents
Phase Pipeline -- How agents are orchestrated across six phases
AppSec Overview -- Module overview and architecture

Validation Architecture​

Layer 1: Always-Reliable Bypass​

Layer 2: Heuristic Filters (H3-H19)​

Layer 3: Multi-Probe Confirmation​

Layer 4: AI Validation​

Layer 5: Deduplication​

Confidence Tiers​

Related Pages​