OWASP Top 10 2025 vs real web pentest findings

The OWASP Foundation published its Top 10:2025 list in late 2025, drawing on more than 2.8 million applications and 175,000 CVE records — the largest dataset the project has ever assembled. For most CISOs and IT directors, that list arrives at the same moment as a question from the board or an external auditor: “We are aligned to the OWASP Top 10, so why does our last pentest still surface things that are not on it?” Reading the OWASP Top 10 2025 web application pentest angle through both lenses — the strategic ranking and what testers actually find — helps answer that question without rewriting the list itself.

This is not a recap of what changed between 2021 and 2025. It is a buyer’s view: where the list and a real engagement disagree, why they disagree, and what that means when scoping a web application security assessment.

What OWASP Top 10:2025 is actually measuring

The Top 10 is a strategic risk-ranking built from aggregate data across millions of applications. Categories are weighted by exploitability, detectability, technical impact, and prevalence — averaged across the entire dataset. The output is a list of categories of risk that the industry as a whole should address, not a list of findings any single application is guaranteed to contain.

That distinction matters. A CISO reading OWASP’s A01 — Broken Access Control page sees the line: “100% of the applications tested were found to have some form of broken access control.” Universal does not mean identical. Broken access control in a banking portal is a horizontal IDOR on a transaction history endpoint; in a health-records SaaS, it is a vertical privilege bypass on an admin role; in a marketing CMS, it is an unauthenticated draft-content leak. The category is always present, but the specific finding is never the same finding twice, which is why the list informs a test; it does not replace one. That is the short answer to “what is the difference between OWASP Top 10 and a web application pentest”: the Top 10 ranks classes of risk across the industry; a pentest characterises how those classes manifest in a specific application’s logic, authentication model, and data flows.

A pentest, by contrast, is an active, attacker-simulated exercise against one application. The output is a list of findings — concrete, exploitable issues — sorted by what the tester could actually do, not by what the industry as a whole tends to suffer.

Where the tactical finding frequency diverges

This is where most reports surprise the buyer. Cobalt published a comparison of the OWASP Top 10:2025 with real-world pentest data in November 2025, drawn from thousands of engagements on its platform. The headline: “Cross-site scripting (XSS) remains our number one most frequent finding, accounting for 18.4% of all web vulnerabilities.”

XSS sits inside A03 (Injection) on the strategic list — important, but not at the top. The tactical frequency table flips the order. Why does XSS appear more often in real pentests than the OWASP ranking suggests? OWASP weights by impact times prevalence across all applications; XSS tops the per-engagement frequency table because it is exploitable in almost every application that handles user input, even when its per-incident impact is moderate. Broken access control testing behaves the same way — A01 is everywhere because every application enforces some access boundary, and every application gets some part of it wrong.

Read together, the implications for a buyer are practical:

The list tells you which categories matter strategically. Cryptographic failures, supply-chain weaknesses, software integrity — these belong on a strategic risk register because their impact is severe and they are systemic rather than per-application.
The pentest tells you which findings exist in your code today. XSS in web application pentest results, broken access control, and authentication weaknesses dominate the actual finding count because they live in business logic, and business logic varies application by application.
A finding low on the strategic list can still be the highest-severity issue in your specific report. A single broken access control flaw on an admin endpoint outranks a dozen reflected XSS issues, even though the latter is more “frequent” overall.

The findings of human testers show that the list does not rank

The OWASP categories are deliberately broad. They have to be, to cover millions of applications. That breadth is also why a list-driven scan often misses the year’s most interesting attack primitives.

PortSwigger’s Top 10 Web Hacking Techniques of 2025 catalogues the research the practitioner community judged most novel last year: 63 nominated pieces, narrowed to a ranked top 10. The standout classes for 2025 included blind server-side template injection surfaced through “successful errors”, SSRF redirect-loop chains that defeat allow-list filters, and ORM-layer leaking that exposes data through query reflection rather than direct injection.

None of those classes is a new OWASP category. Each maps loosely into one or two existing ones — injection, SSRF, security misconfiguration — but the technique is what makes the finding exploitable, and techniques are what automated scanners miss. A scanner looks for known signatures; a human tester reads the application, builds a hypothesis about how the framework, the ORM, or the SSO flow behaves under unusual input, and then probes. The Cobalt frequency data and the PortSwigger technique catalogue describe the two halves of the same gap: scanners find the common categories cheaply; humans find the exploitable specifics that move a finding from “informational” to “critical”.

That is the case for human-led penetration testing in plain terms – not “our methodology is best”, just: someone has to read your application, not just scan it, if the goal is to know what an attacker would actually do.

What the gap means when scoping a web app engagement

The practical question for a buyer is no longer “does our pentest cover the Top 10”. It is “does our web app pentest scope match how this application is actually built and used”. Three scoping decisions follow directly from the gap:

Tie the scope to the application’s authentication and authorisation model, not to a generic checklist. A pentest that allocates a fixed percentage of effort to each OWASP category misses the point. If the application has three user roles and a complex sharing model, A01 — broken access control — deserves a disproportionate share of test time. The Cobalt frequency data backs this up: access control and XSS find the most issues per hour of testing.
Re-test on the architecture change, not just on the calendar. How often should a web application be pentested? Annual is the floor most regulators expect, but the OWASP data is aggregate, and your application’s risk profile shifts with each significant release or architecture change. A re-test scoped to the deltas — new endpoints, new auth flows, new third-party integrations — is more useful than a yearly full-scope replay.
Make remediation outputs developer-shaped. The strongest engagements pair the finding list with secure code review inputs that developers can act on in the same sprint. A pentest that ends at a PDF and a CVSS score rarely closes the gap; one that hands engineering reproducible test cases and remediation guidance does. This is also the bridge between PCI DSS web application testing obligations and engineering cadence — the auditor wants the test, the engineering team needs the patch path.

The same logic applies to a buyer reviewing a vendor’s draft scope of work, or scoping a security assessment for a new third-party application before it goes into production. A scope built around a checklist tells you the categories were considered. A scope built around the application’s actual logic tells you the report will be useful.

Where this leaves the buyer

The OWASP Top 10:2025 is a better strategic list than its predecessors — broader data, clearer category definitions, a more honest treatment of supply-chain and integrity risks. It belongs on the audit checklist, on the risk register, and in every developer onboarding. It is not, on its own, an answer to “Is this application safe to deploy?”.

CTDefense delivers web application penetration testing for organisations in finance, technology, and other regulated sectors whose teams have read the list and want the version of the test that matches their application — not the version that matches the checklist. Similar organisations are encouraged to look at the gap between their last pentest’s findings and the strategic list as a useful signal, not as a contradiction. Both views matter. A well-scoped human-led engagement is what closes the distance between them.