Close Menu
    Facebook X (Twitter) Instagram
    Wifi PortalWifi Portal
    • Blogging
    • SEO & Digital Marketing
    • WiFi / Internet & Networking
    • Cybersecurity
    • Tech Tools & Mobile / Apps
    • Privacy & Online Earning
    Facebook X (Twitter) Instagram
    Wifi PortalWifi Portal
    Home»Cybersecurity»Fixing vulnerability data quality requires fixing the architecture first
    Cybersecurity

    Fixing vulnerability data quality requires fixing the architecture first

    adminBy adminApril 13, 2026No Comments6 Mins Read
    Facebook Twitter LinkedIn Telegram Pinterest Tumblr Reddit WhatsApp Email
    Fixing vulnerability data quality requires fixing the architecture first
    Share
    Facebook Twitter LinkedIn Pinterest Email

    In this Help Net Security interview, Art Manion, Deputy Director at Tharros, examines why vulnerability data across repositories stays inconsistent and hard to trust.

    The problem starts with systems not designed to collect or manage that data well. They introduce the idea of Minimum Viable Vulnerability Enumeration (MVVE), a minimum set of assertions needed to confirm two systems describe the same vulnerability, and find no true minimum exists. Assertions vary by case and change over time. They argue that before writing new specifications or building new tools, the community needs shared terms and principles. Metrics like CVSS scores often distract from the harder work of assessing actual risk in context.

    vulnerability data quality

    When two repositories disagree about whether a patch fixes a vulnerability, is that a data quality problem, a governance problem, or a definitional problem? And does the distinction matter for how we fix it?

    This is likely all three problems, in some combination, and with some degree of overlap. The set of vulnerable and fixed software products and versions may be inaccurate or incomplete. Governance may not adequately detect and resolve such inaccuracies. Definitions, vocabulary, and grammar are neither strict enough nor shared widely enough. One of the principles we’re proposing is that vulnerability record quality is an architecture problem before it is a data problem. We should not expect high quality data from a system that is not designed to collect, manage, and convey it in the first place.

    Producers and consumers of vulnerability information have a variety of skills, experience, and knowledge. Perhaps more importantly, both producers and consumers have unequal access to information, and both the information and access to it change over time. Another principle is that vulnerability records must be managed over time. The system must adapt and even encourage change so that it adapts and evolves with our understanding. We must accept that incomplete information and legitimate disagreement are permanent features of the landscape and manage records accordingly.

    What is the minimal set of assertions that would allow two independent systems to confirm they are talking about the same vulnerability, without either system having to trust the other’s authority?

    We set out to define this minimal set (MVVE) and found that there likely is no such minimum. There are shared elements, such as specifying an affected (software) product and identifying the conditions under which exploitation would be successful and one or more compromised security properties. Beyond that, the number and type of assertions needed to deduplicate and disambiguate a vulnerability varies.

    We expect a variable set of assertions, over time, both within and across repositories. But sorting out lists and types of assertions is secondary. As we’ve been researching the problems with vulnerability data, we found we first need a better foundation of shared terms and concepts, then we can construct conventions for proper assertions.

    If you strip away the severity scores, the advisory prose, the CWE assignments, and the affected product lists, what is left? Is what remains sufficient to anchor cross-repository deduplication, or does stripping it down expose a void?

    There is no doubt that our existing record formats have opportunities for improvement. We did not start out with “What do we have?” but instead asked “What do we need?” This led to “What vulnerability management tasks and decisions do vulnerability records support?” One of the most critical initial tasks is identification of the affected software products. Recent research showed that “Naming inconsistencies were identified in 50.18% of vendor names used in CPEs within the official NVD database.” If we can’t accurately identify affected products, the rest doesn’t matter. (CPE isn’t unique in this failure mode, other software identification systems suffer similarly).

    One challenge in measuring the quality of something like a CPE assertion is that it may look perfectly reasonable to a human, but it fails to properly identify software, especially in terms of automation and machine usability. Telling the difference between identifiers and software products requires significant manual effort. Reducing or even eliminating that manual effort starts in the architecture and design of the identification system and its assertions.

    Once we have designed to capture the right level of detail, we then need to focus on our ability to trust the information. Another principle we will be discussing: “Every assertion in a record must be able to answer for itself.” So when we do record assertions, they need to be simple, precise, observable, useful, and include provenance. The new vulnerability record is a collection of assertions, growing (and changing) over time, bound to a vulnerability identifier, and machine-usable. These assertions describe the vulnerability, enabling identification, deduplication, and vulnerability management and they need to be independently verifiable and refutable.

    Metrics-driven incentives, whether response time, coverage counts, or CVSS throughput, have a distorting effect on record quality. What specific distortions do you observe most frequently, and are any of them invisible to the people producing them?

    Quantity is not quality. Not to say that coverage and counts aren’t useful, but for example, having a CVSS base score or a CWE ID in a vulnerability record doesn’t mean that information is accurate, precise, or even useful. There’s a tendency to focus on the things you are measuring, simply because you are measuring them. The measurements themselves can become the goal.

    Consider CVSS. Vulnerability repositories typically provide CVSS Base scores and vectors, which are intended to convey proximate technical severity. There’s nothing superficially wrong with this. Repositories cannot easily provide consumers with local, context-dependent information needed to assess risk. Consumers need to supply this context, which is probably more important than proximate technical severity. But attention spent on relatively inexpensive CVSS Base scores (“watch out, 9.8!”) distracts from the work of determining context and more holistically assessing risk. Counting vulnerability records with CVSS scores or measuring distributions of CVSS Base scores is possible, but does it help?

    Different language games, definitions, and overly abstract metrics also lead to distortion. How can two different but similarly qualified analysts looking at the same vulnerability information come up with different CWE IDs or CVSS Attack Complexity vectors? Many of the common assertions we use today fail the tests of atomicity and observability. Disagreements naturally occur, and metrics based on these assertions will be distorted.

    The security community has a long history of producing elegant specifications that get adopted selectively and implemented inconsistently. What is different about this proposal that would prevent that outcome?

    We can’t guarantee the outcome, but the current state of the practice is untenable. We’re not about to draft a new specification, elegant or otherwise. A new format, specification, or a handful of fields aren’t going to resolve the foundational and philosophical problems we are facing. The premise of our work is that we first need to develop a set of principles and requirements with which to design and build better vulnerability repositories. Maybe then it will be time to write a specification, build a new repository, or make changes to existing repositories. But first we need a solid foundation.

    Download: 2026 SANS Identity Threats & Defenses Survey

    architecture data fixing quality requires vulnerability
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email
    Previous ArticleWall-mounting my TV was the best home upgrade I’ve made, and here’s why
    Next Article This 25-year-old Windows tool is better than Task Manager
    admin
    • Website

    Related Posts

    Sweden Blames Pro-Russian Group for Cyberattack Last Year on Its Energy Infrastructure

    April 15, 2026

    Microsoft, Salesforce Patch AI Agent Data Leak Flaws

    April 15, 2026

    Actively Exploited nginx-ui Flaw (CVE-2026-33032) Enables Full Nginx Server Takeover

    April 15, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Search Blog
    About
    About

    At WifiPortal.tech, we share simple, easy-to-follow guides on cybersecurity, online privacy, and digital opportunities. Our goal is to help everyday users browse safely, protect personal data, and explore smart ways to earn online. Whether you’re new to the digital world or looking to strengthen your online knowledge, our content is here to keep you informed and secure.

    Trending Blogs

    Sweden Blames Pro-Russian Group for Cyberattack Last Year on Its Energy Infrastructure

    April 15, 2026

    The automation drift and how to correct course

    April 15, 2026

    Posts in your Shorts? What to expect from YouTube’s experiment

    April 15, 2026

    Microsoft, Salesforce Patch AI Agent Data Leak Flaws

    April 15, 2026
    Categories
    • Blogging (63)
    • Cybersecurity (1,329)
    • Privacy & Online Earning (167)
    • SEO & Digital Marketing (814)
    • Tech Tools & Mobile / Apps (1,590)
    • WiFi / Internet & Networking (224)

    Subscribe to Updates

    Stay updated with the latest tips on cybersecurity, online privacy, and digital opportunities straight to your inbox.

    WifiPortal.tech is a blogging platform focused on cybersecurity, online privacy, and digital opportunities. We share easy-to-follow guides, tips, and resources to help you stay safe online and explore new ways of working in the digital world.

    Our Picks

    Sweden Blames Pro-Russian Group for Cyberattack Last Year on Its Energy Infrastructure

    April 15, 2026

    The automation drift and how to correct course

    April 15, 2026

    Posts in your Shorts? What to expect from YouTube’s experiment

    April 15, 2026
    Most Popular
    • Sweden Blames Pro-Russian Group for Cyberattack Last Year on Its Energy Infrastructure
    • The automation drift and how to correct course
    • Posts in your Shorts? What to expect from YouTube’s experiment
    • Microsoft, Salesforce Patch AI Agent Data Leak Flaws
    • Why ChatGPT Cites One Page Over Another (Study of 1.4M Prompts)
    • I Tried Binge, the Letterboxd Alternative That I Now Like More Than Letterboxd
    • How Endpoint Network Monitoring Enables Remote Work
    • Actively Exploited nginx-ui Flaw (CVE-2026-33032) Enables Full Nginx Server Takeover
    © 2026 WifiPortal.tech. Designed by WifiPortal.tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer

    Type above and press Enter to search. Press Esc to cancel.