Applying Generative AI for CVE Analysis at an Enterprise Scale

The software development and deployment process is complex. Modern enterprise applications have complex software dependencies, forming an interconnected web that provides unprecedented functionality, but with the cost of exponentially increasing complexity.

Patching software security issues is becoming progressively more challenging as the number of reported security flaws in the common vulnerabilities and exposures (CVE) database hit a record high in 2022, according to the CVE database.

With over two hundred thousand cumulative vulnerabilities reported as of the end of 2023, it’s clear that a traditional approach to scanning and patching has become unmanageable. Using generative AI, it’s possible to improve vulnerability defense while decreasing the load on security teams.

Organizations have already begun to look at generative AI to help automate this process. However, doing so at an enterprise scale requires the collection, comprehension, and synthesis of many pieces of information.

AI agents and retrieval-augmented generation add intelligence to CVE analysis

The first step in the process of detecting and remediating CVEs is to scan the software for signatures linked to the presence of known CVEs from a database. But the process shouldn’t stop there.

A logical next step could be to remediate the CVE by generating a pull request and bumping the package version to a patched or fixed version. However, requiring upgrading packages for every detected CVE is unrealistic and does not work well for enterprise-scale software publishing, especially as newer CVEs are often discovered before package upgrades become available.

Due to software dependency complexities, upgrading to a non-vulnerable version of a package is not always possible even when the package does exist. Investigating CVEs to determine the best path forward is incredibly labor-intensive.

Generative AI agents enable a more sophisticated response. They expedite the manual work of a human security analyst to do more extensive research and investigation into a CVE and the scanned software container to determine if upgrading is required, but do it significantly faster (Figure 1).

Bar chart shows that Agent Morpheus decreases analyst time spent researching and investigating CVEs before securely publishing software containers. — *Figure 1. Improved efficiency of the investigation stage using Agent Morpheus*

In this CVE analysis AI workflow example, we have done just that. Our generative AI application, called “Agent Morpheus” for the purposes of this post, takes additional steps to determine if a vulnerability actually exists, generates a checklist of tasks to properly and thoroughly investigate the CVE, and, most importantly, determines if it’s exploitable, which is a critical part of the analysis.

Automated remediation is unrealistic for enterprise-scale software publishing

Enterprise software may not always require updating for every detected CVE to remain secure, and it is not always feasible to do so. One reason may be that updated or fixed versions of vulnerable packages are unavailable from the maintainers. Another challenge is that the dependency chain of modern software projects is so complex that updating one package could lead to compatibility issues with other dependencies and break the software.

Should the software remain unpublished until every last CVE is fixed? Clearly not, but publishers should be confident that their software does not contain critical or high CVE scores that could be exploited during the use of their software.

It’s important to differentiate between a container being vulnerable (a CVE is present) and being exploitable (the vulnerability can actually be executed and abused).

CVEs may not be exploitable in a container for many reasons. Some justifications can be as straightforward as when a vulnerability scan is a false positive, where the CVE signature is incorrect and the vulnerable library is not actually present in the container.

Other reasons a CVE may not require patching could be more complex, such as the vulnerable library requiring a specific dependency in runtime that is not present. An example of this is a CVE in a .jar file in a library inside a container without a Java Runtime Environment (JRE) installed. The .jar file can’t be executed without JRE, so the CVE is not exploitable due to a missing dependency.

Another non-exploitable CVE justification is when the vulnerable code or function in a library is simply not used or accessible to the software or there are mitigating conditions present.

The exact method to determine the exploitability of each CVE is a unique process based on the specific vulnerability. It requires an analyst to synthesize the CVE information from a variety of intelligence sources and apply it to the container or software project in question. This is a manual process that can be incredibly tedious and time-consuming.

Obtaining additional context, reasoning, and a standard security justification without human prompting

Agent Morpheus takes a different approach by combining retrieval-augmented generation (RAG) and AI agents in an event-driven workflow for data retrieval, synthesis, planning, and higher-order reasoning.

The workflow is connected to multiple vulnerability databases and threat intelligence sources, as well as assets and data related to the specific software project such as source code, software bill of materials (SBOMs), documentation, and general internet search tools.

The workflow uses four distinct Llama3 large language models (LLMs), three of which are LoRA fine-tuned for their specific task:

Planning, or the unique checklist task-generation stage
An AI Agent stage for executing the checklist items within the context of a specific software project
A summarization stage combining all the items
Standardizing justifications for non-exploitable CVEs into the common machine-readable and distributable VEX format

Because the workflow generates a checklist and the AI agent runs through this checklist independently—effectively talking to itself—it can proceed independently of a human analyst, without prompting. This makes the process more efficient, as the human analyst is engaged only when sufficient information is available for the human to make a decision on the next steps.

A vulnerability scan event triggers the workflow by passing on a list of CVEs detected in the container. These results are combined with up-to-date vulnerability and threat intelligence to provide the workflow with real-time information on the specific CVEs and their current exploits.

This information is added to the prompt of an LLM LoRA fine-tuned for the specific task of making a unique plan or checklist that can determine if the CVE is exploitable. For instance, a checklist for the previous .jar file example might include an item like, “Check if the software project has JRE required to execute the vulnerable .jar file.” The checklist items are passed to an AI agent that retrieves the necessary information and performs the tasks autonomously.

The AI agent has access to many assets related to the software project and container to effectively execute the checklist items and make decisions. For example, it can search for JRE in the software bill of materials and source code of the project and conclude that the environment cannot run a .jar file, the CVE is not exploitable, and the container does not require immediate patching (Figure 2).

Detailed architecture diagram showcasing the NVIDIA workflow for analyzing common vulnerabilities and exposures in enterprise software. — *Figure 2. NVIDIA AI workflow for security vulnerability and exploitability analysis with event-driven RAG*

In addition to data sources, the AI agent also has access to tools that help it overcome some of the current limitations of LLMs.

For example, a common weakness of LLMs is their difficulty with performing mathematical calculations. This can be overcome by giving LLMs access to calculator tools. For our workflow, we found that the model struggled to compare package version numbers such as version 1.9.1 coming before 1.10. We built a version comparison tool that the agent uses to determine the relationship between package versions.

Figure 3 shows a printout for a single CVE example.

Detailed printout of Agent Morpheus analyzing a single CVE, using a checklist model, agent model, tools, summary model, and VEX justification. — *Figure 3. Example printout of the Agent Morpheus workflow*

Supercharge software delivery with event-driven RAG

With Agent Morpheus, organizations can reduce the time it takes to triage software for vulnerabilities from hours or days to seconds. It can perceive, reason, and act independently, without prompting or assistance from a human analyst. When it is finished with its analysis, Agent Morpheus presents a summary of findings to the human analyst who can then determine the best course of action.

Any human-approved patching exemptions or changes to the Agent Morpheus summary from the analyst are fed back into the LLM fine-tuning datasets to continually improve the models based on human output.

Detailed architecture diagram showing the data flow between Agent Morpheus and its connected services with numbered callouts for each step. — *Figure 4. Data flow diagram of Agent Morpheus and its connected services*

Agent Morpheus is fully integrated with our container registry and internal security tools to completely automate the entire process from container upload to the creation of the final VEX document.

The process is triggered from a container upload event that occurs whenever a new container is pushed to the registry by a user.
When the container is uploaded, it is immediately scanned using a traditional CVE scanner such as Anchore. The results of this scan are passed to the Agent Morpheus service.
Agent Morpheus retrieves the necessary intelligence for the listed CVEs and prepares any agent tools.
The Agent Morpheus models and agents are run, generating a final summary and classification for each CVE.
The final summary and classification for each CVE is then sent to the security analyst dashboard for review. Analysts review the original container scan report, improved summary, and justification from Agent Morpheus and make a final recommendation for each CVE.
The recommendation is sent for peer review. Any changes that must be made are returned to the analyst.
After the VEX document has completed peer review, the final document is published and distributed with the container.
Any changes in the summary or exemptions from the analyst are compiled into a new training dataset, which is used to continually retrain the models and automatically improve the system using the analyst’s output.

Agent Morpheus uses NVIDIA NIM inference microservices to accelerate time to deployment and inference speed. NIM microservices are used to execute all LLM queries and are integral to the workflow for several reasons. NVIDIA NIM makes it easy to spin up your own LLM service compatible with the OpenAI API specification.

Agent Morpheus uses three LoRA customized versions of the Llama3 model and one Llama3 base model, which are all hosted using a single NIM container that dynamically loads the LoRA adapters as needed.

NIM can also handle the bursty nature of LLM requests that are generated by the tool. On average, Agent Morpheus requires about 41 LLM queries per CVE! As container scans can generate dozens of CVEs per container, the number of outstanding LLM requests can easily reach into the thousands for a single container. NIM can handle this type of variable workload and eliminate the need to develop a custom LLM service.

Unlike traditional chatbot pipelines, the Agent Morpheus event-driven workflow is not limited by the time it takes a human to respond. Instead, an accelerated workflow can run through all the CVEs or events using the same parallelization and optimization techniques that are cornerstones of traditional machine learning pipelines.

Using the Morpheus cybersecurity framework, we built a pipeline that orchestrates the large number of LLM requests and enables executing the LLM requests asynchronously and in parallel. The checklist items for each CVE and the CVEs themselves are completely independent of each other and can be run in parallel.

When run serially, it can take 2842.35 seconds to process a container with 20 CVEs. When run in parallel using Morpheus, that same container takes 304.72 seconds to process—a 9.3x speedup!

Morpheus streamlines integrating the tool with the container registry and security dashboard service by turning the pipeline into a microservice using HttpServerSourceStage. With this source stage, Agent Morpheus is truly event-driven and is triggered automatically when each container is uploaded to the registry, enabling it to keep up with the extremely high demand of enterprise software vulnerability management.