Shaded Vulnerability Detection Data

rspooner · June 7, 2024, 6:57pm

NOTE: This is an ongoing release. Stay tuned to this community post for updates and educational materials.

The initial release of the highest severity vulnerabilities with a CVSS of 10 will begin on Monday, September 9, 2024. A table outlining the updated drip schedule is available at this link.

Shaded Vulnerability Detection

As we shared in a press release, Sonatype’s new Shaded Vulnerability Detection capability has identified 4.5 million new open-source vulnerabilities, including 336,000 previously undetectable “Critical” open source vulnerabilities.

This industry-first data enhancement comes from a novel, Sonatype-created algorithm capable of detecting vulnerabilities in “shaded” open source files—a technique in which original code is repackaged, often making detection by traditional means impossible.

Data regarding these vulnerabilities will be introduced to the Sonatype Platform starting September 09, 2024.

What’s changing?

When we release the first “drip” of this data in September, you will likely notice some changes:

To start, most Sonatype Lifecycle and SBOM Manager customers will see new critical violations in their reports.
Depending on policy configuration, Sonatype Lifecycle and Sonatype Repository Firewall customers may also receive notifications, have builds blocked, or have components quarantined.

It’s important to note that this is a new capability within our vulnerability detection, providing you with better, more comprehensive data. It is NOT a change in how IQ Server scans your applications.

Who do these changes impact?

This change will primarily impact:

Products: Sonatype Lifecycle, Sonatype Repository Firewall, and SBOM Manager

Deployment Types: This change will impact customers of all deployment types (Self-hosted, Private Cloud, SaaS, and Sonatype Air-Gapped Environment (SAGE))

Ecosystems: Maven, PyPI, and RubyGems

In addition, packages in the following ecosystems also have a low chance of showing a new vulnerability:

Cargo
Composer
CRAN
npm
NuGet
RPM

How should I prepare for this change?

To prepare for the release of this new capability, we recommend following these three key steps to determine how the rollout of this data enhancement will impact your organization:

Understand the Incremental Release of Data
Review your Risk: Shaded Vulnerability Detection Dashboard
Create a Proactive Remediation Plan

Here is a little more about each step:

1. Understand the Incremental Release of Data (a.k.a., Data “drip”)

Due to the large scale of data uncovered by our Shaded Vulnerability Detection capability, we will release it in small increments (a.k.a., “drips”) starting on Monday, September 09, 2024.

The initial release will target the highest severity vulnerabilities, with a CVSS (Common Vulnerability Scoring System) of 10.

While this approach may take a bit longer, it allows you to better manage any policy violations that may arise, providing you time to review your risk, create remediation plans, and communicate with stakeholders about this new data.

2. Review your Risk: Shaded Vulnerability Detection Dashboard

The best way to prepare for this change is to use the Shaded Vulnerability Detection Dashboard—found within the Integrated Enterprise Reporting area of IQ Server—to estimate the impact on your organization. This help doc details everything you need to know about the dashboard.

This dashboard summarizes the quantity and severity of the new violations you’ll see when this data is released starting on September 09, 2024.

NOTE: To access the dashboard, it will require IQ Server Version 177.

If you can’t upgrade directly to 177, spin up a test instance and copy data from your production instance. If that’s not possible, don’t hesitate to contact your assigned CSE or CSA, if applicable.

3. Create a Proactive Remediation Plan

Once you have access to the Shaded Vulnerability Detection dashboard, we encourage you to review this 10-minute lesson from the Sonatype Learn team, which will:

Explain what this data means for your organization in simple terms.
Suggest some best practices for handling this data.
Give you advice on how to prepare.

[Go To Lesson]

The lesson breaks down the following steps we recommend in understanding what Shaded Vulnerability Detection means for you and how to prepare:

Gauge the data impact: Once you have access to the dashboard, note the total number of new vulnerabilities expected and the apps in which these vulnerabilities appear. Use these two data points to estimate the total impact of this change on your organization.
Planning for potential disruption: Alert stakeholders, like developers and project managers, of possible disruptions in their development and build/release processes. Remember, based on these new vulnerabilities, your policies can fail builds or quarantine components.
Address disruption and investigate vulnerabilities: Set aside time to deal with these disruptions adequately. At the very least, you must be prepared to investigate new vulnerabilities and waive restrictions to get developers unblocked.
Be strategic: Ensure stakeholders know about remediation tools like Automatic Pull Requests. Also, take a moment to refamiliarize yourself with Advanced Search and Waiver Best Practices.

Where can I ask additional questions?

Reply directly to this post. If you are not already registered with the Sonatype Community, you will be prompted to create an account. This will empower you to create and reply to other threads initiated by both the Sonatype team and your community peers.

Notifications can be easily configured to ensure you are aware of updates for a specific thread and/or important announcements within the Community.

mfrost · June 10, 2024, 7:18pm

mfrost · June 11, 2024, 1:26pm

mfrost · June 11, 2024, 1:59pm

popo · June 11, 2024, 4:51pm

Pretty cool feature if it works as advertised and doesn’t produce too many false-positives.

I’ve read all the docs available and one thing I’m left wondering is how exactly this works and who does the heavy lifting. Fingerprinting will probably not be sufficient, so (sticking with Java for the moment) does the algorithm decompile the bytecode and perform static analysis? If so, I would expect IQ scans to take significantly longer than before.
Or is some kind of abstract sent to the Nexus IQ server and is the analysis done server-side?
Again, I would assume a pretty massive CPU usage increase.

I welcome this feature, but it has me somewhat worried in terms of performance for our CI pipelines. Particularly on our 300 pom.xml 200k LoC repo.

Can you help me understrand how the feature works?

edit: Or am thinking about this all wrong and has Sonatype simply scanned all the dependencies itself and so the vulnerabilities DB is going to grow significantly without any additional scanning being done on-premise?

mfrost · June 12, 2024, 2:26pm

Hi @popo! There is no performance concern, it’s all precomputed so that the vulnerabilities from the jar shaded into another jar are associated with the outer jar on a scan.

popo · June 12, 2024, 6:52pm

Thanks for the clarification Maura.
So this means that this algorithm cannot be used locally to scan arbitrary libraries. It only applies to libraries that are known to Sonatype and have been actively scanned beforehand for shaded dependencies.

popo · June 12, 2024, 7:24pm

After playing around with it for a while, I gotta say: the implementation leaves a lot to be desired and feels very beta.
Translating the “Shaded Vulnerability Detection” dashboard to actual remediation actions is next to impossible.
We have 500 applications onboarded in IQ and the only thing the new dashboard tells me is: Application X has 2 shaded criticals. That’s it. I cannot see which components are the culprits for that specific application.

There is no relation between the “Highest Risk By Application” and “Highest Risk By Component” tables, both of which are 50 rows long.

Downloading the raw data and trying to find some way to link the findings doesn’t provide any additional insights (or I’m looking at the wrong data).

So now I only know that there are vulnerabilities, but not where. Locating all the apps and their shaded vulnerabilities will be a pretty big undertaking.

I would have expected a table that shows the actual results (app+component), or better yet: integrate the “shaded vulnerabilities” findings inside the regular scan reports but have them earmarked as “shaded vulnerabilities (preview/beta)” and not have them trigger the violation policies (yet).

hatkar · June 13, 2024, 8:04am

Hi @popo you are correct, this is not a new way for Lifecycle to scan local packages which may contain shaded dependencies. This is a change to Sonatype’s data services to detect shaded dependencies in open source packages on public repositories. As a result some of these open source packages may be implicated by vulnerabilities that were previously undetectable and if you are using one of these open source packages that is implicated you will receive new policy violations.

hatkar · June 13, 2024, 8:06am

Thanks for your feedback on the dashboard @popo , I will take this back to the team.

popo · June 13, 2024, 9:42am

@hatkar I appreciate that.
The announcement and the related documentation just led me to believe that we would get actionable data leading up to the “drips” stages.

As it stands right now, I don’t see how we can realistically (given our scale) use the results and we will just have to wait for July 8th to see what breaks.

hatkar · June 21, 2024, 9:20am

Hi @popo We have now updated the Dashboard based on your feedback. You will now be able to see the “Vulnerable Components per Application” table so you can correlate which components are affected in each application and action accordingly.

popo · July 1, 2024, 10:49pm

@hatkar this is perfect. Thank you!