Scoring
lockwarden scores in two layers. Layer 1 is structural — it analyses execution surface and version deltas, works on day zero with zero network and zero advisory data, and is the primary detection. Layer 2 is a known-bad overlay — feed-based matching that can only ever confirm what someone already reported.
Layer 1 — execution-surface signals
Section titled “Layer 1 — execution-surface signals”Each signal carries two weights: an absolute weight (the signal exists in the tree)
and a delta weight (the signal newly appeared in this version). Delta weights
dominate by design: legitimate native packages carry binding.gyp forever, but attacks
introduce it — a new hook in a version bump is the 2026 attack signature.
| Signal | Absolute weight | Delta weight (newly appeared this version) |
|---|---|---|
| Lifecycle install script | Low–Med | Critical |
binding.gyp / node-gyp hook | Low | Critical |
| AI-agent hook / MCP manifest | Med | Critical |
| IDE task / folder-open file | Med | High |
| Main-file size anomaly (>5x) | — | High |
| New transitive dep in a patch release | — | High |
| Obfuscation markers in install-path files | Med | High |
| Phantom dependency | Med | — |
Delta weights apply only in --diff / --deep modes (which fetch previous tarballs for
comparison); absolute weights always apply.
Grades A–F
Section titled “Grades A–F”- Each package receives a grade A–F from its combined signals.
- The project rollup is the worst grade in the tree plus a count summary per grade.
--threshold <grade>(default:high) sets the severity at which findings flip the exit code to1— grades below the threshold are reported but don’t fail the run.
Layer 2 — known-bad overlay
Section titled “Layer 2 — known-bad overlay”- Sources: a vendored OSV.dev npm snapshot, npm advisory data, and vendored incident IOC bundles — all shipped inside the npm package and refreshed each release. No feed is fetched at runtime.
- Matching: resolved
name@versionfrom the lockfile. - Any Layer-2 hit is Critical, regardless of the package’s Layer-1 score.
Layer 2 exists because confirming a known incident should be instant and unambiguous. It is deliberately secondary: the 2026 worm waves outran every known-bad database, which is exactly why Layer 1 doesn’t depend on one.
SARIF mapping
Section titled “SARIF mapping”| Grade | SARIF level |
|---|---|
| Critical | error |
| High | warning |
| Med | note |
| Low | suppressed by default (--verbose to include) |
SARIF 2.1.0 output (--sarif) uploads directly to the GitHub Security tab — the
GitHub Action wires this up automatically.
Calibration: weights are gated on a corpus
Section titled “Calibration: weights are gated on a corpus”All Layer-1 weights are provisional until corpus calibration shows clean separation
between the top ~500 benign npm packages and the confirmed 2026 malicious set (the Axios
plain-crypto-js phantom dep, the node-ipc versions, the autotel family, and the Miasma
@redhat-cloud-services versions). A security tool that flags every legitimate native
package trains its users to ignore it — noise is a bug of the same severity as a miss,
and the calibration harness is the gate that keeps thresholds honest before they ship.