Skip to content

Scoring

lockwarden scores in two layers. Layer 1 is structural — it analyses execution surface and version deltas, works on day zero with zero network and zero advisory data, and is the primary detection. Layer 2 is a known-bad overlay — feed-based matching that can only ever confirm what someone already reported.

Each signal carries two weights: an absolute weight (the signal exists in the tree) and a delta weight (the signal newly appeared in this version). Delta weights dominate by design: legitimate native packages carry binding.gyp forever, but attacks introduce it — a new hook in a version bump is the 2026 attack signature.

SignalAbsolute weightDelta weight (newly appeared this version)
Lifecycle install scriptLow–MedCritical
binding.gyp / node-gyp hookLowCritical
AI-agent hook / MCP manifestMedCritical
IDE task / folder-open fileMedHigh
Main-file size anomaly (>5x)High
New transitive dep in a patch releaseHigh
Obfuscation markers in install-path filesMedHigh
Phantom dependencyMed

Delta weights apply only in --diff / --deep modes (which fetch previous tarballs for comparison); absolute weights always apply.

  • Each package receives a grade A–F from its combined signals.
  • The project rollup is the worst grade in the tree plus a count summary per grade.
  • --threshold <grade> (default: high) sets the severity at which findings flip the exit code to 1 — grades below the threshold are reported but don’t fail the run.
  • Sources: a vendored OSV.dev npm snapshot, npm advisory data, and vendored incident IOC bundles — all shipped inside the npm package and refreshed each release. No feed is fetched at runtime.
  • Matching: resolved name@version from the lockfile.
  • Any Layer-2 hit is Critical, regardless of the package’s Layer-1 score.

Layer 2 exists because confirming a known incident should be instant and unambiguous. It is deliberately secondary: the 2026 worm waves outran every known-bad database, which is exactly why Layer 1 doesn’t depend on one.

GradeSARIF level
Criticalerror
Highwarning
Mednote
Lowsuppressed by default (--verbose to include)

SARIF 2.1.0 output (--sarif) uploads directly to the GitHub Security tab — the GitHub Action wires this up automatically.

Calibration: weights are gated on a corpus

Section titled “Calibration: weights are gated on a corpus”

All Layer-1 weights are provisional until corpus calibration shows clean separation between the top ~500 benign npm packages and the confirmed 2026 malicious set (the Axios plain-crypto-js phantom dep, the node-ipc versions, the autotel family, and the Miasma @redhat-cloud-services versions). A security tool that flags every legitimate native package trains its users to ignore it — noise is a bug of the same severity as a miss, and the calibration harness is the gate that keeps thresholds honest before they ship.