Skip to content

Latest commit

 

History

History
413 lines (296 loc) · 20.3 KB

File metadata and controls

413 lines (296 loc) · 20.3 KB

Diet Command

← Back to README.md

Overview

uzomuzo diet analyzes your project's dependencies and produces a prioritized "diet plan" — ranking dependencies by removal impact, coupling effort, and health risk.

It answers: which dependencies should I remove first, and how hard will it be?

Diet is the automated stage of the scan → diet → LLM → remove pipeline. After diet ranks your dependencies, Claude Code skills use an LLM to assess risk, plan removal, and execute the change. See Diet Workflow below for the full pipeline.

Architecture

uzomuzo diet is distributed as a separate binary (uzomuzo-diet) because it uses tree-sitter (CGo) for multi-language source analysis. The main uzomuzo binary stays Pure Go and delegates to uzomuzo-diet transparently.

$ uzomuzo diet --sbom bom.json    # delegates to uzomuzo-diet on PATH

See ADR-0014 for the full architectural decision record.

Installation

# Install both binaries
go install github.com/future-architect/uzomuzo-oss/cmd/uzomuzo@latest
go install github.com/future-architect/uzomuzo-oss/cmd/uzomuzo-diet@latest

Note: uzomuzo-diet requires a C compiler (gcc/clang) for tree-sitter CGo compilation.

Usage

Prerequisites

Generate a CycloneDX SBOM for your project. The recommended tool depends on your ecosystem:

Go / Python / JavaScript / TypeScript

# Using syft (recommended)
syft . --source-name myproject -o cyclonedx-json > bom.json

# Using trivy
trivy fs . --format cyclonedx -o bom.json

# Using cdxgen
cdxgen -o bom.json

Note: For JavaScript/TypeScript projects, a lockfile (package-lock.json or yarn.lock) is required for dependency graph resolution.

Java (Maven)

Static SBOM tools (syft, Trivy) cannot resolve Maven's transitive dependency graph without running Maven. Use the CycloneDX Maven Plugin instead:

# Generate SBOM with full dependency resolution
mvn org.cyclonedx:cyclonedx-maven-plugin:2.9.1:makeBom \
  -DoutputFormat=json \
  -DoutputName=bom \
  -Dcyclonedx.skipNotDeployed=false

# The SBOM is generated at target/bom.json
uzomuzo diet --sbom target/bom.json --source .

Java (Gradle)

Similarly, use the CycloneDX Gradle Plugin:

// build.gradle
plugins {
    id 'org.cyclonedx.bom' version '2.2.0'
}
gradle cyclonedxBom
uzomuzo diet --sbom build/reports/bom.json --source .

Basic Usage

# Table output (default)
uzomuzo diet --sbom bom.json

# With source coupling analysis
uzomuzo diet --sbom bom.json --source .

# Pipe from trivy (no intermediate file)
trivy fs . --format cyclonedx | uzomuzo diet --sbom - --source .

# JSON output (for CI/LLM consumption)
uzomuzo diet --sbom bom.json --source . --format json

# Detailed per-dependency breakdown
uzomuzo diet --sbom bom.json --source . --format detailed

Flags

Flag Required Default Description
--sbom Yes Path to CycloneDX SBOM JSON, or - for stdin
--source No . Root directory for source coupling analysis
--format, -f No table Output format: json, table, detailed

⚠️ --source must point to the same project root that was used to generate the SBOM. If it points to the wrong directory, dependencies will appear "unused" even when they are actually used — because the scanner cannot find the import statements.

Common mistakes:

  • Subdirectory: --source ./src misses files outside src/, causing false "unused" results.
  • Wrong project: Using an SBOM from project A with --source pointing to project B produces meaningless output.
  • Monorepo: Point --source to the specific subproject root that matches the SBOM, not the repo root.

If Phase 2 reports no imports matched any dependency, double-check your --source path.

Analysis Pipeline

The diet command runs a 4-phase pipeline:

Phase 1: Dependency Graph (SBOM)

Parses the CycloneDX SBOM to build a dependency DAG. For each direct dependency, computes:

  • Exclusive transitive count — dependencies removed only if this dep is removed
  • Shared transitive count — dependencies shared with other direct deps
  • Total transitive count — all reachable transitive dependencies

Phase 2: Source Coupling (tree-sitter)

Analyzes your source code to measure how deeply each dependency is integrated:

  • Import file count — number of files importing the dependency
  • Call site count — total usage sites across all files
  • API breadth — number of distinct APIs used from the dependency

Supported languages: Go, Python, JavaScript/TypeScript, Java.

Phase 2.5: Wheel-based PyPI fallback

For Python (PyPI) packages where Phase 2 finds zero import matches, diet automatically downloads the smallest wheel file and extracts actual import names from top_level.txt, RECORD, or __init__.py directory listing. This resolves common PyPI name mismatches (e.g., beautifulsoup4 imports as bs4, pyyaml as yaml). Wheels larger than 5 MB are skipped.

Phase 3: Health Signals (API)

Reuses the existing uzomuzo scan infrastructure to fetch:

  • Lifecycle status (Active, Stalled, EOL)
  • OpenSSF Scorecard score
  • Known vulnerabilities (advisories)

Phase 4: Scoring

Combines all signals into a priority score:

PriorityScore = GraphImpact × HealthRisk × (1 - CouplingEffort)
Score Range Meaning
GraphImpact 0–1 How much the dependency tree shrinks
HealthRisk 0–1 How risky keeping this dependency is
CouplingEffort 0–1 How hard it is to remove from code

Difficulty labels:

Label CouplingEffort Meaning
trivial 0.0 Unused — just delete the import
easy < 0.25 1–2 files, few call sites
moderate 0.25–0.59 Several files, moderate API usage
hard ≥ 0.60 Deeply integrated

Output Examples

Table

── Diet Plan (8 direct dependencies) ─────────────────────────

  Unused (0 imports):  4
  Quick wins:          2  (trivial/easy + high impact)

RANK  SCORE  EFFORT    PURL                                REMOVES  IMPORTS  CALLS  STATUS
────  ─────  ──────    ────                                ───────  ───────  ─────  ──────
1     0.48   easy      github.com/joho/godotenv            0        1        1      Active
2     0.40   trivial   github.com/tree-sitter/go-tree-sitter 0        0        0    Active
3     0.08   trivial   gopkg.in/yaml.v3                    0        0        0      Active
...

JSON

{
  "summary": {
    "total_direct": 8,
    "total_transitive": 0,
    "unused_direct": 4,
    "easy_wins": 2,
    "actionable_direct": 6,
    "transitive_only_by_one": 0
  },
  "dependencies": [
    {
      "rank": 1,
      "purl": "pkg:golang/github.com/joho/godotenv@v1.5.1",
      "name": "github.com/joho/godotenv",
      "priority_score": 0.48,
      "difficulty": "easy",
      "transitive_only_by_one": 0,
      "import_file_count": 1,
      "call_site_count": 1,
      "lifecycle": "Active"
    }
  ]
}

Diet Workflow: scan → diet → LLM → remove

The diet family of tools forms a pipeline from detection to removal:

uzomuzo scan           "This dep is risky"           Always in CI/CD
        ↓
uzomuzo diet           "Remove in this order"        Quarterly review
        ↓
/diet-assess-risk      "Risk of keeping it"          For EOL + hard deps
/diet-evaluate-removal "Cost-benefit of removal"     When unsure about moderate deps
        ↓
/diet-remove           "Remove it safely"            Actual removal work
Tool Role Scope When
uzomuzo scan Detect — find EOL/Stalled deps All deps, automated Every CI build
uzomuzo diet Prioritize — rank by removability All deps, automated Quarterly review
/diet-assess-risk Assess risk — trace data flows, attack scenarios One dep, LLM-powered EOL deps with non-trivial coupling
/diet-evaluate-removal Plan removal — 6-axis evaluation, replacement options One dep, LLM-powered When unsure if removal is worth the effort
/diet-remove Execute — safe removal with verification One dep, LLM-powered Actual removal work

Typical workflow

# Step 1: Generate the priority ranking
syft . --source-name myapp -o cyclonedx-json > bom.json
uzomuzo diet --sbom bom.json --source . --format json > diet.json

# Step 2: Trivial dependencies (0 imports) — just remove them
# No LLM needed. Delete from go.mod/package.json and run `go mod tidy`.

# Step 3: EOL/Stalled deps with source coupling — assess risk first
/diet-assess-risk pkg:golang/github.com/foo/bar@v1.0.0

# Step 4: Moderate deps you're unsure about — evaluate removal cost
/diet-evaluate-removal github.com/foo/bar

# Step 5: Execute the removal with safety checks
/diet-remove github.com/foo/bar

JSON output for LLM consumption

# Feed diet plan to Claude Code for batch replacement suggestions
uzomuzo diet --sbom bom.json --source . --format json > diet.json
claude "Based on this diet plan, suggest code changes to remove the top 3 dependencies: $(cat diet.json)"

Understanding "Unused" Dependencies

Diet reports dependencies as "unused" when no import statement is found in source code. However, not all "unused" dependencies are removable. There are three common patterns:

1. Dev/build dependencies included in SBOM

SBOM tools may include devDependencies, test dependencies, and build tools alongside production dependencies. These are genuinely unused in production source code:

  • Linters and formatters (eslint, mypy, black)
  • Test frameworks (jest, pytest, vitest)
  • Documentation tools (sphinx, mkdocs)
  • Build tools (webpack, rollup, conventional-changelog-cli)

These are often the best candidates for removal from production SBOMs, as they inflate the dependency tree without contributing to runtime. See SBOM Tool Comparison for how different tools handle this.

2. Config-driven / runtime-loaded dependencies

Some dependencies are used via configuration files, annotations, or runtime class loading rather than explicit import statements:

  • Spring Boot starters — auto-configured via spring.factories, not imported directly
  • JDBC drivers (postgresql, mysql-connector-j) — loaded by URL string
  • Cache providers (caffeine) — specified in application.properties
  • Template engines (thymeleaf) — resolved by Spring MVC at runtime

These show 0 files / 0 calls in the coupling analysis, which is expected behavior, not a false positive. Diet still ranks them correctly: config-driven deps are easy to swap (low coupling) but may bring many transitive deps (high graph impact).

Java runtime-loaded dependency detection

Java has the highest density of runtime-loaded dependencies among supported languages. Diet uses a layered detection strategy:

Layer Mechanism Coverage Limitations
SBOM scope (not yet implemented) CycloneDX scope field from cdxgen / CycloneDX Maven Plugin optional (provided) and excluded (test) deps Cannot distinguish compile from runtime — both map to CycloneDX required. Trivy/syft do not populate this field. See #303.
Runtime whitelist Hardcoded mavenRuntimeDeps list of known groupId/artifactId coordinates JDBC drivers, SLF4J logging backends, WebJars Not exhaustive — unknown runtime deps will still be flagged as unused. Whitelist is expanded empirically via diet-fuzz testing.
Tree-sitter AST Static analysis of import statements and call sites Compile-time couplings (method calls, constructors, annotations, type declarations, generics, casts) Cannot detect reflection (Class.forName(var)), ServiceLoader, Spring classpath scanning. Only 4.6% of reflection call sites are string-literal Class.forName detectable by AST.

What is fundamentally undetectable:

  • Spring Boot autoconfiguration@SpringBootApplication triggers classpath scanning. No static analysis can determine which dependencies are activated by Spring's spring.factories / AutoConfiguration.imports mechanism.
  • Variable-based reflectionClass.forName(driverName) where the class name comes from XML config, properties files, or method parameters.
  • ServiceLoaderjava.util.ServiceLoader.load(Interface.class) discovers implementations via META-INF/services/ at runtime.

See ADR-0015 for the full investigation across 6 Java OSS projects.

3. Provided-scope dependencies (Maven provided, Gradle compileOnly)

Dependencies declared as provided scope compile against the API but are not bundled — the runtime environment (application server, container) supplies them at deploy time. Common examples:

  • javax.servlet-api / jakarta.servlet-api — provided by Tomcat/Jetty
  • lombok — annotation processor, removed at compile time
  • javax.annotation-api — provided by the JEE container

These typically DO have source-level imports (unlike runtime-loaded deps), so coupling analysis produces accurate data for them. When the SBOM tool populates the CycloneDX scope field with optional, this information could be used to annotate them in the diet plan (see #303 for the design).

4. Test-scope dependencies leaked into the SBOM

Some SBOM tools include test-scope dependencies (junit, mockito, testcontainers) alongside production dependencies. These appear as "unused" — which is correct (they have no production imports) but not actionable. CycloneDX SBOMs from cdxgen mark these with scope: "excluded", which could be used to filter them automatically (see #303). For now, use Trivy (which excludes test deps by default) or configure your SBOM tool to omit test-scope dependencies.

5. Leftover dependencies (genuine waste)

Dependencies that were once used but whose import was removed without cleaning up package.json / go.mod / pom.xml. These are the most valuable findings — they can be removed immediately with zero code changes.

SBOM Tool Comparison

The quality of diet analysis depends heavily on what the SBOM tool includes. Different tools handle development dependencies very differently:

Tool Dev deps included? Scope metadata? Notes
syft Yes (all) No Includes everything — devDependencies, test deps, build tools. No way to filter.
Trivy No (default) No Excludes dev deps by default. Use --include-dev-deps to include them.
cdxgen Yes (all) Yes (scope field) Includes all deps but marks them as required, optional, or excluded. Diet does not yet use scope for filtering (see #303).
CycloneDX Maven Plugin Configurable Yes (scope field) Respects Maven scopes. Note: both compile and runtime map to CycloneDX required — scope alone cannot distinguish them.

Real-world impact (Vue.js core)

Tool Components Notes
syft 723 All deps, no scope info
Trivy (default) 34 Dev deps excluded
Trivy (--include-dev-deps) 684 All deps included
cdxgen 698 All deps, with scope (required: 38, optional: 645)

Recommendations

  • For accurate production dependency analysis: Use Trivy (default mode) or configure CycloneDX Maven/Gradle plugins to exclude test scope
  • For comprehensive diet analysis (including dev dep cleanup): Use syft or cdxgen to capture everything, then use diet's coupling analysis to distinguish genuinely unused deps from dev tools
  • For the most actionable results: Run diet twice — once with production-only SBOM (Trivy default) and once with full SBOM (syft) — to see both perspectives

Supported Languages

All languages use tree-sitter for AST-based analysis. Files larger than 1 MB are skipped. Test directories (testdata, __pycache__, target) and vendored code (vendor, node_modules) are excluded.

Go

Capability Details
Import syntaxes import "pkg", import alias "pkg", grouped imports
Blank imports import _ "pkg" — detected, marked as side-effect (no call site tracking)
Dot imports import . "pkg" — detected, marked as uncountable (symbols callable without prefix)
Call sites Selector expressions (pkg.Func), qualified types (pkg.Type)
Ecosystem features Go tool directives (go.mod tool, Go 1.24+) excluded from unused detection
Import path handling Strips major version suffixes (/v2), gopkg.in version suffixes (.v3), hyphenated package aliases (go-loser -> loser, geoip2-golang -> geoip2)
Limitations Dot-imported symbols cannot be attributed to a specific dependency

Python

Capability Details
Import syntaxes import mod, import mod as alias, from mod import name, from mod import *
Wildcard imports from x import * — detected, marked as uncountable
Type-checking imports if TYPE_CHECKING: blocks — skipped entirely (no runtime coupling)
Try/except imports try: import x except ImportError: — marked as feature-detection (blank import)
Call sites obj.attr, bare function calls, decorator usage (@fixture)
Ecosystem features Wheel-based import resolution (Phase 2.5 fallback): downloads smallest wheel to extract actual import names when heuristic paths match nothing. Strips common prefixes (python-, py-) from distribution names.
Limitations Wildcard-imported names cannot be attributed. Type-checking blocks fully ignored.

JavaScript / TypeScript / TSX

Capability Details
Import syntaxes ESM (import, import { }, import * as), CommonJS (require()), dynamic import(), re-exports (export { } from, export * from)
Side-effect imports import 'pkg', bare require('x') — marked as blank import
Type-only imports TypeScript import type { } — skipped (no runtime coupling)
Call sites Member expressions (obj.method), bare calls, constructors (new Foo), computed properties, JSX elements (<Component />), constant-only patterns
Framework detection Angular: decorator array identifiers (@NgModule, @Component imports/declarations). Vue: defineComponent({ components: {} }) shorthand properties.
CJS destructuring const { X } = require('pkg') — each destructured name tracked as alias
Limitations Dynamic import bindings not tracked through await/.then(). JSX patterns only for .js/.jsx/.tsx (not .ts). Bare identifier matching may false-positive on shadowed locals.

Java

Capability Details
Import syntaxes import com.example.Class, import static com.example.Class.method, import com.example.*
Wildcard imports import pkg.*, import static pkg.* — detected, marked as uncountable
Static imports Last component registered as bare alias (assertEquals from import static org.junit.Assert.assertEquals)
Call sites Method invocations, constructors (incl. generics new Foo<T>()), annotations, inheritance (extends/implements), type declarations, instanceof/casts, method references (Foo::bar), field access
Ecosystem features Maven package overrides (~40 entries): maps groupId/artifactId to actual Java packages where they differ (Guava, Jackson, commons-*, Spring Boot starters). Runtime whitelist: JDBC drivers, logging backends, WebJars marked as ScopeRuntime. Spring Boot starter heuristics: derives package prefix from starter suffix.
Limitations Reflection (Class.forName(var)), ServiceLoader, and Spring Boot autoconfiguration are fundamentally undetectable by static analysis. CycloneDX scope field not yet used (#303). See ADR-0015.