Files
Devour/AGENTS.md
2026-02-24 12:10:13 +01:00

8.9 KiB


name: desloppify description: > Codebase health scanner and technical debt tracker. Use when the user asks about code quality, technical debt, dead code, large files, god classes, duplicate functions, code smells, naming issues, import cycles, or coupling problems. Also use when asked for a health score, what to fix next, or to create a cleanup plan. Supports 28 languages. allowed-tools: Bash(devour quality*, devour review*, devour scorecard*)

Desloppify

Devour delegates quality/review operations to desloppify. Use Devour entrypoints (devour quality ..., devour review ..., devour scorecard) so users stay in one CLI surface.

1. Your Job

Improve code quality by fixing findings and maximizing strict score honestly. Never hide debt with suppression patterns just to improve lenient score. After every scan, show the user ALL scores:

What How
Overall health lenient + strict
5 mechanical dimensions File health, Code quality, Duplication, Test health, Security
7 subjective dimensions Naming Quality, Error Consistency, Abstraction Fit, Logic Clarity, AI Generated Debt, Type Safety, Contract Coherence

Never skip scores. The user tracks progress through them.

2. Core Loop

scan → follow the tool's strategy → fix or wontfix → rescan
  1. devour quality scan --path . — this delegates to desloppify scan; follow the output INSTRUCTIONS FOR AGENTS exactly.
  2. Fix the issue the tool recommends.
  3. devour quality resolve fixed "<id>" — or if it's intentional/acceptable: devour quality resolve wontfix "<id>" --note "reason why"
  4. Rescan to verify via devour quality scan --path ..

Wontfix is not free. It lowers the strict score. The gap between lenient and strict IS wontfix debt. Call it out when:

  • Wontfix count is growing — challenge whether past decisions still hold
  • A dimension is stuck 3+ scans — suggest a different approach
  • Auto-fixers exist for open findings — ask why they haven't been run

3. Commands

devour quality scan --path src/                          # full scan
devour quality scan --path src/ --reset-subjective      # reset subjective baseline to 0, then scan
devour quality next --count 5                            # top priorities
devour quality show <pattern>                            # filter by file/detector/ID
devour quality plan                                      # prioritized plan
devour quality fix <fixer> --dry-run                     # auto-fix (dry-run first!)
devour quality move <src> <dst> --dry-run                # move + update imports
devour quality resolve fixed|wontfix|false_positive "<pat>"   # classify finding outcome
devour review --prepare                                  # generate subjective review data
devour review --import file.json                         # import review results
devour review                                            # default batch run (codex+parallel+scan-after-import)

4. Subjective Reviews (biggest score lever)

Score = 40% mechanical + 60% subjective. Subjective starts at 0% until reviewed.

  1. devour review --prepare — delegates to desloppify review --prepare and writes dimension definitions and codebase context to query.json.

  2. Review each dimension independently. For best results, review dimensions in isolation so scores don't bleed across concerns. If your agent supports parallel execution, use it — your agent-specific overlay (appended below, if installed) has the optimal approach. Each reviewer needs:

    • The codebase path and the dimensions to score
    • What each dimension means (from query.json's dimension_prompts)
    • The output format (below)
    • Nothing else — let them decide what to read and how
  3. Merge assessments (average scores if multiple reviewers cover the same dimension) and findings, then import:

    devour review --import findings.json
    

    Required output format per reviewer:

    {
      "assessments": { "naming_quality": 75.0, "logic_clarity": 82.0 },
      "findings": [{
        "dimension": "naming_quality",
        "identifier": "short_id",
        "summary": "one line",
        "related_files": ["path/to/file.py"],
        "evidence": ["specific observation"],
        "suggestion": "concrete action"
      }]
    }
    

Need a clean subjective rerun from zero? Run devour quality scan --path src/ --reset-subjective before preparing/importing fresh review data.

Even moderate scores (60-80) dramatically improve overall health.

Integrity safeguard:

  • If one subjective dimension lands exactly on the strict target, the scanner warns and asks for re-review.
  • If two or more subjective dimensions land on the strict target in the same scan, those dimensions are auto-reset to 0 for that scan and must be re-reviewed/imported.
  • Reviewers should score from evidence only (not from target-seeking).

5. Quick Reference

  • Tiers: T1 auto-fix, T2 quick manual, T3 judgment call, T4 major refactor
  • Zones: production/script (scored), test/config/generated/vendor (not scored). Fix with zone set.
  • Auto-fixers (TS only): unused-imports, unused-vars, debug-logs, dead-exports, etc.
  • query.json: After any command, has narrative.actions with prioritized next steps.
  • --skip-slow skips duplicate detection for faster iteration.
  • --lang python, --lang typescript, or --lang csharp to force language.
  • C# defaults to --profile objective; use --profile full to include subjective review.
  • Score can temporarily drop after fixes (cascade effects are normal).

6. Escalate Tool Issues Upstream

When desloppify itself appears wrong or inconsistent:

  1. Capture a minimal repro (command, path, expected, actual).
  2. Open a GitHub issue in peteromallet/desloppify.
  3. If you can fix it safely, open a PR linked to that issue.
  4. If unsure whether it is tool bug vs user workflow, issue first, PR second.

Prerequisite

command -v desloppify >/dev/null 2>&1 && echo "desloppify: installed" || echo "NOT INSTALLED — run: pip install --upgrade \"desloppify[full]\""

Install-and-Run Prompt (Copy/Paste)

Install or update desloppify, then scan this codebase for code quality issues
(dead code, duplication, complexity, smells). It tracks findings persistently
across sessions and reports a health score.

Requires Python 3.11+. Install from PyPI:
pip install --upgrade desloppify
# Recommended for deepest coverage (tree-sitter + Python security adapter):
pip install --upgrade "desloppify[full]"
desloppify update-skill codex    # pick yours: claude, cursor, codex, copilot, windsurf, gemini

Use Devour wrappers (delegates to desloppify):
devour quality scan --path .
devour quality status
devour quality next

Direct equivalent commands:
desloppify scan --path .
desloppify status
desloppify next

--path is the directory to scan (use "." for the whole project, or "src/" etc).
Language is auto-detected. To override: desloppify --lang python scan --path .
(note: --lang goes BEFORE the subcommand)

Fix what it finds, then:
desloppify resolve fixed <id> --note "what changed" --attest "I have actually [DESCRIBE THE CONCRETE CHANGE YOU MADE] and I am not gaming the score by resolving without fixing."

For false positives:
desloppify resolve wontfix <id> --note "reason" --attest "I have actually verified this is intentional/false-positive and I am not gaming the score by resolving without fixing."

If subjective scores feel stale or inflated:
desloppify scan --path . --reset-subjective

Because state persists, run regularly (for example before each push).
If anything seems wrong/confusing in desloppify itself, capture a repro and ask
whether to log an upstream issue.

If you need to debug desloppify internals:
git clone https://github.com/peteromallet/desloppify.git /tmp/desloppify

Codex Overlay

This is the canonical Codex overlay used by the README install command.

  1. Prefer first-class batch runs: devour review (default) or devour review --run-batches --runner codex --parallel.
  2. The command writes immutable packet snapshots under .desloppify/review_packets/holistic_packet_*.json; use those for reproducible retries.
  3. Keep reviewer input scoped to the immutable packet and the source files named in each batch.
  4. Do not use prior chat context, score history, narrative summaries, issue labels, or target-threshold anchoring while scoring.
  5. Assess every dimension listed in query.dimensions; never drop a requested dimension. If evidence is weak/mixed, score lower and explain uncertainty in findings.
  6. Return machine-readable JSON only for review imports:
{
  "assessments": {
    "<dimension_from_query>": 0
  },
  "findings": []
}
  1. Keep findings schema compatible with query.system_prompt.
  2. If a batch fails, retry only that slice with devour review --run-batches --packet <packet.json> --only-batches <idxs>.