mirror of
https://github.com/Dvorinka/Devour.git
synced 2026-06-03 20:13:03 +00:00
update
This commit is contained in:
File diff suppressed because it is too large
Load Diff
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,433 @@
|
|||||||
|
{
|
||||||
|
"assessments": {
|
||||||
|
"abstraction_fitness": {
|
||||||
|
"score": 40.7,
|
||||||
|
"components": [
|
||||||
|
"Abstraction Leverage",
|
||||||
|
"Indirection Cost",
|
||||||
|
"Interface Honesty"
|
||||||
|
],
|
||||||
|
"component_scores": {
|
||||||
|
"Abstraction Leverage": 60.6,
|
||||||
|
"Indirection Cost": 71.4,
|
||||||
|
"Interface Honesty": 63.6
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"cross_module_architecture": 68.3,
|
||||||
|
"design_coherence": 40.9,
|
||||||
|
"error_consistency": 45.2,
|
||||||
|
"test_strategy": 46.4
|
||||||
|
},
|
||||||
|
"dimension_notes": {
|
||||||
|
"cross_module_architecture": {
|
||||||
|
"evidence": [
|
||||||
|
"All assigned `internal/quality/*.go` files stay within a single package boundary (`package quality`) and only import standard library packages (`time`, `testing`, `strings`), with no cross-package dependency fan-out from this slice.",
|
||||||
|
"`pkg/rustdocs/parser_test.go` is isolated to `package rustdocs` and imports only stdlib plus `github.com/PuerkitoBio/goquery`; it does not couple into `internal/quality` types or helpers.",
|
||||||
|
"No `init()` functions, package-level mutable singleton wiring, or import-time execution patterns were found in the reviewed files; behavior is test-function scoped and constructor-invoked (`NewParser`, `NewScorer`, `NewNarrativeGenerator`).",
|
||||||
|
"Type declarations in `internal/quality/types.go` and `internal/quality/enhanced_types.go` are cohesive data-model definitions within one module boundary rather than cross-module shims or compatibility layers."
|
||||||
|
],
|
||||||
|
"impact_scope": "local",
|
||||||
|
"fix_scope": "single_edit",
|
||||||
|
"confidence": "high",
|
||||||
|
"unreported_risk": "This batch covers only five files; architectural hotspots could still exist in non-assigned packages (e.g., runtime wiring or broader dependency graph) outside this evidence window."
|
||||||
|
},
|
||||||
|
"abstraction_fitness": {
|
||||||
|
"evidence": [
|
||||||
|
"Language-to-doc behavior is spread across multiple large switches: URL construction in cmd/get.go:78-173, type mapping in cmd/get.go:175-205, and term derivation in cmd/ask.go:205-260+.",
|
||||||
|
"External scraper implementations repeat the same transport/change-detection scaffold (config+parser+http client fields, URL check, fetchPage, generateHash, DetectChanges) across multiple files, e.g. internal/scraper/external/godocs.go:17-121, internal/scraper/external/javadocs.go:16-115, internal/scraper/external/nuxtdocs.go:16-120, internal/scraper/external/cloudflaredocs.go:16-105.",
|
||||||
|
"Vector store abstraction exposes implementations that are selected by default config but intentionally unimplemented: internal/config/config.go:121-125 defaults to chromem, while internal/vector/store.go:221-243 returns \"chromem store not implemented\" for all operations.",
|
||||||
|
"Configuration defaults are duplicated in two representations (typed defaults and hand-written YAML template), increasing drift risk: cmd/init.go:92-149 and internal/config/config.go:104-160."
|
||||||
|
],
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "architectural_change",
|
||||||
|
"confidence": "high",
|
||||||
|
"unreported_risk": "",
|
||||||
|
"sub_axes": {
|
||||||
|
"abstraction_leverage": 62.0,
|
||||||
|
"indirection_cost": 71.0,
|
||||||
|
"interface_honesty": 60.0
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"test_strategy": {
|
||||||
|
"evidence": [
|
||||||
|
"`internal/quality/narrative_test.go` validates exact headline/action prose and directly tests internal helper behavior (e.g., `determinePhase`, `generateHeadline`, `classifyDimension`), creating high implementation-coupling.",
|
||||||
|
"`internal/quality/scoring_test.go` similarly focuses on exact internal scoring details and string key literals, which makes refactors noisy and discourages safe design changes.",
|
||||||
|
"`pkg/rustdocs/parser_test.go` is heavily happy-path: it checks successful parses and minimal field presence but has no malformed-input/error-path cases for parser resilience.",
|
||||||
|
"`README.md` marks parts of the CLI as unstable/stubbed, but assigned tests do not provide cross-module contract/integration safety nets for those runtime boundaries."
|
||||||
|
],
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "multi_file_refactor",
|
||||||
|
"confidence": "high",
|
||||||
|
"unreported_risk": ""
|
||||||
|
},
|
||||||
|
"design_coherence": {
|
||||||
|
"evidence": [
|
||||||
|
"Parallel implementations of the same scorecard pipeline exist in `cmd/devour_scorecard.py` and `cmd/scorecard_generator.py` with near-identical function layouts (`ScorecardData`, `score_color`, `draw_left_panel`, `draw_right_panel`, `generate_scorecard`, `main`) and only minor line-level differences.",
|
||||||
|
"Three variants of enhanced generator (`cmd/devour_enhanced.py`, `cmd/devour_enhanced_fixed.py`, `cmd/devour_enhanced_v2.py`) repeat almost the full rendering stack (`draw_header_section`, `draw_enhanced_left_panel`, `draw_enhanced_right_panel`, `draw_trends_section`, `load_enhanced_devour_data`), creating branch-by-copy evolution.",
|
||||||
|
"Scraper adapters across providers (`internal/scraper/external/astrodocs.go`, `internal/scraper/external/cloudflaredocs.go`, `internal/scraper/external/reactdocs.go`) duplicate fetch/hash/change-detection and document assembly patterns with provider-specific data glued inline, indicating repeated structural pattern without shared orchestration abstraction.",
|
||||||
|
"Within `cmd/devour_lighthouse.py`, `load_font` is defined twice (once near top and again later), showing local design drift and utility ownership ambiguity."
|
||||||
|
],
|
||||||
|
"impact_scope": "codebase",
|
||||||
|
"fix_scope": "architectural_change",
|
||||||
|
"confidence": "high",
|
||||||
|
"unreported_risk": ""
|
||||||
|
},
|
||||||
|
"error_consistency": {
|
||||||
|
"evidence": [
|
||||||
|
"Raw error passthrough is common in core flows (e.g., `return nil, err` in `internal/search/engine.go:114`, `internal/search/engine.go:122`, `internal/scraper/openapi.go:45`, `internal/scraper/openapi.go:50`) while nearby code wraps with operation context (e.g., `internal/search/engine.go:111`, `internal/scraper/openapi.go:153`).",
|
||||||
|
"Failure handling style diverges between aborting, propagating, and suppressing in similar backend paths: `panic(...)` in `internal/quality/plugins/go/plugin.go:363`, warning print-and-continue in `internal/indexer/indexer.go:239`, and plain returns in `cmd/scrape.go:90`/`cmd/get.go:59`.",
|
||||||
|
"Some call paths lose caller context at command boundaries (`cmd/scrape.go:90`, `cmd/scrape.go:125`, `cmd/get.go:59`) despite contextual wrapping being used in other command-layer branches (`cmd/scrape.go:131`, `cmd/scrape.go:145`)."
|
||||||
|
],
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "multi_file_refactor",
|
||||||
|
"confidence": "high",
|
||||||
|
"unreported_risk": ""
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"findings": [
|
||||||
|
{
|
||||||
|
"dimension": "abstraction_fitness",
|
||||||
|
"identifier": "language_catalog_scattered_switches",
|
||||||
|
"summary": "Language routing logic is duplicated across CLI flows instead of one catalog abstraction",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/get.go",
|
||||||
|
"cmd/ask.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"cmd/get.go:78-173 defines a large language switch for URL building; cmd/get.go:175-205 defines a second switch for source type mapping.",
|
||||||
|
"cmd/ask.go:205-260+ adds a third language switch for term heuristics, creating three independent sources of truth for one domain model."
|
||||||
|
],
|
||||||
|
"suggestion": "Introduce a single `LanguageSpec` registry (aliases, source type, URL builder, optional query-term strategy) in one package and have both `get` and `ask` consume it; keep per-language behavior as data/functions attached to that registry.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "architectural_change"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "abstraction_fitness",
|
||||||
|
"identifier": "external_scraper_scaffold_duplication",
|
||||||
|
"summary": "External scraper adapters reimplement the same transport/hash lifecycle repeatedly",
|
||||||
|
"related_files": [
|
||||||
|
"internal/scraper/external/godocs.go",
|
||||||
|
"internal/scraper/external/javadocs.go",
|
||||||
|
"internal/scraper/external/nuxtdocs.go",
|
||||||
|
"internal/scraper/external/cloudflaredocs.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"Each file defines near-identical struct fields (`config`, `parser`, `client`), constructor wiring, URL-required guard, `fetchPage`, `generateHash`, and `DetectChanges` flow (e.g., godocs.go:17-121 and javadocs.go:16-115).",
|
||||||
|
"Duplication scales linearly with each new source adapter, increasing edit surface for cross-cutting behavior (timeouts, headers, error mapping)."
|
||||||
|
],
|
||||||
|
"suggestion": "Extract a shared `HTTPDocScraperBase` (or composable helper functions) for request execution, status handling, hashing, and change detection; keep each adapter focused on parser invocation and domain-specific document mapping.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "abstraction_fitness",
|
||||||
|
"identifier": "default_selects_unimplemented_store",
|
||||||
|
"summary": "Store interface contract is dishonest because default backend is not operational",
|
||||||
|
"related_files": [
|
||||||
|
"internal/vector/store.go",
|
||||||
|
"internal/config/config.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"internal/config/config.go:121-125 sets default vector DB type to `chromem`.",
|
||||||
|
"internal/vector/store.go:221-243 returns `chromem store not implemented` for all `Store` operations after `NewStore` can select that backend (store.go:63-72)."
|
||||||
|
],
|
||||||
|
"suggestion": "Either implement `ChromemStore` before exposing it as default, or switch default to a working backend and gate chromem behind explicit opt-in plus capability check at startup.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "module",
|
||||||
|
"fix_scope": "architectural_change"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "abstraction_fitness",
|
||||||
|
"identifier": "config_defaults_double_encoded",
|
||||||
|
"summary": "Initialization defaults are encoded twice with different abstractions",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/init.go",
|
||||||
|
"internal/config/config.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"cmd/init.go:92-149 hardcodes YAML defaults as a template string.",
|
||||||
|
"internal/config/config.go:104-160 hardcodes defaults again in typed structs, requiring synchronized updates across two representations."
|
||||||
|
],
|
||||||
|
"suggestion": "Generate init YAML from `config.Default()` via marshal + small post-processing/comments, or maintain a single canonical defaults schema consumed by both loader and init command.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "module",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "cross_module_architecture",
|
||||||
|
"identifier": "status_contract_string_map_boundary",
|
||||||
|
"summary": "Scorecard state uses string keys instead of shared Status type, weakening module contracts.",
|
||||||
|
"related_files": [
|
||||||
|
"internal/quality/types.go",
|
||||||
|
"internal/quality/scoring_test.go",
|
||||||
|
"README.md"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`internal/quality/types.go` defines `Status` constants but `Scorecard.StatusByType` is `map[string]int`.",
|
||||||
|
"`internal/quality/scoring_test.go` asserts `card.StatusByType[\"open\"]` and `card.StatusByType[\"fixed\"]` directly.",
|
||||||
|
"README promises resolution-state tracking, but this boundary is not type-safe."
|
||||||
|
],
|
||||||
|
"suggestion": "Change `Scorecard.StatusByType` to `map[Status]int` (or a dedicated typed struct), update serialization adapters if needed, and update tests to assert using `StatusOpen`/`StatusFixed` constants.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "test_strategy",
|
||||||
|
"identifier": "brittle_private_and_copy_assertions_in_quality_tests",
|
||||||
|
"summary": "Quality tests are tightly coupled to private helpers and exact copy text, reducing refactor safety.",
|
||||||
|
"related_files": [
|
||||||
|
"internal/quality/narrative_test.go",
|
||||||
|
"internal/quality/scoring_test.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`narrative_test.go` directly asserts exact strings for generated headlines/actions and tests helper internals rather than stable external behavior.",
|
||||||
|
"`scoring_test.go` anchors on specific internal weighting outputs and literal status strings, which can fail on benign internal redesigns."
|
||||||
|
],
|
||||||
|
"suggestion": "Shift to contract-level tests against exported APIs with invariant assertions (phase category, presence of required fields, monotonic score behavior), and keep only a small set of snapshot/copy tests for user-facing text.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "module",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "test_strategy",
|
||||||
|
"identifier": "rust_parser_missing_negative_and_boundary_cases",
|
||||||
|
"summary": "Rust parser tests miss malformed-input and degradation-path coverage.",
|
||||||
|
"related_files": [
|
||||||
|
"pkg/rustdocs/parser_test.go",
|
||||||
|
"README.md"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`parser_test.go` cases are successful parses with valid fixture HTML and only basic assertions.",
|
||||||
|
"No tests verify behavior for malformed HTML, missing selectors, empty documents, or unsupported result rows.",
|
||||||
|
"README positions docs ingestion as core functionality, so parser failure behavior is a critical path."
|
||||||
|
],
|
||||||
|
"suggestion": "Add table-driven negative tests for malformed/partial HTML, empty search blocks, and missing headings; assert stable fallback behavior (explicit error or safe zero-value output) for each parser entrypoint.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "module",
|
||||||
|
"fix_scope": "single_edit"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "design_coherence",
|
||||||
|
"identifier": "scorecard_variant_sprawl",
|
||||||
|
"summary": "Scorecard generation is maintained as multiple copy-variants instead of one composable pipeline.",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/devour_scorecard.py",
|
||||||
|
"cmd/scorecard_generator.py",
|
||||||
|
"cmd/devour_enhanced.py",
|
||||||
|
"cmd/devour_enhanced_fixed.py",
|
||||||
|
"cmd/devour_enhanced_v2.py"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"Both `cmd/devour_scorecard.py` and `cmd/scorecard_generator.py` declare the same major functions and data model in the same order with only minor stylistic deltas.",
|
||||||
|
"Enhanced variants repeat the same section render functions and data loading flow, then diverge by ad-hoc edits, increasing change fan-out for any layout or scoring rule update."
|
||||||
|
],
|
||||||
|
"suggestion": "Extract a shared rendering core module (palette/fonts/layout primitives + data normalization), keep one canonical CLI entrypoint, and convert variant behavior into explicit theme/feature flags rather than duplicated files.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "codebase",
|
||||||
|
"fix_scope": "architectural_change"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "design_coherence",
|
||||||
|
"identifier": "external_scraper_template_duplication",
|
||||||
|
"summary": "Provider scrapers repeat the same orchestration flow with per-provider copy/paste adapters.",
|
||||||
|
"related_files": [
|
||||||
|
"internal/scraper/external/astrodocs.go",
|
||||||
|
"internal/scraper/external/cloudflaredocs.go",
|
||||||
|
"internal/scraper/external/reactdocs.go",
|
||||||
|
"internal/scraper/external/godocs.go",
|
||||||
|
"internal/scraper/external/vuedocs.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"Each scraper reimplements nearly identical `Scrape`, `DetectChanges`, `fetchPage`, and `generateHash` scaffolding, then inlines provider-specific conversion methods.",
|
||||||
|
"The repeated constructor/client/parser wiring pattern appears across multiple files, indicating systemic pattern duplication rather than isolated differences."
|
||||||
|
],
|
||||||
|
"suggestion": "Introduce a shared `DocAdapter` contract and a generic `HTTPDocScraper` that owns fetch/hash/change-detect; keep provider files focused on mapping parsed domain objects to `Document`.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "architectural_change"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "design_coherence",
|
||||||
|
"identifier": "utility_ownership_drift_in_lighthouse_script",
|
||||||
|
"summary": "Duplicate utility definition in one file shows mixed responsibility boundaries.",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/devour_lighthouse.py",
|
||||||
|
"cmd/devour_enhanced.py"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`cmd/devour_lighthouse.py` defines `load_font` twice with effectively the same fallback behavior, creating hidden override risk and unclear source of truth.",
|
||||||
|
"Comparable font utility exists in other renderer scripts, reinforcing that shared utility concerns are spread instead of centralized."
|
||||||
|
],
|
||||||
|
"suggestion": "Remove the duplicate in `cmd/devour_lighthouse.py` and move font-loading helpers into a shared module imported by all renderer scripts.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "module",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "error_consistency",
|
||||||
|
"identifier": "mixed_error_wrapping_in_scrape_and_search_paths",
|
||||||
|
"summary": "Related scrape/search paths mix raw passthrough and contextual wrapping.",
|
||||||
|
"related_files": [
|
||||||
|
"internal/search/engine.go",
|
||||||
|
"internal/scraper/openapi.go",
|
||||||
|
"internal/scraper/localsearch.go",
|
||||||
|
"cmd/scrape.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`internal/search/engine.go` frequently returns raw errors (`:114`, `:117`, `:122`, `:170`) but also uses contextual errors (`:111`, `:230`).",
|
||||||
|
"`internal/scraper/openapi.go` propagates raw errors from `readSpec`/`parseOpenAPISpec` (`:45`, `:50`, `:123`, `:141`, `:149`, `:157`, `:164`) while also defining wrapped errors (`:135`, `:153`, `:217`).",
|
||||||
|
"`internal/scraper/localsearch.go` returns raw errors from helper boundaries (`:79`, `:164`, `:191`, `:222`) mixed with rich wrapped messages in the same workflow (`:196`, `:203`, `:209`, `:217`)."
|
||||||
|
],
|
||||||
|
"suggestion": "Define a package-level rule: public methods must wrap downstream errors with operation context (using `%w`), and helper internals may return raw errors. Apply this consistently to `Rebuild/EnsureIndexed`, `OpenAPIScraper.Scrape/DetectChanges/readSpec`, and `LocalSearchScraper` methods.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "error_consistency",
|
||||||
|
"identifier": "inconsistent_failure_channel_panic_vs_error_vs_warning",
|
||||||
|
"summary": "Failure signaling varies between panic, error return, and warning-only logging.",
|
||||||
|
"related_files": [
|
||||||
|
"internal/quality/plugins/go/plugin.go",
|
||||||
|
"internal/indexer/indexer.go",
|
||||||
|
"cmd/scrape.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`internal/quality/plugins/go/plugin.go:363` panics on plugin registration failure.",
|
||||||
|
"`internal/indexer/indexer.go:239` prints a warning and suppresses deletion errors instead of returning them.",
|
||||||
|
"`cmd/scrape.go` is structured around returned errors (`:131`, `:145`, `:207`) and has no panic-based handling, creating inconsistent contracts across subsystems."
|
||||||
|
],
|
||||||
|
"suggestion": "Standardize on explicit error returns for recoverable startup/runtime failures; replace plugin `panic` with registration error propagation or controlled process-exit at the command entrypoint, and make indexer deletion behavior explicit (either aggregate and return partial-failure errors or document/encode best-effort mode).",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "codebase",
|
||||||
|
"fix_scope": "architectural_change"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "error_consistency",
|
||||||
|
"identifier": "command_boundary_context_loss",
|
||||||
|
"summary": "CLI command boundaries sometimes return raw errors without command context.",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/get.go",
|
||||||
|
"cmd/scrape.go",
|
||||||
|
"internal/config/config.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`cmd/get.go:59` and `cmd/scrape.go:90`/`:125` return raw errors directly from downstream calls.",
|
||||||
|
"Other branches in the same command wrap with explicit context (`cmd/scrape.go:131`, `cmd/scrape.go:145`, `cmd/scrape.go:154`).",
|
||||||
|
"Config layer already emits contextual wrapped errors (`internal/config/config.go:177`, `internal/config/config.go:181`), so command-layer inconsistency creates uneven user-facing diagnostics."
|
||||||
|
],
|
||||||
|
"suggestion": "At CLI entrypoints, wrap all returned downstream errors with command/action context (e.g., `run get`, `load config`, `scrape source`) and preserve root cause with `%w`; keep user-readable validation errors as direct messages.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "module",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "cross_module_architecture",
|
||||||
|
"identifier": "init_side_effect_registration_coupling",
|
||||||
|
"summary": "Scraper registration depends on import-time side effects and global mutable registry state.",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/root.go",
|
||||||
|
"internal/scraper/external/register.go",
|
||||||
|
"internal/scraper/registry_simple.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"Blank import in root command triggers registration implicitly rather than explicit bootstrap wiring.",
|
||||||
|
"Registration happens in `init()` and mutates shared global registry."
|
||||||
|
],
|
||||||
|
"suggestion": "Replace import-time registration with explicit bootstrap registration (e.g., `RegisterExternalScrapers()` called from startup), and pass registry instances through constructors to remove hidden global coupling.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "architectural_change"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "error_consistency",
|
||||||
|
"identifier": "mixed_process_termination_and_error_propagation",
|
||||||
|
"summary": "Error handling mixes panic/log.Fatal/os.Exit with returned errors across adjacent layers.",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/root.go",
|
||||||
|
"cmd/scorecard.go",
|
||||||
|
"internal/quality/plugins/go/plugin.go",
|
||||||
|
"cleanup_unused.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`Execute()` exits process directly; scorecard helper exits inside utility flow; plugin registration panics on failure.",
|
||||||
|
"Most other command paths return wrapped errors, creating inconsistent failure semantics."
|
||||||
|
],
|
||||||
|
"suggestion": "Standardize on returning errors from library/command internals and only perform process exit in one top-level entrypoint; replace panic/log.Fatal in shared code with typed/wrapped errors.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "codebase",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "abstraction_fitness",
|
||||||
|
"identifier": "external_scraper_boilerplate_without_shared_base",
|
||||||
|
"summary": "External scraper implementations duplicate fetch/hash/error/document plumbing instead of sharing a base abstraction.",
|
||||||
|
"related_files": [
|
||||||
|
"internal/scraper/external/godocs.go",
|
||||||
|
"internal/scraper/external/rustdocs.go",
|
||||||
|
"internal/scraper/external/reactdocs.go",
|
||||||
|
"internal/scraper/external/nuxtdocs.go",
|
||||||
|
"internal/scraper/external/cloudflaredocs.go",
|
||||||
|
"internal/scraper/external/types.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"Each scraper repeats `fetchPage`, status code checks, hash generation, and near-identical scrape control flow.",
|
||||||
|
"Alias-only types file adds indirection without behavior."
|
||||||
|
],
|
||||||
|
"suggestion": "Introduce a shared external-scraper base/helper for HTTP fetch, retries, hashing, and common error mapping; keep only parser-specific extraction and document shaping in each language scraper.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "architectural_change"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "test_strategy",
|
||||||
|
"identifier": "untested_unimplemented_runtime_paths",
|
||||||
|
"summary": "Core runtime paths are both under-tested and partially stubbed, leaving high-risk behavior unvalidated.",
|
||||||
|
"related_files": [
|
||||||
|
"internal/server/server.go",
|
||||||
|
"pkg/client/client.go",
|
||||||
|
"internal/vector/store.go",
|
||||||
|
"internal/search/engine.go",
|
||||||
|
"internal/indexer/indexer.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"Server/client/store contain TODO or not-implemented branches without direct tests.",
|
||||||
|
"No direct test files exist for several core modules that govern querying, indexing, and serving."
|
||||||
|
],
|
||||||
|
"suggestion": "Add table-driven tests for client/server/store/indexer contracts first (error behavior and non-nil results), then implement missing paths behind those tests; prioritize integration tests that exercise scrape->index->query flow.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "codebase",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "design_coherence",
|
||||||
|
"identifier": "command_files_mix_multiple_responsibilities",
|
||||||
|
"summary": "Large CLI command files blend orchestration, domain logic, persistence, and formatting concerns.",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/quality.go",
|
||||||
|
"cmd/scrape.go",
|
||||||
|
"cmd/ask.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`cmd/quality.go` combines scan setup, scoring/status persistence, resolve/fix/review workflows.",
|
||||||
|
"`cmd/scrape.go` combines config parsing, source detection/profiling, scrape execution, indexing, and source-state updates.",
|
||||||
|
"`cmd/ask.go` includes query derivation, source URL heuristics, ranking, summarization, and output formatting in one command module."
|
||||||
|
],
|
||||||
|
"suggestion": "Split command files into focused packages (transport/CLI binding vs service layer vs persistence helpers) and keep Cobra handlers as thin adapters invoking composable use-case functions.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "module",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"review_quality": {
|
||||||
|
"batch_count": 6,
|
||||||
|
"dimension_coverage": 0.367,
|
||||||
|
"evidence_density": 2.167,
|
||||||
|
"high_score_without_risk": 0,
|
||||||
|
"finding_pressure": 49.296,
|
||||||
|
"dimensions_with_findings": 5
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,50 @@
|
|||||||
|
You are a focused subagent reviewer for a single holistic investigation batch.
|
||||||
|
|
||||||
|
Repository root: /home/tdvorak/Desktop/PROG_projekty/GOLANG/Devour
|
||||||
|
Immutable packet: /home/tdvorak/Desktop/PROG_projekty/GOLANG/Devour/.desloppify/review_packets/holistic_packet_20260223_100953.json
|
||||||
|
Batch index: 1
|
||||||
|
Batch name: Architecture & Coupling
|
||||||
|
Batch dimensions: cross_module_architecture
|
||||||
|
Batch rationale: god modules, import-time side effects
|
||||||
|
|
||||||
|
Files assigned:
|
||||||
|
- internal/quality/enhanced_types.go
|
||||||
|
- internal/quality/narrative_test.go
|
||||||
|
- internal/quality/scoring_test.go
|
||||||
|
- internal/quality/types.go
|
||||||
|
- pkg/rustdocs/parser_test.go
|
||||||
|
|
||||||
|
Task requirements:
|
||||||
|
1. Read the immutable packet and follow `system_prompt` constraints exactly.
|
||||||
|
2. Evaluate ONLY listed files and ONLY listed dimensions for this batch.
|
||||||
|
3. Return 0-10 high-quality findings for this batch (empty array allowed).
|
||||||
|
4. Score/finding consistency is required: broader or more severe findings MUST lower dimension scores.
|
||||||
|
5. Every finding must include `related_files` with at least 2 files when possible.
|
||||||
|
6. Every finding must include `impact_scope` and `fix_scope`.
|
||||||
|
7. Every scored dimension MUST include dimension_notes with concrete evidence.
|
||||||
|
8. If a dimension score is >85, include `unreported_risk` in dimension_notes.
|
||||||
|
9. Use exactly one decimal place for every assessment and abstraction sub-axis score.
|
||||||
|
10. Do not edit repository files.
|
||||||
|
11. Return ONLY valid JSON, no markdown fences.
|
||||||
|
|
||||||
|
Scope enums:
|
||||||
|
- impact_scope: "local" | "module" | "subsystem" | "codebase"
|
||||||
|
- fix_scope: "single_edit" | "multi_file_refactor" | "architectural_change"
|
||||||
|
|
||||||
|
Output schema:
|
||||||
|
{
|
||||||
|
"batch": "Architecture & Coupling",
|
||||||
|
"batch_index": 1,
|
||||||
|
"assessments": {"<dimension>": <0-100 with one decimal place>},
|
||||||
|
"dimension_notes": {
|
||||||
|
"<dimension>": {
|
||||||
|
"evidence": ["specific code observations"],
|
||||||
|
"impact_scope": "local|module|subsystem|codebase",
|
||||||
|
"fix_scope": "single_edit|multi_file_refactor|architectural_change",
|
||||||
|
"confidence": "high|medium|low",
|
||||||
|
"unreported_risk": "required when score >85",
|
||||||
|
"sub_axes": {"abstraction_leverage": 0-100 with one decimal place, "indirection_cost": 0-100 with one decimal place, "interface_honesty": 0-100 with one decimal place} // required for abstraction_fitness when evidence supports it
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"findings": []
|
||||||
|
}
|
||||||
@@ -0,0 +1,77 @@
|
|||||||
|
You are a focused subagent reviewer for a single holistic investigation batch.
|
||||||
|
|
||||||
|
Repository root: /home/tdvorak/Desktop/PROG_projekty/GOLANG/Devour
|
||||||
|
Immutable packet: /home/tdvorak/Desktop/PROG_projekty/GOLANG/Devour/.desloppify/review_packets/holistic_packet_20260223_100953.json
|
||||||
|
Batch index: 2
|
||||||
|
Batch name: Abstractions & Dependencies
|
||||||
|
Batch dimensions: abstraction_fitness
|
||||||
|
Batch rationale: abstraction hotspots (wrappers/interfaces/param bags), dep cycles
|
||||||
|
|
||||||
|
Files assigned:
|
||||||
|
- cmd/scrape.go
|
||||||
|
- internal/quality/plugins/go/analyzers/detectors.go
|
||||||
|
- internal/quality/plugins/go/analyzers/advanced.go
|
||||||
|
- internal/scraper/web.go
|
||||||
|
- internal/quality/plugins/go/plugin.go
|
||||||
|
- internal/scheduler/scheduler.go
|
||||||
|
- cmd/init.go
|
||||||
|
- internal/scraper/localsearch_test.go
|
||||||
|
- internal/config/config.go
|
||||||
|
- internal/ai/openai.go
|
||||||
|
- cmd/get.go
|
||||||
|
- cmd/get_test.go
|
||||||
|
- internal/quality/analyzers/controlflow.go
|
||||||
|
- internal/vector/store.go
|
||||||
|
- cmd/ask.go
|
||||||
|
- examples/demo_scrapers.go
|
||||||
|
- internal/indexer/indexer.go
|
||||||
|
- internal/scraper/openapi.go
|
||||||
|
- pkg/pythondocs/parser.go
|
||||||
|
- internal/quality/analyzers/dataflow.go
|
||||||
|
- internal/quality/scanner_test.go
|
||||||
|
- internal/server/server.go
|
||||||
|
- internal/scraper/localsearch.go
|
||||||
|
- internal/scraper/external/nuxtdocs.go
|
||||||
|
- internal/quality/plugins/go/analyzers/test_coverage.go
|
||||||
|
- internal/search/engine.go
|
||||||
|
- internal/scraper/external/astrodocs.go
|
||||||
|
- internal/scraper/external/cloudflaredocs.go
|
||||||
|
- internal/scraper/external/dockerdocs.go
|
||||||
|
- internal/scraper/external/godocs.go
|
||||||
|
- internal/scraper/external/javadocs.go
|
||||||
|
- internal/scraper/external/mcpdocs.go
|
||||||
|
|
||||||
|
Task requirements:
|
||||||
|
1. Read the immutable packet and follow `system_prompt` constraints exactly.
|
||||||
|
2. Evaluate ONLY listed files and ONLY listed dimensions for this batch.
|
||||||
|
3. Return 0-10 high-quality findings for this batch (empty array allowed).
|
||||||
|
4. Score/finding consistency is required: broader or more severe findings MUST lower dimension scores.
|
||||||
|
5. Every finding must include `related_files` with at least 2 files when possible.
|
||||||
|
6. Every finding must include `impact_scope` and `fix_scope`.
|
||||||
|
7. Every scored dimension MUST include dimension_notes with concrete evidence.
|
||||||
|
8. If a dimension score is >85, include `unreported_risk` in dimension_notes.
|
||||||
|
9. Use exactly one decimal place for every assessment and abstraction sub-axis score.
|
||||||
|
10. Do not edit repository files.
|
||||||
|
11. Return ONLY valid JSON, no markdown fences.
|
||||||
|
|
||||||
|
Scope enums:
|
||||||
|
- impact_scope: "local" | "module" | "subsystem" | "codebase"
|
||||||
|
- fix_scope: "single_edit" | "multi_file_refactor" | "architectural_change"
|
||||||
|
|
||||||
|
Output schema:
|
||||||
|
{
|
||||||
|
"batch": "Abstractions & Dependencies",
|
||||||
|
"batch_index": 2,
|
||||||
|
"assessments": {"<dimension>": <0-100 with one decimal place>},
|
||||||
|
"dimension_notes": {
|
||||||
|
"<dimension>": {
|
||||||
|
"evidence": ["specific code observations"],
|
||||||
|
"impact_scope": "local|module|subsystem|codebase",
|
||||||
|
"fix_scope": "single_edit|multi_file_refactor|architectural_change",
|
||||||
|
"confidence": "high|medium|low",
|
||||||
|
"unreported_risk": "required when score >85",
|
||||||
|
"sub_axes": {"abstraction_leverage": 0-100 with one decimal place, "indirection_cost": 0-100 with one decimal place, "interface_honesty": 0-100 with one decimal place} // required for abstraction_fitness when evidence supports it
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"findings": []
|
||||||
|
}
|
||||||
@@ -0,0 +1,51 @@
|
|||||||
|
You are a focused subagent reviewer for a single holistic investigation batch.
|
||||||
|
|
||||||
|
Repository root: /home/tdvorak/Desktop/PROG_projekty/GOLANG/Devour
|
||||||
|
Immutable packet: /home/tdvorak/Desktop/PROG_projekty/GOLANG/Devour/.desloppify/review_packets/holistic_packet_20260223_100953.json
|
||||||
|
Batch index: 3
|
||||||
|
Batch name: Governance & Contracts
|
||||||
|
Batch dimensions: cross_module_architecture, test_strategy
|
||||||
|
Batch rationale: architecture contracts, compatibility policy, docs-vs-runtime scope, and quality-gate coverage
|
||||||
|
|
||||||
|
Files assigned:
|
||||||
|
- README.md
|
||||||
|
- internal/quality/enhanced_types.go
|
||||||
|
- internal/quality/narrative_test.go
|
||||||
|
- internal/quality/scoring_test.go
|
||||||
|
- internal/quality/types.go
|
||||||
|
- pkg/rustdocs/parser_test.go
|
||||||
|
|
||||||
|
Task requirements:
|
||||||
|
1. Read the immutable packet and follow `system_prompt` constraints exactly.
|
||||||
|
2. Evaluate ONLY listed files and ONLY listed dimensions for this batch.
|
||||||
|
3. Return 0-10 high-quality findings for this batch (empty array allowed).
|
||||||
|
4. Score/finding consistency is required: broader or more severe findings MUST lower dimension scores.
|
||||||
|
5. Every finding must include `related_files` with at least 2 files when possible.
|
||||||
|
6. Every finding must include `impact_scope` and `fix_scope`.
|
||||||
|
7. Every scored dimension MUST include dimension_notes with concrete evidence.
|
||||||
|
8. If a dimension score is >85, include `unreported_risk` in dimension_notes.
|
||||||
|
9. Use exactly one decimal place for every assessment and abstraction sub-axis score.
|
||||||
|
10. Do not edit repository files.
|
||||||
|
11. Return ONLY valid JSON, no markdown fences.
|
||||||
|
|
||||||
|
Scope enums:
|
||||||
|
- impact_scope: "local" | "module" | "subsystem" | "codebase"
|
||||||
|
- fix_scope: "single_edit" | "multi_file_refactor" | "architectural_change"
|
||||||
|
|
||||||
|
Output schema:
|
||||||
|
{
|
||||||
|
"batch": "Governance & Contracts",
|
||||||
|
"batch_index": 3,
|
||||||
|
"assessments": {"<dimension>": <0-100 with one decimal place>},
|
||||||
|
"dimension_notes": {
|
||||||
|
"<dimension>": {
|
||||||
|
"evidence": ["specific code observations"],
|
||||||
|
"impact_scope": "local|module|subsystem|codebase",
|
||||||
|
"fix_scope": "single_edit|multi_file_refactor|architectural_change",
|
||||||
|
"confidence": "high|medium|low",
|
||||||
|
"unreported_risk": "required when score >85",
|
||||||
|
"sub_axes": {"abstraction_leverage": 0-100 with one decimal place, "indirection_cost": 0-100 with one decimal place, "interface_honesty": 0-100 with one decimal place} // required for abstraction_fitness when evidence supports it
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"findings": []
|
||||||
|
}
|
||||||
@@ -0,0 +1,196 @@
|
|||||||
|
You are a focused subagent reviewer for a single holistic investigation batch.
|
||||||
|
|
||||||
|
Repository root: /home/tdvorak/Desktop/PROG_projekty/GOLANG/Devour
|
||||||
|
Immutable packet: /home/tdvorak/Desktop/PROG_projekty/GOLANG/Devour/.desloppify/review_packets/holistic_packet_20260223_100953.json
|
||||||
|
Batch index: 4
|
||||||
|
Batch name: Design Coherence — Mechanical Concern Signals
|
||||||
|
Batch dimensions: design_coherence
|
||||||
|
Batch rationale: mechanical detectors identified structural patterns needing judgment; concern types: duplication_design, mixed_responsibilities, systemic_pattern
|
||||||
|
|
||||||
|
Files assigned:
|
||||||
|
- .desloppify/query.json
|
||||||
|
- .github/workflows/ci.yml
|
||||||
|
- AGENTS.md
|
||||||
|
- cmd/devour_enhanced.py
|
||||||
|
- cmd/devour_enhanced_fixed.py
|
||||||
|
- cmd/devour_enhanced_v2.py
|
||||||
|
- cmd/devour_lighthouse.py
|
||||||
|
- cmd/devour_scorecard.py
|
||||||
|
- cmd/quality.go
|
||||||
|
- cmd/scorecard_generator.py
|
||||||
|
- desloppify/desloppify/desloppify/app/commands/_show_terminal.py
|
||||||
|
- desloppify/desloppify/desloppify/app/commands/fix/apply_flow.py
|
||||||
|
- desloppify/desloppify/desloppify/app/commands/issues_cmd.py
|
||||||
|
- desloppify/desloppify/desloppify/app/commands/next.py
|
||||||
|
- desloppify/desloppify/desloppify/app/commands/resolve/selection.py
|
||||||
|
- desloppify/desloppify/desloppify/app/commands/scan/scan_reporting_llm.py
|
||||||
|
- desloppify/desloppify/desloppify/app/commands/status_parts/render.py
|
||||||
|
- desloppify/desloppify/desloppify/app/output/scorecard_parts/projection.py
|
||||||
|
- desloppify/desloppify/desloppify/engine/detectors/security/rules.py
|
||||||
|
- desloppify/desloppify/desloppify/engine/scoring_internal/subjective/core.py
|
||||||
|
- desloppify/desloppify/desloppify/engine/state_internal/resolution.py
|
||||||
|
- desloppify/desloppify/desloppify/intelligence/review/__init__.py
|
||||||
|
- desloppify/desloppify/desloppify/intelligence/review/context_internal/structure.py
|
||||||
|
- desloppify/desloppify/desloppify/intelligence/review/dimensions/data.py
|
||||||
|
- desloppify/desloppify/desloppify/intelligence/review/importing/holistic.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/_shared/phases_common.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/_shared/review_data/dimensions.json
|
||||||
|
- desloppify/desloppify/desloppify/languages/_shared/scaffold_detect_commands.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/csharp/_parse_helpers.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/csharp/commands.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/csharp/deps/cli.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/csharp/deps/fallback.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/csharp/detectors/deps.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/csharp/phases.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/csharp/test_coverage.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/dart/__init__.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/dart/commands.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/dart/detectors/deps.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/dart/extractors.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/dart/move.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/framework/commands_base.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/gdscript/__init__.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/gdscript/detectors/deps.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/python/__init__.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/python/commands.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/python/detectors/security.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/python/detectors/smells.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/python/move.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/python/phases.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/python/test_coverage.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/python/tests/test_py_facade.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/detectors/_smell_detectors.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/detectors/_smell_effects.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/detectors/deps.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/detectors/exports.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/detectors/react.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/detectors/unused.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/fixers/common.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/fixers/if_chain.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/fixers/logs.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/tests/test_ts_concerns.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/tests/test_ts_deprecated.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/tests/test_ts_deps.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/tests/test_ts_exports.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/tests/test_ts_fixers.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/tests/test_ts_logs.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/typescript/tests/test_ts_react.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/commands/fix/test_cmd_fix_review.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/commands/test_cmd_detect.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/commands/test_cmd_fix.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/commands/test_cmd_next.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/commands/test_cmd_scan.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/commands/test_cmd_show.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/commands/test_config_cmd.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/detectors/test_architecture_boundaries.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/detectors/test_complexity.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/detectors/test_coupling.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/detectors/test_gods.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/detectors/test_naming.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/detectors/test_orphaned.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/lang/common/test_lang_contract_validation.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/lang/csharp/test_csharp_deps.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/lang/csharp/test_csharp_scan.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/lang/dart/test_dart_deps.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/review/test_review_coverage.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/review/test_review_dimensions_direct.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/review/test_work_queue.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/scan/test_flat_dirs.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/scan/test_scan_reporting_direct.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/scan/test_scan_workflow_wontfix_direct.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/scoring/test_scorecard.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/scoring/test_scorecard_draw_direct.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/snapshots/cli_smoke/state-python.json
|
||||||
|
- desloppify/desloppify/desloppify/tests/state/test_state.py
|
||||||
|
- desloppify/desloppify/desloppify/tests/state/test_state_internal_direct.py
|
||||||
|
- devour_data/docs/docker_compose_-_ask_me_about_docker_1.md
|
||||||
|
- devour_data/docs/docker_compose_-_browse_common_faqs_10.md
|
||||||
|
- devour_data/docs/docker_compose_-_docker_compose_2.md
|
||||||
|
- devour_data/docs/docker_compose_-_explore_the_compose_file_referenc_8.md
|
||||||
|
- devour_data/docs/docker_compose_-_how_compose_works_4.md
|
||||||
|
- devour_data/docs/docker_compose_-_install_compose_5.md
|
||||||
|
- devour_data/docs/docker_compose_-_use_compose_bridge_9.md
|
||||||
|
- internal/ai/openai.go
|
||||||
|
- internal/quality/analyzers/dataflow.go
|
||||||
|
- internal/quality/detector_test.go
|
||||||
|
- internal/quality/detectors/complexity.go
|
||||||
|
- internal/quality/languages.go
|
||||||
|
- internal/quality/languages_test.go
|
||||||
|
- internal/quality/narrative_test.go
|
||||||
|
- internal/quality/plugins/go/analyzers/deadcode.go
|
||||||
|
- internal/quality/plugins/go/analyzers/detectors.go
|
||||||
|
- internal/quality/plugins/go/analyzers/security.go
|
||||||
|
- internal/quality/plugins/go/analyzers/test_coverage.go
|
||||||
|
- internal/quality/plugins/go/fixers/advanced_fixers.go
|
||||||
|
- internal/quality/plugins/go/fixers/fixers.go
|
||||||
|
- internal/quality/scoring_test.go
|
||||||
|
- internal/quality/state_test.go
|
||||||
|
- internal/scraper/external/astrodocs.go
|
||||||
|
- internal/scraper/external/cloudflaredocs.go
|
||||||
|
- internal/scraper/external/godocs.go
|
||||||
|
- internal/scraper/external/javadocs.go
|
||||||
|
- internal/scraper/external/nuxtdocs.go
|
||||||
|
- internal/scraper/external/pythondocs.go
|
||||||
|
- internal/scraper/external/reactdocs.go
|
||||||
|
- internal/scraper/external/rustdocs.go
|
||||||
|
- internal/scraper/external/springdocs.go
|
||||||
|
- internal/scraper/external/vuedocs.go
|
||||||
|
- internal/scraper/localsearch_test.go
|
||||||
|
- internal/scraper/web.go
|
||||||
|
- internal/scraper/web_integration_test.go
|
||||||
|
- landing/dist/index.html
|
||||||
|
- landing/src/components/sections/Footer.tsx
|
||||||
|
- landing/src/index.css
|
||||||
|
- pkg/astrodocs/parser.go
|
||||||
|
- pkg/astrodocs/parser_test.go
|
||||||
|
- pkg/cloudflaredocs/parser.go
|
||||||
|
- pkg/cloudflaredocs/parser_test.go
|
||||||
|
- pkg/dockerdocs/parser.go
|
||||||
|
- pkg/godocs/parser.go
|
||||||
|
- pkg/godocs/parser_test.go
|
||||||
|
- pkg/javadocs/parser.go
|
||||||
|
- pkg/javadocs/parser_test.go
|
||||||
|
- pkg/nuxtdocs/parser.go
|
||||||
|
- pkg/nuxtdocs/parser_test.go
|
||||||
|
- pkg/nuxtdocs/types.go
|
||||||
|
- pkg/pythondocs/parser.go
|
||||||
|
- pkg/pythondocs/parser_test.go
|
||||||
|
- pkg/reactdocs/parser.go
|
||||||
|
- pkg/rustdocs/parser.go
|
||||||
|
- pkg/springdocs/parser.go
|
||||||
|
- pkg/vuedocs/parser.go
|
||||||
|
|
||||||
|
Task requirements:
|
||||||
|
1. Read the immutable packet and follow `system_prompt` constraints exactly.
|
||||||
|
2. Evaluate ONLY listed files and ONLY listed dimensions for this batch.
|
||||||
|
3. Return 0-10 high-quality findings for this batch (empty array allowed).
|
||||||
|
4. Score/finding consistency is required: broader or more severe findings MUST lower dimension scores.
|
||||||
|
5. Every finding must include `related_files` with at least 2 files when possible.
|
||||||
|
6. Every finding must include `impact_scope` and `fix_scope`.
|
||||||
|
7. Every scored dimension MUST include dimension_notes with concrete evidence.
|
||||||
|
8. If a dimension score is >85, include `unreported_risk` in dimension_notes.
|
||||||
|
9. Use exactly one decimal place for every assessment and abstraction sub-axis score.
|
||||||
|
10. Do not edit repository files.
|
||||||
|
11. Return ONLY valid JSON, no markdown fences.
|
||||||
|
|
||||||
|
Scope enums:
|
||||||
|
- impact_scope: "local" | "module" | "subsystem" | "codebase"
|
||||||
|
- fix_scope: "single_edit" | "multi_file_refactor" | "architectural_change"
|
||||||
|
|
||||||
|
Output schema:
|
||||||
|
{
|
||||||
|
"batch": "Design Coherence — Mechanical Concern Signals",
|
||||||
|
"batch_index": 4,
|
||||||
|
"assessments": {"<dimension>": <0-100 with one decimal place>},
|
||||||
|
"dimension_notes": {
|
||||||
|
"<dimension>": {
|
||||||
|
"evidence": ["specific code observations"],
|
||||||
|
"impact_scope": "local|module|subsystem|codebase",
|
||||||
|
"fix_scope": "single_edit|multi_file_refactor|architectural_change",
|
||||||
|
"confidence": "high|medium|low",
|
||||||
|
"unreported_risk": "required when score >85",
|
||||||
|
"sub_axes": {"abstraction_leverage": 0-100 with one decimal place, "indirection_cost": 0-100 with one decimal place, "interface_honesty": 0-100 with one decimal place} // required for abstraction_fitness when evidence supports it
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"findings": []
|
||||||
|
}
|
||||||
@@ -0,0 +1,125 @@
|
|||||||
|
You are a focused subagent reviewer for a single holistic investigation batch.
|
||||||
|
|
||||||
|
Repository root: /home/tdvorak/Desktop/PROG_projekty/GOLANG/Devour
|
||||||
|
Immutable packet: /home/tdvorak/Desktop/PROG_projekty/GOLANG/Devour/.desloppify/review_packets/holistic_packet_20260223_100953.json
|
||||||
|
Batch index: 5
|
||||||
|
Batch name: Cross-cutting Sweep
|
||||||
|
Batch dimensions: error_consistency
|
||||||
|
Batch rationale: selected dimensions had no direct batch mapping; review representative cross-cutting files
|
||||||
|
|
||||||
|
Files assigned:
|
||||||
|
- internal/quality/enhanced_types.go
|
||||||
|
- internal/quality/narrative_test.go
|
||||||
|
- internal/quality/scoring_test.go
|
||||||
|
- internal/quality/types.go
|
||||||
|
- pkg/rustdocs/parser_test.go
|
||||||
|
- cmd/scrape.go
|
||||||
|
- internal/quality/plugins/go/analyzers/detectors.go
|
||||||
|
- internal/quality/plugins/go/analyzers/advanced.go
|
||||||
|
- internal/scraper/web.go
|
||||||
|
- internal/quality/plugins/go/plugin.go
|
||||||
|
- internal/scheduler/scheduler.go
|
||||||
|
- cmd/init.go
|
||||||
|
- internal/scraper/localsearch_test.go
|
||||||
|
- internal/config/config.go
|
||||||
|
- internal/ai/openai.go
|
||||||
|
- cmd/get.go
|
||||||
|
- cmd/get_test.go
|
||||||
|
- internal/quality/analyzers/controlflow.go
|
||||||
|
- internal/vector/store.go
|
||||||
|
- cmd/ask.go
|
||||||
|
- examples/demo_scrapers.go
|
||||||
|
- internal/indexer/indexer.go
|
||||||
|
- internal/scraper/openapi.go
|
||||||
|
- pkg/pythondocs/parser.go
|
||||||
|
- internal/quality/analyzers/dataflow.go
|
||||||
|
- internal/quality/scanner_test.go
|
||||||
|
- internal/server/server.go
|
||||||
|
- internal/scraper/localsearch.go
|
||||||
|
- internal/scraper/external/nuxtdocs.go
|
||||||
|
- internal/quality/plugins/go/analyzers/test_coverage.go
|
||||||
|
- internal/search/engine.go
|
||||||
|
- internal/scraper/external/astrodocs.go
|
||||||
|
- internal/scraper/external/cloudflaredocs.go
|
||||||
|
- internal/scraper/external/dockerdocs.go
|
||||||
|
- internal/scraper/external/godocs.go
|
||||||
|
- internal/scraper/external/javadocs.go
|
||||||
|
- internal/scraper/external/mcpdocs.go
|
||||||
|
- README.md
|
||||||
|
- .desloppify/query.json
|
||||||
|
- .github/workflows/ci.yml
|
||||||
|
- AGENTS.md
|
||||||
|
- cmd/devour_enhanced.py
|
||||||
|
- cmd/devour_enhanced_fixed.py
|
||||||
|
- cmd/devour_enhanced_v2.py
|
||||||
|
- cmd/devour_lighthouse.py
|
||||||
|
- cmd/devour_scorecard.py
|
||||||
|
- cmd/quality.go
|
||||||
|
- cmd/scorecard_generator.py
|
||||||
|
- desloppify/desloppify/desloppify/app/commands/_show_terminal.py
|
||||||
|
- desloppify/desloppify/desloppify/app/commands/fix/apply_flow.py
|
||||||
|
- desloppify/desloppify/desloppify/app/commands/issues_cmd.py
|
||||||
|
- desloppify/desloppify/desloppify/app/commands/next.py
|
||||||
|
- desloppify/desloppify/desloppify/app/commands/resolve/selection.py
|
||||||
|
- desloppify/desloppify/desloppify/app/commands/scan/scan_reporting_llm.py
|
||||||
|
- desloppify/desloppify/desloppify/app/commands/status_parts/render.py
|
||||||
|
- desloppify/desloppify/desloppify/app/output/scorecard_parts/projection.py
|
||||||
|
- desloppify/desloppify/desloppify/engine/detectors/security/rules.py
|
||||||
|
- desloppify/desloppify/desloppify/engine/scoring_internal/subjective/core.py
|
||||||
|
- desloppify/desloppify/desloppify/engine/state_internal/resolution.py
|
||||||
|
- desloppify/desloppify/desloppify/intelligence/review/__init__.py
|
||||||
|
- desloppify/desloppify/desloppify/intelligence/review/context_internal/structure.py
|
||||||
|
- desloppify/desloppify/desloppify/intelligence/review/dimensions/data.py
|
||||||
|
- desloppify/desloppify/desloppify/intelligence/review/importing/holistic.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/_shared/phases_common.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/_shared/review_data/dimensions.json
|
||||||
|
- desloppify/desloppify/desloppify/languages/_shared/scaffold_detect_commands.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/csharp/_parse_helpers.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/csharp/commands.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/csharp/deps/cli.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/csharp/deps/fallback.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/csharp/detectors/deps.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/csharp/phases.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/csharp/test_coverage.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/dart/__init__.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/dart/commands.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/dart/detectors/deps.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/dart/extractors.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/dart/move.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/framework/commands_base.py
|
||||||
|
- desloppify/desloppify/desloppify/languages/gdscript/__init__.py
|
||||||
|
|
||||||
|
Task requirements:
|
||||||
|
1. Read the immutable packet and follow `system_prompt` constraints exactly.
|
||||||
|
2. Evaluate ONLY listed files and ONLY listed dimensions for this batch.
|
||||||
|
3. Return 0-10 high-quality findings for this batch (empty array allowed).
|
||||||
|
4. Score/finding consistency is required: broader or more severe findings MUST lower dimension scores.
|
||||||
|
5. Every finding must include `related_files` with at least 2 files when possible.
|
||||||
|
6. Every finding must include `impact_scope` and `fix_scope`.
|
||||||
|
7. Every scored dimension MUST include dimension_notes with concrete evidence.
|
||||||
|
8. If a dimension score is >85, include `unreported_risk` in dimension_notes.
|
||||||
|
9. Use exactly one decimal place for every assessment and abstraction sub-axis score.
|
||||||
|
10. Do not edit repository files.
|
||||||
|
11. Return ONLY valid JSON, no markdown fences.
|
||||||
|
|
||||||
|
Scope enums:
|
||||||
|
- impact_scope: "local" | "module" | "subsystem" | "codebase"
|
||||||
|
- fix_scope: "single_edit" | "multi_file_refactor" | "architectural_change"
|
||||||
|
|
||||||
|
Output schema:
|
||||||
|
{
|
||||||
|
"batch": "Cross-cutting Sweep",
|
||||||
|
"batch_index": 5,
|
||||||
|
"assessments": {"<dimension>": <0-100 with one decimal place>},
|
||||||
|
"dimension_notes": {
|
||||||
|
"<dimension>": {
|
||||||
|
"evidence": ["specific code observations"],
|
||||||
|
"impact_scope": "local|module|subsystem|codebase",
|
||||||
|
"fix_scope": "single_edit|multi_file_refactor|architectural_change",
|
||||||
|
"confidence": "high|medium|low",
|
||||||
|
"unreported_risk": "required when score >85",
|
||||||
|
"sub_axes": {"abstraction_leverage": 0-100 with one decimal place, "indirection_cost": 0-100 with one decimal place, "interface_honesty": 0-100 with one decimal place} // required for abstraction_fitness when evidence supports it
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"findings": []
|
||||||
|
}
|
||||||
@@ -0,0 +1,158 @@
|
|||||||
|
You are a focused subagent reviewer for a single holistic investigation batch.
|
||||||
|
|
||||||
|
Repository root: /home/tdvorak/Desktop/PROG_projekty/GOLANG/Devour
|
||||||
|
Immutable packet: /home/tdvorak/Desktop/PROG_projekty/GOLANG/Devour/.desloppify/review_packets/holistic_packet_20260223_100953.json
|
||||||
|
Batch index: 6
|
||||||
|
Batch name: Full Codebase Sweep
|
||||||
|
Batch dimensions: cross_module_architecture, error_consistency, abstraction_fitness, test_strategy, design_coherence
|
||||||
|
Batch rationale: thorough default: evaluate cross-cutting quality across all production files
|
||||||
|
|
||||||
|
Files assigned:
|
||||||
|
- cleanup_unused.go
|
||||||
|
- cmd/ask.go
|
||||||
|
- cmd/demo.go
|
||||||
|
- cmd/devour/main.go
|
||||||
|
- cmd/generate_scorecards/main.go
|
||||||
|
- cmd/get.go
|
||||||
|
- cmd/init.go
|
||||||
|
- cmd/languages.go
|
||||||
|
- cmd/push.go
|
||||||
|
- cmd/quality.go
|
||||||
|
- cmd/query.go
|
||||||
|
- cmd/realtest/main.go
|
||||||
|
- cmd/root.go
|
||||||
|
- cmd/runtime_helpers.go
|
||||||
|
- cmd/scorecard.go
|
||||||
|
- cmd/scrape.go
|
||||||
|
- cmd/serve.go
|
||||||
|
- cmd/status.go
|
||||||
|
- cmd/sync.go
|
||||||
|
- examples/demo_scrapers.go
|
||||||
|
- internal/ai/ai.go
|
||||||
|
- internal/ai/openai.go
|
||||||
|
- internal/config/config.go
|
||||||
|
- internal/indexer/indexer.go
|
||||||
|
- internal/markdown/formatter.go
|
||||||
|
- internal/projectstate/state.go
|
||||||
|
- internal/quality/analyzers/controlflow.go
|
||||||
|
- internal/quality/analyzers/dataflow.go
|
||||||
|
- internal/quality/analyzers/practices.go
|
||||||
|
- internal/quality/detector.go
|
||||||
|
- internal/quality/detectors/complexity.go
|
||||||
|
- internal/quality/detectors/duplication.go
|
||||||
|
- internal/quality/detectors/naming.go
|
||||||
|
- internal/quality/enhanced_types.go
|
||||||
|
- internal/quality/languages.go
|
||||||
|
- internal/quality/narrative.go
|
||||||
|
- internal/quality/plugins/go/analyzers/advanced.go
|
||||||
|
- internal/quality/plugins/go/analyzers/deadcode.go
|
||||||
|
- internal/quality/plugins/go/analyzers/detectors.go
|
||||||
|
- internal/quality/plugins/go/analyzers/security.go
|
||||||
|
- internal/quality/plugins/go/analyzers/test_coverage.go
|
||||||
|
- internal/quality/plugins/go/fixers/advanced_fixers.go
|
||||||
|
- internal/quality/plugins/go/fixers/fixers.go
|
||||||
|
- internal/quality/plugins/go/plugin.go
|
||||||
|
- internal/quality/plugins/plugin.go
|
||||||
|
- internal/quality/plugins/registry.go
|
||||||
|
- internal/quality/review/packet.go
|
||||||
|
- internal/quality/scanner.go
|
||||||
|
- internal/quality/scoring.go
|
||||||
|
- internal/quality/state.go
|
||||||
|
- internal/quality/types.go
|
||||||
|
- internal/scheduler/scheduler.go
|
||||||
|
- internal/scraper/external/astrodocs.go
|
||||||
|
- internal/scraper/external/cloudflaredocs.go
|
||||||
|
- internal/scraper/external/dockerdocs.go
|
||||||
|
- internal/scraper/external/godocs.go
|
||||||
|
- internal/scraper/external/javadocs.go
|
||||||
|
- internal/scraper/external/mcpdocs.go
|
||||||
|
- internal/scraper/external/nuxtdocs.go
|
||||||
|
- internal/scraper/external/pythondocs.go
|
||||||
|
- internal/scraper/external/reactdocs.go
|
||||||
|
- internal/scraper/external/register.go
|
||||||
|
- internal/scraper/external/rustdocs.go
|
||||||
|
- internal/scraper/external/springdocs.go
|
||||||
|
- internal/scraper/external/tsdocs.go
|
||||||
|
- internal/scraper/external/types.go
|
||||||
|
- internal/scraper/external/vuedocs.go
|
||||||
|
- internal/scraper/github.go
|
||||||
|
- internal/scraper/local.go
|
||||||
|
- internal/scraper/localsearch.go
|
||||||
|
- internal/scraper/normalize.go
|
||||||
|
- internal/scraper/openapi.go
|
||||||
|
- internal/scraper/register_core.go
|
||||||
|
- internal/scraper/registry_simple.go
|
||||||
|
- internal/scraper/scraper.go
|
||||||
|
- internal/scraper/web.go
|
||||||
|
- internal/scraper/wrapper.go
|
||||||
|
- internal/search/engine.go
|
||||||
|
- internal/server/server.go
|
||||||
|
- internal/storage/writer.go
|
||||||
|
- internal/ui/banner.go
|
||||||
|
- internal/ui/character.go
|
||||||
|
- internal/vector/store.go
|
||||||
|
- main.go
|
||||||
|
- pkg/astrodocs/parser.go
|
||||||
|
- pkg/astrodocs/types.go
|
||||||
|
- pkg/client/client.go
|
||||||
|
- pkg/cloudflaredocs/parser.go
|
||||||
|
- pkg/cloudflaredocs/types.go
|
||||||
|
- pkg/dockerdocs/parser.go
|
||||||
|
- pkg/dockerdocs/types.go
|
||||||
|
- pkg/godocs/parser.go
|
||||||
|
- pkg/godocs/types.go
|
||||||
|
- pkg/javadocs/parser.go
|
||||||
|
- pkg/javadocs/types.go
|
||||||
|
- pkg/mcpdocs/parser.go
|
||||||
|
- pkg/mcpdocs/types.go
|
||||||
|
- pkg/nuxtdocs/parser.go
|
||||||
|
- pkg/nuxtdocs/types.go
|
||||||
|
- pkg/parserutil/url.go
|
||||||
|
- pkg/pythondocs/parser.go
|
||||||
|
- pkg/pythondocs/types.go
|
||||||
|
- pkg/reactdocs/parser.go
|
||||||
|
- pkg/reactdocs/types.go
|
||||||
|
- pkg/rustdocs/parser.go
|
||||||
|
- pkg/rustdocs/types.go
|
||||||
|
- pkg/springdocs/parser.go
|
||||||
|
- pkg/springdocs/types.go
|
||||||
|
- pkg/tsdocs/parser.go
|
||||||
|
- pkg/tsdocs/types.go
|
||||||
|
- pkg/types/types.go
|
||||||
|
- pkg/vuedocs/parser.go
|
||||||
|
- pkg/vuedocs/types.go
|
||||||
|
|
||||||
|
Task requirements:
|
||||||
|
1. Read the immutable packet and follow `system_prompt` constraints exactly.
|
||||||
|
2. Evaluate ONLY listed files and ONLY listed dimensions for this batch.
|
||||||
|
3. Return 0-10 high-quality findings for this batch (empty array allowed).
|
||||||
|
4. Score/finding consistency is required: broader or more severe findings MUST lower dimension scores.
|
||||||
|
5. Every finding must include `related_files` with at least 2 files when possible.
|
||||||
|
6. Every finding must include `impact_scope` and `fix_scope`.
|
||||||
|
7. Every scored dimension MUST include dimension_notes with concrete evidence.
|
||||||
|
8. If a dimension score is >85, include `unreported_risk` in dimension_notes.
|
||||||
|
9. Use exactly one decimal place for every assessment and abstraction sub-axis score.
|
||||||
|
10. Do not edit repository files.
|
||||||
|
11. Return ONLY valid JSON, no markdown fences.
|
||||||
|
|
||||||
|
Scope enums:
|
||||||
|
- impact_scope: "local" | "module" | "subsystem" | "codebase"
|
||||||
|
- fix_scope: "single_edit" | "multi_file_refactor" | "architectural_change"
|
||||||
|
|
||||||
|
Output schema:
|
||||||
|
{
|
||||||
|
"batch": "Full Codebase Sweep",
|
||||||
|
"batch_index": 6,
|
||||||
|
"assessments": {"<dimension>": <0-100 with one decimal place>},
|
||||||
|
"dimension_notes": {
|
||||||
|
"<dimension>": {
|
||||||
|
"evidence": ["specific code observations"],
|
||||||
|
"impact_scope": "local|module|subsystem|codebase",
|
||||||
|
"fix_scope": "single_edit|multi_file_refactor|architectural_change",
|
||||||
|
"confidence": "high|medium|low",
|
||||||
|
"unreported_risk": "required when score >85",
|
||||||
|
"sub_axes": {"abstraction_leverage": 0-100 with one decimal place, "indirection_cost": 0-100 with one decimal place, "interface_honesty": 0-100 with one decimal place} // required for abstraction_fitness when evidence supports it
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"findings": []
|
||||||
|
}
|
||||||
@@ -0,0 +1 @@
|
|||||||
|
{"batch":"Architecture & Coupling","batch_index":1,"assessments":{"cross_module_architecture":100.0},"dimension_notes":{"cross_module_architecture":{"evidence":["All assigned `internal/quality/*.go` files stay within a single package boundary (`package quality`) and only import standard library packages (`time`, `testing`, `strings`), with no cross-package dependency fan-out from this slice.","`pkg/rustdocs/parser_test.go` is isolated to `package rustdocs` and imports only stdlib plus `github.com/PuerkitoBio/goquery`; it does not couple into `internal/quality` types or helpers.","No `init()` functions, package-level mutable singleton wiring, or import-time execution patterns were found in the reviewed files; behavior is test-function scoped and constructor-invoked (`NewParser`, `NewScorer`, `NewNarrativeGenerator`).","Type declarations in `internal/quality/types.go` and `internal/quality/enhanced_types.go` are cohesive data-model definitions within one module boundary rather than cross-module shims or compatibility layers."],"impact_scope":"local","fix_scope":"single_edit","confidence":"high","unreported_risk":"This batch covers only five files; architectural hotspots could still exist in non-assigned packages (e.g., runtime wiring or broader dependency graph) outside this evidence window."}},"findings":[]}
|
||||||
@@ -0,0 +1,97 @@
|
|||||||
|
{
|
||||||
|
"batch": "Abstractions & Dependencies",
|
||||||
|
"batch_index": 2,
|
||||||
|
"assessments": {
|
||||||
|
"abstraction_fitness": 68.0
|
||||||
|
},
|
||||||
|
"dimension_notes": {
|
||||||
|
"abstraction_fitness": {
|
||||||
|
"evidence": [
|
||||||
|
"Language-to-doc behavior is spread across multiple large switches: URL construction in cmd/get.go:78-173, type mapping in cmd/get.go:175-205, and term derivation in cmd/ask.go:205-260+.",
|
||||||
|
"External scraper implementations repeat the same transport/change-detection scaffold (config+parser+http client fields, URL check, fetchPage, generateHash, DetectChanges) across multiple files, e.g. internal/scraper/external/godocs.go:17-121, internal/scraper/external/javadocs.go:16-115, internal/scraper/external/nuxtdocs.go:16-120, internal/scraper/external/cloudflaredocs.go:16-105.",
|
||||||
|
"Vector store abstraction exposes implementations that are selected by default config but intentionally unimplemented: internal/config/config.go:121-125 defaults to chromem, while internal/vector/store.go:221-243 returns \"chromem store not implemented\" for all operations.",
|
||||||
|
"Configuration defaults are duplicated in two representations (typed defaults and hand-written YAML template), increasing drift risk: cmd/init.go:92-149 and internal/config/config.go:104-160."
|
||||||
|
],
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "architectural_change",
|
||||||
|
"confidence": "high",
|
||||||
|
"sub_axes": {
|
||||||
|
"abstraction_leverage": 62.0,
|
||||||
|
"indirection_cost": 71.0,
|
||||||
|
"interface_honesty": 60.0
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"findings": [
|
||||||
|
{
|
||||||
|
"dimension": "abstraction_fitness",
|
||||||
|
"identifier": "language_catalog_scattered_switches",
|
||||||
|
"summary": "Language routing logic is duplicated across CLI flows instead of one catalog abstraction",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/get.go",
|
||||||
|
"cmd/ask.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"cmd/get.go:78-173 defines a large language switch for URL building; cmd/get.go:175-205 defines a second switch for source type mapping.",
|
||||||
|
"cmd/ask.go:205-260+ adds a third language switch for term heuristics, creating three independent sources of truth for one domain model."
|
||||||
|
],
|
||||||
|
"suggestion": "Introduce a single `LanguageSpec` registry (aliases, source type, URL builder, optional query-term strategy) in one package and have both `get` and `ask` consume it; keep per-language behavior as data/functions attached to that registry.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "architectural_change"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "abstraction_fitness",
|
||||||
|
"identifier": "external_scraper_scaffold_duplication",
|
||||||
|
"summary": "External scraper adapters reimplement the same transport/hash lifecycle repeatedly",
|
||||||
|
"related_files": [
|
||||||
|
"internal/scraper/external/godocs.go",
|
||||||
|
"internal/scraper/external/javadocs.go",
|
||||||
|
"internal/scraper/external/nuxtdocs.go",
|
||||||
|
"internal/scraper/external/cloudflaredocs.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"Each file defines near-identical struct fields (`config`, `parser`, `client`), constructor wiring, URL-required guard, `fetchPage`, `generateHash`, and `DetectChanges` flow (e.g., godocs.go:17-121 and javadocs.go:16-115).",
|
||||||
|
"Duplication scales linearly with each new source adapter, increasing edit surface for cross-cutting behavior (timeouts, headers, error mapping)."
|
||||||
|
],
|
||||||
|
"suggestion": "Extract a shared `HTTPDocScraperBase` (or composable helper functions) for request execution, status handling, hashing, and change detection; keep each adapter focused on parser invocation and domain-specific document mapping.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "abstraction_fitness",
|
||||||
|
"identifier": "default_selects_unimplemented_store",
|
||||||
|
"summary": "Store interface contract is dishonest because default backend is not operational",
|
||||||
|
"related_files": [
|
||||||
|
"internal/vector/store.go",
|
||||||
|
"internal/config/config.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"internal/config/config.go:121-125 sets default vector DB type to `chromem`.",
|
||||||
|
"internal/vector/store.go:221-243 returns `chromem store not implemented` for all `Store` operations after `NewStore` can select that backend (store.go:63-72)."
|
||||||
|
],
|
||||||
|
"suggestion": "Either implement `ChromemStore` before exposing it as default, or switch default to a working backend and gate chromem behind explicit opt-in plus capability check at startup.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "module",
|
||||||
|
"fix_scope": "architectural_change"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "abstraction_fitness",
|
||||||
|
"identifier": "config_defaults_double_encoded",
|
||||||
|
"summary": "Initialization defaults are encoded twice with different abstractions",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/init.go",
|
||||||
|
"internal/config/config.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"cmd/init.go:92-149 hardcodes YAML defaults as a template string.",
|
||||||
|
"internal/config/config.go:104-160 hardcodes defaults again in typed structs, requiring synchronized updates across two representations."
|
||||||
|
],
|
||||||
|
"suggestion": "Generate init YAML from `config.Default()` via marshal + small post-processing/comments, or maintain a single canonical defaults schema consumed by both loader and init command.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "module",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
@@ -0,0 +1,88 @@
|
|||||||
|
{
|
||||||
|
"batch": "Governance & Contracts",
|
||||||
|
"batch_index": 3,
|
||||||
|
"assessments": {
|
||||||
|
"cross_module_architecture": 82.0,
|
||||||
|
"test_strategy": 74.0
|
||||||
|
},
|
||||||
|
"dimension_notes": {
|
||||||
|
"cross_module_architecture": {
|
||||||
|
"evidence": [
|
||||||
|
"`internal/quality/types.go` defines a typed status contract (`type Status string` with constants like `StatusOpen`, `StatusFixed`, `StatusWontfix`).",
|
||||||
|
"`internal/quality/types.go` also defines `Scorecard.StatusByType map[string]int`, which bypasses the typed status contract at the module boundary.",
|
||||||
|
"`internal/quality/scoring_test.go` asserts raw string keys (`\"open\"`, `\"fixed\"`) instead of using `Status` constants, reinforcing stringly-typed cross-component coupling.",
|
||||||
|
"`README.md` claims quality features include tracking resolution states, but the in-code state transport for scorecards is weakly typed."
|
||||||
|
],
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "multi_file_refactor",
|
||||||
|
"confidence": "high"
|
||||||
|
},
|
||||||
|
"test_strategy": {
|
||||||
|
"evidence": [
|
||||||
|
"`internal/quality/narrative_test.go` validates exact headline/action prose and directly tests internal helper behavior (e.g., `determinePhase`, `generateHeadline`, `classifyDimension`), creating high implementation-coupling.",
|
||||||
|
"`internal/quality/scoring_test.go` similarly focuses on exact internal scoring details and string key literals, which makes refactors noisy and discourages safe design changes.",
|
||||||
|
"`pkg/rustdocs/parser_test.go` is heavily happy-path: it checks successful parses and minimal field presence but has no malformed-input/error-path cases for parser resilience.",
|
||||||
|
"`README.md` marks parts of the CLI as unstable/stubbed, but assigned tests do not provide cross-module contract/integration safety nets for those runtime boundaries."
|
||||||
|
],
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "multi_file_refactor",
|
||||||
|
"confidence": "high"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"findings": [
|
||||||
|
{
|
||||||
|
"dimension": "cross_module_architecture",
|
||||||
|
"identifier": "status_contract_string_map_boundary",
|
||||||
|
"summary": "Scorecard state uses string keys instead of shared Status type, weakening module contracts.",
|
||||||
|
"related_files": [
|
||||||
|
"internal/quality/types.go",
|
||||||
|
"internal/quality/scoring_test.go",
|
||||||
|
"README.md"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`internal/quality/types.go` defines `Status` constants but `Scorecard.StatusByType` is `map[string]int`.",
|
||||||
|
"`internal/quality/scoring_test.go` asserts `card.StatusByType[\"open\"]` and `card.StatusByType[\"fixed\"]` directly.",
|
||||||
|
"README promises resolution-state tracking, but this boundary is not type-safe."
|
||||||
|
],
|
||||||
|
"suggestion": "Change `Scorecard.StatusByType` to `map[Status]int` (or a dedicated typed struct), update serialization adapters if needed, and update tests to assert using `StatusOpen`/`StatusFixed` constants.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "test_strategy",
|
||||||
|
"identifier": "brittle_private_and_copy_assertions_in_quality_tests",
|
||||||
|
"summary": "Quality tests are tightly coupled to private helpers and exact copy text, reducing refactor safety.",
|
||||||
|
"related_files": [
|
||||||
|
"internal/quality/narrative_test.go",
|
||||||
|
"internal/quality/scoring_test.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`narrative_test.go` directly asserts exact strings for generated headlines/actions and tests helper internals rather than stable external behavior.",
|
||||||
|
"`scoring_test.go` anchors on specific internal weighting outputs and literal status strings, which can fail on benign internal redesigns."
|
||||||
|
],
|
||||||
|
"suggestion": "Shift to contract-level tests against exported APIs with invariant assertions (phase category, presence of required fields, monotonic score behavior), and keep only a small set of snapshot/copy tests for user-facing text.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "module",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "test_strategy",
|
||||||
|
"identifier": "rust_parser_missing_negative_and_boundary_cases",
|
||||||
|
"summary": "Rust parser tests miss malformed-input and degradation-path coverage.",
|
||||||
|
"related_files": [
|
||||||
|
"pkg/rustdocs/parser_test.go",
|
||||||
|
"README.md"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`parser_test.go` cases are successful parses with valid fixture HTML and only basic assertions.",
|
||||||
|
"No tests verify behavior for malformed HTML, missing selectors, empty documents, or unsupported result rows.",
|
||||||
|
"README positions docs ingestion as core functionality, so parser failure behavior is a critical path."
|
||||||
|
],
|
||||||
|
"suggestion": "Add table-driven negative tests for malformed/partial HTML, empty search blocks, and missing headings; assert stable fallback behavior (explicit error or safe zero-value output) for each parser entrypoint.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "module",
|
||||||
|
"fix_scope": "single_edit"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
@@ -0,0 +1,79 @@
|
|||||||
|
{
|
||||||
|
"batch": "Design Coherence — Mechanical Concern Signals",
|
||||||
|
"batch_index": 4,
|
||||||
|
"assessments": {
|
||||||
|
"design_coherence": 66.0
|
||||||
|
},
|
||||||
|
"dimension_notes": {
|
||||||
|
"design_coherence": {
|
||||||
|
"evidence": [
|
||||||
|
"Parallel implementations of the same scorecard pipeline exist in `cmd/devour_scorecard.py` and `cmd/scorecard_generator.py` with near-identical function layouts (`ScorecardData`, `score_color`, `draw_left_panel`, `draw_right_panel`, `generate_scorecard`, `main`) and only minor line-level differences.",
|
||||||
|
"Three variants of enhanced generator (`cmd/devour_enhanced.py`, `cmd/devour_enhanced_fixed.py`, `cmd/devour_enhanced_v2.py`) repeat almost the full rendering stack (`draw_header_section`, `draw_enhanced_left_panel`, `draw_enhanced_right_panel`, `draw_trends_section`, `load_enhanced_devour_data`), creating branch-by-copy evolution.",
|
||||||
|
"Scraper adapters across providers (`internal/scraper/external/astrodocs.go`, `internal/scraper/external/cloudflaredocs.go`, `internal/scraper/external/reactdocs.go`) duplicate fetch/hash/change-detection and document assembly patterns with provider-specific data glued inline, indicating repeated structural pattern without shared orchestration abstraction.",
|
||||||
|
"Within `cmd/devour_lighthouse.py`, `load_font` is defined twice (once near top and again later), showing local design drift and utility ownership ambiguity."
|
||||||
|
],
|
||||||
|
"impact_scope": "codebase",
|
||||||
|
"fix_scope": "architectural_change",
|
||||||
|
"confidence": "high"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"findings": [
|
||||||
|
{
|
||||||
|
"dimension": "design_coherence",
|
||||||
|
"identifier": "scorecard_variant_sprawl",
|
||||||
|
"summary": "Scorecard generation is maintained as multiple copy-variants instead of one composable pipeline.",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/devour_scorecard.py",
|
||||||
|
"cmd/scorecard_generator.py",
|
||||||
|
"cmd/devour_enhanced.py",
|
||||||
|
"cmd/devour_enhanced_fixed.py",
|
||||||
|
"cmd/devour_enhanced_v2.py"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"Both `cmd/devour_scorecard.py` and `cmd/scorecard_generator.py` declare the same major functions and data model in the same order with only minor stylistic deltas.",
|
||||||
|
"Enhanced variants repeat the same section render functions and data loading flow, then diverge by ad-hoc edits, increasing change fan-out for any layout or scoring rule update."
|
||||||
|
],
|
||||||
|
"suggestion": "Extract a shared rendering core module (palette/fonts/layout primitives + data normalization), keep one canonical CLI entrypoint, and convert variant behavior into explicit theme/feature flags rather than duplicated files.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "codebase",
|
||||||
|
"fix_scope": "architectural_change"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "design_coherence",
|
||||||
|
"identifier": "external_scraper_template_duplication",
|
||||||
|
"summary": "Provider scrapers repeat the same orchestration flow with per-provider copy/paste adapters.",
|
||||||
|
"related_files": [
|
||||||
|
"internal/scraper/external/astrodocs.go",
|
||||||
|
"internal/scraper/external/cloudflaredocs.go",
|
||||||
|
"internal/scraper/external/reactdocs.go",
|
||||||
|
"internal/scraper/external/godocs.go",
|
||||||
|
"internal/scraper/external/vuedocs.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"Each scraper reimplements nearly identical `Scrape`, `DetectChanges`, `fetchPage`, and `generateHash` scaffolding, then inlines provider-specific conversion methods.",
|
||||||
|
"The repeated constructor/client/parser wiring pattern appears across multiple files, indicating systemic pattern duplication rather than isolated differences."
|
||||||
|
],
|
||||||
|
"suggestion": "Introduce a shared `DocAdapter` contract and a generic `HTTPDocScraper` that owns fetch/hash/change-detect; keep provider files focused on mapping parsed domain objects to `Document`.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "architectural_change"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "design_coherence",
|
||||||
|
"identifier": "utility_ownership_drift_in_lighthouse_script",
|
||||||
|
"summary": "Duplicate utility definition in one file shows mixed responsibility boundaries.",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/devour_lighthouse.py",
|
||||||
|
"cmd/devour_enhanced.py"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`cmd/devour_lighthouse.py` defines `load_font` twice with effectively the same fallback behavior, creating hidden override risk and unclear source of truth.",
|
||||||
|
"Comparable font utility exists in other renderer scripts, reinforcing that shared utility concerns are spread instead of centralized."
|
||||||
|
],
|
||||||
|
"suggestion": "Remove the duplicate in `cmd/devour_lighthouse.py` and move font-loading helpers into a shared module imported by all renderer scripts.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "module",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
@@ -0,0 +1,79 @@
|
|||||||
|
{
|
||||||
|
"batch": "Cross-cutting Sweep",
|
||||||
|
"batch_index": 5,
|
||||||
|
"assessments": {
|
||||||
|
"error_consistency": 71.0
|
||||||
|
},
|
||||||
|
"dimension_notes": {
|
||||||
|
"error_consistency": {
|
||||||
|
"evidence": [
|
||||||
|
"Raw error passthrough is common in core flows (e.g., `return nil, err` in `internal/search/engine.go:114`, `internal/search/engine.go:122`, `internal/scraper/openapi.go:45`, `internal/scraper/openapi.go:50`) while nearby code wraps with operation context (e.g., `internal/search/engine.go:111`, `internal/scraper/openapi.go:153`).",
|
||||||
|
"Failure handling style diverges between aborting, propagating, and suppressing in similar backend paths: `panic(...)` in `internal/quality/plugins/go/plugin.go:363`, warning print-and-continue in `internal/indexer/indexer.go:239`, and plain returns in `cmd/scrape.go:90`/`cmd/get.go:59`.",
|
||||||
|
"Some call paths lose caller context at command boundaries (`cmd/scrape.go:90`, `cmd/scrape.go:125`, `cmd/get.go:59`) despite contextual wrapping being used in other command-layer branches (`cmd/scrape.go:131`, `cmd/scrape.go:145`)."
|
||||||
|
],
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "multi_file_refactor",
|
||||||
|
"confidence": "high"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"findings": [
|
||||||
|
{
|
||||||
|
"dimension": "error_consistency",
|
||||||
|
"identifier": "mixed_error_wrapping_in_scrape_and_search_paths",
|
||||||
|
"summary": "Related scrape/search paths mix raw passthrough and contextual wrapping.",
|
||||||
|
"related_files": [
|
||||||
|
"internal/search/engine.go",
|
||||||
|
"internal/scraper/openapi.go",
|
||||||
|
"internal/scraper/localsearch.go",
|
||||||
|
"cmd/scrape.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`internal/search/engine.go` frequently returns raw errors (`:114`, `:117`, `:122`, `:170`) but also uses contextual errors (`:111`, `:230`).",
|
||||||
|
"`internal/scraper/openapi.go` propagates raw errors from `readSpec`/`parseOpenAPISpec` (`:45`, `:50`, `:123`, `:141`, `:149`, `:157`, `:164`) while also defining wrapped errors (`:135`, `:153`, `:217`).",
|
||||||
|
"`internal/scraper/localsearch.go` returns raw errors from helper boundaries (`:79`, `:164`, `:191`, `:222`) mixed with rich wrapped messages in the same workflow (`:196`, `:203`, `:209`, `:217`)."
|
||||||
|
],
|
||||||
|
"suggestion": "Define a package-level rule: public methods must wrap downstream errors with operation context (using `%w`), and helper internals may return raw errors. Apply this consistently to `Rebuild/EnsureIndexed`, `OpenAPIScraper.Scrape/DetectChanges/readSpec`, and `LocalSearchScraper` methods.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "error_consistency",
|
||||||
|
"identifier": "inconsistent_failure_channel_panic_vs_error_vs_warning",
|
||||||
|
"summary": "Failure signaling varies between panic, error return, and warning-only logging.",
|
||||||
|
"related_files": [
|
||||||
|
"internal/quality/plugins/go/plugin.go",
|
||||||
|
"internal/indexer/indexer.go",
|
||||||
|
"cmd/scrape.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`internal/quality/plugins/go/plugin.go:363` panics on plugin registration failure.",
|
||||||
|
"`internal/indexer/indexer.go:239` prints a warning and suppresses deletion errors instead of returning them.",
|
||||||
|
"`cmd/scrape.go` is structured around returned errors (`:131`, `:145`, `:207`) and has no panic-based handling, creating inconsistent contracts across subsystems."
|
||||||
|
],
|
||||||
|
"suggestion": "Standardize on explicit error returns for recoverable startup/runtime failures; replace plugin `panic` with registration error propagation or controlled process-exit at the command entrypoint, and make indexer deletion behavior explicit (either aggregate and return partial-failure errors or document/encode best-effort mode).",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "codebase",
|
||||||
|
"fix_scope": "architectural_change"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "error_consistency",
|
||||||
|
"identifier": "command_boundary_context_loss",
|
||||||
|
"summary": "CLI command boundaries sometimes return raw errors without command context.",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/get.go",
|
||||||
|
"cmd/scrape.go",
|
||||||
|
"internal/config/config.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`cmd/get.go:59` and `cmd/scrape.go:90`/`:125` return raw errors directly from downstream calls.",
|
||||||
|
"Other branches in the same command wrap with explicit context (`cmd/scrape.go:131`, `cmd/scrape.go:145`, `cmd/scrape.go:154`).",
|
||||||
|
"Config layer already emits contextual wrapped errors (`internal/config/config.go:177`, `internal/config/config.go:181`), so command-layer inconsistency creates uneven user-facing diagnostics."
|
||||||
|
],
|
||||||
|
"suggestion": "At CLI entrypoints, wrap all returned downstream errors with command/action context (e.g., `run get`, `load config`, `scrape source`) and preserve root cause with `%w`; keep user-readable validation errors as direct messages.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "module",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
@@ -0,0 +1,167 @@
|
|||||||
|
{
|
||||||
|
"batch": "Full Codebase Sweep",
|
||||||
|
"batch_index": 6,
|
||||||
|
"assessments": {
|
||||||
|
"cross_module_architecture": 74.0,
|
||||||
|
"error_consistency": 68.0,
|
||||||
|
"abstraction_fitness": 62.0,
|
||||||
|
"test_strategy": 55.0,
|
||||||
|
"design_coherence": 64.0
|
||||||
|
},
|
||||||
|
"dimension_notes": {
|
||||||
|
"cross_module_architecture": {
|
||||||
|
"evidence": [
|
||||||
|
"`cmd/root.go` relies on blank import `_ \"github.com/yourorg/devour/internal/scraper/external\"` to activate runtime registration side effects.",
|
||||||
|
"`internal/scraper/external/register.go` mutates global scraper registry in `init()` for all language scrapers.",
|
||||||
|
"`internal/scraper/registry_simple.go` uses global mutable registry (`globalRegistry`) shared process-wide, increasing hidden coupling and order sensitivity."
|
||||||
|
],
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "architectural_change",
|
||||||
|
"confidence": "high"
|
||||||
|
},
|
||||||
|
"error_consistency": {
|
||||||
|
"evidence": [
|
||||||
|
"`cmd/root.go` and `cmd/scorecard.go` terminate with `os.Exit`, while most command flows return wrapped errors.",
|
||||||
|
"`internal/quality/plugins/go/plugin.go` panics during plugin registration (`panic(fmt.Sprintf(...))`) instead of surfacing an error contract.",
|
||||||
|
"`cleanup_unused.go` uses `log.Fatal` and mixed logging/exit style unlike the rest of the codebase's `fmt.Errorf(...%w...)` propagation."
|
||||||
|
],
|
||||||
|
"impact_scope": "codebase",
|
||||||
|
"fix_scope": "multi_file_refactor",
|
||||||
|
"confidence": "high"
|
||||||
|
},
|
||||||
|
"abstraction_fitness": {
|
||||||
|
"evidence": [
|
||||||
|
"Language-specific scraper implementations (`internal/scraper/external/godocs.go`, `internal/scraper/external/rustdocs.go`, `internal/scraper/external/reactdocs.go`, and peers) repeat near-identical HTTP fetch/hash/error scaffolding with only parser/document mapping differences.",
|
||||||
|
"`internal/scraper/external/types.go` is a thin alias layer over `internal/scraper` types and does not enforce additional policy or invariants.",
|
||||||
|
"High repeated constructor/scrape boilerplate across external scrapers indicates abstraction cost is paid repeatedly without shared leverage."
|
||||||
|
],
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "architectural_change",
|
||||||
|
"confidence": "high",
|
||||||
|
"sub_axes": {
|
||||||
|
"abstraction_leverage": 58.0,
|
||||||
|
"indirection_cost": 72.0,
|
||||||
|
"interface_honesty": 70.0
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"test_strategy": {
|
||||||
|
"evidence": [
|
||||||
|
"Critical runtime surfaces have no direct tests: `internal/server/server.go`, `pkg/client/client.go`, `internal/vector/store.go`, `internal/search/engine.go`, `internal/indexer/indexer.go`.",
|
||||||
|
"`pkg/client/client.go` has TODO stubs returning `nil, nil` for `Query` and `Status`, but there are no tests asserting failure behavior or contract correctness.",
|
||||||
|
"`internal/server/server.go` Start methods are TODO/no-op `nil` returns and remain unvalidated by tests, creating false-green behavior for unimplemented paths."
|
||||||
|
],
|
||||||
|
"impact_scope": "codebase",
|
||||||
|
"fix_scope": "multi_file_refactor",
|
||||||
|
"confidence": "high"
|
||||||
|
},
|
||||||
|
"design_coherence": {
|
||||||
|
"evidence": [
|
||||||
|
"`cmd/quality.go` (695 LOC) mixes CLI wiring, scan orchestration, status persistence, scoring output, resolution updates, fixer execution, and review import/export in one file.",
|
||||||
|
"`cmd/scrape.go` (444 LOC) combines source parsing, source-type inference, profile heuristics, scrape orchestration, persistence, and source-state hashing.",
|
||||||
|
"These large command files show recurring multi-responsibility patterns rather than cohesive command/use-case units."
|
||||||
|
],
|
||||||
|
"impact_scope": "module",
|
||||||
|
"fix_scope": "multi_file_refactor",
|
||||||
|
"confidence": "high"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"findings": [
|
||||||
|
{
|
||||||
|
"dimension": "cross_module_architecture",
|
||||||
|
"identifier": "init_side_effect_registration_coupling",
|
||||||
|
"summary": "Scraper registration depends on import-time side effects and global mutable registry state.",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/root.go",
|
||||||
|
"internal/scraper/external/register.go",
|
||||||
|
"internal/scraper/registry_simple.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"Blank import in root command triggers registration implicitly rather than explicit bootstrap wiring.",
|
||||||
|
"Registration happens in `init()` and mutates shared global registry."
|
||||||
|
],
|
||||||
|
"suggestion": "Replace import-time registration with explicit bootstrap registration (e.g., `RegisterExternalScrapers()` called from startup), and pass registry instances through constructors to remove hidden global coupling.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "architectural_change"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "error_consistency",
|
||||||
|
"identifier": "mixed_process_termination_and_error_propagation",
|
||||||
|
"summary": "Error handling mixes panic/log.Fatal/os.Exit with returned errors across adjacent layers.",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/root.go",
|
||||||
|
"cmd/scorecard.go",
|
||||||
|
"internal/quality/plugins/go/plugin.go",
|
||||||
|
"cleanup_unused.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`Execute()` exits process directly; scorecard helper exits inside utility flow; plugin registration panics on failure.",
|
||||||
|
"Most other command paths return wrapped errors, creating inconsistent failure semantics."
|
||||||
|
],
|
||||||
|
"suggestion": "Standardize on returning errors from library/command internals and only perform process exit in one top-level entrypoint; replace panic/log.Fatal in shared code with typed/wrapped errors.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "codebase",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "abstraction_fitness",
|
||||||
|
"identifier": "external_scraper_boilerplate_without_shared_base",
|
||||||
|
"summary": "External scraper implementations duplicate fetch/hash/error/document plumbing instead of sharing a base abstraction.",
|
||||||
|
"related_files": [
|
||||||
|
"internal/scraper/external/godocs.go",
|
||||||
|
"internal/scraper/external/rustdocs.go",
|
||||||
|
"internal/scraper/external/reactdocs.go",
|
||||||
|
"internal/scraper/external/nuxtdocs.go",
|
||||||
|
"internal/scraper/external/cloudflaredocs.go",
|
||||||
|
"internal/scraper/external/types.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"Each scraper repeats `fetchPage`, status code checks, hash generation, and near-identical scrape control flow.",
|
||||||
|
"Alias-only types file adds indirection without behavior."
|
||||||
|
],
|
||||||
|
"suggestion": "Introduce a shared external-scraper base/helper for HTTP fetch, retries, hashing, and common error mapping; keep only parser-specific extraction and document shaping in each language scraper.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "subsystem",
|
||||||
|
"fix_scope": "architectural_change"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "test_strategy",
|
||||||
|
"identifier": "untested_unimplemented_runtime_paths",
|
||||||
|
"summary": "Core runtime paths are both under-tested and partially stubbed, leaving high-risk behavior unvalidated.",
|
||||||
|
"related_files": [
|
||||||
|
"internal/server/server.go",
|
||||||
|
"pkg/client/client.go",
|
||||||
|
"internal/vector/store.go",
|
||||||
|
"internal/search/engine.go",
|
||||||
|
"internal/indexer/indexer.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"Server/client/store contain TODO or not-implemented branches without direct tests.",
|
||||||
|
"No direct test files exist for several core modules that govern querying, indexing, and serving."
|
||||||
|
],
|
||||||
|
"suggestion": "Add table-driven tests for client/server/store/indexer contracts first (error behavior and non-nil results), then implement missing paths behind those tests; prioritize integration tests that exercise scrape->index->query flow.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "codebase",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"dimension": "design_coherence",
|
||||||
|
"identifier": "command_files_mix_multiple_responsibilities",
|
||||||
|
"summary": "Large CLI command files blend orchestration, domain logic, persistence, and formatting concerns.",
|
||||||
|
"related_files": [
|
||||||
|
"cmd/quality.go",
|
||||||
|
"cmd/scrape.go",
|
||||||
|
"cmd/ask.go"
|
||||||
|
],
|
||||||
|
"evidence": [
|
||||||
|
"`cmd/quality.go` combines scan setup, scoring/status persistence, resolve/fix/review workflows.",
|
||||||
|
"`cmd/scrape.go` combines config parsing, source detection/profiling, scrape execution, indexing, and source-state updates.",
|
||||||
|
"`cmd/ask.go` includes query derivation, source URL heuristics, ranking, summarization, and output formatting in one command module."
|
||||||
|
],
|
||||||
|
"suggestion": "Split command files into focused packages (transport/CLI binding vs service layer vs persistence helpers) and keep Cobra handlers as thin adapters invoking composable use-case functions.",
|
||||||
|
"confidence": "high",
|
||||||
|
"impact_scope": "module",
|
||||||
|
"fix_scope": "multi_file_refactor"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
@@ -0,0 +1,158 @@
|
|||||||
|
<!-- desloppify-begin -->
|
||||||
|
<!-- desloppify-skill-version: 1 -->
|
||||||
|
---
|
||||||
|
name: desloppify
|
||||||
|
description: >
|
||||||
|
Codebase health scanner and technical debt tracker. Use when the user asks
|
||||||
|
about code quality, technical debt, dead code, large files, god classes,
|
||||||
|
duplicate functions, code smells, naming issues, import cycles, or coupling
|
||||||
|
problems. Also use when asked for a health score, what to fix next, or to
|
||||||
|
create a cleanup plan. Supports 28 languages.
|
||||||
|
allowed-tools: Bash(desloppify *)
|
||||||
|
---
|
||||||
|
|
||||||
|
# Desloppify
|
||||||
|
|
||||||
|
## 1. Your Job
|
||||||
|
|
||||||
|
**Improve code quality by fixing findings and maximizing strict score honestly.**
|
||||||
|
Never hide debt with suppression patterns just to improve lenient score. After
|
||||||
|
every scan, show the user ALL scores:
|
||||||
|
|
||||||
|
| What | How |
|
||||||
|
|------|-----|
|
||||||
|
| Overall health | lenient + strict |
|
||||||
|
| 5 mechanical dimensions | File health, Code quality, Duplication, Test health, Security |
|
||||||
|
| 7 subjective dimensions | Naming Quality, Error Consistency, Abstraction Fit, Logic Clarity, AI Generated Debt, Type Safety, Contract Coherence |
|
||||||
|
|
||||||
|
Never skip scores. The user tracks progress through them.
|
||||||
|
|
||||||
|
## 2. Core Loop
|
||||||
|
|
||||||
|
```
|
||||||
|
scan → follow the tool's strategy → fix or wontfix → rescan
|
||||||
|
```
|
||||||
|
|
||||||
|
1. `desloppify scan --path .` — the scan output ends with **INSTRUCTIONS FOR AGENTS**. Follow them. Don't substitute your own analysis.
|
||||||
|
2. Fix the issue the tool recommends.
|
||||||
|
3. `desloppify resolve fixed "<id>"` — or if it's intentional/acceptable:
|
||||||
|
`desloppify resolve wontfix "<id>" --note "reason why"`
|
||||||
|
4. Rescan to verify.
|
||||||
|
|
||||||
|
**Wontfix is not free.** It lowers the strict score. The gap between lenient and strict IS wontfix debt. Call it out when:
|
||||||
|
- Wontfix count is growing — challenge whether past decisions still hold
|
||||||
|
- A dimension is stuck 3+ scans — suggest a different approach
|
||||||
|
- Auto-fixers exist for open findings — ask why they haven't been run
|
||||||
|
|
||||||
|
## 3. Commands
|
||||||
|
|
||||||
|
```bash
|
||||||
|
desloppify scan --path src/ # full scan
|
||||||
|
desloppify scan --path src/ --reset-subjective # reset subjective baseline to 0, then scan
|
||||||
|
desloppify next --count 5 # top priorities
|
||||||
|
desloppify show <pattern> # filter by file/detector/ID
|
||||||
|
desloppify plan # prioritized plan
|
||||||
|
desloppify fix <fixer> --dry-run # auto-fix (dry-run first!)
|
||||||
|
desloppify move <src> <dst> --dry-run # move + update imports
|
||||||
|
desloppify resolve fixed|wontfix|false_positive "<pat>" # classify finding outcome
|
||||||
|
desloppify review --prepare # generate subjective review data
|
||||||
|
desloppify review --import file.json # import review results
|
||||||
|
```
|
||||||
|
|
||||||
|
## 4. Subjective Reviews (biggest score lever)
|
||||||
|
|
||||||
|
Score = 40% mechanical + 60% subjective. Subjective starts at 0% until reviewed.
|
||||||
|
|
||||||
|
1. `desloppify review --prepare` — writes dimension definitions and codebase context
|
||||||
|
to `query.json`.
|
||||||
|
|
||||||
|
2. **Review each dimension independently.** For best results, review dimensions in
|
||||||
|
isolation so scores don't bleed across concerns. If your agent supports parallel
|
||||||
|
execution, use it — your agent-specific overlay (appended below, if installed)
|
||||||
|
has the optimal approach. Each reviewer needs:
|
||||||
|
- The codebase path and the dimensions to score
|
||||||
|
- What each dimension means (from `query.json`'s `dimension_prompts`)
|
||||||
|
- The output format (below)
|
||||||
|
- Nothing else — let them decide what to read and how
|
||||||
|
|
||||||
|
3. Merge assessments (average scores if multiple reviewers cover the same dimension)
|
||||||
|
and findings, then import:
|
||||||
|
```bash
|
||||||
|
desloppify review --import findings.json
|
||||||
|
```
|
||||||
|
|
||||||
|
Required output format per reviewer:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"assessments": { "naming_quality": 75.0, "logic_clarity": 82.0 },
|
||||||
|
"findings": [{
|
||||||
|
"dimension": "naming_quality",
|
||||||
|
"identifier": "short_id",
|
||||||
|
"summary": "one line",
|
||||||
|
"related_files": ["path/to/file.py"],
|
||||||
|
"evidence": ["specific observation"],
|
||||||
|
"suggestion": "concrete action"
|
||||||
|
}]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Need a clean subjective rerun from zero? Run `desloppify scan --path src/ --reset-subjective` before preparing/importing fresh review data.
|
||||||
|
|
||||||
|
Even moderate scores (60-80) dramatically improve overall health.
|
||||||
|
|
||||||
|
Integrity safeguard:
|
||||||
|
- If one subjective dimension lands exactly on the strict target, the scanner warns and asks for re-review.
|
||||||
|
- If two or more subjective dimensions land on the strict target in the same scan, those dimensions are auto-reset to 0 for that scan and must be re-reviewed/imported.
|
||||||
|
- Reviewers should score from evidence only (not from target-seeking).
|
||||||
|
|
||||||
|
## 5. Quick Reference
|
||||||
|
|
||||||
|
- **Tiers**: T1 auto-fix, T2 quick manual, T3 judgment call, T4 major refactor
|
||||||
|
- **Zones**: production/script (scored), test/config/generated/vendor (not scored). Fix with `zone set`.
|
||||||
|
- **Auto-fixers** (TS only): `unused-imports`, `unused-vars`, `debug-logs`, `dead-exports`, etc.
|
||||||
|
- **query.json**: After any command, has `narrative.actions` with prioritized next steps.
|
||||||
|
- `--skip-slow` skips duplicate detection for faster iteration.
|
||||||
|
- `--lang python`, `--lang typescript`, or `--lang csharp` to force language.
|
||||||
|
- C# defaults to `--profile objective`; use `--profile full` to include subjective review.
|
||||||
|
- Score can temporarily drop after fixes (cascade effects are normal).
|
||||||
|
|
||||||
|
## 6. Escalate Tool Issues Upstream
|
||||||
|
|
||||||
|
When desloppify itself appears wrong or inconsistent:
|
||||||
|
|
||||||
|
1. Capture a minimal repro (`command`, `path`, `expected`, `actual`).
|
||||||
|
2. Open a GitHub issue in `peteromallet/desloppify`.
|
||||||
|
3. If you can fix it safely, open a PR linked to that issue.
|
||||||
|
4. If unsure whether it is tool bug vs user workflow, issue first, PR second.
|
||||||
|
|
||||||
|
## Prerequisite
|
||||||
|
|
||||||
|
`command -v desloppify >/dev/null 2>&1 && echo "desloppify: installed" || echo "NOT INSTALLED — run: pip install --upgrade git+https://github.com/peteromallet/desloppify.git"`
|
||||||
|
|
||||||
|
<!-- desloppify-end -->
|
||||||
|
|
||||||
|
## Codex Overlay
|
||||||
|
|
||||||
|
This is the canonical Codex overlay used by the README install command.
|
||||||
|
|
||||||
|
1. Prefer first-class batch runs: `desloppify review --run-batches --runner codex --parallel`.
|
||||||
|
2. The command writes immutable packet snapshots under `.desloppify/review_packets/holistic_packet_*.json`; use those for reproducible retries.
|
||||||
|
3. Keep reviewer input scoped to the immutable packet and the source files named in each batch.
|
||||||
|
4. Do not use prior chat context, score history, narrative summaries, issue labels, or target-threshold anchoring while scoring.
|
||||||
|
5. Assess every dimension listed in `query.dimensions`; never drop a requested dimension. If evidence is weak/mixed, score lower and explain uncertainty in findings.
|
||||||
|
6. Return machine-readable JSON only for review imports:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"assessments": {
|
||||||
|
"<dimension_from_query>": 0
|
||||||
|
},
|
||||||
|
"findings": []
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
7. Keep `findings` schema compatible with `query.system_prompt`.
|
||||||
|
8. If a batch fails, retry only that slice with `desloppify review --run-batches --packet <packet.json> --only-batches <idxs>`.
|
||||||
|
|
||||||
|
<!-- desloppify-overlay: codex -->
|
||||||
|
<!-- desloppify-end -->
|
||||||
+98
-20
@@ -1,33 +1,111 @@
|
|||||||
# Devour
|
# Contributing to Devour
|
||||||
|
|
||||||
Devour is a context ingestion and management system for AI that scrapes, indexes, and serves documentation from multiple sources.
|
Thanks for considering a contribution.
|
||||||
|
|
||||||
## Installation
|
Devour is moving fast, and practical contributions (bug fixes, docs cleanup, tests, stability work) are especially valuable right now.
|
||||||
|
|
||||||
|
## Before You Start
|
||||||
|
|
||||||
|
- Check existing issues/PRs to avoid duplicate work.
|
||||||
|
- If the change is non-trivial, open an issue first to align on direction.
|
||||||
|
- Keep PRs focused. Small, reviewable changes get merged faster.
|
||||||
|
|
||||||
|
## Local Setup
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
go install github.com/yourorg/devour/cmd/devour@latest
|
git clone <your-fork-or-repo-url>
|
||||||
|
cd devour
|
||||||
|
go mod download
|
||||||
|
go build -o devour ./cmd/devour
|
||||||
```
|
```
|
||||||
|
|
||||||
## Quick Start
|
Optional sanity check:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Initialize
|
./devour --help
|
||||||
devour init
|
|
||||||
|
|
||||||
# Scrape documentation
|
|
||||||
devour scrape https://docs.example.com
|
|
||||||
|
|
||||||
# Query indexed docs
|
|
||||||
devour query "authentication flow"
|
|
||||||
|
|
||||||
# Start MCP server
|
|
||||||
devour serve
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Documentation
|
## Branch and Commit Workflow
|
||||||
|
|
||||||
See [README.md](README.md) for full documentation.
|
1. Fork the repo.
|
||||||
|
2. Create a branch from `main`:
|
||||||
|
`git checkout -b feat/short-description`
|
||||||
|
3. Make your changes.
|
||||||
|
4. Run tests/checks.
|
||||||
|
5. Commit with a clear message.
|
||||||
|
6. Open a PR.
|
||||||
|
|
||||||
## License
|
Commit style suggestion:
|
||||||
|
|
||||||
MIT License
|
- `feat: add xyz`
|
||||||
|
- `fix: resolve panic in abc`
|
||||||
|
- `docs: clarify quick start`
|
||||||
|
- `test: add coverage for module`
|
||||||
|
|
||||||
|
## What To Test
|
||||||
|
|
||||||
|
At minimum, run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
go test ./...
|
||||||
|
```
|
||||||
|
|
||||||
|
If your changes affect CLI behavior, also run relevant commands directly (for example `devour get`, `devour ask`, `devour quality status`).
|
||||||
|
|
||||||
|
If you touch docs, verify commands and paths are real.
|
||||||
|
|
||||||
|
## Current Known Limitations
|
||||||
|
|
||||||
|
Please keep these in mind when proposing fixes or documentation:
|
||||||
|
|
||||||
|
- Remote workflows are still experimental:
|
||||||
|
- `devour serve --remote` is available, but local stdio JSON-RPC is the primary stable mode.
|
||||||
|
- remote `push` flows are intentionally disabled in stable behavior.
|
||||||
|
- Live scraping quality can vary with upstream site changes; use `devour verify smoke` before releases.
|
||||||
|
- Language support is broad, but extraction quality differs by source type and site structure.
|
||||||
|
- Quality analyzers are strongest for Go code; cross-language analysis remains limited.
|
||||||
|
|
||||||
|
PRs that improve these areas are highly appreciated.
|
||||||
|
|
||||||
|
## Pull Request Checklist
|
||||||
|
|
||||||
|
Before opening a PR, confirm:
|
||||||
|
|
||||||
|
- [ ] Change is scoped and clearly described.
|
||||||
|
- [ ] Tests pass locally (`go test ./...`).
|
||||||
|
- [ ] New behavior includes tests when feasible.
|
||||||
|
- [ ] Docs updated for user-facing changes.
|
||||||
|
- [ ] No unrelated refactors mixed in.
|
||||||
|
|
||||||
|
## Coding Guidelines
|
||||||
|
|
||||||
|
- Follow existing Go style in the touched package.
|
||||||
|
- Prefer clear, simple control flow over clever abstractions.
|
||||||
|
- Add comments only when they explain non-obvious intent.
|
||||||
|
- Preserve current CLI UX unless there is a strong reason to change it.
|
||||||
|
|
||||||
|
## Reporting Bugs
|
||||||
|
|
||||||
|
When filing a bug, include:
|
||||||
|
|
||||||
|
- Devour command used
|
||||||
|
- exact error output
|
||||||
|
- OS and Go version
|
||||||
|
- minimal reproduction steps
|
||||||
|
|
||||||
|
Bug reports with reproducible steps are fixed much faster.
|
||||||
|
|
||||||
|
## Documentation Contributions
|
||||||
|
|
||||||
|
Docs improvements are first-class contributions.
|
||||||
|
|
||||||
|
Useful docs PRs include:
|
||||||
|
|
||||||
|
- fixing incorrect command examples
|
||||||
|
- clarifying status of in-progress features
|
||||||
|
- adding troubleshooting notes
|
||||||
|
- improving onboarding for first-time contributors
|
||||||
|
|
||||||
|
## Questions
|
||||||
|
|
||||||
|
If something is unclear, open an issue and ask directly. It is better to align early than rework a large PR later.
|
||||||
|
|||||||
@@ -1,593 +1,155 @@
|
|||||||
<p align="center">
|
<p align="center">
|
||||||
<img src="devour_logo.svg" alt="Devour Logo" width="300">
|
<img src="devour_logo.svg" alt="Devour Logo" width="220">
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<h1 align="center">Devour</h1>
|
<h1 align="center">Devour</h1>
|
||||||
|
<p align="center">Feed your AI real docs so it stops repeating the same mistakes.</p>
|
||||||
|
|
||||||
<p align="center">
|
## Why Devour exists
|
||||||
<strong>Context Ingestion & Management for AI</strong>
|
I built Devour because AI tools kept drifting away from official docs, then repeating wrong patterns in later prompts.
|
||||||
</p>
|
|
||||||
|
|
||||||
<p align="center">
|
Devour fixes that loop with a local-first workflow:
|
||||||
<a href="#features">Features</a> •
|
1. scrape official docs,
|
||||||
<a href="#installation">Installation</a> •
|
2. keep them in your project,
|
||||||
<a href="#quick-start">Quick Start</a> •
|
3. search and ask against that local corpus,
|
||||||
<a href="#architecture">Architecture</a> •
|
4. sync updates when sources change.
|
||||||
<a href="#cli-reference">CLI Reference</a> •
|
|
||||||
<a href="#configuration">Configuration</a>
|
|
||||||
</p>
|
|
||||||
|
|
||||||
---
|
## What works today
|
||||||
|
- `devour init` creates a complete local workspace
|
||||||
|
- `devour scrape` supports URL, language-specific docs, local files, GitHub repos, OpenAPI specs, and `--sources`
|
||||||
|
- `devour get` supports language/framework shortcuts (now including Next.js, Svelte, Angular, Remix, SolidJS, Express)
|
||||||
|
- `devour query` is now functional (local lexical index)
|
||||||
|
- `devour ask` is hybrid local-first with live fallback when local relevance is weak
|
||||||
|
- `devour sync` updates configured sources and reindexes
|
||||||
|
- `devour status` reports real local health/statistics
|
||||||
|
- `devour push <path>` imports local docs into the workspace and reindexes
|
||||||
|
- `devour serve` local mode runs JSON-RPC over stdio (`devour_query`, `devour_status`, `devour_scrape`, `devour_ask`, `devour_sync`)
|
||||||
|
- `devour auto` routes natural-language intent to the right command
|
||||||
|
- `devour verify smoke` runs opt-in live checks and writes reports
|
||||||
|
|
||||||
## What is Devour?
|
## Experimental
|
||||||
|
- `devour serve --remote` is available as an experimental HTTP RPC mode
|
||||||
Devour is a **context ingestion and management system** designed to feed structured, relevant context to AI models for generating accurate, fully working code.
|
- remote push workflows are intentionally not enabled as stable behavior yet
|
||||||
|
|
||||||
It scrapes, indexes, and serves documentation from multiple sources:
|
|
||||||
- GitHub repositories
|
|
||||||
- OpenAPI/Swagger specifications
|
|
||||||
- Markdown/HTML documentation sites
|
|
||||||
- JSON/YAML schemas
|
|
||||||
- Local project files
|
|
||||||
|
|
||||||
### Two Modes of Operation
|
|
||||||
|
|
||||||
| Mode | Description | Use Case |
|
|
||||||
|------|-------------|----------|
|
|
||||||
| **Local** | Runs as an OpenCode skill on your machine | Single developer, offline work |
|
|
||||||
| **Remote** | MCP server hosted on your infrastructure | Teams, multi-project support |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Features
|
|
||||||
|
|
||||||
### 🕷️ Multi-Source Scraping
|
|
||||||
- **GitHub** - Clone and parse repos, extract README, docs, code structure
|
|
||||||
- **OpenAPI** - Parse Swagger specs into structured endpoints
|
|
||||||
- **Web Docs** - Crawl documentation sites with Colly
|
|
||||||
- **Local Files** - Index your project's docs folder
|
|
||||||
|
|
||||||
### 🧠 Intelligent Indexing
|
|
||||||
- Vector embeddings via OpenAI (text-embedding-3-small/large)
|
|
||||||
- Semantic similarity search for context retrieval
|
|
||||||
- Metadata tracking (source, timestamp, file type)
|
|
||||||
|
|
||||||
### 🔄 Automatic Updates
|
|
||||||
- Configurable scheduler (default: every 3 days)
|
|
||||||
- Content hash comparison for change detection
|
|
||||||
- Automatic re-indexing on updates
|
|
||||||
|
|
||||||
### 🔌 MCP Integration
|
|
||||||
- Exposes context via Model Context Protocol
|
|
||||||
- **Local mode**: stdio transport (OpenCode skill)
|
|
||||||
- **Remote mode**: HTTP/SSE transport (MCP server)
|
|
||||||
|
|
||||||
### 💾 Flexible Storage
|
|
||||||
```
|
|
||||||
devour_data/
|
|
||||||
├── docs/ # Raw scraped documents
|
|
||||||
├── index/ # Vector embeddings
|
|
||||||
└── metadata/ # Source tracking & timestamps
|
|
||||||
```
|
|
||||||
|
|
||||||
### 📊 Quality Scorecard
|
|
||||||
|
|
||||||
Devour includes a built-in code quality analysis system that generates comprehensive scorecards for your project.
|
|
||||||
|
|
||||||
#### Three Scorecard Versions
|
|
||||||
|
|
||||||
**1. Compact Scorecard** - Quick overview with 3 key metrics
|
|
||||||

|
|
||||||
|
|
||||||
**2. Detailed Scorecard** - Comprehensive breakdown with charts and analytics
|
|
||||||

|
|
||||||
|
|
||||||
**3. Original Scorecard** - Classic balanced view
|
|
||||||

|
|
||||||
|
|
||||||
#### Usage
|
|
||||||
|
|
||||||
|
## Quick start
|
||||||
|
### 1) Build
|
||||||
```bash
|
```bash
|
||||||
# Run quality analysis with default (original) scorecard
|
|
||||||
devour quality scan
|
|
||||||
|
|
||||||
# Generate compact scorecard
|
|
||||||
devour quality scan --badge-path scorecard_compact.png --format compact
|
|
||||||
|
|
||||||
# Generate detailed scorecard
|
|
||||||
devour quality scan --badge-path scorecard_detailed.png --format detailed
|
|
||||||
|
|
||||||
# Generate dark theme versions
|
|
||||||
devour quality scan --badge-path scorecard_dark.png --theme dark
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Features
|
|
||||||
- **Multi-theme support** - Light and dark themes
|
|
||||||
- **Three formats** - Compact, detailed, and original layouts
|
|
||||||
- **Real scan data** - Analyzes actual code quality issues
|
|
||||||
- **Multi-language support** - Go, Python, JavaScript, TypeScript, Java, Rust
|
|
||||||
- **Severity-based scoring** - T1 (auto-fixable) to T4 (major refactor)
|
|
||||||
- **Technical debt tracking** - Track improvements over time
|
|
||||||
- **Comprehensive metrics** - Complexity, duplication, security, coverage, and more
|
|
||||||
|
|
||||||
#### Score Metrics
|
|
||||||
|
|
||||||
- **Overall Score** - General code health (0-100%)
|
|
||||||
- **Strict Score** - Conservative scoring ignoring quick wins
|
|
||||||
- **Grade** - Letter grade (A-F) based on overall score
|
|
||||||
- **Findings by Type** - Issues grouped by category
|
|
||||||
- **Findings by Severity** - Issues grouped by impact level
|
|
||||||
- **Dimension Breakdown** - Detailed analysis per quality dimension
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Installation
|
|
||||||
|
|
||||||
### Prerequisites
|
|
||||||
- Go 1.22+
|
|
||||||
- OpenAI API key (for embeddings)
|
|
||||||
|
|
||||||
### From Source
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Clone the repository
|
|
||||||
git clone https://github.com/yourorg/devour.git
|
|
||||||
cd devour
|
|
||||||
|
|
||||||
# Install dependencies
|
|
||||||
go mod download
|
go mod download
|
||||||
|
|
||||||
# Build
|
|
||||||
go build -o devour ./cmd/devour
|
go build -o devour ./cmd/devour
|
||||||
|
|
||||||
# Install globally
|
|
||||||
go install ./cmd/devour
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Quick Install
|
### 2) Initialize
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
go install github.com/yourorg/devour/cmd/devour@latest
|
./devour init
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
### 3) Get docs fast
|
||||||
|
|
||||||
## Quick Start
|
|
||||||
|
|
||||||
### 1. Initialize a Project
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Create devour config in current directory
|
./devour get go net/http --format markdown
|
||||||
devour init
|
./devour get nextjs routing
|
||||||
|
./devour get express middleware
|
||||||
# Or specify a path
|
|
||||||
devour init ./my-project
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Get Documentation (NEW!)
|
### 4) Search locally
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Quick access to popular language/framework docs
|
./devour query "json unmarshal"
|
||||||
devour get go http # Go HTTP package
|
|
||||||
devour get python asyncio # Python asyncio module
|
|
||||||
devour get react hooks # React Hooks documentation
|
|
||||||
devour get docker compose # Docker Compose docs
|
|
||||||
devour get rust tokio # Rust Tokio crate
|
|
||||||
devour get spring boot # Spring Boot framework
|
|
||||||
|
|
||||||
# Enhanced markdown output
|
|
||||||
devour get go http --format markdown
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**Supported Languages:**
|
### 5) Ask a docs-grounded question
|
||||||
- `go`, `golang` - Go packages (pkg.go.dev)
|
|
||||||
- `rust` - Rust crates (docs.rs)
|
|
||||||
- `python`, `py` - Python modules (docs.python.org)
|
|
||||||
- `java` - Java packages (docs.oracle.com)
|
|
||||||
- `spring` - Spring Boot (docs.spring.io)
|
|
||||||
- `typescript`, `ts` - TypeScript (typescriptlang.org)
|
|
||||||
- `react` - React (react.dev)
|
|
||||||
- `vue` - Vue.js (vuejs.org)
|
|
||||||
- `nuxt` - Nuxt (nuxt.com)
|
|
||||||
- `docker` - Docker (docs.docker.com)
|
|
||||||
- `cloudflare`, `cf` - Cloudflare (developers.cloudflare.com)
|
|
||||||
- `astro` - Astro (docs.astro.build)
|
|
||||||
|
|
||||||
### 3. Scrape Documentation
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Scrape from a URL
|
./devour ask --lang go "how to parse json" --format text
|
||||||
devour scrape https://docs.example.com
|
|
||||||
|
|
||||||
# Scrape a GitHub repo
|
|
||||||
devour scrape https://github.com/org/repo
|
|
||||||
|
|
||||||
# Scrape local docs
|
|
||||||
devour scrape ./docs
|
|
||||||
|
|
||||||
# Multiple sources
|
|
||||||
devour scrape --sources sources.yaml
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### 4. Query Context
|
### 6) Let Devour route intent automatically
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Search indexed docs
|
./devour auto "how to use useEffect in react"
|
||||||
devour query "How do I authenticate with the API?"
|
./devour auto "https://pkg.go.dev/net/http"
|
||||||
|
|
||||||
# With options
|
|
||||||
devour query "authentication" --limit 5 --format json
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### 5. Start the Server
|
## Core commands
|
||||||
|
| Command | Purpose |
|
||||||
|
| --- | --- |
|
||||||
|
| `devour init [path]` | Create config + local storage directories |
|
||||||
|
| `devour get <language> <keyword>` | Shortcut fetch from curated official docs |
|
||||||
|
| `devour scrape <source>` | Scrape one source directly |
|
||||||
|
| `devour scrape --sources sources.yaml` | Scrape multiple configured sources |
|
||||||
|
| `devour query <text>` | Search local indexed docs |
|
||||||
|
| `devour ask --lang <lang> <question>` | Structured answer using local-first + live fallback |
|
||||||
|
| `devour sync` | Sync configured sources and rebuild index |
|
||||||
|
| `devour status` | Show docs/index/source health |
|
||||||
|
| `devour push <path>` | Import local docs into workspace |
|
||||||
|
| `devour serve` | Start local stdio JSON-RPC server |
|
||||||
|
| `devour auto "<intent>"` | Auto-route intent to command |
|
||||||
|
| `devour verify smoke` | Live smoke verification report |
|
||||||
|
| `devour quality ...` | Code quality scan/triage/fixes |
|
||||||
|
|
||||||
|
## Supported `get` / `ask` languages and frameworks
|
||||||
|
- Go (`go`, `golang`)
|
||||||
|
- Rust (`rust`)
|
||||||
|
- Python (`python`, `py`)
|
||||||
|
- Java (`java`)
|
||||||
|
- Spring (`spring`)
|
||||||
|
- TypeScript (`typescript`, `ts`)
|
||||||
|
- React (`react`)
|
||||||
|
- Vue (`vue`)
|
||||||
|
- Nuxt (`nuxt`)
|
||||||
|
- Docker (`docker`)
|
||||||
|
- Cloudflare (`cloudflare`, `cf`)
|
||||||
|
- Astro (`astro`)
|
||||||
|
- C# (`csharp`, `cs`)
|
||||||
|
- Kotlin (`kotlin`, `kt`)
|
||||||
|
- PHP (`php`)
|
||||||
|
- Ruby (`ruby`, `rb`)
|
||||||
|
- Elixir (`elixir`, `ex`)
|
||||||
|
- Next.js (`next`, `nextjs`)
|
||||||
|
- Svelte (`svelte`)
|
||||||
|
- Angular (`angular`, `ng`)
|
||||||
|
- Remix (`remix`)
|
||||||
|
- Solid (`solid`, `solidjs`) via `github.com/solidjs/solid-docs`
|
||||||
|
- Express (`express`, `expressjs`)
|
||||||
|
|
||||||
|
Run `devour languages` for examples, or `devour languages --format json` for automation.
|
||||||
|
|
||||||
|
## Config
|
||||||
|
Devour reads `devour.yaml` (or `--config`).
|
||||||
|
|
||||||
|
New additive sections:
|
||||||
|
- `indexing`: local lexical index defaults
|
||||||
|
- `verification`: live smoke timeout defaults
|
||||||
|
|
||||||
|
Starter config: `devour.example.yaml`.
|
||||||
|
|
||||||
|
## Real-world verification
|
||||||
|
Run live smoke checks (opt-in):
|
||||||
```bash
|
```bash
|
||||||
# Local MCP server (stdio transport)
|
./devour verify smoke
|
||||||
devour serve
|
|
||||||
|
|
||||||
# Remote MCP server (HTTP)
|
|
||||||
devour serve --remote --port 8080
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### 6. Check Status
|
Reports are saved to `devour_data/verify/smoke-<timestamp>.json`.
|
||||||
|
|
||||||
```bash
|
## JSON-RPC local server
|
||||||
devour status
|
`devour serve` (local mode) accepts JSON-RPC 2.0 methods:
|
||||||
```
|
- `devour_query`
|
||||||
|
- `devour_status`
|
||||||
---
|
- `devour_scrape`
|
||||||
|
- `devour_ask`
|
||||||
## Enhanced Features
|
- `devour_sync`
|
||||||
|
|
||||||
### 🎯 Simplified Language Interface
|
|
||||||
|
|
||||||
The new `devour get` command provides instant access to documentation for popular languages and frameworks without needing to remember full URLs:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Instead of: devour scrape https://pkg.go.dev/net/http
|
|
||||||
devour get go http
|
|
||||||
|
|
||||||
# Instead of: devour scrape https://react.dev/reference/react/hooks
|
|
||||||
devour get react hooks
|
|
||||||
|
|
||||||
# Instead of: devour scrape https://docs.docker.com/compose
|
|
||||||
devour get docker compose
|
|
||||||
```
|
|
||||||
|
|
||||||
### 📝 Rich Markdown Output
|
|
||||||
|
|
||||||
Enable enhanced markdown formatting for beautiful, structured documentation:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
devour get go http --format markdown
|
|
||||||
```
|
|
||||||
|
|
||||||
**Features:**
|
|
||||||
- 📋 Document metadata tables
|
|
||||||
- 📑 Auto-generated table of contents
|
|
||||||
- 🎨 Enhanced typography with emoji indicators
|
|
||||||
- 🔗 Automatic link conversion
|
|
||||||
- 📚 Structured content sections
|
|
||||||
- 🏷️ Source attribution and timestamps
|
|
||||||
|
|
||||||
### 🧠 Smart Content Enhancement
|
|
||||||
|
|
||||||
The markdown formatter automatically:
|
|
||||||
- Converts plain URLs to clickable links
|
|
||||||
- Adds visual indicators for examples, notes, and warnings
|
|
||||||
- Fixes code block formatting
|
|
||||||
- Generates proper heading structure
|
|
||||||
- Creates document metadata tables
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Architecture
|
|
||||||
|
|
||||||
```
|
|
||||||
┌─────────────────────────────────────────────────────────────────┐
|
|
||||||
│ Devour System │
|
|
||||||
├─────────────────────────────────────────────────────────────────┤
|
|
||||||
│ │
|
|
||||||
│ ┌─────────┐ ┌──────────┐ ┌───────────┐ ┌──────────┐ │
|
|
||||||
│ │ Scraper │───▶│ Indexer │───▶│ Storage │───▶│ Server │ │
|
|
||||||
│ └─────────┘ └──────────┘ └───────────┘ └──────────┘ │
|
|
||||||
│ │ │ │ │ │
|
|
||||||
│ ▼ ▼ ▼ ▼ │
|
|
||||||
│ ┌─────────┐ ┌──────────┐ ┌───────────┐ ┌──────────┐ │
|
|
||||||
│ │ GitHub │ │ OpenAI │ │ Vector DB │ │ MCP │ │
|
|
||||||
│ │ Web │ │ Embeds │ │ (chromem) │ │ Protocol │ │
|
|
||||||
│ │ Local │ │ │ │ │ │ │ │
|
|
||||||
│ └─────────┘ └──────────┘ └───────────┘ └──────────┘ │
|
|
||||||
│ │
|
|
||||||
│ ┌─────────────────────────────────────────────────────────┐ │
|
|
||||||
│ │ Scheduler │ │
|
|
||||||
│ │ (Auto-update every 3 days, configurable) │ │
|
|
||||||
│ └─────────────────────────────────────────────────────────┘ │
|
|
||||||
│ │
|
|
||||||
└─────────────────────────────────────────────────────────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
### Data Flow
|
|
||||||
|
|
||||||
```
|
|
||||||
User Query → Devour Server → Embedding Generation → Vector Search
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
AI Response ← Context Chunks ← Top-K Relevant Docs ←───┘
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## CLI Reference
|
|
||||||
|
|
||||||
### Commands
|
|
||||||
|
|
||||||
| Command | Description |
|
|
||||||
|---------|-------------|
|
|
||||||
| `devour init [path]` | Initialize Devour for a project |
|
|
||||||
| `devour get <language> <keyword>` | **NEW** Quick docs fetch for popular languages |
|
|
||||||
| `devour scrape <source>` | Scrape docs from URL, repo, or path |
|
|
||||||
| `devour serve` | Start MCP server (local or remote) |
|
|
||||||
| `devour query <text>` | Search indexed documentation |
|
|
||||||
| `devour status` | Show index stats and last update |
|
|
||||||
| `devour sync` | Fetch updates from all sources |
|
|
||||||
| `devour push <path>` | Push docs to remote MCP server |
|
|
||||||
|
|
||||||
### Flags
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Global flags
|
|
||||||
--config, -c Config file path (default: ./devour.yaml)
|
|
||||||
--verbose, -v Enable verbose logging
|
|
||||||
--quiet, -q Suppress non-error output
|
|
||||||
|
|
||||||
# scrape flags
|
|
||||||
--sources, -s YAML file with source definitions
|
|
||||||
--format, -f Output format: json, markdown (default: json)
|
|
||||||
--concurrency Parallel scraping workers (default: 10)
|
|
||||||
|
|
||||||
# serve flags
|
|
||||||
--remote Run as remote HTTP server
|
|
||||||
--port, -p HTTP port (default: 8080)
|
|
||||||
--host HTTP host (default: localhost)
|
|
||||||
|
|
||||||
# query flags
|
|
||||||
--limit, -l Max results (default: 5)
|
|
||||||
--format, -f Output: json, text, markdown
|
|
||||||
--threshold Similarity threshold (default: 0.7)
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Configuration
|
|
||||||
|
|
||||||
### devour.yaml
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
# Devour Configuration
|
|
||||||
|
|
||||||
# Storage paths
|
|
||||||
storage:
|
|
||||||
docs_dir: ./devour_data/docs
|
|
||||||
index_dir: ./devour_data/index
|
|
||||||
metadata_dir: ./devour_data/metadata
|
|
||||||
|
|
||||||
# Embedding settings
|
|
||||||
embeddings:
|
|
||||||
provider: openai # openai, local
|
|
||||||
model: text-embedding-3-small
|
|
||||||
api_key: ${OPENAI_API_KEY} # Env var reference
|
|
||||||
|
|
||||||
# Vector database
|
|
||||||
vector_db:
|
|
||||||
type: chromem # chromem, weaviate, faiss
|
|
||||||
persist: true
|
|
||||||
|
|
||||||
# Scraping settings
|
|
||||||
scraper:
|
|
||||||
user_agent: "Devour/1.0"
|
|
||||||
timeout: 30s
|
|
||||||
retry_count: 3
|
|
||||||
concurrency: 10
|
|
||||||
rate_limit: 500ms
|
|
||||||
|
|
||||||
# Scheduler
|
|
||||||
scheduler:
|
|
||||||
enabled: true
|
|
||||||
interval: 72h # Every 3 days
|
|
||||||
check_method: hash # hash, timestamp
|
|
||||||
|
|
||||||
# Server settings
|
|
||||||
server:
|
|
||||||
mode: local # local, remote
|
|
||||||
port: 8080
|
|
||||||
host: localhost
|
|
||||||
|
|
||||||
# Sources (for sync)
|
|
||||||
sources:
|
|
||||||
- name: project-docs
|
|
||||||
type: url
|
|
||||||
url: https://docs.example.com
|
|
||||||
include: ["**/*.md", "**/*.html"]
|
|
||||||
exclude: ["**/api/**"]
|
|
||||||
|
|
||||||
- name: api-spec
|
|
||||||
type: openapi
|
|
||||||
url: https://api.example.com/openapi.json
|
|
||||||
|
|
||||||
- name: github-repo
|
|
||||||
type: github
|
|
||||||
repo: org/repo
|
|
||||||
branch: main
|
|
||||||
paths: ["docs/", "README.md"]
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## API Reference
|
|
||||||
|
|
||||||
### MCP Tools (when running as server)
|
|
||||||
|
|
||||||
#### `devour_query`
|
|
||||||
Search indexed documentation for relevant context.
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"query": "How do I authenticate?",
|
|
||||||
"limit": 5,
|
|
||||||
"threshold": 0.7
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
#### `devour_add`
|
|
||||||
Add documents to the index.
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"documents": [
|
|
||||||
{
|
|
||||||
"content": "Document text...",
|
|
||||||
"metadata": {
|
|
||||||
"source": "https://...",
|
|
||||||
"type": "markdown"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
]
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
#### `devour_status`
|
|
||||||
Get indexing status and statistics.
|
|
||||||
|
|
||||||
### REST API (remote mode)
|
|
||||||
|
|
||||||
```
|
|
||||||
GET /health # Health check
|
|
||||||
GET /status # Index statistics
|
|
||||||
POST /query # Search documents
|
|
||||||
POST /documents # Add documents
|
|
||||||
GET /documents # List documents
|
|
||||||
DELETE /documents/:id # Delete document
|
|
||||||
POST /sync # Trigger sync
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Integration Examples
|
|
||||||
|
|
||||||
### With OpenCode (Local Mode)
|
|
||||||
|
|
||||||
Add to your OpenCode skills:
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
# ~/.opencode/skills.yaml
|
|
||||||
skills:
|
|
||||||
- name: devour
|
|
||||||
path: /path/to/devour
|
|
||||||
commands:
|
|
||||||
- devour serve
|
|
||||||
```
|
|
||||||
|
|
||||||
Then in OpenCode:
|
|
||||||
```
|
|
||||||
/devour query "authentication flow"
|
|
||||||
```
|
|
||||||
|
|
||||||
### With AI Applications
|
|
||||||
|
|
||||||
```go
|
|
||||||
import "github.com/yourorg/devour/pkg/client"
|
|
||||||
|
|
||||||
func main() {
|
|
||||||
client := client.New("http://localhost:8080")
|
|
||||||
|
|
||||||
results, err := client.Query(ctx, "How do I use the API?", 5)
|
|
||||||
if err != nil {
|
|
||||||
log.Fatal(err)
|
|
||||||
}
|
|
||||||
|
|
||||||
for _, r := range results {
|
|
||||||
fmt.Printf("Score: %.2f - %s\n", r.Score, r.Content[:100])
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Development
|
## Development
|
||||||
|
### Test
|
||||||
### Project Structure
|
|
||||||
|
|
||||||
```
|
|
||||||
devour/
|
|
||||||
├── cmd/devour/ # CLI entrypoint
|
|
||||||
│ └── main.go
|
|
||||||
├── internal/
|
|
||||||
│ ├── scraper/ # Scraping logic
|
|
||||||
│ ├── indexer/ # Embedding generation
|
|
||||||
│ ├── server/ # MCP server
|
|
||||||
│ ├── scheduler/ # Background updates
|
|
||||||
│ └── ai/ # AI integrations
|
|
||||||
├── pkg/
|
|
||||||
│ ├── client/ # Go client library
|
|
||||||
│ └── types/ # Shared types
|
|
||||||
├── devour_data/ # Default data directory
|
|
||||||
├── go.mod
|
|
||||||
├── Makefile
|
|
||||||
└── README.md
|
|
||||||
```
|
|
||||||
|
|
||||||
### Building
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Development build
|
|
||||||
go build -o devour ./cmd/devour
|
|
||||||
|
|
||||||
# Production build
|
|
||||||
CGO_ENABLED=0 go build -ldflags="-s -w" -o devour ./cmd/devour
|
|
||||||
|
|
||||||
# Run tests
|
|
||||||
go test ./...
|
go test ./...
|
||||||
|
|
||||||
# Run with coverage
|
|
||||||
go test -cover ./...
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Makefile Targets
|
### Typical integration loop
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
make build # Build binary
|
./devour init
|
||||||
make test # Run tests
|
./devour scrape https://pkg.go.dev/net/http --type godocs
|
||||||
make lint # Run linter
|
./devour query "http client"
|
||||||
make docker # Build Docker image
|
./devour ask --lang go "timeout example"
|
||||||
make install # Install locally
|
./devour sync
|
||||||
|
./devour status
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Roadmap
|
|
||||||
|
|
||||||
- [ ] Local LLM support (Ollama, LocalAI)
|
|
||||||
- [ ] Multi-tenant support for remote mode
|
|
||||||
- [ ] Web UI for document management
|
|
||||||
- [ ] Git-based versioning for docs
|
|
||||||
- [ ] Plugin system for custom scrapers
|
|
||||||
- [ ] Reranking with cross-encoders
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Contributing
|
|
||||||
|
|
||||||
Contributions are welcome! Please read our [Contributing Guide](CONTRIBUTING.md) for details.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
MIT (`LICENSE`).
|
||||||
MIT License - see [LICENSE](LICENSE) for details.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
<p align="center">
|
|
||||||
<sub>Built with ❤️ for better AI context</sub>
|
|
||||||
</p>
|
|
||||||
|
|||||||
@@ -1,14 +1,10 @@
|
|||||||
---
|
---
|
||||||
name: devour
|
name: devour
|
||||||
description: >
|
description: >
|
||||||
Context ingestion and management system for AI. Scrapes, indexes, and serves
|
Use this skill for Devour CLI workflows: scrape docs, get language docs,
|
||||||
documentation from GitHub repos, OpenAPI specs, web docs, and local files.
|
query local index, ask docs-grounded questions, sync sources, run quality
|
||||||
Provides semantic search via vector embeddings to feed relevant context to
|
triage, and verify live smoke checks. Trigger on: "devour", "docs to ai",
|
||||||
AI models. Runs in local mode (stdio) or remote mode (HTTP MCP server).
|
"scrape docs", "ask docs", "query docs", "sync docs", "quality scan".
|
||||||
Supports automatic updates via configurable scheduler. Integrates with
|
|
||||||
OpenAI for embeddings and LLM context injection. Triggers on: "devour",
|
|
||||||
"scrape docs", "index documentation", "context for AI", "vector search docs",
|
|
||||||
"semantic search", "ingest documentation", "documentation to AI".
|
|
||||||
allowed-tools:
|
allowed-tools:
|
||||||
- Read
|
- Read
|
||||||
- Write
|
- Write
|
||||||
@@ -19,637 +15,88 @@ allowed-tools:
|
|||||||
- WebFetch
|
- WebFetch
|
||||||
---
|
---
|
||||||
|
|
||||||
# Devour — Context Ingestion Skill
|
# Devour Skill
|
||||||
|
|
||||||
Comprehensive documentation scraping, indexing, and retrieval system for
|
Use this skill when a task is explicitly about Devour CLI operations or troubleshooting Devour workflows.
|
||||||
feeding structured context to AI models. Orchestrates 5 specialized modules
|
|
||||||
and supports both local (stdio) and remote (HTTP) MCP modes.
|
## What Devour now supports
|
||||||
|
|
||||||
## Quick Reference
|
- `devour init`
|
||||||
|
- `devour get`
|
||||||
| Command | What it does |
|
- `devour scrape`
|
||||||
|---------|-------------|
|
- `devour scrape --sources ...`
|
||||||
| `/devour init [path]` | Initialize Devour for a project |
|
- `devour query`
|
||||||
| `/devour get <language> <keyword>` | **NEW** Quick docs fetch for popular languages |
|
- `devour ask`
|
||||||
| `/devour scrape <source>` | Scrape docs from URL, GitHub, or local path |
|
- `devour sync`
|
||||||
| `/devour serve` | Start MCP server (local or remote) |
|
- `devour status`
|
||||||
| `/devour query <text>` | Search indexed documentation |
|
- `devour push <path>` (local ingest)
|
||||||
| `/devour status` | Show index stats and health |
|
- `devour serve` (local stdio JSON-RPC)
|
||||||
| `/devour sync` | Fetch updates from all configured sources |
|
- `devour auto`
|
||||||
| `/devour push <path>` | Push docs to remote MCP server |
|
- `devour verify smoke`
|
||||||
| `/devour sources` | Manage documentation sources |
|
- `devour quality ...`
|
||||||
| `/devour quality scan [path]` | **NEW** Run code quality analysis |
|
|
||||||
| `/devour quality status` | **NEW** Show quality metrics and trends |
|
Remote server/push workflows are experimental.
|
||||||
| `/devour quality next` | **NEW** Show next priority issue to fix |
|
|
||||||
|
## Fast routing
|
||||||
## Orchestration Logic
|
|
||||||
|
1. User gives URL/source: use `devour scrape`.
|
||||||
When the user invokes `/devour get <language> <keyword>`:
|
2. User gives language+topic: use `devour get`.
|
||||||
|
3. User asks a question: use `devour ask --lang ...`.
|
||||||
1. **Map language to base URL**:
|
4. User wants local search: use `devour query`.
|
||||||
- `go http` → `https://pkg.go.dev/http`
|
5. User wants updates from config: use `devour sync`.
|
||||||
- `python asyncio` → `https://docs.python.org/3/library/asyncio.html`
|
6. User wants automatic intent routing: use `devour auto`.
|
||||||
- `react hooks` → `https://react.dev/reference/react/hooks`
|
7. User wants confidence check: use `devour verify smoke`.
|
||||||
- `docker compose` → `https://docs.docker.com/compose`
|
|
||||||
|
## Reliable workflow
|
||||||
2. **Auto-detect source type** based on language:
|
|
||||||
- Go → `godocs` parser
|
|
||||||
- Python → `pythondocs` parser
|
|
||||||
- React → `reactdocs` parser
|
|
||||||
- Docker → `dockerdocs` parser
|
|
||||||
|
|
||||||
3. **Execute enhanced scrape** with pre-configured parameters:
|
|
||||||
- Automatic language-specific parsing
|
|
||||||
- Enhanced markdown formatting (if requested)
|
|
||||||
- Metadata extraction and enrichment
|
|
||||||
|
|
||||||
4. **Return structured documentation**:
|
|
||||||
- Rich markdown with TOC (if `--format markdown`)
|
|
||||||
- JSON with full metadata (default)
|
|
||||||
- Ready for AI context injection
|
|
||||||
|
|
||||||
When the user invokes `/devour scrape`:
|
|
||||||
|
|
||||||
1. **Detect source type** from URL/path:
|
|
||||||
- GitHub: `github.com/org/repo` → Clone, extract docs
|
|
||||||
- OpenAPI: Ends in `.json`/`.yaml` with OpenAPI spec → Parse endpoints
|
|
||||||
- Web: HTTP/HTTPS URL → Crawl with Colly
|
|
||||||
- Local: File path → Scan directory
|
|
||||||
|
|
||||||
2. **Scrape with appropriate parser**:
|
|
||||||
- Extract content (markdown, HTML, code structure)
|
|
||||||
- Clean and normalize text
|
|
||||||
- Extract metadata (title, headings, code blocks)
|
|
||||||
|
|
||||||
3. **Generate embeddings**:
|
|
||||||
- Chunk content appropriately (512-1024 tokens)
|
|
||||||
- Call OpenAI embedding API
|
|
||||||
- Store in vector database
|
|
||||||
|
|
||||||
4. **Update metadata**:
|
|
||||||
- Track source, timestamp, content hash
|
|
||||||
- Enable future update detection
|
|
||||||
|
|
||||||
When the user invokes `/devour query`:
|
|
||||||
|
|
||||||
1. Generate embedding for query text
|
|
||||||
2. Perform vector similarity search
|
|
||||||
3. Return top-K results with metadata
|
|
||||||
4. Optionally inject into AI context
|
|
||||||
|
|
||||||
## Enhanced Features
|
|
||||||
|
|
||||||
### 🎯 Language-Aware Documentation Access
|
|
||||||
|
|
||||||
The `devour get` command provides intelligent, language-specific documentation retrieval:
|
|
||||||
|
|
||||||
**Supported Languages & Mappings:**
|
|
||||||
- `go`, `golang` → Go packages (pkg.go.dev)
|
|
||||||
- `rust` → Rust crates (docs.rs)
|
|
||||||
- `python`, `py` → Python modules (docs.python.org)
|
|
||||||
- `java` → Java packages (docs.oracle.com)
|
|
||||||
- `spring` → Spring Boot (docs.spring.io)
|
|
||||||
- `typescript`, `ts` → TypeScript (typescriptlang.org)
|
|
||||||
- `react` → React (react.dev)
|
|
||||||
- `vue` → Vue.js (vuejs.org)
|
|
||||||
- `nuxt` → Nuxt (nuxt.com)
|
|
||||||
- `docker` → Docker (docs.docker.com)
|
|
||||||
- `cloudflare`, `cf` → Cloudflare (developers.cloudflare.com)
|
|
||||||
- `astro` → Astro (docs.astro.build)
|
|
||||||
|
|
||||||
**Usage Examples:**
|
|
||||||
```bash
|
|
||||||
/devour get go http # Go HTTP package docs
|
|
||||||
/devour get python asyncio # Python asyncio module
|
|
||||||
/devour get react hooks # React Hooks reference
|
|
||||||
/devour get docker compose # Docker Compose guide
|
|
||||||
/devour get rust tokio # Rust Tokio crate docs
|
|
||||||
```
|
|
||||||
|
|
||||||
### 📝 Rich Markdown Enhancement
|
|
||||||
|
|
||||||
When using `--format markdown`, Devour automatically enhances documentation:
|
|
||||||
|
|
||||||
**Auto-Generated Structure:**
|
|
||||||
- 📋 Document metadata tables (source, type, timestamp)
|
|
||||||
- 📑 Table of contents from headings
|
|
||||||
- 🎨 Visual indicators for important content
|
|
||||||
- 🔗 Automatic URL-to-link conversion
|
|
||||||
- 📚 Proper heading hierarchy
|
|
||||||
|
|
||||||
**Content Enhancement:**
|
|
||||||
- `Example:` → 💡 **Example:**
|
|
||||||
- `Note:` → 📝 **Note:**
|
|
||||||
- `Warning:` → ⚠️ **Warning:**
|
|
||||||
- `Important:` → ❗ **Important:**
|
|
||||||
- `TODO:` → 📋 **TODO:**
|
|
||||||
|
|
||||||
**Example Output Structure:**
|
|
||||||
```markdown
|
|
||||||
# Package Name
|
|
||||||
|
|
||||||
## 📋 Document Information
|
|
||||||
| Property | Value |
|
|
||||||
|----------|-------|
|
|
||||||
| **Source** | https://pkg.go.dev/http |
|
|
||||||
| **Type** | `godocs` |
|
|
||||||
| **Scraped** | 2026-02-19 12:30:00 |
|
|
||||||
|
|
||||||
## 📑 Table of Contents
|
|
||||||
- [Functions](#functions)
|
|
||||||
- [Types](#types)
|
|
||||||
- [Examples](#examples)
|
|
||||||
|
|
||||||
## 📚 Content
|
|
||||||
# Functions
|
|
||||||
|
|
||||||
💡 **Example:** Usage example here...
|
|
||||||
```
|
|
||||||
|
|
||||||
## Source Type Detection
|
|
||||||
|
|
||||||
| Pattern | Type | Parser |
|
|
||||||
|---------|------|--------|
|
|
||||||
| `github.com/*/*` | GitHub | Git clone + markdown parser |
|
|
||||||
| `*.json` + OpenAPI keys | OpenAPI | Swagger parser |
|
|
||||||
| `http://*`, `https://*` | Web | Colly crawler |
|
|
||||||
| `./path`, `/path` | Local | Directory scanner |
|
|
||||||
| `*.md`, `*.rst`, `*.txt` | File | Direct parse |
|
|
||||||
|
|
||||||
## Module Reference
|
|
||||||
|
|
||||||
### 1. Scraper Module (`internal/scraper`)
|
|
||||||
|
|
||||||
Responsible for fetching and parsing content from various sources.
|
|
||||||
|
|
||||||
**Supported sources:**
|
|
||||||
- GitHub repositories (clone, extract docs/, README.md)
|
|
||||||
- OpenAPI/Swagger specs (parse endpoints, schemas)
|
|
||||||
- HTML documentation sites (crawl, extract content)
|
|
||||||
- Markdown files (parse structure, code blocks)
|
|
||||||
- JSON/YAML configuration files
|
|
||||||
|
|
||||||
**Output format:**
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"id": "doc-uuid",
|
|
||||||
"source": "https://...",
|
|
||||||
"type": "markdown",
|
|
||||||
"title": "Document Title",
|
|
||||||
"content": "Extracted text...",
|
|
||||||
"metadata": {
|
|
||||||
"headings": ["H1", "H2"],
|
|
||||||
"code_blocks": ["go", "bash"],
|
|
||||||
"links": ["url1", "url2"]
|
|
||||||
},
|
|
||||||
"timestamp": "2025-01-15T10:00:00Z"
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### 2. Indexer Module (`internal/indexer`)
|
|
||||||
|
|
||||||
Converts documents into vector embeddings for semantic search.
|
|
||||||
|
|
||||||
**Features:**
|
|
||||||
- OpenAI embedding integration (text-embedding-3-small/large)
|
|
||||||
- Intelligent chunking (512-1024 tokens, respect boundaries)
|
|
||||||
- Metadata preservation
|
|
||||||
- Batch processing for efficiency
|
|
||||||
|
|
||||||
**Chunking strategy:**
|
|
||||||
```go
|
|
||||||
type Chunk struct {
|
|
||||||
ID string
|
|
||||||
DocID string
|
|
||||||
Content string
|
|
||||||
Vector []float32
|
|
||||||
Metadata map[string]any
|
|
||||||
Position int // Position in original doc
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### 3. Server Module (`internal/server`)
|
|
||||||
|
|
||||||
Exposes context via MCP protocol.
|
|
||||||
|
|
||||||
**Local mode (stdio):**
|
|
||||||
```
|
|
||||||
STDIN → JSON-RPC → Handler → Response → STDOUT
|
|
||||||
```
|
|
||||||
|
|
||||||
**Remote mode (HTTP):**
|
|
||||||
```
|
|
||||||
HTTP Request → Handler → Response → HTTP Response
|
|
||||||
```
|
|
||||||
|
|
||||||
**MCP Tools exposed:**
|
|
||||||
- `devour_query` - Semantic search
|
|
||||||
- `devour_add` - Add documents
|
|
||||||
- `devour_status` - Get stats
|
|
||||||
- `devour_sync` - Trigger update
|
|
||||||
|
|
||||||
**MCP Resources:**
|
|
||||||
- `devour://documents` - All indexed docs
|
|
||||||
- `devour://sources` - Configured sources
|
|
||||||
- `devour://stats` - Index statistics
|
|
||||||
|
|
||||||
### 4. Scheduler Module (`internal/scheduler`)
|
|
||||||
|
|
||||||
Manages automatic updates from configured sources.
|
|
||||||
|
|
||||||
**Default schedule:** Every 72 hours (3 days)
|
|
||||||
|
|
||||||
**Change detection methods:**
|
|
||||||
- Content hash comparison (default)
|
|
||||||
- Last-Modified timestamp
|
|
||||||
- ETag header
|
|
||||||
- Git commit hash (for repos)
|
|
||||||
|
|
||||||
**Configuration:**
|
|
||||||
```yaml
|
|
||||||
scheduler:
|
|
||||||
enabled: true
|
|
||||||
interval: 72h
|
|
||||||
check_method: hash
|
|
||||||
retry_count: 3
|
|
||||||
retry_delay: 1h
|
|
||||||
```
|
|
||||||
|
|
||||||
### 5. AI Module (`internal/ai`)
|
|
||||||
|
|
||||||
Handles AI integrations for embeddings and context injection.
|
|
||||||
|
|
||||||
**Supported providers:**
|
|
||||||
- OpenAI (primary)
|
|
||||||
- Ollama (local, planned)
|
|
||||||
- Custom endpoints
|
|
||||||
|
|
||||||
**Context injection format:**
|
|
||||||
```go
|
|
||||||
type Context struct {
|
|
||||||
Query string
|
|
||||||
Results []SearchResult
|
|
||||||
SystemPrompt string
|
|
||||||
}
|
|
||||||
|
|
||||||
func (c *Context) ToPrompt() string {
|
|
||||||
// Format for LLM consumption
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Configuration Schema
|
|
||||||
|
|
||||||
### devour.yaml
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
# Core configuration
|
|
||||||
version: 1
|
|
||||||
|
|
||||||
# Storage paths
|
|
||||||
storage:
|
|
||||||
docs_dir: ./devour_data/docs
|
|
||||||
index_dir: ./devour_data/index
|
|
||||||
metadata_dir: ./devour_data/metadata
|
|
||||||
|
|
||||||
# Embedding configuration
|
|
||||||
embeddings:
|
|
||||||
provider: openai
|
|
||||||
model: text-embedding-3-small
|
|
||||||
dimensions: 1536
|
|
||||||
api_key: ${OPENAI_API_KEY}
|
|
||||||
batch_size: 100
|
|
||||||
|
|
||||||
# Vector database
|
|
||||||
vector_db:
|
|
||||||
type: chromem # chromem, weaviate, faiss
|
|
||||||
persist: true
|
|
||||||
similarity_metric: cosine
|
|
||||||
|
|
||||||
# Scraping configuration
|
|
||||||
scraper:
|
|
||||||
user_agent: "Devour/1.0 (+https://github.com/yourorg/devour)"
|
|
||||||
timeout: 30s
|
|
||||||
retry_count: 3
|
|
||||||
retry_delay: 5s
|
|
||||||
concurrency: 10
|
|
||||||
rate_limit: 500ms
|
|
||||||
max_depth: 3
|
|
||||||
cache_dir: ./devour_data/cache
|
|
||||||
|
|
||||||
# Scheduler configuration
|
|
||||||
scheduler:
|
|
||||||
enabled: true
|
|
||||||
interval: 72h
|
|
||||||
check_method: hash
|
|
||||||
on_startup: false
|
|
||||||
|
|
||||||
# Server configuration
|
|
||||||
server:
|
|
||||||
mode: local # local, remote
|
|
||||||
transport: stdio # stdio, http
|
|
||||||
host: localhost
|
|
||||||
port: 8080
|
|
||||||
cors:
|
|
||||||
enabled: false
|
|
||||||
origins: []
|
|
||||||
|
|
||||||
# Source definitions
|
|
||||||
sources:
|
|
||||||
- name: example-docs
|
|
||||||
type: url
|
|
||||||
url: https://docs.example.com
|
|
||||||
include:
|
|
||||||
- "**/*.md"
|
|
||||||
- "**/*.html"
|
|
||||||
exclude:
|
|
||||||
- "**/api/**"
|
|
||||||
- "**/legacy/**"
|
|
||||||
schedule: 24h # Override global schedule
|
|
||||||
|
|
||||||
- name: api-spec
|
|
||||||
type: openapi
|
|
||||||
url: https://api.example.com/openapi.json
|
|
||||||
schedule: 168h # Weekly
|
|
||||||
|
|
||||||
- name: internal-repo
|
|
||||||
type: github
|
|
||||||
repo: myorg/myrepo
|
|
||||||
branch: main
|
|
||||||
paths:
|
|
||||||
- docs/
|
|
||||||
- README.md
|
|
||||||
auth_token: ${GITHUB_TOKEN}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Environment Variables
|
|
||||||
|
|
||||||
| Variable | Description | Default |
|
|
||||||
|----------|-------------|---------|
|
|
||||||
| `OPENAI_API_KEY` | OpenAI API key | Required |
|
|
||||||
| `DEVOUR_CONFIG` | Config file path | `./devour.yaml` |
|
|
||||||
| `DEVOUR_DATA_DIR` | Data directory | `./devour_data` |
|
|
||||||
| `GITHUB_TOKEN` | GitHub auth token | Optional |
|
|
||||||
| `DEVOUR_LOG_LEVEL` | Log level (debug, info, warn, error) | `info` |
|
|
||||||
| `DEVOUR_PORT` | Server port | `8080` |
|
|
||||||
|
|
||||||
## Code Quality Analysis
|
|
||||||
|
|
||||||
Devour includes comprehensive code quality analysis with three scorecard formats:
|
|
||||||
|
|
||||||
### Scorecard Types
|
|
||||||
|
|
||||||
**Compact Scorecard** - Quick overview with 3 circular metrics:
|
|
||||||
- Overall score (0-100%)
|
|
||||||
- Strict score (conservative metric)
|
|
||||||
- Letter grade (A-F)
|
|
||||||
|
|
||||||
**Detailed Scorecard** - Comprehensive breakdown featuring:
|
|
||||||
- Score breakdown by dimension with progress bars
|
|
||||||
- Findings grouped by type with visual charts
|
|
||||||
- Severity distribution with percentage circles
|
|
||||||
- Project metadata and timestamps
|
|
||||||
|
|
||||||
**Original Scorecard** - Balanced view with:
|
|
||||||
- Left panel: Project info and main scores
|
|
||||||
- Right panel: Dimension metrics in two-column layout
|
|
||||||
|
|
||||||
### Quality Commands
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Basic quality scan (generates original scorecard)
|
devour init
|
||||||
devour quality scan
|
devour get go net/http
|
||||||
|
devour query "http client timeout"
|
||||||
# Generate specific scorecard formats
|
devour ask --lang go "how to parse json"
|
||||||
devour quality scan --format compact --badge-path compact.png
|
devour sync
|
||||||
devour quality scan --format detailed --badge-path detailed.png
|
devour status
|
||||||
|
|
||||||
# Dark theme support
|
|
||||||
devour quality scan --theme dark --badge-path dark_scorecard.png
|
|
||||||
|
|
||||||
# Quality status and trends
|
|
||||||
devour quality status
|
|
||||||
|
|
||||||
# Show next priority issue
|
|
||||||
devour quality next
|
|
||||||
|
|
||||||
# Export findings as JSON
|
|
||||||
devour quality scan --format json > findings.json
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Quality Metrics
|
## Key behavior notes
|
||||||
|
|
||||||
**Dimensions Analyzed:**
|
- `ask` is hybrid local-first with targeted live fallback.
|
||||||
- Complexity - Nested loops, excessive function calls
|
- `query` is local lexical index; no API key required.
|
||||||
- Duplication - Code clones and near-duplicates
|
- `scrape` fails by default when 0 docs are extracted (unless `--allow-empty`).
|
||||||
- Security - Vulnerabilities and anti-patterns
|
- `serve` local mode uses JSON-RPC over stdio.
|
||||||
- Test Coverage - Unit test coverage analysis
|
|
||||||
- Dead Code - Unused functions, variables, imports
|
|
||||||
- Coupling - High coupling between modules
|
|
||||||
- Naming - Inconsistent naming conventions
|
|
||||||
|
|
||||||
**Severity Levels:**
|
## Supported language aliases
|
||||||
- **T1** - Auto-fixable (unused imports, debug logs)
|
|
||||||
- **T2** - Quick manual fixes (unused vars, dead exports)
|
|
||||||
- **T3** - Requires judgment (near-dupes, single-use abstractions)
|
|
||||||
- **T4** - Major refactor needed (god components, mixed concerns)
|
|
||||||
|
|
||||||
**Scoring:**
|
- `go`, `golang`
|
||||||
- **Overall Score** - General code health (0-100%)
|
- `rust`
|
||||||
- **Strict Score** - Conservative scoring ignoring quick wins
|
- `python`, `py`
|
||||||
- **Grade** - Letter grade based on score ranges (A: 90-100%, B: 70-89%, etc.)
|
- `java`
|
||||||
|
- `spring`
|
||||||
|
- `typescript`, `ts`
|
||||||
|
- `react`
|
||||||
|
- `vue`
|
||||||
|
- `nuxt`
|
||||||
|
- `docker`
|
||||||
|
- `cloudflare`, `cf`
|
||||||
|
- `astro`
|
||||||
|
- `csharp`, `cs`
|
||||||
|
- `kotlin`, `kt`
|
||||||
|
- `php`
|
||||||
|
- `ruby`, `rb`
|
||||||
|
- `elixir`, `ex`
|
||||||
|
- `next`, `nextjs`
|
||||||
|
- `svelte`
|
||||||
|
- `angular`, `ng`
|
||||||
|
- `remix`
|
||||||
|
- `solid`, `solidjs`
|
||||||
|
- `express`, `expressjs`
|
||||||
|
|
||||||
### Multi-Language Support
|
## Response expectations
|
||||||
|
|
||||||
- **Go** - Full AST analysis with go/parser
|
When reporting command results:
|
||||||
- **Python** - AST analysis with ast module
|
|
||||||
- **JavaScript/TypeScript** - ESLint integration
|
|
||||||
- **Java** - JavaParser integration
|
|
||||||
- **Rust** - Synth integration (planned)
|
|
||||||
|
|
||||||
### Integration Examples
|
1. Show exact command(s) run.
|
||||||
|
2. Summarize key output.
|
||||||
```bash
|
3. Show output file locations.
|
||||||
# CI/CD Pipeline Integration
|
4. Call out limitations/experimental behavior.
|
||||||
devour quality scan --format json --threshold 70
|
5. Give the next command to continue.
|
||||||
if [ $? -ne 0 ]; then
|
|
||||||
echo "Quality gate failed - score below threshold"
|
|
||||||
exit 1
|
|
||||||
fi
|
|
||||||
|
|
||||||
# Generate all scorecard versions for documentation
|
|
||||||
devour quality scan --format original --badge-path docs/scorecard.png
|
|
||||||
devour quality scan --format compact --badge-path docs/scorecard_compact.png --theme light
|
|
||||||
devour quality scan --format detailed --badge-path docs/scorecard_detailed.png --theme dark
|
|
||||||
|
|
||||||
# Weekly quality tracking
|
|
||||||
devour quality scan --format json > weekly_$(date +%Y%m%d).json
|
|
||||||
```
|
|
||||||
|
|
||||||
## Quality Gates
|
|
||||||
|
|
||||||
Built-in validation rules:
|
|
||||||
|
|
||||||
- ⚠️ **WARNING** if document count < 10 (may be incomplete scrape)
|
|
||||||
- ⚠️ **WARNING** if average chunk size < 100 tokens (over-fragmented)
|
|
||||||
- 🛑 **HARD STOP** if embedding API fails (cannot index without vectors)
|
|
||||||
- 🛑 **HARD STOP** if storage is not writable (cannot persist)
|
|
||||||
|
|
||||||
## Output Formats
|
|
||||||
|
|
||||||
### Query Results (JSON)
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"query": "authentication",
|
|
||||||
"results": [
|
|
||||||
{
|
|
||||||
"id": "chunk-uuid",
|
|
||||||
"document_id": "doc-uuid",
|
|
||||||
"content": "Relevant text excerpt...",
|
|
||||||
"score": 0.89,
|
|
||||||
"source": "https://docs.example.com/auth",
|
|
||||||
"metadata": {
|
|
||||||
"title": "Authentication Guide",
|
|
||||||
"section": "Getting Started"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"total": 15,
|
|
||||||
"took_ms": 45
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Status Output
|
|
||||||
```
|
|
||||||
Devour Status
|
|
||||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
||||||
Index Health: ✅ Healthy
|
|
||||||
Documents: 1,247 indexed
|
|
||||||
Chunks: 8,392 total
|
|
||||||
Vector Dimension: 1536
|
|
||||||
Last Updated: 2025-01-15 10:30:00
|
|
||||||
Storage Used: 124 MB
|
|
||||||
|
|
||||||
Sources (3):
|
|
||||||
✅ example-docs (234 docs, synced 2h ago)
|
|
||||||
✅ api-spec (12 docs, synced 1d ago)
|
|
||||||
⚠️ internal-repo (pending first sync)
|
|
||||||
|
|
||||||
Next Scheduled Sync: 2025-01-18 10:30:00
|
|
||||||
```
|
|
||||||
|
|
||||||
## Integration Patterns
|
|
||||||
|
|
||||||
### With OpenCode
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
# In OpenCode session
|
|
||||||
> /devour init
|
|
||||||
> /devour scrape https://docs.myframework.com
|
|
||||||
> /devour serve
|
|
||||||
|
|
||||||
# In another terminal or session
|
|
||||||
> /devour query "how to handle authentication"
|
|
||||||
# Returns relevant context for AI
|
|
||||||
```
|
|
||||||
|
|
||||||
### With AI Assistant
|
|
||||||
|
|
||||||
```go
|
|
||||||
// AI assistant queries Devour automatically
|
|
||||||
func getRelevantContext(query string) string {
|
|
||||||
resp, _ := http.Post("http://localhost:8080/query",
|
|
||||||
"application/json",
|
|
||||||
bytes.NewReader([]byte(`{"query":"`+query+`"}`)))
|
|
||||||
|
|
||||||
var result QueryResponse
|
|
||||||
json.NewDecoder(resp.Body).Decode(&result)
|
|
||||||
|
|
||||||
// Inject into prompt
|
|
||||||
return formatContextForAI(result.Results)
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### As MCP Tool
|
|
||||||
|
|
||||||
```json
|
|
||||||
// AI calls via MCP
|
|
||||||
{
|
|
||||||
"method": "tools/call",
|
|
||||||
"params": {
|
|
||||||
"name": "devour_query",
|
|
||||||
"arguments": {
|
|
||||||
"query": "API rate limiting",
|
|
||||||
"limit": 5
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Sub-Skills
|
|
||||||
|
|
||||||
This skill can delegate to specialized modules:
|
|
||||||
|
|
||||||
1. **devour-scrape** — Scraping operations
|
|
||||||
2. **devour-index** — Indexing and embeddings
|
|
||||||
3. **devour-query** — Search and retrieval
|
|
||||||
4. **devour-sync** — Synchronization tasks
|
|
||||||
5. **devour-serve** — Server management
|
|
||||||
|
|
||||||
## Error Handling
|
|
||||||
|
|
||||||
| Error | Cause | Resolution |
|
|
||||||
|-------|-------|------------|
|
|
||||||
| `E001` | OpenAI API error | Check API key, rate limits |
|
|
||||||
| `E002` | Source unreachable | Verify URL, check network |
|
|
||||||
| `E003` | Storage write failure | Check permissions, disk space |
|
|
||||||
| `E004` | Invalid source type | Use supported: url, github, openapi, local |
|
|
||||||
| `E005` | Index corruption | Rebuild index with `devour sync --rebuild` |
|
|
||||||
|
|
||||||
## Performance Tuning
|
|
||||||
|
|
||||||
### Scraping
|
|
||||||
```yaml
|
|
||||||
scraper:
|
|
||||||
concurrency: 20 # Parallel workers
|
|
||||||
rate_limit: 200ms # Between requests
|
|
||||||
timeout: 60s # Per request
|
|
||||||
```
|
|
||||||
|
|
||||||
### Indexing
|
|
||||||
```yaml
|
|
||||||
embeddings:
|
|
||||||
batch_size: 200 # API batch size
|
|
||||||
vector_db:
|
|
||||||
index_type: hnsw # Fast similarity search
|
|
||||||
m: 16 # HNSW connectivity
|
|
||||||
```
|
|
||||||
|
|
||||||
### Querying
|
|
||||||
```yaml
|
|
||||||
query:
|
|
||||||
ef_search: 64 # HNSW search depth
|
|
||||||
limit: 10 # Default result count
|
|
||||||
```
|
|
||||||
|
|
||||||
## Troubleshooting
|
|
||||||
|
|
||||||
### Common Issues
|
|
||||||
|
|
||||||
**Slow queries:**
|
|
||||||
- Increase `ef_search` for better recall
|
|
||||||
- Use smaller `limit` values
|
|
||||||
- Consider index type (HNSW vs Flat)
|
|
||||||
|
|
||||||
**API rate limits:**
|
|
||||||
- Reduce `batch_size`
|
|
||||||
- Add delays between batches
|
|
||||||
- Use caching
|
|
||||||
|
|
||||||
**Memory usage:**
|
|
||||||
- Reduce `concurrency`
|
|
||||||
- Process in smaller batches
|
|
||||||
- Use disk-backed storage
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
*Devour: Feed your AI the context it craves.*
|
|
||||||
|
|||||||
@@ -0,0 +1,7 @@
|
|||||||
|
version: 1
|
||||||
|
display_name: Devour Docs Ops
|
||||||
|
short_description: Route and execute Devour docs/query/ask/sync workflows.
|
||||||
|
default_prompt: |
|
||||||
|
Use Devour commands to fetch official docs, maintain a local docs index,
|
||||||
|
answer questions against docs, sync sources, run quality triage, and report
|
||||||
|
exact commands/results with file outputs.
|
||||||
+1012
File diff suppressed because it is too large
Load Diff
+144
@@ -0,0 +1,144 @@
|
|||||||
|
package cmd
|
||||||
|
|
||||||
|
import (
|
||||||
|
"strings"
|
||||||
|
"testing"
|
||||||
|
|
||||||
|
"github.com/yourorg/devour/internal/scraper"
|
||||||
|
)
|
||||||
|
|
||||||
|
func TestDeriveSearchTerms(t *testing.T) {
|
||||||
|
terms := deriveSearchTerms("go", "how to regex match http path")
|
||||||
|
|
||||||
|
if len(terms) == 0 {
|
||||||
|
t.Fatal("expected at least one derived search term")
|
||||||
|
}
|
||||||
|
|
||||||
|
joined := strings.Join(terms, ",")
|
||||||
|
if !strings.Contains(joined, "regexp") {
|
||||||
|
t.Fatalf("expected regexp term in %v", terms)
|
||||||
|
}
|
||||||
|
if !strings.Contains(joined, "net/http") {
|
||||||
|
t.Fatalf("expected net/http term in %v", terms)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestScoreDocument(t *testing.T) {
|
||||||
|
query := "regex match in go"
|
||||||
|
docTitleMatch := &scraper.Document{
|
||||||
|
Title: "Package regexp",
|
||||||
|
Content: "Use MustCompile and MatchString to match values.",
|
||||||
|
Type: "go-package",
|
||||||
|
URL: "https://pkg.go.dev/regexp",
|
||||||
|
}
|
||||||
|
docNoMatch := &scraper.Document{
|
||||||
|
Title: "Package archive/tar",
|
||||||
|
Content: "Read and write tar archives.",
|
||||||
|
Type: "go-package",
|
||||||
|
URL: "https://pkg.go.dev/archive/tar",
|
||||||
|
}
|
||||||
|
|
||||||
|
if scoreDocument(query, docTitleMatch) <= scoreDocument(query, docNoMatch) {
|
||||||
|
t.Fatal("expected regex-related document to have a higher score")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestExtractRecommendedAPI(t *testing.T) {
|
||||||
|
docs := []rankedDoc{
|
||||||
|
{
|
||||||
|
doc: &scraper.Document{
|
||||||
|
Title: "regexp.func MustCompile ¶",
|
||||||
|
URL: "https://pkg.go.dev/regexp",
|
||||||
|
Content: "re := regexp.MustCompile(`\\\\d+`)\nif re.MatchString(input) { fmt.Println(\"ok\") }",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
apis := extractRecommendedAPI(docs)
|
||||||
|
if len(apis) == 0 {
|
||||||
|
t.Fatal("expected API extraction to return at least one call")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestExtractSnippet(t *testing.T) {
|
||||||
|
content := "The regexp package implements regular expression search. Use MustCompile for fixed patterns."
|
||||||
|
snippet := extractSnippet(content, []string{"regexp"})
|
||||||
|
if snippet == "" {
|
||||||
|
t.Fatal("expected non-empty snippet")
|
||||||
|
}
|
||||||
|
if !strings.Contains(strings.ToLower(snippet), "regexp") {
|
||||||
|
t.Fatalf("snippet should mention regexp, got: %q", snippet)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestCandidateDocURLs_FrameworkFallbacks(t *testing.T) {
|
||||||
|
next, err := candidateDocURLs("nextjs", "routing")
|
||||||
|
if err != nil {
|
||||||
|
t.Fatalf("candidateDocURLs(nextjs) error: %v", err)
|
||||||
|
}
|
||||||
|
if len(next) < 2 {
|
||||||
|
t.Fatalf("expected fallback URLs for nextjs, got %v", next)
|
||||||
|
}
|
||||||
|
if next[0] != "https://nextjs.org/docs/app/building-your-application/routing" {
|
||||||
|
t.Fatalf("unexpected primary nextjs URL: %q", next[0])
|
||||||
|
}
|
||||||
|
|
||||||
|
remix, err := candidateDocURLs("remix", "routes")
|
||||||
|
if err != nil {
|
||||||
|
t.Fatalf("candidateDocURLs(remix) error: %v", err)
|
||||||
|
}
|
||||||
|
if len(remix) == 0 || remix[0] != "https://v2.remix.run/docs/file-conventions/routes" {
|
||||||
|
t.Fatalf("unexpected remix candidate URLs: %v", remix)
|
||||||
|
}
|
||||||
|
|
||||||
|
solid, err := candidateDocURLs("solid", "router")
|
||||||
|
if err != nil {
|
||||||
|
t.Fatalf("candidateDocURLs(solid) error: %v", err)
|
||||||
|
}
|
||||||
|
if len(solid) == 0 || !strings.Contains(solid[0], "github.com/solidjs/solid-docs") {
|
||||||
|
t.Fatalf("unexpected solid candidate URLs: %v", solid)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestPrimaryQueryTokenSkipsQuestionWords(t *testing.T) {
|
||||||
|
token := primaryQueryToken("what does routing do in remix")
|
||||||
|
if token == "" {
|
||||||
|
t.Fatal("expected non-empty token")
|
||||||
|
}
|
||||||
|
if token == "what" || token == "does" {
|
||||||
|
t.Fatalf("expected informative token, got %q", token)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestDeriveSearchTermsSolidRouting(t *testing.T) {
|
||||||
|
terms := deriveSearchTerms("solid", "how to do routing in solid")
|
||||||
|
joined := strings.Join(terms, ",")
|
||||||
|
if !strings.Contains(joined, "solid-router") {
|
||||||
|
t.Fatalf("expected solid-router term in %v", terms)
|
||||||
|
}
|
||||||
|
if strings.Contains(joined, "signals") {
|
||||||
|
t.Fatalf("did not expect signals default for routing question, got %v", terms)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestShouldFallbackToLive(t *testing.T) {
|
||||||
|
strong := []rankedDoc{
|
||||||
|
{
|
||||||
|
doc: &scraper.Document{Title: "Routing Guide", Content: "routing with file based routes", URL: "https://nextjs.org/docs/routing"},
|
||||||
|
score: 2.2,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
if shouldFallbackToLive(strong, []string{"routing"}) {
|
||||||
|
t.Fatal("expected strong local match to skip live fallback")
|
||||||
|
}
|
||||||
|
|
||||||
|
weak := []rankedDoc{
|
||||||
|
{
|
||||||
|
doc: &scraper.Document{Title: "Misc", Content: "unrelated", URL: "https://example.com"},
|
||||||
|
score: 0.1,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
if !shouldFallbackToLive(weak, []string{"routing"}) {
|
||||||
|
t.Fatal("expected weak local match to trigger live fallback")
|
||||||
|
}
|
||||||
|
}
|
||||||
+181
@@ -0,0 +1,181 @@
|
|||||||
|
package cmd
|
||||||
|
|
||||||
|
import (
|
||||||
|
"encoding/json"
|
||||||
|
"fmt"
|
||||||
|
"net/url"
|
||||||
|
"os"
|
||||||
|
"os/exec"
|
||||||
|
"sort"
|
||||||
|
"strings"
|
||||||
|
"unicode"
|
||||||
|
|
||||||
|
"github.com/spf13/cobra"
|
||||||
|
)
|
||||||
|
|
||||||
|
var (
|
||||||
|
autoDryRun bool
|
||||||
|
autoJSON bool
|
||||||
|
autoLang string
|
||||||
|
)
|
||||||
|
|
||||||
|
var autoCmd = &cobra.Command{
|
||||||
|
Use: "auto <intent>",
|
||||||
|
Short: "Route natural-language intent to the best Devour command",
|
||||||
|
Long: `Auto-classify intent and run the best matching command (get/scrape/ask/quality).
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
devour auto "how to parse json in go"
|
||||||
|
devour auto "https://pkg.go.dev/net/http"
|
||||||
|
devour auto "check code quality" --dry-run
|
||||||
|
devour auto "what is useEffect" --lang react`,
|
||||||
|
Args: cobra.MinimumNArgs(1),
|
||||||
|
RunE: runAuto,
|
||||||
|
}
|
||||||
|
|
||||||
|
func init() {
|
||||||
|
autoCmd.Flags().BoolVar(&autoDryRun, "dry-run", false, "print selected command without executing")
|
||||||
|
autoCmd.Flags().BoolVar(&autoJSON, "json", false, "output route decision as JSON")
|
||||||
|
autoCmd.Flags().StringVar(&autoLang, "lang", "", "optional language override for ask/get routes")
|
||||||
|
}
|
||||||
|
|
||||||
|
type autoDecision struct {
|
||||||
|
Intent string `json:"intent"`
|
||||||
|
Route string `json:"route"`
|
||||||
|
Reason string `json:"reason"`
|
||||||
|
Command []string `json:"command"`
|
||||||
|
}
|
||||||
|
|
||||||
|
func runAuto(cmd *cobra.Command, args []string) error {
|
||||||
|
intent := strings.TrimSpace(strings.Join(args, " "))
|
||||||
|
if intent == "" {
|
||||||
|
return fmt.Errorf("intent is required")
|
||||||
|
}
|
||||||
|
|
||||||
|
decision, err := classifyIntent(intent, strings.TrimSpace(autoLang))
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
|
||||||
|
if autoJSON {
|
||||||
|
enc := json.NewEncoder(cmd.OutOrStdout())
|
||||||
|
enc.SetIndent("", " ")
|
||||||
|
return enc.Encode(decision)
|
||||||
|
}
|
||||||
|
|
||||||
|
fmt.Printf("Route: %s\n", decision.Route)
|
||||||
|
fmt.Printf("Reason: %s\n", decision.Reason)
|
||||||
|
fmt.Printf("Command: devour %s\n", strings.Join(decision.Command, " "))
|
||||||
|
|
||||||
|
if autoDryRun {
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
exe, err := os.Executable()
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
run := exec.Command(exe, decision.Command...)
|
||||||
|
run.Stdout = cmd.OutOrStdout()
|
||||||
|
run.Stderr = cmd.ErrOrStderr()
|
||||||
|
return run.Run()
|
||||||
|
}
|
||||||
|
|
||||||
|
func classifyIntent(intent, langOverride string) (*autoDecision, error) {
|
||||||
|
lower := strings.ToLower(intent)
|
||||||
|
trimmed := strings.TrimSpace(intent)
|
||||||
|
|
||||||
|
if u, err := url.Parse(trimmed); err == nil && (u.Scheme == "http" || u.Scheme == "https") {
|
||||||
|
route := []string{"scrape", trimmed}
|
||||||
|
return &autoDecision{Intent: intent, Route: "scrape", Reason: "detected URL input", Command: route}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
if strings.Contains(lower, "quality") || strings.Contains(lower, "technical debt") || strings.Contains(lower, "lint") || strings.Contains(lower, "code smell") {
|
||||||
|
route := []string{"quality", "status"}
|
||||||
|
if strings.Contains(lower, "scan") {
|
||||||
|
route = []string{"quality", "scan", "."}
|
||||||
|
}
|
||||||
|
return &autoDecision{Intent: intent, Route: "quality", Reason: "detected quality-analysis intent", Command: route}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
language := strings.TrimSpace(langOverride)
|
||||||
|
if language == "" {
|
||||||
|
language = inferLanguageFromText(lower)
|
||||||
|
}
|
||||||
|
if language != "" {
|
||||||
|
if canonical, ok := normalizeLanguage(language); ok {
|
||||||
|
language = canonical
|
||||||
|
} else {
|
||||||
|
language = ""
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if strings.Contains(lower, "?") || strings.Contains(lower, "how") || strings.Contains(lower, "why") || strings.Contains(lower, "what") {
|
||||||
|
if language == "" {
|
||||||
|
language = "go"
|
||||||
|
}
|
||||||
|
route := []string{"ask", "--lang", language, intent, "--format", "text"}
|
||||||
|
return &autoDecision{Intent: intent, Route: "ask", Reason: "question-style intent", Command: route}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
if language == "" {
|
||||||
|
language = "go"
|
||||||
|
}
|
||||||
|
keyword := inferKeyword(intent)
|
||||||
|
if canonical, ok := normalizeLanguage(keyword); ok && canonical == language {
|
||||||
|
keyword = "overview"
|
||||||
|
}
|
||||||
|
route := []string{"get", language, keyword}
|
||||||
|
return &autoDecision{Intent: intent, Route: "get", Reason: "default docs retrieval route", Command: route}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
func inferLanguageFromText(text string) string {
|
||||||
|
text = strings.ToLower(text)
|
||||||
|
if strings.Contains(text, "c#") {
|
||||||
|
return "csharp"
|
||||||
|
}
|
||||||
|
if strings.Contains(text, "next.js") {
|
||||||
|
return "nextjs"
|
||||||
|
}
|
||||||
|
|
||||||
|
tokens := strings.FieldsFunc(text, func(r rune) bool {
|
||||||
|
return !(unicode.IsLetter(r) || unicode.IsDigit(r))
|
||||||
|
})
|
||||||
|
tokenSet := make(map[string]bool, len(tokens))
|
||||||
|
for _, tok := range tokens {
|
||||||
|
if tok != "" {
|
||||||
|
tokenSet[tok] = true
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
aliases := make([]string, 0, len(languageAliases()))
|
||||||
|
for alias := range languageAliases() {
|
||||||
|
aliases = append(aliases, alias)
|
||||||
|
}
|
||||||
|
sort.Slice(aliases, func(i, j int) bool {
|
||||||
|
return len(aliases[i]) > len(aliases[j])
|
||||||
|
})
|
||||||
|
|
||||||
|
for _, alias := range aliases {
|
||||||
|
if tokenSet[alias] {
|
||||||
|
return alias
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return ""
|
||||||
|
}
|
||||||
|
|
||||||
|
func inferKeyword(intent string) string {
|
||||||
|
words := strings.Fields(strings.ToLower(intent))
|
||||||
|
stop := map[string]bool{
|
||||||
|
"get": true, "docs": true, "documentation": true, "about": true, "for": true, "on": true,
|
||||||
|
"the": true, "a": true, "an": true, "show": true, "me": true, "please": true,
|
||||||
|
}
|
||||||
|
for _, w := range words {
|
||||||
|
w = strings.Trim(w, ",.!?;:")
|
||||||
|
if w == "" || stop[w] || len(w) < 2 {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
return w
|
||||||
|
}
|
||||||
|
return "overview"
|
||||||
|
}
|
||||||
@@ -0,0 +1,31 @@
|
|||||||
|
package cmd
|
||||||
|
|
||||||
|
import "testing"
|
||||||
|
|
||||||
|
func TestInferLanguageFromText_UsesTokenBoundaries(t *testing.T) {
|
||||||
|
if got := inferLanguageFromText("get nextjs docs"); got != "nextjs" {
|
||||||
|
t.Fatalf("inferLanguageFromText matched %q, want %q", got, "nextjs")
|
||||||
|
}
|
||||||
|
if got := inferLanguageFromText("read docs for architecture"); got != "" {
|
||||||
|
t.Fatalf("inferLanguageFromText should not infer language from plain docs text, got %q", got)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestClassifyIntent_GetRouteKeywordFallback(t *testing.T) {
|
||||||
|
decision, err := classifyIntent("get nextjs docs", "")
|
||||||
|
if err != nil {
|
||||||
|
t.Fatalf("classifyIntent returned error: %v", err)
|
||||||
|
}
|
||||||
|
if decision.Route != "get" {
|
||||||
|
t.Fatalf("expected get route, got %q", decision.Route)
|
||||||
|
}
|
||||||
|
if len(decision.Command) != 3 {
|
||||||
|
t.Fatalf("expected 3 command args, got %v", decision.Command)
|
||||||
|
}
|
||||||
|
if decision.Command[1] != "nextjs" {
|
||||||
|
t.Fatalf("expected language nextjs, got %q", decision.Command[1])
|
||||||
|
}
|
||||||
|
if decision.Command[2] != "overview" {
|
||||||
|
t.Fatalf("expected keyword overview, got %q", decision.Command[2])
|
||||||
|
}
|
||||||
|
}
|
||||||
+124
-36
@@ -14,6 +14,7 @@ import argparse
|
|||||||
class ModernBannerGenerator:
|
class ModernBannerGenerator:
|
||||||
def __init__(self, data):
|
def __init__(self, data):
|
||||||
self.data = data
|
self.data = data
|
||||||
|
self.fonts = self._init_fonts()
|
||||||
|
|
||||||
# Devour brand colors - consistent with Go theme
|
# Devour brand colors - consistent with Go theme
|
||||||
self.colors = {
|
self.colors = {
|
||||||
@@ -57,6 +58,49 @@ class ModernBannerGenerator:
|
|||||||
'severity_t4': (248, 113, 113), # #f87171 - bright red
|
'severity_t4': (248, 113, 113), # #f87171 - bright red
|
||||||
}
|
}
|
||||||
|
|
||||||
|
def _init_fonts(self):
|
||||||
|
"""Initialize font candidates and cache."""
|
||||||
|
# Prefer widely-available fonts on Linux/macOS/Windows.
|
||||||
|
font_candidates = {
|
||||||
|
"regular": [
|
||||||
|
"arial.ttf",
|
||||||
|
"/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf",
|
||||||
|
"/usr/share/fonts/truetype/liberation/LiberationSans-Regular.ttf",
|
||||||
|
"/System/Library/Fonts/Supplemental/Arial.ttf",
|
||||||
|
"/Library/Fonts/Arial.ttf",
|
||||||
|
],
|
||||||
|
"bold": [
|
||||||
|
"arialbd.ttf",
|
||||||
|
"/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf",
|
||||||
|
"/usr/share/fonts/truetype/liberation/LiberationSans-Bold.ttf",
|
||||||
|
"/System/Library/Fonts/Supplemental/Arial Bold.ttf",
|
||||||
|
"/Library/Fonts/Arial Bold.ttf",
|
||||||
|
],
|
||||||
|
}
|
||||||
|
|
||||||
|
return {
|
||||||
|
"candidates": font_candidates,
|
||||||
|
"cache": {},
|
||||||
|
}
|
||||||
|
|
||||||
|
def get_font(self, size, weight="regular"):
|
||||||
|
"""Get a cached font or fall back to the default."""
|
||||||
|
key = (size, weight)
|
||||||
|
if key in self.fonts["cache"]:
|
||||||
|
return self.fonts["cache"][key]
|
||||||
|
|
||||||
|
for path in self.fonts["candidates"].get(weight, []):
|
||||||
|
try:
|
||||||
|
font = ImageFont.truetype(path, size)
|
||||||
|
self.fonts["cache"][key] = font
|
||||||
|
return font
|
||||||
|
except:
|
||||||
|
continue
|
||||||
|
|
||||||
|
font = ImageFont.load_default()
|
||||||
|
self.fonts["cache"][key] = font
|
||||||
|
return font
|
||||||
|
|
||||||
def get_score_color(self, score, muted=False):
|
def get_score_color(self, score, muted=False):
|
||||||
if score >= 90:
|
if score >= 90:
|
||||||
return self.colors['score_a_muted'] if muted else self.colors['score_a']
|
return self.colors['score_a_muted'] if muted else self.colors['score_a']
|
||||||
@@ -90,6 +134,22 @@ class ModernBannerGenerator:
|
|||||||
for x in range(width):
|
for x in range(width):
|
||||||
img.putpixel((x, y), (r, g, b))
|
img.putpixel((x, y), (r, g, b))
|
||||||
|
|
||||||
|
# Add subtle radial glows for depth
|
||||||
|
self.draw_glow(img, width * 0.15, height * 0.2, 220, (71, 85, 105), 40)
|
||||||
|
self.draw_glow(img, width * 0.85, height * 0.75, 260, (251, 146, 60), 35)
|
||||||
|
|
||||||
|
def draw_glow(self, img, cx, cy, radius, color, max_alpha):
|
||||||
|
"""Draw a soft radial glow."""
|
||||||
|
draw = ImageDraw.Draw(img)
|
||||||
|
steps = 12
|
||||||
|
for i in range(steps):
|
||||||
|
r = radius - (radius * i / steps)
|
||||||
|
alpha = int(max_alpha * (1 - i / steps))
|
||||||
|
draw.ellipse(
|
||||||
|
[(cx - r, cy - r), (cx + r, cy + r)],
|
||||||
|
fill=(*color, alpha),
|
||||||
|
)
|
||||||
|
|
||||||
def draw_glass_card(self, draw, x, y, width, height, border_radius=12, use_alt=False):
|
def draw_glass_card(self, draw, x, y, width, height, border_radius=12, use_alt=False):
|
||||||
"""Draw glass morphism card with enhanced effects"""
|
"""Draw glass morphism card with enhanced effects"""
|
||||||
card_color = self.colors['card_alt'] if use_alt else self.colors['card']
|
card_color = self.colors['card_alt'] if use_alt else self.colors['card']
|
||||||
@@ -142,19 +202,20 @@ class ModernBannerGenerator:
|
|||||||
# Draw progress arc
|
# Draw progress arc
|
||||||
start_angle = -90
|
start_angle = -90
|
||||||
end_angle = start_angle + (360 * percentage)
|
end_angle = start_angle + (360 * percentage)
|
||||||
arc_width = 8 if is_primary else 6
|
arc_width = 9 if is_primary else 6
|
||||||
|
|
||||||
draw.arc([(cx-radius+4, cy-radius+4), (cx+radius-4, cy+radius-4)],
|
draw.arc([(cx-radius+4, cy-radius+4), (cx+radius-4, cy+radius-4)],
|
||||||
start_angle, end_angle,
|
start_angle, end_angle,
|
||||||
fill=score_color, width=arc_width)
|
fill=score_color, width=arc_width)
|
||||||
|
|
||||||
|
# Inner glow ring
|
||||||
|
if is_primary:
|
||||||
|
draw.arc([(cx-radius+10, cy-radius+10), (cx+radius-10, cy+radius-10)],
|
||||||
|
start_angle, end_angle, fill=score_color, width=2)
|
||||||
|
|
||||||
# Enhanced typography
|
# Enhanced typography
|
||||||
try:
|
font_large = self.get_font(34 if is_primary else 28, weight="bold")
|
||||||
font_large = ImageFont.truetype("arial.ttf", 32 if is_primary else 28)
|
font_small = self.get_font(11, weight="regular")
|
||||||
font_small = ImageFont.truetype("arial.ttf", 11)
|
|
||||||
except:
|
|
||||||
font_large = ImageFont.load_default()
|
|
||||||
font_small = ImageFont.load_default()
|
|
||||||
|
|
||||||
# Score text
|
# Score text
|
||||||
score_text = f"{int(score)}%"
|
score_text = f"{int(score)}%"
|
||||||
@@ -189,10 +250,7 @@ class ModernBannerGenerator:
|
|||||||
6, fill=grade_color, outline=self.colors['border'])
|
6, fill=grade_color, outline=self.colors['border'])
|
||||||
|
|
||||||
# Grade text with better typography
|
# Grade text with better typography
|
||||||
try:
|
font = self.get_font(18, weight="bold")
|
||||||
font = ImageFont.truetype("arial.ttf", 18)
|
|
||||||
except:
|
|
||||||
font = ImageFont.load_default()
|
|
||||||
|
|
||||||
bbox = draw.textbbox((0, 0), grade, font=font)
|
bbox = draw.textbbox((0, 0), grade, font=font)
|
||||||
text_width = bbox[2] - bbox[0]
|
text_width = bbox[2] - bbox[0]
|
||||||
@@ -201,15 +259,14 @@ class ModernBannerGenerator:
|
|||||||
draw.text((x + badge_width//2 - text_width//2, y + badge_height//2 - text_height//2 + 1),
|
draw.text((x + badge_width//2 - text_width//2, y + badge_height//2 - text_height//2 + 1),
|
||||||
grade, fill=(255, 255, 255), font=font)
|
grade, fill=(255, 255, 255), font=font)
|
||||||
|
|
||||||
def draw_text(self, draw, text, x, y, size=14, color=None, centered=False):
|
def draw_text(self, draw, text, x, y, size=14, color=None, centered=False, max_width=None, min_size=9, weight="regular"):
|
||||||
"""Draw enhanced text with better typography"""
|
"""Draw enhanced text with better typography"""
|
||||||
if color is None:
|
if color is None:
|
||||||
color = self.colors['text']
|
color = self.colors['text']
|
||||||
|
|
||||||
try:
|
font = self.get_font(size, weight=weight)
|
||||||
font = ImageFont.truetype("arial.ttf", size)
|
if max_width is not None:
|
||||||
except:
|
font = self.fit_font(draw, text, font, max_width, min_size=min_size, weight=weight)
|
||||||
font = ImageFont.load_default()
|
|
||||||
|
|
||||||
if centered:
|
if centered:
|
||||||
bbox = draw.textbbox((0, 0), text, font=font)
|
bbox = draw.textbbox((0, 0), text, font=font)
|
||||||
@@ -218,15 +275,42 @@ class ModernBannerGenerator:
|
|||||||
|
|
||||||
draw.text((x, y), text, fill=color, font=font)
|
draw.text((x, y), text, fill=color, font=font)
|
||||||
|
|
||||||
|
def fit_font(self, draw, text, font, max_width, min_size=9, weight="regular"):
|
||||||
|
"""Shrink font until text fits max width."""
|
||||||
|
if font == ImageFont.load_default():
|
||||||
|
return font
|
||||||
|
size = font.size if hasattr(font, "size") else min_size
|
||||||
|
current = font
|
||||||
|
while size > min_size:
|
||||||
|
bbox = draw.textbbox((0, 0), text, font=current)
|
||||||
|
if (bbox[2] - bbox[0]) <= max_width:
|
||||||
|
return current
|
||||||
|
size -= 1
|
||||||
|
current = self.get_font(size, weight=weight)
|
||||||
|
return current
|
||||||
|
|
||||||
|
def truncate_text(self, draw, text, font, max_width):
|
||||||
|
"""Truncate text with ellipsis to fit width."""
|
||||||
|
if max_width <= 0:
|
||||||
|
return ""
|
||||||
|
if draw.textbbox((0, 0), text, font=font)[2] <= max_width:
|
||||||
|
return text
|
||||||
|
ellipsis = "..."
|
||||||
|
for i in range(len(text), 0, -1):
|
||||||
|
candidate = text[:i] + ellipsis
|
||||||
|
if draw.textbbox((0, 0), candidate, font=font)[2] <= max_width:
|
||||||
|
return candidate
|
||||||
|
return ellipsis
|
||||||
|
|
||||||
def draw_metric_card(self, draw, x, y, width, height, title, value, color):
|
def draw_metric_card(self, draw, x, y, width, height, title, value, color):
|
||||||
"""Draw metric card"""
|
"""Draw metric card"""
|
||||||
self.draw_glass_card(draw, x, y, width, height)
|
self.draw_glass_card(draw, x, y, width, height)
|
||||||
|
|
||||||
# Title
|
# Title
|
||||||
self.draw_text(draw, title, x + 15, y + 15, size=12, color=self.colors['text_muted'])
|
self.draw_text(draw, title, x + 15, y + 14, size=12, color=self.colors['text_muted'])
|
||||||
|
|
||||||
# Value
|
# Value
|
||||||
self.draw_text(draw, value, x + 15, y + 40, size=20, color=color)
|
self.draw_text(draw, value, x + 15, y + 38, size=20, color=color, weight="bold")
|
||||||
|
|
||||||
def draw_severity_bars(self, draw, x, y, width, height, find_by_tier):
|
def draw_severity_bars(self, draw, x, y, width, height, find_by_tier):
|
||||||
"""Draw enhanced severity bars"""
|
"""Draw enhanced severity bars"""
|
||||||
@@ -313,14 +397,14 @@ class ModernBannerGenerator:
|
|||||||
# Enhanced header section
|
# Enhanced header section
|
||||||
header_y = content_y + 20
|
header_y = content_y + 20
|
||||||
self.draw_text(draw, "DEVOUR SCORE", content_x + content_width//2, header_y,
|
self.draw_text(draw, "DEVOUR SCORE", content_x + content_width//2, header_y,
|
||||||
size=20, color=self.colors['text'], centered=True)
|
size=20, color=self.colors['text'], centered=True, weight="bold")
|
||||||
|
|
||||||
# Project info
|
# Project info
|
||||||
project_name = self.data['project_name']
|
project_name = self.data['project_name']
|
||||||
version_text = f"v{self.data['version']}" if self.data['version'] else "latest"
|
version_text = f"v{self.data['version']}" if self.data['version'] else "latest"
|
||||||
project_text = f"{project_name} {version_text}"
|
project_text = f"{project_name} {version_text}"
|
||||||
self.draw_text(draw, project_text, content_x + content_width//2, header_y + 25,
|
self.draw_text(draw, project_text, content_x + content_width//2, header_y + 25,
|
||||||
size=14, color=self.colors['text_muted'], centered=True)
|
size=14, color=self.colors['text_muted'], centered=True, max_width=content_width - 120)
|
||||||
|
|
||||||
# Timestamp
|
# Timestamp
|
||||||
time_text = self.data.get('timestamp', 'Today')
|
time_text = self.data.get('timestamp', 'Today')
|
||||||
@@ -347,19 +431,19 @@ class ModernBannerGenerator:
|
|||||||
|
|
||||||
# Total findings
|
# Total findings
|
||||||
self.draw_text(draw, str(findings_total), col_x + col_width//2, metrics_y,
|
self.draw_text(draw, str(findings_total), col_x + col_width//2, metrics_y,
|
||||||
size=18, color=self.colors['text'], centered=True)
|
size=18, color=self.colors['text'], centered=True, weight="bold")
|
||||||
self.draw_text(draw, "TOTAL", col_x + col_width//2, metrics_y + 22,
|
self.draw_text(draw, "TOTAL", col_x + col_width//2, metrics_y + 22,
|
||||||
size=10, color=self.colors['text_muted'], centered=True)
|
size=10, color=self.colors['text_muted'], centered=True)
|
||||||
|
|
||||||
# Open findings
|
# Open findings
|
||||||
self.draw_text(draw, str(findings_open), col_x + col_width + col_width//2, metrics_y,
|
self.draw_text(draw, str(findings_open), col_x + col_width + col_width//2, metrics_y,
|
||||||
size=18, color=self.colors['orange'], centered=True)
|
size=18, color=self.colors['orange'], centered=True, weight="bold")
|
||||||
self.draw_text(draw, "OPEN", col_x + col_width + col_width//2, metrics_y + 22,
|
self.draw_text(draw, "OPEN", col_x + col_width + col_width//2, metrics_y + 22,
|
||||||
size=10, color=self.colors['text_muted'], centered=True)
|
size=10, color=self.colors['text_muted'], centered=True)
|
||||||
|
|
||||||
# Resolved findings
|
# Resolved findings
|
||||||
self.draw_text(draw, str(findings_closed), col_x + 2*col_width + col_width//2, metrics_y,
|
self.draw_text(draw, str(findings_closed), col_x + 2*col_width + col_width//2, metrics_y,
|
||||||
size=18, color=self.colors['score_a'], centered=True)
|
size=18, color=self.colors['score_a'], centered=True, weight="bold")
|
||||||
self.draw_text(draw, "RESOLVED", col_x + 2*col_width + col_width//2, metrics_y + 22,
|
self.draw_text(draw, "RESOLVED", col_x + 2*col_width + col_width//2, metrics_y + 22,
|
||||||
size=10, color=self.colors['text_muted'], centered=True)
|
size=10, color=self.colors['text_muted'], centered=True)
|
||||||
|
|
||||||
@@ -379,7 +463,7 @@ class ModernBannerGenerator:
|
|||||||
# Header section
|
# Header section
|
||||||
header_y = 30
|
header_y = 30
|
||||||
self.draw_text(draw, f"{self.data['project_name']} Quality Report",
|
self.draw_text(draw, f"{self.data['project_name']} Quality Report",
|
||||||
width//2, header_y, size=28, color=self.colors['text'], centered=True)
|
width//2, header_y, size=28, color=self.colors['text'], centered=True, weight="bold", max_width=width - 80)
|
||||||
|
|
||||||
version_text = f"v{self.data['version']}" if self.data['version'] else "latest"
|
version_text = f"v{self.data['version']}" if self.data['version'] else "latest"
|
||||||
self.draw_text(draw, version_text, width//2, header_y + 35,
|
self.draw_text(draw, version_text, width//2, header_y + 35,
|
||||||
@@ -400,7 +484,7 @@ class ModernBannerGenerator:
|
|||||||
score_details_y = score_y + 100
|
score_details_y = score_y + 100
|
||||||
self.draw_text(draw, f"Overall: {int(self.data['overall_score'])}%",
|
self.draw_text(draw, f"Overall: {int(self.data['overall_score'])}%",
|
||||||
score_x, score_details_y, size=20,
|
score_x, score_details_y, size=20,
|
||||||
color=self.get_score_color(self.data['overall_score']), centered=True)
|
color=self.get_score_color(self.data['overall_score']), centered=True, weight="bold")
|
||||||
self.draw_text(draw, f"Strict: {int(self.data['strict_score'])}%",
|
self.draw_text(draw, f"Strict: {int(self.data['strict_score'])}%",
|
||||||
score_x, score_details_y + 25, size=16,
|
score_x, score_details_y + 25, size=16,
|
||||||
color=self.get_score_color(self.data['strict_score'], muted=True), centered=True)
|
color=self.get_score_color(self.data['strict_score'], muted=True), centered=True)
|
||||||
@@ -419,7 +503,7 @@ class ModernBannerGenerator:
|
|||||||
|
|
||||||
# Column 1 Header
|
# Column 1 Header
|
||||||
self.draw_text(draw, "Score Breakdown", col1_x + col_width//2, grid_start_y + 20,
|
self.draw_text(draw, "Score Breakdown", col1_x + col_width//2, grid_start_y + 20,
|
||||||
size=18, color=self.colors['text'], centered=True)
|
size=18, color=self.colors['text'], centered=True, weight="bold")
|
||||||
|
|
||||||
# Column 1 Data
|
# Column 1 Data
|
||||||
score_data = [
|
score_data = [
|
||||||
@@ -439,7 +523,7 @@ class ModernBannerGenerator:
|
|||||||
|
|
||||||
# Value
|
# Value
|
||||||
self.draw_text(draw, value, col1_x + col_width//2, data_y + 35,
|
self.draw_text(draw, value, col1_x + col_width//2, data_y + 35,
|
||||||
size=24, color=color, centered=True)
|
size=24, color=color, centered=True, weight="bold")
|
||||||
|
|
||||||
data_y += 80
|
data_y += 80
|
||||||
|
|
||||||
@@ -449,15 +533,19 @@ class ModernBannerGenerator:
|
|||||||
|
|
||||||
# Column 2 Header
|
# Column 2 Header
|
||||||
self.draw_text(draw, "Findings by Type", col2_x + col_width//2, grid_start_y + 20,
|
self.draw_text(draw, "Findings by Type", col2_x + col_width//2, grid_start_y + 20,
|
||||||
size=18, color=self.colors['text'], centered=True)
|
size=18, color=self.colors['text'], centered=True, weight="bold")
|
||||||
|
|
||||||
# Column 2 Data - Top finding types
|
# Column 2 Data - Top finding types
|
||||||
type_data_y = grid_start_y + 60
|
type_data_y = grid_start_y + 60
|
||||||
type_items = list(self.data['find_by_type'].items())[:6] # Top 6 types
|
type_items = list(self.data['find_by_type'].items())[:6] # Top 6 types
|
||||||
|
max_type_count = max(self.data['find_by_type'].values()) if self.data['find_by_type'] else 1
|
||||||
|
|
||||||
|
if not type_items:
|
||||||
|
self.draw_text(draw, "No findings", col2_x + col_width//2, grid_start_y + 110,
|
||||||
|
size=14, color=self.colors['text_dim'], centered=True)
|
||||||
for issue_type, count in type_items:
|
for issue_type, count in type_items:
|
||||||
# Type bar
|
# Type bar
|
||||||
bar_width = int((col_width - 40) * (count / max(self.data['find_by_type'].values())))
|
bar_width = int((col_width - 40) * (count / max_type_count))
|
||||||
bar_height = 22
|
bar_height = 22
|
||||||
|
|
||||||
# Bar background
|
# Bar background
|
||||||
@@ -469,9 +557,9 @@ class ModernBannerGenerator:
|
|||||||
4, fill=self.colors['orange'])
|
4, fill=self.colors['orange'])
|
||||||
|
|
||||||
# Type label
|
# Type label
|
||||||
label_text = f"{issue_type}"
|
label_text = f"{issue_type}".replace("_", " ")
|
||||||
if len(label_text) > 20:
|
font_label = self.get_font(11, weight="regular")
|
||||||
label_text = label_text[:17] + "..."
|
label_text = self.truncate_text(draw, label_text, font_label, col_width - 90)
|
||||||
self.draw_text(draw, label_text, col2_x + 25, type_data_y + 2,
|
self.draw_text(draw, label_text, col2_x + 25, type_data_y + 2,
|
||||||
size=11, color=self.colors['text_muted'])
|
size=11, color=self.colors['text_muted'])
|
||||||
|
|
||||||
@@ -487,7 +575,7 @@ class ModernBannerGenerator:
|
|||||||
|
|
||||||
# Column 3 Header
|
# Column 3 Header
|
||||||
self.draw_text(draw, "Issues by Severity", col3_x + col_width//2, grid_start_y + 20,
|
self.draw_text(draw, "Issues by Severity", col3_x + col_width//2, grid_start_y + 20,
|
||||||
size=18, color=self.colors['text'], centered=True)
|
size=18, color=self.colors['text'], centered=True, weight="bold")
|
||||||
|
|
||||||
# Column 3 Data - Severity breakdown
|
# Column 3 Data - Severity breakdown
|
||||||
severity_data_y = grid_start_y + 60
|
severity_data_y = grid_start_y + 60
|
||||||
@@ -510,11 +598,11 @@ class ModernBannerGenerator:
|
|||||||
|
|
||||||
# Severity name
|
# Severity name
|
||||||
self.draw_text(draw, severity_name, col3_x + 50, severity_data_y + 15,
|
self.draw_text(draw, severity_name, col3_x + 50, severity_data_y + 15,
|
||||||
size=14, color=self.colors['text'])
|
size=14, color=self.colors['text'], max_width=col_width - 70)
|
||||||
|
|
||||||
# Count
|
# Count
|
||||||
self.draw_text(draw, f"{count} issues", col3_x + 50, severity_data_y + 35,
|
self.draw_text(draw, f"{count} issues", col3_x + 50, severity_data_y + 35,
|
||||||
size=16, color=color)
|
size=16, color=color, weight="bold")
|
||||||
|
|
||||||
severity_data_y += 70
|
severity_data_y += 70
|
||||||
|
|
||||||
@@ -539,7 +627,7 @@ class ModernBannerGenerator:
|
|||||||
|
|
||||||
# Value
|
# Value
|
||||||
self.draw_text(draw, value, metric_x + metrics_width//2, summary_y + 10,
|
self.draw_text(draw, value, metric_x + metrics_width//2, summary_y + 10,
|
||||||
size=18, color=color, centered=True)
|
size=18, color=color, centered=True, weight="bold")
|
||||||
|
|
||||||
# Label
|
# Label
|
||||||
self.draw_text(draw, label, metric_x + metrics_width//2, summary_y + 30,
|
self.draw_text(draw, label, metric_x + metrics_width//2, summary_y + 30,
|
||||||
|
|||||||
@@ -24,7 +24,6 @@ This command will:
|
|||||||
}
|
}
|
||||||
|
|
||||||
func init() {
|
func init() {
|
||||||
rootCmd.AddCommand(demoCmd)
|
|
||||||
}
|
}
|
||||||
|
|
||||||
func runDemo(cmd *cobra.Command, args []string) error {
|
func runDemo(cmd *cobra.Command, args []string) error {
|
||||||
|
|||||||
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
@@ -0,0 +1,5 @@
|
|||||||
|
{
|
||||||
|
"version": "1",
|
||||||
|
"built_at": "2026-02-23T11:19:21.65415175+01:00",
|
||||||
|
"docs": []
|
||||||
|
}
|
||||||
@@ -0,0 +1,7 @@
|
|||||||
|
{
|
||||||
|
"version": "1",
|
||||||
|
"built_at": "2026-02-23T11:19:21.65415175+01:00",
|
||||||
|
"docs_dir": "./devour_data/docs",
|
||||||
|
"source_file_hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
|
||||||
|
"doc_count": 0
|
||||||
|
}
|
||||||
@@ -1,3 +1,5 @@
|
|||||||
|
//go:build ignore
|
||||||
|
|
||||||
package main
|
package main
|
||||||
|
|
||||||
import (
|
import (
|
||||||
@@ -5,7 +7,6 @@ import (
|
|||||||
"time"
|
"time"
|
||||||
|
|
||||||
"github.com/yourorg/devour/internal/quality"
|
"github.com/yourorg/devour/internal/quality"
|
||||||
"github.com/yourorg/devour/internal/quality/scorecard"
|
|
||||||
)
|
)
|
||||||
|
|
||||||
func main() {
|
func main() {
|
||||||
|
|||||||
+226
-58
@@ -2,6 +2,7 @@ package cmd
|
|||||||
|
|
||||||
import (
|
import (
|
||||||
"fmt"
|
"fmt"
|
||||||
|
"sort"
|
||||||
"strings"
|
"strings"
|
||||||
|
|
||||||
"github.com/spf13/cobra"
|
"github.com/spf13/cobra"
|
||||||
@@ -11,112 +12,210 @@ var getCmd = &cobra.Command{
|
|||||||
Use: "get <language> <keyword>",
|
Use: "get <language> <keyword>",
|
||||||
Short: "Get documentation for a language/framework",
|
Short: "Get documentation for a language/framework",
|
||||||
Long: `Quickly fetch documentation for popular languages and frameworks.
|
Long: `Quickly fetch documentation for popular languages and frameworks.
|
||||||
This command automatically maps language+keyword combinations to their official documentation sites.
|
This command maps language+keyword combinations to official documentation sources.
|
||||||
|
|
||||||
Supported languages:
|
|
||||||
go, golang - Go documentation (pkg.go.dev)
|
|
||||||
rust - Rust documentation (docs.rs)
|
|
||||||
python, py - Python documentation (docs.python.org)
|
|
||||||
java - Java documentation (docs.oracle.com)
|
|
||||||
spring - Spring Boot documentation (docs.spring.io)
|
|
||||||
typescript, ts - TypeScript documentation (typescriptlang.org)
|
|
||||||
react - React documentation (react.dev)
|
|
||||||
vue - Vue.js documentation (vuejs.org)
|
|
||||||
nuxt - Nuxt documentation (nuxt.com)
|
|
||||||
docker - Docker documentation (docs.docker.com)
|
|
||||||
cloudflare, cf - Cloudflare documentation (developers.cloudflare.com)
|
|
||||||
astro - Astro documentation (docs.astro.build)
|
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
devour get go http # Go HTTP package documentation
|
devour get go http
|
||||||
devour get python asyncio # Python asyncio module
|
devour get python asyncio
|
||||||
devour get react hooks # React Hooks documentation
|
devour get react hooks
|
||||||
devour get docker compose # Docker Compose docs
|
devour get nextjs routing
|
||||||
devour get rust tokio # Rust Tokio crate`,
|
devour get express middleware`,
|
||||||
Args: cobra.ExactArgs(2),
|
Args: cobra.ExactArgs(2),
|
||||||
RunE: runGet,
|
RunE: runGet,
|
||||||
}
|
}
|
||||||
|
|
||||||
func init() {
|
func init() {
|
||||||
// Add flags that can override defaults
|
|
||||||
getCmd.Flags().StringVarP(&scrapeFormat, "format", "f", "json", "output format (json, markdown)")
|
getCmd.Flags().StringVarP(&scrapeFormat, "format", "f", "json", "output format (json, markdown)")
|
||||||
getCmd.Flags().StringVarP(&scrapeOutput, "output", "o", "", "output directory (default: devour_data/docs)")
|
getCmd.Flags().StringVarP(&scrapeOutput, "output", "o", "", "output directory (default: configured docs dir)")
|
||||||
getCmd.Flags().IntVar(&scrapeConcurrency, "concurrency", 10, "parallel scraping workers")
|
getCmd.Flags().IntVar(&scrapeConcurrency, "concurrency", 10, "parallel scraping workers")
|
||||||
}
|
}
|
||||||
|
|
||||||
func runGet(cmd *cobra.Command, args []string) error {
|
func runGet(cmd *cobra.Command, args []string) error {
|
||||||
language := strings.ToLower(args[0])
|
langIn := strings.ToLower(strings.TrimSpace(args[0]))
|
||||||
keyword := strings.ToLower(args[1])
|
keyword := strings.TrimSpace(args[1])
|
||||||
|
if keyword == "" {
|
||||||
|
return fmt.Errorf("keyword is required")
|
||||||
|
}
|
||||||
|
|
||||||
|
language, ok := normalizeLanguage(langIn)
|
||||||
|
if !ok {
|
||||||
|
return fmt.Errorf("unsupported language: %s. Supported: %s", langIn, strings.Join(supportedLanguages(), ", "))
|
||||||
|
}
|
||||||
|
|
||||||
// Map language to base URL and construct full URL
|
|
||||||
url, err := constructDocURL(language, keyword)
|
url, err := constructDocURL(language, keyword)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return err
|
||||||
}
|
}
|
||||||
|
|
||||||
// Set the scrape type based on language
|
|
||||||
sourceType := mapLanguageToType(language)
|
sourceType := mapLanguageToType(language)
|
||||||
|
scrapeType = sourceType
|
||||||
// Reuse the existing scrape logic with pre-determined values
|
|
||||||
scrapeType = string(sourceType)
|
|
||||||
sourceURL := url
|
|
||||||
|
|
||||||
fmt.Printf("Getting docs for: %s %s\n", language, keyword)
|
fmt.Printf("Getting docs for: %s %s\n", language, keyword)
|
||||||
fmt.Printf("URL: %s\n", sourceURL)
|
fmt.Printf("URL: %s\n", url)
|
||||||
fmt.Printf("Type: %s\n", sourceType)
|
fmt.Printf("Type: %s\n\n", sourceType)
|
||||||
fmt.Println()
|
|
||||||
|
|
||||||
// Call the existing scrape logic
|
return runScrape(cmd, []string{url})
|
||||||
return runScrape(cmd, []string{sourceURL})
|
|
||||||
}
|
}
|
||||||
|
|
||||||
func constructDocURL(language, keyword string) (string, error) {
|
func constructDocURL(language, keyword string) (string, error) {
|
||||||
|
language = strings.ToLower(strings.TrimSpace(language))
|
||||||
|
keyword = strings.TrimSpace(keyword)
|
||||||
|
lowerKeyword := strings.ToLower(keyword)
|
||||||
|
|
||||||
switch language {
|
switch language {
|
||||||
case "go", "golang":
|
case "go":
|
||||||
return fmt.Sprintf("https://pkg.go.dev/%s", keyword), nil
|
return fmt.Sprintf("https://pkg.go.dev/%s", lowerKeyword), nil
|
||||||
case "rust":
|
case "rust":
|
||||||
return fmt.Sprintf("https://docs.rs/%s/latest/%s/", keyword, keyword), nil
|
return fmt.Sprintf("https://docs.rs/%s/latest/%s/", lowerKeyword, lowerKeyword), nil
|
||||||
case "python", "py":
|
case "python":
|
||||||
if keyword == "stdlib" || keyword == "standard" {
|
if lowerKeyword == "stdlib" || lowerKeyword == "standard" {
|
||||||
return "https://docs.python.org/3/library/", nil
|
return "https://docs.python.org/3/library/", nil
|
||||||
}
|
}
|
||||||
return fmt.Sprintf("https://docs.python.org/3/library/%s.html", keyword), nil
|
return fmt.Sprintf("https://docs.python.org/3/library/%s.html", lowerKeyword), nil
|
||||||
case "java":
|
case "java":
|
||||||
return fmt.Sprintf("https://docs.oracle.com/javase/8/docs/api/%s.html", keyword), nil
|
return fmt.Sprintf("https://docs.oracle.com/javase/8/docs/api/%s.html", lowerKeyword), nil
|
||||||
case "spring":
|
case "spring":
|
||||||
return fmt.Sprintf("https://docs.spring.io/spring-boot/docs/current/reference/htmlsingle/#%s", keyword), nil
|
if lowerKeyword == "mcp" || lowerKeyword == "mcp-overview" {
|
||||||
case "typescript", "ts":
|
return "https://docs.spring.io/spring-ai/reference/api/mcp/mcp-overview.html", nil
|
||||||
return fmt.Sprintf("https://www.typescriptlang.org/docs/handbook/%s.html", keyword), nil
|
}
|
||||||
|
return fmt.Sprintf("https://docs.spring.io/spring-boot/docs/current/reference/htmlsingle/#%s", lowerKeyword), nil
|
||||||
|
case "typescript":
|
||||||
|
return fmt.Sprintf("https://www.typescriptlang.org/docs/handbook/%s.html", lowerKeyword), nil
|
||||||
case "react":
|
case "react":
|
||||||
return fmt.Sprintf("https://react.dev/reference/react/%s", keyword), nil
|
if lowerKeyword == "hooks" {
|
||||||
|
return "https://react.dev/reference/react", nil
|
||||||
|
}
|
||||||
|
return fmt.Sprintf("https://react.dev/reference/react/%s", lowerKeyword), nil
|
||||||
case "vue":
|
case "vue":
|
||||||
return fmt.Sprintf("https://vuejs.org/guide/%s.html", keyword), nil
|
if strings.Contains(lowerKeyword, "api") {
|
||||||
|
return "https://vuejs.org/api/", nil
|
||||||
|
}
|
||||||
|
return fmt.Sprintf("https://vuejs.org/guide/%s.html", lowerKeyword), nil
|
||||||
case "nuxt":
|
case "nuxt":
|
||||||
return fmt.Sprintf("https://nuxt.com/docs/guide/%s", keyword), nil
|
return fmt.Sprintf("https://nuxt.com/docs/guide/%s", lowerKeyword), nil
|
||||||
case "docker":
|
case "docker":
|
||||||
return fmt.Sprintf("https://docs.docker.com/%s", keyword), nil
|
return fmt.Sprintf("https://docs.docker.com/%s", lowerKeyword), nil
|
||||||
case "cloudflare", "cf":
|
case "cloudflare":
|
||||||
return fmt.Sprintf("https://developers.cloudflare.com/%s", keyword), nil
|
return fmt.Sprintf("https://developers.cloudflare.com/%s", lowerKeyword), nil
|
||||||
case "astro":
|
case "astro":
|
||||||
return fmt.Sprintf("https://docs.astro.build/en/guides/%s", keyword), nil
|
path := lowerKeyword
|
||||||
|
switch lowerKeyword {
|
||||||
|
case "components":
|
||||||
|
path = "basics/astro-components"
|
||||||
|
case "api":
|
||||||
|
path = "reference/api-reference"
|
||||||
|
case "install", "setup", "getting-started":
|
||||||
|
path = "install-and-setup"
|
||||||
default:
|
default:
|
||||||
return "", fmt.Errorf("unsupported language: %s. Supported languages: go, rust, python, java, spring, typescript, react, vue, nuxt, docker, cloudflare, astro", language)
|
if !strings.Contains(lowerKeyword, "/") {
|
||||||
|
path = "guides/" + lowerKeyword
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return fmt.Sprintf("https://docs.astro.build/en/%s/", path), nil
|
||||||
|
case "csharp":
|
||||||
|
lowerKeyword = strings.TrimPrefix(lowerKeyword, "/")
|
||||||
|
if strings.Contains(lowerKeyword, "regex") || strings.Contains(lowerKeyword, "regular-expression") {
|
||||||
|
return "https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expressions", nil
|
||||||
|
}
|
||||||
|
return fmt.Sprintf("https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/%s", lowerKeyword), nil
|
||||||
|
case "kotlin":
|
||||||
|
lowerKeyword = strings.TrimPrefix(lowerKeyword, "/")
|
||||||
|
if lowerKeyword == "regex" || lowerKeyword == "regexp" {
|
||||||
|
lowerKeyword = "strings"
|
||||||
|
}
|
||||||
|
if strings.HasSuffix(lowerKeyword, ".html") {
|
||||||
|
return fmt.Sprintf("https://kotlinlang.org/docs/%s", lowerKeyword), nil
|
||||||
|
}
|
||||||
|
return fmt.Sprintf("https://kotlinlang.org/docs/%s.html", lowerKeyword), nil
|
||||||
|
case "php":
|
||||||
|
lowerKeyword = strings.TrimPrefix(lowerKeyword, "/")
|
||||||
|
if strings.HasSuffix(lowerKeyword, ".php") || strings.Contains(lowerKeyword, "function.") || strings.Contains(lowerKeyword, "book.") {
|
||||||
|
return fmt.Sprintf("https://www.php.net/manual/en/%s", lowerKeyword), nil
|
||||||
|
}
|
||||||
|
return fmt.Sprintf("https://www.php.net/manual/en/book.%s.php", lowerKeyword), nil
|
||||||
|
case "ruby":
|
||||||
|
keyword = strings.TrimPrefix(keyword, "/")
|
||||||
|
switch strings.ToLower(keyword) {
|
||||||
|
case "regex", "regexp":
|
||||||
|
keyword = "Regexp"
|
||||||
|
case "string":
|
||||||
|
keyword = "String"
|
||||||
|
case "array":
|
||||||
|
keyword = "Array"
|
||||||
|
default:
|
||||||
|
if !strings.Contains(keyword, "::") && len(keyword) > 0 {
|
||||||
|
keyword = strings.ToUpper(keyword[:1]) + strings.ToLower(keyword[1:])
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return fmt.Sprintf("https://ruby-doc.org/core/%s.html", keyword), nil
|
||||||
|
case "elixir":
|
||||||
|
keyword = strings.TrimPrefix(keyword, "/")
|
||||||
|
switch strings.ToLower(keyword) {
|
||||||
|
case "regex":
|
||||||
|
keyword = "Regex"
|
||||||
|
case "string":
|
||||||
|
keyword = "String"
|
||||||
|
case "enum":
|
||||||
|
keyword = "Enum"
|
||||||
|
default:
|
||||||
|
if len(keyword) > 0 {
|
||||||
|
keyword = strings.ToUpper(keyword[:1]) + strings.ToLower(keyword[1:])
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return fmt.Sprintf("https://hexdocs.pm/elixir/%s.html", keyword), nil
|
||||||
|
case "nextjs":
|
||||||
|
if strings.Contains(lowerKeyword, "routing") {
|
||||||
|
return "https://nextjs.org/docs/app/building-your-application/routing", nil
|
||||||
|
}
|
||||||
|
if strings.Contains(lowerKeyword, "data") || strings.Contains(lowerKeyword, "fetch") {
|
||||||
|
return "https://nextjs.org/docs/app/building-your-application/data-fetching", nil
|
||||||
|
}
|
||||||
|
return "https://nextjs.org/docs", nil
|
||||||
|
case "svelte":
|
||||||
|
if strings.Contains(lowerKeyword, "kit") {
|
||||||
|
return "https://svelte.dev/docs/kit", nil
|
||||||
|
}
|
||||||
|
return "https://svelte.dev/docs/svelte/overview", nil
|
||||||
|
case "angular":
|
||||||
|
if strings.Contains(lowerKeyword, "http") {
|
||||||
|
return "https://angular.dev/guide/http", nil
|
||||||
|
}
|
||||||
|
return "https://angular.dev/guide/components", nil
|
||||||
|
case "remix":
|
||||||
|
if strings.Contains(lowerKeyword, "route") {
|
||||||
|
return "https://v2.remix.run/docs/file-conventions/routes", nil
|
||||||
|
}
|
||||||
|
return "https://v2.remix.run/docs", nil
|
||||||
|
case "solid":
|
||||||
|
// Solid docs are published from this repository and include solid-router content.
|
||||||
|
return "https://github.com/solidjs/solid-docs", nil
|
||||||
|
case "express":
|
||||||
|
if strings.Contains(lowerKeyword, "routing") {
|
||||||
|
return "https://expressjs.com/en/guide/routing.html", nil
|
||||||
|
}
|
||||||
|
if strings.Contains(lowerKeyword, "middleware") {
|
||||||
|
return "https://expressjs.com/en/guide/using-middleware.html", nil
|
||||||
|
}
|
||||||
|
return "https://expressjs.com/en/guide/writing-middleware.html", nil
|
||||||
|
default:
|
||||||
|
return "", fmt.Errorf("unsupported language: %s. Supported: %s", language, strings.Join(supportedLanguages(), ", "))
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
func mapLanguageToType(language string) string {
|
func mapLanguageToType(language string) string {
|
||||||
|
language, _ = normalizeLanguage(language)
|
||||||
switch language {
|
switch language {
|
||||||
case "go", "golang":
|
case "go":
|
||||||
return "godocs"
|
return "godocs"
|
||||||
case "rust":
|
case "rust":
|
||||||
return "rustdocs"
|
return "rustdocs"
|
||||||
case "python", "py":
|
case "python":
|
||||||
return "pythondocs"
|
return "pythondocs"
|
||||||
case "java":
|
case "java":
|
||||||
return "javadocs"
|
return "javadocs"
|
||||||
case "spring":
|
case "spring":
|
||||||
return "springdocs"
|
return "springdocs"
|
||||||
case "typescript", "ts":
|
case "typescript":
|
||||||
return "tsdocs"
|
return "tsdocs"
|
||||||
case "react":
|
case "react":
|
||||||
return "reactdocs"
|
return "reactdocs"
|
||||||
@@ -126,11 +225,80 @@ func mapLanguageToType(language string) string {
|
|||||||
return "nuxtdocs"
|
return "nuxtdocs"
|
||||||
case "docker":
|
case "docker":
|
||||||
return "dockerdocs"
|
return "dockerdocs"
|
||||||
case "cloudflare", "cf":
|
case "cloudflare":
|
||||||
return "cloudflaredocs"
|
return "cloudflaredocs"
|
||||||
case "astro":
|
case "astro":
|
||||||
return "astrodocs"
|
return "astrodocs"
|
||||||
|
case "csharp", "kotlin", "php", "ruby", "elixir", "nextjs", "svelte", "angular", "remix", "express":
|
||||||
|
return "url"
|
||||||
|
case "solid":
|
||||||
|
return "github"
|
||||||
default:
|
default:
|
||||||
return "web"
|
return ""
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func normalizeLanguage(language string) (string, bool) {
|
||||||
|
language = strings.ToLower(strings.TrimSpace(language))
|
||||||
|
if language == "" {
|
||||||
|
return "", false
|
||||||
|
}
|
||||||
|
if canonical, ok := languageAliases()[language]; ok {
|
||||||
|
return canonical, true
|
||||||
|
}
|
||||||
|
return "", false
|
||||||
|
}
|
||||||
|
|
||||||
|
func languageAliases() map[string]string {
|
||||||
|
return map[string]string{
|
||||||
|
"go": "go",
|
||||||
|
"golang": "go",
|
||||||
|
"rust": "rust",
|
||||||
|
"python": "python",
|
||||||
|
"py": "python",
|
||||||
|
"java": "java",
|
||||||
|
"spring": "spring",
|
||||||
|
"typescript": "typescript",
|
||||||
|
"ts": "typescript",
|
||||||
|
"react": "react",
|
||||||
|
"vue": "vue",
|
||||||
|
"nuxt": "nuxt",
|
||||||
|
"docker": "docker",
|
||||||
|
"cloudflare": "cloudflare",
|
||||||
|
"cf": "cloudflare",
|
||||||
|
"astro": "astro",
|
||||||
|
"csharp": "csharp",
|
||||||
|
"cs": "csharp",
|
||||||
|
"kotlin": "kotlin",
|
||||||
|
"kt": "kotlin",
|
||||||
|
"php": "php",
|
||||||
|
"ruby": "ruby",
|
||||||
|
"rb": "ruby",
|
||||||
|
"elixir": "elixir",
|
||||||
|
"ex": "elixir",
|
||||||
|
"next": "nextjs",
|
||||||
|
"nextjs": "nextjs",
|
||||||
|
"svelte": "svelte",
|
||||||
|
"angular": "angular",
|
||||||
|
"ng": "angular",
|
||||||
|
"remix": "remix",
|
||||||
|
"solid": "solid",
|
||||||
|
"solidjs": "solid",
|
||||||
|
"express": "express",
|
||||||
|
"expressjs": "express",
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func supportedLanguages() []string {
|
||||||
|
seen := map[string]bool{}
|
||||||
|
out := make([]string, 0)
|
||||||
|
for _, canonical := range languageAliases() {
|
||||||
|
if seen[canonical] {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
seen[canonical] = true
|
||||||
|
out = append(out, canonical)
|
||||||
|
}
|
||||||
|
sort.Strings(out)
|
||||||
|
return out
|
||||||
|
}
|
||||||
|
|||||||
+121
@@ -0,0 +1,121 @@
|
|||||||
|
package cmd
|
||||||
|
|
||||||
|
import "testing"
|
||||||
|
|
||||||
|
func TestConstructDocURL_SupportedLanguages(t *testing.T) {
|
||||||
|
tests := []struct {
|
||||||
|
language string
|
||||||
|
keyword string
|
||||||
|
wantURL string
|
||||||
|
}{
|
||||||
|
{"go", "net/http", "https://pkg.go.dev/net/http"},
|
||||||
|
{"rust", "tokio", "https://docs.rs/tokio/latest/tokio/"},
|
||||||
|
{"python", "asyncio", "https://docs.python.org/3/library/asyncio.html"},
|
||||||
|
{"java", "java/util/list", "https://docs.oracle.com/javase/8/docs/api/java/util/list.html"},
|
||||||
|
{"spring", "mcp", "https://docs.spring.io/spring-ai/reference/api/mcp/mcp-overview.html"},
|
||||||
|
{"typescript", "utility-types", "https://www.typescriptlang.org/docs/handbook/utility-types.html"},
|
||||||
|
{"react", "hooks", "https://react.dev/reference/react"},
|
||||||
|
{"vue", "essentials/reactivity-fundamentals", "https://vuejs.org/guide/essentials/reactivity-fundamentals.html"},
|
||||||
|
{"nuxt", "directory-structure", "https://nuxt.com/docs/guide/directory-structure"},
|
||||||
|
{"docker", "compose", "https://docs.docker.com/compose"},
|
||||||
|
{"cloudflare", "workers", "https://developers.cloudflare.com/workers"},
|
||||||
|
{"astro", "components", "https://docs.astro.build/en/basics/astro-components/"},
|
||||||
|
{"csharp", "regex", "https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expressions"},
|
||||||
|
{"kotlin", "regex", "https://kotlinlang.org/docs/strings.html"},
|
||||||
|
{"php", "pcre", "https://www.php.net/manual/en/book.pcre.php"},
|
||||||
|
{"ruby", "Regexp", "https://ruby-doc.org/core/Regexp.html"},
|
||||||
|
{"elixir", "String", "https://hexdocs.pm/elixir/String.html"},
|
||||||
|
{"nextjs", "routing", "https://nextjs.org/docs/app/building-your-application/routing"},
|
||||||
|
{"svelte", "kit", "https://svelte.dev/docs/kit"},
|
||||||
|
{"angular", "http", "https://angular.dev/guide/http"},
|
||||||
|
{"remix", "routes", "https://v2.remix.run/docs/file-conventions/routes"},
|
||||||
|
{"solid", "signals", "https://github.com/solidjs/solid-docs"},
|
||||||
|
{"express", "routing", "https://expressjs.com/en/guide/routing.html"},
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, tt := range tests {
|
||||||
|
t.Run(tt.language+"_"+tt.keyword, func(t *testing.T) {
|
||||||
|
got, err := constructDocURL(tt.language, tt.keyword)
|
||||||
|
if err != nil {
|
||||||
|
t.Fatalf("constructDocURL(%q, %q) returned error: %v", tt.language, tt.keyword, err)
|
||||||
|
}
|
||||||
|
if got != tt.wantURL {
|
||||||
|
t.Fatalf("constructDocURL(%q, %q) = %q, want %q", tt.language, tt.keyword, got, tt.wantURL)
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestConstructDocURL_UnsupportedLanguage(t *testing.T) {
|
||||||
|
if _, err := constructDocURL("haskell", "regex-tdfa"); err == nil {
|
||||||
|
t.Fatal("constructDocURL should return an error for unsupported language")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestMapLanguageToType(t *testing.T) {
|
||||||
|
tests := []struct {
|
||||||
|
language string
|
||||||
|
wantType string
|
||||||
|
}{
|
||||||
|
{"go", "godocs"},
|
||||||
|
{"golang", "godocs"},
|
||||||
|
{"rust", "rustdocs"},
|
||||||
|
{"python", "pythondocs"},
|
||||||
|
{"py", "pythondocs"},
|
||||||
|
{"java", "javadocs"},
|
||||||
|
{"spring", "springdocs"},
|
||||||
|
{"typescript", "tsdocs"},
|
||||||
|
{"ts", "tsdocs"},
|
||||||
|
{"react", "reactdocs"},
|
||||||
|
{"vue", "vuedocs"},
|
||||||
|
{"nuxt", "nuxtdocs"},
|
||||||
|
{"docker", "dockerdocs"},
|
||||||
|
{"cloudflare", "cloudflaredocs"},
|
||||||
|
{"cf", "cloudflaredocs"},
|
||||||
|
{"astro", "astrodocs"},
|
||||||
|
{"csharp", "url"},
|
||||||
|
{"kotlin", "url"},
|
||||||
|
{"php", "url"},
|
||||||
|
{"ruby", "url"},
|
||||||
|
{"elixir", "url"},
|
||||||
|
{"nextjs", "url"},
|
||||||
|
{"next", "url"},
|
||||||
|
{"svelte", "url"},
|
||||||
|
{"angular", "url"},
|
||||||
|
{"ng", "url"},
|
||||||
|
{"remix", "url"},
|
||||||
|
{"solidjs", "github"},
|
||||||
|
{"expressjs", "url"},
|
||||||
|
{"unknown", ""},
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, tt := range tests {
|
||||||
|
t.Run(tt.language, func(t *testing.T) {
|
||||||
|
got := mapLanguageToType(tt.language)
|
||||||
|
if got != tt.wantType {
|
||||||
|
t.Fatalf("mapLanguageToType(%q) = %q, want %q", tt.language, got, tt.wantType)
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestNormalizeLanguage(t *testing.T) {
|
||||||
|
tests := []struct {
|
||||||
|
in string
|
||||||
|
want string
|
||||||
|
ok bool
|
||||||
|
}{
|
||||||
|
{"go", "go", true},
|
||||||
|
{"golang", "go", true},
|
||||||
|
{"next", "nextjs", true},
|
||||||
|
{"solidjs", "solid", true},
|
||||||
|
{"expressjs", "express", true},
|
||||||
|
{"unknown", "", false},
|
||||||
|
}
|
||||||
|
for _, tt := range tests {
|
||||||
|
got, ok := normalizeLanguage(tt.in)
|
||||||
|
if got != tt.want || ok != tt.ok {
|
||||||
|
t.Fatalf("normalizeLanguage(%q) = (%q,%v), want (%q,%v)", tt.in, got, ok, tt.want, tt.ok)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
+5
-61
@@ -6,6 +6,7 @@ import (
|
|||||||
"path/filepath"
|
"path/filepath"
|
||||||
|
|
||||||
"github.com/spf13/cobra"
|
"github.com/spf13/cobra"
|
||||||
|
appconfig "github.com/yourorg/devour/internal/config"
|
||||||
)
|
)
|
||||||
|
|
||||||
var initCmd = &cobra.Command{
|
var initCmd = &cobra.Command{
|
||||||
@@ -53,7 +54,10 @@ func runInit(cmd *cobra.Command, args []string) error {
|
|||||||
}
|
}
|
||||||
|
|
||||||
// Create default config
|
// Create default config
|
||||||
config := generateDefaultConfig(initRemote)
|
config, err := appconfig.RenderInitYAML(initRemote)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("failed to render default config: %w", err)
|
||||||
|
}
|
||||||
if err := os.WriteFile(configPath, []byte(config), 0644); err != nil {
|
if err := os.WriteFile(configPath, []byte(config), 0644); err != nil {
|
||||||
return fmt.Errorf("failed to write config: %w", err)
|
return fmt.Errorf("failed to write config: %w", err)
|
||||||
}
|
}
|
||||||
@@ -82,63 +86,3 @@ func runInit(cmd *cobra.Command, args []string) error {
|
|||||||
|
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
func generateDefaultConfig(remote bool) string {
|
|
||||||
mode := "local"
|
|
||||||
if remote {
|
|
||||||
mode = "remote"
|
|
||||||
}
|
|
||||||
|
|
||||||
return fmt.Sprintf(`# Devour Configuration
|
|
||||||
version: 1
|
|
||||||
|
|
||||||
# Storage paths
|
|
||||||
storage:
|
|
||||||
docs_dir: ./devour_data/docs
|
|
||||||
index_dir: ./devour_data/index
|
|
||||||
metadata_dir: ./devour_data/metadata
|
|
||||||
|
|
||||||
# Embedding settings
|
|
||||||
embeddings:
|
|
||||||
provider: openai
|
|
||||||
model: text-embedding-3-small
|
|
||||||
dimensions: 1536
|
|
||||||
api_key: ${OPENAI_API_KEY}
|
|
||||||
batch_size: 100
|
|
||||||
|
|
||||||
# Vector database
|
|
||||||
vector_db:
|
|
||||||
type: chromem
|
|
||||||
persist: true
|
|
||||||
similarity_metric: cosine
|
|
||||||
|
|
||||||
# Scraping settings
|
|
||||||
scraper:
|
|
||||||
user_agent: "Devour/1.0"
|
|
||||||
timeout: 30s
|
|
||||||
retry_count: 3
|
|
||||||
concurrency: 10
|
|
||||||
rate_limit: 500ms
|
|
||||||
max_depth: 3
|
|
||||||
cache_dir: ./devour_data/cache
|
|
||||||
|
|
||||||
# Scheduler
|
|
||||||
scheduler:
|
|
||||||
enabled: true
|
|
||||||
interval: 72h
|
|
||||||
check_method: hash
|
|
||||||
|
|
||||||
# Server settings
|
|
||||||
server:
|
|
||||||
mode: %s
|
|
||||||
port: 8080
|
|
||||||
host: localhost
|
|
||||||
|
|
||||||
# Sources (add your own)
|
|
||||||
sources: []
|
|
||||||
# - name: example-docs
|
|
||||||
# type: url
|
|
||||||
# url: https://docs.example.com
|
|
||||||
# include: ["**/*.md", "**/*.html"]
|
|
||||||
`, mode)
|
|
||||||
}
|
|
||||||
|
|||||||
+67
-95
@@ -1,118 +1,90 @@
|
|||||||
package cmd
|
package cmd
|
||||||
|
|
||||||
import (
|
import (
|
||||||
|
"encoding/json"
|
||||||
"fmt"
|
"fmt"
|
||||||
|
"io"
|
||||||
"strings"
|
"strings"
|
||||||
|
|
||||||
"github.com/spf13/cobra"
|
"github.com/spf13/cobra"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
var languagesFormat string
|
||||||
|
|
||||||
var languagesCmd = &cobra.Command{
|
var languagesCmd = &cobra.Command{
|
||||||
Use: "languages",
|
Use: "languages",
|
||||||
Short: "Show supported languages and their mappings",
|
Short: "Show supported languages and aliases",
|
||||||
Long: `Display all supported languages for the 'devour get' command
|
Long: `Display all supported languages for 'devour get' and 'devour ask'
|
||||||
along with their base URLs and examples.
|
with aliases and starter examples.`,
|
||||||
|
|
||||||
This helps you discover what documentation sources are available
|
|
||||||
and how to reference them quickly.`,
|
|
||||||
RunE: runLanguages,
|
RunE: runLanguages,
|
||||||
}
|
}
|
||||||
|
|
||||||
func init() {
|
func init() {
|
||||||
rootCmd.AddCommand(languagesCmd)
|
languagesCmd.Flags().StringVar(&languagesFormat, "format", "text", "output format (text, json)")
|
||||||
|
}
|
||||||
|
|
||||||
|
type languageInfo struct {
|
||||||
|
Canonical string `json:"canonical"`
|
||||||
|
Aliases []string `json:"aliases"`
|
||||||
|
Example string `json:"example"`
|
||||||
|
Source string `json:"source"`
|
||||||
}
|
}
|
||||||
|
|
||||||
func runLanguages(cmd *cobra.Command, args []string) error {
|
func runLanguages(cmd *cobra.Command, args []string) error {
|
||||||
fmt.Println("🌐 Devour Supported Languages")
|
rows := []languageInfo{
|
||||||
fmt.Println("═══════════════════════════════════════════════════════════════")
|
{Canonical: "go", Aliases: []string{"go", "golang"}, Example: "devour get go http", Source: "pkg.go.dev"},
|
||||||
fmt.Println()
|
{Canonical: "rust", Aliases: []string{"rust"}, Example: "devour get rust tokio", Source: "docs.rs"},
|
||||||
|
{Canonical: "python", Aliases: []string{"python", "py"}, Example: "devour get python asyncio", Source: "docs.python.org"},
|
||||||
|
{Canonical: "java", Aliases: []string{"java"}, Example: "devour get java string", Source: "docs.oracle.com"},
|
||||||
|
{Canonical: "spring", Aliases: []string{"spring"}, Example: "devour get spring mcp", Source: "docs.spring.io"},
|
||||||
|
{Canonical: "typescript", Aliases: []string{"typescript", "ts"}, Example: "devour get ts interfaces", Source: "typescriptlang.org"},
|
||||||
|
{Canonical: "react", Aliases: []string{"react"}, Example: "devour get react hooks", Source: "react.dev"},
|
||||||
|
{Canonical: "vue", Aliases: []string{"vue"}, Example: "devour get vue reactivity", Source: "vuejs.org"},
|
||||||
|
{Canonical: "nuxt", Aliases: []string{"nuxt"}, Example: "devour get nuxt routing", Source: "nuxt.com"},
|
||||||
|
{Canonical: "docker", Aliases: []string{"docker"}, Example: "devour get docker compose", Source: "docs.docker.com"},
|
||||||
|
{Canonical: "cloudflare", Aliases: []string{"cloudflare", "cf"}, Example: "devour get cloudflare workers", Source: "developers.cloudflare.com"},
|
||||||
|
{Canonical: "astro", Aliases: []string{"astro"}, Example: "devour get astro components", Source: "docs.astro.build"},
|
||||||
|
{Canonical: "csharp", Aliases: []string{"csharp", "cs"}, Example: "devour get csharp regex", Source: "learn.microsoft.com"},
|
||||||
|
{Canonical: "kotlin", Aliases: []string{"kotlin", "kt"}, Example: "devour get kotlin strings", Source: "kotlinlang.org"},
|
||||||
|
{Canonical: "php", Aliases: []string{"php"}, Example: "devour get php pcre", Source: "php.net"},
|
||||||
|
{Canonical: "ruby", Aliases: []string{"ruby", "rb"}, Example: "devour get ruby Regexp", Source: "ruby-doc.org"},
|
||||||
|
{Canonical: "elixir", Aliases: []string{"elixir", "ex"}, Example: "devour get elixir String", Source: "hexdocs.pm"},
|
||||||
|
{Canonical: "nextjs", Aliases: []string{"next", "nextjs"}, Example: "devour get nextjs routing", Source: "nextjs.org"},
|
||||||
|
{Canonical: "svelte", Aliases: []string{"svelte"}, Example: "devour get svelte kit", Source: "svelte.dev"},
|
||||||
|
{Canonical: "angular", Aliases: []string{"angular", "ng"}, Example: "devour get angular http", Source: "angular.dev"},
|
||||||
|
{Canonical: "remix", Aliases: []string{"remix"}, Example: "devour get remix routes", Source: "v2.remix.run"},
|
||||||
|
{Canonical: "solid", Aliases: []string{"solid", "solidjs"}, Example: "devour get solid router", Source: "github.com/solidjs/solid-docs"},
|
||||||
|
{Canonical: "express", Aliases: []string{"express", "expressjs"}, Example: "devour get express middleware", Source: "expressjs.com"},
|
||||||
|
}
|
||||||
|
|
||||||
languages := []struct {
|
switch strings.ToLower(strings.TrimSpace(languagesFormat)) {
|
||||||
langs []string
|
case "json":
|
||||||
url string
|
out := struct {
|
||||||
examples []string
|
Count int `json:"count"`
|
||||||
|
Languages []languageInfo `json:"languages"`
|
||||||
}{
|
}{
|
||||||
{
|
Count: len(rows),
|
||||||
langs: []string{"go", "golang"},
|
Languages: rows,
|
||||||
url: "https://pkg.go.dev/{package}",
|
|
||||||
examples: []string{"devour get go http", "devour get go fmt", "devour get golang json"},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
langs: []string{"rust"},
|
|
||||||
url: "https://docs.rs/{crate}/latest/{crate}/",
|
|
||||||
examples: []string{"devour get rust tokio", "devour get rust serde", "devour get rust clap"},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
langs: []string{"python", "py"},
|
|
||||||
url: "https://docs.python.org/3/library/{module}.html",
|
|
||||||
examples: []string{"devour get python asyncio", "devour get py requests", "devour get python stdlib"},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
langs: []string{"java"},
|
|
||||||
url: "https://docs.oracle.com/javase/8/docs/api/{package}.html",
|
|
||||||
examples: []string{"devour get java string", "devour get java arraylist"},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
langs: []string{"spring"},
|
|
||||||
url: "https://docs.spring.io/spring-boot/docs/current/reference/htmlsingle/#{section}",
|
|
||||||
examples: []string{"devour get spring boot", "devour get spring testing"},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
langs: []string{"typescript", "ts"},
|
|
||||||
url: "https://www.typescriptlang.org/docs/handbook/{topic}.html",
|
|
||||||
examples: []string{"devour get typescript interfaces", "devour get ts decorators"},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
langs: []string{"react"},
|
|
||||||
url: "https://react.dev/reference/react/{feature}",
|
|
||||||
examples: []string{"devour get react hooks", "devour get react components", "devour get react state"},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
langs: []string{"vue"},
|
|
||||||
url: "https://vuejs.org/guide/{topic}.html",
|
|
||||||
examples: []string{"devour get vue components", "devour get vue reactivity"},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
langs: []string{"nuxt"},
|
|
||||||
url: "https://nuxt.com/docs/guide/{topic}",
|
|
||||||
examples: []string{"devour get nuxt routing", "devour get nuxt middleware"},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
langs: []string{"docker"},
|
|
||||||
url: "https://docs.docker.com/{topic}",
|
|
||||||
examples: []string{"devour get docker compose", "devour get docker build", "devour get docker networking"},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
langs: []string{"cloudflare", "cf"},
|
|
||||||
url: "https://developers.cloudflare.com/{topic}",
|
|
||||||
examples: []string{"devour get cloudflare workers", "devour get cf pages", "devour get cloudflare dns"},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
langs: []string{"astro"},
|
|
||||||
url: "https://docs.astro.build/en/guides/{topic}",
|
|
||||||
examples: []string{"devour get astro routing", "devour get astro components"},
|
|
||||||
},
|
|
||||||
}
|
}
|
||||||
|
enc := json.NewEncoder(cmd.OutOrStdout())
|
||||||
for _, lang := range languages {
|
enc.SetIndent("", " ")
|
||||||
fmt.Printf("🔷 %s\n", strings.Join(lang.langs, ", "))
|
return enc.Encode(out)
|
||||||
fmt.Printf(" URL: %s\n", lang.url)
|
case "text", "":
|
||||||
fmt.Printf(" Examples:\n")
|
printLanguagesText(cmd.OutOrStdout(), rows)
|
||||||
for _, example := range lang.examples {
|
|
||||||
fmt.Printf(" • %s\n", example)
|
|
||||||
}
|
|
||||||
fmt.Println()
|
|
||||||
}
|
|
||||||
|
|
||||||
fmt.Println("💡 Pro Tips:")
|
|
||||||
fmt.Println(" • Use 'devour get <language> help' for language-specific help")
|
|
||||||
fmt.Println(" • Add --format markdown for enhanced documentation")
|
|
||||||
fmt.Println(" • Most languages support common aliases (e.g., py → python)")
|
|
||||||
fmt.Println()
|
|
||||||
fmt.Println("🚀 Quick Start:")
|
|
||||||
fmt.Println(" devour get go http --format markdown")
|
|
||||||
fmt.Println(" devour get python asyncio")
|
|
||||||
fmt.Println(" devour get react hooks")
|
|
||||||
|
|
||||||
return nil
|
return nil
|
||||||
|
default:
|
||||||
|
return fmt.Errorf("unsupported format: %s", languagesFormat)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func printLanguagesText(out io.Writer, rows []languageInfo) {
|
||||||
|
_, _ = fmt.Fprintln(out, "Devour Supported Languages")
|
||||||
|
_, _ = fmt.Fprintln(out, "============================================")
|
||||||
|
_, _ = fmt.Fprintln(out)
|
||||||
|
for _, row := range rows {
|
||||||
|
_, _ = fmt.Fprintf(out, "- %s (%s)\n", row.Canonical, strings.Join(row.Aliases, ", "))
|
||||||
|
_, _ = fmt.Fprintf(out, " source: %s\n", row.Source)
|
||||||
|
_, _ = fmt.Fprintf(out, " example: %s\n\n", row.Example)
|
||||||
|
}
|
||||||
|
_, _ = fmt.Fprintln(out, "Tip: use 'devour get <language> <keyword> --format markdown' for readable output.")
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -0,0 +1,63 @@
|
|||||||
|
package cmd
|
||||||
|
|
||||||
|
import (
|
||||||
|
"bytes"
|
||||||
|
"encoding/json"
|
||||||
|
"strings"
|
||||||
|
"testing"
|
||||||
|
)
|
||||||
|
|
||||||
|
func TestLanguagesJSONFormat(t *testing.T) {
|
||||||
|
prev := languagesFormat
|
||||||
|
defer func() { languagesFormat = prev }()
|
||||||
|
languagesFormat = "json"
|
||||||
|
|
||||||
|
var buf bytes.Buffer
|
||||||
|
languagesCmd.SetOut(&buf)
|
||||||
|
|
||||||
|
if err := runLanguages(languagesCmd, nil); err != nil {
|
||||||
|
t.Fatalf("runLanguages returned error: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
var payload struct {
|
||||||
|
Count int `json:"count"`
|
||||||
|
Languages []struct {
|
||||||
|
Canonical string `json:"canonical"`
|
||||||
|
Aliases []string `json:"aliases"`
|
||||||
|
} `json:"languages"`
|
||||||
|
}
|
||||||
|
if err := json.Unmarshal(buf.Bytes(), &payload); err != nil {
|
||||||
|
t.Fatalf("invalid json output: %v", err)
|
||||||
|
}
|
||||||
|
if payload.Count == 0 || len(payload.Languages) == 0 {
|
||||||
|
t.Fatalf("expected non-empty languages payload, got %+v", payload)
|
||||||
|
}
|
||||||
|
|
||||||
|
foundNext := false
|
||||||
|
for _, l := range payload.Languages {
|
||||||
|
if l.Canonical == "nextjs" {
|
||||||
|
foundNext = true
|
||||||
|
break
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if !foundNext {
|
||||||
|
t.Fatalf("expected nextjs in JSON payload, got %+v", payload.Languages)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestLanguagesTextFormat(t *testing.T) {
|
||||||
|
prev := languagesFormat
|
||||||
|
defer func() { languagesFormat = prev }()
|
||||||
|
languagesFormat = "text"
|
||||||
|
|
||||||
|
var buf bytes.Buffer
|
||||||
|
languagesCmd.SetOut(&buf)
|
||||||
|
|
||||||
|
if err := runLanguages(languagesCmd, nil); err != nil {
|
||||||
|
t.Fatalf("runLanguages returned error: %v", err)
|
||||||
|
}
|
||||||
|
out := buf.String()
|
||||||
|
if !strings.Contains(out, "Devour Supported Languages") {
|
||||||
|
t.Fatalf("unexpected text output: %q", out)
|
||||||
|
}
|
||||||
|
}
|
||||||
+78
-26
@@ -1,25 +1,32 @@
|
|||||||
package cmd
|
package cmd
|
||||||
|
|
||||||
import (
|
import (
|
||||||
|
"context"
|
||||||
"fmt"
|
"fmt"
|
||||||
|
"net/url"
|
||||||
|
"os"
|
||||||
|
"strings"
|
||||||
|
|
||||||
"github.com/spf13/cobra"
|
"github.com/spf13/cobra"
|
||||||
|
"github.com/yourorg/devour/internal/scraper"
|
||||||
|
"github.com/yourorg/devour/internal/search"
|
||||||
|
"github.com/yourorg/devour/internal/storage"
|
||||||
)
|
)
|
||||||
|
|
||||||
var pushCmd = &cobra.Command{
|
var pushCmd = &cobra.Command{
|
||||||
Use: "push <path>",
|
Use: "push <path>",
|
||||||
Short: "Push documents to remote MCP server",
|
Short: "Import local documents into Devour storage/index",
|
||||||
Long: `Push local documents to a remote Devour MCP server.
|
Long: `Push local documents into your Devour local workspace.
|
||||||
|
|
||||||
Useful for:
|
Current stable behavior:
|
||||||
- Syncing local documentation to a shared server
|
- local ingest into docs storage
|
||||||
- Backing up indexed content
|
- local reindex for query/ask/status
|
||||||
- Contributing to a team knowledge base
|
|
||||||
|
Remote push is experimental and not enabled by default.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
devour push ./docs
|
devour push ./docs
|
||||||
devour push ./docs --server http://devour.company.com
|
devour push ./docs --project my-project`,
|
||||||
devour push ./docs --server http://localhost:8080 --project my-project`,
|
|
||||||
Args: cobra.ExactArgs(1),
|
Args: cobra.ExactArgs(1),
|
||||||
RunE: runPush,
|
RunE: runPush,
|
||||||
}
|
}
|
||||||
@@ -30,33 +37,78 @@ var (
|
|||||||
)
|
)
|
||||||
|
|
||||||
func init() {
|
func init() {
|
||||||
pushCmd.Flags().StringVar(&pushServer, "server", "", "remote Devour server URL")
|
pushCmd.Flags().StringVar(&pushServer, "server", "", "remote Devour server URL (experimental)")
|
||||||
pushCmd.Flags().StringVarP(&pushProject, "project", "p", "", "project name on remote server")
|
pushCmd.Flags().StringVarP(&pushProject, "project", "p", "", "project name label")
|
||||||
}
|
}
|
||||||
|
|
||||||
func runPush(cmd *cobra.Command, args []string) error {
|
func runPush(cmd *cobra.Command, args []string) error {
|
||||||
path := args[0]
|
path := args[0]
|
||||||
|
if _, err := os.Stat(path); err != nil {
|
||||||
if pushServer == "" {
|
return fmt.Errorf("path does not exist: %s", path)
|
||||||
// Try to get from config
|
|
||||||
pushServer = "http://localhost:8080"
|
|
||||||
}
|
}
|
||||||
|
|
||||||
fmt.Printf("📤 Pushing to: %s\n", pushServer)
|
cfg, err := loadAppConfig()
|
||||||
fmt.Printf(" Path: %s\n", path)
|
if err != nil {
|
||||||
if pushProject != "" {
|
return err
|
||||||
fmt.Printf(" Project: %s\n", pushProject)
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// TODO: Implement actual push logic
|
server := strings.TrimSpace(pushServer)
|
||||||
// 1. Scan path for documents
|
if server != "" && !isLocalServer(server) {
|
||||||
// 2. Connect to remote server
|
return fmt.Errorf("remote push is experimental and not enabled in this build; use local push without --server")
|
||||||
// 3. Upload documents
|
}
|
||||||
// 4. Wait for indexing confirmation
|
|
||||||
|
|
||||||
fmt.Println()
|
projectName := strings.TrimSpace(pushProject)
|
||||||
fmt.Println("⚠️ Push functionality not yet implemented")
|
if projectName == "" {
|
||||||
fmt.Println(" Remote server support coming soon")
|
projectName = "local-push"
|
||||||
|
}
|
||||||
|
|
||||||
|
fmt.Printf("📤 Ingesting local docs from: %s\n", path)
|
||||||
|
fmt.Printf(" Project: %s\n", projectName)
|
||||||
|
fmt.Printf(" Target docs dir: %s\n", cfg.Storage.DocsDir)
|
||||||
|
|
||||||
|
s := scraper.NewScraper(scraper.SourceTypeLocal, toScraperConfig(cfg, 0))
|
||||||
|
if s == nil {
|
||||||
|
return fmt.Errorf("local scraper not available")
|
||||||
|
}
|
||||||
|
|
||||||
|
docs, err := s.Scrape(context.Background(), &scraper.Source{
|
||||||
|
Name: projectName,
|
||||||
|
Type: scraper.SourceTypeLocal,
|
||||||
|
Path: path,
|
||||||
|
Include: []string{`.*`},
|
||||||
|
})
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("local ingest failed: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
saved, err := storage.SaveDocuments(docs, storage.SaveOptions{
|
||||||
|
Format: "json",
|
||||||
|
OutputDir: cfg.Storage.DocsDir,
|
||||||
|
AllowEmpty: false,
|
||||||
|
PrintWriter: nil,
|
||||||
|
})
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("save docs failed: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
engine := search.NewEngine(cfg)
|
||||||
|
stats, err := engine.Rebuild(context.Background())
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("reindex failed: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
fmt.Println("\n✓ Push complete")
|
||||||
|
fmt.Printf(" Documents imported: %d\n", saved.Count)
|
||||||
|
fmt.Printf(" Index docs: %d\n", stats.Documents)
|
||||||
|
fmt.Printf(" Index path: %s\n", stats.IndexPath)
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func isLocalServer(raw string) bool {
|
||||||
|
u, err := url.Parse(raw)
|
||||||
|
if err != nil {
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
host := strings.ToLower(u.Hostname())
|
||||||
|
return host == "" || host == "localhost" || host == "127.0.0.1"
|
||||||
|
}
|
||||||
|
|||||||
@@ -6,6 +6,7 @@ import (
|
|||||||
"fmt"
|
"fmt"
|
||||||
"os"
|
"os"
|
||||||
"path/filepath"
|
"path/filepath"
|
||||||
|
"strings"
|
||||||
"time"
|
"time"
|
||||||
|
|
||||||
"github.com/spf13/cobra"
|
"github.com/spf13/cobra"
|
||||||
@@ -218,6 +219,7 @@ func runQualityScan(cmd *cobra.Command, args []string) error {
|
|||||||
if err != nil {
|
if err != nil {
|
||||||
return fmt.Errorf("scan failed: %w", err)
|
return fmt.Errorf("scan failed: %w", err)
|
||||||
}
|
}
|
||||||
|
result.Findings = quality.AttachDocsEvidence(lang, result.Findings)
|
||||||
|
|
||||||
return outputScanResult(result, qualityFormat)
|
return outputScanResult(result, qualityFormat)
|
||||||
}
|
}
|
||||||
@@ -256,9 +258,11 @@ func runQualityStatus(cmd *cobra.Command, args []string) error {
|
|||||||
return json.NewEncoder(os.Stdout).Encode(scorecard)
|
return json.NewEncoder(os.Stdout).Encode(scorecard)
|
||||||
case "strict":
|
case "strict":
|
||||||
fmt.Println(scorer.FormatStrictScorecard(findings, lastScan))
|
fmt.Println(scorer.FormatStrictScorecard(findings, lastScan))
|
||||||
|
printQualityEvidenceSummary(findings)
|
||||||
return nil
|
return nil
|
||||||
default:
|
default:
|
||||||
fmt.Println(scorer.FormatScorecard(scorecard))
|
fmt.Println(scorer.FormatScorecard(scorecard))
|
||||||
|
printQualityEvidenceSummary(findings)
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -318,6 +322,17 @@ func runQualityNext(cmd *cobra.Command, args []string) error {
|
|||||||
fmt.Printf("Score: %d\n", next.Score)
|
fmt.Printf("Score: %d\n", next.Score)
|
||||||
fmt.Printf("ID: %s\n", next.ID)
|
fmt.Printf("ID: %s\n", next.ID)
|
||||||
fmt.Printf("\nDescription:\n%s\n", next.Description)
|
fmt.Printf("\nDescription:\n%s\n", next.Description)
|
||||||
|
if next.Metadata != nil {
|
||||||
|
if urls := strings.TrimSpace(next.Metadata["docs_evidence_urls"]); urls != "" {
|
||||||
|
fmt.Printf("\nEvidence Docs:\n%s\n", urls)
|
||||||
|
}
|
||||||
|
if rationale := strings.TrimSpace(next.Metadata["docs_evidence_rationale"]); rationale != "" {
|
||||||
|
fmt.Printf("\nRationale:\n%s\n", rationale)
|
||||||
|
}
|
||||||
|
if confidence := strings.TrimSpace(next.Metadata["docs_evidence_confidence"]); confidence != "" {
|
||||||
|
fmt.Printf("Evidence confidence: %s\n", confidence)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
if explain {
|
if explain {
|
||||||
fmt.Printf("\nExplanation:\n")
|
fmt.Printf("\nExplanation:\n")
|
||||||
@@ -693,3 +708,27 @@ func importReviewResponses(dataDir string, filename string) error {
|
|||||||
|
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func printQualityEvidenceSummary(findings []quality.Finding) {
|
||||||
|
totalWithEvidence := 0
|
||||||
|
for _, f := range findings {
|
||||||
|
if f.Metadata != nil && strings.TrimSpace(f.Metadata["docs_evidence_urls"]) != "" {
|
||||||
|
totalWithEvidence++
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if totalWithEvidence == 0 {
|
||||||
|
return
|
||||||
|
}
|
||||||
|
fmt.Printf("\nEvidence-linked findings: %d/%d\n", totalWithEvidence, len(findings))
|
||||||
|
for _, f := range findings {
|
||||||
|
if f.Metadata == nil {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
urls := strings.TrimSpace(f.Metadata["docs_evidence_urls"])
|
||||||
|
if urls == "" {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
fmt.Printf(" • %s:%d - %s\n %s\n", filepath.Base(f.File), f.Line, f.Title, urls)
|
||||||
|
break
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|||||||
+100
-18
@@ -1,9 +1,14 @@
|
|||||||
package cmd
|
package cmd
|
||||||
|
|
||||||
import (
|
import (
|
||||||
|
"context"
|
||||||
|
"encoding/json"
|
||||||
"fmt"
|
"fmt"
|
||||||
|
"strings"
|
||||||
|
|
||||||
"github.com/spf13/cobra"
|
"github.com/spf13/cobra"
|
||||||
|
appconfig "github.com/yourorg/devour/internal/config"
|
||||||
|
"github.com/yourorg/devour/internal/search"
|
||||||
)
|
)
|
||||||
|
|
||||||
var queryCmd = &cobra.Command{
|
var queryCmd = &cobra.Command{
|
||||||
@@ -29,32 +34,109 @@ var (
|
|||||||
)
|
)
|
||||||
|
|
||||||
func init() {
|
func init() {
|
||||||
queryCmd.Flags().IntVarP(&queryLimit, "limit", "l", 5, "maximum number of results")
|
queryCmd.Flags().IntVarP(&queryLimit, "limit", "n", 5, "maximum number of results")
|
||||||
queryCmd.Flags().StringVarP(&queryFormat, "format", "f", "text", "output format (text, json, markdown)")
|
queryCmd.Flags().StringVarP(&queryFormat, "format", "f", "text", "output format (text, json, markdown)")
|
||||||
queryCmd.Flags().Float64Var(&queryThreshold, "threshold", 0.7, "similarity threshold (0-1)")
|
queryCmd.Flags().Float64Var(&queryThreshold, "threshold", 0, "minimum lexical score threshold")
|
||||||
}
|
}
|
||||||
|
|
||||||
func runQuery(cmd *cobra.Command, args []string) error {
|
func runQuery(cmd *cobra.Command, args []string) error {
|
||||||
query := args[0]
|
query := strings.TrimSpace(strings.Join(args, " "))
|
||||||
if len(args) > 1 {
|
if query == "" {
|
||||||
query = fmt.Sprintf("%s", args)
|
return fmt.Errorf("query cannot be empty")
|
||||||
}
|
}
|
||||||
|
|
||||||
fmt.Printf("Searching: %q\n", query)
|
cfg, err := loadAppConfig()
|
||||||
fmt.Printf(" Limit: %d\n", queryLimit)
|
if err != nil {
|
||||||
fmt.Printf(" Threshold: %.2f\n", queryThreshold)
|
return err
|
||||||
fmt.Println()
|
}
|
||||||
|
|
||||||
// TODO: Implement actual query logic
|
engine := search.NewEngine(cfg)
|
||||||
// 1. Generate embedding for query
|
results, stats, err := engine.Search(context.Background(), query, search.SearchOptions{
|
||||||
// 2. Search vector database
|
Limit: queryLimit,
|
||||||
// 3. Format and return results
|
Threshold: queryThreshold,
|
||||||
|
})
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("query failed: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
// Placeholder results
|
switch strings.ToLower(queryFormat) {
|
||||||
fmt.Println("Results:")
|
case "json":
|
||||||
fmt.Println("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━")
|
resp := map[string]any{
|
||||||
fmt.Println("⚠️ Query functionality not yet implemented")
|
"query": query,
|
||||||
fmt.Println(" Index some documents first with 'devour scrape'")
|
"limit": queryLimit,
|
||||||
|
"threshold": queryThreshold,
|
||||||
|
"count": len(results),
|
||||||
|
"results": results,
|
||||||
|
"indexed_at": stats.LastIndexedAt,
|
||||||
|
"documents": stats.Documents,
|
||||||
|
}
|
||||||
|
enc := json.NewEncoder(cmd.OutOrStdout())
|
||||||
|
enc.SetIndent("", " ")
|
||||||
|
return enc.Encode(resp)
|
||||||
|
case "markdown":
|
||||||
|
return printQueryMarkdown(cmd, query, cfg, results, stats)
|
||||||
|
case "text":
|
||||||
|
return printQueryText(cmd, query, cfg, results, stats)
|
||||||
|
default:
|
||||||
|
return fmt.Errorf("unsupported format: %s (supported: text, json, markdown)", queryFormat)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func printQueryText(cmd *cobra.Command, query string, cfg *appconfig.Config, results []search.Result, stats *search.IndexStats) error {
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), "Searching: %q\n", query)
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), " Limit: %d\n", queryLimit)
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), " Threshold: %.2f\n", queryThreshold)
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), " Indexed docs: %d\n", stats.Documents)
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), " Docs dir: %s\n\n", cfg.Storage.DocsDir)
|
||||||
|
|
||||||
|
if len(results) == 0 {
|
||||||
|
fmt.Fprintln(cmd.OutOrStdout(), "No results found.")
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
fmt.Fprintln(cmd.OutOrStdout(), "Results:")
|
||||||
|
fmt.Fprintln(cmd.OutOrStdout(), "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━")
|
||||||
|
for i, r := range results {
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), "%d. %s\n", i+1, r.Title)
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), " Score: %.3f | Type: %s | Source: %s\n", r.Score, r.Type, defaultSource(r.Source))
|
||||||
|
if r.URL != "" {
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), " URL: %s\n", r.URL)
|
||||||
|
}
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), " Snippet: %s\n\n", r.Snippet)
|
||||||
|
}
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func printQueryMarkdown(cmd *cobra.Command, query string, cfg *appconfig.Config, results []search.Result, stats *search.IndexStats) error {
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), "# Query Results\n\n")
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), "- Query: `%s`\n", query)
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), "- Limit: `%d`\n", queryLimit)
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), "- Threshold: `%.2f`\n", queryThreshold)
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), "- Indexed docs: `%d`\n", stats.Documents)
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), "- Docs dir: `%s`\n\n", cfg.Storage.DocsDir)
|
||||||
|
|
||||||
|
if len(results) == 0 {
|
||||||
|
fmt.Fprintln(cmd.OutOrStdout(), "_No results found._")
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
for i, r := range results {
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), "## %d. %s\n\n", i+1, r.Title)
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), "- Score: `%.3f`\n", r.Score)
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), "- Type: `%s`\n", r.Type)
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), "- Source: `%s`\n", defaultSource(r.Source))
|
||||||
|
if r.URL != "" {
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), "- URL: %s\n", r.URL)
|
||||||
|
}
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), "\n%s\n\n", r.Snippet)
|
||||||
|
}
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
func defaultSource(source string) string {
|
||||||
|
source = strings.TrimSpace(source)
|
||||||
|
if source == "" {
|
||||||
|
return "unknown"
|
||||||
|
}
|
||||||
|
return source
|
||||||
|
}
|
||||||
|
|||||||
@@ -6,6 +6,7 @@ import (
|
|||||||
"time"
|
"time"
|
||||||
|
|
||||||
"github.com/yourorg/devour/internal/scraper"
|
"github.com/yourorg/devour/internal/scraper"
|
||||||
|
_ "github.com/yourorg/devour/internal/scraper/external"
|
||||||
)
|
)
|
||||||
|
|
||||||
func main() {
|
func main() {
|
||||||
@@ -90,6 +91,7 @@ func main() {
|
|||||||
scraper.SourceTypeGitHub,
|
scraper.SourceTypeGitHub,
|
||||||
scraper.SourceTypeOpenAPI,
|
scraper.SourceTypeOpenAPI,
|
||||||
scraper.SourceTypeLocal,
|
scraper.SourceTypeLocal,
|
||||||
|
scraper.SourceTypeLocalSearch,
|
||||||
scraper.SourceTypeGoDocs,
|
scraper.SourceTypeGoDocs,
|
||||||
scraper.SourceTypeRustDocs,
|
scraper.SourceTypeRustDocs,
|
||||||
scraper.SourceTypePythonDocs,
|
scraper.SourceTypePythonDocs,
|
||||||
|
|||||||
@@ -6,6 +6,7 @@ import (
|
|||||||
|
|
||||||
"github.com/spf13/cobra"
|
"github.com/spf13/cobra"
|
||||||
"github.com/spf13/viper"
|
"github.com/spf13/viper"
|
||||||
|
_ "github.com/yourorg/devour/internal/scraper/external"
|
||||||
"github.com/yourorg/devour/internal/ui"
|
"github.com/yourorg/devour/internal/ui"
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -34,6 +35,7 @@ Runs in two modes:
|
|||||||
- Local mode: OpenCode skill running entirely on your machine
|
- Local mode: OpenCode skill running entirely on your machine
|
||||||
- Remote mode: MCP server for multi-user/team access`,
|
- Remote mode: MCP server for multi-user/team access`,
|
||||||
Version: "1.0.0",
|
Version: "1.0.0",
|
||||||
|
SilenceUsage: true,
|
||||||
}
|
}
|
||||||
|
|
||||||
func Execute() {
|
func Execute() {
|
||||||
@@ -53,6 +55,7 @@ func init() {
|
|||||||
rootCmd.AddCommand(initCmd)
|
rootCmd.AddCommand(initCmd)
|
||||||
rootCmd.AddCommand(scrapeCmd)
|
rootCmd.AddCommand(scrapeCmd)
|
||||||
rootCmd.AddCommand(getCmd)
|
rootCmd.AddCommand(getCmd)
|
||||||
|
rootCmd.AddCommand(askCmd)
|
||||||
rootCmd.AddCommand(languagesCmd)
|
rootCmd.AddCommand(languagesCmd)
|
||||||
rootCmd.AddCommand(demoCmd)
|
rootCmd.AddCommand(demoCmd)
|
||||||
rootCmd.AddCommand(serveCmd)
|
rootCmd.AddCommand(serveCmd)
|
||||||
@@ -62,6 +65,8 @@ func init() {
|
|||||||
rootCmd.AddCommand(pushCmd)
|
rootCmd.AddCommand(pushCmd)
|
||||||
rootCmd.AddCommand(logoCmd)
|
rootCmd.AddCommand(logoCmd)
|
||||||
rootCmd.AddCommand(scorecardCmd)
|
rootCmd.AddCommand(scorecardCmd)
|
||||||
|
rootCmd.AddCommand(autoCmd)
|
||||||
|
rootCmd.AddCommand(verifyCmd)
|
||||||
}
|
}
|
||||||
|
|
||||||
// logoCmd displays the Devour character
|
// logoCmd displays the Devour character
|
||||||
|
|||||||
@@ -0,0 +1,31 @@
|
|||||||
|
package cmd
|
||||||
|
|
||||||
|
import "testing"
|
||||||
|
|
||||||
|
func TestRootCommandsAreUnique(t *testing.T) {
|
||||||
|
seen := map[string]bool{}
|
||||||
|
for _, c := range rootCmd.Commands() {
|
||||||
|
name := c.Name()
|
||||||
|
if seen[name] {
|
||||||
|
t.Fatalf("duplicate root command registered: %s", name)
|
||||||
|
}
|
||||||
|
seen[name] = true
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestQueryLimitShorthandIsN(t *testing.T) {
|
||||||
|
flag := queryCmd.Flags().Lookup("limit")
|
||||||
|
if flag == nil {
|
||||||
|
t.Fatal("query --limit flag not found")
|
||||||
|
}
|
||||||
|
if flag.Shorthand != "n" {
|
||||||
|
t.Fatalf("expected query --limit shorthand to be n, got %q", flag.Shorthand)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestRootExecuteQueryNoPanic(t *testing.T) {
|
||||||
|
rootCmd.SetArgs([]string{"query", "http client", "--limit", "1"})
|
||||||
|
if _, err := rootCmd.ExecuteC(); err != nil {
|
||||||
|
t.Fatalf("query execution should not panic; got error: %v", err)
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,81 @@
|
|||||||
|
package cmd
|
||||||
|
|
||||||
|
import (
|
||||||
|
"fmt"
|
||||||
|
"path/filepath"
|
||||||
|
"strings"
|
||||||
|
"time"
|
||||||
|
|
||||||
|
appconfig "github.com/yourorg/devour/internal/config"
|
||||||
|
"github.com/yourorg/devour/internal/scraper"
|
||||||
|
)
|
||||||
|
|
||||||
|
func loadAppConfig() (*appconfig.Config, error) {
|
||||||
|
cfg, err := appconfig.Load(cfgFile)
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
if err := cfg.EnsureStorageDirs(); err != nil {
|
||||||
|
return nil, fmt.Errorf("ensure storage dirs: %w", err)
|
||||||
|
}
|
||||||
|
return cfg, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
func toScraperConfig(c *appconfig.Config, concurrencyOverride int) *scraper.Config {
|
||||||
|
sc := &scraper.Config{
|
||||||
|
UserAgent: c.Scraper.UserAgent,
|
||||||
|
Timeout: c.Scraper.Timeout,
|
||||||
|
RetryCount: c.Scraper.RetryCount,
|
||||||
|
RetryDelay: c.Scraper.RetryDelay,
|
||||||
|
Concurrency: c.Scraper.Concurrency,
|
||||||
|
RateLimit: c.Scraper.RateLimit,
|
||||||
|
MaxDepth: c.Scraper.MaxDepth,
|
||||||
|
CacheDir: c.Scraper.CacheDir,
|
||||||
|
}
|
||||||
|
if concurrencyOverride > 0 {
|
||||||
|
sc.Concurrency = concurrencyOverride
|
||||||
|
}
|
||||||
|
if sc.Timeout <= 0 {
|
||||||
|
sc.Timeout = 30 * time.Second
|
||||||
|
}
|
||||||
|
if sc.RetryCount <= 0 {
|
||||||
|
sc.RetryCount = 3
|
||||||
|
}
|
||||||
|
if sc.RetryDelay <= 0 {
|
||||||
|
sc.RetryDelay = 1 * time.Second
|
||||||
|
}
|
||||||
|
if sc.Concurrency <= 0 {
|
||||||
|
sc.Concurrency = 10
|
||||||
|
}
|
||||||
|
if sc.MaxDepth <= 0 {
|
||||||
|
sc.MaxDepth = 2
|
||||||
|
}
|
||||||
|
return sc
|
||||||
|
}
|
||||||
|
|
||||||
|
func sourceFromConfig(s appconfig.SourceConfig) *scraper.Source {
|
||||||
|
return &scraper.Source{
|
||||||
|
Name: strings.TrimSpace(s.Name),
|
||||||
|
Type: scraper.SourceType(strings.TrimSpace(s.Type)),
|
||||||
|
URL: strings.TrimSpace(s.URL),
|
||||||
|
Query: strings.TrimSpace(s.Query),
|
||||||
|
ResultLimit: s.ResultLimit,
|
||||||
|
Domains: append([]string(nil), s.Domains...),
|
||||||
|
Repo: strings.TrimSpace(s.Repo),
|
||||||
|
Branch: strings.TrimSpace(s.Branch),
|
||||||
|
Path: strings.TrimSpace(s.Path),
|
||||||
|
Include: append([]string(nil), s.Include...),
|
||||||
|
Exclude: append([]string(nil), s.Exclude...),
|
||||||
|
Schedule: strings.TrimSpace(s.Schedule),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func resolveOutputDir(c *appconfig.Config, override string) string {
|
||||||
|
if strings.TrimSpace(override) != "" {
|
||||||
|
return override
|
||||||
|
}
|
||||||
|
if strings.TrimSpace(c.Storage.DocsDir) != "" {
|
||||||
|
return c.Storage.DocsDir
|
||||||
|
}
|
||||||
|
return filepath.Join("devour_data", "docs")
|
||||||
|
}
|
||||||
@@ -37,7 +37,6 @@ Examples:
|
|||||||
}
|
}
|
||||||
|
|
||||||
func init() {
|
func init() {
|
||||||
rootCmd.AddCommand(scorecardCmd)
|
|
||||||
scorecardCmd.Flags().BoolVar(&scorecardCompact, "compact", false, "Generate compact banner only")
|
scorecardCmd.Flags().BoolVar(&scorecardCompact, "compact", false, "Generate compact banner only")
|
||||||
scorecardCmd.Flags().BoolVar(&scorecardDetailed, "detailed", false, "Generate detailed banner only")
|
scorecardCmd.Flags().BoolVar(&scorecardDetailed, "detailed", false, "Generate detailed banner only")
|
||||||
scorecardCmd.Flags().StringVarP(&scorecardOutput, "output", "o", "lighthouse_scorecard", "Output filename prefix")
|
scorecardCmd.Flags().StringVarP(&scorecardOutput, "output", "o", "lighthouse_scorecard", "Output filename prefix")
|
||||||
|
|||||||
+291
-87
@@ -2,17 +2,23 @@ package cmd
|
|||||||
|
|
||||||
import (
|
import (
|
||||||
"context"
|
"context"
|
||||||
"encoding/json"
|
"crypto/sha256"
|
||||||
|
"encoding/hex"
|
||||||
"fmt"
|
"fmt"
|
||||||
"net/url"
|
"net/url"
|
||||||
"os"
|
"os"
|
||||||
"path/filepath"
|
"path/filepath"
|
||||||
|
"sort"
|
||||||
"strings"
|
"strings"
|
||||||
"time"
|
"time"
|
||||||
|
|
||||||
"github.com/spf13/cobra"
|
"github.com/spf13/cobra"
|
||||||
"github.com/yourorg/devour/internal/markdown"
|
appconfig "github.com/yourorg/devour/internal/config"
|
||||||
|
"github.com/yourorg/devour/internal/projectstate"
|
||||||
"github.com/yourorg/devour/internal/scraper"
|
"github.com/yourorg/devour/internal/scraper"
|
||||||
|
"github.com/yourorg/devour/internal/search"
|
||||||
|
"github.com/yourorg/devour/internal/storage"
|
||||||
|
"gopkg.in/yaml.v3"
|
||||||
)
|
)
|
||||||
|
|
||||||
var scrapeCmd = &cobra.Command{
|
var scrapeCmd = &cobra.Command{
|
||||||
@@ -34,13 +40,17 @@ Supported source types:
|
|||||||
- dockerdocs: Docker (docs.docker.com)
|
- dockerdocs: Docker (docs.docker.com)
|
||||||
- cloudflaredocs: Cloudflare (developers.cloudflare.com)
|
- cloudflaredocs: Cloudflare (developers.cloudflare.com)
|
||||||
- astrodocs: Astro (docs.astro.build)
|
- astrodocs: Astro (docs.astro.build)
|
||||||
|
- localsearch: Self-hosted search API returning JSON results
|
||||||
- url: Generic web pages
|
- url: Generic web pages
|
||||||
- github: GitHub repositories
|
- github: GitHub repositories
|
||||||
|
- openapi: OpenAPI/Swagger specs
|
||||||
|
- local: Local files/directories
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
devour scrape https://pkg.go.dev/net/http --type godocs
|
devour scrape https://pkg.go.dev/net/http --type godocs
|
||||||
devour scrape https://react.dev/reference/react --type reactdocs
|
devour scrape https://react.dev/reference/react --type reactdocs
|
||||||
devour scrape https://developers.cloudflare.com/ --type cloudflaredocs
|
devour scrape https://developers.cloudflare.com/ --type cloudflaredocs
|
||||||
|
devour scrape http://127.0.0.1:8080/search --type localsearch --search-query "golang http client"
|
||||||
devour scrape --sources sources.yaml`,
|
devour scrape --sources sources.yaml`,
|
||||||
Args: cobra.MaximumNArgs(1),
|
Args: cobra.MaximumNArgs(1),
|
||||||
RunE: runScrape,
|
RunE: runScrape,
|
||||||
@@ -52,126 +62,261 @@ var (
|
|||||||
scrapeOutput string
|
scrapeOutput string
|
||||||
scrapeConcurrency int
|
scrapeConcurrency int
|
||||||
scrapeType string
|
scrapeType string
|
||||||
|
scrapeSearchQuery string
|
||||||
|
scrapeSearchLimit int
|
||||||
|
scrapeSearchDomains []string
|
||||||
|
scrapeInclude []string
|
||||||
|
scrapeExclude []string
|
||||||
|
scrapeAllowEmpty bool
|
||||||
)
|
)
|
||||||
|
|
||||||
func init() {
|
func init() {
|
||||||
scrapeCmd.Flags().StringVarP(&scrapeFormat, "format", "f", "json", "output format (json, markdown)")
|
scrapeCmd.Flags().StringVarP(&scrapeFormat, "format", "f", "json", "output format (json, markdown)")
|
||||||
scrapeCmd.Flags().StringVarP(&scrapeSources, "sources", "s", "", "YAML file with source definitions")
|
scrapeCmd.Flags().StringVarP(&scrapeSources, "sources", "s", "", "YAML file with source definitions")
|
||||||
scrapeCmd.Flags().StringVarP(&scrapeOutput, "output", "o", "", "output directory (default: devour_data/docs)")
|
scrapeCmd.Flags().StringVarP(&scrapeOutput, "output", "o", "", "output directory (default: configured docs dir)")
|
||||||
scrapeCmd.Flags().IntVar(&scrapeConcurrency, "concurrency", 10, "parallel scraping workers")
|
scrapeCmd.Flags().IntVar(&scrapeConcurrency, "concurrency", 10, "parallel scraping workers")
|
||||||
scrapeCmd.Flags().StringVarP(&scrapeType, "type", "t", "", "source type (auto-detected if not specified)")
|
scrapeCmd.Flags().StringVarP(&scrapeType, "type", "t", "", "source type (auto-detected if not specified)")
|
||||||
|
scrapeCmd.Flags().StringVar(&scrapeSearchQuery, "search-query", "", "search query for --type localsearch")
|
||||||
|
scrapeCmd.Flags().IntVar(&scrapeSearchLimit, "search-limit", 8, "max result URLs to scrape for --type localsearch")
|
||||||
|
scrapeCmd.Flags().StringSliceVar(&scrapeSearchDomains, "search-domain", nil, "restrict localsearch results to these domains (repeatable)")
|
||||||
|
scrapeCmd.Flags().StringSliceVar(&scrapeInclude, "include", nil, "include URL/file regex patterns (repeatable)")
|
||||||
|
scrapeCmd.Flags().StringSliceVar(&scrapeExclude, "exclude", nil, "exclude URL/file regex patterns (repeatable)")
|
||||||
|
scrapeCmd.Flags().BoolVar(&scrapeAllowEmpty, "allow-empty", false, "allow success when no documents were extracted")
|
||||||
}
|
}
|
||||||
|
|
||||||
func runScrape(cmd *cobra.Command, args []string) error {
|
func runScrape(cmd *cobra.Command, args []string) error {
|
||||||
|
cfg, err := loadAppConfig()
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
|
||||||
if scrapeSources != "" {
|
if scrapeSources != "" {
|
||||||
return scrapeFromConfig(scrapeSources)
|
return scrapeFromConfig(cmd, cfg, scrapeSources)
|
||||||
}
|
}
|
||||||
|
|
||||||
if len(args) == 0 {
|
if len(args) == 0 {
|
||||||
return fmt.Errorf("source argument required when not using --sources flag")
|
return fmt.Errorf("source argument required when not using --sources flag")
|
||||||
}
|
}
|
||||||
|
|
||||||
sourceURL := args[0]
|
sourceURL := strings.TrimSpace(args[0])
|
||||||
|
|
||||||
config := &scraper.Config{
|
|
||||||
UserAgent: "Devour/1.0 (Documentation Scraper)",
|
|
||||||
Timeout: 30 * time.Second,
|
|
||||||
RetryCount: 3,
|
|
||||||
RetryDelay: 1 * time.Second,
|
|
||||||
Concurrency: scrapeConcurrency,
|
|
||||||
}
|
|
||||||
|
|
||||||
sourceType := scraper.SourceType(scrapeType)
|
sourceType := scraper.SourceType(scrapeType)
|
||||||
if sourceType == "" {
|
if sourceType == "" {
|
||||||
sourceType = detectSourceType(sourceURL)
|
sourceType = detectSourceType(sourceURL)
|
||||||
}
|
}
|
||||||
|
|
||||||
fmt.Printf("Scraping: %s\n", sourceURL)
|
|
||||||
fmt.Printf(" Type: %s\n", sourceType)
|
|
||||||
fmt.Printf(" Concurrency: %d\n", scrapeConcurrency)
|
|
||||||
fmt.Println()
|
|
||||||
|
|
||||||
s := scraper.NewScraper(sourceType, config)
|
|
||||||
if s == nil {
|
|
||||||
return fmt.Errorf("unsupported source type: %s", sourceType)
|
|
||||||
}
|
|
||||||
|
|
||||||
source := &scraper.Source{
|
source := &scraper.Source{
|
||||||
Name: extractName(sourceURL),
|
Name: extractName(sourceURL),
|
||||||
Type: sourceType,
|
Type: sourceType,
|
||||||
URL: sourceURL,
|
URL: sourceURL,
|
||||||
|
Query: strings.TrimSpace(scrapeSearchQuery),
|
||||||
|
ResultLimit: scrapeSearchLimit,
|
||||||
|
Domains: append([]string(nil), scrapeSearchDomains...),
|
||||||
|
Include: append([]string(nil), scrapeInclude...),
|
||||||
|
Exclude: append([]string(nil), scrapeExclude...),
|
||||||
|
}
|
||||||
|
if sourceType == scraper.SourceTypeLocal {
|
||||||
|
source.Path = sourceURL
|
||||||
|
}
|
||||||
|
applySourceProfile(source)
|
||||||
|
|
||||||
|
outputDir := resolveOutputDir(cfg, scrapeOutput)
|
||||||
|
count, err := scrapeOne(cmd, cfg, source, outputDir)
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
}
|
}
|
||||||
|
|
||||||
ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second)
|
if cfg.Indexing.Enabled {
|
||||||
|
engine := search.NewEngine(cfg)
|
||||||
|
if _, err := engine.Rebuild(context.Background()); err != nil {
|
||||||
|
return fmt.Errorf("reindex after scrape: %w", err)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fmt.Printf("\n✓ Scraping complete!\n")
|
||||||
|
fmt.Printf(" Output: %s\n", outputDir)
|
||||||
|
fmt.Printf(" Documents: %d\n", count)
|
||||||
|
fmt.Println(" Run 'devour status' to inspect local index health")
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
func scrapeFromConfig(cmd *cobra.Command, cfg *appconfig.Config, configPath string) error {
|
||||||
|
raw, err := os.ReadFile(configPath)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("read sources file: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
var list []appconfig.SourceConfig
|
||||||
|
if err := yaml.Unmarshal(raw, &list); err != nil || len(list) == 0 {
|
||||||
|
var wrapped struct {
|
||||||
|
Sources []appconfig.SourceConfig `yaml:"sources"`
|
||||||
|
}
|
||||||
|
if wrapErr := yaml.Unmarshal(raw, &wrapped); wrapErr != nil {
|
||||||
|
return fmt.Errorf("parse sources file: %w", err)
|
||||||
|
}
|
||||||
|
list = wrapped.Sources
|
||||||
|
}
|
||||||
|
if len(list) == 0 {
|
||||||
|
return fmt.Errorf("sources file contains no sources")
|
||||||
|
}
|
||||||
|
|
||||||
|
sort.Slice(list, func(i, j int) bool {
|
||||||
|
return list[i].Name < list[j].Name
|
||||||
|
})
|
||||||
|
|
||||||
|
outputDir := resolveOutputDir(cfg, scrapeOutput)
|
||||||
|
success := 0
|
||||||
|
failures := 0
|
||||||
|
totalDocs := 0
|
||||||
|
for _, srcCfg := range list {
|
||||||
|
source := sourceFromConfig(srcCfg)
|
||||||
|
if source.Type == "" {
|
||||||
|
if source.URL != "" {
|
||||||
|
source.Type = detectSourceType(source.URL)
|
||||||
|
} else if source.Path != "" {
|
||||||
|
source.Type = scraper.SourceTypeLocal
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if source.Name == "" {
|
||||||
|
source.Name = extractName(source.URL)
|
||||||
|
if source.Name == "unknown" && source.Path != "" {
|
||||||
|
source.Name = filepath.Base(source.Path)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
applySourceProfile(source)
|
||||||
|
|
||||||
|
fmt.Printf("\n=== Source: %s (%s) ===\n", source.Name, source.Type)
|
||||||
|
count, srcErr := scrapeOne(cmd, cfg, source, outputDir)
|
||||||
|
if srcErr != nil {
|
||||||
|
failures++
|
||||||
|
fmt.Printf("✗ %s failed: %v\n", source.Name, srcErr)
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
totalDocs += count
|
||||||
|
success++
|
||||||
|
}
|
||||||
|
|
||||||
|
if cfg.Indexing.Enabled {
|
||||||
|
engine := search.NewEngine(cfg)
|
||||||
|
if _, err := engine.Rebuild(context.Background()); err != nil {
|
||||||
|
return fmt.Errorf("reindex after scrape sources: %w", err)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fmt.Printf("\nSummary: %d succeeded, %d failed, %d docs written\n", success, failures, totalDocs)
|
||||||
|
if failures > 0 {
|
||||||
|
return fmt.Errorf("one or more sources failed")
|
||||||
|
}
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
func scrapeOne(cmd *cobra.Command, cfg *appconfig.Config, source *scraper.Source, outputDir string) (int, error) {
|
||||||
|
if source == nil {
|
||||||
|
return 0, fmt.Errorf("source is required")
|
||||||
|
}
|
||||||
|
if source.Type == "" {
|
||||||
|
return 0, fmt.Errorf("source type is required")
|
||||||
|
}
|
||||||
|
|
||||||
|
if source.Type == scraper.SourceTypeLocalSearch && strings.TrimSpace(source.Query) == "" {
|
||||||
|
return 0, fmt.Errorf("search query is required for localsearch sources")
|
||||||
|
}
|
||||||
|
|
||||||
|
scraperConfig := toScraperConfig(cfg, scrapeConcurrency)
|
||||||
|
s := scraper.NewScraper(source.Type, scraperConfig)
|
||||||
|
if s == nil {
|
||||||
|
return 0, fmt.Errorf("unsupported source type: %s", source.Type)
|
||||||
|
}
|
||||||
|
|
||||||
|
fmt.Printf("Scraping: %s\n", chooseSourceLabel(source))
|
||||||
|
fmt.Printf(" Type: %s\n", source.Type)
|
||||||
|
fmt.Printf(" Concurrency: %d\n", scraperConfig.Concurrency)
|
||||||
|
if source.Type == scraper.SourceTypeLocalSearch {
|
||||||
|
fmt.Printf(" Search query: %s\n", source.Query)
|
||||||
|
fmt.Printf(" Search limit: %d\n", source.ResultLimit)
|
||||||
|
if len(source.Domains) > 0 {
|
||||||
|
fmt.Printf(" Search domains: %s\n", strings.Join(source.Domains, ", "))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
fmt.Println()
|
||||||
|
|
||||||
|
ctx, cancel := context.WithTimeout(context.Background(), scraperConfig.Timeout*2)
|
||||||
defer cancel()
|
defer cancel()
|
||||||
|
|
||||||
docs, err := s.Scrape(ctx, source)
|
docs, err := s.Scrape(ctx, source)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return fmt.Errorf("scraping failed: %w", err)
|
return 0, fmt.Errorf("scraping failed: %w", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
fmt.Printf("✓ Scraped %d documents\n\n", len(docs))
|
save, err := storage.SaveDocuments(docs, storage.SaveOptions{
|
||||||
|
Format: scrapeFormat,
|
||||||
if scrapeOutput == "" {
|
OutputDir: outputDir,
|
||||||
scrapeOutput = "devour_data/docs"
|
AllowEmpty: scrapeAllowEmpty,
|
||||||
}
|
PrintWriter: func(format string, args ...any) {
|
||||||
|
_, _ = fmt.Printf(format, args...)
|
||||||
if err := os.MkdirAll(scrapeOutput, 0755); err != nil {
|
},
|
||||||
return fmt.Errorf("failed to create output directory: %w", err)
|
})
|
||||||
}
|
|
||||||
|
|
||||||
for i, doc := range docs {
|
|
||||||
var filename string
|
|
||||||
var content []byte
|
|
||||||
|
|
||||||
if scrapeFormat == "markdown" {
|
|
||||||
filename = fmt.Sprintf("%s_%d.md", sanitizeFilename(doc.Title), i)
|
|
||||||
|
|
||||||
// Create enhanced markdown document
|
|
||||||
markdownDoc := &markdown.Document{
|
|
||||||
ID: doc.ID,
|
|
||||||
Source: doc.Source,
|
|
||||||
Type: string(doc.Type),
|
|
||||||
Title: doc.Title,
|
|
||||||
Content: doc.Content,
|
|
||||||
URL: doc.URL,
|
|
||||||
Metadata: doc.Metadata,
|
|
||||||
Hash: doc.Hash,
|
|
||||||
Timestamp: doc.Timestamp,
|
|
||||||
}
|
|
||||||
|
|
||||||
formatter := markdown.NewFormatter()
|
|
||||||
content = []byte(formatter.FormatWithTOC(markdownDoc))
|
|
||||||
} else {
|
|
||||||
filename = fmt.Sprintf("%s_%d.json", sanitizeFilename(doc.Title), i)
|
|
||||||
content, err = json.MarshalIndent(doc, "", " ")
|
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return fmt.Errorf("failed to marshal document: %w", err)
|
return 0, err
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
filePath := filepath.Join(scrapeOutput, filename)
|
fmt.Printf("✓ Scraped %d documents\n", save.Count)
|
||||||
if err := os.WriteFile(filePath, content, 0644); err != nil {
|
|
||||||
return fmt.Errorf("failed to write document: %w", err)
|
if err := updateSourceState(cfg, source, docs); err != nil {
|
||||||
|
return save.Count, fmt.Errorf("update source state: %w", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
fmt.Printf(" 📄 %s (%s)\n", filename, doc.Type)
|
return save.Count, nil
|
||||||
}
|
|
||||||
|
|
||||||
fmt.Printf("\n✓ Scraping complete!\n")
|
|
||||||
fmt.Printf(" Output: %s\n", scrapeOutput)
|
|
||||||
fmt.Println(" Run 'devour status' to see indexed documents")
|
|
||||||
|
|
||||||
return nil
|
|
||||||
}
|
}
|
||||||
|
|
||||||
func scrapeFromConfig(configPath string) error {
|
func updateSourceState(cfg *appconfig.Config, source *scraper.Source, docs []*scraper.Document) error {
|
||||||
return fmt.Errorf("scraping from config file not yet implemented")
|
state, err := projectstate.LoadSourceState(cfg.Storage.MetadataDir)
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
|
||||||
|
key := source.Name
|
||||||
|
if key == "" {
|
||||||
|
key = chooseSourceLabel(source)
|
||||||
|
}
|
||||||
|
|
||||||
|
h := sha256.New()
|
||||||
|
for _, d := range docs {
|
||||||
|
if d == nil {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
fmt.Fprintf(h, "%s|%s|%s\n", d.ID, d.Hash, d.URL)
|
||||||
|
}
|
||||||
|
state.Sources[key] = &projectstate.SourceState{
|
||||||
|
Name: source.Name,
|
||||||
|
Type: string(source.Type),
|
||||||
|
URL: source.URL,
|
||||||
|
Hash: hex.EncodeToString(h.Sum(nil)),
|
||||||
|
LastSync: time.Now(),
|
||||||
|
DocCount: len(docs),
|
||||||
|
}
|
||||||
|
|
||||||
|
return projectstate.SaveSourceState(cfg.Storage.MetadataDir, state)
|
||||||
|
}
|
||||||
|
|
||||||
|
func chooseSourceLabel(source *scraper.Source) string {
|
||||||
|
if strings.TrimSpace(source.URL) != "" {
|
||||||
|
return source.URL
|
||||||
|
}
|
||||||
|
if strings.TrimSpace(source.Path) != "" {
|
||||||
|
return source.Path
|
||||||
|
}
|
||||||
|
if strings.TrimSpace(source.Repo) != "" {
|
||||||
|
return source.Repo
|
||||||
|
}
|
||||||
|
return source.Name
|
||||||
}
|
}
|
||||||
|
|
||||||
func detectSourceType(sourceURL string) scraper.SourceType {
|
func detectSourceType(sourceURL string) scraper.SourceType {
|
||||||
u, err := url.Parse(sourceURL)
|
u, err := url.Parse(sourceURL)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
|
if sourceURL != "" && !strings.HasPrefix(sourceURL, "http://") && !strings.HasPrefix(sourceURL, "https://") {
|
||||||
|
return scraper.SourceTypeLocal
|
||||||
|
}
|
||||||
return scraper.SourceTypeWeb
|
return scraper.SourceTypeWeb
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -208,6 +353,11 @@ func detectSourceType(sourceURL string) scraper.SourceType {
|
|||||||
return scraper.SourceTypeAstroDocs
|
return scraper.SourceTypeAstroDocs
|
||||||
case host == "github.com":
|
case host == "github.com":
|
||||||
return scraper.SourceTypeGitHub
|
return scraper.SourceTypeGitHub
|
||||||
|
case strings.HasSuffix(path, ".json") || strings.HasSuffix(path, ".yaml") || strings.HasSuffix(path, ".yml"):
|
||||||
|
if strings.Contains(strings.ToLower(path), "openapi") || strings.Contains(strings.ToLower(path), "swagger") {
|
||||||
|
return scraper.SourceTypeOpenAPI
|
||||||
|
}
|
||||||
|
return scraper.SourceTypeWeb
|
||||||
default:
|
default:
|
||||||
return scraper.SourceTypeWeb
|
return scraper.SourceTypeWeb
|
||||||
}
|
}
|
||||||
@@ -216,27 +366,81 @@ func detectSourceType(sourceURL string) scraper.SourceType {
|
|||||||
func extractName(sourceURL string) string {
|
func extractName(sourceURL string) string {
|
||||||
u, err := url.Parse(sourceURL)
|
u, err := url.Parse(sourceURL)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
|
if strings.TrimSpace(sourceURL) != "" {
|
||||||
|
return filepath.Base(sourceURL)
|
||||||
|
}
|
||||||
return "unknown"
|
return "unknown"
|
||||||
}
|
}
|
||||||
|
|
||||||
parts := strings.Split(strings.Trim(u.Path, "/"), "/")
|
parts := strings.Split(strings.Trim(u.Path, "/"), "/")
|
||||||
if len(parts) > 0 {
|
if len(parts) > 0 && strings.TrimSpace(parts[len(parts)-1]) != "" {
|
||||||
return parts[len(parts)-1]
|
return parts[len(parts)-1]
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if strings.TrimSpace(u.Host) != "" {
|
||||||
return u.Host
|
return u.Host
|
||||||
|
}
|
||||||
|
return "unknown"
|
||||||
}
|
}
|
||||||
|
|
||||||
func sanitizeFilename(name string) string {
|
func applySourceProfile(source *scraper.Source) {
|
||||||
name = strings.ToLower(name)
|
if source == nil {
|
||||||
name = strings.ReplaceAll(name, " ", "_")
|
return
|
||||||
name = strings.ReplaceAll(name, "/", "_")
|
}
|
||||||
name = strings.ReplaceAll(name, ":", "_")
|
if source.Type != scraper.SourceTypeWeb && source.Type != scraper.SourceTypeLocalSearch {
|
||||||
name = strings.ReplaceAll(name, ".", "_")
|
return
|
||||||
|
}
|
||||||
if len(name) > 50 {
|
if strings.TrimSpace(source.URL) == "" {
|
||||||
name = name[:50]
|
return
|
||||||
}
|
}
|
||||||
|
|
||||||
return name
|
u, err := url.Parse(source.URL)
|
||||||
|
if err != nil {
|
||||||
|
return
|
||||||
|
}
|
||||||
|
host := strings.ToLower(u.Host)
|
||||||
|
if host == "" {
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
// Preserve explicit user-provided patterns.
|
||||||
|
if len(source.Include) > 0 || len(source.Exclude) > 0 {
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
switch {
|
||||||
|
case strings.Contains(host, "learn.microsoft.com"):
|
||||||
|
source.Include = []string{`/dotnet/`, `/csharp/`, `/base-types/`}
|
||||||
|
source.Exclude = []string{`/previous-versions/`, `/answers/`, `/support/`, `/training/`, `/events/`, `/products/`}
|
||||||
|
case strings.Contains(host, "kotlinlang.org"):
|
||||||
|
source.Include = []string{`/docs/`}
|
||||||
|
source.Exclude = []string{`/community/`, `/api/`, `/releases/`}
|
||||||
|
case strings.Contains(host, "php.net"):
|
||||||
|
source.Include = []string{`/manual/en/`}
|
||||||
|
source.Exclude = []string{`/manual/(de|fr|es|ja|ru|pt)/`, `/downloads.php`, `/bugs.php`}
|
||||||
|
case strings.Contains(host, "ruby-doc.org"):
|
||||||
|
source.Include = []string{`/core/`}
|
||||||
|
source.Exclude = []string{`/stdlib/`, `/gems/`}
|
||||||
|
case strings.Contains(host, "hexdocs.pm"):
|
||||||
|
source.Include = []string{`/elixir/`}
|
||||||
|
source.Exclude = []string{`/phoenix/`, `/ecto/`}
|
||||||
|
case strings.Contains(host, "nextjs.org"):
|
||||||
|
source.Include = []string{`/docs/`}
|
||||||
|
source.Exclude = []string{`/showcase`, `/blog`, `/learn/`, `/pricing`}
|
||||||
|
case strings.Contains(host, "svelte.dev"):
|
||||||
|
source.Include = []string{`/docs/`}
|
||||||
|
source.Exclude = []string{`/playground`, `/tutorial`, `/blog`}
|
||||||
|
case strings.Contains(host, "angular.dev"):
|
||||||
|
source.Include = []string{`/guide/`, `/api/`, `/tutorials/`}
|
||||||
|
source.Exclude = []string{`/resources/`, `/playground`}
|
||||||
|
case strings.Contains(host, "remix.run"):
|
||||||
|
source.Include = []string{`/docs/`}
|
||||||
|
source.Exclude = []string{`/blog`, `/conf`, `/merch`}
|
||||||
|
case strings.Contains(host, "solidjs.com"):
|
||||||
|
source.Include = []string{`/docs/`}
|
||||||
|
source.Exclude = []string{`/community`, `/showcase`, `/blog`}
|
||||||
|
case strings.Contains(host, "expressjs.com"):
|
||||||
|
source.Include = []string{`/en/(guide|api|advanced)/`}
|
||||||
|
source.Exclude = []string{`/en/starter/`, `/cn/`, `/fr/`, `/es/`, `/de/`}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -0,0 +1,56 @@
|
|||||||
|
package cmd
|
||||||
|
|
||||||
|
import (
|
||||||
|
"net/http"
|
||||||
|
"net/http/httptest"
|
||||||
|
"os"
|
||||||
|
"path/filepath"
|
||||||
|
"strings"
|
||||||
|
"testing"
|
||||||
|
|
||||||
|
appconfig "github.com/yourorg/devour/internal/config"
|
||||||
|
)
|
||||||
|
|
||||||
|
func TestScrapeFromConfig(t *testing.T) {
|
||||||
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||||
|
w.Header().Set("Content-Type", "text/html")
|
||||||
|
_, _ = w.Write([]byte("<html><head><title>Docs</title></head><body><main>" + strings.Repeat("docs content ", 30) + "</main></body></html>"))
|
||||||
|
}))
|
||||||
|
defer srv.Close()
|
||||||
|
|
||||||
|
tmp := t.TempDir()
|
||||||
|
cfg := appconfig.Default()
|
||||||
|
cfg.Storage.DocsDir = filepath.Join(tmp, "docs")
|
||||||
|
cfg.Storage.IndexDir = filepath.Join(tmp, "index")
|
||||||
|
cfg.Storage.MetadataDir = filepath.Join(tmp, "metadata")
|
||||||
|
cfg.Storage.CacheDir = filepath.Join(tmp, "cache")
|
||||||
|
if err := cfg.EnsureStorageDirs(); err != nil {
|
||||||
|
t.Fatal(err)
|
||||||
|
}
|
||||||
|
|
||||||
|
sourcesPath := filepath.Join(tmp, "sources.yaml")
|
||||||
|
yaml := "- name: demo\n type: url\n url: " + srv.URL + "\n"
|
||||||
|
if err := os.WriteFile(sourcesPath, []byte(yaml), 0o644); err != nil {
|
||||||
|
t.Fatal(err)
|
||||||
|
}
|
||||||
|
|
||||||
|
oldFormat, oldOutput, oldAllow := scrapeFormat, scrapeOutput, scrapeAllowEmpty
|
||||||
|
scrapeFormat = "json"
|
||||||
|
scrapeOutput = cfg.Storage.DocsDir
|
||||||
|
scrapeAllowEmpty = false
|
||||||
|
defer func() {
|
||||||
|
scrapeFormat, scrapeOutput, scrapeAllowEmpty = oldFormat, oldOutput, oldAllow
|
||||||
|
}()
|
||||||
|
|
||||||
|
if err := scrapeFromConfig(nil, cfg, sourcesPath); err != nil {
|
||||||
|
t.Fatalf("scrapeFromConfig failed: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
entries, err := os.ReadDir(cfg.Storage.DocsDir)
|
||||||
|
if err != nil {
|
||||||
|
t.Fatal(err)
|
||||||
|
}
|
||||||
|
if len(entries) == 0 {
|
||||||
|
t.Fatal("expected scraped files")
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,39 @@
|
|||||||
|
package cmd
|
||||||
|
|
||||||
|
import (
|
||||||
|
"testing"
|
||||||
|
|
||||||
|
"github.com/yourorg/devour/internal/scraper"
|
||||||
|
)
|
||||||
|
|
||||||
|
func TestDetectSourceType(t *testing.T) {
|
||||||
|
tests := []struct {
|
||||||
|
url string
|
||||||
|
wantType scraper.SourceType
|
||||||
|
}{
|
||||||
|
{"https://pkg.go.dev/net/http", scraper.SourceTypeGoDocs},
|
||||||
|
{"https://docs.rs/tokio/latest/tokio/", scraper.SourceTypeRustDocs},
|
||||||
|
{"https://docs.python.org/3/library/asyncio.html", scraper.SourceTypePythonDocs},
|
||||||
|
{"https://docs.oracle.com/javase/8/docs/api/java/util/List.html", scraper.SourceTypeJavaDocs},
|
||||||
|
{"https://docs.spring.io/spring-boot/docs/current/reference/htmlsingle/", scraper.SourceTypeSpringDocs},
|
||||||
|
{"https://www.typescriptlang.org/docs/handbook/2/basic-types.html", scraper.SourceTypeTSDocs},
|
||||||
|
{"https://react.dev/reference/react", scraper.SourceTypeReactDocs},
|
||||||
|
{"https://vuejs.org/guide/introduction.html", scraper.SourceTypeVueDocs},
|
||||||
|
{"https://nuxt.com/docs/guide/directory-structure", scraper.SourceTypeNuxtDocs},
|
||||||
|
{"https://docs.docker.com/compose", scraper.SourceTypeDockerDocs},
|
||||||
|
{"https://hub.docker.com/mcp/server/github", scraper.SourceTypeMCPDocs},
|
||||||
|
{"https://developers.cloudflare.com/workers", scraper.SourceTypeCloudflareDocs},
|
||||||
|
{"https://docs.astro.build/en/guides/components/", scraper.SourceTypeAstroDocs},
|
||||||
|
{"https://github.com/yourorg/devour", scraper.SourceTypeGitHub},
|
||||||
|
{"https://example.com/docs", scraper.SourceTypeWeb},
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, tt := range tests {
|
||||||
|
t.Run(tt.url, func(t *testing.T) {
|
||||||
|
got := detectSourceType(tt.url)
|
||||||
|
if got != tt.wantType {
|
||||||
|
t.Fatalf("detectSourceType(%q) = %q, want %q", tt.url, got, tt.wantType)
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
+185
-27
@@ -1,25 +1,29 @@
|
|||||||
package cmd
|
package cmd
|
||||||
|
|
||||||
import (
|
import (
|
||||||
|
"context"
|
||||||
|
"encoding/json"
|
||||||
"fmt"
|
"fmt"
|
||||||
|
"strings"
|
||||||
|
|
||||||
"github.com/spf13/cobra"
|
"github.com/spf13/cobra"
|
||||||
|
"github.com/yourorg/devour/internal/projectstate"
|
||||||
|
"github.com/yourorg/devour/internal/scraper"
|
||||||
|
"github.com/yourorg/devour/internal/search"
|
||||||
|
"github.com/yourorg/devour/internal/server"
|
||||||
)
|
)
|
||||||
|
|
||||||
var serveCmd = &cobra.Command{
|
var serveCmd = &cobra.Command{
|
||||||
Use: "serve",
|
Use: "serve",
|
||||||
Short: "Start the MCP server",
|
Short: "Start the local Devour RPC server",
|
||||||
Long: `Start the Devour MCP server.
|
Long: `Start the Devour RPC server.
|
||||||
|
|
||||||
In local mode (default), the server communicates via stdio, making it
|
Local mode (default): JSON-RPC over stdin/stdout for agent/skill integration.
|
||||||
suitable for use as an OpenCode skill.
|
Remote mode (--remote): experimental HTTP RPC endpoint at /rpc.
|
||||||
|
|
||||||
In remote mode (--remote flag), the server listens on HTTP and exposes
|
|
||||||
a REST API for multi-user access.
|
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
devour serve # Local mode (stdio)
|
devour serve
|
||||||
devour serve --remote # Remote mode on default port
|
devour serve --remote
|
||||||
devour serve --remote --port 3000`,
|
devour serve --remote --port 3000`,
|
||||||
RunE: runServe,
|
RunE: runServe,
|
||||||
}
|
}
|
||||||
@@ -31,31 +35,185 @@ var (
|
|||||||
)
|
)
|
||||||
|
|
||||||
func init() {
|
func init() {
|
||||||
serveCmd.Flags().BoolVar(&serveRemote, "remote", false, "run as remote HTTP server")
|
serveCmd.Flags().BoolVar(&serveRemote, "remote", false, "run as remote HTTP server (experimental)")
|
||||||
serveCmd.Flags().IntVarP(&servePort, "port", "p", 8080, "HTTP port (remote mode only)")
|
serveCmd.Flags().IntVarP(&servePort, "port", "p", 8080, "HTTP port (remote mode only)")
|
||||||
serveCmd.Flags().StringVar(&serveHost, "host", "localhost", "HTTP host (remote mode only)")
|
serveCmd.Flags().StringVar(&serveHost, "host", "localhost", "HTTP host (remote mode only)")
|
||||||
}
|
}
|
||||||
|
|
||||||
func runServe(cmd *cobra.Command, args []string) error {
|
func runServe(cmd *cobra.Command, args []string) error {
|
||||||
if serveRemote {
|
if _, err := loadAppConfig(); err != nil {
|
||||||
fmt.Printf("🚀 Starting Devour server in remote mode\n")
|
return err
|
||||||
fmt.Printf(" Host: %s\n", serveHost)
|
|
||||||
fmt.Printf(" Port: %d\n", servePort)
|
|
||||||
fmt.Printf(" URL: http://%s:%d\n", serveHost, servePort)
|
|
||||||
|
|
||||||
// TODO: Start HTTP MCP server
|
|
||||||
return fmt.Errorf("remote mode not yet implemented")
|
|
||||||
}
|
}
|
||||||
|
|
||||||
fmt.Println("🚀 Starting Devour server in local mode (stdio)")
|
srvCfg := &server.Config{
|
||||||
fmt.Println(" Communicating via JSON-RPC over stdin/stdout")
|
Mode: "local",
|
||||||
|
Transport: "stdio",
|
||||||
|
Host: serveHost,
|
||||||
|
Port: servePort,
|
||||||
|
Handler: func(ctx context.Context, method string, params json.RawMessage) (any, error) {
|
||||||
|
return handleServeMethod(ctx, method, params)
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
// TODO: Start stdio MCP server
|
if serveRemote {
|
||||||
// Should handle JSON-RPC messages for:
|
srvCfg.Mode = "remote"
|
||||||
// - devour_query
|
fmt.Printf("🚀 Starting Devour RPC server in remote experimental mode\n")
|
||||||
// - devour_add
|
fmt.Printf(" URL: http://%s:%d/rpc\n", serveHost, servePort)
|
||||||
// - devour_status
|
} else {
|
||||||
// - devour_sync
|
fmt.Println("🚀 Starting Devour RPC server in local mode (stdio)")
|
||||||
|
fmt.Println(" Protocol: JSON-RPC 2.0 over stdin/stdout")
|
||||||
|
}
|
||||||
|
|
||||||
return fmt.Errorf("local mode not yet implemented")
|
srv := server.NewServer(srvCfg)
|
||||||
|
return srv.Start(context.Background())
|
||||||
|
}
|
||||||
|
|
||||||
|
func handleServeMethod(ctx context.Context, method string, params json.RawMessage) (any, error) {
|
||||||
|
// The method implementation needs full typed config. Load per-call to avoid stale state.
|
||||||
|
loadedCfg, err := loadAppConfig()
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
|
||||||
|
switch strings.TrimSpace(method) {
|
||||||
|
case "devour_query":
|
||||||
|
var req struct {
|
||||||
|
Query string `json:"query"`
|
||||||
|
Limit int `json:"limit"`
|
||||||
|
Threshold float64 `json:"threshold"`
|
||||||
|
}
|
||||||
|
if len(params) > 0 {
|
||||||
|
_ = json.Unmarshal(params, &req)
|
||||||
|
}
|
||||||
|
engine := search.NewEngine(loadedCfg)
|
||||||
|
results, stats, err := engine.Search(ctx, req.Query, search.SearchOptions{Limit: req.Limit, Threshold: req.Threshold})
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
return map[string]any{"query": req.Query, "count": len(results), "results": results, "indexed": stats.Documents}, nil
|
||||||
|
|
||||||
|
case "devour_status":
|
||||||
|
docsStats, err := projectstate.CollectDocsStats(loadedCfg.Storage.DocsDir)
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
state, _ := projectstate.LoadSourceState(loadedCfg.Storage.MetadataDir)
|
||||||
|
engine := search.NewEngine(loadedCfg)
|
||||||
|
idxStats, _ := engine.EnsureIndexed(ctx)
|
||||||
|
return map[string]any{
|
||||||
|
"documents": docsStats.DocumentCount,
|
||||||
|
"storage_bytes": docsStats.StorageBytes,
|
||||||
|
"last_updated": docsStats.LastUpdated,
|
||||||
|
"sources": state.Sources,
|
||||||
|
"indexed_docs": idxStats.Documents,
|
||||||
|
"index_timestamp": idxStats.LastIndexedAt,
|
||||||
|
}, nil
|
||||||
|
|
||||||
|
case "devour_scrape":
|
||||||
|
var req struct {
|
||||||
|
Source string `json:"source"`
|
||||||
|
Type string `json:"type"`
|
||||||
|
Format string `json:"format"`
|
||||||
|
Output string `json:"output"`
|
||||||
|
Query string `json:"query"`
|
||||||
|
ResultLimit int `json:"result_limit"`
|
||||||
|
Domains []string `json:"domains"`
|
||||||
|
Include []string `json:"include"`
|
||||||
|
Exclude []string `json:"exclude"`
|
||||||
|
}
|
||||||
|
if err := json.Unmarshal(params, &req); err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
if strings.TrimSpace(req.Source) == "" {
|
||||||
|
return nil, fmt.Errorf("source is required")
|
||||||
|
}
|
||||||
|
st := scraper.SourceType(req.Type)
|
||||||
|
if st == "" {
|
||||||
|
st = detectSourceType(req.Source)
|
||||||
|
}
|
||||||
|
source := &scraper.Source{
|
||||||
|
Name: extractName(req.Source),
|
||||||
|
Type: st,
|
||||||
|
URL: req.Source,
|
||||||
|
Query: strings.TrimSpace(req.Query),
|
||||||
|
ResultLimit: req.ResultLimit,
|
||||||
|
Domains: append([]string(nil), req.Domains...),
|
||||||
|
Include: append([]string(nil), req.Include...),
|
||||||
|
Exclude: append([]string(nil), req.Exclude...),
|
||||||
|
}
|
||||||
|
if st == scraper.SourceTypeLocal {
|
||||||
|
source.Path = req.Source
|
||||||
|
}
|
||||||
|
applySourceProfile(source)
|
||||||
|
prevFormat := scrapeFormat
|
||||||
|
prevOutput := scrapeOutput
|
||||||
|
prevAllowEmpty := scrapeAllowEmpty
|
||||||
|
scrapeFormat = coalesceString(req.Format, "json")
|
||||||
|
scrapeOutput = req.Output
|
||||||
|
scrapeAllowEmpty = false
|
||||||
|
count, err := scrapeOne(nil, loadedCfg, source, resolveOutputDir(loadedCfg, req.Output))
|
||||||
|
scrapeFormat = prevFormat
|
||||||
|
scrapeOutput = prevOutput
|
||||||
|
scrapeAllowEmpty = prevAllowEmpty
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
return map[string]any{"source": req.Source, "type": st, "documents": count}, nil
|
||||||
|
|
||||||
|
case "devour_ask":
|
||||||
|
var req struct {
|
||||||
|
Question string `json:"question"`
|
||||||
|
Limit int `json:"limit"`
|
||||||
|
}
|
||||||
|
if err := json.Unmarshal(params, &req); err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
if strings.TrimSpace(req.Question) == "" {
|
||||||
|
return nil, fmt.Errorf("question is required")
|
||||||
|
}
|
||||||
|
limit := req.Limit
|
||||||
|
if limit <= 0 {
|
||||||
|
limit = 5
|
||||||
|
}
|
||||||
|
engine := search.NewEngine(loadedCfg)
|
||||||
|
results, _, err := engine.Search(ctx, req.Question, search.SearchOptions{Limit: limit})
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
summary := "No relevant docs found."
|
||||||
|
if len(results) > 0 {
|
||||||
|
summary = results[0].Snippet
|
||||||
|
}
|
||||||
|
return map[string]any{"question": req.Question, "summary": summary, "sources": results}, nil
|
||||||
|
|
||||||
|
case "devour_sync":
|
||||||
|
prevForce, prevRebuild, prevSource := syncForce, syncRebuild, syncSource
|
||||||
|
var req struct {
|
||||||
|
Source string `json:"source"`
|
||||||
|
Force bool `json:"force"`
|
||||||
|
Rebuild bool `json:"rebuild"`
|
||||||
|
}
|
||||||
|
if len(params) > 0 {
|
||||||
|
_ = json.Unmarshal(params, &req)
|
||||||
|
}
|
||||||
|
syncForce = req.Force
|
||||||
|
syncRebuild = req.Rebuild
|
||||||
|
syncSource = req.Source
|
||||||
|
err := runSync(nil, nil)
|
||||||
|
syncForce, syncRebuild, syncSource = prevForce, prevRebuild, prevSource
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
return map[string]any{"ok": true}, nil
|
||||||
|
|
||||||
|
default:
|
||||||
|
return nil, fmt.Errorf("unknown method: %s", method)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func coalesceString(primary, fallback string) string {
|
||||||
|
if strings.TrimSpace(primary) != "" {
|
||||||
|
return primary
|
||||||
|
}
|
||||||
|
return fallback
|
||||||
}
|
}
|
||||||
|
|||||||
+86
-17
@@ -1,10 +1,13 @@
|
|||||||
package cmd
|
package cmd
|
||||||
|
|
||||||
import (
|
import (
|
||||||
|
"context"
|
||||||
"fmt"
|
"fmt"
|
||||||
"time"
|
"time"
|
||||||
|
|
||||||
"github.com/spf13/cobra"
|
"github.com/spf13/cobra"
|
||||||
|
"github.com/yourorg/devour/internal/projectstate"
|
||||||
|
"github.com/yourorg/devour/internal/search"
|
||||||
"github.com/yourorg/devour/internal/ui"
|
"github.com/yourorg/devour/internal/ui"
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -23,39 +26,105 @@ Shows:
|
|||||||
}
|
}
|
||||||
|
|
||||||
func runStatus(cmd *cobra.Command, args []string) error {
|
func runStatus(cmd *cobra.Command, args []string) error {
|
||||||
|
cfg, err := loadAppConfig()
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
|
||||||
// Print the small character mascot
|
// Print the small character mascot
|
||||||
ui.PrintCharacterSmall()
|
ui.PrintCharacterSmall()
|
||||||
fmt.Println()
|
fmt.Println()
|
||||||
|
|
||||||
ui.PrintHeader("Devour Status")
|
ui.PrintHeader("Devour Status")
|
||||||
|
|
||||||
// TODO: Implement actual status check
|
docsStats, err := projectstate.CollectDocsStats(cfg.Storage.DocsDir)
|
||||||
// Check:
|
if err != nil {
|
||||||
// - Index existence and health
|
return err
|
||||||
// - Document count
|
}
|
||||||
// - Vector count
|
|
||||||
// - Last sync time
|
|
||||||
// - Source status
|
|
||||||
|
|
||||||
// Placeholder status
|
engine := search.NewEngine(cfg)
|
||||||
ui.PrintKeyValue("Index Health", "⚠️ Not initialized")
|
indexStats, indexErr := engine.EnsureIndexed(context.Background())
|
||||||
ui.PrintKeyValue("Documents", "0 indexed")
|
indexHealth := "✓ Healthy"
|
||||||
ui.PrintKeyValue("Chunks", "0 total")
|
if indexErr != nil {
|
||||||
ui.PrintKeyValue("Vector Dimension", "1536")
|
if docsStats.DocumentCount == 0 {
|
||||||
ui.PrintKeyValue("Last Updated", "Never")
|
indexHealth = "⚠️ No docs indexed yet"
|
||||||
ui.PrintKeyValue("Storage Used", "0 MB")
|
} else {
|
||||||
|
indexHealth = "✗ Index error"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
lastUpdated := "Never"
|
||||||
|
if !docsStats.LastUpdated.IsZero() {
|
||||||
|
lastUpdated = docsStats.LastUpdated.Format(time.RFC3339)
|
||||||
|
}
|
||||||
|
|
||||||
|
chunks := 0
|
||||||
|
if indexStats != nil {
|
||||||
|
chunks = indexStats.Documents
|
||||||
|
}
|
||||||
|
|
||||||
|
ui.PrintKeyValue("Index Health", indexHealth)
|
||||||
|
ui.PrintKeyValue("Documents", fmt.Sprintf("%d indexed", docsStats.DocumentCount))
|
||||||
|
ui.PrintKeyValue("Chunks", fmt.Sprintf("%d total", chunks))
|
||||||
|
ui.PrintKeyValue("Vector Dimension", fmt.Sprintf("%d", cfg.Embeddings.Dimensions))
|
||||||
|
ui.PrintKeyValue("Last Updated", lastUpdated)
|
||||||
|
ui.PrintKeyValue("Storage Used", humanSize(docsStats.StorageBytes))
|
||||||
|
|
||||||
fmt.Println()
|
fmt.Println()
|
||||||
ui.PrintSection("Sources")
|
ui.PrintSection("Sources")
|
||||||
ui.PrintInfo(" None configured")
|
state, stateErr := projectstate.LoadSourceState(cfg.Storage.MetadataDir)
|
||||||
|
if stateErr != nil || len(state.Sources) == 0 {
|
||||||
|
ui.PrintInfo(" None tracked yet")
|
||||||
|
} else {
|
||||||
|
keys := make([]string, 0, len(state.Sources))
|
||||||
|
for k := range state.Sources {
|
||||||
|
keys = append(keys, k)
|
||||||
|
}
|
||||||
|
sortStrings(keys)
|
||||||
|
for _, k := range keys {
|
||||||
|
s := state.Sources[k]
|
||||||
|
last := "never"
|
||||||
|
if !s.LastSync.IsZero() {
|
||||||
|
last = s.LastSync.Format("2006-01-02 15:04:05")
|
||||||
|
}
|
||||||
|
fmt.Printf(" • %s (%s): %d docs, last sync %s\n", s.Name, s.Type, s.DocCount, last)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
fmt.Println()
|
fmt.Println()
|
||||||
ui.PrintSection("Next Steps")
|
ui.PrintSection("Next Steps")
|
||||||
fmt.Println(" 1. Run 'devour init' to initialize")
|
if docsStats.DocumentCount == 0 {
|
||||||
fmt.Println(" 2. Run 'devour scrape <source>' to index documents")
|
fmt.Println(" 1. Run 'devour scrape <source>' to index documentation")
|
||||||
|
fmt.Println(" 2. Run 'devour query \"<topic>\"' to search indexed docs")
|
||||||
|
} else {
|
||||||
|
fmt.Println(" 1. Run 'devour query \"<topic>\"' for local docs search")
|
||||||
|
fmt.Println(" 2. Run 'devour ask --lang <lang> \"<question>\"' for structured answers")
|
||||||
|
}
|
||||||
|
if indexErr != nil {
|
||||||
|
fmt.Printf(" ⚠️ Index note: %v\n", indexErr)
|
||||||
|
}
|
||||||
|
|
||||||
// Show when check happened
|
// Show when check happened
|
||||||
fmt.Printf("\nStatus as of: %s\n", time.Now().Format(time.RFC3339))
|
fmt.Printf("\nStatus as of: %s\n", time.Now().Format(time.RFC3339))
|
||||||
|
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func humanSize(b int64) string {
|
||||||
|
const mb = 1024 * 1024
|
||||||
|
if b < mb {
|
||||||
|
return fmt.Sprintf("%d KB", b/1024)
|
||||||
|
}
|
||||||
|
return fmt.Sprintf("%.2f MB", float64(b)/float64(mb))
|
||||||
|
}
|
||||||
|
|
||||||
|
func sortStrings(values []string) {
|
||||||
|
if len(values) < 2 {
|
||||||
|
return
|
||||||
|
}
|
||||||
|
for i := 1; i < len(values); i++ {
|
||||||
|
for j := i; j > 0 && values[j] < values[j-1]; j-- {
|
||||||
|
values[j], values[j-1] = values[j-1], values[j]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|||||||
+157
-17
@@ -1,9 +1,18 @@
|
|||||||
package cmd
|
package cmd
|
||||||
|
|
||||||
import (
|
import (
|
||||||
|
"context"
|
||||||
|
"crypto/sha256"
|
||||||
|
"encoding/hex"
|
||||||
"fmt"
|
"fmt"
|
||||||
|
"strings"
|
||||||
|
"time"
|
||||||
|
|
||||||
"github.com/spf13/cobra"
|
"github.com/spf13/cobra"
|
||||||
|
"github.com/yourorg/devour/internal/projectstate"
|
||||||
|
"github.com/yourorg/devour/internal/scraper"
|
||||||
|
"github.com/yourorg/devour/internal/search"
|
||||||
|
"github.com/yourorg/devour/internal/storage"
|
||||||
)
|
)
|
||||||
|
|
||||||
var syncCmd = &cobra.Command{
|
var syncCmd = &cobra.Command{
|
||||||
@@ -12,7 +21,7 @@ var syncCmd = &cobra.Command{
|
|||||||
Long: `Fetch updates from all configured sources.
|
Long: `Fetch updates from all configured sources.
|
||||||
|
|
||||||
Checks each source for changes (using hash or timestamp comparison)
|
Checks each source for changes (using hash or timestamp comparison)
|
||||||
and updates the index accordingly.
|
and updates the local docs + index accordingly.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
devour sync # Sync all sources
|
devour sync # Sync all sources
|
||||||
@@ -34,29 +43,160 @@ func init() {
|
|||||||
}
|
}
|
||||||
|
|
||||||
func runSync(cmd *cobra.Command, args []string) error {
|
func runSync(cmd *cobra.Command, args []string) error {
|
||||||
|
cfg, err := loadAppConfig()
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
|
||||||
if syncRebuild {
|
if syncRebuild {
|
||||||
fmt.Println("🔄 Rebuilding index from all sources...")
|
fmt.Println("🔄 Rebuilding local index from configured sources...")
|
||||||
} else {
|
} else {
|
||||||
fmt.Println("🔄 Syncing with configured sources...")
|
fmt.Println("🔄 Syncing configured sources...")
|
||||||
}
|
}
|
||||||
|
|
||||||
if syncSource != "" {
|
if len(cfg.Sources) == 0 {
|
||||||
fmt.Printf(" Source: %s\n", syncSource)
|
fmt.Println("No sources configured. Add sources in devour.yaml first.")
|
||||||
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
// TODO: Implement actual sync logic
|
state, err := projectstate.LoadSourceState(cfg.Storage.MetadataDir)
|
||||||
// 1. Load sources from config
|
if err != nil {
|
||||||
// 2. For each source:
|
return err
|
||||||
// a. Check for changes (hash/timestamp)
|
}
|
||||||
// b. If changes detected or --force:
|
|
||||||
// - Scrape updated content
|
|
||||||
// - Re-generate embeddings
|
|
||||||
// - Update index
|
|
||||||
// 3. Update metadata
|
|
||||||
|
|
||||||
fmt.Println()
|
updated := 0
|
||||||
fmt.Println("⚠️ Sync functionality not yet implemented")
|
skipped := 0
|
||||||
fmt.Println(" Configure sources in devour.yaml first")
|
failed := 0
|
||||||
|
totalDocs := 0
|
||||||
|
|
||||||
|
for _, srcCfg := range cfg.Sources {
|
||||||
|
if syncSource != "" && srcCfg.Name != syncSource {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
source := sourceFromConfig(srcCfg)
|
||||||
|
if source.Type == "" {
|
||||||
|
if source.URL != "" {
|
||||||
|
source.Type = detectSourceType(source.URL)
|
||||||
|
} else if source.Path != "" {
|
||||||
|
source.Type = scraper.SourceTypeLocal
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if source.Name == "" {
|
||||||
|
source.Name = extractName(source.URL)
|
||||||
|
}
|
||||||
|
applySourceProfile(source)
|
||||||
|
|
||||||
|
fmt.Printf("\n• %s (%s)\n", source.Name, source.Type)
|
||||||
|
s := scraper.NewScraper(source.Type, toScraperConfig(cfg, 0))
|
||||||
|
if s == nil {
|
||||||
|
failed++
|
||||||
|
fmt.Printf(" ✗ unsupported source type: %s\n", source.Type)
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
key := source.Name
|
||||||
|
if key == "" {
|
||||||
|
key = chooseSourceLabel(source)
|
||||||
|
}
|
||||||
|
lastHash := ""
|
||||||
|
if prev := state.Sources[key]; prev != nil {
|
||||||
|
lastHash = prev.Hash
|
||||||
|
}
|
||||||
|
|
||||||
|
needsUpdate := syncForce || syncRebuild
|
||||||
|
newHash := lastHash
|
||||||
|
if !needsUpdate {
|
||||||
|
changed, hash, detectErr := s.DetectChanges(context.Background(), source, lastHash)
|
||||||
|
if detectErr != nil {
|
||||||
|
fmt.Printf(" ⚠ change detection failed (%v), scraping anyway\n", detectErr)
|
||||||
|
needsUpdate = true
|
||||||
|
} else {
|
||||||
|
needsUpdate = changed
|
||||||
|
newHash = hash
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if !needsUpdate {
|
||||||
|
skipped++
|
||||||
|
fmt.Println(" ✓ no changes")
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
docs, scrapeErr := s.Scrape(context.Background(), source)
|
||||||
|
if scrapeErr != nil {
|
||||||
|
failed++
|
||||||
|
fmt.Printf(" ✗ scrape failed: %v\n", scrapeErr)
|
||||||
|
state.Sources[key] = &projectstate.SourceState{
|
||||||
|
Name: source.Name,
|
||||||
|
Type: string(source.Type),
|
||||||
|
URL: source.URL,
|
||||||
|
Hash: lastHash,
|
||||||
|
LastSync: time.Now(),
|
||||||
|
DocCount: 0,
|
||||||
|
LastError: scrapeErr.Error(),
|
||||||
|
}
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
saved, saveErr := storage.SaveDocuments(docs, storage.SaveOptions{
|
||||||
|
Format: "json",
|
||||||
|
OutputDir: cfg.Storage.DocsDir,
|
||||||
|
AllowEmpty: false,
|
||||||
|
PrintWriter: nil,
|
||||||
|
})
|
||||||
|
if saveErr != nil {
|
||||||
|
failed++
|
||||||
|
fmt.Printf(" ✗ save failed: %v\n", saveErr)
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
if newHash == "" {
|
||||||
|
h := sha256.New()
|
||||||
|
for _, d := range docs {
|
||||||
|
if d == nil {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
fmt.Fprintf(h, "%s|%s|%s\n", d.ID, d.Hash, d.URL)
|
||||||
|
}
|
||||||
|
newHash = hex.EncodeToString(h.Sum(nil))
|
||||||
|
}
|
||||||
|
|
||||||
|
state.Sources[key] = &projectstate.SourceState{
|
||||||
|
Name: source.Name,
|
||||||
|
Type: string(source.Type),
|
||||||
|
URL: source.URL,
|
||||||
|
Hash: newHash,
|
||||||
|
LastSync: time.Now(),
|
||||||
|
DocCount: saved.Count,
|
||||||
|
LastError: "",
|
||||||
|
}
|
||||||
|
|
||||||
|
updated++
|
||||||
|
totalDocs += saved.Count
|
||||||
|
fmt.Printf(" ✓ updated (%d docs)\n", saved.Count)
|
||||||
|
}
|
||||||
|
|
||||||
|
if err := projectstate.SaveSourceState(cfg.Storage.MetadataDir, state); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
|
||||||
|
if syncRebuild || updated > 0 {
|
||||||
|
engine := search.NewEngine(cfg)
|
||||||
|
if _, err := engine.Rebuild(context.Background()); err != nil {
|
||||||
|
return fmt.Errorf("rebuild index: %w", err)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fmt.Printf("\nSync summary: updated=%d skipped=%d failed=%d docs=%d\n", updated, skipped, failed, totalDocs)
|
||||||
|
if failed > 0 {
|
||||||
|
return fmt.Errorf("sync completed with failures")
|
||||||
|
}
|
||||||
|
if syncSource != "" && updated == 0 && skipped == 0 && failed == 0 {
|
||||||
|
return fmt.Errorf("source %q not found in config", syncSource)
|
||||||
|
}
|
||||||
|
if strings.TrimSpace(syncSource) != "" {
|
||||||
|
fmt.Printf("Synced source: %s\n", syncSource)
|
||||||
|
}
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|||||||
+169
@@ -0,0 +1,169 @@
|
|||||||
|
package cmd
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"encoding/json"
|
||||||
|
"fmt"
|
||||||
|
"os"
|
||||||
|
"path/filepath"
|
||||||
|
"time"
|
||||||
|
|
||||||
|
"github.com/spf13/cobra"
|
||||||
|
"github.com/yourorg/devour/internal/scraper"
|
||||||
|
)
|
||||||
|
|
||||||
|
var (
|
||||||
|
verifyFormat string
|
||||||
|
verifyTimeout int
|
||||||
|
)
|
||||||
|
|
||||||
|
var verifyCmd = &cobra.Command{
|
||||||
|
Use: "verify",
|
||||||
|
Short: "Run Devour verification suites",
|
||||||
|
Long: `Run deterministic and live verification suites for Devour commands and scrapers.`,
|
||||||
|
}
|
||||||
|
|
||||||
|
var verifySmokeCmd = &cobra.Command{
|
||||||
|
Use: "smoke",
|
||||||
|
Short: "Run live docs scraping smoke checks",
|
||||||
|
Long: `Run an opt-in live network smoke suite and persist a machine-readable report under devour_data/verify/.`,
|
||||||
|
RunE: runVerifySmoke,
|
||||||
|
}
|
||||||
|
|
||||||
|
func init() {
|
||||||
|
verifyCmd.AddCommand(verifySmokeCmd)
|
||||||
|
verifySmokeCmd.Flags().StringVar(&verifyFormat, "format", "text", "output format (text, json)")
|
||||||
|
verifySmokeCmd.Flags().IntVar(&verifyTimeout, "timeout", 90, "timeout per smoke case in seconds")
|
||||||
|
}
|
||||||
|
|
||||||
|
type verifyCase struct {
|
||||||
|
Name string `json:"name"`
|
||||||
|
Type scraper.SourceType `json:"type"`
|
||||||
|
URL string `json:"url"`
|
||||||
|
Passed bool `json:"passed"`
|
||||||
|
Docs int `json:"docs"`
|
||||||
|
Error string `json:"error,omitempty"`
|
||||||
|
TookMs int64 `json:"took_ms"`
|
||||||
|
}
|
||||||
|
|
||||||
|
type verifyReport struct {
|
||||||
|
CreatedAt time.Time `json:"created_at"`
|
||||||
|
Duration string `json:"duration"`
|
||||||
|
Passed int `json:"passed"`
|
||||||
|
Failed int `json:"failed"`
|
||||||
|
Cases []verifyCase `json:"cases"`
|
||||||
|
}
|
||||||
|
|
||||||
|
func runVerifySmoke(cmd *cobra.Command, args []string) error {
|
||||||
|
cfg, err := loadAppConfig()
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
|
||||||
|
if verifyTimeout <= 0 {
|
||||||
|
verifyTimeout = 90
|
||||||
|
}
|
||||||
|
|
||||||
|
cases := []verifyCase{
|
||||||
|
{Name: "Go net/http", Type: scraper.SourceTypeGoDocs, URL: "https://pkg.go.dev/net/http"},
|
||||||
|
{Name: "Python asyncio", Type: scraper.SourceTypePythonDocs, URL: "https://docs.python.org/3/library/asyncio.html"},
|
||||||
|
{Name: "React reference", Type: scraper.SourceTypeReactDocs, URL: "https://react.dev/reference/react"},
|
||||||
|
{Name: "TypeScript handbook", Type: scraper.SourceTypeTSDocs, URL: "https://www.typescriptlang.org/docs/handbook/2/basic-types.html"},
|
||||||
|
{Name: "Next.js docs", Type: scraper.SourceTypeWeb, URL: "https://nextjs.org/docs"},
|
||||||
|
{Name: "Svelte docs", Type: scraper.SourceTypeWeb, URL: "https://svelte.dev/docs/kit"},
|
||||||
|
{Name: "Angular guide", Type: scraper.SourceTypeWeb, URL: "https://angular.dev/guide/http"},
|
||||||
|
{Name: "Remix docs", Type: scraper.SourceTypeWeb, URL: "https://v2.remix.run/docs"},
|
||||||
|
{Name: "Solid docs repo", Type: scraper.SourceTypeGitHub, URL: "https://github.com/solidjs/solid-docs"},
|
||||||
|
{Name: "Express guide", Type: scraper.SourceTypeWeb, URL: "https://expressjs.com/en/guide/routing.html"},
|
||||||
|
}
|
||||||
|
|
||||||
|
startAll := time.Now()
|
||||||
|
passed := 0
|
||||||
|
failed := 0
|
||||||
|
|
||||||
|
for i := range cases {
|
||||||
|
c := &cases[i]
|
||||||
|
caseStart := time.Now()
|
||||||
|
s := scraper.NewScraper(c.Type, toScraperConfig(cfg, 4))
|
||||||
|
if s == nil {
|
||||||
|
c.Error = "scraper not available"
|
||||||
|
c.Passed = false
|
||||||
|
failed++
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(verifyTimeout)*time.Second)
|
||||||
|
docs, err := s.Scrape(ctx, &scraper.Source{Name: c.Name, Type: c.Type, URL: c.URL})
|
||||||
|
cancel()
|
||||||
|
c.TookMs = time.Since(caseStart).Milliseconds()
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
c.Error = err.Error()
|
||||||
|
c.Passed = false
|
||||||
|
failed++
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
c.Docs = len(docs)
|
||||||
|
if len(docs) == 0 {
|
||||||
|
c.Error = "0 documents"
|
||||||
|
c.Passed = false
|
||||||
|
failed++
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
c.Passed = true
|
||||||
|
passed++
|
||||||
|
}
|
||||||
|
|
||||||
|
report := verifyReport{
|
||||||
|
CreatedAt: time.Now(),
|
||||||
|
Duration: time.Since(startAll).String(),
|
||||||
|
Passed: passed,
|
||||||
|
Failed: failed,
|
||||||
|
Cases: cases,
|
||||||
|
}
|
||||||
|
|
||||||
|
rootDataDir := filepath.Dir(cfg.Storage.DocsDir)
|
||||||
|
verifyDir := filepath.Join(rootDataDir, "verify")
|
||||||
|
if err := os.MkdirAll(verifyDir, 0o755); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
filename := fmt.Sprintf("smoke-%s.json", time.Now().Format("20060102-150405"))
|
||||||
|
reportPath := filepath.Join(verifyDir, filename)
|
||||||
|
b, err := json.MarshalIndent(report, "", " ")
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
if err := os.WriteFile(reportPath, b, 0o644); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
|
||||||
|
switch verifyFormat {
|
||||||
|
case "json":
|
||||||
|
enc := json.NewEncoder(cmd.OutOrStdout())
|
||||||
|
enc.SetIndent("", " ")
|
||||||
|
if err := enc.Encode(report); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
default:
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), "Smoke verification complete\n")
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), " Passed: %d\n", report.Passed)
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), " Failed: %d\n", report.Failed)
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), " Report: %s\n", reportPath)
|
||||||
|
for _, c := range report.Cases {
|
||||||
|
status := "PASS"
|
||||||
|
if !c.Passed {
|
||||||
|
status = "FAIL"
|
||||||
|
}
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), " - [%s] %s (%d docs, %dms)", status, c.Name, c.Docs, c.TookMs)
|
||||||
|
if c.Error != "" {
|
||||||
|
fmt.Fprintf(cmd.OutOrStdout(), " error=%s", c.Error)
|
||||||
|
}
|
||||||
|
fmt.Fprintln(cmd.OutOrStdout())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if report.Failed > 0 {
|
||||||
|
return fmt.Errorf("smoke verification completed with failures")
|
||||||
|
}
|
||||||
|
return nil
|
||||||
|
}
|
||||||
+39
-9
@@ -8,8 +8,9 @@ storage:
|
|||||||
docs_dir: ./devour_data/docs
|
docs_dir: ./devour_data/docs
|
||||||
index_dir: ./devour_data/index
|
index_dir: ./devour_data/index
|
||||||
metadata_dir: ./devour_data/metadata
|
metadata_dir: ./devour_data/metadata
|
||||||
|
cache_dir: ./devour_data/cache
|
||||||
|
|
||||||
# Embedding settings
|
# Embedding settings (optional for lexical search; required for future embedding flows)
|
||||||
embeddings:
|
embeddings:
|
||||||
provider: openai # openai, mock
|
provider: openai # openai, mock
|
||||||
model: text-embedding-3-small
|
model: text-embedding-3-small
|
||||||
@@ -17,7 +18,7 @@ embeddings:
|
|||||||
api_key: ${OPENAI_API_KEY} # Use environment variable
|
api_key: ${OPENAI_API_KEY} # Use environment variable
|
||||||
batch_size: 100
|
batch_size: 100
|
||||||
|
|
||||||
# Vector database
|
# Vector database (optional in current local-first pipeline)
|
||||||
vector_db:
|
vector_db:
|
||||||
type: memory # memory, chromem
|
type: memory # memory, chromem
|
||||||
persist: true
|
persist: true
|
||||||
@@ -28,7 +29,7 @@ scraper:
|
|||||||
user_agent: "Devour/1.0 (+https://github.com/yourorg/devour)"
|
user_agent: "Devour/1.0 (+https://github.com/yourorg/devour)"
|
||||||
timeout: 30s
|
timeout: 30s
|
||||||
retry_count: 3
|
retry_count: 3
|
||||||
retry_delay: 5s
|
retry_delay: 1s
|
||||||
concurrency: 10
|
concurrency: 10
|
||||||
rate_limit: 500ms
|
rate_limit: 500ms
|
||||||
max_depth: 3
|
max_depth: 3
|
||||||
@@ -44,9 +45,22 @@ scheduler:
|
|||||||
# Server settings
|
# Server settings
|
||||||
server:
|
server:
|
||||||
mode: local # local, remote
|
mode: local # local, remote
|
||||||
|
transport: stdio # stdio, http
|
||||||
port: 8080
|
port: 8080
|
||||||
host: localhost
|
host: localhost
|
||||||
|
|
||||||
|
# Local lexical indexing defaults
|
||||||
|
indexing:
|
||||||
|
enabled: true
|
||||||
|
auto_reindex: true
|
||||||
|
snippet_length: 220
|
||||||
|
max_docs: 10000
|
||||||
|
|
||||||
|
# Verification defaults
|
||||||
|
verification:
|
||||||
|
enabled: true
|
||||||
|
timeout: 90s
|
||||||
|
|
||||||
# Example sources
|
# Example sources
|
||||||
sources:
|
sources:
|
||||||
# Web documentation
|
# Web documentation
|
||||||
@@ -67,17 +81,14 @@ sources:
|
|||||||
url: https://api.example.com/openapi.json
|
url: https://api.example.com/openapi.json
|
||||||
schedule: 168h # Weekly
|
schedule: 168h # Weekly
|
||||||
|
|
||||||
# GitHub repository
|
# GitHub repository docs
|
||||||
- name: github-repo
|
- name: github-repo
|
||||||
type: github
|
type: github
|
||||||
repo: org/repository
|
repo: org/repository
|
||||||
branch: main
|
branch: main
|
||||||
include:
|
include:
|
||||||
- "docs/.*"
|
- "(?i)(^|/)README\\.md$"
|
||||||
- "README.md"
|
- "(?i)(^|/)docs?/"
|
||||||
exclude:
|
|
||||||
- "docs/internal/.*"
|
|
||||||
# auth_token: ${GITHUB_TOKEN} # Optional for private repos
|
|
||||||
|
|
||||||
# Local files
|
# Local files
|
||||||
- name: local-docs
|
- name: local-docs
|
||||||
@@ -86,3 +97,22 @@ sources:
|
|||||||
include:
|
include:
|
||||||
- ".*\\.md"
|
- ".*\\.md"
|
||||||
- ".*\\.txt"
|
- ".*\\.txt"
|
||||||
|
|
||||||
|
# Self-hosted search API (e.g. SearxNG) with no API key
|
||||||
|
- name: local-searxng-go
|
||||||
|
type: localsearch
|
||||||
|
url: http://127.0.0.1:8080/search
|
||||||
|
query: golang http client
|
||||||
|
result_limit: 8
|
||||||
|
domains:
|
||||||
|
- pkg.go.dev
|
||||||
|
- go.dev
|
||||||
|
|
||||||
|
# New framework examples
|
||||||
|
- name: nextjs-docs
|
||||||
|
type: url
|
||||||
|
url: https://nextjs.org/docs
|
||||||
|
|
||||||
|
- name: express-docs
|
||||||
|
type: url
|
||||||
|
url: https://expressjs.com/en/guide/routing.html
|
||||||
|
|||||||
+11
@@ -40,9 +40,20 @@ scheduler:
|
|||||||
# Server settings
|
# Server settings
|
||||||
server:
|
server:
|
||||||
mode: local
|
mode: local
|
||||||
|
transport: stdio
|
||||||
port: 8080
|
port: 8080
|
||||||
host: localhost
|
host: localhost
|
||||||
|
|
||||||
|
indexing:
|
||||||
|
enabled: true
|
||||||
|
auto_reindex: true
|
||||||
|
snippet_length: 220
|
||||||
|
max_docs: 10000
|
||||||
|
|
||||||
|
verification:
|
||||||
|
enabled: true
|
||||||
|
timeout: 90s
|
||||||
|
|
||||||
# Sources (add your own)
|
# Sources (add your own)
|
||||||
sources: []
|
sources: []
|
||||||
# - name: example-docs
|
# - name: example-docs
|
||||||
|
|||||||
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user