Files
Devour/README.md
Tomas Dvorak 55885a0e8f first commit
2026-02-22 10:42:17 +01:00

14 KiB

Devour Logo

Devour

Context Ingestion & Management for AI

FeaturesInstallationQuick StartArchitectureCLI ReferenceConfiguration


What is Devour?

Devour is a context ingestion and management system designed to feed structured, relevant context to AI models for generating accurate, fully working code.

It scrapes, indexes, and serves documentation from multiple sources:

  • GitHub repositories
  • OpenAPI/Swagger specifications
  • Markdown/HTML documentation sites
  • JSON/YAML schemas
  • Local project files

Two Modes of Operation

Mode Description Use Case
Local Runs as an OpenCode skill on your machine Single developer, offline work
Remote MCP server hosted on your infrastructure Teams, multi-project support

Features

🕷️ Multi-Source Scraping

  • GitHub - Clone and parse repos, extract README, docs, code structure
  • OpenAPI - Parse Swagger specs into structured endpoints
  • Web Docs - Crawl documentation sites with Colly
  • Local Files - Index your project's docs folder

🧠 Intelligent Indexing

  • Vector embeddings via OpenAI (text-embedding-3-small/large)
  • Semantic similarity search for context retrieval
  • Metadata tracking (source, timestamp, file type)

🔄 Automatic Updates

  • Configurable scheduler (default: every 3 days)
  • Content hash comparison for change detection
  • Automatic re-indexing on updates

🔌 MCP Integration

  • Exposes context via Model Context Protocol
  • Local mode: stdio transport (OpenCode skill)
  • Remote mode: HTTP/SSE transport (MCP server)

💾 Flexible Storage

devour_data/
├── docs/           # Raw scraped documents
├── index/          # Vector embeddings
└── metadata/       # Source tracking & timestamps

📊 Quality Scorecard

Quality Scorecard

Devour includes a built-in code quality analysis system that generates a comprehensive scorecard for your project.

# Run quality analysis
devour quality scan

# Generate a visual scorecard badge
devour quality scan --badge-path scorecard.png

Features:

  • Multi-language support (Go, Python, JavaScript, etc.)
  • Severity-based scoring (T1-T4 tiers)
  • Technical debt tracking
  • Automated code review integration

Installation

Prerequisites

  • Go 1.22+
  • OpenAI API key (for embeddings)

From Source

# Clone the repository
git clone https://github.com/yourorg/devour.git
cd devour

# Install dependencies
go mod download

# Build
go build -o devour ./cmd/devour

# Install globally
go install ./cmd/devour

Quick Install

go install github.com/yourorg/devour/cmd/devour@latest

Quick Start

1. Initialize a Project

# Create devour config in current directory
devour init

# Or specify a path
devour init ./my-project

2. Get Documentation (NEW!)

# Quick access to popular language/framework docs
devour get go http              # Go HTTP package
devour get python asyncio      # Python asyncio module  
devour get react hooks         # React Hooks documentation
devour get docker compose      # Docker Compose docs
devour get rust tokio          # Rust Tokio crate
devour get spring boot         # Spring Boot framework

# Enhanced markdown output
devour get go http --format markdown

Supported Languages:

  • go, golang - Go packages (pkg.go.dev)
  • rust - Rust crates (docs.rs)
  • python, py - Python modules (docs.python.org)
  • java - Java packages (docs.oracle.com)
  • spring - Spring Boot (docs.spring.io)
  • typescript, ts - TypeScript (typescriptlang.org)
  • react - React (react.dev)
  • vue - Vue.js (vuejs.org)
  • nuxt - Nuxt (nuxt.com)
  • docker - Docker (docs.docker.com)
  • cloudflare, cf - Cloudflare (developers.cloudflare.com)
  • astro - Astro (docs.astro.build)

3. Scrape Documentation

# Scrape from a URL
devour scrape https://docs.example.com

# Scrape a GitHub repo
devour scrape https://github.com/org/repo

# Scrape local docs
devour scrape ./docs

# Multiple sources
devour scrape --sources sources.yaml

4. Query Context

# Search indexed docs
devour query "How do I authenticate with the API?"

# With options
devour query "authentication" --limit 5 --format json

5. Start the Server

# Local MCP server (stdio transport)
devour serve

# Remote MCP server (HTTP)
devour serve --remote --port 8080

6. Check Status

devour status

Enhanced Features

🎯 Simplified Language Interface

The new devour get command provides instant access to documentation for popular languages and frameworks without needing to remember full URLs:

# Instead of: devour scrape https://pkg.go.dev/net/http
devour get go http

# Instead of: devour scrape https://react.dev/reference/react/hooks  
devour get react hooks

# Instead of: devour scrape https://docs.docker.com/compose
devour get docker compose

📝 Rich Markdown Output

Enable enhanced markdown formatting for beautiful, structured documentation:

devour get go http --format markdown

Features:

  • 📋 Document metadata tables
  • 📑 Auto-generated table of contents
  • 🎨 Enhanced typography with emoji indicators
  • 🔗 Automatic link conversion
  • 📚 Structured content sections
  • 🏷️ Source attribution and timestamps

🧠 Smart Content Enhancement

The markdown formatter automatically:

  • Converts plain URLs to clickable links
  • Adds visual indicators for examples, notes, and warnings
  • Fixes code block formatting
  • Generates proper heading structure
  • Creates document metadata tables

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         Devour System                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────┐    ┌──────────┐    ┌───────────┐    ┌──────────┐  │
│  │ Scraper │───▶│ Indexer  │───▶│  Storage  │───▶│  Server  │  │
│  └─────────┘    └──────────┘    └───────────┘    └──────────┘  │
│       │              │               │                │        │
│       ▼              ▼               ▼                ▼        │
│  ┌─────────┐    ┌──────────┐    ┌───────────┐    ┌──────────┐  │
│  │ GitHub  │    │ OpenAI   │    │ Vector DB │    │   MCP    │  │
│  │ Web     │    │ Embeds   │    │ (chromem) │    │ Protocol │  │
│  │ Local   │    │          │    │           │    │          │  │
│  └─────────┘    └──────────┘    └───────────┘    └──────────┘  │
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                     Scheduler                            │   │
│  │         (Auto-update every 3 days, configurable)         │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Data Flow

User Query → Devour Server → Embedding Generation → Vector Search
                                                           │
                                                           ▼
    AI Response ← Context Chunks ← Top-K Relevant Docs ←───┘

CLI Reference

Commands

Command Description
devour init [path] Initialize Devour for a project
devour get <language> <keyword> NEW Quick docs fetch for popular languages
devour scrape <source> Scrape docs from URL, repo, or path
devour serve Start MCP server (local or remote)
devour query <text> Search indexed documentation
devour status Show index stats and last update
devour sync Fetch updates from all sources
devour push <path> Push docs to remote MCP server

Flags

# Global flags
--config, -c     Config file path (default: ./devour.yaml)
--verbose, -v    Enable verbose logging
--quiet, -q      Suppress non-error output

# scrape flags
--sources, -s    YAML file with source definitions
--format, -f     Output format: json, markdown (default: json)
--concurrency    Parallel scraping workers (default: 10)

# serve flags
--remote         Run as remote HTTP server
--port, -p       HTTP port (default: 8080)
--host           HTTP host (default: localhost)

# query flags
--limit, -l      Max results (default: 5)
--format, -f     Output: json, text, markdown
--threshold      Similarity threshold (default: 0.7)

Configuration

devour.yaml

# Devour Configuration

# Storage paths
storage:
  docs_dir: ./devour_data/docs
  index_dir: ./devour_data/index
  metadata_dir: ./devour_data/metadata

# Embedding settings
embeddings:
  provider: openai           # openai, local
  model: text-embedding-3-small
  api_key: ${OPENAI_API_KEY} # Env var reference

# Vector database
vector_db:
  type: chromem              # chromem, weaviate, faiss
  persist: true

# Scraping settings
scraper:
  user_agent: "Devour/1.0"
  timeout: 30s
  retry_count: 3
  concurrency: 10
  rate_limit: 500ms

# Scheduler
scheduler:
  enabled: true
  interval: 72h              # Every 3 days
  check_method: hash         # hash, timestamp

# Server settings
server:
  mode: local                # local, remote
  port: 8080
  host: localhost

# Sources (for sync)
sources:
  - name: project-docs
    type: url
    url: https://docs.example.com
    include: ["**/*.md", "**/*.html"]
    exclude: ["**/api/**"]
    
  - name: api-spec
    type: openapi
    url: https://api.example.com/openapi.json
    
  - name: github-repo
    type: github
    repo: org/repo
    branch: main
    paths: ["docs/", "README.md"]

API Reference

MCP Tools (when running as server)

devour_query

Search indexed documentation for relevant context.

{
  "query": "How do I authenticate?",
  "limit": 5,
  "threshold": 0.7
}

devour_add

Add documents to the index.

{
  "documents": [
    {
      "content": "Document text...",
      "metadata": {
        "source": "https://...",
        "type": "markdown"
      }
    }
  ]
}

devour_status

Get indexing status and statistics.

REST API (remote mode)

GET  /health              # Health check
GET  /status              # Index statistics
POST /query               # Search documents
POST /documents           # Add documents
GET  /documents           # List documents
DELETE /documents/:id     # Delete document
POST /sync                # Trigger sync

Integration Examples

With OpenCode (Local Mode)

Add to your OpenCode skills:

# ~/.opencode/skills.yaml
skills:
  - name: devour
    path: /path/to/devour
    commands:
      - devour serve

Then in OpenCode:

/devour query "authentication flow"

With AI Applications

import "github.com/yourorg/devour/pkg/client"

func main() {
    client := client.New("http://localhost:8080")
    
    results, err := client.Query(ctx, "How do I use the API?", 5)
    if err != nil {
        log.Fatal(err)
    }
    
    for _, r := range results {
        fmt.Printf("Score: %.2f - %s\n", r.Score, r.Content[:100])
    }
}

Development

Project Structure

devour/
├── cmd/devour/           # CLI entrypoint
│   └── main.go
├── internal/
│   ├── scraper/          # Scraping logic
│   ├── indexer/          # Embedding generation
│   ├── server/           # MCP server
│   ├── scheduler/        # Background updates
│   └── ai/               # AI integrations
├── pkg/
│   ├── client/           # Go client library
│   └── types/            # Shared types
├── devour_data/          # Default data directory
├── go.mod
├── Makefile
└── README.md

Building

# Development build
go build -o devour ./cmd/devour

# Production build
CGO_ENABLED=0 go build -ldflags="-s -w" -o devour ./cmd/devour

# Run tests
go test ./...

# Run with coverage
go test -cover ./...

Makefile Targets

make build        # Build binary
make test         # Run tests
make lint         # Run linter
make docker       # Build Docker image
make install      # Install locally

Roadmap

  • Local LLM support (Ollama, LocalAI)
  • Multi-tenant support for remote mode
  • Web UI for document management
  • Git-based versioning for docs
  • Plugin system for custom scrapers
  • Reranking with cross-encoders

Contributing

Contributions are welcome! Please read our Contributing Guide for details.


License

MIT License - see LICENSE for details.


Built with ❤️ for better AI context