mirror of
https://github.com/Dvorinka/Devour.git
synced 2026-06-03 20:13:03 +00:00
565 lines
14 KiB
Markdown
565 lines
14 KiB
Markdown
<p align="center">
|
|
<img src="devour_logo.svg" alt="Devour Logo" width="300">
|
|
</p>
|
|
|
|
<h1 align="center">Devour</h1>
|
|
|
|
<p align="center">
|
|
<strong>Context Ingestion & Management for AI</strong>
|
|
</p>
|
|
|
|
<p align="center">
|
|
<a href="#features">Features</a> •
|
|
<a href="#installation">Installation</a> •
|
|
<a href="#quick-start">Quick Start</a> •
|
|
<a href="#architecture">Architecture</a> •
|
|
<a href="#cli-reference">CLI Reference</a> •
|
|
<a href="#configuration">Configuration</a>
|
|
</p>
|
|
|
|
---
|
|
|
|
## What is Devour?
|
|
|
|
Devour is a **context ingestion and management system** designed to feed structured, relevant context to AI models for generating accurate, fully working code.
|
|
|
|
It scrapes, indexes, and serves documentation from multiple sources:
|
|
- GitHub repositories
|
|
- OpenAPI/Swagger specifications
|
|
- Markdown/HTML documentation sites
|
|
- JSON/YAML schemas
|
|
- Local project files
|
|
|
|
### Two Modes of Operation
|
|
|
|
| Mode | Description | Use Case |
|
|
|------|-------------|----------|
|
|
| **Local** | Runs as an OpenCode skill on your machine | Single developer, offline work |
|
|
| **Remote** | MCP server hosted on your infrastructure | Teams, multi-project support |
|
|
|
|
---
|
|
|
|
## Features
|
|
|
|
### 🕷️ Multi-Source Scraping
|
|
- **GitHub** - Clone and parse repos, extract README, docs, code structure
|
|
- **OpenAPI** - Parse Swagger specs into structured endpoints
|
|
- **Web Docs** - Crawl documentation sites with Colly
|
|
- **Local Files** - Index your project's docs folder
|
|
|
|
### 🧠 Intelligent Indexing
|
|
- Vector embeddings via OpenAI (text-embedding-3-small/large)
|
|
- Semantic similarity search for context retrieval
|
|
- Metadata tracking (source, timestamp, file type)
|
|
|
|
### 🔄 Automatic Updates
|
|
- Configurable scheduler (default: every 3 days)
|
|
- Content hash comparison for change detection
|
|
- Automatic re-indexing on updates
|
|
|
|
### 🔌 MCP Integration
|
|
- Exposes context via Model Context Protocol
|
|
- **Local mode**: stdio transport (OpenCode skill)
|
|
- **Remote mode**: HTTP/SSE transport (MCP server)
|
|
|
|
### 💾 Flexible Storage
|
|
```
|
|
devour_data/
|
|
├── docs/ # Raw scraped documents
|
|
├── index/ # Vector embeddings
|
|
└── metadata/ # Source tracking & timestamps
|
|
```
|
|
|
|
### 📊 Quality Scorecard
|
|
|
|

|
|
|
|
Devour includes a built-in code quality analysis system that generates a comprehensive scorecard for your project.
|
|
|
|
```bash
|
|
# Run quality analysis
|
|
devour quality scan
|
|
|
|
# Generate a visual scorecard badge
|
|
devour quality scan --badge-path scorecard.png
|
|
```
|
|
|
|
**Features:**
|
|
- Multi-language support (Go, Python, JavaScript, etc.)
|
|
- Severity-based scoring (T1-T4 tiers)
|
|
- Technical debt tracking
|
|
- Automated code review integration
|
|
|
|
---
|
|
|
|
## Installation
|
|
|
|
### Prerequisites
|
|
- Go 1.22+
|
|
- OpenAI API key (for embeddings)
|
|
|
|
### From Source
|
|
|
|
```bash
|
|
# Clone the repository
|
|
git clone https://github.com/yourorg/devour.git
|
|
cd devour
|
|
|
|
# Install dependencies
|
|
go mod download
|
|
|
|
# Build
|
|
go build -o devour ./cmd/devour
|
|
|
|
# Install globally
|
|
go install ./cmd/devour
|
|
```
|
|
|
|
### Quick Install
|
|
|
|
```bash
|
|
go install github.com/yourorg/devour/cmd/devour@latest
|
|
```
|
|
|
|
---
|
|
|
|
## Quick Start
|
|
|
|
### 1. Initialize a Project
|
|
|
|
```bash
|
|
# Create devour config in current directory
|
|
devour init
|
|
|
|
# Or specify a path
|
|
devour init ./my-project
|
|
```
|
|
|
|
### 2. Get Documentation (NEW!)
|
|
|
|
```bash
|
|
# Quick access to popular language/framework docs
|
|
devour get go http # Go HTTP package
|
|
devour get python asyncio # Python asyncio module
|
|
devour get react hooks # React Hooks documentation
|
|
devour get docker compose # Docker Compose docs
|
|
devour get rust tokio # Rust Tokio crate
|
|
devour get spring boot # Spring Boot framework
|
|
|
|
# Enhanced markdown output
|
|
devour get go http --format markdown
|
|
```
|
|
|
|
**Supported Languages:**
|
|
- `go`, `golang` - Go packages (pkg.go.dev)
|
|
- `rust` - Rust crates (docs.rs)
|
|
- `python`, `py` - Python modules (docs.python.org)
|
|
- `java` - Java packages (docs.oracle.com)
|
|
- `spring` - Spring Boot (docs.spring.io)
|
|
- `typescript`, `ts` - TypeScript (typescriptlang.org)
|
|
- `react` - React (react.dev)
|
|
- `vue` - Vue.js (vuejs.org)
|
|
- `nuxt` - Nuxt (nuxt.com)
|
|
- `docker` - Docker (docs.docker.com)
|
|
- `cloudflare`, `cf` - Cloudflare (developers.cloudflare.com)
|
|
- `astro` - Astro (docs.astro.build)
|
|
|
|
### 3. Scrape Documentation
|
|
|
|
```bash
|
|
# Scrape from a URL
|
|
devour scrape https://docs.example.com
|
|
|
|
# Scrape a GitHub repo
|
|
devour scrape https://github.com/org/repo
|
|
|
|
# Scrape local docs
|
|
devour scrape ./docs
|
|
|
|
# Multiple sources
|
|
devour scrape --sources sources.yaml
|
|
```
|
|
|
|
### 4. Query Context
|
|
|
|
```bash
|
|
# Search indexed docs
|
|
devour query "How do I authenticate with the API?"
|
|
|
|
# With options
|
|
devour query "authentication" --limit 5 --format json
|
|
```
|
|
|
|
### 5. Start the Server
|
|
|
|
```bash
|
|
# Local MCP server (stdio transport)
|
|
devour serve
|
|
|
|
# Remote MCP server (HTTP)
|
|
devour serve --remote --port 8080
|
|
```
|
|
|
|
### 6. Check Status
|
|
|
|
```bash
|
|
devour status
|
|
```
|
|
|
|
---
|
|
|
|
## Enhanced Features
|
|
|
|
### 🎯 Simplified Language Interface
|
|
|
|
The new `devour get` command provides instant access to documentation for popular languages and frameworks without needing to remember full URLs:
|
|
|
|
```bash
|
|
# Instead of: devour scrape https://pkg.go.dev/net/http
|
|
devour get go http
|
|
|
|
# Instead of: devour scrape https://react.dev/reference/react/hooks
|
|
devour get react hooks
|
|
|
|
# Instead of: devour scrape https://docs.docker.com/compose
|
|
devour get docker compose
|
|
```
|
|
|
|
### 📝 Rich Markdown Output
|
|
|
|
Enable enhanced markdown formatting for beautiful, structured documentation:
|
|
|
|
```bash
|
|
devour get go http --format markdown
|
|
```
|
|
|
|
**Features:**
|
|
- 📋 Document metadata tables
|
|
- 📑 Auto-generated table of contents
|
|
- 🎨 Enhanced typography with emoji indicators
|
|
- 🔗 Automatic link conversion
|
|
- 📚 Structured content sections
|
|
- 🏷️ Source attribution and timestamps
|
|
|
|
### 🧠 Smart Content Enhancement
|
|
|
|
The markdown formatter automatically:
|
|
- Converts plain URLs to clickable links
|
|
- Adds visual indicators for examples, notes, and warnings
|
|
- Fixes code block formatting
|
|
- Generates proper heading structure
|
|
- Creates document metadata tables
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ Devour System │
|
|
├─────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ ┌─────────┐ ┌──────────┐ ┌───────────┐ ┌──────────┐ │
|
|
│ │ Scraper │───▶│ Indexer │───▶│ Storage │───▶│ Server │ │
|
|
│ └─────────┘ └──────────┘ └───────────┘ └──────────┘ │
|
|
│ │ │ │ │ │
|
|
│ ▼ ▼ ▼ ▼ │
|
|
│ ┌─────────┐ ┌──────────┐ ┌───────────┐ ┌──────────┐ │
|
|
│ │ GitHub │ │ OpenAI │ │ Vector DB │ │ MCP │ │
|
|
│ │ Web │ │ Embeds │ │ (chromem) │ │ Protocol │ │
|
|
│ │ Local │ │ │ │ │ │ │ │
|
|
│ └─────────┘ └──────────┘ └───────────┘ └──────────┘ │
|
|
│ │
|
|
│ ┌─────────────────────────────────────────────────────────┐ │
|
|
│ │ Scheduler │ │
|
|
│ │ (Auto-update every 3 days, configurable) │ │
|
|
│ └─────────────────────────────────────────────────────────┘ │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Data Flow
|
|
|
|
```
|
|
User Query → Devour Server → Embedding Generation → Vector Search
|
|
│
|
|
▼
|
|
AI Response ← Context Chunks ← Top-K Relevant Docs ←───┘
|
|
```
|
|
|
|
---
|
|
|
|
## CLI Reference
|
|
|
|
### Commands
|
|
|
|
| Command | Description |
|
|
|---------|-------------|
|
|
| `devour init [path]` | Initialize Devour for a project |
|
|
| `devour get <language> <keyword>` | **NEW** Quick docs fetch for popular languages |
|
|
| `devour scrape <source>` | Scrape docs from URL, repo, or path |
|
|
| `devour serve` | Start MCP server (local or remote) |
|
|
| `devour query <text>` | Search indexed documentation |
|
|
| `devour status` | Show index stats and last update |
|
|
| `devour sync` | Fetch updates from all sources |
|
|
| `devour push <path>` | Push docs to remote MCP server |
|
|
|
|
### Flags
|
|
|
|
```bash
|
|
# Global flags
|
|
--config, -c Config file path (default: ./devour.yaml)
|
|
--verbose, -v Enable verbose logging
|
|
--quiet, -q Suppress non-error output
|
|
|
|
# scrape flags
|
|
--sources, -s YAML file with source definitions
|
|
--format, -f Output format: json, markdown (default: json)
|
|
--concurrency Parallel scraping workers (default: 10)
|
|
|
|
# serve flags
|
|
--remote Run as remote HTTP server
|
|
--port, -p HTTP port (default: 8080)
|
|
--host HTTP host (default: localhost)
|
|
|
|
# query flags
|
|
--limit, -l Max results (default: 5)
|
|
--format, -f Output: json, text, markdown
|
|
--threshold Similarity threshold (default: 0.7)
|
|
```
|
|
|
|
---
|
|
|
|
## Configuration
|
|
|
|
### devour.yaml
|
|
|
|
```yaml
|
|
# Devour Configuration
|
|
|
|
# Storage paths
|
|
storage:
|
|
docs_dir: ./devour_data/docs
|
|
index_dir: ./devour_data/index
|
|
metadata_dir: ./devour_data/metadata
|
|
|
|
# Embedding settings
|
|
embeddings:
|
|
provider: openai # openai, local
|
|
model: text-embedding-3-small
|
|
api_key: ${OPENAI_API_KEY} # Env var reference
|
|
|
|
# Vector database
|
|
vector_db:
|
|
type: chromem # chromem, weaviate, faiss
|
|
persist: true
|
|
|
|
# Scraping settings
|
|
scraper:
|
|
user_agent: "Devour/1.0"
|
|
timeout: 30s
|
|
retry_count: 3
|
|
concurrency: 10
|
|
rate_limit: 500ms
|
|
|
|
# Scheduler
|
|
scheduler:
|
|
enabled: true
|
|
interval: 72h # Every 3 days
|
|
check_method: hash # hash, timestamp
|
|
|
|
# Server settings
|
|
server:
|
|
mode: local # local, remote
|
|
port: 8080
|
|
host: localhost
|
|
|
|
# Sources (for sync)
|
|
sources:
|
|
- name: project-docs
|
|
type: url
|
|
url: https://docs.example.com
|
|
include: ["**/*.md", "**/*.html"]
|
|
exclude: ["**/api/**"]
|
|
|
|
- name: api-spec
|
|
type: openapi
|
|
url: https://api.example.com/openapi.json
|
|
|
|
- name: github-repo
|
|
type: github
|
|
repo: org/repo
|
|
branch: main
|
|
paths: ["docs/", "README.md"]
|
|
```
|
|
|
|
---
|
|
|
|
## API Reference
|
|
|
|
### MCP Tools (when running as server)
|
|
|
|
#### `devour_query`
|
|
Search indexed documentation for relevant context.
|
|
|
|
```json
|
|
{
|
|
"query": "How do I authenticate?",
|
|
"limit": 5,
|
|
"threshold": 0.7
|
|
}
|
|
```
|
|
|
|
#### `devour_add`
|
|
Add documents to the index.
|
|
|
|
```json
|
|
{
|
|
"documents": [
|
|
{
|
|
"content": "Document text...",
|
|
"metadata": {
|
|
"source": "https://...",
|
|
"type": "markdown"
|
|
}
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
#### `devour_status`
|
|
Get indexing status and statistics.
|
|
|
|
### REST API (remote mode)
|
|
|
|
```
|
|
GET /health # Health check
|
|
GET /status # Index statistics
|
|
POST /query # Search documents
|
|
POST /documents # Add documents
|
|
GET /documents # List documents
|
|
DELETE /documents/:id # Delete document
|
|
POST /sync # Trigger sync
|
|
```
|
|
|
|
---
|
|
|
|
## Integration Examples
|
|
|
|
### With OpenCode (Local Mode)
|
|
|
|
Add to your OpenCode skills:
|
|
|
|
```yaml
|
|
# ~/.opencode/skills.yaml
|
|
skills:
|
|
- name: devour
|
|
path: /path/to/devour
|
|
commands:
|
|
- devour serve
|
|
```
|
|
|
|
Then in OpenCode:
|
|
```
|
|
/devour query "authentication flow"
|
|
```
|
|
|
|
### With AI Applications
|
|
|
|
```go
|
|
import "github.com/yourorg/devour/pkg/client"
|
|
|
|
func main() {
|
|
client := client.New("http://localhost:8080")
|
|
|
|
results, err := client.Query(ctx, "How do I use the API?", 5)
|
|
if err != nil {
|
|
log.Fatal(err)
|
|
}
|
|
|
|
for _, r := range results {
|
|
fmt.Printf("Score: %.2f - %s\n", r.Score, r.Content[:100])
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Development
|
|
|
|
### Project Structure
|
|
|
|
```
|
|
devour/
|
|
├── cmd/devour/ # CLI entrypoint
|
|
│ └── main.go
|
|
├── internal/
|
|
│ ├── scraper/ # Scraping logic
|
|
│ ├── indexer/ # Embedding generation
|
|
│ ├── server/ # MCP server
|
|
│ ├── scheduler/ # Background updates
|
|
│ └── ai/ # AI integrations
|
|
├── pkg/
|
|
│ ├── client/ # Go client library
|
|
│ └── types/ # Shared types
|
|
├── devour_data/ # Default data directory
|
|
├── go.mod
|
|
├── Makefile
|
|
└── README.md
|
|
```
|
|
|
|
### Building
|
|
|
|
```bash
|
|
# Development build
|
|
go build -o devour ./cmd/devour
|
|
|
|
# Production build
|
|
CGO_ENABLED=0 go build -ldflags="-s -w" -o devour ./cmd/devour
|
|
|
|
# Run tests
|
|
go test ./...
|
|
|
|
# Run with coverage
|
|
go test -cover ./...
|
|
```
|
|
|
|
### Makefile Targets
|
|
|
|
```bash
|
|
make build # Build binary
|
|
make test # Run tests
|
|
make lint # Run linter
|
|
make docker # Build Docker image
|
|
make install # Install locally
|
|
```
|
|
|
|
---
|
|
|
|
## Roadmap
|
|
|
|
- [ ] Local LLM support (Ollama, LocalAI)
|
|
- [ ] Multi-tenant support for remote mode
|
|
- [ ] Web UI for document management
|
|
- [ ] Git-based versioning for docs
|
|
- [ ] Plugin system for custom scrapers
|
|
- [ ] Reranking with cross-encoders
|
|
|
|
---
|
|
|
|
## Contributing
|
|
|
|
Contributions are welcome! Please read our [Contributing Guide](CONTRIBUTING.md) for details.
|
|
|
|
---
|
|
|
|
## License
|
|
|
|
MIT License - see [LICENSE](LICENSE) for details.
|
|
|
|
---
|
|
|
|
<p align="center">
|
|
<sub>Built with ❤️ for better AI context</sub>
|
|
</p>
|