mirror of
https://github.com/Dvorinka/Trackeep.git
synced 2026-06-04 12:32:58 +00:00
b083dac3f0
Add real API support in demo mode with credential checking, implement build-time version injection from package.json, and refactor update checking with 24-hour caching. Migrate landing page from Vue to Astro with comprehensive UI components including Hero, Features, Benefits, and Tech Stack sections. Update CI/CD workflow with expanded cache paths and security scanner version pinned.
212 lines
5.4 KiB
Markdown
212 lines
5.4 KiB
Markdown
# Enhanced Favicon Fetching System
|
|
|
|
This document describes the enhanced favicon fetching system implemented to address issues with favicons not being found when they're located in non-standard paths.
|
|
|
|
## Overview
|
|
|
|
The enhanced favicon fetching system (`favicon_fetcher.go`) provides comprehensive favicon detection by:
|
|
|
|
1. **HTML Head Parsing**: Extracts favicon URLs from the `<head>` section of HTML pages
|
|
2. **Multiple Pattern Matching**: Supports various favicon declaration formats
|
|
3. **Common Location Checking**: Tests standard favicon file paths
|
|
4. **Fallback Services**: Uses Google's favicon service as a reliable fallback
|
|
|
|
## Key Features
|
|
|
|
### Comprehensive Pattern Detection
|
|
|
|
The system detects favicons declared in multiple ways:
|
|
|
|
```html
|
|
<!-- Standard favicon -->
|
|
<link rel="icon" href="/favicon.ico">
|
|
<link rel="shortcut icon" href="/favicon.ico">
|
|
|
|
<!-- Apple touch icons -->
|
|
<link rel="apple-touch-icon" href="/apple-touch-icon.png">
|
|
<link rel="apple-touch-icon-precomposed" href="/apple-touch-icon-precomposed.png">
|
|
|
|
<!-- Android icons -->
|
|
<link rel="android-chrome-192x192" href="/android-chrome-192x192.png">
|
|
|
|
<!-- Microsoft tiles -->
|
|
<meta name="msapplication-TileImage" content="/mstile-144x144.png">
|
|
|
|
<!-- Open Graph images (fallback) -->
|
|
<meta property="og:image" content="/logo.png">
|
|
|
|
<!-- Twitter images (fallback) -->
|
|
<meta name="twitter:image" content="/logo.png">
|
|
```
|
|
|
|
### Common Location Testing
|
|
|
|
The system tests these common favicon paths:
|
|
|
|
- `/favicon.ico`, `/favicon.png`, `/favicon.svg`
|
|
- `/apple-touch-icon.png`, `/apple-touch-icon-precomposed.png`
|
|
- `/android-chrome-192x192.png`
|
|
- `/icon.png`, `/icon.svg`, `/logo.png`, `/logo.svg`
|
|
- `/assets/favicon.ico`, `/static/favicon.ico`, `/images/favicon.ico`
|
|
- And more...
|
|
|
|
### Smart URL Resolution
|
|
|
|
The system properly handles:
|
|
- Absolute URLs (`https://cdn.example.com/favicon.ico`)
|
|
- Protocol-relative URLs (`//cdn.example.com/favicon.ico`)
|
|
- Root-relative URLs (`/favicon.ico`)
|
|
- Relative URLs (`../favicon.ico`, `favicon.ico`)
|
|
|
|
## Usage
|
|
|
|
### Basic Usage
|
|
|
|
```go
|
|
import "github.com/trackeep/backend/services"
|
|
|
|
// Get the best available favicon
|
|
favicon, err := services.GetFavicon("https://example.com")
|
|
if err != nil {
|
|
log.Printf("Error fetching favicon: %v", err)
|
|
}
|
|
fmt.Printf("Favicon: %s\n", favicon)
|
|
```
|
|
|
|
### Multiple Candidates
|
|
|
|
```go
|
|
// Get multiple favicon candidates
|
|
favicons := services.GetAllFavicons("https://example.com", 5)
|
|
for i, favicon := range favicons {
|
|
fmt.Printf("%d. %s\n", i+1, favicon)
|
|
}
|
|
```
|
|
|
|
### Advanced Usage
|
|
|
|
```go
|
|
fetcher := services.NewFaviconFetcher()
|
|
favicon, err := fetcher.FetchFavicon("https://example.com")
|
|
if err != nil {
|
|
// Handle error
|
|
}
|
|
```
|
|
|
|
## Integration with Metadata Service
|
|
|
|
The favicon fetcher is integrated into the existing metadata service:
|
|
|
|
```go
|
|
metadata, err := services.FetchWebsiteMetadata("https://example.com")
|
|
if err == nil {
|
|
fmt.Printf("Favicon: %s\n", metadata.Favicon)
|
|
}
|
|
```
|
|
|
|
## Performance Considerations
|
|
|
|
- **Timeout**: 10 seconds per HTTP request
|
|
- **Caching**: Consider implementing caching for frequently accessed favicons
|
|
- **Concurrency**: The system is designed to be thread-safe
|
|
- **Verification**: Each favicon candidate is verified with a HEAD request
|
|
|
|
## Testing
|
|
|
|
### CLI Testing Tool
|
|
|
|
Use the CLI tool to test favicon fetching:
|
|
|
|
```bash
|
|
cd backend
|
|
go run tools/favicon_cli.go https://github.com
|
|
```
|
|
|
|
### Unit Tests
|
|
|
|
Run the unit tests:
|
|
|
|
```bash
|
|
cd backend
|
|
go test ./services -v -run TestFaviconFetcher
|
|
```
|
|
|
|
### Benchmark Tests
|
|
|
|
Run benchmark tests:
|
|
|
|
```bash
|
|
cd backend
|
|
go test ./services -bench=BenchmarkFaviconFetch
|
|
```
|
|
|
|
## Error Handling
|
|
|
|
The system gracefully handles:
|
|
|
|
- Network timeouts and connection errors
|
|
- Invalid URLs
|
|
- HTTP errors (4xx, 5xx responses)
|
|
- Malformed HTML
|
|
- Missing favicons (falls back to Google's service)
|
|
|
|
## Fallback Strategy
|
|
|
|
1. **HTML Extraction**: Try to extract from HTML head
|
|
2. **Common Paths**: Test standard favicon locations
|
|
3. **Google Service**: Use `https://www.google.com/s2/favicons?domain=example.com&sz=128`
|
|
|
|
## Configuration
|
|
|
|
The system uses these default settings:
|
|
|
|
- HTTP Timeout: 10 seconds
|
|
- User Agent: Chrome browser string
|
|
- Max Results: 5 (for multiple favicon fetching)
|
|
- Google Favicon Size: 128px
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
1. **Rate Limiting**: Some sites may block frequent requests
|
|
2. **HTTPS Issues**: Certificate validation problems
|
|
3. **Redirects**: The system follows redirects automatically
|
|
4. **Large Pages**: Only the head section is parsed for efficiency
|
|
|
|
### Debug Mode
|
|
|
|
Enable debug logging by modifying the log level in the fetcher:
|
|
|
|
```go
|
|
fetcher := services.NewFaviconFetcher()
|
|
// Add debug logging as needed
|
|
```
|
|
|
|
## Future Improvements
|
|
|
|
Potential enhancements:
|
|
|
|
1. **Persistent Caching**: Implement Redis or database caching
|
|
2. **Async Fetching**: Background favicon updates
|
|
3. **Image Processing**: Size optimization and format conversion
|
|
4. **Domain Whitelisting**: Prioritize certain domains
|
|
5. **Machine Learning**: Predict best favicon based on site structure
|
|
|
|
## Migration from Old System
|
|
|
|
The new system is backward compatible. Simply replace calls to the old favicon extraction:
|
|
|
|
```go
|
|
// Old way (still works but less comprehensive)
|
|
metadata.Favicon = extractFavicon(content, parsedURL)
|
|
|
|
// New way (recommended)
|
|
favicon, err := GetFavicon(targetURL)
|
|
if err == nil {
|
|
metadata.Favicon = favicon
|
|
}
|
|
```
|
|
|
|
The enhanced system provides significantly better favicon detection rates, especially for modern web applications that use non-standard favicon locations.
|