Add real API support in demo mode with credential checking, implement build-time version injection from package.json, and refactor update checking with 24-hour caching. Migrate landing page from Vue to Astro with comprehensive UI components including Hero, Features, Benefits, and Tech Stack sections. Update CI/CD workflow with expanded cache paths and security scanner version pinned.
5.4 KiB
Enhanced Favicon Fetching System
This document describes the enhanced favicon fetching system implemented to address issues with favicons not being found when they're located in non-standard paths.
Overview
The enhanced favicon fetching system (favicon_fetcher.go) provides comprehensive favicon detection by:
- HTML Head Parsing: Extracts favicon URLs from the
<head>section of HTML pages - Multiple Pattern Matching: Supports various favicon declaration formats
- Common Location Checking: Tests standard favicon file paths
- Fallback Services: Uses Google's favicon service as a reliable fallback
Key Features
Comprehensive Pattern Detection
The system detects favicons declared in multiple ways:
<!-- Standard favicon -->
<link rel="icon" href="/favicon.ico">
<link rel="shortcut icon" href="/favicon.ico">
<!-- Apple touch icons -->
<link rel="apple-touch-icon" href="/apple-touch-icon.png">
<link rel="apple-touch-icon-precomposed" href="/apple-touch-icon-precomposed.png">
<!-- Android icons -->
<link rel="android-chrome-192x192" href="/android-chrome-192x192.png">
<!-- Microsoft tiles -->
<meta name="msapplication-TileImage" content="/mstile-144x144.png">
<!-- Open Graph images (fallback) -->
<meta property="og:image" content="/logo.png">
<!-- Twitter images (fallback) -->
<meta name="twitter:image" content="/logo.png">
Common Location Testing
The system tests these common favicon paths:
/favicon.ico,/favicon.png,/favicon.svg/apple-touch-icon.png,/apple-touch-icon-precomposed.png/android-chrome-192x192.png/icon.png,/icon.svg,/logo.png,/logo.svg/assets/favicon.ico,/static/favicon.ico,/images/favicon.ico- And more...
Smart URL Resolution
The system properly handles:
- Absolute URLs (
https://cdn.example.com/favicon.ico) - Protocol-relative URLs (
//cdn.example.com/favicon.ico) - Root-relative URLs (
/favicon.ico) - Relative URLs (
../favicon.ico,favicon.ico)
Usage
Basic Usage
import "github.com/trackeep/backend/services"
// Get the best available favicon
favicon, err := services.GetFavicon("https://example.com")
if err != nil {
log.Printf("Error fetching favicon: %v", err)
}
fmt.Printf("Favicon: %s\n", favicon)
Multiple Candidates
// Get multiple favicon candidates
favicons := services.GetAllFavicons("https://example.com", 5)
for i, favicon := range favicons {
fmt.Printf("%d. %s\n", i+1, favicon)
}
Advanced Usage
fetcher := services.NewFaviconFetcher()
favicon, err := fetcher.FetchFavicon("https://example.com")
if err != nil {
// Handle error
}
Integration with Metadata Service
The favicon fetcher is integrated into the existing metadata service:
metadata, err := services.FetchWebsiteMetadata("https://example.com")
if err == nil {
fmt.Printf("Favicon: %s\n", metadata.Favicon)
}
Performance Considerations
- Timeout: 10 seconds per HTTP request
- Caching: Consider implementing caching for frequently accessed favicons
- Concurrency: The system is designed to be thread-safe
- Verification: Each favicon candidate is verified with a HEAD request
Testing
CLI Testing Tool
Use the CLI tool to test favicon fetching:
cd backend
go run tools/favicon_cli.go https://github.com
Unit Tests
Run the unit tests:
cd backend
go test ./services -v -run TestFaviconFetcher
Benchmark Tests
Run benchmark tests:
cd backend
go test ./services -bench=BenchmarkFaviconFetch
Error Handling
The system gracefully handles:
- Network timeouts and connection errors
- Invalid URLs
- HTTP errors (4xx, 5xx responses)
- Malformed HTML
- Missing favicons (falls back to Google's service)
Fallback Strategy
- HTML Extraction: Try to extract from HTML head
- Common Paths: Test standard favicon locations
- Google Service: Use
https://www.google.com/s2/favicons?domain=example.com&sz=128
Configuration
The system uses these default settings:
- HTTP Timeout: 10 seconds
- User Agent: Chrome browser string
- Max Results: 5 (for multiple favicon fetching)
- Google Favicon Size: 128px
Troubleshooting
Common Issues
- Rate Limiting: Some sites may block frequent requests
- HTTPS Issues: Certificate validation problems
- Redirects: The system follows redirects automatically
- Large Pages: Only the head section is parsed for efficiency
Debug Mode
Enable debug logging by modifying the log level in the fetcher:
fetcher := services.NewFaviconFetcher()
// Add debug logging as needed
Future Improvements
Potential enhancements:
- Persistent Caching: Implement Redis or database caching
- Async Fetching: Background favicon updates
- Image Processing: Size optimization and format conversion
- Domain Whitelisting: Prioritize certain domains
- Machine Learning: Predict best favicon based on site structure
Migration from Old System
The new system is backward compatible. Simply replace calls to the old favicon extraction:
// Old way (still works but less comprehensive)
metadata.Favicon = extractFavicon(content, parsedURL)
// New way (recommended)
favicon, err := GetFavicon(targetURL)
if err == nil {
metadata.Favicon = favicon
}
The enhanced system provides significantly better favicon detection rates, especially for modern web applications that use non-standard favicon locations.