mirror of
https://github.com/Dvorinka/facr-scraper.git
synced 2026-06-03 20:12:57 +00:00
feat(scraper): implement CloakBrowser support and enhance request stealth
Integrate CloakBrowser to improve success rates against Cloudflare challenges and implement more robust request handling in the Go backend. - Add CloakBrowser integration to Dockerfile and requirements - Implement domain-specific request semaphores in Go to prevent rate-limiting - Add shared HTTP client with cookie jar and header preservation for better session management - Enhance request headers in Go to include modern client hints (Sec-Ch-Ua) - Add benchmarking scripts to compare fetch methods (urllib vs Scrapling vs CloakBrowser) - Update docker-compose to support CloakBrowser environment variables - Optimize Docker image by pre-downloading patched Chromium binaries
This commit is contained in:
@@ -82,9 +82,11 @@ def scrapling_fetch(url: str, referer: str = "", timeout_ms: int = 30000, wait_m
|
||||
if referer:
|
||||
extra_headers["Referer"] = referer
|
||||
|
||||
# Increase challenge-solving timeout; network_idle can interfere with
|
||||
# ongoing Cloudflare polling so we disable it.
|
||||
fetch_kwargs = {
|
||||
"headless": True,
|
||||
"network_idle": True,
|
||||
"network_idle": False,
|
||||
"google_search": False,
|
||||
"solve_cloudflare": True,
|
||||
"timeout": timeout_ms,
|
||||
|
||||
Reference in New Issue
Block a user