mirror of
https://github.com/Dvorinka/MyClubServer.git
synced 2026-06-04 02:32:57 +00:00
205 lines
5.4 KiB
Markdown
205 lines
5.4 KiB
Markdown
# Logo API Request Storm Fix
|
|
|
|
## Problem Summary
|
|
|
|
The admin page experienced a massive request storm with **13,000+ opponent PNG requests** to `logoapi.sportcreative.eu`, causing performance issues and potential API abuse.
|
|
|
|
## Root Cause Analysis
|
|
|
|
### Primary Cause
|
|
The **Teams Admin page** (`/admin/teams`) was fetching logos for ALL teams from ALL competitions simultaneously without any rate limiting:
|
|
|
|
```typescript
|
|
// BEFORE: Uncontrolled concurrent requests
|
|
await Promise.all(
|
|
teamIds.map(async (id) => {
|
|
const url = await fetchLogoFromLogoAPI(id); // Individual request per team
|
|
if (url) logos[id] = url;
|
|
})
|
|
);
|
|
```
|
|
|
|
### Contributing Factors
|
|
1. **No Rate Limiting**: All requests fired simultaneously via `Promise.all()`
|
|
2. **Large Competition Data**: Multiple competitions with hundreds of teams each
|
|
3. **Ineffective Caching**: Cache bypass or cleared frequently
|
|
4. **No Volume Monitoring**: No tracking of request counts
|
|
5. **No Circuit Breaker**: Failed requests continued indefinitely
|
|
|
|
## Implemented Solutions
|
|
|
|
### 1. Rate Limiting & Batching
|
|
```typescript
|
|
// AFTER: Controlled batch processing
|
|
const BATCH_SIZE = 10;
|
|
const RATE_LIMIT_DELAY = 100; // ms between batches
|
|
|
|
for (let i = 0; i < teamIds.length; i += BATCH_SIZE) {
|
|
const batch = teamIds.slice(i, i + BATCH_SIZE);
|
|
// Process batch concurrently
|
|
await new Promise(resolve => setTimeout(resolve, RATE_LIMIT_DELAY));
|
|
}
|
|
```
|
|
|
|
### 2. Safety Limits
|
|
```typescript
|
|
// Prevent excessive requests
|
|
const MAX_TEAM_IDS = 500;
|
|
if (teamIds.size > MAX_TEAM_IDS) {
|
|
console.warn(`Too many team IDs (${teamIds.size}). Limiting to ${MAX_TEAM_IDS}.`);
|
|
// Limit to first 500 IDs
|
|
}
|
|
```
|
|
|
|
### 3. Circuit Breaker Pattern
|
|
```typescript
|
|
// Circuit breaker state
|
|
let circuitBreakerState = {
|
|
failures: 0,
|
|
maxFailures: 5,
|
|
resetTimeout: 60000, // 1 minute
|
|
};
|
|
|
|
// Skip requests if circuit breaker is open
|
|
if (isCircuitBreakerOpen()) {
|
|
return null;
|
|
}
|
|
```
|
|
|
|
### 4. Request Monitoring
|
|
```typescript
|
|
// Track request volume
|
|
let requestStats = {
|
|
totalRequests: 0,
|
|
windowMs: 60000, // 1 minute window
|
|
};
|
|
|
|
// Skip if volume too high
|
|
if (stats.totalRequests > 200) {
|
|
console.warn(`Request volume too high. Skipping batch fetch.`);
|
|
return {};
|
|
}
|
|
```
|
|
|
|
### 5. Enhanced Caching
|
|
- Cache-first approach with `getCachedLogo()` check
|
|
- Automatic cache cleanup after 30 days
|
|
- IndexedDB for persistent storage
|
|
- 7-day cache duration
|
|
|
|
### 6. Request Timeouts
|
|
```typescript
|
|
// 5-second timeout per request
|
|
const controller = new AbortController();
|
|
const timeoutId = setTimeout(() => controller.abort(), 5000);
|
|
```
|
|
|
|
## Files Modified
|
|
|
|
### Frontend
|
|
- `/frontend/src/utils/sportLogosAPI.ts`
|
|
- Added rate limiting and batching
|
|
- Implemented circuit breaker pattern
|
|
- Added request monitoring
|
|
- Enhanced caching logic
|
|
- Added request timeouts
|
|
|
|
- `/frontend/src/pages/admin/TeamsAdminPage.tsx`
|
|
- Added safety limits (MAX_TEAM_IDS = 500)
|
|
- Added request statistics display
|
|
- Enhanced error handling
|
|
|
|
## Performance Improvements
|
|
|
|
### Before Fix
|
|
- **13,000+ concurrent requests**
|
|
- **No rate limiting**
|
|
- **No failure protection**
|
|
- **Cache bypass issues**
|
|
|
|
### After Fix
|
|
- **Maximum 10 concurrent requests**
|
|
- **100ms delay between batches**
|
|
- **Circuit breaker after 5 failures**
|
|
- **200 requests/minute hard limit**
|
|
- **500 team IDs maximum per fetch**
|
|
- **5-second timeout per request**
|
|
|
|
## Monitoring Features
|
|
|
|
### Request Statistics
|
|
The admin page now shows:
|
|
- Number of logos fetched
|
|
- Current request volume (requests/minute)
|
|
- Cache hit ratio
|
|
|
|
### Console Warnings
|
|
- High request volume alerts (>100/min)
|
|
- Circuit breaker activation
|
|
- Team ID limiting warnings
|
|
- Cache miss warnings
|
|
|
|
## Prevention Measures
|
|
|
|
### Automatic Protections
|
|
1. **Rate Limiting**: 10 concurrent requests max
|
|
2. **Volume Limits**: 200 requests/minute max
|
|
3. **Circuit Breaker**: Stops after 5 consecutive failures
|
|
4. **Safety Caps**: Maximum 500 team IDs per operation
|
|
5. **Timeouts**: 5-second limit per request
|
|
|
|
### Admin Visibility
|
|
- Real-time request statistics in Teams Admin page
|
|
- Console warnings for abnormal activity
|
|
- Clear error messages for failures
|
|
|
|
## Testing Recommendations
|
|
|
|
### Load Testing
|
|
```bash
|
|
# Test with large competition data
|
|
1. Navigate to /admin/teams
|
|
2. Monitor network tab for request patterns
|
|
3. Verify request volume stays under limits
|
|
4. Check console for warnings
|
|
```
|
|
|
|
### Circuit Breaker Testing
|
|
```bash
|
|
# Simulate API failures
|
|
1. Block logoapi.sportcreative.eu in network
|
|
2. Trigger logo fetch in admin
|
|
3. Verify circuit breaker opens after 5 failures
|
|
4. Verify requests stop during breaker open state
|
|
5. Verify breaker resets after 1 minute
|
|
```
|
|
|
|
## Future Enhancements
|
|
|
|
### Backend Caching (Recommended)
|
|
- Implement server-side logo caching
|
|
- Use CDN for logo distribution
|
|
- Add logo pre-fetching during off-peak hours
|
|
|
|
### Advanced Monitoring
|
|
- Add Prometheus metrics for request tracking
|
|
- Implement alerting for high request volumes
|
|
- Add request pattern analysis
|
|
|
|
### User Experience
|
|
- Add manual refresh controls
|
|
- Implement progressive loading
|
|
- Add request queue status indicator
|
|
|
|
## Conclusion
|
|
|
|
The implemented fixes provide multiple layers of protection against request storms:
|
|
|
|
1. **Rate Limiting** prevents overwhelming the API
|
|
2. **Circuit Breaker** stops cascading failures
|
|
3. **Volume Monitoring** provides visibility and control
|
|
4. **Enhanced Caching** reduces unnecessary requests
|
|
5. **Safety Limits** prevent extreme scenarios
|
|
|
|
The system is now resilient against similar request storms while maintaining good performance for normal usage.
|