mirror of
https://github.com/Dvorinka/MyClubServer.git
synced 2026-06-03 18:22:57 +00:00
hot fix #1
This commit is contained in:
@@ -0,0 +1,204 @@
|
||||
# Logo API Request Storm Fix
|
||||
|
||||
## Problem Summary
|
||||
|
||||
The admin page experienced a massive request storm with **13,000+ opponent PNG requests** to `logoapi.sportcreative.eu`, causing performance issues and potential API abuse.
|
||||
|
||||
## Root Cause Analysis
|
||||
|
||||
### Primary Cause
|
||||
The **Teams Admin page** (`/admin/teams`) was fetching logos for ALL teams from ALL competitions simultaneously without any rate limiting:
|
||||
|
||||
```typescript
|
||||
// BEFORE: Uncontrolled concurrent requests
|
||||
await Promise.all(
|
||||
teamIds.map(async (id) => {
|
||||
const url = await fetchLogoFromLogoAPI(id); // Individual request per team
|
||||
if (url) logos[id] = url;
|
||||
})
|
||||
);
|
||||
```
|
||||
|
||||
### Contributing Factors
|
||||
1. **No Rate Limiting**: All requests fired simultaneously via `Promise.all()`
|
||||
2. **Large Competition Data**: Multiple competitions with hundreds of teams each
|
||||
3. **Ineffective Caching**: Cache bypass or cleared frequently
|
||||
4. **No Volume Monitoring**: No tracking of request counts
|
||||
5. **No Circuit Breaker**: Failed requests continued indefinitely
|
||||
|
||||
## Implemented Solutions
|
||||
|
||||
### 1. Rate Limiting & Batching
|
||||
```typescript
|
||||
// AFTER: Controlled batch processing
|
||||
const BATCH_SIZE = 10;
|
||||
const RATE_LIMIT_DELAY = 100; // ms between batches
|
||||
|
||||
for (let i = 0; i < teamIds.length; i += BATCH_SIZE) {
|
||||
const batch = teamIds.slice(i, i + BATCH_SIZE);
|
||||
// Process batch concurrently
|
||||
await new Promise(resolve => setTimeout(resolve, RATE_LIMIT_DELAY));
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Safety Limits
|
||||
```typescript
|
||||
// Prevent excessive requests
|
||||
const MAX_TEAM_IDS = 500;
|
||||
if (teamIds.size > MAX_TEAM_IDS) {
|
||||
console.warn(`Too many team IDs (${teamIds.size}). Limiting to ${MAX_TEAM_IDS}.`);
|
||||
// Limit to first 500 IDs
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Circuit Breaker Pattern
|
||||
```typescript
|
||||
// Circuit breaker state
|
||||
let circuitBreakerState = {
|
||||
failures: 0,
|
||||
maxFailures: 5,
|
||||
resetTimeout: 60000, // 1 minute
|
||||
};
|
||||
|
||||
// Skip requests if circuit breaker is open
|
||||
if (isCircuitBreakerOpen()) {
|
||||
return null;
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Request Monitoring
|
||||
```typescript
|
||||
// Track request volume
|
||||
let requestStats = {
|
||||
totalRequests: 0,
|
||||
windowMs: 60000, // 1 minute window
|
||||
};
|
||||
|
||||
// Skip if volume too high
|
||||
if (stats.totalRequests > 200) {
|
||||
console.warn(`Request volume too high. Skipping batch fetch.`);
|
||||
return {};
|
||||
}
|
||||
```
|
||||
|
||||
### 5. Enhanced Caching
|
||||
- Cache-first approach with `getCachedLogo()` check
|
||||
- Automatic cache cleanup after 30 days
|
||||
- IndexedDB for persistent storage
|
||||
- 7-day cache duration
|
||||
|
||||
### 6. Request Timeouts
|
||||
```typescript
|
||||
// 5-second timeout per request
|
||||
const controller = new AbortController();
|
||||
const timeoutId = setTimeout(() => controller.abort(), 5000);
|
||||
```
|
||||
|
||||
## Files Modified
|
||||
|
||||
### Frontend
|
||||
- `/frontend/src/utils/sportLogosAPI.ts`
|
||||
- Added rate limiting and batching
|
||||
- Implemented circuit breaker pattern
|
||||
- Added request monitoring
|
||||
- Enhanced caching logic
|
||||
- Added request timeouts
|
||||
|
||||
- `/frontend/src/pages/admin/TeamsAdminPage.tsx`
|
||||
- Added safety limits (MAX_TEAM_IDS = 500)
|
||||
- Added request statistics display
|
||||
- Enhanced error handling
|
||||
|
||||
## Performance Improvements
|
||||
|
||||
### Before Fix
|
||||
- **13,000+ concurrent requests**
|
||||
- **No rate limiting**
|
||||
- **No failure protection**
|
||||
- **Cache bypass issues**
|
||||
|
||||
### After Fix
|
||||
- **Maximum 10 concurrent requests**
|
||||
- **100ms delay between batches**
|
||||
- **Circuit breaker after 5 failures**
|
||||
- **200 requests/minute hard limit**
|
||||
- **500 team IDs maximum per fetch**
|
||||
- **5-second timeout per request**
|
||||
|
||||
## Monitoring Features
|
||||
|
||||
### Request Statistics
|
||||
The admin page now shows:
|
||||
- Number of logos fetched
|
||||
- Current request volume (requests/minute)
|
||||
- Cache hit ratio
|
||||
|
||||
### Console Warnings
|
||||
- High request volume alerts (>100/min)
|
||||
- Circuit breaker activation
|
||||
- Team ID limiting warnings
|
||||
- Cache miss warnings
|
||||
|
||||
## Prevention Measures
|
||||
|
||||
### Automatic Protections
|
||||
1. **Rate Limiting**: 10 concurrent requests max
|
||||
2. **Volume Limits**: 200 requests/minute max
|
||||
3. **Circuit Breaker**: Stops after 5 consecutive failures
|
||||
4. **Safety Caps**: Maximum 500 team IDs per operation
|
||||
5. **Timeouts**: 5-second limit per request
|
||||
|
||||
### Admin Visibility
|
||||
- Real-time request statistics in Teams Admin page
|
||||
- Console warnings for abnormal activity
|
||||
- Clear error messages for failures
|
||||
|
||||
## Testing Recommendations
|
||||
|
||||
### Load Testing
|
||||
```bash
|
||||
# Test with large competition data
|
||||
1. Navigate to /admin/teams
|
||||
2. Monitor network tab for request patterns
|
||||
3. Verify request volume stays under limits
|
||||
4. Check console for warnings
|
||||
```
|
||||
|
||||
### Circuit Breaker Testing
|
||||
```bash
|
||||
# Simulate API failures
|
||||
1. Block logoapi.sportcreative.eu in network
|
||||
2. Trigger logo fetch in admin
|
||||
3. Verify circuit breaker opens after 5 failures
|
||||
4. Verify requests stop during breaker open state
|
||||
5. Verify breaker resets after 1 minute
|
||||
```
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
### Backend Caching (Recommended)
|
||||
- Implement server-side logo caching
|
||||
- Use CDN for logo distribution
|
||||
- Add logo pre-fetching during off-peak hours
|
||||
|
||||
### Advanced Monitoring
|
||||
- Add Prometheus metrics for request tracking
|
||||
- Implement alerting for high request volumes
|
||||
- Add request pattern analysis
|
||||
|
||||
### User Experience
|
||||
- Add manual refresh controls
|
||||
- Implement progressive loading
|
||||
- Add request queue status indicator
|
||||
|
||||
## Conclusion
|
||||
|
||||
The implemented fixes provide multiple layers of protection against request storms:
|
||||
|
||||
1. **Rate Limiting** prevents overwhelming the API
|
||||
2. **Circuit Breaker** stops cascading failures
|
||||
3. **Volume Monitoring** provides visibility and control
|
||||
4. **Enhanced Caching** reduces unnecessary requests
|
||||
5. **Safety Limits** prevent extreme scenarios
|
||||
|
||||
The system is now resilient against similar request storms while maintaining good performance for normal usage.
|
||||
Reference in New Issue
Block a user