SpotifyRecAlg Backend
Go recommendation API for music catalogs. It combines the approaches from project.md and the included papers:
- content-based exploration over normalized audio features
- weighted Spotify-style audio similarity over fixed feature ranges
- metadata affinity for genre/artist fallback when audio features are missing
- collaborative exploitation using Pearson-style neighborhood scores
- seed-track and manual feature targeting
- explicit user controls for hidden tracks, genres, artists, and explicit content
- popularity dampening, safety penalties, constrained commercial boosts, and diversity reranking
- response explanations so clients can show why a track was recommended
Authentication Options
Option 1: Auth-free (default) - Native Go webplayer client No Spotify API credentials needed. The backend includes a native webplayer client that generates TOTP tokens (same method as official Web Player) to get anonymous access. No external services required.
cd apps/backend
STORE_DRIVER=memory SEED_DEMO_DATA=true go run ./cmd/api
Option 2: Official Spotify API - Set credentials
export SPOTIFY_CLIENT_ID=...
export SPOTIFY_CLIENT_SECRET=...
cd apps/backend && go run ./cmd/api
The backend automatically falls back to the native webplayer client if Spotify credentials are not configured.
Run Locally
Memory mode, with demo data:
cd apps/backend
STORE_DRIVER=memory SEED_DEMO_DATA=true go run ./cmd/api
Postgres mode:
docker compose -f infra/docker-compose.yml up postgres -d
cd apps/backend
go install github.com/pressly/goose/v3/cmd/goose@latest
goose -dir migrations postgres "postgres://spotify:spotify@localhost:5432/spotifyrec?sslmode=disable" up
go run ./cmd/api
Request recommendations:
curl -s http://localhost:8080/v1/recommendations \
-H 'Content-Type: application/json' \
-d '{"user_id":"demo-user","limit":5,"mode":"balanced"}'
Import one Spotify track (works with unlocker or official API):
curl -s http://localhost:8080/v1/providers/spotify/import \
-H 'Content-Type: application/json' \
-d '{"source":{"type":"url","value":"https://open.spotify.com/track/3n3Ppam7vgaVa1iaRUc9Lp"}, "market":"US", "enrich_musicbrainz":true, "persist":true}'
API Surface
POST /v1/tracksupsert one trackPUT /v1/tracks/batchupsert up to 1000 tracksPOST /v1/interactionsingest play, skip, like, dislike, save, or hide eventsPOST /v1/recommendationscreate explainable ranked recommendationsGET /v1/users/{user_id}/taste-profileinspect the computed profileGET /v1/users/{user_id}/controlsread taste and safety controlsPUT /v1/users/{user_id}/controlsupdate controlsPOST /v1/providers/spotify/importimport Spotify track, album, playlist, or artist tracksPOST /v1/providers/spotify/searchsearch Spotify tracks with limit capped at 10POST /v1/providers/musicbrainz/enrichenrich existing tracks by ISRC or title/artist searchGET /v1/providers/statusinspect provider configuration, availability, and cache statsGET /healthzlivenessGET /readyzstorage readiness
See docs/openapi.yaml for the contract.
Architecture
The HTTP layer depends on a small storage interface and the recommendation engine depends only on a snapshot provider. That keeps this service wireable to another backend: you can replace Postgres with your own catalog, data lake, event stream, or user service without changing the scorer.
Core scoring:
final =
content_weight * weighted_audio_and_metadata_similarity
+ collaborative_weight * overlap_shrunk_neighbor_score
+ popularity_weight * mode_aware_popularity_fit
+ exploration_weight * target_distance_score
+ constrained_commercial_boost
final *= safety_score * negative_feedback_penalty
Candidates are then reranked with a Maximal Marginal Relevance style diversity pass so the top results are not duplicates of the same audio neighborhood.
Production Notes
- Set
API_KEYSfor backend-to-backend API key protection. - Set
SPOTIFY_CLIENT_IDandSPOTIFY_CLIENT_SECRETfor Spotify client credentials auth, orSPOTIFY_BEARER_TOKENfor a short-lived externally managed bearer token. - Set
SPOTIFY_MARKETto the default two-letter market, for exampleUS. - Set
MUSICBRAINZ_APP_NAMEandMUSICBRAINZ_CONTACT; MusicBrainz requires an identifying User-Agent. - Set
PROVIDER_CACHE_TTL_HOURSto control provider payload cache freshness. Expired cache entries may be used as stale fallback when an upstream provider fails. - Keep user authentication in the parent product and pass stable opaque
user_idvalues to this service. - Run goose migrations before starting Postgres mode.
- Use bulk ingestion for catalog updates and append-only interaction events.
- For large catalogs, replace full snapshots with vector indexes or precomputed candidate sets while keeping the same engine contract.
- When Spotify API credentials are provided, the backend uses the official Web API. Otherwise, it uses the native Go webplayer client which generates TOTP tokens for anonymous access (no user account required).