Expand the automated detection layer beyond internal duplicate detection to cover the commercial music catalog and detect sophisticated copies. Integrate external fingerprint databases (AcoustID / MusicBrainz API, Audd.io API) and AI similarity detection.
## Phase 4: Advanced Detection — External DBs + AI Similarity **Parent Epic:** #404 **Priority:** P2 **Depends on:** Phase 1 (#405) **Estimated Effort:** 5-7 sprints ### Scope Expand the automated detection layer beyond internal duplicate detection to cover the commercial music catalog and detect sophisticated copies. ### Tasks - [ ] **External fingerprint database integration** - AcoustID / MusicBrainz API (open, ~45M tracks) — primary - Audd.io API (commercial catalog) — secondary, paid per query - Batch comparison for new uploads + periodic re-scan of existing content - Confidence threshold tuning (avoid false positives on original works) - [ ] **ISRC / ISWC metadata cross-referencing** - If uploader provides ISRC → verify against International ISRC Agency - Cross-reference title + artist combinations against MusicBrainz - Flag mismatches (e.g., ISRC belongs to a different artist) - [ ] **AI-based similarity detection** - Train embedding model on internal corp