Modern RAG systems pair single-shot top-k vector search with a cross-encoder reranker to re-order results by semantic relevance. This feature would improve retrieval quality, as current `probe_recall_at_10` is 0.22 compared to `citation_span_overlaps_gold` at 0.60.
## Context Single-shot top-k vector search is the default retrieval primitive in OpenContracts (`CoreAnnotationVectorStore.search`). Modern RAG systems pair that with a cross-encoder reranker so the top-k returned to the agent is re-ordered by semantic relevance, not just vector similarity. Benchmark data from the LegalBench-RAG harness (#1239) on privacy_qa: - `probe_recall_at_10` = 0.22 - `citation_span_overlaps_gold` = 0.60 (agent iterative retrieval; much better) The gap between the probe and the agent is roughly the value a reranker would add to the probe, because the agent is already performing a rough manual-rerank via tool-use iteration. ## Proposal Add an optional post-retrieval reranker stage to `CoreAnnotationVectorStore` (or a wrapper). Candidates: - **BAAI/bge-reranker-v2-m3** — open weights, CPU-usable, ~300M params - **cohere-rerank-3** — hosted API, excellent quality, adds latency + cost - **jina-reranker-v2** — open weights, multilingual Implementation sketch: