Tech News
← Back to articles

Show HN: Pyversity – Fast Result Diversification for Retrieval and RAG

read original related products more articles

Fast Diversification for Search & Retrieval

Pyversity is a fast, lightweight library for diversifying retrieval results. Retrieval systems often return highly similar items. Pyversity efficiently re-ranks these results to encourage diversity, surfacing items that remain relevant but less redundant.

It implements several popular diversification strategies such as MMR, MSD, DPP, and Cover with a clear, unified API. More information about the supported strategies can be found in the supported strategies section. The only dependency is NumPy, making the package very lightweight.

Quickstart

Install pyversity with:

pip install pyversity

Diversify retrieval results:

import numpy as np from pyversity import diversify , Strategy # Define embeddings and scores (e.g. cosine similarities of a query result) embeddings = np . random . randn ( 100 , 256 ) scores = np . random . rand ( 100 ) # Diversify the result diversified_result = diversify ( embeddings = embeddings , scores = scores , k = 10 , # Number of items to select strategy = Strategy . MMR , # Diversification strategy to use diversity = 0.5 # Diversity parameter (higher values prioritize diversity) ) # Get the indices of the diversified result diversified_indices = diversified_result . indices

The returned DiversificationResult can be used to access the diversified indices , as well as the selection_scores of the selected strategy and other useful info. The strategies are extremely fast and scalable: this example runs in milliseconds.

... continue reading