Learning to rank

Learning to rank is handled by various classes. Some are located in the learning module.


XPM Configxpmir.letor.learner.ValidationListener(*, id, metrics, dataset, retriever, warmup, validation_interval, early_stop, hooks)[source]

Bases: LearnerListener

Submit type: Any

Learning validation early-stopping

Computes a validation metric and stores the best result. If early_stop is set (> 0), then it signals to the learner that the learning process can stop.

id: str

Unique ID to identify the listener (ignored for signature)

metrics: Dict[str, bool] = {'map': True}

Dictionary whose keys are the metrics to record, and boolean values whether the best performance checkpoint should be kept for the associated metric ([parseable by ir-measures](https://ir-measur.es/))

dataset: datamaestro_text.data.ir.Adhoc

The dataset to use

retriever: xpmir.rankers.Retriever

The retriever for validation

warmup: int = -1

How many epochs before actually computing the metric

bestpath: Pathgenerated

Path to the best checkpoints

info: Pathgenerated

Path to the JSON file that contains the metric values at each epoch

validation_interval: int = 1

Epochs between each validation

early_stop: int = 0

Number of epochs without improvement after which we stop learning. Should be a multiple of validation_interval or 0 (no early stopping)

hooks: List[xpmir.learning.context.ValidationHook] = []

The list of the hooks during the validation


Scorers are able to give a score to a (query, document) pair. Among the scorers, some are have learnable parameters.

XPM Configxpmir.rankers.Scorer[source]

Bases: Config, Initializable, EasyLogger, ABC

Submit type: xpmir.rankers.Scorer

Query-document scorer

A model able to give a score to a list of documents given a query


Put the model in inference/evaluation mode

getRetriever(retriever: Retriever, batch_size: int, batcher: Batcher = Config[xpmir.learning.batchers.batcher], top_k=None, device=None)[source]

Returns a two stage re-ranker from this retriever and a scorer

  • device – Device for the ranker or None if no change should be made

  • batch_size – The number of documents in each batch

  • top_k – Number of documents to re-rank (or None for all)

initialize(*args, **kwargs)

Main initialization

Calls __initialize__() once (using __initialize__())


Move the scorer to another device

XPM Configxpmir.rankers.RandomScorer(*, random)[source]

Bases: Scorer

Submit type: xpmir.rankers.RandomScorer

A random scorer

random: xpmir.learning.base.Random

The random number generator

XPM Configxpmir.rankers.AbstractModuleScorer[source]

Bases: Scorer, Module

Submit type: xpmir.rankers.AbstractModuleScorer

Base class for all learnable scorer

This class provides a compute method that calls the forward method,

XPM Configxpmir.rankers.LearnableScorer[source]

Bases: AbstractModuleScorer

Submit type: xpmir.rankers.LearnableScorer

Learnable scorer

A scorer with parameters that can be learnt


XPM Configxpmir.rankers.adapters.ScorerTransformAdapter(*, scorer, adapter)[source]

Bases: Scorer

Submit type: xpmir.rankers.adapters.ScorerTransformAdapter

Transforms topic and/or documents output by a scorer when rescoring documents

scorer: xpmir.rankers.Scorer

The original scorer to be transform

adapter: xpmir.letor.samplers.hydrators.SampleTransform

The list of sample transforms to apply

Utility functions

xpmir.rankers.scorer_retriever(documents: Documents, *, retrievers: RetrieverFactory, scorer: Scorer, **kwargs)[source]

Helper function that returns a two stage retriever. This is useful when used with partial (when the scorer is not known).

  • documents – The document collection

  • retrievers – A retriever factory

  • scorer – The scorer


A retriever, calling the :meth:scorer.getRetriever


Scores can be used as retrievers through a xpmir.rankers.TwoStageRetriever


Samplers provide samples in the form of records. They all inherit from:

class xpmir.letor.samplers.SerializableIterator[source]

Bases: Iterator[T], Generic[T, State]

An iterator that can be serialized through state dictionaries.

This is used when saving the sampler state

XPM Configxpmir.letor.samplers.ModelBasedSampler(*, dataset, retriever)[source]

Bases: Sampler

Submit type: xpmir.letor.samplers.ModelBasedSampler

Base class for retriever-based sampler

dataset: datamaestro_text.data.ir.Adhoc

The IR adhoc dataset

retriever: xpmir.rankers.Retriever

A retriever to sample negative documents

Records for training

class xpmir.letor.records.PairwiseRecord(query: Record, positive: Record, negative: Record)[source]

Bases: object

A pairwise record is composed of a query, a positive and a negative document

class xpmir.letor.records.PointwiseRecord(topic: Record, document: Record, relevance: float | None = None)[source]

Bases: object

A record from a pointwise sampler

Document samplers

Useful for pre-training or when learning index parameters (e.g. for FAISS).

XPM Configxpmir.documents.samplers.DocumentSampler(*, documents)[source]

Bases: Config, ABC

Submit type: xpmir.documents.samplers.DocumentSampler

How to sample from a document store

documents: datamaestro_text.data.ir.DocumentStore
XPM Configxpmir.documents.samplers.HeadDocumentSampler(*, documents, max_count, max_ratio)[source]

Bases: DocumentSampler

Submit type: xpmir.documents.samplers.HeadDocumentSampler

A basic sampler that iterates over the first documents

if max_count is 0, it iterates over all documents

documents: datamaestro_text.data.ir.DocumentStore
max_count: int = 0

Maximum number of documents (if 0, no limit)

max_ratio: float = 0

Maximum ratio of documents (if 0, no limit)

XPM Configxpmir.documents.samplers.RandomDocumentSampler(*, documents, max_count, max_ratio, random)[source]

Bases: DocumentSampler

Submit type: xpmir.documents.samplers.RandomDocumentSampler

A basic sampler that iterates over the first documents

Either max_count or max_ratio should be non null

documents: datamaestro_text.data.ir.DocumentStore
max_count: int = 0

Maximum number of documents (if 0, no limit)

max_ratio: float = 0

Maximum ratio of documents (if 0, no limit)

random: xpmir.learning.base.Random

Random sampler


XPM Configxpmir.letor.samplers.hydrators.SampleTransform[source]

Bases: Config, ABC

Submit type: xpmir.letor.samplers.hydrators.SampleTransform

XPM Configxpmir.letor.samplers.hydrators.SampleHydrator(*, documentstore, querystore)[source]

Bases: SampleTransform

Submit type: xpmir.letor.samplers.hydrators.SampleHydrator

Base class for document/topic hydrators

documentstore: datamaestro_text.data.ir.DocumentStore

The store for document texts if needed

querystore: xpmir.datasets.adapters.TextStore

The store for query texts if needed

XPM Configxpmir.letor.samplers.hydrators.SamplePrefixAdding(*, query_prefix, document_prefix)[source]

Bases: SampleTransform

Submit type: xpmir.letor.samplers.hydrators.SamplePrefixAdding

Transform the query and documents by adding the prefix

query_prefix: str

The prefix for the query

document_prefix: str

The prefix for the document

XPM Configxpmir.letor.samplers.hydrators.SampleTransformList(*, adapters)[source]

Bases: SampleTransform

Submit type: xpmir.letor.samplers.hydrators.SampleTransformList

A class which group a list of sample transforms

adapters: List[xpmir.letor.samplers.hydrators.SampleTransform]

The list of sample transform to be applied