Batchwise

Trainer

XPM Configxpmir.letor.trainers.batchwise.BatchwiseTrainer(*, hooks, model, batcher, sampler, batch_size, lossfn)[source]

Bases: LossTrainer

Submit type: xpmir.letor.trainers.batchwise.BatchwiseTrainer

Batchwise trainer

Arguments:

lossfn: The loss function to use sampler: A batchwise sampler

hooks: List[xpmir.learning.context.TrainingHook] = []

Hooks for this trainer: this includes the losses, but can be adapted for other uses The specific list of hooks depends on the specific trainer

model: xpmir.learning.optim.Module

If the model to optimize is different from the model passsed to Learn, this parameter can be used – initialization is still expected to be done at the learner level

batcher: xpmir.learning.batchers.Batcher = xpmir.learning.batchers.Batcher.XPMValue()

How to batch samples together

sampler: xpmir.letor.samplers.BatchwiseSampler

A batch-wise sampler

batch_size: int = 16

Number of samples per batch

lossfn: xpmir.letor.trainers.batchwise.BatchwiseLoss

A batchwise loss function

Losses

XPM Configxpmir.letor.trainers.batchwise.BatchwiseLoss(*, weight)[source]

Bases: Config

Submit type: xpmir.letor.trainers.batchwise.BatchwiseLoss

weight: float = 1.0

The weight of this loss

XPM Configxpmir.letor.trainers.batchwise.CrossEntropyLoss(*, weight)[source]

Bases: BatchwiseLoss

Submit type: xpmir.letor.trainers.batchwise.CrossEntropyLoss

weight: float = 1.0

The weight of this loss

XPM Configxpmir.letor.trainers.batchwise.SoftmaxCrossEntropy(*, weight)[source]

Bases: BatchwiseLoss

Submit type: xpmir.letor.trainers.batchwise.SoftmaxCrossEntropy

weight: float = 1.0

The weight of this loss

Samplers

XPM Configxpmir.letor.samplers.BatchwiseSampler[source]

Bases: Sampler

Submit type: xpmir.letor.samplers.BatchwiseSampler

Base class for batchwise samplers, that provide for each question a list of documents

XPM Configxpmir.documents.samplers.RandomSpanSampler(*, documents, max_spansize)[source]

Bases: BatchwiseSampler, PairwiseSampler

Submit type: xpmir.documents.samplers.RandomSpanSampler

This sampler uses positive samples coming from the same documents and negative ones coming from others

Allows to (pre)-train as in co-condenser:

L. Gao and J. Callan, “Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval,” arXiv:2108.05540 [cs], Aug. 2021, Accessed: Sep. 17, 2021. [Online]. http://arxiv.org/abs/2108.05540

documents: datamaestro_text.data.ir.DocumentStore

The document store to use

max_spansize: int = 1000

Maximum span size in number of characters