Batchwise learning
In batchwise learning, the loss is computed over entire batches of documents rather than individual pairs. This enables losses such as in-batch contrastive learning or listwise softmax cross-entropy.
Trainer
- XPM Configxpmir.letor.trainers.batchwise.BatchwiseTrainer(*, hooks, model, sampler, batch_size, num_workers, lossfn)[source]
Bases:
LossTrainerBatchwise trainer
Arguments:
lossfn: The loss function to use sampler: A batchwise sampler
- hooks: List[xpm_torch.trainers.context.TrainingHook] = []
Hooks for this trainer: this includes the losses, but can be adapted for other uses The specific list of hooks depends on the specific trainer
- model: xpm_torch.module.Module
If the model to optimize is different from the model passsed to Learn, this parameter can be used – initialization is still expected to be done at the learner level
- batcher: xpm_torch.batchers.Batchergenerated
How to batch samples together
- sampler: xpm_torch.base.Sampler
The sampler to use
- lossfn: xpm_torch.losses.batchwise.BatchwiseLoss
A batchwise loss function
Samplers
- XPM Configxpmir.documents.samplers.RandomSpanSampler(*, documents, max_spansize)[source]
Bases:
SamplerThis sampler uses positive samples coming from the same documents and negative ones coming from others
- Allows to (pre)-train as in co-condenser:
L. Gao and J. Callan, “Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval,” arXiv:2108.05540 [cs], Aug. 2021, Accessed: Sep. 17, 2021. [Online]. http://arxiv.org/abs/2108.05540
- documents: datamaestro_ir.data.DocumentStore
The document store to use