Data Generation
Synthetic topics
- XPM Taskxpmir.letor.samplers.synthetic.SyntheticQueryGeneration(*, model, batchsize, num_qry_per_doc, sampler, device, batcher, hooks)[source]
Bases:
Task
Submit type:
Any
- model: xpmir.neural.generative.hf.T5ConditionalGenerator
The model we use to generate the queries
- batchsize: int = 128
Batchsize when computing negatives
- num_qry_per_doc: int = 5
How many synthetic qry to generate per document
- sampler: xpmir.documents.samplers.DocumentSampler
document sampler to iterate over the corpus
- device: xpmir.learning.devices.Device = xpmir.learning.devices.Device.XPMValue()
The device used by the encoder
- batcher: xpmir.learning.batchers.Batcher = xpmir.learning.batchers.Batcher.XPMValue()
The way to prepare batches of documents
- synthetic_samples: Pathgenerated
Path to store the generated queries
- hooks: List[xpmir.context.Hook] = []
Global learning hooks