Evaluation
Evaluation
- XPM Taskxpmir.evaluation.BaseEvaluation(*, measures)[source]
Bases:
Task
Submit type:
Any
Base class for evaluation tasks
- measures: List[xpmir.measures.Measure] = [Config[xpmir.measures.measure], Config[xpmir.measures.measure], Config[xpmir.measures.measure], Config[xpmir.measures.measure], Config[xpmir.measures.measure]]
List of metrics
- aggregated: Pathgenerated
Path for aggregated results
- detailed: Pathgenerated
Path for detailed results
- XPM Taskxpmir.evaluation.RunEvaluation(*, measures, run, assessments)[source]
Bases:
BaseEvaluation
,Task
Submit type:
Any
Evaluate a run
- measures: List[xpmir.measures.Measure] = [Config[xpmir.measures.measure], Config[xpmir.measures.measure], Config[xpmir.measures.measure], Config[xpmir.measures.measure], Config[xpmir.measures.measure]]
List of metrics
- aggregated: Pathgenerated
Path for aggregated results
- detailed: Pathgenerated
Path for detailed results
- assessments: datamaestro_text.data.ir.AdhocAssessments
- XPM Taskxpmir.evaluation.Evaluate(*, measures, dataset, retriever, topic_wrapper)[source]
Bases:
BaseEvaluation
,Task
Submit type:
Any
Evaluate a retriever directly (without generating the run explicitly)
- measures: List[xpmir.measures.Measure] = [Config[xpmir.measures.measure], Config[xpmir.measures.measure], Config[xpmir.measures.measure], Config[xpmir.measures.measure], Config[xpmir.measures.measure]]
List of metrics
- aggregated: Pathgenerated
Path for aggregated results
- detailed: Pathgenerated
Path for detailed results
- dataset: datamaestro_text.data.ir.Adhoc
The dataset for retrieval
- retriever: xpmir.rankers.Retriever
The retriever to evaluate
- topic_wrapper: datamaestro_text.transforms.ir.TopicWrapper
Topic extractor
- class xpmir.evaluation.Evaluations(dataset: Adhoc, measures: List[Measure], *, topic_wrapper: TopicWrapper | None = None)[source]
Bases:
object
Holds experiment results for several models on one dataset
- class xpmir.evaluation.EvaluationsCollection(**collection: Evaluations)[source]
Bases:
object
A collection of evaluation
This is useful to group all the evaluations to be conducted, and then to call the
evaluate_retriever()
Metrics
Metrics are backed up by the module ir_measures
- XPM Configxpmir.measures.Measure(*, identifier, rel, cutoff)[source]
Bases:
Measure
Mirrors the ir_measures metric object
- identifier: str
main identifier
- rel: int = 1
minimum relevance score to be considered relevant (inclusive)
- cutoff: int
Cutoff value
List of defined measures
- xpmir.measures.AP = Config[xpmir.measures.measure]
Average precision metric
- xpmir.measures.P = Config[xpmir.measures.measure]
Precision at rank
- xpmir.measures.R = Config[xpmir.measures.measure]
Recall at rank
- xpmir.measures.RR = Config[xpmir.measures.measure]
Reciprocical rank
- xpmir.measures.Success = Config[xpmir.measures.measure]
1 if a document with at least rel relevance is found in the first cutoff documents, else 0.
- xpmir.measures.nDCG = Config[xpmir.measures.measure]
Normalized Discounted Cumulated Gain
Measures can be used with the @ operator. Exemple:
from xpmir.measures import AP, P, nDCG, RR
from xpmir.evaluation import Evaluate
Evaluate(measures=[AP, P@20, nDCG, nDCG@10, nDCG@20, RR, RR@10], ...)