Optimization

Modules

XPM Configxpmir.learning.optim.Module[source]

Bases: Config, Initializable, TorchModule

A module contains parameters

XPM Configxpmir.learning.optim.ModuleList(*, sub_modules)[source]

Bases: Module, Initializable

Groups different models together, to be used within the Learner

sub_modules: List[xpmir.learning.optim.Module]

The module loader can be used to load a checkpoint

XPM Configxpmir.learning.optim.ModuleLoader(*, value, path)[source]

Bases: PathSerializationLWTask

value: experimaestro.core.objects.Config: The configuration that will be serialized

path: Path: Path containing the data

Optimizers

XPM Configxpmir.learning.optim.Optimizer[source]: Bases: Config

XPM Configxpmir.learning.optim.SGD(*, lr, weight_decay)[source]

Bases: Optimizer

Wrapper for SGD optimizer in Pytorch

lr: float = 1e-05: Learning rate

weight_decay: float = 0.0: Weight decay (L2)

XPM Configxpmir.learning.optim.Adam(*, lr, weight_decay, eps)[source]

Bases: Optimizer

Wrapper for Adam optimizer in PyTorch

lr: float = 0.001: Learning rate

weight_decay: float = 0.0: Weight decay (L2)

eps: float = 1e-08

XPM Configxpmir.learning.optim.AdamW(*, lr, weight_decay, eps)[source]

Bases: Optimizer

Adam optimizer that takes into account the regularization

See the PyTorch documentation

lr: float = 0.001

weight_decay: float = 0.01

eps: float = 1e-08

XPM Configxpmir.learning.optim.ParameterOptimizer(*, optimizer, scheduler, module, filter)[source]

Bases: Config

Associates an optimizer with a list of parameters to optimize

optimizer: xpmir.learning.optim.Optimizer: The optimizer

scheduler: xpmir.learning.schedulers.Scheduler: The optional scheduler

module: xpmir.learning.optim.Module: The module from which parameters should be extracted

filter: xpmir.learning.optim.ParameterFilter = xpmir.learning.optim.ParameterFilter(): How parameters should be selected for this (by default, use them all)

XPM Configxpmir.learning.optim.ParameterFilter[source]

Bases: Config

One abstract class which doesn’t do the filtrage

XPM Configxpmir.learning.optim.RegexParameterFilter(*, includes, excludes)[source]

Bases: ParameterFilter

gives the name of the model to do the filtrage Precondition: Only and just one of the includes and excludes can be None

includes: List[str]: The str of params to be included from the model

excludes: List[str]: The str of params to be excludes from the model

Parameters

During learning, some parameter-specific treatments can be applied (e.g. freezing).

Selecting

The classes below allow to select a subset of parameters.

XPM Configxpmir.learning.parameters.InverseParametersIterator(*, iterator)[source]

Bases: ParametersIterator

Inverse the selection of a parameter iterator

iterator: xpmir.learning.parameters.ParametersIterator

XPM Configxpmir.learning.parameters.ParametersIterator[source]

Bases: Config, ABC

Iterator over module parameters

This can be useful to freeze some layers, or perform any other parameter-wise operation

XPM Configxpmir.learning.parameters.SubParametersIterator(*, model, iterator, default)[source]

Bases: ParametersIterator

Wraps a parameter iterator over a global model and a selector over a subpart of the model

model: xpmir.learning.optim.Module: The model from which the parameters should be gathered

iterator: xpmir.learning.parameters.ParametersIterator: The sub-model iterator

default: bool: Default value for parameters not within the sub-model

XPM Configxpmir.learning.parameters.RegexParametersIterator(*, regex, model)[source]

Bases: ParametersIterator

Itertor over all the parameters which match the given regex

regex: str: The regex expression

model: xpmir.learning.optim.Module: The model we want to select the parameters from

Freezing

XPM Configxpmir.learning.hooks.LayerFreezer(*, selector)[source]

Bases: InitializationTrainingHook

This training hook class can be used to freeze some of the transformer layers

selector: xpmir.learning.parameters.ParametersIterator: How to select the layers to freeze

Loading

XPM Configxpmir.learning.parameters.NameMapper[source]

Bases: Config, ABC

Changes name of parameters

XPM Configxpmir.learning.parameters.PrefixRenamer(*, model, data)[source]

Bases: NameMapper

Changes name of parameters

model: str: Prefix in model

data: str: Prefix in data

XPM Configxpmir.learning.parameters.PartialModuleLoader(*, value, path, selector, mapper)[source]

Bases: PathSerializationLWTask

Allows to load only a part of the parameters

value: experimaestro.core.objects.Config: The configuration that will be serialized

path: Path: Path containing the data

selector: xpmir.learning.parameters.ParametersIterator: The selectors gives the list of parameters for which some

mapper: xpmir.learning.parameters.NameMapper: Maps parameter names so it matches so the saved ones

XPM Configxpmir.learning.parameters.SubModuleLoader(*, value, path, selector, saved_value)[source]

Bases: PathSerializationLWTask

Allows to load only a part of the parameters (with automatic renaming)

value: experimaestro.core.objects.Config: The configuration that will be serialized

path: Path: Path containing the data

selector: xpmir.learning.parameters.ParametersIterator: The selectors gives the list of parameters for which loaded parameters should be used

saved_value: xpmir.learning.optim.Module: The original module that is being loaded (optional, allows to map names)

Batching

XPM Configxpmir.learning.batchers.Batcher[source]

Bases: Config

Responsible for micro-batching when the batch does not fit in memory

The base class just does nothing (no adaptation)

XPM Configxpmir.learning.batchers.PowerAdaptativeBatcher[source]

Bases: Batcher

Starts with the provided batch size, and then divides in 2, 3, etc. until there is no more OOM

Devices

The devices configuration allow to select both the device to use for computation and the way to use it (i.e. multi-gpu settings).

XPM Configxpmir.learning.devices.Device[source]

Bases: Config

Device to use, as well as specific option (e.g. parallelism)

XPM Configxpmir.learning.devices.CudaDevice(*, gpu_determ, cpu_fallback, distributed)[source]

Bases: Device

CUDA device

gpu_determ: bool = False: Sets the deterministic

cpu_fallback: bool = False: Fallback to CPU if no GPU is available

distributed: bool = False: Flag for using DistributedDataParallel When the number of GPUs is greater than one, use torch.nn.parallel.DistributedDataParallel when distributed is True and the number of GPUs greater than 1. When False, use torch.nn.DataParallel

Schedulers

XPM Configxpmir.learning.schedulers.Scheduler[source]

Bases: Config

Base class for all optimizers schedulers

XPM Configxpmir.learning.schedulers.CosineWithWarmup(*, num_warmup_steps, num_cycles)[source]

Bases: Scheduler

Cosine schedule with warmup

Uses the implementation of the transformer library

https://huggingface.co/docs/transformers/main_classes/optimizer_schedules#transformers.get_cosine_schedule_with_warmup

num_warmup_steps: int: Number of warmup steps

num_cycles: float = 0.5: Number of cycles

XPM Configxpmir.learning.schedulers.LinearWithWarmup(*, num_warmup_steps, min_factor)[source]

Bases: Scheduler

Linear warmup followed by decay

num_warmup_steps: int: Number of warmup steps

min_factor: float = 0.0: Minimum multiplicative factor

Base classes

XPM Configxpmir.learning.base.Random(*, seed)[source]

Bases: Config

Random configuration

seed: int = 0: The seed to use so the random process is deterministic

XPM Configxpmir.learning.base.Sampler[source]

Bases: Config, EasyLogger

Abstract data sampler

XPM Configxpmir.learning.trainers.Trainer(*, hooks, model)[source]

Bases: Config, EasyLogger

Generic trainer

hooks: List[xpmir.learning.context.TrainingHook] = []: Hooks for this trainer: this includes the losses, but can be adapted for other uses The specific list of hooks depends on the specific trainer

model: xpmir.learning.optim.Module: If the model to optimize is different from the model passsed to Learn, this parameter can be used – initialization is still expected to be done at the learner level