Optimization

Modules

XPM Configxpmir.learning.optim.Module[source]

Bases: Config, Initializable, TorchModule

A module contains parameters

XPM Configxpmir.learning.optim.ModuleList(*, sub_modules)[source]

Bases: Module, Initializable

Groups different models together, to be used within the Learner

sub_modules: List[xpmir.learning.optim.Module]

The module loader can be used to load a checkpoint

XPM Configxpmir.learning.optim.ModuleLoader(*, value, path)[source]

Bases: PathSerializationLWTask

value: experimaestro.core.objects.Config

The configuration that will be serialized

path: Path

Path containing the data

Optimizers

XPM Configxpmir.learning.optim.Optimizer[source]

Bases: Config

XPM Configxpmir.learning.optim.SGD(*, lr, weight_decay)[source]

Bases: Optimizer

Wrapper for SGD optimizer in Pytorch

lr: float = 1e-05

Learning rate

weight_decay: float = 0.0

Weight decay (L2)

XPM Configxpmir.learning.optim.Adam(*, lr, weight_decay, eps)[source]

Bases: Optimizer

Wrapper for Adam optimizer in PyTorch

lr: float = 0.001

Learning rate

weight_decay: float = 0.0

Weight decay (L2)

eps: float = 1e-08
XPM Configxpmir.learning.optim.AdamW(*, lr, weight_decay, eps)[source]

Bases: Optimizer

Adam optimizer that takes into account the regularization

See the PyTorch documentation

lr: float = 0.001
weight_decay: float = 0.01
eps: float = 1e-08
XPM Configxpmir.learning.optim.ParameterOptimizer(*, optimizer, scheduler, module, filter)[source]

Bases: Config

Associates an optimizer with a list of parameters to optimize

optimizer: xpmir.learning.optim.Optimizer

The optimizer

scheduler: xpmir.learning.schedulers.Scheduler

The optional scheduler

module: xpmir.learning.optim.Module

The module from which parameters should be extracted

filter: xpmir.learning.optim.ParameterFilter = xpmir.learning.optim.ParameterFilter()

How parameters should be selected for this (by default, use them all)

XPM Configxpmir.learning.optim.ParameterFilter[source]

Bases: Config

One abstract class which doesn’t do the filtrage

XPM Configxpmir.learning.optim.RegexParameterFilter(*, includes, excludes)[source]

Bases: ParameterFilter

gives the name of the model to do the filtrage Precondition: Only and just one of the includes and excludes can be None

includes: List[str]

The str of params to be included from the model

excludes: List[str]

The str of params to be excludes from the model

Parameters

During learning, some parameter-specific treatments can be applied (e.g. freezing).

Selecting

The classes below allow to select a subset of parameters.

XPM Configxpmir.learning.parameters.InverseParametersIterator(*, iterator)[source]

Bases: ParametersIterator

Inverse the selection of a parameter iterator

iterator: xpmir.learning.parameters.ParametersIterator
XPM Configxpmir.learning.parameters.ParametersIterator[source]

Bases: Config, ABC

Iterator over module parameters

This can be useful to freeze some layers, or perform any other parameter-wise operation

XPM Configxpmir.learning.parameters.SubParametersIterator(*, model, iterator, default)[source]

Bases: ParametersIterator

Wraps a parameter iterator over a global model and a selector over a subpart of the model

model: xpmir.learning.optim.Module

The model from which the parameters should be gathered

iterator: xpmir.learning.parameters.ParametersIterator

The sub-model iterator

default: bool

Default value for parameters not within the sub-model

XPM Configxpmir.learning.parameters.RegexParametersIterator(*, regex, model)[source]

Bases: ParametersIterator

Itertor over all the parameters which match the given regex

regex: str

The regex expression

model: xpmir.learning.optim.Module

The model we want to select the parameters from

Freezing

XPM Configxpmir.learning.hooks.LayerFreezer(*, selector)[source]

Bases: InitializationTrainingHook

This training hook class can be used to freeze some of the transformer layers

selector: xpmir.learning.parameters.ParametersIterator

How to select the layers to freeze

Loading

XPM Configxpmir.learning.parameters.NameMapper[source]

Bases: Config, ABC

Changes name of parameters

XPM Configxpmir.learning.parameters.PrefixRenamer(*, model, data)[source]

Bases: NameMapper

Changes name of parameters

model: str

Prefix in model

data: str

Prefix in data

XPM Configxpmir.learning.parameters.PartialModuleLoader(*, value, path, selector, mapper)[source]

Bases: PathSerializationLWTask

Allows to load only a part of the parameters

value: experimaestro.core.objects.Config

The configuration that will be serialized

path: Path

Path containing the data

selector: xpmir.learning.parameters.ParametersIterator

The selectors gives the list of parameters for which some

mapper: xpmir.learning.parameters.NameMapper

Maps parameter names so it matches so the saved ones

XPM Configxpmir.learning.parameters.SubModuleLoader(*, value, path, selector, saved_value)[source]

Bases: PathSerializationLWTask

Allows to load only a part of the parameters (with automatic renaming)

value: experimaestro.core.objects.Config

The configuration that will be serialized

path: Path

Path containing the data

selector: xpmir.learning.parameters.ParametersIterator

The selectors gives the list of parameters for which loaded parameters should be used

saved_value: xpmir.learning.optim.Module

The original module that is being loaded (optional, allows to map names)

Batching

XPM Configxpmir.learning.batchers.Batcher[source]

Bases: Config

Responsible for micro-batching when the batch does not fit in memory

The base class just does nothing (no adaptation)

XPM Configxpmir.learning.batchers.PowerAdaptativeBatcher[source]

Bases: Batcher

Starts with the provided batch size, and then divides in 2, 3, etc. until there is no more OOM

Devices

The devices configuration allow to select both the device to use for computation and the way to use it (i.e. multi-gpu settings).

XPM Configxpmir.learning.devices.Device[source]

Bases: Config

Device to use, as well as specific option (e.g. parallelism)

XPM Configxpmir.learning.devices.CudaDevice(*, gpu_determ, cpu_fallback, distributed)[source]

Bases: Device

CUDA device

gpu_determ: bool = False

Sets the deterministic

cpu_fallback: bool = False

Fallback to CPU if no GPU is available

distributed: bool = False

Flag for using DistributedDataParallel When the number of GPUs is greater than one, use torch.nn.parallel.DistributedDataParallel when distributed is True and the number of GPUs greater than 1. When False, use torch.nn.DataParallel

Schedulers

XPM Configxpmir.learning.schedulers.Scheduler[source]

Bases: Config

Base class for all optimizers schedulers

XPM Configxpmir.learning.schedulers.CosineWithWarmup(*, num_warmup_steps, num_cycles)[source]

Bases: Scheduler

Cosine schedule with warmup

Uses the implementation of the transformer library

https://huggingface.co/docs/transformers/main_classes/optimizer_schedules#transformers.get_cosine_schedule_with_warmup

num_warmup_steps: int

Number of warmup steps

num_cycles: float = 0.5

Number of cycles

XPM Configxpmir.learning.schedulers.LinearWithWarmup(*, num_warmup_steps, min_factor)[source]

Bases: Scheduler

Linear warmup followed by decay

num_warmup_steps: int

Number of warmup steps

min_factor: float = 0.0

Minimum multiplicative factor

Base classes

XPM Configxpmir.learning.base.Random(*, seed)[source]

Bases: Config

Random configuration

seed: int = 0

The seed to use so the random process is deterministic

XPM Configxpmir.learning.base.Sampler[source]

Bases: Config, EasyLogger

Abstract data sampler

XPM Configxpmir.learning.trainers.Trainer(*, hooks, model)[source]

Bases: Config, EasyLogger

Generic trainer

hooks: List[xpmir.learning.context.TrainingHook] = []

Hooks for this trainer: this includes the losses, but can be adapted for other uses The specific list of hooks depends on the specific trainer

model: xpmir.learning.optim.Module

If the model to optimize is different from the model passsed to Learn, this parameter can be used – initialization is still expected to be done at the learner level