Optimization

filter: xpmir.learning.optim.ParameterFilter = xpmir.learning.optim.ParameterFilter.XPMValue(): How parameters should be selected for this (by default, use them all)

XPM Configxpmir.learning.optim.ParameterFilter[source]

Bases: Config

Submit type: xpmir.learning.optim.ParameterFilter

One abstract class which doesn’t do the filtrage

XPM Configxpmir.learning.optim.RegexParameterFilter(*, includes, excludes)[source]

Bases: ParameterFilter

Submit type: xpmir.learning.optim.RegexParameterFilter

gives the name of the model to do the filtrage Precondition: Only and just one of the includes and excludes can be None

includes: List[str]: The str of params to be included from the model

excludes: List[str]: The str of params to be excludes from the model

XPM Configxpmir.learning.optim.OptimizationHook[source]

Bases: Hook

Submit type: xpmir.learning.optim.OptimizationHook

Base class for all optimization hooks

Hooks

XPM Configxpmir.learning.optim.GradientHook[source]

Bases: OptimizationHook

Submit type: xpmir.learning.optim.GradientHook

Hooks that are called when the gradient is computed

The gradient is guaranteed to be unscaled in this case.

XPM Configxpmir.learning.optim.GradientClippingHook(*, max_norm)[source]

Bases: GradientHook

Submit type: xpmir.learning.optim.GradientClippingHook

Gradient clipping

max_norm: float: Maximum norm for gradient clipping

XPM Configxpmir.learning.optim.GradientLogHook(*, name)[source]

Bases: GradientHook

Submit type: xpmir.learning.optim.GradientLogHook

“Log the gradient norm

name: str = gradient_norm

Parameters

During learning, some parameter-specific treatments can be applied (e.g. freezing).

Selecting

The classes below allow to select a subset of parameters.

XPM Configxpmir.learning.parameters.InverseParametersIterator(*, iterator)[source]

Bases: ParametersIterator

Submit type: xpmir.learning.parameters.InverseParametersIterator

Inverse the selection of a parameter iterator

iterator: xpmir.learning.parameters.ParametersIterator

XPM Configxpmir.learning.parameters.ParametersIterator[source]

Bases: Config, ABC

Submit type: xpmir.learning.parameters.ParametersIterator

Iterator over module parameters

This can be useful to freeze some layers, or perform any other parameter-wise operation

XPM Configxpmir.learning.parameters.SubParametersIterator(*, model, iterator, default)[source]

Bases: ParametersIterator

Submit type: xpmir.learning.parameters.SubParametersIterator

Wraps a parameter iterator over a global model and a selector over a subpart of the model

model: xpmir.learning.optim.Module: The model from which the parameters should be gathered

iterator: xpmir.learning.parameters.ParametersIterator: The sub-model iterator

default: bool: Default value for parameters not within the sub-model

XPM Configxpmir.learning.parameters.RegexParametersIterator(*, regex, model)[source]

Bases: ParametersIterator

Submit type: xpmir.learning.parameters.RegexParametersIterator

Itertor over all the parameters which match the given regex

regex: str: The regex expression

model: xpmir.learning.optim.Module: The model we want to select the parameters from

Freezing

XPM Configxpmir.learning.hooks.LayerFreezer(*, selector)[source]

Bases: InitializationTrainingHook

Submit type: xpmir.learning.hooks.LayerFreezer

This training hook class can be used to freeze some of the transformer layers

selector: xpmir.learning.parameters.ParametersIterator: How to select the layers to freeze

Loading

XPM Configxpmir.learning.parameters.NameMapper[source]

Bases: Config, ABC

Submit type: xpmir.learning.parameters.NameMapper

Changes name of parameters

XPM Configxpmir.learning.parameters.PrefixRenamer(*, model, data)[source]

Bases: NameMapper

Submit type: xpmir.learning.parameters.PrefixRenamer

Changes name of parameters

model: str: Prefix in model

data: str: Prefix in data

XPM Configxpmir.learning.parameters.PartialModuleLoader(*, value, path, selector, mapper)[source]

Bases: PathSerializationLWTask

Submit type: xpmir.learning.parameters.PartialModuleLoader

Allows to load only a part of the parameters

value: experimaestro.core.objects.Config: The configuration that will be serialized

path: Path: Path containing the data

selector: xpmir.learning.parameters.ParametersIterator: The selectors gives the list of parameters for which some

mapper: xpmir.learning.parameters.NameMapper: Maps parameter names so it matches so the saved ones

XPM Configxpmir.learning.parameters.SubModuleLoader(*, value, path, selector, saved_value)[source]

Bases: PathSerializationLWTask

Submit type: xpmir.learning.parameters.SubModuleLoader

Allows to load only a part of the parameters (with automatic renaming)

value: experimaestro.core.objects.Config: The configuration that will be serialized

path: Path: Path containing the data

selector: xpmir.learning.parameters.ParametersIterator: The selectors gives the list of parameters for which loaded parameters should be used

saved_value: xpmir.learning.optim.Module: The original module that is being loaded (optional, allows to map names)

Batching

XPM Configxpmir.learning.batchers.Batcher[source]

Bases: Config

Submit type: xpmir.learning.batchers.Batcher

Responsible for micro-batching when the batch does not fit in memory

The base class just does nothing (no adaptation)

XPM Configxpmir.learning.batchers.PowerAdaptativeBatcher[source]

Bases: Batcher

Submit type: xpmir.learning.batchers.PowerAdaptativeBatcher

Starts with the provided batch size, and then divides in 2, 3, etc. until there is no more OOM

Devices

The devices configuration allow to select both the device to use for computation and the way to use it (i.e. multi-gpu settings).

XPM Configxpmir.learning.devices.Device[source]

Bases: Config

Submit type: xpmir.learning.devices.Device

Device to use, as well as specific option (e.g. parallelism)

XPM Configxpmir.learning.devices.CudaDevice(*, gpu_determ, cpu_fallback, distributed)[source]

Bases: Device

Submit type: xpmir.learning.devices.CudaDevice

CUDA device

gpu_determ: bool = False: Sets the deterministic

cpu_fallback: bool = False: Fallback to CPU if no GPU is available

distributed: bool = False: Flag for using DistributedDataParallel When the number of GPUs is greater than one, use torch.nn.parallel.DistributedDataParallel when distributed is True and the number of GPUs greater than 1. When False, use torch.nn.DataParallel

Schedulers

XPM Configxpmir.learning.schedulers.Scheduler[source]

Bases: Config

Submit type: xpmir.learning.schedulers.Scheduler

Base class for all optimizers schedulers

XPM Configxpmir.learning.schedulers.CosineWithWarmup(*, num_warmup_steps, num_cycles)[source]

Bases: Scheduler

Submit type: xpmir.learning.schedulers.CosineWithWarmup

Cosine schedule with warmup

Uses the implementation of the transformer library

https://huggingface.co/docs/transformers/main_classes/optimizer_schedules#transformers.get_cosine_schedule_with_warmup

num_warmup_steps: int: Number of warmup steps

num_cycles: float = 0.5: Number of cycles

XPM Configxpmir.learning.schedulers.LinearWithWarmup(*, num_warmup_steps, min_factor)[source]

Bases: Scheduler

Submit type: xpmir.learning.schedulers.LinearWithWarmup

Linear warmup followed by decay

num_warmup_steps: int: Number of warmup steps

min_factor: float = 0.0: Minimum multiplicative factor

Base classes

XPM Configxpmir.learning.base.Random(*, seed)[source]

Bases: Config

Submit type: xpmir.learning.base.Random

Random configuration

seed: int = 0: The seed to use so the random process is deterministic

XPM Configxpmir.learning.base.Sampler[source]

Bases: Config, EasyLogger

Submit type: xpmir.learning.base.Sampler

Abstract data sampler

XPM Configxpmir.learning.base.BaseSampler[source]

Bases: Sampler, Generic[T], ABC

Submit type: xpmir.learning.base.BaseSampler

XPM Configxpmir.learning.trainers.Trainer(*, hooks, model)[source]

Bases: Config, EasyLogger

Submit type: xpmir.learning.trainers.Trainer

Generic trainer

hooks: List[xpmir.learning.context.TrainingHook] = []: Hooks for this trainer: this includes the losses, but can be adapted for other uses The specific list of hooks depends on the specific trainer

model: xpmir.learning.optim.Module: If the model to optimize is different from the model passsed to Learn, this parameter can be used – initialization is still expected to be done at the learner level