Optimization
Modules
- XPM Configxpmir.learning.optim.Module[source]
Bases:
Config,Initializable,TorchModuleSubmit type:
xpmir.learning.optim.ModuleA module contains parameters
- XPM Configxpmir.learning.optim.ModuleList(*, sub_modules)[source]
Bases:
Module,InitializableSubmit type:
xpmir.learning.optim.ModuleListGroups different models together, to be used within the Learner
- sub_modules: List[xpmir.learning.optim.Module]
The module loader can be used to load a checkpoint
- XPM Configxpmir.learning.optim.ModuleLoader(*, value, path)[source]
Bases:
PathSerializationLWTaskSubmit type:
xpmir.learning.optim.ModuleLoader- value: experimaestro.core.objects.config.Config
The configuration that will be serialized
- path: Path
Path containing the data
Optimizers
- XPM Configxpmir.learning.optim.Optimizer[source]
Bases:
ConfigSubmit type:
xpmir.learning.optim.Optimizer
- XPM Configxpmir.learning.optim.SGD(*, lr, weight_decay)[source]
Bases:
OptimizerSubmit type:
xpmir.learning.optim.SGDWrapper for SGD optimizer in Pytorch
- lr: float = 1e-05
Learning rate
- weight_decay: float = 0.0
Weight decay (L2)
- XPM Configxpmir.learning.optim.Adam(*, lr, weight_decay, eps)[source]
Bases:
OptimizerSubmit type:
xpmir.learning.optim.AdamWrapper for Adam optimizer in PyTorch
- lr: float = 0.001
Learning rate
- weight_decay: float = 0.0
Weight decay (L2)
- eps: float = 1e-08
- XPM Configxpmir.learning.optim.AdamW(*, lr, weight_decay, eps)[source]
Bases:
OptimizerSubmit type:
xpmir.learning.optim.AdamWAdam optimizer that takes into account the regularization
See the PyTorch documentation
- lr: float = 0.001
- weight_decay: float = 0.01
- eps: float = 1e-08
- XPM Configxpmir.learning.optim.Adafactor(*, lr, weight_decay, relative_step)[source]
Bases:
OptimizerSubmit type:
xpmir.learning.optim.AdafactorWrapper for Adafactor optimizer in Transformers library
See
transformers.optimization.Adafactorfor full documentation- lr: float
Learning rate
- weight_decay: float = 0.0
Weight decay (L2)
- relative_step: bool = True
If true, time-dependent learning rate is computed instead of external learning rate
- XPM Configxpmir.learning.optim.ParameterOptimizer(*, optimizer, scheduler, module, filter)[source]
Bases:
ConfigSubmit type:
xpmir.learning.optim.ParameterOptimizerAssociates an optimizer with a list of parameters to optimize
- optimizer: xpmir.learning.optim.Optimizer
The optimizer
- scheduler: xpmir.learning.schedulers.Scheduler
The optional scheduler
- module: xpmir.learning.optim.Module
The module from which parameters should be extracted
- filter: xpmir.learning.optim.ParameterFilter = xpmir.learning.optim.ParameterFilter.XPMValue()
How parameters should be selected for this (by default, use them all)
- XPM Configxpmir.learning.optim.ParameterFilter[source]
Bases:
ConfigSubmit type:
xpmir.learning.optim.ParameterFilterOne abstract class which doesn’t do the filtrage
- XPM Configxpmir.learning.optim.RegexParameterFilter(*, includes, excludes)[source]
Bases:
ParameterFilterSubmit type:
xpmir.learning.optim.RegexParameterFiltergives the name of the model to do the filtrage Precondition: Only and just one of the includes and excludes can be None
- includes: List[str]
The str of params to be included from the model
- excludes: List[str]
The str of params to be excludes from the model
- XPM Configxpmir.learning.optim.OptimizationHook[source]
Bases:
HookSubmit type:
xpmir.learning.optim.OptimizationHookBase class for all optimization hooks
Hooks
- XPM Configxpmir.learning.optim.GradientHook[source]
Bases:
OptimizationHookSubmit type:
xpmir.learning.optim.GradientHookHooks that are called when the gradient is computed
The gradient is guaranteed to be unscaled in this case.
- XPM Configxpmir.learning.optim.GradientClippingHook(*, max_norm)[source]
Bases:
GradientHookSubmit type:
xpmir.learning.optim.GradientClippingHookGradient clipping
- max_norm: float
Maximum norm for gradient clipping
- XPM Configxpmir.learning.optim.GradientLogHook(*, name)[source]
Bases:
GradientHookSubmit type:
xpmir.learning.optim.GradientLogHook“Log the gradient norm
- name: str = gradient_norm
Parameters
During learning, some parameter-specific treatments can be applied (e.g. freezing).
Selecting
The classes below allow to select a subset of parameters.
- XPM Configxpmir.learning.parameters.InverseParametersIterator(*, iterator)[source]
Bases:
ParametersIteratorSubmit type:
xpmir.learning.parameters.InverseParametersIteratorInverse the selection of a parameter iterator
- XPM Configxpmir.learning.parameters.ParametersIterator[source]
Bases:
Config,ABCSubmit type:
xpmir.learning.parameters.ParametersIteratorIterator over module parameters
This can be useful to freeze some layers, or perform any other parameter-wise operation
- XPM Configxpmir.learning.parameters.SubParametersIterator(*, model, iterator, default)[source]
Bases:
ParametersIteratorSubmit type:
xpmir.learning.parameters.SubParametersIteratorWraps a parameter iterator over a global model and a selector over a subpart of the model
- model: xpmir.learning.optim.Module
The model from which the parameters should be gathered
- iterator: xpmir.learning.parameters.ParametersIterator
The sub-model iterator
- default: bool
Default value for parameters not within the sub-model
- XPM Configxpmir.learning.parameters.RegexParametersIterator(*, negative_regex, regex, model)[source]
Bases:
ParametersIteratorSubmit type:
xpmir.learning.parameters.RegexParametersIteratorItertor over all the parameters which match the given regex
- negative_regex: str
The negative regex expression (should not match)
- regex: str
The regex expression
- model: xpmir.learning.optim.Module
The model we want to select the parameters from
Freezing
- XPM Configxpmir.learning.hooks.LayerFreezer(*, selector)[source]
Bases:
InitializationTrainingHookSubmit type:
xpmir.learning.hooks.LayerFreezerThis training hook class can be used to freeze a subset of model parameters
- selector: xpmir.learning.parameters.ParametersIterator
How to select the layers to freeze
Loading
- XPM Configxpmir.learning.parameters.NameMapper[source]
Bases:
Config,ABCSubmit type:
xpmir.learning.parameters.NameMapperChanges name of parameters
- XPM Configxpmir.learning.parameters.PrefixRenamer(*, model, data)[source]
Bases:
NameMapperSubmit type:
xpmir.learning.parameters.PrefixRenamerChanges name of parameters
- model: str
Prefix in model
- data: str
Prefix in data
- XPM Configxpmir.learning.parameters.PartialModuleLoader(*, value, path, selector, mapper)[source]
Bases:
PathSerializationLWTaskSubmit type:
xpmir.learning.parameters.PartialModuleLoaderAllows to load only a part of the parameters
- value: experimaestro.core.objects.config.Config
The configuration that will be serialized
- path: Path
Path containing the data
- selector: xpmir.learning.parameters.ParametersIterator
The selectors gives the list of parameters for which some
- mapper: xpmir.learning.parameters.NameMapper
Maps parameter names so it matches so the saved ones
- XPM Configxpmir.learning.parameters.SubModuleLoader(*, value, path, selector, saved_value)[source]
Bases:
PathSerializationLWTaskSubmit type:
xpmir.learning.parameters.SubModuleLoaderAllows to load only a part of the parameters (with automatic renaming)
- value: experimaestro.core.objects.config.Config
The configuration that will be serialized
- path: Path
Path containing the data
- selector: xpmir.learning.parameters.ParametersIterator
The selectors gives the list of parameters for which loaded parameters should be used
- saved_value: xpmir.learning.optim.Module
The original module that is being loaded (optional, allows to map names)
Batching
- XPM Configxpmir.learning.batchers.Batcher[source]
Bases:
ConfigSubmit type:
xpmir.learning.batchers.BatcherResponsible for micro-batching when the batch does not fit in memory
The base class just does nothing (no adaptation)
- XPM Configxpmir.learning.batchers.PowerAdaptativeBatcher[source]
Bases:
BatcherSubmit type:
xpmir.learning.batchers.PowerAdaptativeBatcherStarts with the provided batch size, and then divides in 2, 3, etc. until there is no more OOM
Devices
The devices configuration allow to select both the device to use for computation and the way to use it (i.e. multi-gpu settings).
- XPM Configxpmir.learning.devices.Device[source]
Bases:
ConfigSubmit type:
xpmir.learning.devices.DeviceDevice to use, as well as specific option (e.g. parallelism)
- XPM Configxpmir.learning.devices.CudaDevice(*, gpu_determ, cpu_fallback, distributed)[source]
Bases:
DeviceSubmit type:
xpmir.learning.devices.CudaDeviceCUDA device
- gpu_determ: bool = False
Sets the deterministic
- cpu_fallback: bool = False
Fallback to CPU if no GPU is available
- distributed: bool = False
Flag for using DistributedDataParallel When the number of GPUs is greater than one, use torch.nn.parallel.DistributedDataParallel when distributed is True and the number of GPUs greater than 1. When False, use torch.nn.DataParallel
- XPM Configxpmir.learning.devices.BestDevice[source]
Bases:
DeviceSubmit type:
xpmir.learning.devices.BestDeviceTry to use a GPU device if it exists, fallbacks to CPU otherwise
To be used when debugging
Schedulers
- XPM Configxpmir.learning.schedulers.Scheduler[source]
Bases:
ConfigSubmit type:
xpmir.learning.schedulers.SchedulerBase class for all optimizers schedulers
- XPM Configxpmir.learning.schedulers.CosineWithWarmup(*, num_warmup_steps, num_cycles)[source]
Bases:
SchedulerSubmit type:
xpmir.learning.schedulers.CosineWithWarmupCosine schedule with warmup
Uses the implementation of the transformer library
- num_warmup_steps: int
Number of warmup steps
- num_cycles: float = 0.5
Number of cycles
- XPM Configxpmir.learning.schedulers.LinearWithWarmup(*, num_warmup_steps, min_factor)[source]
Bases:
SchedulerSubmit type:
xpmir.learning.schedulers.LinearWithWarmupLinear warmup followed by decay
- num_warmup_steps: int
Number of warmup steps
- min_factor: float = 0.0
Minimum multiplicative factor
Base classes
- XPM Configxpmir.learning.base.Random(*, seed)[source]
Bases:
ConfigSubmit type:
xpmir.learning.base.RandomRandom configuration
- seed: int = 0
The seed to use so the random process is deterministic
- XPM Configxpmir.learning.base.Sampler[source]
Bases:
Config,EasyLoggerSubmit type:
xpmir.learning.base.SamplerAbstract data sampler
- XPM Configxpmir.learning.base.BaseSampler[source]
Bases:
Sampler,SampleIterator[T],ABCSubmit type:
xpmir.learning.base.BaseSamplerA serializable sampler iterator
- XPM Configxpmir.learning.trainers.Trainer(*, hooks, model)[source]
Bases:
Config,EasyLoggerSubmit type:
xpmir.learning.trainers.TrainerGeneric trainer
- hooks: List[xpmir.learning.context.TrainingHook] = []
Hooks for this trainer: this includes the losses, but can be adapted for other uses The specific list of hooks depends on the specific trainer
- model: xpmir.learning.optim.Module
If the model to optimize is different from the model passsed to Learn, this parameter can be used – initialization is still expected to be done at the learner level
- XPM Configxpmir.learning.base.SampleIterator[source]
Bases:
Config,Iterable[T],ABCSubmit type:
xpmir.learning.base.SampleIteratorGeneric class to iterate over items or batch of items