Optimization
Modules
- XPM Configxpmir.learning.optim.Module[source]
Bases:
Config
,Initializable
,TorchModule
Submit type:
xpmir.learning.optim.Module
A module contains parameters
- XPM Configxpmir.learning.optim.ModuleList(*, sub_modules)[source]
Bases:
Module
,Initializable
Submit type:
xpmir.learning.optim.ModuleList
Groups different models together, to be used within the Learner
- sub_modules: List[xpmir.learning.optim.Module]
The module loader can be used to load a checkpoint
- XPM Configxpmir.learning.optim.ModuleLoader(*, value, path)[source]
Bases:
PathSerializationLWTask
Submit type:
xpmir.learning.optim.ModuleLoader
- value: experimaestro.core.objects.Config
The configuration that will be serialized
- path: Path
Path containing the data
Optimizers
- XPM Configxpmir.learning.optim.Optimizer[source]
Bases:
Config
Submit type:
xpmir.learning.optim.Optimizer
- XPM Configxpmir.learning.optim.SGD(*, lr, weight_decay)[source]
Bases:
Optimizer
Submit type:
xpmir.learning.optim.SGD
Wrapper for SGD optimizer in Pytorch
- lr: float = 1e-05
Learning rate
- weight_decay: float = 0.0
Weight decay (L2)
- XPM Configxpmir.learning.optim.Adam(*, lr, weight_decay, eps)[source]
Bases:
Optimizer
Submit type:
xpmir.learning.optim.Adam
Wrapper for Adam optimizer in PyTorch
- lr: float = 0.001
Learning rate
- weight_decay: float = 0.0
Weight decay (L2)
- eps: float = 1e-08
- XPM Configxpmir.learning.optim.AdamW(*, lr, weight_decay, eps)[source]
Bases:
Optimizer
Submit type:
xpmir.learning.optim.AdamW
Adam optimizer that takes into account the regularization
See the PyTorch documentation
- lr: float = 0.001
- weight_decay: float = 0.01
- eps: float = 1e-08
- XPM Configxpmir.learning.optim.Adafactor(*, lr, weight_decay, relative_step)[source]
Bases:
Optimizer
Submit type:
xpmir.learning.optim.Adafactor
Wrapper for Adafactor optimizer in Transformers library
See
transformers.optimization.Adafactor
for full documentation- lr: float
Learning rate
- weight_decay: float = 0.0
Weight decay (L2)
- relative_step: bool = True
If true, time-dependent learning rate is computed instead of external learning rate
- XPM Configxpmir.learning.optim.ParameterOptimizer(*, optimizer, scheduler, module, filter)[source]
Bases:
Config
Submit type:
xpmir.learning.optim.ParameterOptimizer
Associates an optimizer with a list of parameters to optimize
- optimizer: xpmir.learning.optim.Optimizer
The optimizer
- scheduler: xpmir.learning.schedulers.Scheduler
The optional scheduler
- module: xpmir.learning.optim.Module
The module from which parameters should be extracted
- filter: xpmir.learning.optim.ParameterFilter = xpmir.learning.optim.ParameterFilter.XPMValue()
How parameters should be selected for this (by default, use them all)
- XPM Configxpmir.learning.optim.ParameterFilter[source]
Bases:
Config
Submit type:
xpmir.learning.optim.ParameterFilter
One abstract class which doesn’t do the filtrage
- XPM Configxpmir.learning.optim.RegexParameterFilter(*, includes, excludes)[source]
Bases:
ParameterFilter
Submit type:
xpmir.learning.optim.RegexParameterFilter
gives the name of the model to do the filtrage Precondition: Only and just one of the includes and excludes can be None
- includes: List[str]
The str of params to be included from the model
- excludes: List[str]
The str of params to be excludes from the model
- XPM Configxpmir.learning.optim.OptimizationHook[source]
Bases:
Hook
Submit type:
xpmir.learning.optim.OptimizationHook
Base class for all optimization hooks
Hooks
- XPM Configxpmir.learning.optim.GradientHook[source]
Bases:
OptimizationHook
Submit type:
xpmir.learning.optim.GradientHook
Hooks that are called when the gradient is computed
The gradient is guaranteed to be unscaled in this case.
- XPM Configxpmir.learning.optim.GradientClippingHook(*, max_norm)[source]
Bases:
GradientHook
Submit type:
xpmir.learning.optim.GradientClippingHook
Gradient clipping
- max_norm: float
Maximum norm for gradient clipping
- XPM Configxpmir.learning.optim.GradientLogHook(*, name)[source]
Bases:
GradientHook
Submit type:
xpmir.learning.optim.GradientLogHook
“Log the gradient norm
- name: str = gradient_norm
Parameters
During learning, some parameter-specific treatments can be applied (e.g. freezing).
Selecting
The classes below allow to select a subset of parameters.
- XPM Configxpmir.learning.parameters.InverseParametersIterator(*, iterator)[source]
Bases:
ParametersIterator
Submit type:
xpmir.learning.parameters.InverseParametersIterator
Inverse the selection of a parameter iterator
- XPM Configxpmir.learning.parameters.ParametersIterator[source]
Bases:
Config
,ABC
Submit type:
xpmir.learning.parameters.ParametersIterator
Iterator over module parameters
This can be useful to freeze some layers, or perform any other parameter-wise operation
- XPM Configxpmir.learning.parameters.SubParametersIterator(*, model, iterator, default)[source]
Bases:
ParametersIterator
Submit type:
xpmir.learning.parameters.SubParametersIterator
Wraps a parameter iterator over a global model and a selector over a subpart of the model
- model: xpmir.learning.optim.Module
The model from which the parameters should be gathered
- iterator: xpmir.learning.parameters.ParametersIterator
The sub-model iterator
- default: bool
Default value for parameters not within the sub-model
- XPM Configxpmir.learning.parameters.RegexParametersIterator(*, negative_regex, regex, model)[source]
Bases:
ParametersIterator
Submit type:
xpmir.learning.parameters.RegexParametersIterator
Itertor over all the parameters which match the given regex
- negative_regex: str
The negative regex expression (should not match)
- regex: str
The regex expression
- model: xpmir.learning.optim.Module
The model we want to select the parameters from
Freezing
- XPM Configxpmir.learning.hooks.LayerFreezer(*, selector)[source]
Bases:
InitializationTrainingHook
Submit type:
xpmir.learning.hooks.LayerFreezer
This training hook class can be used to freeze a subset of model parameters
- selector: xpmir.learning.parameters.ParametersIterator
How to select the layers to freeze
Loading
- XPM Configxpmir.learning.parameters.NameMapper[source]
Bases:
Config
,ABC
Submit type:
xpmir.learning.parameters.NameMapper
Changes name of parameters
- XPM Configxpmir.learning.parameters.PrefixRenamer(*, model, data)[source]
Bases:
NameMapper
Submit type:
xpmir.learning.parameters.PrefixRenamer
Changes name of parameters
- model: str
Prefix in model
- data: str
Prefix in data
- XPM Configxpmir.learning.parameters.PartialModuleLoader(*, value, path, selector, mapper)[source]
Bases:
PathSerializationLWTask
Submit type:
xpmir.learning.parameters.PartialModuleLoader
Allows to load only a part of the parameters
- value: experimaestro.core.objects.Config
The configuration that will be serialized
- path: Path
Path containing the data
- selector: xpmir.learning.parameters.ParametersIterator
The selectors gives the list of parameters for which some
- mapper: xpmir.learning.parameters.NameMapper
Maps parameter names so it matches so the saved ones
- XPM Configxpmir.learning.parameters.SubModuleLoader(*, value, path, selector, saved_value)[source]
Bases:
PathSerializationLWTask
Submit type:
xpmir.learning.parameters.SubModuleLoader
Allows to load only a part of the parameters (with automatic renaming)
- value: experimaestro.core.objects.Config
The configuration that will be serialized
- path: Path
Path containing the data
- selector: xpmir.learning.parameters.ParametersIterator
The selectors gives the list of parameters for which loaded parameters should be used
- saved_value: xpmir.learning.optim.Module
The original module that is being loaded (optional, allows to map names)
Batching
- XPM Configxpmir.learning.batchers.Batcher[source]
Bases:
Config
Submit type:
xpmir.learning.batchers.Batcher
Responsible for micro-batching when the batch does not fit in memory
The base class just does nothing (no adaptation)
- XPM Configxpmir.learning.batchers.PowerAdaptativeBatcher[source]
Bases:
Batcher
Submit type:
xpmir.learning.batchers.PowerAdaptativeBatcher
Starts with the provided batch size, and then divides in 2, 3, etc. until there is no more OOM
Devices
The devices configuration allow to select both the device to use for computation and the way to use it (i.e. multi-gpu settings).
- XPM Configxpmir.learning.devices.Device[source]
Bases:
Config
Submit type:
xpmir.learning.devices.Device
Device to use, as well as specific option (e.g. parallelism)
- XPM Configxpmir.learning.devices.CudaDevice(*, gpu_determ, cpu_fallback, distributed)[source]
Bases:
Device
Submit type:
xpmir.learning.devices.CudaDevice
CUDA device
- gpu_determ: bool = False
Sets the deterministic
- cpu_fallback: bool = False
Fallback to CPU if no GPU is available
- distributed: bool = False
Flag for using DistributedDataParallel When the number of GPUs is greater than one, use torch.nn.parallel.DistributedDataParallel when distributed is True and the number of GPUs greater than 1. When False, use torch.nn.DataParallel
Schedulers
- XPM Configxpmir.learning.schedulers.Scheduler[source]
Bases:
Config
Submit type:
xpmir.learning.schedulers.Scheduler
Base class for all optimizers schedulers
- XPM Configxpmir.learning.schedulers.CosineWithWarmup(*, num_warmup_steps, num_cycles)[source]
Bases:
Scheduler
Submit type:
xpmir.learning.schedulers.CosineWithWarmup
Cosine schedule with warmup
Uses the implementation of the transformer library
- num_warmup_steps: int
Number of warmup steps
- num_cycles: float = 0.5
Number of cycles
- XPM Configxpmir.learning.schedulers.LinearWithWarmup(*, num_warmup_steps, min_factor)[source]
Bases:
Scheduler
Submit type:
xpmir.learning.schedulers.LinearWithWarmup
Linear warmup followed by decay
- num_warmup_steps: int
Number of warmup steps
- min_factor: float = 0.0
Minimum multiplicative factor
Base classes
- XPM Configxpmir.learning.base.Random(*, seed)[source]
Bases:
Config
Submit type:
xpmir.learning.base.Random
Random configuration
- seed: int = 0
The seed to use so the random process is deterministic
- XPM Configxpmir.learning.base.Sampler[source]
Bases:
Config
,EasyLogger
Submit type:
xpmir.learning.base.Sampler
Abstract data sampler
- XPM Configxpmir.learning.base.BaseSampler[source]
Bases:
Sampler
,SampleIterator
[T
],ABC
Submit type:
xpmir.learning.base.BaseSampler
A serializable sampler iterator
- XPM Configxpmir.learning.trainers.Trainer(*, hooks, model)[source]
Bases:
Config
,EasyLogger
Submit type:
xpmir.learning.trainers.Trainer
Generic trainer
- hooks: List[xpmir.learning.context.TrainingHook] = []
Hooks for this trainer: this includes the losses, but can be adapted for other uses The specific list of hooks depends on the specific trainer
- model: xpmir.learning.optim.Module
If the model to optimize is different from the model passsed to Learn, this parameter can be used – initialization is still expected to be done at the learner level