scitex_ml.optim.Ranger_Deep_Learning_Optimizer.ranger.rangerqh
Classes
|
Implements the QHAdam optimization algorithm (Ma and Yarats, 2019). Along with Hinton/Zhang Lookahead. :type params: iterable :param params: iterable of parameters to optimize or dicts defining parameter groups :type params: iterable :type lr: float, optional :param lr: learning rate (\(\alpha\) from the paper) (default: 1e-3) :type lr: float, optional :type betas: Tuple[float, float], optional :param betas: coefficients used for computing running averages of the gradient and its square (default: (0.9, 0.999)) :type betas: Tuple[float, float], optional :type nus: Tuple[float, float], optional :param nus: immediate discount factors used to estimate the gradient and its square (default: (1.0, 1.0)) :type nus: Tuple[float, float], optional :type eps: float, optional :param eps: term added to the denominator to improve numerical stability (default: 1e-8) :type eps: float, optional :type weight_decay: float, optional :param weight_decay: weight decay (default: 0.0) :type weight_decay: float, optional :type decouple_weight_decay: bool, optional :param decouple_weight_decay: whether to decouple the weight decay from the gradient-based optimization step (default: False) :type decouple_weight_decay: bool, optional. |
- class scitex_ml.optim.Ranger_Deep_Learning_Optimizer.ranger.rangerqh.RangerQH(*args: Any, **kwargs: Any)[source]
Implements the QHAdam optimization algorithm (Ma and Yarats, 2019). Along with Hinton/Zhang Lookahead. :type params: iterable :param params: iterable of parameters to optimize or dicts defining parameter
groups
- Parameters:
lr (float, optional) – learning rate (\(\alpha\) from the paper) (default: 1e-3)
betas (Tuple[float, float], optional) – coefficients used for computing running averages of the gradient and its square (default: (0.9, 0.999))
nus (Tuple[float, float], optional) – immediate discount factors used to estimate the gradient and its square (default: (1.0, 1.0))
eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)
weight_decay (float, optional) – weight decay (default: 0.0)
decouple_weight_decay (bool, optional) – whether to decouple the weight decay from the gradient-based optimization step (default: False)
Example
>>> optimizer = qhoptim.pyt.QHAdam( ... model.parameters(), ... lr=3e-4, nus=(0.8, 1.0), betas=(0.99, 0.999)) >>> optimizer.zero_grad() >>> loss_fn(model(input), target).backward() >>> optimizer.step()