Commit e2aef3a7 authored by Sean Bell's avatar Sean Bell Committed by Facebook Github Bot

Add exp_decay learning rate policy

Summary:
This adds an exponential learning rate schedule, parameterized by `GAMMA` which specifies the ratio between the final and initial learning rate.  This is a more natural parameterization than `A ^ t` or `exp(A * t)`.

Since most jobs drop by ~3 orders of magnitude throughout training, typically `GAMMA = 1e-3` should work for many scenarios.

Reviewed By: ashwinb, rbgirshick

Differential Revision: D14654419

fbshipit-source-id: 8dabbc8df28d32e4fd748b61bc32efaf3935d244
parent 3dc87ab5
......@@ -583,9 +583,13 @@ __C.SOLVER.LR_POLICY = 'step'
# lr = LRS[current_step]
# 'cosine_decay'
# lr = SOLVER.BASE_LR * (cos(PI * cur_iter / SOLVER.MAX_ITER) * 0.5 + 0.5)
# 'exp_decay'
# lr smoothly decays from SOLVER.BASE_LR to SOLVER.GAMMA * SOLVER.BASE_LR
# lr = SOLVER.BASE_LR * exp(np.log(SOLVER.GAMMA) * cur_iter / SOLVER.MAX_ITER)
# Hyperparameter used by the specified policy
# For 'step', the current LR is multiplied by SOLVER.GAMMA at each step
# For 'exp_decay', SOLVER.GAMMA is the ratio between the final and initial LR.
__C.SOLVER.GAMMA = 0.1
# Uniform step size for 'steps' policy
......
......@@ -99,6 +99,15 @@ def lr_func_cosine_decay(cur_iter):
return cfg.SOLVER.BASE_LR * cos_frac
def lr_func_exp_decay(cur_iter):
"""For cfg.SOLVER.LR_POLICY = 'exp_decay'
"""
# GAMMA is final/initial learning rate ratio
iter_frac = float(cur_iter) / cfg.SOLVER.MAX_ITER
exp_frac = np.exp(iter_frac * np.log(cfg.SOLVER.GAMMA))
return cfg.SOLVER.BASE_LR * exp_frac
# ---------------------------------------------------------------------------- #
# Helpers
# ---------------------------------------------------------------------------- #
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment