Add exp_decay learning rate policy

Summary: This adds an exponential learning rate schedule, parameterized by `GAMMA` which specifies the ratio between the final and initial learning rate. This is a more natural parameterization than `A ^ t` or `exp(A * t)`. Since most jobs drop by ~3 orders of magnitude throughout training, typically `GAMMA = 1e-3` should work for many scenarios. Reviewed By: ashwinb, rbgirshick Differential Revision: D14654419 fbshipit-source-id: 8dabbc8df28d32e4fd748b61bc32efaf3935d244

Add exp_decay learning rate policy
Summary: This adds an exponential learning rate schedule, parameterized by `GAMMA` which specifies the ratio between the final and initial learning rate. This is a more natural parameterization than `A ^ t` or `exp(A * t)`. Since most jobs drop by ~3 orders of magnitude throughout training, typically `GAMMA = 1e-3` should work for many scenarios. Reviewed By: ashwinb, rbgirshick Differential Revision: D14654419 fbshipit-source-id: 8dabbc8df28d32e4fd748b61bc32efaf3935d244
e2aef3a7 · Sean Bell · Facebook Github Bot · 3dc87ab5 · e2aef3a7 · e2aef3a7
Commit e2aef3a7 authored Mar 29, 2019 by Sean Bell Committed by Facebook Github Bot Mar 29, 2019
Hide whitespace changes
Inline Side-by-side

Showing with 13 additions and 0 deletions

config.py detectron/core/config.py +4 -0

lr_policy.py detectron/utils/lr_policy.py +9 -0

No files found.
--- a/detectron/core/config.py
+++ b/detectron/core/config.py
@@ -583,9 +583,13 @@ __C.SOLVER.LR_POLICY = 'step'
 #   lr = LRS[current_step]
 # 'cosine_decay'
 #   lr = SOLVER.BASE_LR * (cos(PI * cur_iter / SOLVER.MAX_ITER) * 0.5 + 0.5)
+# 'exp_decay'
+#   lr smoothly decays from SOLVER.BASE_LR to SOLVER.GAMMA * SOLVER.BASE_LR
+#   lr = SOLVER.BASE_LR * exp(np.log(SOLVER.GAMMA) * cur_iter / SOLVER.MAX_ITER)
 # Hyperparameter used by the specified policy
 # For 'step', the current LR is multiplied by SOLVER.GAMMA at each step
+# For 'exp_decay', SOLVER.GAMMA is the ratio between the final and initial LR.
 __C.SOLVER.GAMMA = 0.1
 # Uniform step size for 'steps' policy

--- a/detectron/utils/lr_policy.py
+++ b/detectron/utils/lr_policy.py
@@ -99,6 +99,15 @@ def lr_func_cosine_decay(cur_iter):
    return cfg.SOLVER.BASE_LR * cos_frac
+def lr_func_exp_decay(cur_iter):
+    """For cfg.SOLVER.LR_POLICY = 'exp_decay'
+    """
+    # GAMMA is final/initial learning rate ratio
+    iter_frac = float(cur_iter) / cfg.SOLVER.MAX_ITER
+    exp_frac = np.exp(iter_frac * np.log(cfg.SOLVER.GAMMA))
+    return cfg.SOLVER.BASE_LR * exp_frac
 # ---------------------------------------------------------------------------- #
 # Helpers
 # ---------------------------------------------------------------------------- #