-
Davis King authored
when it determines that there have been a lot of steps without progress and shrinks the learning rate. Instead, it removes only the oldest 100. The problem with the old way of removing all the loss values in the history was that if you set the steps without progress threshold to a really high number you would often observe that the last few learning rate values were obviously not making progress, however, since all the previous loss values were forgotten the trainer needed to fully populate it's loss history from scratch before it would figure this out. This new style makes the trainer not waste time running this excessive optimization of obviously useless mini-batches.
dd62b0e2
Name |
Last commit
|
Last update |
---|---|---|
dlib | ||
docs | ||
examples | ||
python_examples | ||
tools | ||
.gitignore | ||
.hgignore | ||
.hgtags | ||
.travis.yml | ||
CMakeLists.txt | ||
MANIFEST.in | ||
README.md | ||
appveyor.yml | ||
setup.py |