Commit 5f5c46f4 authored by Davis King's avatar Davis King

Made loss layers output the gradients by assigning them to the output rather

than adding them.  This way, the gradient buffer can be used as scratch space
during the loss computation.
parent e2a67dec
...@@ -77,7 +77,7 @@ namespace dlib ...@@ -77,7 +77,7 @@ namespace dlib
if (temp > 0) if (temp > 0)
{ {
loss += scale*temp; loss += scale*temp;
g[i] += -scale*y; g[i] = -scale*y;
} }
} }
return loss; return loss;
......
...@@ -110,9 +110,9 @@ namespace dlib ...@@ -110,9 +110,9 @@ namespace dlib
of sub matches the expected labels given by truth. Let's write the loss of sub matches the expected labels given by truth. Let's write the loss
function as L(input_tensor, truth, sub). function as L(input_tensor, truth, sub).
- Then compute_loss() computes the gradient of L() with respect to the - Then compute_loss() computes the gradient of L() with respect to the
outputs in sub. Specifically, compute_loss() adds the gradients into sub outputs in sub. Specifically, compute_loss() assigns the gradients into
by performing the following tensor additions, for all valid i: sub by performing the following tensor assignments, for all valid i:
- layer<i>(sub).get_gradient_input() += the gradient of - layer<i>(sub).get_gradient_input() = the gradient of
L(input_tensor,truth,sub) with respect to layer<i>(sub).get_output(). L(input_tensor,truth,sub) with respect to layer<i>(sub).get_output().
- returns L(input_tensor,truth,sub) - returns L(input_tensor,truth,sub)
!*/ !*/
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment