Made loss layers output the gradients by assigning them to the output rather
than adding them. This way, the gradient buffer can be used as scratch space during the loss computation.
Showing
Please
register
or
sign in
to comment
than adding them. This way, the gradient buffer can be used as scratch space during the loss computation.