Added weighted labels to loss_binary_log layer. (#1538)

* Added weighted labels to loss_binary_log * Added weighted labels to loss_binary_log * Clarified docs. * clarified docs

Added weighted labels to loss_binary_log layer. (#1538)
* Added weighted labels to loss_binary_log * Added weighted labels to loss_binary_log * Clarified docs. * clarified docs
27a52936 · Matthias Stauber · Davis E. King · 6747122c · 27a52936 · 27a52936
Commit 27a52936 authored Nov 04, 2018 by Matthias Stauber Committed by Davis E. King Nov 04, 2018
Hide whitespace changes
Inline Side-by-side

Showing with 15 additions and 9 deletions

loss.h dlib/dnn/loss.h +5 -5

loss_abstract.h dlib/dnn/loss_abstract.h +10 -4

No files found.
--- a/dlib/dnn/loss.h
+++ b/dlib/dnn/loss.h
@@ -195,19 +195,19 @@ namespace dlib
            for (long i = 0; i < output_tensor.num_samples(); ++i)
            {
                const float y = *truth++;
-                DLIB_CASSERT(y == +1 || y == -1, "y: " << y);
+                DLIB_CASSERT(y != 0, "y: " << y);
                float temp;
                if (y > 0)
                {
                    temp = log1pexp(-out_data[i]);
-                    loss += scale*temp;
+                    loss += y*scale*temp;
-                    g[i] = scale*(g[i]-1);
+                    g[i] = y*scale*(g[i]-1);
                }
                else
                {
                    temp = -(-out_data[i]-log1pexp(-out_data[i]));
-                    loss += scale*temp;
+                    loss += -y*scale*temp;
-                    g[i] = scale*g[i];
+                    g[i] = -y*scale*g[i];
                }
            }
            return loss;

--- a/dlib/dnn/loss_abstract.h
+++ b/dlib/dnn/loss_abstract.h
@@ -232,10 +232,16 @@ namespace dlib
            WHAT THIS OBJECT REPRESENTS
                This object implements the loss layer interface defined above by
                EXAMPLE_LOSS_LAYER_.  In particular, it implements the log loss, which is
-                appropriate for binary classification problems.  Therefore, the possible
+                appropriate for binary classification problems.  Therefore, there are two possible
-                labels when using this loss are +1 and -1.  Moreover, it will cause the
+                classes of labels: positive (> 0) and negative (< 0) when using this loss.
-                network to produce outputs > 0 when predicting a member of the +1 class and
+                The absolute value of the label represents its weight.  Putting a larger weight
-                values < 0 otherwise.
+                on a sample increases the importance of getting its prediction correct during 
+                training.  A good rule of thumb is to use weights with absolute value 1 unless 
+                you have a very unbalanced training dataset, in that case, give larger weight
+                to the class with less training examples.
+                This loss will cause the network to produce outputs > 0 when predicting a
+                member of the positive class and values < 0 otherwise.
                To be more specific, this object contains a sigmoid layer followed by a 
                cross-entropy layer.