Added the following functions:

make_uniform_lbp_image() extract_histogram_descriptors() extract_uniform_lbp_descriptors() extract_highdim_face_lbp_descriptors() compute_lda_transform() compute_equal_error_rate()

Added the following functions:
make_uniform_lbp_image() extract_histogram_descriptors() extract_uniform_lbp_descriptors() extract_highdim_face_lbp_descriptors() compute_lda_transform() compute_equal_error_rate()
36d49e67 · Davis King · 5a69878b · 36d49e67 · 36d49e67 · 36d49e67
Commit 36d49e67 authored Sep 15, 2014 by Davis King
6 changed files
--- a/dlib/image_transforms.h
+++ b/dlib/image_transforms.h
@@ -17,6 +17,7 @@
 #include "image_transforms/segment_image.h"
 #include "image_transforms/interpolation.h"
 #include "image_transforms/fhog.h"
+#include "image_transforms/lbp.h"

 #endif // DLIB_IMAGE_TRANSFORMs_

--- a/dlib/image_transforms/lbp.h
+++ b/dlib/image_transforms/lbp.h
--- a/dlib/image_transforms/lbp_abstract.h
+++ b/dlib/image_transforms/lbp_abstract.h
+// Copyright (C) 2014  Davis E. King (davis@dlib.net)
+// License: Boost Software License   See LICENSE.txt for the full license.
+#undef DLIB_LBP_ABSTRACT_Hh_
+#ifdef DLIB_LBP_ABSTRACT_Hh_
+
+#include "../image_processing/generic_image.h"
+#include "../pixel.h"
+
+namespace dlib
+{
+
+// ----------------------------------------------------------------------------------------
+
+    template <
+        typename image_type,
+        typename image_type2
+        >
+    void make_uniform_lbp_image (
+        const image_type& img,
+        image_type2& lbp
+    );
+    /*!
+        requires
+            - image_type == an image object that implements the interface defined in
+              dlib/image_processing/generic_image.h 
+            - image_type2 == an image object that implements the interface defined in
+              dlib/image_processing/generic_image.h 
+            - image_type2 should contain a grayscale pixel type such as unsigned char.
+        ensures
+            - #lbp.nr() == img.nr()
+            - #lbp.nc() == img.nc()
+            - This function extracts the uniform local-binary-pattern feature at every pixel
+              and stores it into #lbp.  In particular, we have the following for all valid 
+              r and c:
+                - #lbp[r][c] == the uniform LBP for the 3x3 pixel window centered on img[r][c].  
+                  In particular, this is a value in the range 0 to 58 inclusive. 
+            - We use the idea of uniform LBPs from the paper: 
+                Face Description with Local Binary Patterns: Application to Face Recognition
+                by Ahonen, Hadid, and Pietikainen.
+    !*/
+
+// ----------------------------------------------------------------------------------------
+
+    template <
+        typename image_type,
+        typename T
+        >
+    void extract_histogram_descriptors (
+        const image_type& img,
+        const point& loc,
+        std::vector<T>& histograms,
+        const unsigned int cell_size = 10,
+        const unsigned int block_size = 4,
+        const unsigned int max_val = 58
+    );
+    /*!
+        requires
+            - image_type == an image object that implements the interface defined in
+              dlib/image_processing/generic_image.h 
+            - image_type contains unsigned char valued pixels.
+            - All pixel values in img are <= max_val
+            - cell_size >= 1
+            - block_size >= 1
+            - max_val < 256
+        ensures
+            - This function extracts histograms of pixel values from block_size*block_size
+              windows in the area in img immediately around img[loc.y()][loc.x()].  The
+              histograms are appended onto the end of #histograms.  Each window is
+              cell_size pixels wide and tall.  Moreover, the windows do not overlap.
+            - #histograms.size() == histograms.size() + block_size*block_size*(max_val+1)
+    !*/
+
+// ----------------------------------------------------------------------------------------
+
+    template <
+        typename image_type,
+        typename T
+        >
+    void extract_uniform_lbp_descriptors (
+        const image_type& img,
+        std::vector<T>& feats,
+        const unsigned int cell_size = 10
+    );
+    /*!
+        requires
+            - cell_size >= 1
+        ensures
+            - Extracts histograms of uniform local-binary-patterns from img.  The
+              histograms are densely tiled windows that are cell_size pixels wide and tall.
+              The windows do not overlap and cover all of img.
+            - #feats.size() == 59*(number of windows that fit into img)
+              (i.e. #feats contains the LBP histograms)
+            - We will have taken the square root of all the histogram elements.  That is,
+              #feats[i] is the square root of the number of LBPs that appeared in its
+              corresponding window.
+    !*/
+
+// ----------------------------------------------------------------------------------------
+
+    template <
+        typename image_type,
+        typename T
+        >
+    void extract_highdim_face_lbp_descriptors (
+        const image_type& img,
+        const full_object_detection& det,
+        std::vector<T>& feats
+    );
+    /*!
+        requires
+            - det.num_parts() == 68
+        ensures
+            - This function extracts the high-dimensional LBP feature described in the
+              paper:
+                Blessing of Dimensionality: High-dimensional Feature and Its Efficient
+                Compression for Face Verification by Dong Chen, Xudong Cao, Fang Wen, and
+                Jian Sun
+            - #feats == the high-dimensional LBP descriptor.  It is the concatenation of
+              many LBP histograms, each extracted from different scales and from different
+              windows around different face landmarks.  We also take the square root of
+              each histogram element before storing it into #feats.
+            - #feats.size() == 99120
+            - This function assumes img has already been aligned and normalized to a
+              standard size.
+            - This function assumes det contains a human face detection with face parts
+              annotated using the annotation scheme from the iBUG 300-W face landmark
+              dataset.  This means that det.part(i) gives the locations of different face
+              landmarks according to the iBUG 300-W annotation scheme.
+    !*/
+
+// ----------------------------------------------------------------------------------------
+
+}
+
+#endif // DLIB_LBP_ABSTRACT_Hh_
+
--- a/dlib/statistics.h
+++ b/dlib/statistics.h
@@ -11,6 +11,7 @@
 #include "statistics/cca.h"
 #include "statistics/average_precision.h"
 #include "statistics/vector_normalizer_frobmetric.h"
+#include "statistics/lda.h"

 #endif // DLIB_STATISTICs_H_ 


--- a/dlib/statistics/lda.h
+++ b/dlib/statistics/lda.h
+// Copyright (C) 2014  Davis E. King (davis@dlib.net)
+// License: Boost Software License   See LICENSE.txt for the full license.
+#ifndef DLIB_LDA_Hh_
+#define DLIB_LDA_Hh_
+
+#include "lda_abstract.h"
+#include "../algs.h"
+#include <map>
+#include "../matrix.h"
+#include <vector>
+
+namespace dlib
+{
+
+// ----------------------------------------------------------------------------------------
+
+    namespace impl
+    {
+
+        inline std::map<unsigned long,unsigned long> make_class_labels(
+            const std::vector<unsigned long>& row_labels
+        )
+        {
+            std::map<unsigned long,unsigned long> class_labels;
+            for (unsigned long i = 0; i < row_labels.size(); ++i)
+            {
+                const unsigned long next = class_labels.size();
+                if (class_labels.count(row_labels[i]) == 0)
+                    class_labels[row_labels[i]] = next;
+            }
+            return class_labels;
+        }
+
+    // ------------------------------------------------------------------------------------
+
+        template <
+            typename T
+            >
+        matrix<T,0,1> center_matrix (
+            matrix<T>& X
+        )
+        {
+            matrix<T,1> mean;
+            for (long r = 0; r < X.nr(); ++r)
+                mean += rowm(X,r);
+            mean /= X.nr();
+
+            for (long r = 0; r < X.nr(); ++r)
+                set_rowm(X,r) -= mean;
+
+            return trans(mean);
+        }
+    }
+
+// ----------------------------------------------------------------------------------------
+
+    template <
+        typename T
+        >
+    void compute_lda_transform (
+        matrix<T>& X,
+        matrix<T,0,1>& mean,
+        const std::vector<unsigned long>& row_labels,
+        unsigned long lda_dims = 500,
+        unsigned long extra_pca_dims = 200
+    )
+    {
+        std::map<unsigned long,unsigned long> class_labels = impl::make_class_labels(row_labels);
+        // LDA can only give out at most class_labels.size()-1 dimensions so don't try to
+        // compute more than that.
+        lda_dims = std::min(lda_dims, class_labels.size()-1);
+
+        // make sure requires clause is not broken
+        DLIB_CASSERT(class_labels.size() > 1,
+            "\t void compute_lda_transform()"
+            << "\n\t You can't call this function if the number of distinct class labels is less than 2."
+            );
+        DLIB_CASSERT(X.size() != 0 && (long)row_labels.size() == X.nr() && lda_dims != 0,
+            "\t void compute_lda_transform()"
+            << "\n\t Invalid inputs were given to this function."
+            << "\n\t X.size():          " << X.size()
+            << "\n\t row_labels.size(): " << row_labels.size()
+            << "\n\t lda_dims:          " << lda_dims
+            );
+
+
+        mean = impl::center_matrix(X);
+        // Do PCA to reduce dims
+        matrix<T> pu,pw,pv;
+        svd_fast(X, pu, pw, pv, lda_dims+extra_pca_dims, 4);
+        pu.set_size(0,0); // free RAM, we don't need pu.
+        X = X*pv;
+
+
+        matrix<T> class_means(class_labels.size(), X.nc());
+        class_means = 0;
+        matrix<T,0,1> class_counts(class_labels.size());
+        class_counts = 0;
+
+        // First compute the means of each class
+        for (unsigned long i = 0; i < row_labels.size(); ++i)
+        {
+            const unsigned long class_idx = class_labels[row_labels[i]];
+            set_rowm(class_means,class_idx) += rowm(X,i);
+            class_counts(class_idx)++;
+        }
+        class_means = inv(diagm(class_counts))*class_means;
+        // subtract means from the data
+        for (unsigned long i = 0; i < row_labels.size(); ++i)
+        {
+            const unsigned long class_idx = class_labels[row_labels[i]];
+            set_rowm(X,i) -= rowm(class_means,class_idx);
+        }
+
+        // Note that we are using the formulas from the paper Using Discriminant
+        // Eigenfeatures for Image Retrieval by Swets and Weng.
+        matrix<T> Sw = trans(X)*X;
+        matrix<T> Sb = trans(class_means)*class_means;
+        matrix<T> A, H;
+        matrix<T,0,1> W;
+        svd3(Sw, A, W, H);
+        W = sqrt(W);
+        W = reciprocal(round_zeros(W,max(W)*1e-5));
+        A = trans(H*diagm(W))*Sb*H*diagm(W);
+        matrix<T> v,s,u;
+        svd3(A, v, s, u);
+        matrix<T> tform = H*diagm(W)*u;
+        // pick out only the number of dimensions we are supposed to for the output, unless
+        // we should just keep them all, then don't do anything. 
+        if ((long)lda_dims <= tform.nc())
+        {
+            rsort_columns(tform, s);
+            tform = colm(tform, range(0, lda_dims-1));
+        }
+
+        X = trans(pv*tform);
+        mean = X*mean;
+    }
+
+// ----------------------------------------------------------------------------------------
+
+    inline std::pair<double,double> compute_equal_error_rate (
+        const std::vector<double>& low_vals,
+        const std::vector<double>& high_vals 
+    )
+    {
+        std::vector<std::pair<double,int> > temp;
+        temp.reserve(low_vals.size()+high_vals.size());
+        for (unsigned long i = 0; i < low_vals.size(); ++i)
+            temp.push_back(std::make_pair(low_vals[i], -1));
+        for (unsigned long i = 0; i < high_vals.size(); ++i)
+            temp.push_back(std::make_pair(high_vals[i], +1));
+
+        std::sort(temp.begin(), temp.end());
+
+        if (temp.size() == 0)
+            return std::make_pair(0,0);
+
+        double thresh = temp[0].first;
+
+        unsigned long num_low_wrong = low_vals.size();
+        unsigned long num_high_wrong = 0;
+        double low_error = num_low_wrong/(double)low_vals.size();
+        double high_error = num_high_wrong/(double)high_vals.size();
+        for (unsigned long i = 0; i < temp.size() && high_error < low_error; ++i)
+        {
+            thresh = temp[i].first;
+            if (temp[i].second > 0)
+            {
+                num_high_wrong++;
+                high_error = num_high_wrong/(double)high_vals.size();
+            }
+            else
+            {
+                num_low_wrong--;
+                low_error = num_low_wrong/(double)low_vals.size();
+            }
+        }
+
+        return std::make_pair((low_error+high_error)/2, thresh);
+    }
+
+// ----------------------------------------------------------------------------------------
+
+}
+
+#endif // DLIB_LDA_Hh_
+
--- a/dlib/statistics/lda_abstract.h
+++ b/dlib/statistics/lda_abstract.h
+// Copyright (C) 2014  Davis E. King (davis@dlib.net)
+// License: Boost Software License   See LICENSE.txt for the full license.
+#undef DLIB_LDA_ABSTRACT_Hh_
+#ifdef DLIB_LDA_ABSTRACT_Hh_
+
+#include <map>
+#include "../matrix.h"
+#include <vector>
+
+namespace dlib
+{
+
+// ----------------------------------------------------------------------------------------
+
+    template <
+        typename T
+        >
+    void compute_lda_transform (
+        matrix<T>& X,
+        matrix<T,0,1>& M,
+        const std::vector<unsigned long>& row_labels,
+        unsigned long lda_dims = 500,
+        unsigned long extra_pca_dims = 200
+    );
+    /*!
+        requires
+            - X.size() != 0
+            - row_labels.size() == X.nr()
+            - The number of distinct values in row_labels > 1
+            - lda_dims != 0
+        ensures
+            - We interpret X as a collection X.nr() of input vectors, where each row of X
+              is one of the vectors.
+            - We interpret row_labels[i] as the label of the vector rowm(X,i).
+            - This function performs the dimensionality reducing version of Linear
+              discriminant analysis.  That is, you give it a set of labeled vectors and it
+              returns a linear transform that maps the input vectors into a new space that
+              is good for distinguishing between the different classes.  In particular,
+              this function finds matrices Z and M such that:
+                - Given an put vector x, Z*x-M, is the transformed version of x.  That is,
+                  Z*x-M maps x into a space where x vectors that share the same class label
+                  are near each other. 
+                - Z*x-M results in the transformed vectors having zero expected mean.
+                - Z.nr() <= lda_dims
+                  (it might be less than lda_dims if there are not enough distinct class
+                  labels to support lda_dims dimensions).
+                - Z.nc() == X.nc()
+                - We overwrite the input matrix X and store Z in it.  Therefore, the
+                  outputs of this function are in X and M.
+            - In order to deal with very high dimensional inputs, we perform PCA internally
+              to map the input vectors into a space of at most lda_dims+extra_pca_dims
+              prior to performing LDA.
+    !*/
+
+// ----------------------------------------------------------------------------------------
+
+    std::pair<double,double> compute_equal_error_rate (
+        const std::vector<double>& low_vals,
+        const std::vector<double>& high_vals 
+    );
+    /*!
+        ensures
+            - This function finds a threshold T that best separates the elements of
+              low_vals from high_vals by selecting the threshold with equal error rate.  In
+              particular, we try to pick a threshold T such that:
+                - for all valid i:
+                    - high_vals[i] >= T
+                - for all valid i:
+                    - low_vals[i] < T
+              Where the best T is determined such that the fraction of low_vals >= T is the
+              same as the fraction of high_vals < T.
+            - Let ERR == the equal error rate.  I.e. the fraction of times low_vals >= T
+              and high_vals < T.  Note that 0 <= ERR <= 1.
+            - returns make_pair(ERR,T) 
+    !*/
+
+// ----------------------------------------------------------------------------------------
+
+}
+
+#endif // DLIB_LDA_ABSTRACT_Hh_
+
+