Commit 36d49e67 authored by Davis King's avatar Davis King

Added the following functions:

    make_uniform_lbp_image()
    extract_histogram_descriptors()
    extract_uniform_lbp_descriptors()
    extract_highdim_face_lbp_descriptors()
    compute_lda_transform()
    compute_equal_error_rate()
parent 5a69878b
......@@ -17,6 +17,7 @@
#include "image_transforms/segment_image.h"
#include "image_transforms/interpolation.h"
#include "image_transforms/fhog.h"
#include "image_transforms/lbp.h"
#endif // DLIB_IMAGE_TRANSFORMs_
This diff is collapsed.
// Copyright (C) 2014 Davis E. King (davis@dlib.net)
// License: Boost Software License See LICENSE.txt for the full license.
#undef DLIB_LBP_ABSTRACT_Hh_
#ifdef DLIB_LBP_ABSTRACT_Hh_
#include "../image_processing/generic_image.h"
#include "../pixel.h"
namespace dlib
{
// ----------------------------------------------------------------------------------------
template <
typename image_type,
typename image_type2
>
void make_uniform_lbp_image (
const image_type& img,
image_type2& lbp
);
/*!
requires
- image_type == an image object that implements the interface defined in
dlib/image_processing/generic_image.h
- image_type2 == an image object that implements the interface defined in
dlib/image_processing/generic_image.h
- image_type2 should contain a grayscale pixel type such as unsigned char.
ensures
- #lbp.nr() == img.nr()
- #lbp.nc() == img.nc()
- This function extracts the uniform local-binary-pattern feature at every pixel
and stores it into #lbp. In particular, we have the following for all valid
r and c:
- #lbp[r][c] == the uniform LBP for the 3x3 pixel window centered on img[r][c].
In particular, this is a value in the range 0 to 58 inclusive.
- We use the idea of uniform LBPs from the paper:
Face Description with Local Binary Patterns: Application to Face Recognition
by Ahonen, Hadid, and Pietikainen.
!*/
// ----------------------------------------------------------------------------------------
template <
typename image_type,
typename T
>
void extract_histogram_descriptors (
const image_type& img,
const point& loc,
std::vector<T>& histograms,
const unsigned int cell_size = 10,
const unsigned int block_size = 4,
const unsigned int max_val = 58
);
/*!
requires
- image_type == an image object that implements the interface defined in
dlib/image_processing/generic_image.h
- image_type contains unsigned char valued pixels.
- All pixel values in img are <= max_val
- cell_size >= 1
- block_size >= 1
- max_val < 256
ensures
- This function extracts histograms of pixel values from block_size*block_size
windows in the area in img immediately around img[loc.y()][loc.x()]. The
histograms are appended onto the end of #histograms. Each window is
cell_size pixels wide and tall. Moreover, the windows do not overlap.
- #histograms.size() == histograms.size() + block_size*block_size*(max_val+1)
!*/
// ----------------------------------------------------------------------------------------
template <
typename image_type,
typename T
>
void extract_uniform_lbp_descriptors (
const image_type& img,
std::vector<T>& feats,
const unsigned int cell_size = 10
);
/*!
requires
- cell_size >= 1
ensures
- Extracts histograms of uniform local-binary-patterns from img. The
histograms are densely tiled windows that are cell_size pixels wide and tall.
The windows do not overlap and cover all of img.
- #feats.size() == 59*(number of windows that fit into img)
(i.e. #feats contains the LBP histograms)
- We will have taken the square root of all the histogram elements. That is,
#feats[i] is the square root of the number of LBPs that appeared in its
corresponding window.
!*/
// ----------------------------------------------------------------------------------------
template <
typename image_type,
typename T
>
void extract_highdim_face_lbp_descriptors (
const image_type& img,
const full_object_detection& det,
std::vector<T>& feats
);
/*!
requires
- det.num_parts() == 68
ensures
- This function extracts the high-dimensional LBP feature described in the
paper:
Blessing of Dimensionality: High-dimensional Feature and Its Efficient
Compression for Face Verification by Dong Chen, Xudong Cao, Fang Wen, and
Jian Sun
- #feats == the high-dimensional LBP descriptor. It is the concatenation of
many LBP histograms, each extracted from different scales and from different
windows around different face landmarks. We also take the square root of
each histogram element before storing it into #feats.
- #feats.size() == 99120
- This function assumes img has already been aligned and normalized to a
standard size.
- This function assumes det contains a human face detection with face parts
annotated using the annotation scheme from the iBUG 300-W face landmark
dataset. This means that det.part(i) gives the locations of different face
landmarks according to the iBUG 300-W annotation scheme.
!*/
// ----------------------------------------------------------------------------------------
}
#endif // DLIB_LBP_ABSTRACT_Hh_
......@@ -11,6 +11,7 @@
#include "statistics/cca.h"
#include "statistics/average_precision.h"
#include "statistics/vector_normalizer_frobmetric.h"
#include "statistics/lda.h"
#endif // DLIB_STATISTICs_H_
......
// Copyright (C) 2014 Davis E. King (davis@dlib.net)
// License: Boost Software License See LICENSE.txt for the full license.
#ifndef DLIB_LDA_Hh_
#define DLIB_LDA_Hh_
#include "lda_abstract.h"
#include "../algs.h"
#include <map>
#include "../matrix.h"
#include <vector>
namespace dlib
{
// ----------------------------------------------------------------------------------------
namespace impl
{
inline std::map<unsigned long,unsigned long> make_class_labels(
const std::vector<unsigned long>& row_labels
)
{
std::map<unsigned long,unsigned long> class_labels;
for (unsigned long i = 0; i < row_labels.size(); ++i)
{
const unsigned long next = class_labels.size();
if (class_labels.count(row_labels[i]) == 0)
class_labels[row_labels[i]] = next;
}
return class_labels;
}
// ------------------------------------------------------------------------------------
template <
typename T
>
matrix<T,0,1> center_matrix (
matrix<T>& X
)
{
matrix<T,1> mean;
for (long r = 0; r < X.nr(); ++r)
mean += rowm(X,r);
mean /= X.nr();
for (long r = 0; r < X.nr(); ++r)
set_rowm(X,r) -= mean;
return trans(mean);
}
}
// ----------------------------------------------------------------------------------------
template <
typename T
>
void compute_lda_transform (
matrix<T>& X,
matrix<T,0,1>& mean,
const std::vector<unsigned long>& row_labels,
unsigned long lda_dims = 500,
unsigned long extra_pca_dims = 200
)
{
std::map<unsigned long,unsigned long> class_labels = impl::make_class_labels(row_labels);
// LDA can only give out at most class_labels.size()-1 dimensions so don't try to
// compute more than that.
lda_dims = std::min(lda_dims, class_labels.size()-1);
// make sure requires clause is not broken
DLIB_CASSERT(class_labels.size() > 1,
"\t void compute_lda_transform()"
<< "\n\t You can't call this function if the number of distinct class labels is less than 2."
);
DLIB_CASSERT(X.size() != 0 && (long)row_labels.size() == X.nr() && lda_dims != 0,
"\t void compute_lda_transform()"
<< "\n\t Invalid inputs were given to this function."
<< "\n\t X.size(): " << X.size()
<< "\n\t row_labels.size(): " << row_labels.size()
<< "\n\t lda_dims: " << lda_dims
);
mean = impl::center_matrix(X);
// Do PCA to reduce dims
matrix<T> pu,pw,pv;
svd_fast(X, pu, pw, pv, lda_dims+extra_pca_dims, 4);
pu.set_size(0,0); // free RAM, we don't need pu.
X = X*pv;
matrix<T> class_means(class_labels.size(), X.nc());
class_means = 0;
matrix<T,0,1> class_counts(class_labels.size());
class_counts = 0;
// First compute the means of each class
for (unsigned long i = 0; i < row_labels.size(); ++i)
{
const unsigned long class_idx = class_labels[row_labels[i]];
set_rowm(class_means,class_idx) += rowm(X,i);
class_counts(class_idx)++;
}
class_means = inv(diagm(class_counts))*class_means;
// subtract means from the data
for (unsigned long i = 0; i < row_labels.size(); ++i)
{
const unsigned long class_idx = class_labels[row_labels[i]];
set_rowm(X,i) -= rowm(class_means,class_idx);
}
// Note that we are using the formulas from the paper Using Discriminant
// Eigenfeatures for Image Retrieval by Swets and Weng.
matrix<T> Sw = trans(X)*X;
matrix<T> Sb = trans(class_means)*class_means;
matrix<T> A, H;
matrix<T,0,1> W;
svd3(Sw, A, W, H);
W = sqrt(W);
W = reciprocal(round_zeros(W,max(W)*1e-5));
A = trans(H*diagm(W))*Sb*H*diagm(W);
matrix<T> v,s,u;
svd3(A, v, s, u);
matrix<T> tform = H*diagm(W)*u;
// pick out only the number of dimensions we are supposed to for the output, unless
// we should just keep them all, then don't do anything.
if ((long)lda_dims <= tform.nc())
{
rsort_columns(tform, s);
tform = colm(tform, range(0, lda_dims-1));
}
X = trans(pv*tform);
mean = X*mean;
}
// ----------------------------------------------------------------------------------------
inline std::pair<double,double> compute_equal_error_rate (
const std::vector<double>& low_vals,
const std::vector<double>& high_vals
)
{
std::vector<std::pair<double,int> > temp;
temp.reserve(low_vals.size()+high_vals.size());
for (unsigned long i = 0; i < low_vals.size(); ++i)
temp.push_back(std::make_pair(low_vals[i], -1));
for (unsigned long i = 0; i < high_vals.size(); ++i)
temp.push_back(std::make_pair(high_vals[i], +1));
std::sort(temp.begin(), temp.end());
if (temp.size() == 0)
return std::make_pair(0,0);
double thresh = temp[0].first;
unsigned long num_low_wrong = low_vals.size();
unsigned long num_high_wrong = 0;
double low_error = num_low_wrong/(double)low_vals.size();
double high_error = num_high_wrong/(double)high_vals.size();
for (unsigned long i = 0; i < temp.size() && high_error < low_error; ++i)
{
thresh = temp[i].first;
if (temp[i].second > 0)
{
num_high_wrong++;
high_error = num_high_wrong/(double)high_vals.size();
}
else
{
num_low_wrong--;
low_error = num_low_wrong/(double)low_vals.size();
}
}
return std::make_pair((low_error+high_error)/2, thresh);
}
// ----------------------------------------------------------------------------------------
}
#endif // DLIB_LDA_Hh_
// Copyright (C) 2014 Davis E. King (davis@dlib.net)
// License: Boost Software License See LICENSE.txt for the full license.
#undef DLIB_LDA_ABSTRACT_Hh_
#ifdef DLIB_LDA_ABSTRACT_Hh_
#include <map>
#include "../matrix.h"
#include <vector>
namespace dlib
{
// ----------------------------------------------------------------------------------------
template <
typename T
>
void compute_lda_transform (
matrix<T>& X,
matrix<T,0,1>& M,
const std::vector<unsigned long>& row_labels,
unsigned long lda_dims = 500,
unsigned long extra_pca_dims = 200
);
/*!
requires
- X.size() != 0
- row_labels.size() == X.nr()
- The number of distinct values in row_labels > 1
- lda_dims != 0
ensures
- We interpret X as a collection X.nr() of input vectors, where each row of X
is one of the vectors.
- We interpret row_labels[i] as the label of the vector rowm(X,i).
- This function performs the dimensionality reducing version of Linear
discriminant analysis. That is, you give it a set of labeled vectors and it
returns a linear transform that maps the input vectors into a new space that
is good for distinguishing between the different classes. In particular,
this function finds matrices Z and M such that:
- Given an put vector x, Z*x-M, is the transformed version of x. That is,
Z*x-M maps x into a space where x vectors that share the same class label
are near each other.
- Z*x-M results in the transformed vectors having zero expected mean.
- Z.nr() <= lda_dims
(it might be less than lda_dims if there are not enough distinct class
labels to support lda_dims dimensions).
- Z.nc() == X.nc()
- We overwrite the input matrix X and store Z in it. Therefore, the
outputs of this function are in X and M.
- In order to deal with very high dimensional inputs, we perform PCA internally
to map the input vectors into a space of at most lda_dims+extra_pca_dims
prior to performing LDA.
!*/
// ----------------------------------------------------------------------------------------
std::pair<double,double> compute_equal_error_rate (
const std::vector<double>& low_vals,
const std::vector<double>& high_vals
);
/*!
ensures
- This function finds a threshold T that best separates the elements of
low_vals from high_vals by selecting the threshold with equal error rate. In
particular, we try to pick a threshold T such that:
- for all valid i:
- high_vals[i] >= T
- for all valid i:
- low_vals[i] < T
Where the best T is determined such that the fraction of low_vals >= T is the
same as the fraction of high_vals < T.
- Let ERR == the equal error rate. I.e. the fraction of times low_vals >= T
and high_vals < T. Note that 0 <= ERR <= 1.
- returns make_pair(ERR,T)
!*/
// ----------------------------------------------------------------------------------------
}
#endif // DLIB_LDA_ABSTRACT_Hh_
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment