Commit ecfa5a8d authored by Davis King's avatar Davis King

Updated this example to use the newer and easier to use wrapper

function for rank_features().

--HG--
extra : convert_revision : svn%3Afdd8eb12-d10e-0410-9acb-85c331704f74/trunk%403235
parent a01f495a
// The contents of this file are in the public domain. See LICENSE_FOR_EXAMPLE_PROGRAMS.txt // The contents of this file are in the public domain. See LICENSE_FOR_EXAMPLE_PROGRAMS.txt
/* /*
This is an example illustrating the use of the rank_features() function This is an example illustrating the use of the feature ranking
from the dlib C++ Library. tools from the dlib C++ Library.
This example creates a simple set of data and then shows This example creates a simple set of data and then shows
you how to use the rank_features() function to find a good you how to use the feature ranking function to find a good
set of features (where "good" means the feature set will probably set of features (where "good" means the feature set will probably
work well with a classification algorithm). work well with a classification algorithm).
...@@ -14,7 +14,7 @@ ...@@ -14,7 +14,7 @@
from the origin are labeled +1 and all other points are labeled from the origin are labeled +1 and all other points are labeled
as -1. Note that this data is conceptually 2 dimensional but we as -1. Note that this data is conceptually 2 dimensional but we
will add two extra features for the purpose of showing what will add two extra features for the purpose of showing what
the rank_features() function does. the feature ranking function does.
*/ */
...@@ -55,7 +55,7 @@ int main() ...@@ -55,7 +55,7 @@ int main()
samp(1) = y; samp(1) = y;
// This is a worthless feature since it is just random noise. It should // This is a worthless feature since it is just random noise. It should
// be indicated as worthless by the rank_features() function below. // be indicated as worthless by the feature ranking below.
samp(2) = rnd.get_random_double(); samp(2) = rnd.get_random_double();
// This is a version of the y feature that is corrupted by random noise. It // This is a version of the y feature that is corrupted by random noise. It
...@@ -85,45 +85,31 @@ int main() ...@@ -85,45 +85,31 @@ int main()
for (unsigned long i = 0; i < samples.size(); ++i) for (unsigned long i = 0; i < samples.size(); ++i)
samples[i] = pointwise_multiply(samples[i] - m, sd); samples[i] = pointwise_multiply(samples[i] - m, sd);
// This is another thing that is often good to do from a numerical stability point of view. // This is another thing that is often good to do from a numerical stability point of view.
// However, in our case it doesn't really matter. // However, in our case it doesn't really matter.
randomize_samples(samples,labels); randomize_samples(samples,labels);
// This is a typedef for the type of kernel we are going to use in this example. // Finally we get to the feature ranking. Here we call verbose_rank_features_rbf() with
// In this case I have selected the radial basis kernel that can operate on our // the samples and labels we made above. The 20 is a measure of how much memory and CPU
// 4D sample_type objects. In general, I would suggest using the same kernel for // resources the algorithm should use. Generally bigger values give better results but
// classification and feature ranking. // take longer to run.
typedef radial_basis_kernel<sample_type> kernel_type; cout << verbose_rank_features_rbf(samples, labels, 20) << endl;
// Here we declare an instance of the kcentroid object. It is used by rank_features()
// two represent the centroids of the two classes. The kcentroid has 3 parameters
// you need to set. The first argument to the constructor is the kernel we wish to
// use. The second is a parameter that determines the numerical accuracy with which
// the object will perform part of the ranking algorithm. Generally, smaller values
// give better results but cause the algorithm to attempt to use more support vectors
// (and thus run slower and use more memory). The third argument, however, is the
// maximum number of support vectors a kcentroid is allowed to use. So you can use
// it to control the runtime complexity.
kcentroid<kernel_type> kc(kernel_type(0.05), 0.001, 25);
// And finally we get to the feature ranking. Here we call rank_features() with the kcentroid we just made,
// the samples and labels we made above, and the number of features we want it to rank.
cout << rank_features(kc, samples, labels) << endl;
// The output is: // The output is:
/* /*
1 0.514254 0 0.810087
0 0.810668 1 1
3 1 3 0.873991
2 0.994169 2 0.668913
*/ */
// The first column is a list of the features in order of decreasing goodness. So the rank_features() function // The first column is a list of the features in order of decreasing goodness. So the feature ranking function
// is telling us that the samples[i](0) and samples[i](1) (i.e. the x and y) features are the best two. Then // is telling us that the samples[i](0) and samples[i](1) (i.e. the x and y) features are the best two. Then
// after that the next best feature is the samples[i](3) (i.e. the y corrupted by noise) and finally the worst // after that the next best feature is the samples[i](3) (i.e. the y corrupted by noise) and finally the worst
// feature is the one that is just random noise. So in this case rank_features did exactly what we would // feature is the one that is just random noise. So in this case the feature ranking did exactly what we would
// intuitively expect. // intuitively expect.
...@@ -132,10 +118,10 @@ int main() ...@@ -132,10 +118,10 @@ int main()
// indicate a larger separation. // indicate a larger separation.
// So to break it down a little more. // So to break it down a little more.
// 1 0.514254 <-- class separation of feature 1 all by itself // 1 0.810087 <-- class separation of feature 1 all by itself
// 0 0.810668 <-- class separation of feature 1 and 0 // 0 1 <-- class separation of feature 1 and 0
// 3 1 <-- class separation of feature 1, 0, and 3 // 3 0.873991 <-- class separation of feature 1, 0, and 3
// 2 0.994169 <-- class separation of feature 1, 0, 3, and 2 // 2 0.668913 <-- class separation of feature 1, 0, 3, and 2
} }
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment