Cleaned up a few things

--HG-- extra : convert_revision : svn%3Afdd8eb12-d10e-0410-9acb-85c331704f74/trunk%403355

Cleaned up a few things
--HG-- extra : convert_revision : svn%3Afdd8eb12-d10e-0410-9acb-85c331704f74/trunk%403355
c5d4b20c · Davis King · e50e904a · c5d4b20c
Commit c5d4b20c authored Jan 04, 2010 by Davis King
Show whitespace changes
Inline Side-by-side

Showing with 22 additions and 19 deletions

empirical_kernel_map_ex.cpp examples/empirical_kernel_map_ex.cpp +22 -19

No files found.
--- a/examples/empirical_kernel_map_ex.cpp
+++ b/examples/empirical_kernel_map_ex.cpp
@@ -13,7 +13,7 @@
    This means they are all simple linear algorithms that have been formulated such 
    that the only way they look at the data given by a user is via dot products between
    the data samples.  These algorithms are made more useful via the application of the
-    so called kernel trick.  This trick is to replace the dot product with a user 
+    so-called kernel trick.  This trick is to replace the dot product with a user 
    supplied function which takes two samples and returns a real number.  This function 
    is the kernel that is required by so many algorithms.  The most basic kernel is the 
    linear_kernel which is simply a normal dot product.  However, more interesting
@@ -47,14 +47,14 @@
    
    The empirical_kernel_map is useful because it is often difficult to formulate an 
    algorithm in a way that uses only dot products.  So the empirical_kernel_map lets 
-    non-experts effectively kernelize any algorithm they like by using this object 
-    during a preprocessing step.   However, it should be noted that the algorithm
-    is only practical when used with at most a few thousand basis samples.  Fortunately,
-    most datasets live in subspaces that are relatively low dimensional.  So for these 
-    datasets, using the empirical_kernel_map is practical assuming a reasonable set of 
-    basis samples can be selected by the user.  To help with this dlib supplies the 
-    linearly_independent_subset_finder.  Some people also find that just picking a random 
-    subset of their data and using that as a basis set is fine as well.    
+    us easily kernelize any algorithm we like by using this object during a preprocessing 
+    step.  However, it should be noted that the algorithm is only practical when used 
+    with at most a few thousand basis samples.  Fortunately, most datasets live in 
+    subspaces that are relatively low dimensional.  So for these datasets, using the 
+    empirical_kernel_map is practical assuming an appropriate set of basis samples can be 
+    selected by the user.  To help with this dlib supplies the linearly_independent_subset_finder.  
+    Some people also find that just picking a random subset of their data and using that 
+    as a basis set is fine as well.    



@@ -91,6 +91,8 @@ void generate_concentric_circles (
    const int num_points
 );
 /*!
+    requires
+        - num_points > 0
    ensures
        - generates two circles centered at the point (0,0), one of radius 1 and
          the other of radius 5.  These points are stored into samples.  labels will
@@ -130,7 +132,7 @@ int main()
    // Here we create an empirical_kernel_map using all of our data samples as basis samples.  
    cout << "\n\nBuilding an empirical_kernel_map " << samples.size() << " basis samples." << endl;
    ekm.load(kern, samples);
-    cout << "Test the empirical_kernel_map when it is loaded with every sample." << endl;
+    cout << "Test the empirical_kernel_map when loaded with every sample." << endl;
    test_empirical_kernel_map(samples, labels, ekm);


@@ -153,7 +155,7 @@ int main()
    // selected using the linearly_independent_subset_finder.
    cout << "\n\nBuilding an empirical_kernel_map with " << lisf.dictionary_size() << " basis samples." << endl;
    ekm.load(kern, lisf.get_dictionary());
-    cout << "Test the empirical_kernel_map when it is loaded with samples from the lisf object." << endl;
+    cout << "Test the empirical_kernel_map when loaded with samples from the lisf object." << endl;
    test_empirical_kernel_map(samples, labels, ekm);


@@ -189,28 +191,29 @@ void test_empirical_kernel_map (
    const matrix<double> new_kernel_matrix = kernel_matrix(linear_kernel<sample_type>(), projected_samples);

    cout << "Max kernel matrix error: " << max(abs(normal_kernel_matrix - new_kernel_matrix)) << endl;
-    cout << "mean kernel matrix error: " << mean(abs(normal_kernel_matrix - new_kernel_matrix)) << endl;
+    cout << "Mean kernel matrix error: " << mean(abs(normal_kernel_matrix - new_kernel_matrix)) << endl;
    /*
        Example outputs from these cout statements.
        For the case where we use all samples as basis samples:
            Max kernel matrix error: 2.73115e-14
-            mean kernel matrix error: 6.19125e-15
+            Mean kernel matrix error: 6.19125e-15

        For the case where we use only 20 samples as basis samples:
            Max kernel matrix error: 0.0154466
-            mean kernel matrix error: 0.000753427
+            Mean kernel matrix error: 0.000753427


-        Note that if we use enough basis samples to perfectly span the space of input samples
-        then we get errors that are essentially just rounding noise (Moreover, using all the 
+        Note that if we use enough basis samples we can perfectly span the space of input samples.
+        In that case we get errors that are essentially just rounding noise (Moreover, using all the 
        samples is always enough since they are always within their own span).  Once we start 
-        to use fewer basis samples we begin to get approximation error since the data doesn't
-        really lay exactly in a 20 dimensional subspace.  But it is pretty close.  
+        to use fewer basis samples we may begin to get approximation error.  In the second case we 
+        used 20 and we can see that the data doesn't really lay exactly in a 20 dimensional subspace.  
+        But it is pretty close.  
    */



-    // Now lets do something more interesting.   The below loop finds the centroids
+    // Now lets do something more interesting.  The following loop finds the centroids
    // of the two classes of data.
    sample_type class1_center(ekm.out_vector_size()); 
    sample_type class2_center(ekm.out_vector_size());