Made extract_fhog_features() faster by using simd instructions. Also added an

option to zero pad the borders of the output to it's easier to filter.

Made extract_fhog_features() faster by using simd instructions. Also added an
option to zero pad the borders of the output to it's easier to filter.
cbc79bcf · Davis King · 2895d425 · cbc79bcf · cbc79bcf
Commit cbc79bcf authored Nov 07, 2013 by Davis King
Expand all Hide whitespace changes
Inline Side-by-side

Showing with 65 additions and 15 deletions

fhog.h dlib/image_transforms/fhog.h +0 -0

fhog_abstract.h dlib/image_transforms/fhog_abstract.h +65 -15

No files found.
--- a/dlib/image_transforms/fhog.h
+++ b/dlib/image_transforms/fhog.h
--- a/dlib/image_transforms/fhog_abstract.h
+++ b/dlib/image_transforms/fhog_abstract.h
@@ -20,11 +20,15 @@ namespace dlib
    void extract_fhog_features(
        const image_type& img, 
        array2d<matrix<T,31,1>,mm>& hog, 
-        int cell_size = 8
+        int cell_size = 8,
+        int filter_rows_padding = 1,
+        int filter_cols_padding = 1
    );
    /*!
        requires
            - cell_size > 0
+            - filter_rows_padding > 0
+            - filter_cols_padding > 0
            - in_image_type  == is an implementation of array2d/array2d_kernel_abstract.h
            - img contains some kind of pixel type. 
              (i.e. pixel_traits<typename in_image_type::type> is defined)
@@ -40,11 +44,29 @@ namespace dlib
            - The input image is broken into cells that are cell_size by cell_size pixels
              and within each cell we compute a 31 dimensional FHOG vector.  This vector
              describes the gradient structure within the cell.  
-            - #hog.nr() is approximately equal to img.nr()/cell_size.
-            - #hog.nc() is approximately equal to img.nc()/cell_size.
+            - A common task is to convolve each channel of the hog image with a linear
+              filter.  This is made more convenient if the contents of #hog includes extra
+              rows and columns of zero padding along the borders.  This extra padding
+              allows for more efficient convolution code since the code does not need to
+              perform expensive boundary checking.  Therefore, you can set
+              filter_rows_padding and filter_cols_padding to indicate the size of the
+              filter you wish to use and this function will ensure #hog has the appropriate
+              extra zero padding along the borders.  In particular, it will include the
+              following extra padding:
+                - (filter_rows_padding-1)/2 extra rows of zeros on the top of #hog.
+                - (filter_cols_padding-1)/2 extra columns of zeros on the left of #hog.
+                - filter_rows_padding/2 extra rows of zeros on the bottom of #hog.
+                - filter_cols_padding/2 extra columns of zeros on the right of #hog.
+              Therefore, the extra padding is done such that functions like
+              spatially_filter_image() apply their filters to the entire content containing
+              area of a hog image (note that you should use the following planar version of
+              extract_fhog_features() instead of the interlaced version if you want to use
+              spatially_filter_image() on a hog image).
+            - #hog.nr() is approximately equal to img.nr()/cell_size + filter_rows_padding-1.
+            - #hog.nc() is approximately equal to img.nc()/cell_size + filter_cols_padding-1.
            - for all valid r and c:
-                - #hog[r][c] == the FHOG vector describing the cell centered at the pixel
-                  location fhog_to_image(point(c,r),cell_size) in img.
+                - #hog[r][c] == the FHOG vector describing the cell centered at the pixel location 
+                  fhog_to_image(point(c,r),cell_size,filter_rows_padding,filter_cols_padding) in img.
    !*/

 // ----------------------------------------------------------------------------------------
@@ -58,11 +80,15 @@ namespace dlib
    void extract_fhog_features(
        const image_type& img, 
        dlib::array<array2d<T,mm1>,mm2>& hog, 
-        int cell_size = 8
+        int cell_size = 8,
+        int filter_rows_padding = 1,
+        int filter_cols_padding = 1
    );
    /*!
        requires
            - cell_size > 0
+            - filter_rows_padding > 0
+            - filter_cols_padding > 0
            - in_image_type  == is an implementation of array2d/array2d_kernel_abstract.h
            - img contains some kind of pixel type. 
              (i.e. pixel_traits<typename in_image_type::type> is defined)
@@ -83,11 +109,15 @@ namespace dlib

    inline point image_to_fhog (
        point p,
-        int cell_size = 8
+        int cell_size = 8,
+        int filter_rows_padding = 1,
+        int filter_cols_padding = 1
    );
    /*!
        requires
            - cell_size > 0
+            - filter_rows_padding > 0
+            - filter_cols_padding > 0
        ensures
            - When using extract_fhog_features(), each FHOG cell is extracted from a
              certain region in the input image.  image_to_fhog() returns the identity of
@@ -98,56 +128,75 @@ namespace dlib
              might not have corresponding feature locations.  E.g. border points or points
              outside the image.  In these cases the returned point will be outside the
              input image.
+            - Note that you should use the same values of cell_size, filter_rows_padding,
+              and filter_cols_padding that you used with extract_fhog_features().
    !*/

 // ----------------------------------------------------------------------------------------

    inline rectangle image_to_fhog (
        const rectangle& rect,
-        int cell_size = 8
+        int cell_size = 8,
+        int filter_rows_padding = 1,
+        int filter_cols_padding = 1
    );
    /*!
        requires
            - cell_size > 0
+            - filter_rows_padding > 0
+            - filter_cols_padding > 0
        ensures
            - maps a rectangle from image space to fhog space.  In particular this function returns:
-              rectangle(image_to_fhog(rect.tl_corner(),cell_size), image_to_fhog(rect.br_corner(),cell_size))
+              rectangle(image_to_fhog(rect.tl_corner(),cell_size,filter_rows_padding,filter_cols_padding), 
+                        image_to_fhog(rect.br_corner(),cell_size,filter_rows_padding,filter_cols_padding))
    !*/

 // ----------------------------------------------------------------------------------------

    inline point fhog_to_image (
        point p,
-        int cell_size = 8
+        int cell_size = 8,
+        int filter_rows_padding = 1,
+        int filter_cols_padding = 1
    );
    /*!
        requires
            - cell_size > 0
+            - filter_rows_padding > 0
+            - filter_cols_padding > 0
        ensures
            - Maps a pixel in a FHOG image (produced by extract_fhog_features()) back to the
              corresponding original input pixel.  Note that since FHOG images are
              spatially downsampled by aggregation into cells the mapping is not totally
              invertible.  Therefore, the returned location will be the center of the cell
              in the original image that contained the FHOG vector at position p.  Moreover,
-              cell_size should be set to the value used by the call to extract_fhog_features().
+              cell_size, filter_rows_padding, and filter_cols_padding should be set to the
+              values used by the call to extract_fhog_features().
            - Mapping from fhog space to image space is an invertible transformation.  That
-              is, for any point P we have P == image_to_fhog(fhog_to_image(P,cell_size),cell_size).
+              is, for any point P we have P == image_to_fhog(fhog_to_image(P,cell_size,filter_rows_padding,filter_cols_padding),
+                                                             cell_size,filter_rows_padding,filter_cols_padding).
    !*/

 // ----------------------------------------------------------------------------------------

    inline rectangle fhog_to_image (
        const rectangle& rect,
-        int cell_size = 8
+        int cell_size = 8,
+        int filter_rows_padding = 1,
+        int filter_cols_padding = 1
    );
    /*!
        requires
            - cell_size > 0
+            - filter_rows_padding > 0
+            - filter_cols_padding > 0
        ensures
            - maps a rectangle from fhog space to image space.  In particular this function returns:
-              rectangle(fhog_to_image(rect.tl_corner(),cell_size), fhog_to_image(rect.br_corner(),cell_size))
+              rectangle(fhog_to_image(rect.tl_corner(),cell_size,filter_rows_padding,filter_cols_padding), 
+                        fhog_to_image(rect.br_corner(),cell_size,filter_rows_padding,filter_cols_padding))
            - Mapping from fhog space to image space is an invertible transformation.  That
-              is, for any rectangle R we have R == image_to_fhog(fhog_to_image(R,cell_size),cell_size).
+              is, for any rectangle R we have R == image_to_fhog(fhog_to_image(R,cell_size,filter_rows_padding,filter_cols_padding),
+                                                                 cell_size,filter_rows_padding,filter_cols_padding).
    !*/

 // ----------------------------------------------------------------------------------------
@@ -187,6 +236,7 @@ namespace dlib
    /*!
        requires
            - cell_draw_size > 0
+            - hog.size() == 31
        ensures
            - This function just converts the given hog object into an array<array2d<T>>
              and passes it to the above draw_fhog() routine and returns the results.