Commit cbc79bcf authored by Davis King's avatar Davis King

Made extract_fhog_features() faster by using simd instructions. Also added an

option to zero pad the borders of the output to it's easier to filter.
parent 2895d425
This diff is collapsed.
...@@ -20,11 +20,15 @@ namespace dlib ...@@ -20,11 +20,15 @@ namespace dlib
void extract_fhog_features( void extract_fhog_features(
const image_type& img, const image_type& img,
array2d<matrix<T,31,1>,mm>& hog, array2d<matrix<T,31,1>,mm>& hog,
int cell_size = 8 int cell_size = 8,
int filter_rows_padding = 1,
int filter_cols_padding = 1
); );
/*! /*!
requires requires
- cell_size > 0 - cell_size > 0
- filter_rows_padding > 0
- filter_cols_padding > 0
- in_image_type == is an implementation of array2d/array2d_kernel_abstract.h - in_image_type == is an implementation of array2d/array2d_kernel_abstract.h
- img contains some kind of pixel type. - img contains some kind of pixel type.
(i.e. pixel_traits<typename in_image_type::type> is defined) (i.e. pixel_traits<typename in_image_type::type> is defined)
...@@ -40,11 +44,29 @@ namespace dlib ...@@ -40,11 +44,29 @@ namespace dlib
- The input image is broken into cells that are cell_size by cell_size pixels - The input image is broken into cells that are cell_size by cell_size pixels
and within each cell we compute a 31 dimensional FHOG vector. This vector and within each cell we compute a 31 dimensional FHOG vector. This vector
describes the gradient structure within the cell. describes the gradient structure within the cell.
- #hog.nr() is approximately equal to img.nr()/cell_size. - A common task is to convolve each channel of the hog image with a linear
- #hog.nc() is approximately equal to img.nc()/cell_size. filter. This is made more convenient if the contents of #hog includes extra
rows and columns of zero padding along the borders. This extra padding
allows for more efficient convolution code since the code does not need to
perform expensive boundary checking. Therefore, you can set
filter_rows_padding and filter_cols_padding to indicate the size of the
filter you wish to use and this function will ensure #hog has the appropriate
extra zero padding along the borders. In particular, it will include the
following extra padding:
- (filter_rows_padding-1)/2 extra rows of zeros on the top of #hog.
- (filter_cols_padding-1)/2 extra columns of zeros on the left of #hog.
- filter_rows_padding/2 extra rows of zeros on the bottom of #hog.
- filter_cols_padding/2 extra columns of zeros on the right of #hog.
Therefore, the extra padding is done such that functions like
spatially_filter_image() apply their filters to the entire content containing
area of a hog image (note that you should use the following planar version of
extract_fhog_features() instead of the interlaced version if you want to use
spatially_filter_image() on a hog image).
- #hog.nr() is approximately equal to img.nr()/cell_size + filter_rows_padding-1.
- #hog.nc() is approximately equal to img.nc()/cell_size + filter_cols_padding-1.
- for all valid r and c: - for all valid r and c:
- #hog[r][c] == the FHOG vector describing the cell centered at the pixel - #hog[r][c] == the FHOG vector describing the cell centered at the pixel location
location fhog_to_image(point(c,r),cell_size) in img. fhog_to_image(point(c,r),cell_size,filter_rows_padding,filter_cols_padding) in img.
!*/ !*/
// ---------------------------------------------------------------------------------------- // ----------------------------------------------------------------------------------------
...@@ -58,11 +80,15 @@ namespace dlib ...@@ -58,11 +80,15 @@ namespace dlib
void extract_fhog_features( void extract_fhog_features(
const image_type& img, const image_type& img,
dlib::array<array2d<T,mm1>,mm2>& hog, dlib::array<array2d<T,mm1>,mm2>& hog,
int cell_size = 8 int cell_size = 8,
int filter_rows_padding = 1,
int filter_cols_padding = 1
); );
/*! /*!
requires requires
- cell_size > 0 - cell_size > 0
- filter_rows_padding > 0
- filter_cols_padding > 0
- in_image_type == is an implementation of array2d/array2d_kernel_abstract.h - in_image_type == is an implementation of array2d/array2d_kernel_abstract.h
- img contains some kind of pixel type. - img contains some kind of pixel type.
(i.e. pixel_traits<typename in_image_type::type> is defined) (i.e. pixel_traits<typename in_image_type::type> is defined)
...@@ -83,11 +109,15 @@ namespace dlib ...@@ -83,11 +109,15 @@ namespace dlib
inline point image_to_fhog ( inline point image_to_fhog (
point p, point p,
int cell_size = 8 int cell_size = 8,
int filter_rows_padding = 1,
int filter_cols_padding = 1
); );
/*! /*!
requires requires
- cell_size > 0 - cell_size > 0
- filter_rows_padding > 0
- filter_cols_padding > 0
ensures ensures
- When using extract_fhog_features(), each FHOG cell is extracted from a - When using extract_fhog_features(), each FHOG cell is extracted from a
certain region in the input image. image_to_fhog() returns the identity of certain region in the input image. image_to_fhog() returns the identity of
...@@ -98,56 +128,75 @@ namespace dlib ...@@ -98,56 +128,75 @@ namespace dlib
might not have corresponding feature locations. E.g. border points or points might not have corresponding feature locations. E.g. border points or points
outside the image. In these cases the returned point will be outside the outside the image. In these cases the returned point will be outside the
input image. input image.
- Note that you should use the same values of cell_size, filter_rows_padding,
and filter_cols_padding that you used with extract_fhog_features().
!*/ !*/
// ---------------------------------------------------------------------------------------- // ----------------------------------------------------------------------------------------
inline rectangle image_to_fhog ( inline rectangle image_to_fhog (
const rectangle& rect, const rectangle& rect,
int cell_size = 8 int cell_size = 8,
int filter_rows_padding = 1,
int filter_cols_padding = 1
); );
/*! /*!
requires requires
- cell_size > 0 - cell_size > 0
- filter_rows_padding > 0
- filter_cols_padding > 0
ensures ensures
- maps a rectangle from image space to fhog space. In particular this function returns: - maps a rectangle from image space to fhog space. In particular this function returns:
rectangle(image_to_fhog(rect.tl_corner(),cell_size), image_to_fhog(rect.br_corner(),cell_size)) rectangle(image_to_fhog(rect.tl_corner(),cell_size,filter_rows_padding,filter_cols_padding),
image_to_fhog(rect.br_corner(),cell_size,filter_rows_padding,filter_cols_padding))
!*/ !*/
// ---------------------------------------------------------------------------------------- // ----------------------------------------------------------------------------------------
inline point fhog_to_image ( inline point fhog_to_image (
point p, point p,
int cell_size = 8 int cell_size = 8,
int filter_rows_padding = 1,
int filter_cols_padding = 1
); );
/*! /*!
requires requires
- cell_size > 0 - cell_size > 0
- filter_rows_padding > 0
- filter_cols_padding > 0
ensures ensures
- Maps a pixel in a FHOG image (produced by extract_fhog_features()) back to the - Maps a pixel in a FHOG image (produced by extract_fhog_features()) back to the
corresponding original input pixel. Note that since FHOG images are corresponding original input pixel. Note that since FHOG images are
spatially downsampled by aggregation into cells the mapping is not totally spatially downsampled by aggregation into cells the mapping is not totally
invertible. Therefore, the returned location will be the center of the cell invertible. Therefore, the returned location will be the center of the cell
in the original image that contained the FHOG vector at position p. Moreover, in the original image that contained the FHOG vector at position p. Moreover,
cell_size should be set to the value used by the call to extract_fhog_features(). cell_size, filter_rows_padding, and filter_cols_padding should be set to the
values used by the call to extract_fhog_features().
- Mapping from fhog space to image space is an invertible transformation. That - Mapping from fhog space to image space is an invertible transformation. That
is, for any point P we have P == image_to_fhog(fhog_to_image(P,cell_size),cell_size). is, for any point P we have P == image_to_fhog(fhog_to_image(P,cell_size,filter_rows_padding,filter_cols_padding),
cell_size,filter_rows_padding,filter_cols_padding).
!*/ !*/
// ---------------------------------------------------------------------------------------- // ----------------------------------------------------------------------------------------
inline rectangle fhog_to_image ( inline rectangle fhog_to_image (
const rectangle& rect, const rectangle& rect,
int cell_size = 8 int cell_size = 8,
int filter_rows_padding = 1,
int filter_cols_padding = 1
); );
/*! /*!
requires requires
- cell_size > 0 - cell_size > 0
- filter_rows_padding > 0
- filter_cols_padding > 0
ensures ensures
- maps a rectangle from fhog space to image space. In particular this function returns: - maps a rectangle from fhog space to image space. In particular this function returns:
rectangle(fhog_to_image(rect.tl_corner(),cell_size), fhog_to_image(rect.br_corner(),cell_size)) rectangle(fhog_to_image(rect.tl_corner(),cell_size,filter_rows_padding,filter_cols_padding),
fhog_to_image(rect.br_corner(),cell_size,filter_rows_padding,filter_cols_padding))
- Mapping from fhog space to image space is an invertible transformation. That - Mapping from fhog space to image space is an invertible transformation. That
is, for any rectangle R we have R == image_to_fhog(fhog_to_image(R,cell_size),cell_size). is, for any rectangle R we have R == image_to_fhog(fhog_to_image(R,cell_size,filter_rows_padding,filter_cols_padding),
cell_size,filter_rows_padding,filter_cols_padding).
!*/ !*/
// ---------------------------------------------------------------------------------------- // ----------------------------------------------------------------------------------------
...@@ -187,6 +236,7 @@ namespace dlib ...@@ -187,6 +236,7 @@ namespace dlib
/*! /*!
requires requires
- cell_draw_size > 0 - cell_draw_size > 0
- hog.size() == 31
ensures ensures
- This function just converts the given hog object into an array<array2d<T>> - This function just converts the given hog object into an array<array2d<T>>
and passes it to the above draw_fhog() routine and returns the results. and passes it to the above draw_fhog() routine and returns the results.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment