Skip to content
Projects
Groups
Snippets
Help
Loading...
Sign in
Toggle navigation
D
dlib
Project
Project
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
钟尚武
dlib
Commits
ff27f789
Commit
ff27f789
authored
Jun 08, 2013
by
Davis King
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Added svm-rank python example
parent
2f34594f
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
110 additions
and
0 deletions
+110
-0
svm_rank.py
python_examples/svm_rank.py
+110
-0
No files found.
python_examples/svm_rank.py
0 → 100755
View file @
ff27f789
#!/usr/bin/python
# The contents of this file are in the public domain. See LICENSE_FOR_EXAMPLE_PROGRAMS.txt
#
#
# This is an example illustrating the use of the SVM-Rank tool from the dlib C++
# Library. This is a tool useful for learning to rank objects. For example,
# you might use it to learn to rank web pages in response to a user's query.
# The idea being to rank the most relevant pages higher than non-relevant pages.
#
# In this example, we will create a simple test dataset and show how to learn a
# ranking function from it. The purpose of the function will be to give
# "relevant" objects higher scores than "non-relevant" objects. The idea is
# that you use this score to order the objects so that the most relevant objects
# come to the top of the ranked list.
#
# COMPILING THE DLIB PYTHON INTERFACE
# You need to compile the dlib python interface before you can use this file.
# To do this, run compile_dlib_python_module.bat. This should work on any
# operating system so long as you have CMake and boost-python installed. On
# Ubuntu, this can be done easily by running the command:
# sudo apt-get install libboost-python-dev cmake
import
dlib
# Now lets make some testing data. To make it really simple, lets suppose that
# we are ranking 2D vectors and that vectors with positive values in the first
# dimension should rank higher than other vectors. So what we do is make
# examples of relevant (i.e. high ranking) and non-relevant (i.e. low ranking)
# vectors and store them into a ranking_pair object like so:
data
=
dlib
.
ranking_pair
()
data
.
relevant
.
append
(
dlib
.
vector
([
1
,
0
]))
data
.
nonrelevant
.
append
(
dlib
.
vector
([
0
,
1
]))
# Now that we have some data, we can use a machine learning method to learn a
# function that will give high scores to the relevant vectors and low scores to
# the non-relevant vectors.
trainer
=
dlib
.
svm_rank_trainer
()
# Note that the trainer object has some parameters that control how it behaves.
# For example, since this is the SVM-Rank algorithm it has a C parameter that
# controls the trade-off between trying to fit the training data exactly or
# selecting a "simpler" solution which might generalize better.
trainer
.
c
=
10
# So lets do the training.
rank
=
trainer
.
train
(
data
)
# Now if you call rank on a vector it will output a ranking score. In
# particular, the ranking score for relevant vectors should be larger than the
# score for non-relevant vectors.
print
"ranking score for a relevant vector: "
,
rank
(
data
.
relevant
[
0
])
print
"ranking score for a non-relevant vector: "
,
rank
(
data
.
nonrelevant
[
0
])
# These output the following:
# ranking score for a relevant vector: 0.5
# ranking score for a non-relevant vector: -0.5
# If we want an overall measure of ranking accuracy we can compute the ordering
# accuracy and mean average precision values by calling test_ranking_function().
# In this case, the ordering accuracy tells us how often a non-relevant vector
# was ranked ahead of a relevant vector. In this case, it returns 1 for both
# metrics, indicating that the rank function outputs a perfect ranking.
print
dlib
.
test_ranking_function
(
rank
,
data
)
# We can also see the ranking weights:
print
"weights:
\n
"
,
rank
.
weights
# In this case they are:
# 0.5
# -0.5
# In the above example, our data contains just two sets of objects. The
# relevant set and non-relevant set. The trainer is attempting to find a
# ranking function that gives every relevant vector a higher score than every
# non-relevant vector. Sometimes what you want to do is a little more complex
# than this.
#
# For example, in the web page ranking example we have to rank pages based on a
# user's query. In this case, each query will have its own set of relevant and
# non-relevant documents. What might be relevant to one query may well be
# non-relevant to another. So in this case we don't have a single global set of
# relevant web pages and another set of non-relevant web pages.
#
# To handle cases like this, we can simply give multiple ranking_pair instances
# to the trainer. Therefore, each ranking_pair would represent the
# relevant/non-relevant sets for a particular query. An example is shown below
# (for simplicity, we reuse our data from above to make 4 identical "queries").
queries
=
dlib
.
ranking_pairs
()
queries
.
append
(
data
)
queries
.
append
(
data
)
queries
.
append
(
data
)
queries
.
append
(
data
)
# We can train just as before.
rank
=
trainer
.
train
(
queries
)
# Now that we have multiple ranking_pair instances, we can also use
# cross_validate_ranking_trainer(). This performs cross-validation by splitting
# the queries up into folds. That is, it lets the trainer train on a subset of
# ranking_pair instances and tests on the rest. It does this over 4 different
# splits and returns the overall ranking accuracy based on the held out data.
# Just like test_ranking_function(), it reports both the ordering accuracy and
# mean average precision.
print
"cross validation results: "
,
dlib
.
cross_validate_ranking_trainer
(
trainer
,
queries
,
4
)
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment