Home
Subjects
Textbook solutions
Create
Study sets, textbooks, questions
Log in
Sign up
Upgrade to remove ads
Only $35.99/year
Science
Computer Science
Artificial Intelligence
Lecture 5 - Compound Descriptors and Metrics
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (18)
How does Deformable Part Models work?
1) A coarse global model
2) A fixed number of part models with flexible spartial arrangement
• Detection is done on a coarse pattern
• Constellations are used as a verification step
How does Bags of features work?
Bag-of-features is a vector of occurrence counts of a vocabulary of local image features.
What is Visual words and what is it used for?
• Quickly index large datasets
• Completely disregards spartial relationships among features
How does Visual words work?
1) Extract local patches of the images in the training set
2) Create descriptor vectors of the patches
3) Cluster the descriptor vectors. The cluster is done in whitened space.
4) Give each cluster a "visual word", it's not an english word but a word for the image
5) When a new image is checked, one extract patches, create descriptors and connect them to a cluster. A count (bag-of-words) is then done and a histogram is created.
Explain Spatial pyramids
• It's very similar to Bag-of-words, but instead of doing it directly on the image, the image is split into grids of several different sizes. This result in a spatial understanding.
• Larger grid cells are down-weighted to compensate for the higher likelihood of matches there (divide by width)
What is Spatial pyramid pooling used for?
To avoid the need to resample the input image.
How does Spatial pyramid pooling work?
1) The output of the last convolutional layer in inputed to the spatial pyramid pooling
2) The feature maps from the last convolutional layer is then split up into multiple copies, where each copy is divided into predefined cells, aka. bins.
3) For each copy (level of spatial pyramid pooling), each bin are max-pooled and concatenated into a vector. Then the next copy (level) is concatenated into the same vector. This result in a control over the output dimensions of the Spatial pyramid pooling, regardless of the input.
If you have a descriptor q of a quary image, how do you know which prototype in memory that is mosty likely to correspond to the same world object?
• By using Descriptor distances
List some descriptor distance metrics for histogram-based features
1) Chi-2 distance
2) Square root matching
3) Histogram intersection
4) Earth Mover's distance
5) Pyramid Match Kernel
What discrete distribution is most typical for histograms?
• Poisson distribution
• It can be approximated with a continues gaussian distribution if the expected value is large (e.g. 1000)
Explain some properties of Square root matching
• Close approximation to Chi-2
• Faster if SQRT is pre-computed
Explain some properties of Histogram intersection
• Is a similarity metrics
• Assumes independence between bins
• Is an approximation
Explain some properties of Earth Mover's Distance
Distance = Cost of moving values in p to q
Cost = Amount*Distance
• In histograms, neighbouring bins are typically correlated
• Distance between two probability distributions over a region D.
• Think of the distributions as two different ways of piling up a certain amount of dirt over the region D. The Earth Mover's Distance is the minimum cost of turning one pile into the other.
Explain some properties of Pyramid Match Kernel
• Approximation of Earth Mover's Distance
Explain some properties of Ratio Score
• If we have best matches for descriptors q1 and q2 in the image, we need to know which one is best.
• Incorporates the risk of misclassification
• Ratio score scores the match or q1 as the ratio between the best match and the second best match. If the value is large, there's a large distance between the best and the second best match, thus being more sure than if the ratio was small.
What is metric learning used for?
• To normalize feature dimensions
• Find more compact descriptors, e.g. with PCA
What is the equation for Chi-2 distance?
X^2(q, p_k) = \sum_{l=1}^{D} \frac{(p_{kl} - q_{l})^2}{(p_{kl} + q_{l}}
What is the equation for Squared root matching?
d_{1/2}(q, p_k)^2 = \sum_{l=1}^{D} (\sqrt(p_kl) - \sqrt(q_l))
Recommended textbook explanations
Computer Organization and Design MIPS Edition: The Hardware/Software Interface
5th Edition
David A. Patterson, John L. Hennessy
220 explanations
Introduction to the Theory of Computation
3rd Edition
Michael Sipser
389 explanations
Engineering Electromagnetics
8th Edition
John Buck, William Hayt
483 explanations
Python for Everyone
1st Edition
Cay S. Horstmann, Rance D. Necaise
633 explanations
Sets with similar terms
Spatial Analysis Midterm
33 terms
GIS
47 terms
Toolbox Quiz #8
15 terms
Geog 5 Exam 1
100 terms
Sets found in the same folder
Lecture 1
14 terms
Lecture 2 - Feature Descriptors
43 terms
Lecture 3 - Convolutional Neural Networks
31 terms
Lecture 4 - Image classification with convolutiona…
30 terms
Other sets by this creator
Lecture 8
14 terms
Lecture 7
18 terms
Lecture 6
8 terms
Lecture 5
18 terms
Verified questions
COMPUTER SCIENCE
A(n) __________ expression has a value of either true or false. a. binary b. decision c. unconditional d. Boolean
COMPUTER SCIENCE
A program has been written to process the scores of soccer games. Consider the following code segment, which is intended to assign an appropriate string to outcome based on the number of points scored by each of the two teams. (The team with the greater number of points gains the victory). if (team1Points==team2Points) outcome="Tie game"; if (team1Points>team2Points) outcome="Victory for Team 1"; else outcome="Victory for Team 2"; The code does not work properly. For which of the following cases will the code assign the wrong string to outcome? The case where I. both teams score the same number of points, II. Team 1 scores more points than Team 2, III. Team 2 scores more points than Team 1. (A) I only, (B) II only, (C) III only, (D) I and III only, (E) II and III only.
COMPUTER SCIENCE
Computer scientists and mathematicians often use numbering systems other than base 10. Write a program that allows a user to enter a number and a base and then prints out the digits of the number in the new base. Use a recursive function baseConversion(num, base) to print the digits. Hint: Consider base 10. To get the rightmost digit of a base 10 number, simply look at the remainder after dividing by 10. For example, 153%10 is 3. To get the remaining digits, you repeat the process on 15, which is just 153/10. This same process works for any base. The only problem is that we get the digits in reverse order (right to left). Write a recursive function that first prints the digits of num//base and then prints the last digit, namely num%base. You should put a space between successive digits, since bases greater than 10 will print out with multi-character digits. For example, baseConversion (245, 16) should print 15 5.
COMPUTER SCIENCE
In what circumstances is the system-call sequence fork () exec () most appropriate? When is vfork () preferable?
Other Quizlet sets
Anatomy and Physiology Ch 13 The Central…
58 terms
Auditing Class 4 Chp. 3
22 terms
Sensationa and Perception Test 1
53 terms
MKTG CH. 18
60 terms