Home
Subjects
Create
Search
Log in
Sign up
Upgrade to remove ads
Only $2.99/month
Science
Computer Science
Artificial Intelligence
Nearest Neighbor
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (22)
eager learning
explicit description of target function on the whole training set
-pretty much alll methods except this one
lazy learner
nearest neighbor (aka instance-based)
lazy learner learning phase
storing all training instances
-no real work done
lazy learner classification phase
computing distances and determining classification
role-learner
memorizes entire training data and performs classification only if attributes of record exactly match of the training examples - no generalization
nearest neighbor
uses k "closest" points for performing classification
- generalizes
NN requires three things
- set of stored records
-distance metrics to compute distance
-the value of k, the number of nearest neighbors to retrieve
to classify unknown records
- compute distance to other training records
-identify k nearest neighbors
-use class levels of nearest neighbors to determine the class label (majority vote)
nearest neighbor can be used for classification and
regression - for regression just compute a number based on the numbers associated with the nearest neighbors (eg average)
Strengths
-quick build time
-comprehensible -easy to explain prediction
-robust to noisy data by averaging k-nearest (k not small)
-distance function can be tailored using domain knowledge
-can learn complex decision boundaries - much more expressive than linear and decision trees
weaknesses
-needs lots of space to store
-takes much more time to classify a new example - compare distance to all others
-distnace function must be designed carefully with domain knowledge - missing Vals and irrelevant are a probelm
similarity problem
not easy to computer cause various factors can interfere
standard distance metrics weight each feature equally when determining similarity - problem with this
if irrelevant
voronoi diagram
division of sapce
the nearest neighbor algorithm is sensitive to
outliers
generalize NN algorithm to K-NN
measure distance between k instances and let them vote. k normally odd number
curse of dimensionality
many attributes with only 2 relevant in determining classification of target function
-instances that have identical values fro the 2 relevant hight be distant from one another in the big dimensional instance space
how to mitigate irrelevant features
-use more training instances - harder to obscure patterns
-use statistical tests (prune irrelevant features)
-search over feature subsets
how to use nearest neighbor for regression
take average value of k nearest neighbor
weight each nearest neighbors vote by the inverse square of its distance from test instance
case based reasoning
similar to instance based learning - but may reason about the difference between the new example and match example
pros
no learning time
highly expressive
via use of k can avoid noise
easy to explain
cons
relatively long eval time
no model to provide high level insight
very sensetive to irrelevant/redundant
good distance measures required to get good results
You might also like...
ML concept learning
65 terms
ML NEW
86 terms
Machine Learning Quiz 3
45 terms
Machine Learning Technologies
19 terms
Other sets by this creator
ensembles
28 terms
class imbalance
22 terms
Rule Classifiers
44 terms
Decision Trees
58 terms
Other Quizlet sets
driver ed
55 terms
Physio Exam 1: CRAM OR DIE
34 terms
FUNDAMENTALS OF SHOOTING
18 terms
Assessment Exam 1: CH 14 Assessing Skin,…
95 terms