Upgrade to remove ads
Statistical Learning - Fundamental concepts
Terms in this set (11)
Statistical Learning/Pattern Recognition
An approach to machine intelligence which is based on statistical modeling of data. With a statistical model in hand, one applies probability theory and decision theory to get an algorithm. This is opposed to using training data merely to select among different algorithms or using heuristics/"common sense" to design an algorithm.
The measurements which represent the data. The statistical model one uses is crucially dependent on the choice of features. Hence it is useful to consider alternative representations of the same measurements (i.e. different features). For example, different representations of the color values in an image. General techniques for finding new representations include discriminant analysis, principal component analysis, and clustering.
Assigning a class to a measurement, or equivalently, identifying the probabilistic source of a measurement. The only statistical model that is needed is the conditional model of the class variable given the measurement. This conditional model can be obtained from a joint model or it can be learned directly. The former approach is generative since it models the measurements in each class. It is more work, but it can exploit more prior knowledge, needs less data, is more modular, and can handle missing or corrupted data. Methods include mixture models and Hidden Markov Models. The latter approach is discriminative since it focuses only on discriminating one class from another. It can be more efficient once trained and requires fewer modeling assumptions. Methods include logistic regression, generalized linear classifiers, and nearest-neighbor. See "Discriminative vs Informative Learning".
Predicting the value of random variable y from measurement x. For example, predicting engine efficiency based on oil pressure. Regression generalizes classification since y can be any quantity, including a class index. Many classification algorithms can be understood as thresholding the output of a regression. Like classification, one can obtain the conditional model of y from a joint model (which includes a model of x) or it can be learned directly. Curve fitting is the common special case where y is assumed to be a deterministic function of x, plus additive noise (usually Gaussian). Methods for curve fitting include radial basis functions, feed-forward neural networks, and mixtures of experts.
Nonparametric regression/density estimation
An approach to regression/density estimation that doesn't require much prior knowledge but only a large amount of data. For regression, it includes nearest-neighbor, weighted average, and locally weighted regression. For density estimation, it includes histograms, kernel smoothing, and nearest-neighbor.
Density estimation when the density is assumed to be in a specific parametric family. Special cases include maximum likelihood, maximum a posteriori, unbiased estimation, and predictive estimation. See the section on Parameter estimation techniques.
Choosing the parametric family to use for density estimation. This is harder than parameter estimation since you have to take into account every member of each family in order to choose the best family. Considering only the best member of each family is not sufficient (one would tend to choose the biggest family). See the section on Model selection techniques.
A graphical way of expressing the conditional independence relationships among a set of random variables. They cannot encode every possible form of conditional independence but they go a long way toward this end. They are also called "Bayesian networks." See "Independence Diagrams", A Brief Introduction to Graphical Models and Bayesian Networks, Course Notes on Bayesian Networks, and Pearl.
Determining the optimal measurements to make under a cost constraint. A measurement is "optimal" when it is expected to give the most new information about the parameters of a model. Active learning is thus an application of decision theory to the process of learning. It is also known as experiment design. See "Employing EM in Pool-Based Active Learning for Text Classification", "Selective sampling using the Query by Committee algorithm", "Reinforcement Learning: A Survey", Box&Draper, and Raiffa&Schlaifer.
Learning how to act optimally in a given environment, especially with delayed and nondeterministic rewards. It is equivalent to adaptive control. There are two interleaved tasks: modeling the environment and making optimal decisions based on the model. The first task is a statistical modeling problem and is handled using the techniques listed in this glossary. The second task is a decision theory problem: converting the expectation of delayed reward into an immediate action. Since reinforcement learning requires exploration, it is often combined with active learning, though this is not essential. Most learning problems that humans face are reinforcement learning problems, e.g. deciding which melon to buy, which coat to wear outside today, or which friends to have. See "Reinforcement Learning: A Survey" and "Reinforcement Learning: A Tutorial".
No free lunch
The point that all statistical models are necessarily biased in one way or another, and that no single bias is globally optimal. Mitchell, and later Wolpert, emphasized this point in order to stop useless comparisons between learning algorithms that were using different priors (like Euclidean nearest neighbor vs. axis-parallel decision trees). The real way to evaluate algorithms is how well they can utilize prior knowledge given to them, i.e. how well they can approximate Bayesian learning. See "The Need for Biases in Learning Generalizations" in Readings in Machine Learning, "The lack of a priori distinctions between learning algorithms", "Bayesian regression filters and the issue of priors" (the "issue of priors" part), and Cross-validation.
THIS SET IS OFTEN IN FOLDERS WITH...
Statistical Learning - Feature extraction techniqu…
Statistical Learning - Statistical models
Statistical Learning - Parameter estimation techni…
Statistical Learning - Model Selection and Non par…
YOU MIGHT ALSO LIKE...
Machine Learning Technologies
Machine learning and Data Analytics questions
Machine Learning 1-3.5/Quiz 1
Experiential Learning Theory
OTHER SETS BY THIS CREATOR
Deep Learning Glossary
OTHER QUIZLET SETS
common assesment social
LL 5 Gr. Level 2
Psyc 2430 Final
Post Quiz- Ch. 4,5,6