Predictive Analytics_Exam I
Terms in this set (34)
Specific procedure used to implement a predictive analytic technique (e.g. cluster analysis, regression)
One complete set of variable values; considered to be the unit of analysis
Conditional probability of an outcome will be realized IF others are realized.
Global summary of a data set on a description of relationships between or among variables; allows you to make statements about an point in the observation space.
Holdout Sample/Validation Set
Sample of data not used in fitting a model; but used to test the performance of the model. "Stress testing" to SELECT 'BEST' MODEL
Makes statements only about restricted regions of the observation space.
Refers to the predicted values or class or outcome
Process by which an algorithm 'learns' how to predict values for new cases based on known output values
Analysis done to learn something about data other than to predict outcome variables (e.g. clustering)
Portion of data used to fit/develop a model - CREATES model CANDIDATES
Set of data used as an unbiased assessment in measuring the accuracy of predictions
Outcome of interest. "What do I want to predict?"
Basic form of data analysis used to arrange observations in classes.
WITHIN group VARIANCE = LOW
BETWEEN group VARIANCE = HIGH
(Similar to classification) Except the aim here is to predict the value of a numerical value.
New data for which the target value will be predicted
Affinity Analysis ('Association Rules)
Rules developed and used to relate observations
Predictive Analytics/Technical Descriptions
Combined use of classification, prediction and affinity analysis
Examine and inferring ideas from the data
Data exploration by way of graphical analysis
Consolidates a large number of variables into a smaller set (e.g. PCA or Neural Nets)
Only (numerical) order matters
Difference between values is meaningful. There is no ABSOLUTE 0 (i.e. a '0' value does not mean there is none of that variable)
Similar to interval, but can have absolute 0 value (e.g. Kelvin temp. scale)
Used when variables with the largest scales would dominate and skew results.
Used with time-series data, which are values that vary within time intervals
Compares a single stat across groups - height of bar corresponds to focal stat
Study Association between numerical variables' values
Useful for comparing subgroups and seeing distribution overtime side-by-side
Useful to indicate where transformations are required due to skewness in outcome variables ("Frequency Chart")
Measure strength of linear relationship between two variables
Principal Component Analysis (PCA)
Procedure for transforming correlated variables to a linear combination of uncorrelated (independent) variables. Removes overlap of information
Procedure for identifying a function that relates variables to each other
Procedure related to dimension reduction. Select/reject variables based on predefined criteria.
Part of dimension reduction:
1) Create a new variable by changing the form of the given variable;
2) Replacing the observed set of variables with a smaller set or combination of variables
YOU MIGHT ALSO LIKE...
Academic Word Lists - AWL Sublists
Sociological Research Methods Exam 3
Research Methods - AQA psychology A2 level
OTHER SETS BY THIS CREATOR
ERP Test I
Test I Review
Predictive Analytics, Exam II
Organizational Behavior & Management
THIS SET IS OFTEN IN FOLDERS WITH...
Python Machine Learning
KPs Machine Learning Set