Study sets, textbooks, questions
Upgrade to remove ads
Terms in this set (50)
CBR or Case based reasoning model solves problems by
retrieving stored cases similar cases describing old problems and adapting to fit the new needs
In supervised learning the training data provides ... and ...
In supervised learning the machine learns to predict the outcome of ... data based on the ...
new, past examples
In supervised learning, if the objective is categorical the model is ..
In supervised learning, if the objective is numeric the model is ...
The algorithms that can be used for supervised learning are
1. Decision Tree 2. Random Forest 3. Logistic Regression 4. Ensembles 5. Deep Learning
In unsupervised learning the training data provides ... but no specific ..
In unsupervised learning the machine tries to find interesting .. in the data
The algorithms that can be used in unsupervised are
1. Clustering 2. Anomaly detection 3. Association discovery 4. Topic models
A filter bubble
Personalized web experience based on metrics about your history and web searches
The echo chamber effect
reinforces beliefs through repetition inside a closed system
The simplest method to build a decision tree is to use the ...
divide and conquer method
A metric used to train decision trees, and measure the quality of a split with respect to the prediction target is .. , and is calculated using the entropy values of parent and child nodes
Information Gain (IG)
Entropy is a measured of homogeneity of a dataset and can be roughly thought of as how much .. the data has
Disadvantages of decision trees are
they easily lead to overfitting, do not generalize well
When a model learns from only some part of the training data it is ..
When the model relies too much on the training set it is..
Is the target variable counted as one of the predictors?
The precision formula is:
TP / TP + FP
Is the ordering of predictors in a decision tree the same as the overall importance?
The recall (or sensitivity) formula is:
TP/ TP + FN
The Accuracy formula is:
TP + TN / All instances
Normalization is important before clustering as:
Euclidean distance is very sensitive to the changes in the differences
It ensures that the numerical range does not disproportionately dominate over others
The total number of things you correctly labeled as positive or negative over everything
The percentage of correctly predicted instances over all instances predicted for the positive class
Recall (or sensitivity) is
the percentage of correctly predicted instances over the total actual instances for the positive class
the balanced harmonic mean between precision and recall (or sensitivity)
The correlation coefficient between predicted and actual values
The Precision of a class define how trustable is the model when
predicting an instance to that class.
The Recall (or sensitivity) of a class expresses how
well the model is able to detect that class.
high recall (sensitivity) + high precision:
the class is perfectly handled by the model.
low recall (sensitivity) + high precision:
the model can't detect the class well but is highly trustable when it does.
high recall ( or sensitivity) + low precision:
the class is well detected but the model also include points of other classes in it.
low recall ( or sensitivity) + low precision:
the class is poorly handled by the model.
In imbalanced datasets, the precision and recall (or sensitivity)for the positive class (minority) are very ...., while the precision and recall ( or sensitivity) for the negative calls (majority) are very ... .
The accuracy is also very ... (the Accuracy Paradox).
low, high, high
MAE (Mean Absolute Error)
the mean of the absolute errors when comparing the actual y with the predicted one
Problem is that large errors count as much as small errors
MSE (mean squared error)
Is the mean squared of the absolute errors when comparing actual y to predicted
More complicated than MAE, but larger errors take on more weights
Compared the MSE of the model with respect to the MSE of the mean model
0 model is no better than mean
< 0 Model is worse than mean
1 - Model fits the data perfectly
Confidence of a prediction is a combination of
the probability of the prediction and the "amount" of data in that node.
individual models are built separately, equal weight is given to all models
each new model is influenced by the performance of those built previously.
Weights a model's contribution by its performance.
Support Vector Machine
Supervised learning classification tool that seeks a dividing hyperplane for any number of dimensions can be used for regression or classification
What model assigns a discrete variable and which assigns continuos?
Classification = Discrete
Regression = Continuos
Specificity = TN / (TN + FP)
The specificity of a test is its ability to designate an individual who does not have a disease as negative
Sensitivity vs. Specificity
Sensitivity refers to a test's ability to designate an individual with disease as positive. A highly sensitive test means that there are few false negative results, and thus fewer cases of disease are missed.
The specificity of a test is its ability to designate an individual who does not have a disease as negative.
In essence, statistical learning refers to a set of approaches for estimating ...
And the two reasons for estimatimg f are
Variability associated with ε also affects the accuracy of our predictions. This is known as the........, because no matter how well we estimate f, we cannot reduce the error introduced by ε.
Questions asked when we're looking for inference rather than predictions could be:
Which predictors are associated with the response?
What is the relationship between the response and each predictor?
Can the relationship between Y and each predictor be adequately sum- marized using a linear equation, or is the relationship more compli- cated?
inference vs prediction, for which do non-linear approaches work better than linear approaches.
Linear better for inference
Non- linear better for prediction
parametric; it reduces the problem of estimating f down to one of estimating a set of ...
Other sets by this creator
FINAL AWS OCT1, 2021 - Add everything here
Relational Database Service RDS, Security, Deploym…
Relational Database Service RDS
Security, Deployment & Operations
Other Quizlet sets
Management Lesson 7: Organizational Strategy
Learning: Module 2: Section 4_03
Another way to perform topological sorting on a directed acyclic graph G = (V, E) is to repeatedly find a vertex of in-degree 0, output it, and remove it and all of its outgoing edges from the graph. Explain how to implement this idea so that it runs in time O(V + E). What happens to this algorithm if G has cycles?
The following statement calls a function named half, which returns a value that is half that of the argument. (Assume the number variable references a float value.) Write code for the function. result = half(number)
True or false? It is not necessary to have a base case in all recursive algorithms.
Prove that COUNTING-SORT is stable.