Home
Subjects
Textbook solutions
Create
Study sets, textbooks, questions
Log in
Sign up
Upgrade to remove ads
Only $35.99/year
Science
Computer Science
Artificial Intelligence
Lecture 3 - Convolutional Neural Networks
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (31)
What are the three Machine learning problems?
1) Supervised learning
- Regression
- Classification
2) Unsupervised learning
- Clustering
- Topic models
3) Reinforcement learning
- Motor babbling
- Bootstrapping
What is the equation for mean squared error?
J(ϴ) = (1/2N) * \sum_{i=1}^{N} (h(x_{i}) - y_{i})^2
h(x_{i}): Predicted value for x_i
y_{i}: Target value for x_{i}
What are the steps of gradient descent?
Step 0:
Start with an arbitrary value for 𝜽
Step 1:
Compute the gradient ∇𝐽(𝜽)
Step 2:
Update 𝜽 by adding the (scaled) negative gradient
𝜽 ≔ 𝜽 − 𝛼 ∇𝐽(𝜽)
Step 3:
Repeat step 1 and step 2 until error i sufficiently low.
What is the generic gradient decent update rule for Linear regression?
𝜽 ≔ 𝜽 − 𝛼/N
\sum_{i=1}^{N} (𝜽_t
1}^{N} (𝜽_t * x_{i} - y_{i}) * x_{i}
In vector form:
𝜽 := 𝜽 - 𝛼/N
X_t (X
* X_t (X*𝜽 - y)
What are some variants of gradient descent?
1) Minibatch gradient descent
- Update 𝜽 based on a small batch of the training data
- Can increase the speed of convergence
2) Stochastic gradient descent
- Update 𝜽 based on a random training instance
- Can prevent the learning process from getting stuck in local optima
What is overfitting?
When the trained model fits the data to well and doesn't generalize for other data from the same distribution
What is underfitting?
When the trained model hasn't learnt to model the data distribution well enough
How does an artificial neuron work?
1) Inputs x_{i} get multiplied with the parameters 𝜽_{i}.
2) All of them get summed up
3) An activation function f is applied to the sum
h(x) = f(x_t * 𝜽)
List some activation functions
1) Logistic function
2) Softmax layer
3) Hyperbolic tangent
4) Rectified linear units
5) Softplus
What is the equation for a general logistic function?
f(z) = 1/(1 + e^(-z))
What is the derivative of the general logistic function?
df/dz = f(z) * (1 - f(z)
What is the equation for a Softmax layer?
y_i = exp(z_i)/[\sum_{j} exp(z_k)]
What is the equation for a Hyperbolic tangent?
f(z) = [e^z - e^(-z)]/[e^z + e^(-z)]
What is the derivative of the Hyperbolic tangent?
df/dz = 1 - f^2(z)
What is the equation for the Rectified Linear Units (ReLUs)?
f(z) = 0, if z <= 0
f(z) = z, if z > 0
Recommended textbook explanations
Computer Organization and Design MIPS Edition: The Hardware/Software Interface
5th Edition
David A. Patterson, John L. Hennessy
220 explanations
Starting Out with Python
3rd Edition
Tony Gaddis
610 explanations
Essentials Of Computer Organization And Architecture
5th Edition
Julia Lobur
364 explanations
Starting Out with C++ from Control Structures to Objects
8th Edition
Godfrey Muganda, Judy Walters, Tony Gaddis
1,294 explanations
Sets with similar terms
Stanford Machine Learning
46 terms
IB Computer Science 2018 Case Study
26 terms
Exam 2 Data Mining Review, Chapter 5 Data Mining R…
51 terms
Test 1
79 terms
Sets found in the same folder
Lecture 1
14 terms
Lecture 2 - Feature Descriptors
43 terms
Lecture 4 - Image classification with convolutiona…
30 terms
Lecture 5 - Compound Descriptors and Metrics
18 terms
Other sets by this creator
Lecture 8
14 terms
Lecture 7
18 terms
Lecture 6
8 terms
Lecture 5
18 terms
Verified questions
COMPUTER SCIENCE
Give an algorithm for constructing the sum-of- products representation for an arbitrary logic equation consisting of AND, OR, and NOT. The algorithm should be recursive and should not construct the truth table in the process.
COMPUTER SCIENCE
keyboard and mouse are examples of
COMPUTER SCIENCE
Suppose that we were to rewrite the for loop header in line 10 of the COUNTING-SORT as 10 for j = 1 to A. length Show that the algorithm still works properly. Is the modified algorithm stable?
COMPUTER SCIENCE
Write a program that counts how many times the fib function is called to compute fib(n) where n is a user input.
Other Quizlet sets
Practice Exam Questions
34 terms
AWS CLOUD PRACTITIONER ESSENTIALS (5)
65 terms
HIST 1201 - FINAL EXAM REVIEW
125 terms
General Senses, Reflex Arc, and White Ma…
50 terms