- find a line that separates two classes - find the line that maximizes M (the minimum perpendicular distance between each point and the separating line) - this line is called the "optimal separating hyperplane"

to transform data, we choose a _____ functionKernelcommon kernel functions- linear
- polynomial
- radial basis
- sigmoidpros of support vector regression (?)- effective in high dimensional spaces (many variables)
- uses a subset of training points in the decision function (called support vectors), so it's also memory efficientcons of support vector regression (?)-sensitive to noise- a small number of mislabeled examples can dramatically decrease the performance
- selecting the right kernel is not an easy task
- gets slower when dataset size is bigger
- it classifies through geometry whereas a lot of classification problems probability gives better results