Search
Create
Log in
Sign up
Log in
Sign up
busmgt 2320 final
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (64)
regression analysis
technique used to build an equation that can be used to estimate or predict the value of one variable through its relationship with one or more other variables
response variable is predicted or estimate: Y
one or more predictor/explanatory/independent variables: x1, x2, etc
advantage of multiple regression
potentially increased explanatory power for the response variable as a result of including more information
disadvantage of multiple regression
more complicated model
increased commitment of resources to collecting information and maintaining the database
greater risk of misuse of the model; extrapolation is more difficult to recognize
mathematical problems like multicollinearity
goal of multiple regression
parsimony- develop a model that allows us to predict the response variable accurately but that is as brief as possible
Y = B0 + BpXpi + Ei
B0, Bp are the population regression coefficients
E is the population error term
o is the standard deviation of the population model
yhat = b0 + bpxpi
b0, bp are the sample regression coefficients
e is the sample residual or error term
s is the standard error of the model
residuals
the plane is the surface created by yhat = b0 + bpxpi
residual = ei = yi - yhati
least squares criterion
b0, b1, b2 are once again determined by minimizing the SSE
the minimization requires a separate partial differential equation for each parameter to be estimated
this produces a set of p+1 equations with p+1 unknowns where p is the number of predictors in the model. a matrix algebra solution process is used to solve the set of simultaneous linear equations
multiple regression required conditions
Ei (error terms) terms are normally distributed
homogeneity of variance for the Ei
The Ei terms are independent
average of (Ei) = 0
we can use residual plots to validate the assumptions
multiple regression precautions
regression coefficients: change one predictor while holding all others constant: can only change one variable at a time, be one the watch for multicollinearity: when the variables are highly related to each other
F-test of MODEL significance vs t test of PREDICTOR signficance
R^2 vs adjusted R^2
interpreting multiple regression coefficients
When Xj is a quantatitive variable, the coefficient bj should be interpreted as the estimated increase in Y when Xj is increased 1 unit, and all other predictors in the equation are held constant
Predictors that are strongly correlated to each other may have unstable coefficients - multicollinearity
F test for model significance
Ho: B1=B2=Bp=0
Ha: Not all Bj = 0
test statistic = Fobs = MSR/MSE
~ F(p, n-p-1)
reject Ho if p value = P(F>=Fobs) < alpha
t test for predictor significance
a general goal of regression model building is parsimony
we want to identify variables that are not significant in the model, and consider removing any that aren't
a predictor is not significant if its population coefficient is 0
the test is identical to the t test in simple regression
Ho: Bj = 0 (insignificant)
Ha: Bj =/ 0 (significant) do this for each predictor
test statistic = tobs = bj-0/sbj
~t(dfE)
reject alpha/2 because two sided
multiple regression goodness of fit
standard error of model s=sqrt(MSE)
coefficient of determination R^2 = SSR/SST = 1 - SSE/SST
adjusted coefficient of determination R^2 - adj = 1 - (SSE/dfE)/(SST/dfT)
flaw in R^2
can be inflated by the use of many predictors but each predictor has a price - a degree of freedom paid by the error source of variation
but adding more predictors will never lead to a decrease in R^2
purpose of R^2 adjusted
the purpose behind the adjusted R^2 is to account for the number of predictors
this is particularly relevant in when n is small
it helps identify over fitting in regression models
the adjusted coefficient normally gives a more appropriate estimate of the proportion of variance in the dependent variable that is explained when the multiple predictors are used
always smaller than regular R^2
multiple regression qualitative predictors
Indicator (dummy) variables used to model a qualitative X variable with 2 or more categories
categories are coded 0 or 1 with indicator variables
the required number of indicator variables is 1 less than number of categories
k-1 dummy variables for k categories. a kth dummy variable can be formed but using it will create serious computational problems
may be combined with quantatitive variable
lines parallel on scatterplot
no interaction
partial f test purpose
used to conduct a significance test on some (more than 1) but not all predictors simultaneously
particularly useful when using qualitative predictors that have more than 2 categories
partial f test hypotheses
the null hypothesis states that a collection of q explanatory variables all have coefficients equal to 0
the alternative hypothesis states that not all predictors have coefficients equal to 0
Ho: B3=B3=0
Ha: not both B3 and B4 are 0
partial f test test statistic
let C stand for the model that uses all relevant predictors (C=complete)
let R stand for the model that omits the q specific predictors (R=reduced)
partial F= (R^2C-R^2R)/q / (1-R^2C)/dfEc
~F(q, DFEc)
use regular R^2
to reject partial f test
when calculated partial f is more than the critical f value, reject
time series definition
set of observations on a variable measured at successive points in time or successive periods of time at regular intervals
time series purpose
use clues provided by past values of the time series to predict future values of the time series
-time series methods
use cause and effect relationships with one or more other variables to predict future values of the time series
-causal forecasting methods
cross sectional data
data collected at the same or approximately the same point in time, often multiple variables. "snapshot"
time series data
data collected at more than one period of time, sequentially over several periods at regular intervals
time series plot
scatterplot of Yt against t
successive points are connected by line segments
trend
gradual, persistent tendency of the time series to increase or decrease over many time periods
cyclical
measured in years
clusters of 2+ years alternating above and below the trend
our economy
difficult to forecast because they are not regular or repeating
Seasonal
regular and repeating patterns
each iteration of the pattern is completed within 1 years
random
or irregular
unpredictable, catch all for any deviatoins from expected that aren't accounted for by the others Tt, Ct, and/or St
stationary time series
one that exhibits no notable patterns, trends, or seasonal variations in particular
only random
naive forecasting methods
used for stationary time series
-lags
-simple moving average MA
-weighted moving average
-single (simple) exponential smoothing
lag
lag1 = one period lag
yhatt=yt-1
forecast for week 2 is the sales in week 1 and so on
moving average
MA, k-period
using 3 as k for this example
yhatt=(yt-1 + yt-2 +yt-3)/3
take the average of the past 3 weeks of data
larger k results in smoother forecast
choosing k is based off of trial and error
weighed MA
assign weights, a1, a2, ak, to the most recent k observations. the weights are chosen by the user, and usually assign increasing weight to more recent observations
a1y1+a2y2 / a1+a2
simple exponential smoothing
the forecast for time period t is a weighted average of the most recent prior observation in the time series yt-1 and the forecast of that observation yhatt-1
yhatt=wyt-1 + (1-w)yhatt-1
w (omega)
smoothing constant used in simple exponential smoothing
0<=w<=1
smaller values of this result in smoother forecasts
forecast error
for a given time period, t the forecast error is the difference between actual value of the time series and the value forecasted for that time period
et = yt-yhatt
essentially same as a residual
forecast accuracy measures
mean square error: MSE = SSE/n
mean absolute deviation: MAD = Sum(et)/n
mean absolute percentage error: MAPE: sum(et/yt)/n x 100
units of MSE and MAD and MAPE
the denominator is n. this is the number of errors (et) used to obtain the numerator
MSE is measured in the time series units^2
MAD is measured in the units of the time series
The MAPE as a percentage has no units
smaller is better
effect of dummy variable
to change the intercept for each vendor. this is powerful result. it allows us to fit a separate regression line simultaneously to each group
can account for differences in the intercepts of different groups but not the slopes of groups
categories with different slopes
to adjust slope, add an interaction term
transformations
can often be quite effective for correcting violations of data requirements
making the distribution of a variable more symmetric
making the spread of the distribution of a variable or of several groups more homogeneous
quantitative and quantitative
regression
quantitative and categorical
1 sample t/z test for means
2 sample t/z test for means
anova
regression
categorical and categorical
1 sample z test for proportions
2 sample z test for proportions
chi square
two way anova flow - is there an interaction
yes: interpret the means plot (and/or do multiple comparison tests)
no: are there main effects? - yes: do multiple comparison tests
competing goals of model seleciton
high adjusted r squared
small s
significant predictors
least squares regression
finds the sample coefficients b0 and b1 to define yhat = b0 + b1x by minimizing sum of ei^2
correlation
measures strength of linear association between two quantitative variables
r= Sxy/SxSy
coefficient of determination
R^2 = SSR/SST
proportion of variability in Y that is explained by the regression
simple linear regression
1. define the two relevant quantitative variables = dependent Y and predictor X
2. start with a scatterplot and make note of direction, form, strength (correlation) and unusual features
3. least squares regression
4. assess goodness of fit of the equation to the data, SST = SSR + SSE, coefficient of determination, standard error of the regression equation s = sqrt(MSE) where k=number of predictors
test significance of regression equation
Ho: B1 = 0 and Ha: B1=/0
t test: if tobs=b1-0/sb1 > t*(n-k-1) then reject Ho
confidence interval estimates for the coefficients
bj +/- t*sbj
confidence interval estimates for uy/x
yhat +/- t*SE(uhat)
prediction interval for y given x
yhat +/- t*SE(yhat)
the standard error for the prediction of y is always larger than the standard error for the estimate of the mean at the same x
the intervals are narrowest at xbar
multiple regression
1. identify relevant variables, response and potential predictors
2. gather data: experiment or observation/survey
3. assess relationships among variables using scatterplots or correlational analysis (look for multicollinearity)
4. create beginning model
5. assess strength of model = measure goodness of fit using coefficient of determination and standard error of the estimate
6. validate model assumptions before performing classical inference (residual analysis as in SLR)
7 test model signficance Ho: B1 = B2 = Bk = 0 and Ha not all Bj=0 and f statistic
8. test individual predictor significance Ho: Bj = 0 for each quantitative predictor using t statistic, goal is parismony
multicollinearity
occurs when two or more predictor X variables are highly correlated to each other
estimated coefficients are poorly estimated reflected by large standard errors
essentially some variables are redundant
can see conflicting results from f test model of significance to t tests of predictor significance
measure degree of multicollinearity with VIF = 1/1-R^2
VIF>5 causes concern VIF>10 indicates severe mulicollinearity
nonlinear pattern because Y and X
option 1: model a polynomial equation to match the nonlinear pattern, this will be the better option if the curve is not monotonic
option 2: transform one or both variables in attempt to remove the curvature to result in a linear pattern, the curvature must be monotonic increasing or monotonic decreasing, use the Tukey transformation circle for guidance in type of transformation, up the ladder increases power down the ladder decreases power
tukey transformation circle
left up: x down y up
left down: x and y down
right up: x and y up
right down: x up y down
up the ladder: square, cube
down the ladder: log, ln, sqrt, -inverse
up the ladder increases power down the ladder decreases power
interaction
between an indicator variable and quantitative predictor is used to create a different slope on the quantitative predictor for two or more categories of the categorical predictor
create an interaction term by multiply the value of the predictor and value of an indicator
the coefficient on the interaction term represents how much faster y is estimated to change for 1 unit increase in the x in the category that has the interaction than in the base category
indicator (dummy) variables
used to code a categorical predictor for use in a regression equation
given c categories define c-1 indicator variables
defined for a specific category and has the value 1 is observation occurred and value 0 otherwise
category not included in the regression equation is the base category that is defined in the equation by the value 0 being substituted for each of the indicators
the regression coefficient on a dummy variable identifies the difference in the average response variable for that category and the base category (represents change in the intercept)
autoregression
forecasting model
the independent variables are the time lagged values of the dependent variable
YOU MIGHT ALSO LIKE...
Marketing Research
TextbookMediaPremium
$9.99
STUDY GUIDE
Level 2 SS 3
147 Terms
cpendley
Time series analysis
35 Terms
Jordan_Snyder9
Regression Analysis
61 Terms
jack_meredith8
OTHER SETS BY THIS CREATOR
socio quiz 4
54 Terms
cuitee8
3200 final
197 Terms
cuitee8
socio quiz 3
38 Terms
cuitee8
org tophat exam 2
28 Terms
cuitee8
THIS SET IS OFTEN IN FOLDERS WITH...
BUSMGT 2320 Midterm 1
39 Terms
cuitee8
busmgt 2320 midterm 2
35 Terms
cuitee8
Final
14 Terms
kmcallister4
BusMgt 2320 Midterm 1 (OSU)
106 Terms
tal_bhaiji
;