Social Science
Economics
Econometrics
POLS 306 Final
Terms in this set (110)
A __________is one that is so great that it has a low probability of having occurred by chance.
significant difference
The __________ is the area or portion in a sampling distribution that contains all the values that allow you to reject the null hypothesis.
critical region
Which of the following is not one of the critical values commonly used in applied statistics?
.95
If we failed to reject a false null hypothesis, we have committed a __________ error.
type II error
The normal curve is a __________, it gives you a statement of probabilities associated with various portions of the curve.
probabilistic distribution
What does an extreme case in a probability distribution mean?
A low probability of occurrence
What is the denominator of a t-statistic for the difference between a sample mean and a population mean?
The Standard Error of the Mean
We use a __________ because we want to estimate the value of population parameter but we can use a __________ to test hypotheses.
confidence interval, t-test
Which of the following is not one of the ways Caldwell describes a null hypothesis?
A statement of uncertainty
Which of the following is not one of the steps required to test the difference between a sample mean and the population mean?
Calculate a confidence interval for the mean
In the test for the difference of means, independent sample design, the number of cases in each sample must be equal.
False
In the matched samples design (the test involving the mean difference) the number of cases in each sample must be equal.
True
Which of the assumptions is not one of the assumptions necessary for a difference of means test?
The independent variable causes the dependent variable.
Which t-test is used for the following research question:
Participants in a study are randomized into two groups. One group of people reads a story about a terrorist attack and another reads a control story about a sporting event.
Independent Samples
__________ are samples involving cases or subjects that share certain characteristics in common.
Matched samples
What is D with a bar over it represent?
The mean of the differences between two matched groups
Which one of the following is not part of the formula for the standard error of the difference of independent means?
The difference between the sample means
Which t-test is used for the following research question:
Independent samples are drawn from two populations and then individuals are paired based on a series of demographic characteristics including Sex, Age, and Race. These pairs are randomly assigned a treatment story to read and the other member of the pair receives a control story.
Matched Samples
__________ are samples selected in such a way that the selection of cases or subjects included in one sample has no connection to or influence on the selection of cases or subject in the other sample.
Independent samples
Which t-test is used for the following research question:
People are asked what their opinion of Sweden is, they are then asked to read a story about Sweden, and then they are asked about their opinion of Sweden again.
Matched Samples
Which of the following hypothesis test types is most common in applied research?
Two-tailed tests
Which of the following is not one of the ways we can increase the power of a statistical test?
We can choose a different distribution.
What type of hypothesis is the following hypothesis?
Economic sanctions are related to presidential approval.
non-directional hypothesis
What type of hypothesis is the following hypothesis?
Democracies are less likely to fight wars with one another than non-democracies.
directional hypothesis
The __________ of a statistical test is the ability of the test to reject a false null.
power
When we're working with a non-directional hypothesis, we're in a __________ scenario.
two-tailed test
Which of the following is a type II error?
The null hypothesis is false and we fail to reject the null.
Why do researchers prefer two-tailed tests?
The tests are more conservative because it's harder to reject the null.
The _________ is a statement of equality (no difference) or chance.
null hypothesis
Which of the following is not one of the ways a directional hypothesis changes the way we approach hypothesis testing?
We use different distributions
The __________is calculated by finding the deviation of each score in a sample from the sample mean, squaring the deviations, adding the squared deviations for each sample, and summing across all the samples.
SSW
When do we reject the null in an ANOVA?
When the between group variation exceeds the within group variation.
The __________ is an expression of the amount of deviation of sample scores from sample means.
within group variation
Where can the critical values for an ANOVA test be found in Caldwell (2013)?
Appendix D
The __________ is calculated by finding the deviation of each sample mean from the grand mean, squaring the deviation, weighting the squared deviation for each sample, and summing across the samples.
SSB
What best describes the F-ratio?
A measure of the relative variation between groups over the variation within groups.
The __________ is an expression of the amount of deviation of sample means from the grand man.
between group variation
Which of the following is not one of the steps necessary to calculate an F statistic?
Calculate the standard deviations for each group.
Which of the following is not a possible F-statistic?
-10
When should ANOVA be used?
When there are more than two groups and a single continuous variable.
A Chi-square test can tell us ___________.
whether two variables are associated.
Where can you find the critical values for the chi-square text in the Caldwell (2013) text?
Appendix H
Which of the following is not one of the steps necessary to calculate a chi-square test?
calculate the means for each variable
Which of the following types of data would we not use a chi-squared test for?
Continuous Data
A __________ is a classification tool that reveals the various possibilities in the comparison of variables.
contingency table
How does a large difference between an expected and observed frequency influence the outcome of a chi-square test?
Increases the likelihood of rejecting a null.
The __________ are the frequencies that would occur by chance in each cell of a contingency table, given the marginal totals.
expected frequencies
How do you calculate the degrees of freedom for a chi-square test?
(r - 1) x (c - 1)
It is important to remember that __________ is something that largely exists in our minds - it's a model or an explanation that we sometimes mistakenly impose on our data or results. Except in highly controlled experimental research situations, it's difficult to make legitimate claims about __________.
causation
Which of the following is not a value that a chi-square statistic could take?
-10
Who developed correlation analysis?
Karl Pearson
What word describes the following relationship?
As the values increase along the x-axis of the scatter plot,
the values increase along the y-axis and then decrease along the y-axis.
Curvilinear
What word describes the following relationship?
As the values increase along the x-axis of the scatter plot,
the values decrease along the y-axis of the scatter plot.
Negative
Which of the following is not one of the things that a scatter plot can tell us about a relationship?
The importance of the relationship
Which statistic that we have discussed previously is closely related to correlation analysis?
Z
A __________ allows us to simultaneously view the values of two variables on a case-by-case basis.
scatter plot
Which one of the following is not one of the steps required to calculate r?
Square the Z values for each value of X and Y
The __________ is a measure of the explained variance - the amount of variation in one variable that is attributable to variation in another variable.
coefficient of determination
What is the independent variable commonly referred to in statistical models?
X
Which of the following is not a value that r can take?
2
__________ occurs when the stochastic terms for any two or more cases are systematically related to each other.
Autocorrelation
In two-variable regression, we use information from the __________ to make inferences about the unseen __________.
sample regression model, population regression model
The __________ are our best guesses of the unseen population parameters in the regression model.
parameter estimates
The basic idea of two-variable regression is that __________.
we are fitting the "best" line through a scatter plot of data.
The __________ the value of the standard error of the regression the __________ the values of the variance and the standard error of the slope parameter.
larger, larger
For a given covariance between X and Y, the __________ spread out X is the __________ steep the estimated slope of the regression line.
more, less
The __________ ranges between zero and one, indicating the proportion of the variation in the dependent variable that is accounted for by the model.
R-Squared
The __________ is equal to the difference between the actual value of the dependent variable and the predicted value of the dependent variable from our regression model.
estimated stochastic component
What is the method one uses to draw the line of best fit when estimating an ordinary least squares regression model?
Minimizing the sum of the squared residuals
The __________ provides a measure of the average accuracy of the model in the metric of the dependent variable.
Root Mean-Squared Error
What is the essential incongruity between political phenomena and the statistical models we use to study them?
Most political phenomena have more than one cause but our causal models focus on a single variable.
How can we make the relationships between variables bigger?
None of the above
One can calculate a __________ for an independent variable by multiplying a __________ by the ratio of the standard deviation of the independent variable and the standard deviation of the dependent variable.
standardized coefficient, unstandardized coefficient
What does the new formula for beta-hat in Chapter 10 not account for?
The effects of variables not included in the model.
Looking at Table 10.1, what proportion of the variation in vote percentage for the incumbent party is explained by the growth rate of the U.S. economy and good news?
44 percent
If a scatter plot in two dimensions (X and Y) suggests a formula for a __________, then adding a third dimension suggests the formula for a __________.
line, plane
What happens when we fail to include a relevant cause of Y in our regression model?
Bad things
__________ is a judgement about whether or not a ___________ relationship is "large" or "small" in terms of its real world impact.
Substantive significance, statistical significance
What we call __________ is not as strong as __________.
statistical control, experimental control
When do you automatically control for alternative possible causes of your outcome?
When you conduct an experiment
If, hypothetically, we erased the circle for Z from Figure 10.1, we would (incorrectly) attributed all of the area of __________ to X.
B, D
Looking at Figure 10.1, what areas represent the covariation between X and Y?
B, D
When is omitted variable bias likely to be a big problem?
The parameter for Z is big and the correlation between X and Z is strong.
The specific type of bias that results from the failure to include a variable that belongs in our regression model is called __________.
omitted variable bias
__________ occurs when the expected value of the parameter estimate that we obtain from a sample is not equal to the true population parameter.
Bias
Which of the following is not one of the reasons that Kellstedt and Whitten highlight for why you should not just copy results from preferred software program?
You may be judged for the software you use.
What are the consequences of perfect multicollinearity?
The model cannot be estimated.
In a regression of Y on X and Z, the plane of the regression travels through the means of which variables?
All of the above
Which of the following is not one of things that Kellstedt and Whitten (2018) suggest should be included in your regression results?
The interpretation of the results
When does omitting Z not cause bias problems?
When X and Y are not related
Categorical independent variables that take on one of two possible values are commonly referred to as __________.
dummy variables
Which of the following is not an example of a dummy variable?
A variable that can take values between 0 or 1
How do you correctly include a categorical variable with multiple levels in an OLS regression model?
Code the categories as a series of dummy variables and include all but one.
The value of the independent variable for which we do not include a dummy variable is known as the __________.
reference category
A __________ is a survey respondent's answer to a question about how the feel toward particular individuals or groups on a scale that typically runs from 0 to 100.
thermometer rating
Which of the following is true about the category from a categorical variable for which there is not a variable included in the model?
All of the above
To avoid the dummy-variable trap, we have to __________.
omit one of our dummy variables
How do we interpret the intercept in Model 1 of Table 11.1?
The estimated value of the dependent variable for a low-income man.
In an ordinary least squares regression model, we interpret each parameter estimate as the estimated effect of a one-unit increase in that particular independent variable on the dependent variable, __________.
while controlling for the effects of all other independent variables.
Looking at the example regression shown in Figure 11.1, why is the "female" variable "omitted" in the regression output?
All of the Above
Which of the following is not one of the consequences of high multicollinearity?
Bias
__________ occurs when tow or more of the independent variables in the model are extremely highly correlated.
High multicollinearity
Which of the following is not one of the causes of perfect multicollinearity highlighted by Kellstedt and Whitten (2018)?
Omitted variables
__________ are calculated as the difference in the parameter estimate without each case divided by the standard error of the original parameter estimate.
DFBeta scores
What should you not do if you find evidence of multicollinearity?
Collect more data
A case has large __________ when it has unusual independent variable values.
leverage
The higher the _________ value, or the lower the __________, the higher will be the estimated variance of the coefficient for the variable being evaluated in the model.
VIF, tolerance index
Which cases are highly influential?
Cases with high residual and high leverage
An __________ is an extreme value relative to the other values for that variable.
outlier
Adding more data will alleviate __________, but not __________.
multicollinearity, omitted variable bias
