Statistics final
Terms in this set (66)
The Alternative Hypothesis
A statistical hypothesis that there is a difference between a parameter and a certain value, or between two parameters. H1:p1<P2
Type I error
Rejecting the null hypothesis when it is true
Type II error
The null hypothesis is not rejected when it is false
The critical region
The range of test values that indicates that there is a significant difference and that the null hypothesis should be rejected
A one-tailed test
Is used when the null hypothesis should be rejected if the test value is in the critical region on one side of the mean
A two-tailed test
is used when the null hypothesis should be rejected if the test value is in the critical region on either side of the mean
Alternative hypothesis
The research hypothesis
A statistical test
uses data to reject or not the null hypothesis
Statistical evidence can be used to reject a claim if the claim is the null hypothesis.
Reject the null hypothesis
The level of significance
is the maximum probability of committing a type I error. Symbolized by α
The numerical value obtained from a statistical test
test value or test statistic
Noncritical or nonrejection region
the range of test values that indicates that the difference was probably due to chance and that the null hypothesis should not be rejected
No error is committed when the null hypothesis is rejected when it is false
True
for the t test use "s" (standard deviation) instead of
σ
Rejecting the null hypothesis when it is true
Type I error
A conjecture about a population parameter is called a
statistical hypothesis
To test the claim that the mean is greater than 87
use a right-tailed test
The degrees of freedom for the t test are
n-1
The z test
A statistical test for the mean of a population. Used when n ≥ 30 or the population is normally distributed and σ is known
The P-value (or probability value)
the probability of getting a sample statistic in the direction of the alternative hypothesis when the null hypothesis is true.
P-Value rule
If P-value ≤ α, reject the null hypothesis.
If P-value > α, do not reject the null hypothesis.
T-test
is a statistical test for the mean of a population and is used when the population is normally or approximately normally distributed and σ is unknown.
z-test for a proportion
The sample is a random sample.
The conditions for a binomial experiment are satisfied.
np ≥ 5 and nq ≥ 5.
t-test
the population standard deviations are not known and we need to test the difference between means
use the t-test for two independent means when
The samples are random samples.
The sample data are independent of one another.
When the sample sizes are less than 30, the populations must be normally or approximately normally distributed
Independent samples when using the z test
the subjects selected for the first sample in no way influence the way the subjects are selected in the second sample.
Dependent samples
when the selection of subjects for the first group in some way influenced the selection of the subjects for the other group
q=
1-p
When the test value falls in the critical region reject the null hypothesis
There is enough evidence to reject the claim
p-value for a two tailed test
area for z=0.9976
1.0000-0.9976=0.0024
2(0.0024) doubled because it's two tailed
p-value =0.0048
Reject the null hypothesis because 0.0024 is less than α
α=0.01
Symmetric distribution
A distribution in which the data values are uniformly distributed about the mean
Negatively or left-skewed distribution
A distribution in which the majority of the data values fall to the right of the mean
Positively or right-skewed distribution
A distribution in which the majority of the data values fall to the left of the mean
Normal distribution
A continuous, symmetric, bell-shaped distribution of a variable used to study approximately normal variables
Shape and position of a normal distribution curve
Depend on the mean and the standard deviation
Properties of the theoretical normal distribution
The curve is bell-shaped
The mean, median and mode are equal and located at the center of the distribution
Is unimodal
Is symmetric about the mean
Is continuous
The curve never touches the x-axis
The total area under a normal distribution curve is equal to 1.00 or 100%
The area under the part of a normal curve that lies within 1 standard deviation of the mean is approximately 68%; within 2 standard deviation about 95% and within 3 standard deviation about 99.7%.
Standard normal distribution
A normal distribution for which the mean is equal to 0 and the standard deviation is equal to 1.
Find the area to the left of 1.23
Use table E
Find 1.2 in the left column and 0.03 in the top row. They meet at the area .8907
Find the area to the right of z=-0.32
Use table E
Look for -0.3 in the left column and 0.02 in the top row.
They meet at 0.3745
Subtract from 1 to get 0.6255
Find the probability of P(z<1.23)
Same as find the area to the left of 1.23
In the standard normal distribution table =0.8907 or 89.07%
Find the probability of P(z>-0.32)
Same as the area to the right of z=-o.32
subtract the area from 1 to get 0.6255 or 62.55%
Find the z value that corresponds to the area 0.4175
0.5-0.4175=0.0825
the closest value is 0.0823
z=-1.39
Find the z value that corresponds to the area 0.4066
0.4066+0.5=0.9066
z=1.32
Find the z value for 0.0239
Rounding rules
Always round z values to two decimal places, area and/or probability to four decimal places
Sampling distribution of sample means
A distribution obtained by using the means computed from all possible random samples of a specific size taken from a population
Sampling error
The difference between the sample measure and the population measure because the sample is not a perfect representation of the population
Properties of the distribution of sample means:
The mean of the sample means will be the same as the population mean.
The standard deviation of the sample means will be smaller than the standard deviation of the population. Equal to the population standard deviation divided by the square root of the sample size
Standard error of the mean
The standard deviation of the sample means taken from the same population
The Central Limit Theorem
As the sample size n increases without limit, the shape of the distribution of the sample means taken with replacement from a population with mean and standard deviation will approach a normal distribution.
Central limit theorem for individual data and a sample mean
Rounding rules for the confidence interval
Round to one more decimal place than the sample mean
Point estimate
A point specific numerical value estimate of a parameter.
Properties of a good estimator
Unbiased
Consistent
Efficient (smallest variance)
Interval estimate
A range of values used to estimate a parameter
Confidence Level
The probability that a parameter lies within the specified interval estimate of the parameter. With confidence...
Confidence interval
A specific interval estimate of a parameter.
Rounding rules for sample size
Round up to the next whole number
the t distribution
It is bell shaped
Symmetric about the mean
Mean, median and mode are equal to 0 and located at the center of the distribution
The curve never touches the x-axis
The variance is greater than 1
It is a family of curves based on the concept of degrees of freedom
As the sample size increases, the t distribution approaches the standard normal distribution
Degrees of freedom
The number of values that are free to vary after a sample statistic has been computed.
A proportion
A part of a whole, represented by a fraction, a decimal or a percentage. p is the population proportion.
Scatter Plot
A graph of ordered pairs (x,y) of numbers consisting of the independent variable and the dependent variable
Correlation Coefficient
A statistic or parameter that measures the strength and direction of a linear relationship of two quantitative variables
A lurking variable
Regression Line
The line of best fit of the data.
Influential Observation
An influential point pulls the regression line toward the point itself.
