387 ch. 13
Key Concepts:
Terms in this set (78)
Statistics
The branch of math that collects, analyzes, interprets, and present numerical data in terms of samples and populations
statistic
The numerical outcomes and pobablilites derived from calculations of data. Collection of methods for planning experiments, obtaining data, organizing, summarizing, presenting, analyzing, interpreting and drawing conclusions based on data.
descriptive statistics
Collection and presentation of data that explain characteristics of variables found in the sample. Mathematical procedures for organizing collections of data based on characteristics of variables found in the sample, such as determining the mean, the median, the range, the variance, and the correlation coefficient. ex. 60% of subjects in the experimental group had few colds than those in the placebo
Inferential statistics
Analysis of data as the basis for prediction related to phenomenon of interest. Numerical methods used to determine whether research data support a hypothesis or whether results were due to chance. Draw conclusions about population based on sample and develop population parameters. ex. statistical significance of the study
population parameteres
characteristics of a population that are inferred from characteristics of a sample
sample statistics
numerical data describing characteristics of the sample
univariate analysis
the use of statistical tests to provide information about one variable.
The analysis of a single variable for purposes of description. Examples: frequency distribution, averages, and measures of dispersion; Gender: The number of men in a sample/population and the number of women in a sample/population
Bivariate analysis
relationship among two variables
multivariate analysis
relationship among 3 or more variables
Frequency
how often a variable is found to occur in either grouped or ungrouped data
Ungrouped data
primarily used to present nominal or ordinal data where the raw data represents some characteristic of the variable. rarely used when reporting continuous variables like age, scales, time, physiologic variables. But if rearranged into a frequency on a graph, you can see a distribution that makes more sense
Grouped - Interval and ratio level data
raw data is collapsed into smaller classifications to make data easier to interperet. there is no overlap of categories. may be easier to understand but result in the loss of some information
percentage distribution
descriptive statistics used to group data to make results more comprehensible; calculated by dividing the frequency of an event by the total number of events. ex. 3 18-yo boys represent 15% of the study and this provides another way to group information
Frequency distributions are an effective way to present inferential stats?
False.
Categories that are grouped must be mutually exclusive.
True
Percentages are often used to describe characteristics of samples
true
The total number of subjects in a sample is represent by the symbol n.
false. N represents this.
Measures of central tendency
mean, median, mode
Mean
continuous-level data, can be rounded to nearest number
median
continuous-level data, can be rounded to nearest number
mode
used for both continuous and nominal-level data and is never rounded because it is an actual data point. most frequently occurring value. not effected by extreme values
modality
the number of modes found in a data distribution
amodal
without a mode
unimodal
with one mode. ex age 21. the mean median and mode are equal and symmetrical. Bells shaped curves
bimodal
with two modes. ex. 18-19 years old
median
the point at the center of a data set
position of the median
calculated by using the formula (n+1)/2, where n is the number of data values in the set. in groups, the median is determined using cumulative frequencies.
Mean
the mathematical average calculated by adding all values and then dividing by the total number of values. greatly effected by outliers. best measure of central tendency when there are no outliers and the most stable number during re-testing
normal distribution
A function that represents the distribution of variables as a symmetrical bell-shaped graph, symmetric about the mean
skewed
an asymmetrical distribution of data. the peak of data not in the center and generally discussed in terms of direction
negatively skewed
distribution when the mean is less than the median and the mode, the longer tail is point to the left
positively skewed
distribution when the mean is greater than the median and the mode and the longer tail is pointing to the right
kurtosis
the peakedness or flatness of a distribution of data
measures of variability
measures providing information about differences among data within a set; measures of dispersion bc of how data is dispersed around the mean.
homogeneous data
less variable, easier to combine, easier to show significant difference. little variablility and many common characteristics.
heterogeneous data
wide variability and the degree to which elements are diverse or not alike.
measures of variation
range, semiquartile range, percentile, standard deviation, z-score, variance
range
Distance between highest and lowest scores in a set of data. unstable measure of variability
semiquartile range
the range of the middle 50% of the data. The difference betweem the third and first quartile values. found with the median of the lower 50% of data and taking the median of the upper 50% of data.
percentile
a number that tells us what percent of the total number of data values that lie at or below a certain level
standard deviation
a measure of variability used to determine the number of data values falling within a specific interval in a normal distribution. based of deviations from the mean of the data and the number of times the average deviates from a mean.
Z-score
standardizes units used to compare data gathered using different measurement scales. describe the distance a score is away from the mean per SD. age and weight ex.
coefficient of variation
a percentage used to compare standard deviations when the units of measure are different or when the means of the distributions being compared are far apart.
tailedness
the degree to which the tail in the distribution is pulled right or left
rule of 68-95-99.7
rule stating that for every sample 68% of the data will fall within one standard deviation of the mean, 95% will fall within 2 SD's, and 99.7% will fall within 3 SDs
correlation coefficients
an estimate, ranging from 0.0 to +1.0 that indicates the reliability of an instrument, statistic used to describe the relationship among 2 variables. for bivariate analysis to describe the relationship between 2 variables.
direction
the way 2 variables covary. positive correlation = increase in both variables or decrease in both variables. Negative = covary inversely, one decreases and the other increases
magnitude
the strength of the relationship between 2 variables. a correlation of zero means there is no relationship among variables.
0.1-0.3 = weak correlation
0.3-0.5 = moderate
>0.5 = strong
confidence intervals
ranges established around means that estimate the probability of being correct. tell whether finding can be applied to a population or if they actually test the hypothesis. an estimate of the degree of confidence one can have about the inferences. ex. 95-99%
probability
chance that an event will occur in a situation
sampling error
error resulting when elements in the sample do not adequately represent the population
nominal data
This type of data is also called categorical data and includes values assigned to name-specific categories:
ordinal data
Type of data in which both the respondents and the researcher cannot state with certainty whether the intervals between each value are equal.
statistically significant
when findings did not happen by chance
nonsignificant
supports the null hypothesis
type I error
An error that occurs when a researcher concludes that the independent variable had an effect on the dependent variable, when no such relation exists; a "false positive" (Source: CHH, 2 Ed). researcher rejects null hypothesis when it should be accepted. sampling bias, measurement error.
type II error
An error that occurs when a researcher concludes that the independent variable had no effect on the dependent variable, when in truth it did; a "false negative" (Source: CHH, 2 Ed). accepts the null hypothesis when it should be rejected. error or chance when it is actually significant.
alpha level
The ____________ is the probability of making a type I error; it is designated at the end of the tail in a distribution at 0.01 or 0.05
parametric
inferential statistical tests involving interval- or ratio-level data to make inference about the population
nonparametric
inferential statistics involving nominal - or ordinal - level data to make inferences about the population
degrees of freedom
a statistical concept used to refer to the number of sample values that are free to vary; n-1
sampling distribution
a theoretical distribution representing a infinite number of samples that can be drawn from a population
chi square
tests for differences between groups using nonparametric data
t test
has independent and correlated varaitions
ANOVA
uses the F statistic
Pearson's R
tests for the significance between 2 variables
multiple regression
tests the significance of relationships among three or more variables
a researcher is studying the relationship of the amount of time intensive care unit patients spending lying on their backs and urine output. what test should be used to analyze?
Pearsons R.
Which of the following is calculated by dividing the frequency of an event by the total number of events.
a. Frequency distribution
b. mode
c. median
d. percentage distribution
D. percentage distribution
The semiquartile range is the range of the middle 50% or the data.
True or False.
True
Bivariate analysis is performed to describe the relationship between 2 variables that can be expressed in contingency tables or with other statistical tests.
True or False
True
To maintain ethical integrity, researchers should select statistical tests and alpha levels in advance but should ignore incidental findings.
False
The Z score is:
a. the degree to which a tail in a distribution is pulled to the left or to the right
b. used to describe the distance a score is away from the mean per standard deviation
c. used to compare standard deviations when the units of measure are different or when the means of distributions being compared are far apart.
d. the way two variable co-vary
b. used to describe the distance a score is away from the mean per standard deviation
Inferential statistical test used when the level of measurement is interval or ratio and more than two groups are being compared.
a. t statistic
b. pearson's R
c. Chi square
d. ANOVA
d. ANOVA
a statistical range is considered to be an unstable measure of variability because
a. the calculation of range is simple.
b. range is not specific to the sample
c. range is very sample specific
d. number of values that can be included has some limits
C. range is very sample specific
Type I errors occur when:
a. researchers reject the null hypothesis when it should have been accepted
b. the opportunity to implement an effective treatment or claim the discovery of a relationship has bee missed
c. practice does not change when it should be changed
d. researchers accept the null hypothesis when it should have been rejected
A. researchers reject the null hypothesis when it should have been accepted
The type of statistical analysis used by researchers to study the relationship of many independent variables on one dependent variable is:
a. multiple regression
b. correlated t test
c. independant t test
d. MANOVA
A. multiple regression
When data distribution is skewed, ________________________.
a. the peak of the data is not at the center of the distribution
b. both tails (of distribution) are of equal length.
c. the peak of the data is at the center of the distribution
d. the mean will be equal to the median and mode.
a. the peak of the data is not at the center of the distribution
