Home
Subjects
Explanations
Create
Study sets, textbooks, questions
Log in
Sign up
Upgrade to remove ads
Only $35.99/year
Math
Statistics
DSST - Statistics
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (85)
Inference
Refers to the key concept in statistics in which we draw a conclusion from available evidence
Thomas Bayes
Known probabilities of the past to predict the future
Edwards Deming
Merged science and statistics and promoted the use of statistics in driving a process
Descriptive statistics
Summarrize or display data so we can quickly obtain an overview
Inferential statistics
Make claims based on a piece of sample data
Quantitative data
Uses numerical values to describe something of interest
Qualitative data
Descriptive terms to measure or classify something of interest
Nominal level of measurement
Qualitative data only. When measuring using a nominal scale, one simply names or categorizes responses. Gender, handedness, favorite color, and religion are examples of variables measured on a nominal scale. The essential point about nominal scales is that they do not imply any ordering among the responses. For example, when classifying people according to their favorite color, there is no sense in which green is placed "ahead of" blue. Responses are merely categorized. Nominal scales embody the lowest level of measurement.
Ordinal
Ability to rank/order values from highest to lowest. Does not allow us to make measurements between categories
Interval level
Strictly quantitative data, mathematical operations of addition and subtraction
Ratio level
All mathematical operations on data that has a true zero point
True zero point
Indicates the complete absence of the object being measured
Classes
Intervals in frequencies, meaning a range like 0.00-0.45Are considered mutually exclusive when observations can only fall into one class. For example, the gender classes "male" and "female" are mutually exclusive because a person cannot belong to both.
Relative frequency distributions
Display the percentage of observations of each class relative to the total number of observations
Cumulative frequency distribution
Totals the percentage of observations as you move down the frequency table
Histogram
A bar graph showing the number of observations in each class as the height of each bar
Stem and leaf display
Splits the data values into stems(the first digit value) and leaves(the second digit value). i.e 8|00012338
Central Tendency
Describes the center point of a data set with a single value. (Most common measure is mean or average).
Dispersion
Describes how far individual data values have strayed from the mean.
Mean
Is the most common measure of central tendency and is calculated by adding all the values in our data set and then dividing this result by the number of observations.
Weighted mean
An average in which each quantity to be averaged is assigned a weight. These weightings determine the relative importance of each quantity on the average
Median
The value in the data where half the values are higher then it and half are lower. With even data points the median is an average of the two center points.
Mode
Simply the observation in the data set that occurs the most frequently.
Range
Calculated by subtracting the smallest measurement from the largest measurement
Variance
A measure of dispersion that describes the relative distance between the data points in the set and the mean of the data set. This measure is widely used in inferential statistics.
Standard deviation
Square root of the variance. How far on average the data points in a population or sample are away form the mean(average)
Empirical rule
Tells us that approximately 68% of the data values will be within one standard deviation from the mean, 95% will be within 2, and 99.7% of values will fall within 3 standard deviations
Chebyshev's Theorem
For any number k greater than 1, at least (1-1/k^2)*100 percent of the values will fall within k standard deviations from the mean. i.e At least 75% of the data values will fall within two standard deviations from the mean by setting k=2.
Quartiles
Quartiles Divide the data set into four equal segments after it has been arranged in ascending order. Approximately 25% of the data points will fall below the first quartile, Q1, 50% under Q2, 75% under Q3.
IQR
Interquartile range. Q3-Q1. Any values that are more than: Q3 + 1.5(IQR) or less than Q1-1.5(IQR) should be discarded
Experiment
The process of measuring or observing an activity for the purpose of collecting data. An example is rolling a pair of dice.
Outcome
A particular result of an experiment. AN example is rolling a pair of threes with the dice.
Sample space
All the possible outcomes of the experiment. (Statistics people like to put {} around the sample space values because they think it looks cool.)
Event
One or more outcomes that are of interest for the experiment and is/are a subset of the sample space. An example is rolling a total of 2,3,4,5 with two dice.
Classical probability
refers to a situation when we know the number of possible outcomes of the event of interest and can calculate the probability of that event with the following equation.
P[A] = # of possible outcomes in which Event A occurs/Total # of possible outcomes in the sample space
Empirical Probability
This type of probability observes the number of occurrences of an event through an experiment and calculates the probability from a relative frequency distribution. P[A] = Frequency in which Event A occurs/Total number of observations
Law of large numbers
States that when an experiment is conducted a large number of times, the empirical probabilities of the process will converge to the classical probabilities.
Subjective Probability
Probability that relies on experience and intuition to estimate the probabilities.
Simple Probability
Probability of a single event
Contingency table
Indicates the number of observations that are classified according to two variables. The intersection of Events A and B represents the number of instances where Events A and B occur at the same time.
Joint Probability
The intersection of Events A and B(A n B) represents the number of instances where Events A and B occur at the same time.
Union
Events A and B represents all the instances where either Event A or Event B or both occur and is denoted as A U B. The call is from Christin and is a crisis
Conditional probability
Probability of Event A occuring knowing that Event B has already occurred(also known as posterior probabilities)
Prior probabilites
Probabilities that are derived only from information that is currently available.
Mutually Exclusive
Events that cannot occur at the same time during the experiment.
Addition rule of probability
Used to calculate the probability of the union of events--that is, the probability that either Event A or Event B will occur. For mutually exclusive events: P[A or B] = P[A]+P[B]. Non mutually exclusive events: P[A or B] = P[A] + P[B] - P[A and B].
Multiplication rule of probability
If the two event are dependent: P[A and B] = P[A/B] x P[B], but if the events are independent: P[A and B] = P[A] x P[B]
Fundamental counting principle
If one event can occur in m ways and a second event can occur in n ways, the total number of ways both events can occur together is m*n ways. And we can extend this principle to more than two events.
Permutations
Number of different ways to which objects can be arranged in order. The number of distinct objects in n is n! = n*(n-1)(n-2)(n-3). By definition !0 = 1
Combinations
Number of different ways in which objects can be arranged without regard to order.
Random variable
Is an outcome that takes on a numerical value as a result of an experiment.
Continuous random variable
A random variable is ____ if it can assume any numerical value within an interval as a result of measuring the outcome of an experiment.
Discrete random variable
A random variable is ____ if it is limited to assuming only specific integer values as a result of counting the outcome of an experiment.
Expected value
Is the mean of a probability distribution
Binomial experiment
Has the following characteristics: (1) the experiment consists of a fixed number of trials denoted by n; (2) each trial has only two possible outcomes, a success or a failure; (3) the probability of success and the probability of failure are constant throughout the experiment; (4) each trial is independent of any other trial in the experiment(coin toss). AKA Bernoulli process
Poisson Process
Has the following characteristics: (1) the experiment consists of counting the number of occurrences of an event over a period of time, area, distance, or any other type of measurement; (2) the mean of the Poisson distribution has to be the same for each interval of measurement; (3) the number of occurrences during one interval is independent of the number of occurrences in any other interval.
Normal Probability
The mean, median, and mode are the same value. The distribution is bell-shaped and symmetrical around the mean. The total area under the curve is equal to 1. The left and right sides of the normal probability distribution extend indefinitely, never quite touching the horizontal axis.
Standard normal distribution
is a normal distribution with a mean equal to 0 and a standard deviation equal to 1.0.
Z-score
Tells how many standard deviations a value is from the mean; if you have a mean of zero then you have a standard deviation of one
Simple Random sample
Is a sample in which every member of the population has an equal chance of being chosen
Systematic sampling
Every kth member of the population is chosen f0r the sample, with the value of k being approximately N/n
Cluster sample
A cluster sample is a simple random sample of groups, or clusters, of the population. Each member of the chosen clusters would be part of the final sample.
Stratified sample
Is obtained by dividing the population into mutually exclusive groups, or strata, and randomly sampling from each of these groups.
Sampling distribution of the mean
Refers to the pattern of sample means that will occur as samples are drawn from the population at large
Discrete uniform probability distribution
A distribution that assigns the same probability to each discrete event(and is discrete if it is countable)
Central limit theorem
As the sample size, n, gets larger, the sample means tend to follow a normal probability distribution and tend to cluster around the true population mean. This holds true regardless of the distribution of the population fro which the sample was drawn.
Standard error of the mean
Is the standard deviation of sample means. According to the central limit theorem, the standard error of the mean can be determined by sx_ = s//n
Theoretical sampling distribution of the mean
Displays all the possible sample means along with their classical probabilities.
Standard error of the proportion
Standard deviation of the sample proportions and can be calculated by sp = //p(1-p) / n
correlation coefficient
A statistical index of the relationship between two things (from -1 to +1)
Point Estimate
Is a single value that best describes the population of interest, the sample mean being the most common.
Confidence level
Is the probability that the interval estimate will include the population parameter, such as the mean. A parameter is data that describes a characteristic about a population.
Interval estimate
Provides a range of values that best describes the population.
Confidence Interval
Is a range of values used to estimate a population parameter and is associated with a specific confidence level
Margin of Error
E, determines the width of the confidence interval and is calculated using zcsx_
Level of significance(a)
The probability of making a Type I error
Degrees of freedom
The number of values that are free to be varied given information such as the sample mean, is known.
Null hypothesis
Denoted by H0, represents the status quo and involves stating the belief that the mean of the population is <, =, or > to a specific value.
Alternative hypothesis
Denoted by H1, represents the opposite of the null hypothesis and holds true if the null hypothesis is found to be false.
Two-tail hypothesis test
Is used whenever the alternative hypothesis is expressed as /=
One-tail hypothesis test
Used when the alternative hypothesis is being stated as < or >.
Type I error
Occurs when the null hypothesis is not accepted when in reality it is true.
Type II error
Occurs when we fail to reject the null hypothesis when in reality it is not true.
Correlation coefficient
sometimes referred to as the PPMCC or PCC or Pearson's r is a measure of the linear correlation (dependence) between two variables X and Y, giving a value between +1 and −1 inclusive, where 1 is total positive correlation, 0 is no correlation, and −1 is total negative correlation.
p-value(observed level of significance)
The smallest level of significance at which the null hypothesis(H0) will be rejected, assuming the null hypothesis is true.
Recommended textbook explanations
Statistics and Probability with Applications
3rd Edition
Daren S. Starnes, Josh Tabor
2,555 explanations
A Survey of Mathematics with Applications
10th Edition
Allen R. Angel, Christine D. Abbott, Dennis C. Runde
6,365 explanations
A First Course in Probability
5th Edition
Sheldon Ross
656 explanations
Elementary Statistics: Picturing the World
2nd Edition
Betsy Farber, Larson
1,809 explanations
Sets found in the same folder
DSST - Statistics
85 terms
DSST Principles of Statistics
58 terms
DSST - Statistics
85 terms
DSST Principles of Statistics
29 terms
Sets with similar terms
Statistics
81 terms
QMB 3200 Exam 1
59 terms
QMB exam terms
51 terms
Statistics Ch 1-6 Vocab
59 terms
Other sets by this creator
ClEP American Government
97 terms
CLEP: Principles of Marketing
103 terms
Principles of Macroeconomics CLEP
57 terms
CLEP: Natural Sciences > Biology
143 terms