### Distribution (of a variable)

Refers to it's pattern of variation. With a categorical variable, distribution means the variable's possible categories and the proportion of responses in each

### Dot plot

Useful for displaying the distribution of a relatively small data set of a quantitative variable

### Variability

Phenomenon of a variable taking on different values or categories from observational unit to observational unit.

### experimental probability

the ratio of the number of times an outcome occurs to the total amount of trials performed

### Independent events

events for which the occurrence of one has no impact on the occurrence of the other

### tree diagram

a tree-shaped diagram that illustrates sequentially the possible outcomes of a given event

### binomial distribution

a theoretical distribution of the number of successes in a finite set of independent trials with a constant probability of success

### causation

A cause and effect relationship in which one variable controls the changes in another variable.

### central limit theorem

Regardless of the population distribution, The sampling distribution is normal IF n is large enough (>30).

### cluster sampling

divide population into sections then randomly select some of those clusters and then choose ALL members from selected clusters

### confounding

a situation where the effect of one variable on the response variable cannot be separated from the effect of another variable on the response variable.

### correlation

measuring the strength and direction of the relationship between two numerical variables

### degrees of freedom

A concept used in tests of statistical significance; the number of observations that are free to vary to produce a known outcome.

### double blind experiments

experiments in which neither the participants nor the people analyzing the results know who is in the control group

### Empirical Rule

The rules gives the approximate % of observations w/in 1 standard deviation (68%), 2 standard deviations (95%) and 3 standard deviations (99.7%) of the mean when the histogram is well approx. by a normal curve

### geometric distribution

Success / Failure, trials continue until successful, each outcome is independent, constant probability of success

### influential observations

Individual points that change the regression line. Often outliers in the x direction, but require large residuals.

### law of large numbers

as an experiment is repeated over and over, the empirical probability of an event approaches the actual probability of the event

### lurking variable

A lurking variable is a variable that is not among the explanatory or response variables in a study and yet may influence the interpretation of relationships among those variables.

### margin of error

The +- value added to and subtracted from a point estimate in order to develop an interval estimate of a population parameter

### matched pairs design

A matched pairs design is a special case of the randomized block design. It is used when the experiment has only two treatment conditions; and subjects can be grouped into pairs, based on some blocking variable. Then, within each pair, subjects are randomly assigned to different treatments.

### mean

an average of n numbers computed by adding some function of the numbers and dividing by some function of n

### normal distribution

A function that represents the distribution of variables as a symmetrical bell-shaped graph.

### randomization

the best defense against bias, in which each individual is given a fair, random chance of selection

### residual

the difference between the observed value and the predicted value of a regression equation; y - y-hat

### response bias

people answer questions the way they think you want them answered. There are some questions they simply don't want to answer truthfully.

### sampling distribution

a distribution of statistics obtained by selecting all the possible samples of a specific size from a population

### scatterplots

a graphed cluster of dots, each of which represents the values of two variables. The slope of the points suggests the direction of the relationship between the two variables. The amount of scatter suggests the strength of the correlation.

### simple random sample

abbreviated SRS, this requires that every item in the population has an equal chance to be chosen and that every possible combination of items has an equal chance to exist. No grouping can be involved.

### simpson's paradox

conclusions drawn from two or more separate crosstabulations that can be reveresed when the data are aggregated between two quantitative variables

### single blind experiments

an experiment in which the participants are unaware of which participants received the treatment

### skewed

a distribution is this if it's not symmetric and one tail stretches out farther than the other

### standard deviation

a measure of variability that describes an average distance of every score from the mean

### standard normal curve

A normal distribution with mean of zero and standard deviation of one. Probabilities are given in Table A for values of the standard Normal variable.

### statistically significance

said to exist when the probability that the observed findings are due to chance is very low

### stratified random sample

a sample in which the population is first divided into similar, nonoverlapping groups. A simple random sample is then selected from each of the groups

### symmetric

a distribution is this if the two halves on either side of the center look approximately like mirror images of each other

### Type I Error

The error that is committed when a true null hypothesis is rejected erroneously. The probability of a Type I Error is abbreviated with the lowercase Greek letter alpha.

### Type II Error

the error of failing to reject a null hypothesis when in fact it is false (also called a "false negative"). the probability of a Type II error is commonly denoted β and depends on the effect size.

### unbiased estimator

a statistic whose sampling distribution is centered over the population parameter

### undercoverage

occurs when some groups in the population are left out of the process of choosing the sample

### voluntary response

Individuals with strong feelings about a subject are more likely than others to respond. Such a study is interesting but not reflective of the population.

### Conditional Distribution

the distribution of a variable restricting the who to consider only a smaller group of individuals

### General Addition Rule

For any two events (meaning disjoint or not disjoint), A and B, the probability of A or B is:

P(A ∪ B) = P(A) + P(B) - P(A ∩ B).

### Independence (Casually)

Two events are indpendent if knowing whether one event occurs does not alter the probability that the other event occurs

### Tree Diagram

a diagram used to show the total number of possible outcomes in a probability experiment

### Addition and Subtraction

-Mean, Median, and Mode are affected

-Cannot subtract SD; Only square, add, and square root

-Range is NOT affected

### Continuous Random Variable

-Random variable that assumes values associated with one or more intervals on the number line

### Independent Events

-If the knowledge of one event having occurred does not change the probability that the other event occurs

### Law of Large Numbers

-States that the proportion of successes in the simulation should become, over time, close to the true proportion in population

### Probability Distribution for a Discrete Random Variable

-Possible values of the discrete random variable together with their respective probabilities

### Probability Distribution for a Random Variable

-Possible values of the random variable X together with the probabilities corresponding to those values

### Random Phenomenon

-An activity whose outcome we can observe or measure but we do not know how it will turn out on any single trial

### Bayes's Theorem

Suppose that A₁, A₂, ... Ak are disjoint events whose probabilities are not 0 and add to exactly 1, i.e. any outcome must be exactly one of those events. The, if B is any other event whose probability is not 0 or 1,

P(Ai|B) = P(B|Ai)P(Ai) / P(B|A₁)P(A₁) + ... P(B|Ak)P(Ak)

### Disjoint ≠ Independent

If two events are disjoint, then the occurrence of one would mean the non-occurrence of the other. If events are independent, then non/occurrence is moot.

### General Addition Rule for Any Two Events

For any two events A and B,

P(A or B) = P(A) + P(B) - P(A and B)

### General Multiplication Rule for Any Two Events

The probability that both of two events A and B happen together can be found by

P(A and B) = P(A)P(B|A)

### Independence Definition

When the outcome of one event cannot influence the outcome of a second event.

### Outcomes for a diagnostic test

There are four possible outcomes:

- true positive

- true negative

- false positive

- false negative

### Tree Diagrams

Diagrams that will show P(A) as independent branches, then P(B|A) as branches coming off those branches, etc. until a final event is reached. The probability of any one event occurring can be calculated by multiplying the probabilities of each branch along the way.

### Venn Diagram

A diagram showing a sample space S and events as areas within S. Overlaps indicate non-disjoint events.

### When P(A)>0, the conditional probability of event B occurring given A occurs is

P(B|A) = P(A and B) / P(A)

### Addition Rule 1

When two events A and B are mutually exclusive, the probability that A or B will occur is

P(A or B) = P(A) + P(B)

### Classical Probability

P(E) = # of ways the trial can occur

total # of outcomes

Whenever you are finding probability where the sample space is the same.

### Combinations Rule

Used when selecting a smaller number from a larger number but the order is NOT important.

nCr= n!

r! (n-r)!

n=sample size, r=smaller objects selecting

On calculator: enter amount(n), math, PRB, 3, enter amount(r), enter

### Complement Rule

P(E)

Is the set of outcomes in the sample space that are not included in the outcomes of E

### Conditional Probability

The probability that the second event B occurs given that the first event A has occurred can be found by dividing the probability that both events occurred by the probability that the first event has occurred. The formula is

P(B!A) = P(A and B)

P(A)

### Dependent Events

When the outcome or occurrence of the first event affects the outcome or occurrence of the second event in such a way that the probability is changed.

*without replacement = dependent events

### Empirical Probability

P(E) = frequency for the class = f

total frequencies in the distribution n

Relies on actual experience to determine the likelihood of outcomes.

### Factorial Rule

Use this when you have "n" objects and you want to know how many different ways they can be arranged.

n!

On calculator: enter amount, math, arrow left to PRB, 4, enter

### Fundamental Counting Rule

Use this when you have different positions and you want to know how many options there are within those positions.

___**___**___**___**___**___**___**___**___= 2 =512

### Independent Events

Two events A and B are independent events if the fact that A occurs does NOT affect the probability of B occurring.

*with replacement = independent events

### Multiplication Rule 1

When two events are independent, the probability of both occurring is

P(A and B) = P(A) * P(B)

### Multiplication Rule 2

When two events are dependent, the probability of both occurring is

P(A and B) = P(A) * P(B!A)

### Mutually Exclusive Events

Two events that cannot occur at the same time (i.e., they have no outcomes in common).

### Permutations Rule

Used when selecting a smaller group from a larger group and you put them in a specific order.

*ORDER IS IMPORTANT

nPr= n!

(n-r)!

n=sample size, r=smaller objects selecting

On calculator: enter amount(n), math, PRB, 2, enter amount(r), enter

### Subjective Probability

Uses a probability value based on an educated guess or estimate, employing opinions and inexact information.

### density curve

the overall pattern of a distribution, areas underneath give proportions of observations for the distribution

### distribution

a variable tells us what values it takes and how often it takes these values

of categorical - gives us either the count of the percent of individuals that fall in each category

### histogram

breaks the the range of values of a variable into classes and displayus only the count or percent of the observations that fall into each class, no space inbetween each bar

### linear transformations

changes the original variable x into the new variable x(new) given by the euation ***

### normal quartile plot

a pattern on such a plot that deviates substantially from a staight line indicates that the data are not normal

### quantitative variable

numerical values for which arithmetic operations such as adding and averaging make sense

### resistance measure

any aspect of a distribution is relatively unaffected by changes in the numerical value of a small proportion of the total number of oberservations no matter how large these changes are

### splitting stem/ trim

terms to slim down the size of your stem plot. helpful when you have large sets of data

### stem plot

gives a quick picture of the shape of a distribution while including the actual numerical values in the graph

### cluster sampling

divide population into pre-existing segments; select random clusters; include every member of each selected cluster

### completely randomized experiment

one in which a random process is used to assign each individual to one of the treatments

### confounded

when the effects of one of the two variables canot be distinguished from the effects of the other

### control group

receives a dummy treatment, enabling the researchers to control for the placebo effect; used to account for the influence of other known or unknown variables that might be an underlying cause of a change in response in the experimental group

### convenience sampling

create a sample by using data from population members that are readily available

### Descriptive statistics

involves methods of organizing, picturing, and summarizing information from samples or populations

### experiment

a treatment is deliberatrly imposed on the individuals in order to observe a possible change in the response or variable being measured

### inferential statistics

involves methods of using information from a sample to draw conclusions regarding the population

### interval level of measurement

applies to data that can be arranged in order; differences are meaningfull

### lurking variable

one for which no data have been collected but that nevertheless has influence on other variables in the study

### multistage sampling

use a variety of sampling methods to create successively smaller groups at each stage. The final sample consists of clusters

### nominal level of measurement

applies to data that consist of names, labels, or categories; cannot be ordered

### nonsampling error

the result of poor sample design, sloppy data collection, faulty measuring instruments, bias in questionnaires, and so on.

### observational study

observations and measurements of individuals are conducted in a way that doesn't change the response or the variable being measured

### ordinal level of measurement

applies to data that can be arranged in order; differences between data are meaningless

### placebo effect

occurs when a subject receives no treatment but (incorrectly) believes he is receiving treatment and responds favorably

### randomization

used to assign individuals to the two treatment groups; helps prevent bias in selecting group members

### randomized block experiment

individuals are first sorted into blocks, and then a random process is used to assign each individual in the block to one of the treatments

### ratio level of measurement

applies to data that can be arranged in order; differences are meaningfull; true zero

### replication

reduces the possibility that the differences in pain relief for the two groups occurred by chance alone

### sampling error

the difference between measurements from a sample and corresponding measurements from the respective population; caused by the fact that the sample does not perfectly represent the population

### simple random sample

a subset of the population selected in a manner such that every sample of size n from the population has an equal chance of being selected

### statistics

the study of how to collect, organize, analyze, and interpret numerical information from data

### stratified sampling

divide the entire population into distinct subgroups called strata. The strata are based on a specific characteristic. All members of a stratum share the specific charactersitic. Draw random samples from each stratum

### systematic sampling

number all members of the population then from a random starting point, select every kth member

### Bell-Shaped Distribution

Has a single peak, tapers odd at either end; and is approximately symmetric

### Categorical Frequency Distribution

Used for data that can be placed into specific categories, such as nominal or ordinal level data

### Class Boundaries

Used to separate the classes so that there are no gaps in the frequency distribution

### Class Width

Found by subtracting the lower class limit one from the lower class limit of the next class; can also be used with upper limits

### Frequency Distribution

The organization of raw data in table form; consists of classes and frequencies

### Frequency Polygon

Graph that displays the data by using lines that connect points plotted for the frequencies at the midpoints of the classes

### Grouped Frequency Distribution

When the data is large and the data must be grouped into classes that are more than one unit in width.

### Histogram

A graph that displays the data by using adjacent vertical bars of various heights to represent the frequencies of the classes

### Positively Skewed

When the peak of the distribution is to the left and the data values taper off to the right

### Raw Data

When data are in their original form; little information can be obtained from looking at this

### 4.3 random variables

-random variable is a variable that assigns a number to each outcome of an experiment. This is not to be confused with an algebraic variable.

-the probability distribution of a random variable is a listing of each possible outcome of a random variable together with that outcomes probability

-X: X1, X2, X3...

-P(X): P1,P2,P3...

example: toss a coin 3 times. Let X=the number of heads

-X:0,1,2,3

-P(X): 1/8,3/8,3/8,1/8

### 4.4 properties of random variables

definitions:

expected value (or mean) of a random variable: this is denoted E(X)

Variance of a random variable: this is denoted V(X)

### at a hospital, the probability of a patient having surgery is 12%, and obstetric treatment 16% and the probability of both is 2%. What is the probability that a patient will have neither treatment?

.74

### benford's law, also called the first-digit law

states that for certain kinds of data, the first digit in each data value has a curious frequency

-this can be used to access the legitimacy of certain date

-for appropriate data, first digits have the following distribution )with the last value missing

-first digit: 1 2 3 4 5 6 7 8 9

-Probability: .301 .176 .125 .097 .079 .067 .058 .051 ?

-1. what is the probability that the first digit is 9? .046

-2. What is the probability that the first digit is at least 2? (pp says .699 but I don't understand that)

### Better example

-woman visits her doc and gets tested for rare disease

-doc indicates that the test is 99% accurate (false positive=1%)

-woman tests positive, she concludes there is a 99% chance she has the disease

-this is a rare disease, suppose the incidence in the population is 1 in 50,000.

-if 50,000 people are tested, we would expect 500 to test positive even though only one person has the disease

-thus, even after testing positive, she only has a 1 in 500 chance of having the disease

### Calculating probability: Roll a die twice, what is the probability that the sum of the faces will be 8?

P(Sum=8)=5/36

### Caution

a random variable does not share the same properties as an algebraic variable

-for an algebraic variable X: X+X+X=3X

-for a random variable, each X may turn out differently, so X+X+X doesnotequal 3X

-this distinction matter when calculating variance.

-X+X+X should really be written X1+X2+X3

###
Class Problem: A card is drawn from a deck of 52 cards.

-what is the probability that it is neither a diamond nor an ace?

-What is the probability that it is either not a diamond or it is not an ace?

-13 cards are diamonds and 3 more are aces, that leaves 36 cards, so 36/52= .6923

-there is only one card that doesn't fit either category-the ace of diamonds, so 51/52= .9808

###
Class problem: employee bonuses are awarded at the end of the year. Thomas realizes it is possible for him to get a $5000 bonus, but it is unlikely. He is twice as likely to get a $2000 bonus, seven times as likely to get a $1000 bonus, and ten times as likely to get a $500 bonus.

-construct the probability distribution for Thomas's bonus (first call the probability of getting a $5000 bonus p)

Bonus: 5000 2000 1000 500

probability: p 2p 7p 10p

###
E(X)=(5000)(.05)+(2000)(.10)+(1000)(.35)+(500)(.5)=1050

V(X)=(5000-1050)^2(.05)+(2000-1050)^2(.10)+(1000-1050)^2(.35)+(500-1050)^2(.5)=(powerpoint says 1,022,500 but I got 378,996,250)

...

### CLASS PROBLEM: In real estate ads it is found that 64% of homes have garages, 9% have pools, and 28% have a finished basement. 5% have a garage and a pool, 19% have a garage and a basement, 4% have a basement and a pool, and 2% have all three. What percentage of homes do not have any of these three?

G=64-2=62-3-17=42

P=9-2=7-3-2=2

B=28-2=26-17-2=7

G&P=5-2=3

G&B=19-2=17

B&P=4-2=2

All=2

100-42-2-7-3-17-2-2=25

###
Class problem: John is suing his landlord. If he wins. he will be awarded $6000 and will not have to pay any court costs. If he loses, he will have to pay court fees totaling $200.

-john has found a lawyer that will represent him for $1200. If he hires this lawyer, there is an 80% chance he will win, and if he represents himself there is only a 60% chance that he will win.

-should john hire this lawyer? (calculate his expected net winnings using the lawyer and his expected net winnings not using the lawyer)

With lawyer: 4800 -1400

P(X): .8 .2

-E(X)= (4800)(.8)+(-1400)(.2)=3560

Without lawyer: 6000 -200

P(X): .6 .4

-E(X)=(6000)(.6)+(-200)(.4)=3520

###
Class problem: K, A, and M have completed several relay triathlons. K-swimming, A-bikes, M-runs. Their respective completion times (in hours) have means .77, 1.33, and .9, and their respective standard deviations are .05, .08, and .06.

a) what is their expected team finish time?

b) what is the standard deviation of the team finish time?

c) assume their team finish times are normally distributed. What is the probability that they finish the triathlon 15 minutes earlier than usual?

a)E(K+A+M)=E(K)+E(A)+E(M)=.77+1.33+.9=3

b)V(K+A+M)=V(K)+V(A)+V(M)=.0025+.0064+.0036=.0125

oK+A+M=Square root of .0125=.1118

c) T N(3, .1118) > P(T<2.75)=P(Z<2.236)=0.0127

### Class Problem: Out of 125 students surveyed, 12 were accounting majors, 24 were business majors, and 34 were either an accounting major or business major (or both). Draw and label a Venn Diagram

Acc 10 Both 2 Bus 22

### Class problem: Roll a die twice, what is the probability that the number on the second cast is greater than the one on the first cast?

P(2nd>1st) = 15/36=5/12

###
CLASS PROBLEM: The probability of encountering heavy traffic on a Monday is 0.8, and the probability of encountering heavy traffic on a Tuesday is 0.6

1. someone claims the probability of heavy traffic occurring both days is .3, why is this impossible?

2. the person retracts their claim, but insists that Monday and Tuesday are independent of each other. What is the probability of encountering heavy traffic on Monday or Tuesday?

3. What is the probability of encountering heavy traffic at least one Tuesday in a Month? (successive Tuesdays are independent)

1. 0.8+0.6-0.3=1.1

2. P(M or T)= P(M)+P(T)-P(M and T)= 0.8+0.6-(0.8)(0.6)=0.92

3. P(equal to or greater than 1)=1-P(none)=1-(0.4)^4=0.9744

### CLASS PROBLEM: Toss a coin, if it lands heads, roll a die once. If it lands tails, flip the coin one more time. What is the sample space, and what is the size of the sample space?

S={(H,1), (H,2),...(H,6), (T,H), (T,T)} lsl=8

### Events

-An event is some set of outcomes from the sample space. events are denoted by capital letters A,B,C....

### -Two events are Independent if the probability of one occurring is not influenced by the other occurring

...

### -disjoint events are sometimes called mutually exclusive, since the occurrence of one excludes the possibility of the other occurring

...

###
Example:

S = {0,1,2,3,4,5,6,7,8}

A = {2,3,6,7} B = {0,3,6,8}

A and B = {3,6}

A or B = {0,2,3,6,7,8}

A^c and B = {0,8}

A^c or B^c = {0,1,2,4,5,7,8}

(A and B)^c = {0,1,2,4,5,7,8}

(A or B)^c = {1,4,5}

A and A^c = {ø}

###
Example: the american vet ass. claims that the annual cost of medical care for dogs averages $100 with a standard deviation of 30$, and the annual cost of medical care for cats averages $130 with a standard deviation of $35

a) what's the expected difference in cost between cats and dogs?

b) what's the standard deviation of the difference between cats and dogs?

c) if the differences in costs is normally distributed, what's the probability that the medical expenses for a woman's dog is greater than that for her ca?

a) E(C-D)=E(C)-E(D)=120-100=$20

b) V(C-D)=V(C)+V(D)=1225+900=2125 > O c-d=$46.1

c) we are told the difference is normal, and we already found the center and spread. Difference N(20,46.1)

P(difference<0)=P(Z<(0-20/46.1)=P(Z<-.4338)=.3322

### Example:

suppose X and Y are independent, and E(X)=120 ox=12 E(Y)=300 ox=16

Find the mean and standard deviation of 2X-5Y

E(2X-5Y)=2E(X)-5E(Y)=2(120)-5(300)=-1260

V(2X-5Y)=V(2X)+V(5Y)=4V(X)+25V(Y)=4(144)+25(256)=6976 > o2x-5y=square root of 6976=83.522

### Examples:

calculate the mean and standard deviation of the following random variable:

-X: -2 3 7

-P(X): .3 .1 .6

-E(X)= (-2)(.3)+(3)(.1)+7(.6)=3.9

-V(X)=(-2-3.9)^2(.3)+(3-3.9)^2(.1)+(7-3.9)^2(.6)=16.29

### Examples:

In a game, a die is thrown. Alan pays Sally $1 if the die falls 1,2, or 3, and $3 if the die falls 4 or 5. If the die falls 6, Sally has to pay Alan $8. What is the expected value and standard deviation of the amount Sally wins?

Winnings X: 1 3 -8

P(X): 0.5 0.333 0.1667

-E(X)=(1)(0.5)+(3)(0.333)+(-8)(0.1667)=(power point got $0.1667 but my calculations were $0.1654)

-PwPtV(X)=(1-.1667)^2(0.5)+(3-.1667)^2(.333)+(-8-.1667)^2(.1667)= 14.13

-myV(X)= (1-.1654)^2(0.5)+(3-.1654)^2(.333)+(-8-.1654)^2(.1667)=14.13

### Gambler's fallacy, or "law of averages"

psychological prejudice that assumes observations will behave as expected much sooner than necessary.

In other words, thinking an event is "due" or "not due"

-playing a different lottery number than last week's winning number because the chances it would come up twice in a row are so small.

-building your home in the exact spot that a meteor struck reasoning it would almost impossible for a meteor to strike in the same place twice.

-a man brings a bomb on a plane. he reasons "the chances of there being a bomb on a plane are so small, so the chances of there being another one are almost zero"

###
INDEPENDENCE:

two events A and B are independent if P(A and B)=P(A)*P(B)

Example: P(A)= .3 P(B)= .5 P(A and B)= .10

.15 does not equal .10 so A and B are not independent

###
Example: P(A)=.2 P(B)= .6 P(A or B)= .68

are A and B independent? (first use addition rule)

By the addition rule. P(A and B)=.12 and (.2)(.6)=.12, so A and B are independent.

...

### -do not confuse independence with disjoin. Independence cannot be illustrated on a Venn diagram.

...

### Law of Large Numbers

states that as an experiment is repeated over and over, the observed frequency of an outcome gets closer to its expected frequency.

### probability

the probability of an outcome is the proportion of times that it would occur over many repetitions.

-often, people expect the outcomes to settle into some regularity much sooner than they actually do.

### Properties of Mean and Variance

E(c)=c V(c)=0 E(X+/-Y)=E(X)+/-E(Y)

E(cX)=cE(X) V(cX)=c^2V(X)

if X and Y are independent: V(X+/-Y)=V(X)+V(Y)

### Prosecutor's fallacy

-a man is on trial for a crime, and forensic evidence is found at the scene which implicates him.

-a prosecutor has an expert witness testify that the probability of finding this forensic evidence is 1 in 20,000 if the person is innocent

-by itself, this argument is misleading...

-the defense counters that there are 1,000,000 ppl in this city and so there are 50 people who could have left this evidence.

-thus there is still only a 1 in 50 chance that the defendant is the one that left this evidence

-the prosecutor would have to make an argument that significantly narrows down this pool of 40 people, like additional evidence.

-this is tantamount to someone winning the lottery, and the prosecutor charging them of cheating because the odds of winning were so low.

### Random

a phenomenon is random if any individual outcome is unpredictable, but the distribution of outcomes over many repetitions is known

example: toss a coin. no flip is predictable, but many flips will result in approximately half heads and half tails

-remember that random does not mean that each outcome is equally likely, it only means that a particular outcome cannot be predicted with certainty

###
record the number of people that walk into a post office each day.

a) what is the sample space?

b) How do you think the outcomes will be distributed (what shape)

a) S={0,1,2,3,....) lsl= infinity

b) skewed-right

### Rules of Thumb (1)

1) "and" means multiply when the events are independent

-toss a coin three times. what is the probability of all three being tails?

-that is, tails first AND tails second AND tails third

-since coin flips are independent we multiple, .5x.5x.5=.125

### Rules of Thumb (2)

2) "or" means add when the events are disjoint

-roll two dice. What is the probability that the sum of the faces is 5 or 11?

-since the sum cannot be 5 and 11 at the same time, these are disjoint outcomes, so we add: P(sum=5)+P(sum=11)=4/36+2/36=1/6

### Rules of Thumb (3 continued)

-if there are 23 people in a room, what is the probability that at least two of them have the same b-day?

-P(at least 2) = 1-P(all different)= # different bdays for 23 people/# possible bdays for 23 people = 1- (365**364**363...**343/365**365**365**...*365)=1-.4927 = 50.73%

### Rules of Thumb (3)

3) for any probability question, first decide whether it is easier to calculate it directly, or easier to calculate the opposite and subtract from 1.

-a coin is tossed 7 times, what is the probability of tails occurring at least once?

-easier to answer the opposite: P(tails at least once)=1-P(no tails)

-"no tails" means "heads first AND heads second AND..."

P(no tails)=.5 x .5 x .5 x .5 x .5 x .5 x .5= .0078

P(at least once)=1-P(no tails)=1-0.0078=.9922

### sample space

the sample space is the set of all possible outcomes, denoted S

example: toss a coin three times. The sample space is ... S={HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}

-the size of S is denoted lSl.

-example: toss a die twice. The sample space is... S={(1,1), (1,2),...(1,6), (2,1), (2,2),...(2,6),...(6,6)}

example: pull two cards from a well-shuffled deck. How many elements are in the sample space?

###
The Addition Rule:

P(A or B)=P(A) + P(B)-P(A and B)

Overlap counted twice, subtract out once.

In an office building of 80 people, 28 work on Saturday, 11 work on Sunday, and 3 people work on both Sunday and Saturday. What is the probability that a person in this office works at least one of these days?

P(Sat or Sun)= P(Sat) + P(Sun) - P(Both) = 28/80+11/80-3/80=.45

###
Basic Rules for

Computing Probability (Rule 1) -

Relative Frequency Approximation of Probability

P(A) = # of times A occurred / # of times procedure was repeated

###
Basic Rules for

Computing Probability (Rule 2) -

Classical Approach to Probability

(Requires Equally Likely Outcomes)

###
Basic Rules for

Computing Probability (Rule 3) - Subjective Probabilities

P(A), the probability of event A, is estimated by using knowledge of the relevant circumstances.

### Combinations Rule

Requirements:

There are n different items available.

We select r of the n items (without replacement).

We consider rearrangements of the same items to be the same. (The combination of ABC is the same as CBA.)

### Complementary Events

The complement of event A, denoted by A, consists of all outcomes in which the event A does not occur

### Complements: The Probability of "At Least One"

"At least one" is equivalent to "one or more."

### The complement of getting at least one item of a particular type is that you get no items of that type.

...

### Compound Event

P(A or B) = P (in a single trial, event A occurs or event B occurs or they both occur)

### Conditional probability

Find the probability of an event when we have additional information that some other event has already occurred.

### Confusion of the Inverse

To incorrectly believe that P(A|B) and P(B|A) are the same, or to incorrectly use one value for the other, is often called confusion of the inverse.

### Dependent and Independent

Two events A and B are independent if the occurrence of one does not affect the probability of the occurrence of the other. (Several events are similarly independent if the occurrence of any does not affect the probabilities of the occurrence of the others.) If A and B are not independent, they are said to be dependent.

### Disjoint or Mutually Exclusive

Events A and B are disjoint (or mutually exclusive) if they cannot occur at the same time. (That is, disjoint events do not overlap.)

### Factorial Rule

A collection of n different items can be arranged in order n! different ways. (This factorial rule reflects the fact that the first item may be selected in n different ways, the second item may be selected in n - 1 ways, and so on.)

### where P(A and B) denotes the probability that A and B both occur at the same time as an outcome in a trial of a procedure.

...

### Fundamental Counting Rule

For a sequence of two events in which the first event can occur m ways and the second event can occur n ways, the events together can occur a total of m n ways.

###
General Rule for a

Compound Event

When finding the probability that event A occurs or event B occurs, find the total number of ways A can occur and the number of ways B can occur, but find that total in such a way that no outcome is counted more than once

### Intuitive Approach to Conditional Probability

The conditional probability of B given A can be found by assuming that event A has occurred, and then calculating the probability that event B will occur.

### Law of Large Numbers

As a procedure is repeated again and again, the relative frequency probability of an event tends to approach the actual probability.

###
Notation for

Probabilities

P - denotes a probability.

A, B, and C - denote specific events.

P(A) - denotes the probability of event A occurring.

### Notation for Conditional Probability

P(B|A) represents the probability of event B occurring after it is assumed that event A has already occurred (read B|A as "B given A.")

###
Permutations Rule

(when items are all different)

Requirements:

There are n different items available. (This rule does not apply if some of the items are identical to others.)

### We consider rearrangements of the same items to be different sequences. (The permutation of ABC is different from CBA and is counted separately.)

...

###
Permutations Rule

(when some items are identical to others)

Requirements:

There are n items available, and some items are identical to others.

### Permutations versus Combinations

When different orderings of the same items are to be counted separately, we have a permutation problem, but when different orderings are not to be counted separately, we have a combination problem.

###
For any event A, the probability of A is between 0 and 1 inclusive.

That is, 0 <= P(A) <= 1

...

### Probability of "at least one"

Find the probability that among several trials, we get at least one of some specified event.

### Rare Event Rule for Inferential Statistics

If, under a given assumption, the probability of a particular observed event is extremely small, we conclude that the assumption is probably not correct.

### Rounding Off Probabilities

When expressing the value of a probability, either give the exact fraction or decimal or round off final decimal results to three significant digits. (Suggestion: When a probability is not a simple fraction such as 2/3 or 5/9, express it as a decimal so that the number can be better understood.)

### Sample Space

for a procedure consists of all possible simple events; that is, the sample space consists of all outcomes that cannot be broken down any further

### The 5% Guideline for Cumbersome Calculations

If a sample size is no more than 5% of the size of the population, treat the selections as being independent (even if the selections are made without replacement, so they are technically dependent).

### exhaustive events

If one includes all the possible outcomes, then A and A' are exhaustive because one of them must happen.

### frequentist interpretation of probability

probability of an event proportional to number of times event occurs in a large number of repetitions of the experiment

### joint probabilities

probabilities that correspond to the events represented in the cells of the contingency table

### marginal probabilities

probabilities that correspond to the events represented in the margin of the contingency table.

### prior probability

initial probability of a state of nature before sample information is used with Bayes Theorem

### probability model

a mathematical description of a random phenomenon consisting of a sample space and a way of assigning probabilities to events

### outcome

the result of a singe performance of an experiment; a set of outcomes is denoted with braces {}

### probability model

a table or listing of all the possible outcome of an experiment, together with the probability of each outcome; must follow the Rules of Probability

### probability

of an outcome is defined as the long-term proportion of times the outcome occurs; a number that indicates how likely the particular outcome is

### simulation

uses methods such as rolling dice or computer generation of random numbers to generate results from an experiment.

### Benford's Law

Mathematical algorithm that accurately predicts that, for many data sets, the first digit of each group of numbers in a random sample will begin with 1 more than a 2, a 2 more than a 3, a 3 more than a 4, and so on. Predicts the percentage of time each digit will appear in a sequence of numbers.

### complement of an event

the probability that an event does not occur; all outcomes in a sample space that are not outcomes in the event

### contingency table

A table that relates two categories of data; two-way table. Variables are placed in rows and columns; each intersection of variables is a cell in the table.

### equation for approximating probabilities using the empirical approach

P(E) ≈ relative frequency of E =

(frequency of E)/(number of trials of experiment)

### equation for computing probability using the classical method

P(E) = (number of ways that E can occur)/ (number of possible outcomes) = m/n

### factorial symbol (n!)

if n ≥ 0 is an integer, the factorial symbol, n!, is defined as follows:

n! = n(n-1)∗⋅⋅⋅∗3∗2∗1

### multiplication rule of counting

If as task consists of a sequence of choices in which there are p selections for the first choice, q selections for the second choice, r selections for the third choice, etc., then the task of making these selections can be done in

p∗q∗r∗⋅⋅⋅ ways

### number of combinations of n distinct objects taken r at a time

The number of different arrangements of n objects using r ≤ n of them, in which

1. the n objects are distinct

2. repetition of objects is not allowed

3. order is not important

### number of permutations of distinct objects in groups

The number of arrangements of r objects chosen from n objects in which

1. the n objects are distinct

2. repetition of objects is not allowed

3. order is important

### permutation

an arrangement in which r objects are chosen from n distinct objects, repetition is not allowed, and order is important.

### probability model

lists the possible outcomes of a probability experiment and each outcome's probability

### Rules of probability

1. The probability of any event must be between 0 and 1, inclusive. 0 ≤ P(E) ≤ 1.

2. The sum of the probabilities of all outcomes must equal 1.

3. If E and F are disjoint events, then P(E or F) = P(E) + P(F). If E and F are not disjoint events, then P(E or F) = P(E) + P(F) - P(E and F)

4. If E represents any event and Ec represents the complement of E, then P(Ec) = 1 - P(E)

5. If E and F are independent events, then P(E and F) = P(E)∗P(F)

### the law of large numbers

as the number of repetitions of a probability experiment increases, the proportion with which a certain outcome is observed gets closer to the probability of the outcome

### tree diagram

a diagram to determine a sample space that lists the equally likely outcomes of an experiment

### Venn diagram

A diagram that uses circles contained within a rectangle to display elements of different sets. The rectangle represents the sample space, and circles represent events.

### simulation

the imitation of change behavior, based on a model that accurately reflect the phenomenon under consideration

### random

when individual outcomes are uncertain but there is a regular distribution of outcomes in a large number of repetitions

### probability model

a mathematical description of a random phenomenon consisting of a sample space and a way of assigning probabilities to events

### Multiplication Principle

If you can do one task in n1 ways and a second task in n2 ways, then both tasks can be done in n1*n2 ways

### Area and Probability

Because the total area under the density curve is equal to 1, there is a correspondence between area and probability.

### Binomial Probability Distribution

1.The procedure must have a fixed number of trials.

2. The trials must be independent.

3. Each trial must have all outcomes classified into two categories (commonly, success and failure).

4.The probability of success remains the same in all trials

### Central Limit Theorem - continued

Conclusions:

1. The distribution of sample x will, as the sample size increases, approach a normal distribution.

2. The mean of the sample means is the population mean µ.

3. The standard deviation of all sample means is σ / (n)^(1/2)

### Central Limit Theorem Description

for a population with any distribution, the distribution of the sample means approaches a normal distribution as the sample size increases.

### Central Limit Theorem Requirements

Given:

1. The random variable x has a distribution (which may or may not be normal) with mean µ and standard deviation σ

2. Simple random samples all of size n are selected from the population. (The samples are selected so that all possible samples of the same size n have the same chance of being selected.)

### continuity correction

When we use the normal distribution (which is a continuous probability distribution) as an approximation to the binomial distribution (which is discrete), a continuity correction is made to a discrete whole number x in the binomial distribution by representing the discrete whole number x by the interval from

x - 0.5 to x + 0.5

(that is, adding and subtracting 0.5).

### Density Curve

A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

1. The total area under the curve must equal 1.

2. Every point on the curve must have a vertical height that is 0 or greater. (That is, the curve cannot fall below the x-axis.)

### Helpful Hints

1. Don't confuse z scores and areas. z scores are points along the horizontal scale, but areas are regions under the normal curve.

2. Choose the correct (right/left) side of the graph.

3. A z score must be negative whenever it is located in the left half of the normal distribution.

4. Areas (or probabilities) are positive or zero values, but they are never negative.

### Practical Rules Commonly Used

1. For samples of size n larger than 30, the distribution of the sample means can be approximated reasonably well by a normal distribution. The approximation gets closer to a normal distribution as the sample size n becomes larger.

2. If the original population is normally distributed, then for any sample size n, the sample means will be normally distributed (not just the values of n larger than 30).

standard deviation of sample mean or standard error of the mean~σ(x) = σ / (n)^(1/2)

### Standard Normal Distribution

The standard normal distribution is a normal probability distribution with μ = 0 and σ = 1. The total area under its density curve is equal to 1.

### Uniform Distribution

A continuous random variable has a uniform distribution if its values are spread evenly over the range of probabilities. The graph of a uniform distribution results in a rectangular shape.

### using a normal distribution as an approximation to the binomial probability distribution.

If the conditions of np ≥ 5 and nq ≥ 5 are both satisfied, then probabilities from a binomial probability distribution can be approximated well by using a normal distribution with mean μ = np and standard deviation σ = (n ** p ** q) ^ (1/2)

### Chi-Square Distribution

In a normally distributed population with variance σ^2 assume that we randomly select independent samples of size n and, for each sample, compute the sample variance s2 (which is the square of the sample standard deviation s). The sample statistic x^2 (pronounced chi-square) has a sampling distribution called the chi-square distribution.

### Choosing the Appropriate Distribution

Use the normal (z) distribution

If σ known and normally distributed population or σ known and n > 30

###
Use t distribution

if σ not known and normally distributed population or σ not known and n > 30

...

###
Use a nonparametric method or bootstrapping

If Population is not normally distributed and n ≤ 30

...

### confidence interval (or interval estimate)

range (or an interval) of values used to estimate the true value of a population parameter. A confidence interval is sometimes abbreviated as CI.

### Confidence Interval for Estimating a Population Mean (with σ Known)

1. The sample is a simple random sample. (All samples of the same size have an equal chance of being selected.)

2. The value of the population standard deviation σ is known.

3. Either or both of these conditions is satisfied: The population is normally distributed or n > 30.

### Confidence Interval for Estimating a Population Mean (with σ Known)

xbar - E < μ < xbar + E

or

xbar +/- E

or

(xbar - E, xbar + e)

where E = z(α/2) * ( σ / (n)^(1/2) )

### Confidence Interval for Estimating a Population Proportion p notation

p̂ - E < p̂ < p̂ + E

p̂ +/- E

(p̂ - E, p̂ + E)

### Confidence Interval for Estimating a Population Proportion p

p̂ - E < p̂ < p̂ + E

where

E = z(α/2) ** ( [p̂ ** q̂] / n ) ^(1/2)

### Confidence Interval for Estimating a Population Standard Deviation or Variance

( [n - 1] ** s^2 ) / X(r)^2 < σ^2 < ( [ n - 1 ] ** s^2) / X(L)^2

### Confidence Intervals for Comparing Data Caution

Confidence intervals can be used informally to compare the variation in different data sets, but the overlapping of confidence intervals should not be used for making formal and final conclusions about equality of variances or standard deviations.

### confidence level, degree of confidence, or the confidence coefficient.

is the probability 1 - α (often expressed as the equivalent percentage value) that the confidence interval actually does contain the population parameter, assuming that the estimation process is repeated a large number of times.

### Critical Value

A critical value is the number on the borderline separating sample statistics that are likely to occur from those that are unlikely to occur.

### degrees of freedom

The number of degrees of freedom for a collection of sample data is the number of sample values that can vary after certain restrictions have been imposed on all data values. The degree of freedom is often abbreviated df.

degrees of freedom = n - 1 in this section

### Finding the Point Estimate and E from a Confidence Interval

Point estimate of µ:

xbar = (upper confidence limit + lower confidence limit) / 2

### Finding the Point Estimate and E from a Confidence Interval

Point estimate of

p̂ = (upper confidence limit + lower confidence limit) / 2

Margin of error E = (upper confidence limit - lower confidence limit) / 2

### Important Properties of the Student t Distribution

1. The Student t distribution is different for different sample sizes (see the following slide, for the cases n = 3 and n = 12).

2. The Student t distribution has the same general symmetric bell shape as the standard normal distribution but it reflects the greater variability (with wider distributions) that is expected with small samples.

3. The Student t distribution has a mean of t = 0 (just as the standard normal distribution has a mean of z = 0).

4. The standard deviation of the Student t distribution varies with the sample size and is greater than 1 (unlike the standard normal distribution, which has a σ = 1).

5. As the sample size n gets larger, the Student t distribution gets closer to the normal distribution.

### Point Estimate of the Population Mean

The sample mean xbar is the best point estimate of the population mean µ.

### Procedure for Constructing a Confidence Interval for p

1.Verify that the required assumptions are satisfied. (The sample is a simple random sample, the conditions for the binomial distribution are satisfied, and the normal distribution can be used to approximate the distribution of sample proportions because np >= 5, and nq >= 5 are both satisfied.)

2. Refer to Table A-2 and find the critical value z(α/2) that corresponds to the desired confidence level.

3. Evaluate the margin of error

4. Using the value of the calculated margin of error, E and the value of the sample proportion, p, find the values of p - E and p + E. Substitute those values in the general format for the confidence interval: p̂ - E < p̂ < p̂ + E

5. Round the resulting confidence interval limits to three significant digits.

### Procedure for Constructing aConfidence Interval for µ (With σ Unknown)

1. Verify that the requirements are satisfied.

2. Using n - 1 degrees of freedom, refer to Table A-3 or use technology to find the critical value t(α/2) that corresponds to the desired confidence level.

3. Evaluate the margin of error E = t(α/2) • [ s / n^(1/2 ] .

4. Find the values of xbar - E and xbar + E. Substitute those values in the general format for the confidence interval: xbar - E < μ < xbar + E

5. Round the resulting confidence interval limits

### Procedure for Constructing a Confidence Interval for σ or σ^2

1. Verify that the required assumptions are satisfied.

2. Using n - 1 degrees of freedom, refer to Table A-4 or use technology to find the critical values X(r)^2 and X(L)^2 that correspond to the desired confidence level/

3. Evaluate the upper and lower confidence interval limits using this format of the confidence interval:

( [n - 1] ** s^2 ) / X(r)^2 < σ^2 < ( [ n - 1 ] ** s^2) / X(L)^2

4. If a confidence interval estimate of is desired, take the square root of the upper and lower confidence interval limits and change σ^2 to σ.

5. Round the resulting confidence level limits. If using the original set of data to construct a confidence interval, round the confidence interval limits to one more decimal place than is used for the original set of data. If using the sample standard deviation or variance, round the confidence interval limits to the same number of decimals places.

### Procedure for Constructing a Confidence Interval for µ (with Known σ)

1. Verify that the requirements are satisfied.

2. Refer to Table A-2 or use technology to find the critical value z(α/2) that corresponds to the desired confidence level

3. Evaluate the margin of error E = z(α/2) * ( σ / (n)^(1/2) )

4. Find the values of xbar - E and xbar + E. Substitute those values in the general format of the confidence interval

5. Round using the confidence intervals round-off rules.

### Properties of the Distribution of the Chi-Square Statistic

1. The chi-square distribution is not symmetric, unlike the normal and Student t distributions.

As the number of degrees of freedom increases, the distribution becomes more symmetric.

2. The values of chi-square can be zero or positive, but they cannot be negative.

3. The chi-square distribution is different for each number of degrees of freedom, which is df = n - 1. As the number of degrees of freedom increases, the chi-square distribution approaches a normal distribution.

In Table A-4, each critical value of X^2 corresponds to an area given in the top row of the table, and that area represents the cumulative area located to the right of the critical value.

### Round-Off Rule for Confidence Intervals Used to Estimate µ

When using the original set of data, round the confidence interval limits to one more decimal place than used in original set of data.

When the original set of data is unknown and only the summary statistics (n, x, s) are used, round the confidence interval limits to the same number of decimal places used for the sample mean.

### Round-Off Rule for Determining Sample Size

If the computed sample size n is not a whole number, round the value of n up to the next larger whole number.

### Round-Off Rule for Sample Size n

If the computed sample size n is not a whole number, round the value of n up to the next larger whole number.

### Sample Mean

1. For all populations, the sample mean x is an unbiased estimator of the population mean xbar, meaning that the distribution of sample means tends to center about the value of the population mean μ.

2. For many populations, the distribution of sample means x tends to be more consistent (with less variation) than the distributions of other sample statistics.

### Sample Size for Estimating Proportion p

When an estimate of p̂ is known

n = (z(α/2)^2 ** p̂ ** q̂) / E^2

When no estimate of p is known:

n = (z(α/2)^2 * 0.25) / E^2

### Student t Distribution

If the distribution of a population is essentially normal, then the distribution of

t = (xbar - μ) / [ s / n^(1/2) ]

### Addition rule for disjoint events

P(A or B) = P(A) + P(B)

If two events are disjoint, the probability of getting one or the other is the sum of their individual probabilities.

### Continuous Probability Model

A probability model that assigns probabilities as areas under a density curve; the probability of any event is the area under the curve and above the values on the horizontal axis that make up the event.

### Density Curve

A curve that is on or above the horizontal axis and has an area of exactly 1 underneath it. It describes the overall pattern of a distribution. Merely a model - no set of real data is exactly described by a density curve.

### Discrete/Categotical Probability Model

A probability model with a sample space made up of a finite list of individual outcomes.

### Finite Sample Space

A sample space dealing with either discrete or categorical variables that can take on only certain values.

### Intervals and Areas of Density Curves

For continuous probability models using a density curve, events are defined over intervals of values, and probability is computed as areas under the density curve.

### Mean "mu" of a density curve

The point at which the density curve would be balanced, if it were physical.

### Odds

The ratio of the probability of an outcome of a random phenom over the probability of that outcome not occurring.

### P(A does not occur) = 1 - P(A)

The probability of an event not occurring is equal to 1 minus the probability of the event happening.

### Personal/Subjective Probability

A number between 0 and 1 that expresses an individual's judgement of how likely the outcome is.

### Probability Distribution

The distribution of a random variable X that tells us what values X can take and how to assign probabilities to those values.

### Probability Model

A mathematical description of a random phenom consisting of two parts: a sample space S, and a way of assigning probabilities to events.

### Random Phenomenon

When the individual outcomes of a phenomenon are uncertain but there is nonetheless a regular distribution of outcomes in a large number of repetitions.

### Random samples eliminate bias from the act of choosing a sample, but they can still be wrong because of...

... the variability that results when one chooses at random.

### The idea of probability

Chance behavior is unpredictable short-term, but has a regular and predictable pattern in the long run.

### The outcome of a single individual outcome for a continuous probability model

All continuous probability models assign a probability of 0 to any individual outcome; only intervals of values can have positive probability.

### Bayes's Theorem: P[A(giv)B] =

(P[B(giv)A] ** P[A]) / P[B(giv)A] ** P[A] + P[B(giv)A'] * P[A']

### Binomial Distribution with Parameters n & p:

p(x) is the probability that there will be exactly x successes in the n trials

### Binomial Distrubution: p, q, x, n =

p = probability success, q = probability failure, x = successes, n = independent trials

### Cumulative Distribution Function:

F(x) = P[X<x] (Probability to the left of, and including, the point x)

### DeMorgan's Laws:

(A or B)' = A' & B' ; (A & B)' = A' or B'

### Exhaustive Outcomes:

Combine to the entire probability space. Or, one of the outcomes must occur whenever the experiment is performed.

### How do you Standardize Normal Distribution: P[ r < X < s)

Z = (X-u)/std, =P[(r-u)/std < (X-u)/std < (s-u)/std]

### Joint Distribution of Random Variables:

The Probability of two or more random variables together as a joint distribution

### P[A or B or C] =

P[A] + P[B] + P[C] - P[A & B] - P[A & C] - P[B & C] + P[A & B & C]

### Poisson distribution used as a model for:

Counting the number of events of a certain type that occur in a certain period of time

### boxplot

displays the 5-number summary as a central box with whiskers that extend to the non-outlying data values

### condtional probrability

The conditional probablity of B given A, written P(B/ A) is the probabilty that event B will occur given that event A has occured

### depentent event

Two events such that the occurrence of one event affect the occurrence of the other event

### five- number summary

A list of numbers that lists the minimum, first quartile, median, third-quartile, and the maximum of a data set.

### mean absolute deviation

the average of the absolute deviations for all the data values in the sample

### mutually exclusive event

events that have no common outcome, two events that cannot occour at the same time

### quartile

when data in a set are arranged in order, quartiles are the numbers that split the data into quarters or fourths

### random sample

A sample in which every member of the population has an equal chance of being selected

### brood

(n.) a family of young animals, especially birds; any group having the same nature or orgin; (v.) to think over in a worried, unhappy way.

### cater

(v.) to satisfy the needs of, try to make things easy and pleasant; to supply food and service.

### drone

(n.) a loafer, idler; a buzzing or humming sound; a remote-control device; a male bee. (v.) to make a buzzing sound; to spead in a dull tone of voice.

### literate

(adj.) able to read and write; showing an excellent educational background; having knowledge or training.

### plague

(n.) an easily spread disease causing a large number of deaths; a widespread evil; (v.) to annoy or bother

### transparent

(adj.) allowing light to pass through; easily recongnized or understoof; easily seen through or detected

### Descriptive Statistics

data is summarized using numerical and graphical techniques in some useful way

### Experiment

deliberately imposes some treatment on individuals in order to observe their responses. Used to study whether the treatment causes a change in the response

### Individual

object described by a set of data. Individuals may be people, but they may also be animals or things

### Interval Level

like ordinal, but differences between values make sense; data does not have a natural zero or starting point

### Line Graph

used to indicate a trend over time. Horizontal axis = time. Vertical axis = observed numerical data. Look for: overall pattern or trend, deviations, seasonal variations, pay specific attention to vertical scale

### Observational Study

observes individuals and measures variables of interest but does not attempt to influence the responses. Purpose of study is to describe some group or situation

### Parameter

a number describing or calculated from a population, usually the actual numerical value is unknown and we must describe the parameter in words

### Pictogram

uses pictures as part of the representation. Pictures are not often to scale, should not be used

### Pie Chart

divides the data up into slices, where each slice represents one category. The size of each slice is determined by the relative frequency of each category. Used only when data represents parts of one whole

### Quantitative Data

numerical variable for which it makes sense to do arithmetic operations; measurements

### Sample

the part of the population from which we actually collect information and is used to draw conclusions about the whole

### Statistic

a number describing or calculated from a sample, usually the actual numerical value is known

### Classical (or theorectical) Probability

is used when each outcome in a sample space is equally likely to occur

### Complement of Event E

the set of all outcomes in a sample space that are not included in event E. the complement of event E is denoted by E & is read as "E Prime"

### Law of Large Numbers

as you increase the # of times of probability experiment in repeated, the emirical probabiliy (relative frequency) of an event approaches the theoretical probability of the event

### "something has to happen rule"

the sum of the probabilities of all possible outcomes must be 1

### complement rule

the probability of one event occurring is 1 minus the probability that it does NOT occur.

P(A)= 1-P(A') A' read as A complement

### disjoint (mutually exclusive)

Two events share NO outcomes in common. As a mater of fact, knowing that A occurs tells that B CANNOT occur.

### event

a collection of outcomes, usually identified to attach probabilities to them; denoted by capital letters such as A,B, or C.

### General Addition Rule

For any two events, A and B, the probability of A or B is P(A∪B)=P(A)+P(B)-P(A∩B)

### General Multiplication Rule

For any two events, then the probability of A and B is P(A∩B)= P(A) X P(B/A)

### independence (informally)

this happens between two events where the knowing whether or not one event occurs does NOT alter the probability that the other event occurs

### law of large numbers

states that in the long-run relative frequency of repeated independent events settles down to the TRUE relative frequency as the number of trials increases.

### legitimate probability assignment

each probability is between 0 and 1 (inclusive) and the sum of the probabilities is 1

### multiplication rule

If A and B are independent events, then the probability of A and B is

P(A∩B)= P(A) X P(B)

### outcome

an individual result of a component of a simulation; the value measured, observed, or reported or an individual instance of the trial

### probability of an event

a number between 0 and 1 that reports the likelihood of the event's occurrence; can be derived from equally likely outcome, long-run relative frequency of the events occurence or from known probabilities. We write P(A) for the probability of an event

### random event

an event where we know what outcome could happen, but not which particular values will happen

### response variable

a record of the resulting values from each trial that corresponds to what we were interested in

### simulation

models random events by using random numbers to specify event outcomes with relative frequencies that correspond to the true real-world relative frequencies we are trying to model

### tree diagram

a display of conditional events or probabilities that is helpful in thinking through conditioning.

### Experimental Study

the factors who effect to be assessed is manipulated appropriately by devising a suitable design

-conceptual

-data creates background

### Observational Study

a survey of an existing population carried out by adopting a sample procedure (pre-existing/ in place)

### QuaLitative Variable

can be identified by noting its presence describes observation as belonging to a set of categories

### Reasons for Sampling 8

1. lower cost

2. less time

3. provides relevant information

4. population might be destroyed

5. population size might be infinite

6. population might not be available

7. Risk factor

8. Avoid administrative problems

### Statistics

A science that deals with methods of collecting, organizing, and summarizing data in such a way that valid conclusions can be drawn from them

### complement rule

the probability of an event occurring is 1 minus the probability that it doesn't occur

### probability

the proportion of times the event occurs in many repeated trials of a random phenomenon (the long-term relative frequency of an event)

### random phenomena

the rules and concepts of probability that give us a language to talk and think about ________

### the law of large numbers

the long run relative frequency of repeated independent events settles down to the true probability as the number of trials increases

### 1st Quartile

median of the portion of the entire data set that lies at or below the median of the entire data set

### 3rd Quartile

median of the portion of the entire data set that lies at or above the median of the entire data set

### 68.26% - 95.44% - 99.74% Rule

1) 68.26% of all observations lie within one standard deviation to either side of the mean

2) 95.44% of all observations lie within two standard deviations

3) 99.74% of all observations lie within three standard deviations

### Boxplot

A plot of data based on the five number summary. A line is drawn from the minimum observation to Q1; a box is drawn from Q1 to Q3 with a vertical line at the median and a line is drawn from Q3 to the maximum observation.

### Discrete Random Variable

random variable whose possible values from a finite or countably finite set of numbers

### Discrete Variable

quantitative variable whose possible values form a finite (or countable infinite) set of numbers

### Distribution of a Data Set

a table, graph, or formula that provides the values of the observations and how often they occur

### Independent Events

two events in which the outcome of one event does not affect the outcome of the other event.

### Inferential Statistics

Consists of methods for drawing and measuring the reliability of conclusions about a population based on info obtained from a sample of the population

### Observational Study compared to a Designed Experiment

Designed experiment - treatments are imposed and experiment is controlled

Observational study - experiment is only observed, no treatments imposed

### Outliers

lower limit = Q1 - 1.5 x IQR

upper limit = Q3 + 1.5 x IQR

if data falls below lower limit or above upper limit, it is an outlier

### Representative Sample

sample that reflects as closely as possible the relevant characteristics of the population under consideration

### Simple Random Sampling

sampling procedure for which each possible sample of a given size is equally likely to be the one obtained

### Three Conditions for Bernoulli Trials

1) each trial has two possible outcomes; p = success, q = 1-p

2) trials are independent

3) probability of a success remains the same from trial to trial

### Three Principles of Experimental Design - Control

control effects due to factors other than ones of primary interest

### Three Principles of Experimental Design - Randomization

divide into groups to avoid unintentional selection bias

### Three Principles of Experimental Design - Replication

ensure randomization creates groups that resemble and increases chances of detecting differences among treatments

### Three Standard Deviation Rule

for any data set almost all data values fall within three standard deviations of the mean

### Unimodal, Bimodal, & Multimodal Distributions

unimodal - has one peak

bimodal - has two peaks

multimodal - has three or more peaks

### 5-Number Summary

For a set of data, the 5-number summary consists of the minimum value; the first quartile Q1; the median (or second quartile Q2); the third quartile, Q3; and the maximum value.

### Arithmetic Mean (Mean)

the measure of center obtained by adding the values and dividing the total by the number of values

### Boxplot skeletal (or regular)

A boxplot (or box-and-whisker-diagram) is a graph of a data set that consists of a line extending from the minimum value to the maximum value, and a box with lines drawn at the first quartile, Q1; the median; and the third quartile, Q3.

### Coefficient of Variation

The coefficient of variation (or CV) for a set of nonnegative sample or population data, expressed as a percent, describes the standard deviation relative to the mean.

Sample

CV = s/xbar * 100%

Population

CV = mu / µ * 100%

### Comparing Variation in Different Samples

It's a good practice to compare two sample standard deviations only when the sample means are approximately the same.

When comparing variation in samples with very different means, it is better to use the coefficient of variation, which is defined later in this section.

### Converting from the kth Percentile to the Corresponding Data Value flowchart

1) Sort the data

2) L = (k/100) * n

3) Is L a whole number?

Yes) The value of the kth perecntile is midway between the Lth value and the nest value in the sorted set of data. Find P(k) by adding the Lth value and the next value and dividing by two

No) Change L by rounding it up the next larger whole number

The value of P(k) is the Lth value, counting from the lowest.

### Description of mean

Advantages - Is relatively reliable, means of samples drawn from the same population don't vary as much as other measures of center. Takes every data value into account

### Disadvantage - Is sensitive to every data value, one extreme value can affect it dramatically; is not a resistant measure of center

...

### Description of Midrange

Sensitive to extremes, because it uses only the maximum and minimum values, so rarely used

###
Redeeming Features

(1)very easy to compute

(2)reinforces that there are several ways to define the center

(3)Avoids confusion with median

...

### Description of Ranged

It is very sensitive to extreme values; therefore not as useful as other measures of variation.

### Empirical (or 68-95-99.7) Rule

For data sets having a distribution that is approximately bell shaped, the following properties apply:

### Finding the Median

First sort the values (arrange them in order), the follow one of these

1. If the number of data values is odd, the median is the number located in the exact middle of the list.

2. If the number of data values is even, the median is found by computing the mean of the two middle numbers.

### Important Principles of Outliers

An outlier can have a dramatic effect on the mean.

An outlier can have a dramatic effect on the standard deviation.

An outlier can have a dramatic effect on the scale of the histogram so that the true nature of the distribution is totally obscured.

### Mode

the value that occurs with the greatest frequency

Data set can have one, more than one, or no mode

### The solid horizontal line extends only as far as the minimum data value that is not an outlier and the maximum data value that is not an outlier.

...

### Outliers

An outlier is a value that lies very far away from the vast majority of the other values in a data set.

### Percentiles

are measures of location. There are 99 percentiles denoted P1, P2, . . . P99, which divide a set of data into 100 groups with about 1% of the values in each group.

### Values close together have a small standard deviation, but values with much more variation have a larger standard deviation

...

### For many data sets, a value is unusual if it differs from the mean by more than two standard deviations

...

### Compare standard deviations of two different data sets only if the they use the same scale and units, and they have means that are approximately the same

...

### Range Rule of Thumb

is based on the principle that for many data sets, the vast majority (such as 95%) of sample values lie within two standard deviations of the mean.

### Rationale for using n - 1 versus n

There are only n - 1 independent values. With a given mean, only n - 1 values can be freely assigned any number before the last value is determined.

Dividing by n - 1 yields better results than dividing by n. It causes s2 to target 2 whereas division by n causes s2 to underestimate 2.

###
Round-off Rule for

Measures of Center

Carry one more decimal place than is present in the original set of values.

### Round-Off Rule for Measures of Variation

When rounding the value of a measure of variation, carry one more decimal place than is present in the original set of data.

### Skewed to the left

(also called negatively skewed) have a longer left tail, mean and median are to the left of the mode

### Skewed to the right

(also called positively skewed) have a longer right tail, mean and median are to the right of the mode

### Skewed

distribution of data is skewed if it is not symmetric and extends more to one side than the other

###
Standard Deviation -

Important Properties

The standard deviation is a measure of variation of all values from the mean.

The value of the standard deviation s is usually positive.

The value of the standard deviation s can increase dramatically with the inclusion of one or more outliers (data values far away from all others).

The units of the standard deviation s are the same as the units of the original data values.

### standard deviation

The standard deviation of a set of sample values, denoted by s, is a measure of variation of values about the mean.

Also known as the Square root of variance

or ( { ∑(x-xbar)^2 } / (n-1) )^(1/2)

### Symmetric

distribution of data is symmetric if the left half of its histogram is roughly a mirror image of its right half

### Unbiased Estimator

The sample variance s2 is an unbiased estimator of the population variance 2, which means values of s2 tend to target the value of 2 instead of systematically tending to overestimate or underestimate 2.

### Variance

The variance of a set of values is a measure of variation equal to the square of the standard deviation.

### z Score (or standardized value)

the number of standard deviations that a given value x is above or below the mean

### Cluster Sampling

Divide the entire population into pre-existing segments or clusters. The clusters are often geographic. Make a random selection of clusters. Include every member of each selected cluster in the sample.

### Completely Randomized Experiment

One in which a random process is used to assign each individual to one of the treatments.

### Confounding Variable

When the effects of one [variable] cannot be distinguished from the effects of the other. [ ] variables may be part of the study, or they may be outside lurking variables.

### Control Group

This group received a dummy treatment, enabling the researchers to control for the placebo effect. In general, a [ ] group is used to account for the influence of other known or unknown variables that might be an underlying cause of a change in response in the experimental group.

### Descriptive Statistics

Organizing, picturing, and summarizing information from samples or populations

### Double-Blind

Neither the individuals in the study nor the observers know which subjects are receiving the treatment.

### Experiment

A treatment is deliberately imposed on the individuals in order to observe a possible change in the response or variable being measured.

### Hidden Bias

The question may be worded in such a way as to elicit a specific response. The order of questions might lead to biased responses. Also, the number of responses on a Likert scale may force responses that do not reflect the respondent's feelings or experience.

### Interviewer Influence

Factors such as tone of voice, body language, dress, gender, authority, and ethnicity of the interviewer might influence responses.

### Lurking Variable

One [variable] for which no data have been collected but that nevertheless had influence on other variables in the study.

### Multistage Sampling

Use a variety of sampling methods to create successively smaller groups at each stage. The final sample consists of clusters.

### Nonresponse

Individuals either cannot be contacted or refuse to participate. [ ] can result in significant undercoverage of a population.

### Observational Study

Observations and measurements of individuals are conducted in a way that doesn't change the response or the variable being measured.

### Placebo Effect

Occurs when a subject receives no treatment but (incorrectly) believes he or she is, in fact, receiving treatment and responds favorably.

### Randomization

Used to assign the individuals to the two treatment groups. This helps prevent bias in selecting members for each group.

### Replication

[ ] of the experiment on many patients reduces the possibility that the differences in pain relief for the two groups occurred by chance alone.

### Sampling Error

Difference between measurement from a sample and the population because the sample does not perfectly represent the population

### Simple Random Samples

Take 'n' measurements from a population so that every sample of size 'n' has an equal chance of being selected and every individual has an equal chance of being included (use the random number table!)

### Statistics

The study of how to collect, organize, analyze and interpret numerical information from data

### Stratified Sampling

Divide the entire population into distinct subgroups called strata. The strata are based on a specific characteristic such as age, income, education level, and so on. All members of a stratum share the specific characteristic. Draw random samples from each stratum.

### Systematic Random Sample

Select every 'nth' subject (can be problematic if the subject is cyclical)

### Systematic Sampling

Number all members of the population sequentially. Then, from a starting point selected at random, include every Nth member of the population in the sample.

### Vague Wording

Words such as "often", "seldom", and "occasionally" mean different things to different people.

### residual plot

scatterplot of the (x, y) values after each of they-coordinate values has been replaced by the residual value y - y (where y denotes the predicted value of y). That is, a residual plot is a graph of the points (x, y - y).

### Using the Regression Equation for Predictions cont

3. Use the regression line for predictions only if the data do not go much beyond the scope of the available sample data. (Predicting too far beyond the scope of the available sample data is called extrapolation, and it could result in bad predictions.)

### 4. If the regression equation does not appear to be useful for making predictions, the best predicted value of a variable is its point estimate, which is its sample mean.

...

### Using the Regression Equation for Predictions

1.Use the regression equation for predictions only if the graph of the regression line on the scatterplot confirms that the regression line fits the points reasonably well.

### 2. Use the regression equation for predictions only if the linear correlation coefficient r indicates that there is a linear correlation between the two variables (as described in Section 10-2).

...

### Comparing Variation in Two Samples Requirements

1. The two populations are independent.

2. The two samples are simple random samples.

3. The two populations are each normally distributed. IT DOES NOT MATTER OF THE POPULATION IS > 30

### Confidence Interval Estimate of μ(1) - μ(2): Independent Samples

(x1 - x2) - E < (µ1 - µ2) < (x1 - x2) + E

### Hypothesis Test Statistic for Two Means: Independent Samples

t = ( ( xbar(1) - xbar(2) ) - ( μ(1) - μ(2)) ) / ( (s^2(1) / n(1) ) + (s^2(2) / n(2)) )

### independent

the sample values selected from one population are not related to or somehow paired or matched with the sample values from the other population.

### Properties of the F Distribution - continued

If the two populations do have equal variances, then F = s(1) / s(2) will be close to 1 because and are close in value.

### Test Statistic for Two Proportions - cont

p(1) - p(2) = 0 (assumed in the null hypothesis)

phat(1) = x(1) / n(1)

phat(2) = x(2) / n(2)

### Test Statistic for Two Proportions

z = ( phat(1) - phat(2) ) - ( p(1) - p(2) ) / ( (phat ** qhat / n(1) + (phat ** qhat / n(2) )

### alternative hypothesis

The alternative hypothesis (denoted by H1 or Ha or HA) is the statement that the parameter has a value that somehow differs from the null hypothesis.

The symbolic form of the alternative hypothesis must use one of these symbols: , <, >.

### Conclusions in Hypothesis Testing

We always test the null hypothesis. The initial conclusion will always be one of the following:

1. Reject the null hypothesis.

2. Fail to reject the null hypothesis.

### critical region (or rejection region)

is the set of all values of the test statistic that cause us to reject the null hypothesis.

### critical value

A critical value is any value that separates the critical region (where we reject the null hypothesis) from the values of the test statistic that do not lead to rejection of the null hypothesis. The critical values depend on the nature of the null hypothesis, the sampling distribution that applies, and the significance level 𝞪

### Decision Criterion

P-value method:

Using the significance level :

If P-value <= 𝞪 , reject H0.

If P-value > 𝞪 , fail to reject H0.

### hypothesis test (or test of significance)

is a standard procedure for testing a claim about a property of a population.

### null hypothesis (denoted by H0)

The null hypothesis (denoted by H0) is a statement that the value of a population parameter (such as proportion, mean, or standard deviation) is equal to some claimed value.

We test the null hypothesis directly.

Either reject H0 or fail to reject H0.

### P-Value

The P-value (or p-value or probability value) is the probability of getting a value of the test statistic that is at least as extreme as the one representing the sample data, assuming that the null hypothesis is true.

### Requirements for Testing Claims About a Population Mean (with σ Not Known)

1) The sample is a simple random sample.

2) The value of the population standard deviation σ is not known.

3) Either or both of these conditions is satisfied: The population is normally distributed or n > 30.

### Requirements for Testing Claims About a Population Mean (with σ Known)

1) The sample is a simple random sample.

2) The value of the population standard deviation σ is known.

3) Either or both of these conditions is satisfied: The population is normally distributed or n > 30.

### Requirements for Testing Claims About a Population Proportion p

1) The sample observations are a simple random sample.

### 3) The conditions np >= 5 and nq >= 5 are both satisfied, so the binomial distribution of sample proportions can be approximated by a normal distribution with µ = np and σ = (npq)^(1/2) . Note: p is the assumed proportion not the sample proportion.

...

### significance level (denoted by 𝞪)

is the probability that the test statistic will fall in the critical region when the null hypothesis is actually true. This is the same 𝞪 introduced in Section 7-2. Common choices for 𝞪 are 0.05, 0.01, and 0.10.

### Test Statistic for Testing a Claim About a Mean (with σ Not Known)

t = (xbar - μ(xbar) ) / (s / (n)^(1/2) )

### Test Statistic for Testing a Claim About a Mean (with σ Known)

z = (xbar - μ(xbar) / (σ / (n)^(1/2) )

### test statistic

The test statistic is a value used in making a decision about the null hypothesis, and is found by converting the sample statistic to a score with the assumption that the null hypothesis is true.

### Type I Error

A Type I error is the mistake of rejecting the null hypothesis when it is actually true.

### Type II Error

A Type II error is the mistake of failing to reject the null hypothesis when it is actually false.

### binomial probability distribution

1. The procedure has a fixed number of trials

2. The trials must be independent. (The outcome of any individual trial doesn't affect the probabilities in the other trials.)

3. Each trial must have all outcomes classified into two categories (commonly referred to as success and failure).

4. The probability of a success remains the same in all trials.

### Continuous random variable

infinitely many values, and those values can be associated with measurements on a continuous scale without gaps or interruptions

### Discrete random variable

either a finite number of values or countable number of values, where "countable" refers to the fact that there might be infinitely many values, but they result from a counting process

### expected value

The expected value of a discrete random variable is denoted by E, and it represents the mean value of the outcomes. It is obtained by finding the value of [x • P(x)].

### Identifying Unusual Results Range Rule of Thumb

According to the range rule of thumb, most values should lie within 2 standard deviations of the mean.

We can therefore identify "unusual" values by determining if they lie outside these limits:

Maximum usual value = μ + 2σ

Minimum usual value = μ - 2σ

### Mean of a Probability Distribution

µ = ∑[x • P(x)]

Methods for Finding Probabilities - Method 1: Using the Binomial

### Probability Distributions

describe what will probably happen instead of what actually did happen, and they are often given in the format of a graph, table, or formula.

### Probability distribution

a description that gives the probability for each value of the random variable; often expressed in the format of a graph, table, or formula

### Random variable

a variable (typically represented by x) that has a single numerical value, determined by chance, for each outcome of a procedure

### Rare Event Rule for Inferential Statistics

If, under a given assumption (such as the assumption that a coin is fair), the probability of a particular observed event (such as 992 heads in 1000 tosses of a coin) is extremely small, we conclude that the assumption is probably not correct.

### Requirements for Probability Distribution

P(x) = 1

where x assumes all possible values.

0 <= P(x) = 1

for every individual value of x.

### Roundoff Rule for μ, σ, σ^2

Round results by carrying one more decimal place than the number of decimal places used for the random variable x.

If the values of x are integers, round µ, σ, and σ^2 and 2 to one decimal place.

### association

Although there may be a strong ________________ between variables this does not necessarily imply there is causation.

### block design

the random assignment of units to treatments is carried out separately within each section.

### causation

Although there may be a strong association between variables this does not necessarily imply there is this

### completely randomized design

when all experimental units are allocated at random among all treatments

### correlation

measures the direction and strength of the linear relationship between two quantitative variables.

### extrapolation

the use of a regression line to make predictions for the values outside the range of x.

### informed consent

All individuals who are subjects in a student must give this before data is collected

### institutional review board

protects the rights and welfare of humans subjects participating in research activities

### normal quartile plot

a pattern on such a plot that deviates substantially from a straight line indicates that the data are not Normal.

### placebo effect

an improvement in health not due to any treatment, but only to the patient's belief that he or she will improve.

### regression line

a line that describes how the response variable changes as the explanatory variable changes.

### residual

the difference between an observed value of the response variable and the value predicted by the regression line

### transformation

if the data does not appear linearly distributed it may be necessary to do this on the data to change the distribution to a linear distribution

### Voluntary Response

Individuals with strong feelings about a subject are more likely than others to respond. Such a study is interesting but not reflective of the population.

### 74.5

A statistics student recieves a score of 85 on a statistics midterm. If the corresponding z-score equals 1.5 and the standard deviation equals 7, the average score on this exam is __________.

### asymptotic

__________ means that the normal curve gets closer and closer to the x-axis but never actually touches it.

### continuous probability distribution

What type of probability distribution is the normal distribution?

### less than

A negative z-score indicates that the corresponding value in the original distribution is __________ the mean.

### negative

For a normal distribution curve, the z value for an x value that is less than µ is always __________.

### normal probability distribution

the most widely used continuous probability distribution, which plays a central role in statistical inference; can be used to describe almost all phenomena in real life situations

### standard normal distribution

If we convert values of a normal distribution to a distribution that has a mean of 0 and a standard deviation of 1, this probability distribution is called __________.

### standard normal distribution

Normal distribution with a mean µ=0 and a standard deviation σ=1 is known as __________.

### z-score

A __________ is the distance between a selected value (x) and the population mean (µ) divided by the population standard deviation (σ).

### z=x-µ/σ

The formula to convert any normal distribution to the standard normal distribution is __________.

### "Something has to happen rule"

The sum of the probabilities of all possible outcomes of a trial must be 1.

### Addition Rule "or"

If A and B are disjoint events, then the probability of A or B is P(A U B) = P(A) + P(B).

### Complement Rule or "at least one"

The probability of an event occurring is 1 minus the probability it doesn't occur. P(A) = 1 - P(Ac(it doesn't occur)).

### Conditional Probability

P(BlA) = P(A and B)/P(A). P(BlA) is read " the probability of B given A."

### Disjoint (mutually exclusive)

Two events are disjoint of they share no outcomes in common. If A and B are disjoint, then knowing that A occurs tells us that B cannot occur. Disjoint events are also called "mutually exclusive."

### Event

A collection of outcomes. Usually, we identify events in order to attach probabilities to them. We denote events with bold capital letters such as A, B, or C.

### General Addition Rule

For any two events, A and B, the probability of A or B is... P(A U B) = P(A) + P(B) - P(A and B).

### General Multiplication Rule

For any two events, A and B, the probability of A and B is P(A and B) = P(A) x P(BlA).

### Independence (informally)

Two events are independent if knowing whether one event occurs does not alter the probability of the other event occurring.

### Independence (used casually)

Two events are independent if knowing whether one event occurs does not alter the probability that the other event occurs.

### Law of Large Numbers

States that long-run relative frequency of repeated independent events gets closer and closer to the true relative frequency as the number of trials increase.

### Legitimate probability assignment

An assignment of probabilities to outcomes is legitimate if...

a) each probability if between 0 and 1 (inclusive).

b) the sum of the probabilities is 1.

### Multiplication Rule, Upside down U

If A and B are disjoint events, then the probability of A and B is P(A and B) = P(A) x P(B).

### Outcome

The outcome of a trial is the value measured, observed, or reported for an individual instance of that trial.

Outcomes are considered to be either

a) discrete if they have distinct values such as heads or tails (even if the values are numerals)

b) continuous if they take on numeric values in some rand of possible values

### Probability

The probability of an event is a number between 0 and 1 that reports the likelihood of the event's occurrence. A probability can be derived from equally likely outcomes, from the long-run proportion of the event's occurrence, or from known proportions, We write P(A) for the probability of the event A.

### Random Phenomenon

A phenomenon is random if we know what outcomes could happen, but not which particular values did or will happen.

### Sample Space

The collection of all possible outcome values. The sample space has a probability of 1.

### Tree Diagram

A display of conditional events or probabilities that is helpful in thinking through conditioning.

### binomial probability distribution function

The probability of obtaining x successes in n independent trials of a binomial experiment is given by

### continuous random variable

Has infinitely many values. The values can be plotted on a line in an uninterrupted fashion.

### criteria for a binomial probability experiment

1. The experiment is performed a fixed number of times. Each repetition is called a trial.

2. The trials are independent. The outcome of one trial will not affect the outcome of the other trials.

3. For each trial, there are two mutually exclusive (disjoint) outcomes: success or failure.

4. The probability of success is the same for each trial of the experiment.

### discrete random variable

Has either a finite or countable number of values. The values can be plotted on a number line with space between each point.

### interpretation of the mean of a discrete random variable

Suppose an experiment is repeated n independent times and the value of the random variable X is recorded. As the number of repetitions of the experiment increases, the mean value of the n trials will approach µx, the mean of the random variable X.

x̄ =( x₁ + x₂ + ⋅⋅⋅ + x-sub-n)/n

The difference between x̄ and µ-sub-x gets closer to 0 as n increases

### mean and standard deviation of a binomial random variable

a binomial experiment with n independent trials and probability of success p has a mean and standard deviation given by the formulas

u-sub-x = np and σ-sub-x = √np(1-p)

### where x is the value of the random variable and P(x) is the probability of observing the variable x.

...

### notation used in binomial probability distribution

1. There are n independent trials of the experiment.

2. P denotes the probability of success for each trial so that 1-p is the probability of failure for each trial.

3. X denotes the number of successes in n independent trials of the experiment. 0 ≤ x ≤ 1.

### probability distribution

Provides the possible values of the random variable and their corresponding probabilities. Can be in the form of a table, graph, or mathematical formula.

### probability histogram

a histogram in which the horizontal axis corresponds to the value of the random variable and the vertical axis represents the probability of each value of the random variable

### random variable

A numerical measure of the outcome of a probability experiment, so its value is determined by chance. Random variables are typically denoted using capital letters such as X.

### rules for a discrete probability distribution

Let P(x) denote the probability that the random variable X equals x;

1. then ∑P(x) = 1

2. 0 ≤ P(x) ≤ 1

### If you can do one task n number of ways and a second m number of ways, then both tasks can be done in n*m ways.

Multiplication principle

### mean of X = multiply each possible value by its probability and then add it up

Mean of a discrete random variable

### random variable with either a finite (whole) number value or a countable number

Discrete random variable

### takes all values in an interval of numbers, described by a density curve.

Continuous random variable

### Two events A and B are disjoint if they have no outcomes in common. P(A or B)=P(A)+P(B)

Probability rule 3

### Disjoint/mutually exclusive

two events that have no outcomes in common (Cannot occur simultaneously)

### Probability Model

a mathematical description of a random phenomenon consisting of a sample space and way of assigning probability

### The Multiplication Principle

if event A has a possible outcomes and event B has b possible outcomes, then BOTH events considered together have (a*b) outcomes.

### How do you Standardize Normal Distribution: P[ r < X < s)

Z = (X-u)/std, =P[(r-u)/std < (X-u)/std < (s-u)/std]

### Joint Distribution Cumulative Distribution:

integral (-infinity to x) integral (-infinity to y) f(s,t) dt ds

### Negative Binomial Distribution:

Experiment performed repeatedly until the r-th success (X # of failures)

### blinding

is a technique where the subject does not know whether he or she is receiving a treatment or a placebo.(1.3)

### cluster sample

divide the population into groups, called clusters, and select all of the members in one or more (but not all) of the clusters.(1.3)

### completely randomized design

subjects are assigned to different treatment groups through random selection.(1.3)

### confounding variable

occurs when an experimenter cannot tell the difference between the effects of different factors on a variable.(1.3)

### cumulative frequency

is the sum of the frequency for that class and all previous classes. The cumulative frequency of the last class is equal to the sample size n.(2.1)

### descriptive statistics

is the branch of statistics that involves the organization summarization, and display of data.(1.1)

### double-blind experiment

neither the subject nor the experimenter knows if the subject is receiving a treatment or a placebo. The experimenter is informed after all the data have been collected. This type of experimental design is preferred by researchers.(1.3)

### frequency distribution

is a table that shows classes or intervals of data entries with a count of the number of entries in each class.(2.1)

### inferential statistics

is the branch of statistics that involves using a sample to draw conclusions about a population. A basic tool in the study of inferential statistics is probability.(1.1)

### interval level of measurement

data at this measurement can be ordered, and you can calculate meaningful differences between data entries. At the interval level, a zero entry simply represents a position on a scale; the entry is not an inherent zero.(1.2)

### matched-pairs design

where subjects are paired up according to a similarity. One subject in the pair is randomly selected to receive one treatment while the other subject receives a different treatment.(1.3)

### midpoint

is the sum of the lower and upper limits of the class divided by two. Sometimes called the class mark and is calculated [(lower class limit)+(upper class limit)]/2. section (2.1)

### nominal level of measurement

data at this level is qualitative only. Data at this level are categorized using names, labels, or qualities. No mathematical computations can be made at this level.(1.2)

### observational study

a researcher observes and measures characteristics of interest of part of a population but does not change existing conditions.(1.3)

### ordinal level of measurement

data at this level are qualitative or quantitative. Data at this level can be arranged in order, or ranked, but differences between data entries are not meaningful.(1.2)

### placebo effect

occurs when a subject reacts favorably to a placebo when in fact, he or she has been given no medicated treatment at all.(1.3)

### population

is the collection of all outcomes, responses, measurements, or counts that are of interest.(1.1)

### random sample

is one in which every member of the population has an equal chance of being selected.(1.3)

### randomized block design

divide subjects with similar characteristics into blocks, and then within each block, randomly assign subjects to treatment groups.(1.3)

### ratio level of measurement

data at this measurement are similar to data at the interval level, with the added property that a zero entry is an inherent zero. A ratio of two data values can be formed so that one data value can be meaningfully expressed as a multiple of another.(1.2)

### relative frequency

is the portion or percentage of the data that falls in that class. To find the relative frequency of a class, divide the frequency f by the sample size n.(2.1)

### sampling

is a count or measure of apart of a population, more commonly used in statistical studies.(1.3)

### simple random sample

is a sample in which every possible sample of the same size has the same chance of being selected.(1.3)

### simulation

is the use of a mathematical or physical model to reproduce the conditions of a situation or process.(1.3)

### Statistics

is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions.(1.1)

### stratified sample

members of the population are divided into two or more subsets, called strata, that share a similar characteristic such as age, gender, ethnicity or even political preference. A sample is then randomly selected from each of the strata and ensures that each segment of the population is represented.(1.3)

### systematic sample

is a sample in which each member of the population is assigned a number. The members of the population are ordered in some way, a starting number is randomly selected, and then sample members are selected at regular intervals from the starting number.(1.3)

### Complement Rule

the probability of an event occurring is 1 minus the probability that it doesn't occur

### Law of Large Numbers

a statistical law stating that as sample size increases, the probability of an event outcome will more closely reflect the theoretical probability of the event.

### Probability

the likelihood that a possible future event will occur in any given instance of the event;

mathematical ratio:

what you want to happen

to total outcomes of what could happen

### Bar Graph

A type of graph which uses bars to show the differences or similarities in different sets of data.

### Boxplot

displays the 5-number summary as a central box with whiskers that extend to the non-outlying data values

### Dotplot

A type of graph where dots are used to represent data. The distribution of the dots can highlight similarities or differences in the data.

### Five number summary

The smallest observation, the first quartile, the median, the third quartile, and the largest observation. Usually written from smallest to largest.

### Median

The number in a set of data which half the observations are small than and the other half of the observations are larger than.

### Overall pattern

The overall pattern of a graph is the basic trend the graph shows and can lead to a general explanation of the data presented.

### Pie Chart

A type of graph which uses a 2 dimensional circle. Sections of the circle are filled out to show differences or similarities in data.

### Quartiles

The medians of the two halves of a set of a data after the median of the entire set is found.