stats 121

Created by finedinebizzle77 

Upgrade to
remove ads

44 terms · test 2

Bivariate data

two measurements on a single individual during a study (response variable, explanatory variable)

Direction relationship

positive if while X increases Y increases
Negative if while X increases Y decreases

Explanitory variable

denoted as the X...it predicts Y

Response variable

measures outcome on each individual, denoted as Y

Form of relationship

strength
direction
form (linear non-linear)

Correlation coefficient

denoted by r, a number that gives the direction and strength of a linear relationship between two Quantitative variables

properties for r

both variables must be quantitive
sign of r denotes direction
r is between -1 and +1
no unit of measure
is affected by outliers

Statistical model

an equation that fits the pattern between a response variable and explanatory variable, accounting for deviations in the model

statistical equaition

Y-hat=a+bx
a=intercept
b=slope

residuals

prediction errors
y-(y-hat)=prediction error
vertical distance from observed y to the line

least-squares regression line

when sum of squared errors (SSE) is the least....
E(y-y-hat)^2

a=

(Y-bar)-b(x-bar)

b=

r(stan of y/stan of x)

correlation

measures direction and strengthof linear association between X and Y

regression line

models the linear relationship and can be used to make predictions for y values

facts about regression line

a change in one standard deviation x accounts for a r change in standard deviation y...
regression line passes through point (Xbar, Ybar)

r^2

it tells us the percentage of variation in Y that is explained by the least-squares regression line...
or.... it is a measure of how successfully the regression explains the response y

residual plot

a scatterplot of the residuals

residual plot diagnostics

smile or frown shape-means there is a non-linear relationship
Megaphone-indicates constant variation (variation in y is dependant on x)
shoe-boX: point outside indicates outlier in x or Y direction

influential observation

an observation that if removed would change the regression line slope and y-intercept noticeably
-otliers in x direction are often influential
-influential observations may have small residuals
-not all outliers are influential observations

extrapolation

predicting y for an x value that is outside the range of observed x values

drawbacks of observational studies

-cannot systematically change x to observe y
-cannot randomize
-cannot establish causation; on correlation or association

to display categorical data

use a table
-explanitory variable is the row
-response is the colomb

margins of a table

show the totals for each coulomb and each row

Simpson's paradox

demonstrates that a great deal of care has to be taken when combining small data sets into a large one. Sometimes conclusions from the large data set are exactly the opposite of conclusion from the smaller sets. Unfortunately, the conclusions from the large set are also usually wrong.

rules of data analysis

-always plot your data
-always describe shape, center, spreadof distributions

measures affected by outliers

-mean
standard deviation
-correlation
-r^2
-slope and y-intercept

random phenomenon

the outcome of one play is unpredictable, but the outcome of many plays forms a distribution and then we can make a prediction

probability of an outcome

portion of how many times an out come occurs based on repition of plays (total)

Random doesn't mean haphazard

fact!

probability =

(# of outcomes in the event of interest)/(count of outcomes in sample space)

probability rules

-between 0-1
-sum of all probabilities must equal 1
- the probe that event will NOT occur = 1-prob that it will occur

disjoint events

two events that have no outcomes in common and, thus, cannot both occur simultaneously.

you cannot state a parameter in statistics without saying mean or proportion

fact 2

parameter

a number describing a characteristic of a population

statistic

a number computed from sample data, estimating an unknown parameter

Law of large numbers

If population has a finite mean mu or if x-bar is used to estimate mu,
Then....
as sample sixe increases, x-bar gets closer to mu

sampling distribution of x-bar

the distribution of all x-bar values from all possible samples of the same size from a population - (x-bar=mu)
-standard deviation of x-bar= standard deviation of pop/square root of n
-standard deviation of x-bar is always less than pop standard deviation where n>1

Central limit theorem

if you take a large srs of size n from any population shape gets more normal as n increases

for x-bar... Z=

x-bar-mu/standard deviation of x-bar

control limits =

mu plus or minus 3standard deviation of x-bar

center line =

mu

out-of-control signals

-one point above or below control limits
-9 points in a row on the same side of the centerline

random variable

is a variable whose value is a numerical outcome of a random phenomenon

Please allow access to your computer’s microphone to use Voice Recording.

Having trouble? Click here for help.

We can’t access your microphone!

Click the icon above to update your browser permissions above and try again

Example:

Reload the page to try again!

Reload

Press Cmd-0 to reset your zoom

Press Ctrl-0 to reset your zoom

It looks like your browser might be zoomed in or out. Your browser needs to be zoomed to a normal size to record audio.

Please upgrade Flash or install Chrome
to use Voice Recording.

For more help, see our troubleshooting page.

Your microphone is muted

For help fixing this issue, see this FAQ.

Star this term

You can study starred terms together

NEW! Voice Recording

Create Set