# Econ Stat First exam

## 58 terms

### Statistics

the science of collecting, organizing, analyzing, interpreting and presenting data.

### Statistic

A single measure used to summarize a sample data set.

### Descriptive Statistics

The collection, organization, presentation and summary of data

### Inferential Statistics

Generalizing from a sample to a population estimating unknown parameters, drawing conclusions, and making decisions.

### Empirical Data

represents data collected through observation and experiments.

### Pitfall 1

Making conclusions about a large population from a small sample

### Pitfall 2

Interpret a correlation as a specific causal link (A causes B, B causes A, some third factor causes both)

### Pitfall 3

Generalization to Individuals

### Pitfall 4

Significance versus Importance.

### Data Set

A particular collection of data values as a whole.

### Subject (or individual)

An item for study (i.e. an employee in your company)

### Variable

A characteristic about the subject or individual. (i.e. an employees income)

each data value

### Univariate

One variable. (Histograms, descriptive statistics, frequency tallies)

### Bivariate

Two variables. (Scatter plots, correlations, regression modeling)

### Multivariate

More than two variables. (Multiple regression, data mining, econometric modeling)

### Time Series Data

Each observation in the sample represents a different equally spaced point in time.

### Cross Sectional Data

Each observation represents a different individual unit (i.e. person) at the same point in time.

### Nominal Measurement

Qualitative, attribute, categorical or classification data and can be coded numerically. (i.e. 1=apple 2= compaq 3= dell 4 = HP)

### Ordinal Measurement

Ordinal data codes can be ranked. (i.e. 1 = frequently 2= sometimes 3 = Rarely 4 = Never)

### Interval Measurement

Data can not only be ranked but also have meaningful intervals between scale points. (i.e. difference betweet 60 & 70 is same as 20 & 30)

### Ratio Measurement

Have all the properties of other three data types.

### How are stats are computed

From a sample of n items, chosen from a population of N items

### Simple Random Sample

Every item in the population of N items has the same chance of being chosen in the sample of n items.

### Systematic Sampling

Sample by choosing every nth item from a list, starting from a randomly chosen entry on the list.

### Stratified Sampling

Utilizes prior information of the population.

### Stratum

A simple random sample of the desired size is taken.

### One Stage Cluster

Sample consists of all elements in each of k randomly chosen sub regions (clusters)

### Two Stage Cluster

First choose k sub regions (clusters), then choose a random sample of elements within each cluster.

### Judgement Sample

A non-probability sampling method that relies on expertise of the sampler to choose items that are representative of the population.

### Quota Sampling

special kind of judgement sampling in which the interviewer chooses a certain number of people in each category.

### Convenience Sample

Take advantage of whatever sample is available at that moment. Quick way to sample.

### Focus Group

A panel of individuals chosen to be representative of a wider population.

### Frequency Distribution

A table formed by classifying n data values into k classes (bins)

### Bin Limits

Define the values to be included in each bin. Widths must all be the same except when we have open ended bins.

### Frequencies

The number of observations within each bin.

### Sturges Rule

k = 1 +3.3 x log(n)

### Chebyshev's Theorem

For any population and standard deviation, the percentage of observation that lie within k standard deviations of the mean, must be at least 100[1-1/k^2]

### Empirical Rule

States that for data from a normal distribution, we expect the interval to contain a known percentage of data.

### Outliers

Lie beyond the distance between the observation and the mean is at least three times as large as the standard deviation.

### Standardized Variable

Redefines such observations in terms of its distance from the mean in "standard deviations".

### Percentiles

Are data that have been divided into 100 groups.

### Deciles

Data that has been divided into 10 groups.

### Quintiles

Datat that has been divided into 5 groups.

### Quartiles

Data that has been divided into 4 groups.

### Random experiment

An observational process whose results cannot be known in advance.

### Event

Any subset of outcomes in the sample space.

### Elementary Event

A single outcome.

### Union of 2 Events

Consists of all outcomes in the sample space S that are contained either in event A or in event B or boths.

P(A or B) = P(A) + P(B) - P(A and B)

### Mutually Exclusive Events

Events A and B do no intersect.

In the case of mutually exclusive events - P(A or B) = P(A) + P(B).

### Conditional Probability

The probability of even A given that event B has occurred.

### Multiplication of Law for Independent Events

The probability of n independent events occurring simultaneously is - P(A1 and A2 and ..... An) = P(A1) P(A2) ...... P(An).

### Coefficient of Variation

Compares dispersion in data sets with different units of measurement or different means.

### Trimmed Mean

N = Data
K= Percent to trim n*k =x. Remove x number of smallest and largest observations.

### Midrange

Mr = (xMin + xMax)/2

### Conditional Probability

P(A given B) = (P(A and B)) / P(B)