a set of procedures used by social scientists to organize, summarize and communicate information
information represented by numbers, which can be the subject of statistical analysis
Research Process or Social Research
a set of activities in which social scientists engage to answer questions, examine ideas, or test theories.
Ex: Has drug abuse increased during the last decade?What factors influence the economic mobility of female workers?
Research based on evidence that can be verified by using our direct experience.
an elaborate explanation of the relationship between two or more observable attributes of individuals or groups.
a tentative answer to a research problem.
A characteristic that differs or varies from one individual t o another. The variable can be social class, monthly income, religion, gender, age, Education, Race, Ethnicity etc.
Unit of analysis
The level of social life on which social scientists focus. Examples of different levels are individuals (How old are you?) and groups (How many children are in the family?).
Dependent variable (Y)
the variable to be explained (the "effect" ). It is always the property that you are trying to explain; it is always the object of research (the output we want to explain). Always what is being measured in response to what you change. Think about the "Why?"
Independent Variable (X)
the variable expected to account for "the cause of" the dependent variable. It is always what is being manipulated, changed (input) Race, Sex or Gender, Political affiliation, Religion, Income, Education, Poverty, location, social statistics.
Level of measurement
Nominal, Ordinal, Interval, Ratio - Four levels of measurement each containing differing amount of information as follows: Nominal = category, Ordinal = rank, Interval = equal distance, Ratio = all plus true 0 point
Involves naming, labeling, or classifying the observations. Ex. Gender, Political Party, Race, Religion.
Involves ranking ordered categories ranging from low to high. Ex. Social Class: upper, middle, lower : Strongly agree..... Strongly disagree .
Involves ordering and exact distance Example: Income, SAT scores, dollars, degrees, pounds, temperature
same as interval but includes an absolute or true zero point (age, hours worked, re-arrests)
The total set of individuals, objects, groups, or events in which research is interested.
a relatively small subset selected from a population.
Procedures that help organize and describe data collected from either a sample or a population. (Census, survey, administrative data)
Used to make predictions or inferences about a population from observations and analyses of a sample. Shows cause and effect relationships.
A table listing all categories (nominal/ordinal) or observed scores (interval/ratio) and the frequency (f) of each category or observed score.Tells me how many cases I have for each category I am interested in. (a way to see each variable in a category)
A relative frequency obtained by dividing the frequency in each category by the total number of cases. P=f/N
* when working with proportion you cannot get a # higher than 1 .If that happens manipulate the highest # of cases ** only work with 1 decimal point
a table showing the percentage of observations falling into each category of the variable. (Shows relative size
a relative frequency obtained by dividing the frequency in each category by the total number of cases and multiplying by 100. P=F/N x 100
Cumulative frequency distribution
a distribution showing the frequency at or below each category (class interval or score) of the variable.
Cumulative Percentage Distribution
A distribution showing the percentage at or below each category (class interval or score) of the variable. c%=(100)cf/N
**cf =sum of frequencies in that category + all lower category frequencies**
Rates and Ratios
a number obtained by dividing the number of actual occurrences in a given time period by the number of possible occurrences.
**birthrate, unemployment rate, poverty rate, and marriage rate**
*Rate= f actual cases/f potential cases x k (100, 1,000, ...)
Ratio= f1/f2 *comparison of one category to another
Ex: College Graduation Rate = # Students who graduated / # of students x 100
shows the differences in frequencies or percentages among the categories of a NOMINAL or an ORDINAL variable.
(Ex: Feelings about computer and technology)
a graph showing the differences in frequencies or percentages among the categories of a NOMINAL or an ORDINAL variable. The categories are displayed as rectangles of equal width with their height proportional to the frequency or percentage of the category.
(Ex:Graduation rates at a 4-year university)
a graph showing the differences in frequencies or percentages among categories of an INTERVAL-RATIO variable. The categories are displayed as contiguous bars, with width proportional to the width of the category (x-axis) and height proportional to the frequency or percentage of that category. (y-axis)
(Ex: Public Assistance income in the past 12 months)
A graphic display of a frequency distribution in which the frequency of each score is plotted on the vertical axis, with the plotted points connected by straight lines. Most useful when the data represent INTERVAL-RATIO variables. (Ex:student examination grades)
*Best suited to show continuity rather than differences*
It shows the differences in frequencies or percentages among categories of an INTERVAL-RATIO variable.
(Ex: Graduation Rates by gender for a 4-year university)
Displays geographic variations in variables
Shading represents different frequencies or percentages
Ex: % of people 5 years and over who speak Spanish at Home, 2007
Measures of central tendency
Categories or scores that describe what is average or typical of the distribution. Mean, Median, and Mode
The category or score with the highest frequency (cases) or percentage in the distribution. Can be used with nominal, ordinal , or interval-ratio level variables but is the only MCT appropriate to describe NOMINAL variables.
two data values occur with the same greatest frequency
The middle score of a distribution. The score that divides the distribution into two equal parts (50%) so that half the cases are above it and half below it. Can be used with ORDINAL or INTERVAL RATIO variables. (Extremely useful when distribution of scores is skewed)
To calculate the Median
-order score from lowest to highest
-if odd, the median will be an actual score (middle score) in the distribution
-if even, the median is located between two middle scores and is calculated by taking the average of those two scores
-to find the middle score:
Position of the Median= N + 1 / 2
A measure of central tendency that is obtained by adding up all the scores and dividing by the total number of scores. It is the arithmetic average score of distribution. Typically used in INTERVAL RATIO variables.
The frequencies at the right and left of the distribution are identical; each half of the distribution is the mirror image of the other.
a distribution with a few extreme values on one side of the distribution.
Negatively skewed distribution
a distribution with a few extremely low values (tilts right)
Positively skewed distribution
a distribution with a few extremely high values (tilts left)
Mean (Y bar)= ΣY/N
*see notes for examples*
Factors in choosing a measure of central tendency