Lecture 3: Descriptive Statistics: Graphical
bar graph
Is used for qualitative/categorical data. A bar diagram represents the frequency/relative frequencies of occurrences for
different categories of the data . Frequency is represented by the height of a bar. Side by side bar graphs can be used for
comparative analysis. Pareto graphs are used when instead of frequency, relative frequency is visually represented.
graphical summaries
Enables us to visually ascertain the mean, median, mode (measures of central tendencies), standard deviation
(spread/dispersion/deviation) and the shape (skewed or bell shaped) of the distribution.
contingency table
Is used to numerically quantify the Categorical data.
simple bar graph
Clustered Bar Graph
useful for comparing two categorical variables and are often used in conjunction with cross tabulations
stacked bar graph
A bar graph that compares the same categories for different groups and shows category totals
This data provides a visual summary that can be used to compare data across classes as well as data within classes .
PIE CHARTS
Is used for qualitative/categorical data visualization. Represents the visual percentages of various categories. The
percentage is computed using the circle angle ratios. The piechart percentage in the data has to total upto a 100%
Pie chart should be used when each category is necessarily represented as a part of a whole whereas bar charts are used
to represent quantities that are measured in the same units but may not be a part of a whole.
It can be concluded that IT is the Maximum profit sector for the current company
TIME SERIES
Is used to visualize Quantitative Data. This showcases a sequence of data points consisting of successive measurements
made over a continuum of time intervals. Provides the trend and cyclic pattern of the data
HISTOGRAM
Is graphical display used for Quantitative Data. A rectangle is used to represent frequencies of observations in each
interval. For a Histogram the y axis mandatorily starts at the zero value. If it does not ,then it might provide us with a
fallacious view of the data. If the x axis values do not start at zero then a break symbol should be employed
Disadvantage: Histogram by itself does not preserve the raw data
Quantitative
Histograms, Stem & Leaf, Boxplot
Qualitative
Bar graph, Pie Charts
stem and leaf plot
A method of graphing a collection of numbers by placing the "stem" digits (or initial digits) in one column and the "leaf" digits (or remaining digits) out to the right.
Is used for displaying Quantitative Data and was created by a Mathematician working at Princeton in 1980s : John Tukey.
He was the scientist who introduced the word "Software" for the first time and also the one who renamed the binary unit
a "bit".
Histograms do not retain the original values. In order to preserve the original values we use a stem and leaf plot . It can
also be used to compare two distributions. Stem leaf, due to space constraint can only be used for a smaller data set as
compared to a histogram which can be used to summarize a large data set.
