Quantitative versus Qualitative Data
Quantitative data or numerical data are data measured or identified on a numerical scale. Qualitative
data or categorical data are data that can be classified into a group.
Examples of Quantitative (Numerical) Data: The heights of students in an AP Statistics class;
the number of freckles on the face of a redhead; the average speed on a busy expressway; the
scores on a final exam; the concentration of DDT in a creek; the daily temperatures in Death
Valley; the number of people jailed for marijuana possession each year
Examples of Qualitative (Categorical) Data: Gender; political party preference; eye color;
ethnicity; level of education; socioeconomic level; birth order of a person (first-born, second-
born, etc.)
There are times that the distinction between quantitative and qualitative data is somewhat less clear
than in the examples above. For example, we could view the variable “family size” as a categorical
variable if we were labeling a person based on the size of his or her family. That is, a woman would go in
category “TWO” if she was married but there were no children. Another woman would be in category
“FOUR” if she was married and had two children. On the other hand, “family size” would be a
quantitative variable if we were observing families and recording the number of people in each family (2,
4, ...). In situations like this, the context will make it clear whether we are dealing with quantitative or
qualitative data.
Discrete and Continuous Data
Quantitative data can be either discrete or continuous. Discrete data are data that can be listed or
placed in order. Continuous data can be measured, or take on values in an interval. The number of heads
we get on 20 flips of a coin is discrete; the time of day is continuous. We will see more about discrete and
continuous data later on.
Descriptive versus Inferential Statistics
Statistics has two primary functions: to describe data and to make inferences from data. Descriptive
statistics is often referred to as exploratory data analysis (EDA) . The components of EDA are
analytical and graphical . When we have collected some one-variable data, we can examine these data
in a variety of ways: look at measures of center for the distribution (such as the mean and median); look at
measures of spread (variance, standard deviation, range, interquartile range); graph the data to identify
features such as shape and whether or not there are clusters or gaps (using dotplots, boxplots, histograms,
and stemplots).
With two-variable data, we look for relationships between variables and ask questions like: “Are
these variables related to each other, and, if so, what is the nature of that relationship?” Here we consider
such analytical ideas as correlation and regression, and graphical techniques such as scatterplots.
Chapters 6 and 7 of this book are primarily concerned with exploratory data analysis.
Procedures for collecting data are discussed in Chapter 8 . Chapters 9 and 10 are concerned with the
probabilistic underpinnings of inference.
Inferential statistics involves using data from samples to make inferences about the population from
which the sample was drawn. If we are interested in the average height of students at a local community
college, we could select a random sample of the students and measure their heights. Then we could use