GLOSSARY
Alternative hypothesis —the theory that the researcher hopes to confirm by rejecting the null hypothesis
Association —when some of the variability in one variable can be accounted for by the other
Bar graph —graph in which the frequencies of categories are displayed with bars; analogous to a
histogram for numerical data
Bimodal —distribution with two (or more) most common values; see mode
Binomial distribution —probability distribution for a random variable X in a binomial setting;
where n is the number of independent trials, p is the probability of success on each trial, and x is the
count of successes out of the n trials
Binomial setting (experiment) —when each of a fixed number, n , of observations either succeeds or
fails, independently, with probability p
Bivariate data —having to do with two variables
Block —a group of experimental units thought to be homogenous with respect to the response variable
Block design —procedure by which experimental units are put into homogeneous groups in an attempt to
reduce variability due to the group on the response variable
Blocking —see block design
Boxplot (box and whisker plot) —graphical representation of the five-number summary of a dataset.
Each value in the five-number summary is located over its corresponding value on a number line. A
box is drawn that ranges from Q1 to Q3 and “whiskers” extend to the maximum and minimum values
from Q1 and Q3.
Categorical data —see qualitative data
Census —attempt to contact every member of a population
Center —the “middle” of a distribution; either the mean or the median
Central limit theorem —theorem that states that the sampling distribution of a sample mean becomes
approximately normal when the sample size is large
Chi-square (χ^2 ) goodness-of-fit test —compares a set of observed categorical values to a set of
expected values under a set of hypothesized proportions for the categories;
Cluster sample— The population is first divided into sections or “clusters.” Then we randomly select an
entire cluster, or clusters, and include all of the members of the cluster(s) in the sample.
Coefficient of determination (r 2 )—measures the proportion of variation in the response variable