The Essentials of Biostatistics for Physicians, Nurses, and Clinicians

182 Solutions to Selected Exercises

(e) Nonlinear regression (f) Scatter plot (g) Slope of the regression line in simple linear regression (a) Association is a general term for a relationship between two variables. It includes the Pearson correlation coeffi cient, Kendall ’ s tau, and Spearman ’ s rho among others. (b) The correlation coeffi cient usually refers to the Pearson product moment correlation, which is a measure of the strength of the linear association between two variables. Sometimes Kendall ’ s tau and Spearman ’ s rho are also called correlations. Spearman ’ s rank correlation measures the degree to which X increases as Y increase or X decreases as Y increases. It is 1 when Y is exactly a monotonically increasing function of X , and − 1 when Y is exactly a monotonically decreasing function of X. (c) Simple linear regression is the curve relating two variables X and when Y = af ( X ) + b + e , where a and b are the parameters, and e represents a random noise component. In this formulation, Y is a linear function of the parameters a and b , and f is any function of X. The regression function is af ( X ) + b. If f ( X ) = X , Y is linear in X also, but f ( X ) could also be X or X 2 or log( X ). (d) Multiple linear regression is similar to linear regression except that Y is a function of two or more variables X 1 , X 2 ,... X n , where n ≥ 2. (e) Nonlinear regression can have Y be a function of one or more variables. It differs from linear regression in that it must be nonlinear in parameters. So, for example, Y could be exp( b ) X a , or some other complicated expression. Y = X a + e is nonlinear. But if the noise term were multiplicative, that is, Y = X a e , then it is transformable to a linear regression, since ln( Y ) = ln( e ) + a ln( X ) In this case, we can solve by least squares with a zero intercept restriction. ln( e ) is the additive noise term, and Z = ln( Y ) has a linear regression Z = aW + δ , where W = ln( X ) and δ = ln( e ). The only parameter now is a , and Z is a linear function of the parameter a. Usually, in nonlinear regression, iterative procedures are needed for the solution, while in linear regression, the least squares solution is obtained in closed form by solving equations that are called the normal equations. (f) A scatter plot is a graph of pairs ( X , Y ) that graphically shows the degree of relationship between the variables X and Y and is often the fi rst step toward fi tting a model of Y as a function of X. (g) In simple linear regression, where Y = af ( X ) + b + e. The parameter a is called the slope of the regression line. When f ( X ) = X , the least squares regression line is fi t through the scatter plot of the data. The closer the data points fall near the least squares line the higher is the correlation between X and Y , and the better the linear regression line fi ts the data. The slope of that regression line is the least squares estimate of a , and the Y intercept for the line is the least squares estimate of b.

What is logistic regression? How is it different from ordinary linear
regression?
Logistic regression involves a response variable that is binary. The predictor vari-
ables can be continuous or discrete or a combination of both. Call Y the binary

bboth.indd 182oth.indd 182 6 6/15/2011 4:08:22 PM/ 15 / 2011 4 : 08 : 22 PM

The Essentials of Biostatistics for Physicians, Nurses, and Clinicians

Get our desktop app

Company

Features

Documentation

Resources