Wine Chemistry and Biochemistry

(Steven Felgate) #1

688 P.J. Mart ́ın-Alvarez ́


13.2 Bivariate Statistical Techniques


In this section we consider the statistical techniques, correlation and regression


analysis, tostudy the interrelationship between two continuous random variables


({X 1 ,X 2 ), from the information supplied by a sample ofnpairs of observations


(x 1 , 1 ,x 1 , 2 ),(x 2 , 1 ,x 2 , 2 ),...,(xn, 1 ,xn, 2 )


}
, from a populationW.Inthecorrelation

analysiswe accept that the sample has been obtained of random form, and in the


regression analysis(linear or not linear) we accept that the values of one of the


variables are not subject to error (independent variableX=X 1 ), and the dependent


variable (Y=X 2 ) is related to the independent variable by means of a mathematical


model (Y=f(X)+ε).


Mean and standard deviation values ( ̄x 1 ,s 1 ,x ̄ 2 ,s 2 ) for every variable can be


calculated, and the scatterplot with thenpoints can be used to see the form


of the association between the two variables. In the case of random samples


and assuming a bivariate normal distribution, the 95% confidence ellipse: (x 1 −


x ̄ 1 ,x 2 −x ̄ 2 )


(
s^21 s 12
s 12 s 22

)− 1 (
x 1 −x ̄ 1
x 2 −x ̄ 2

)
n(n−2)
2(n^2 −1) = F^1 −α,^2 ,n−^2 , that can be used to

detect outliers, can also be included in the scatterplot. The covariance (s 12 =
∑n


i= 1


(xi, 1 −x ̄ 1 )(xi, 2 −x ̄ 2 )/(n−1) ) and correlation coefficient (r = s 12 /(s 1 s 2 ))


values, which take into account the joint variation of both variables, can also be


calculated (Afifi and Azen 1979; Jobson 1991).


13.2.1 Correlation Analysis


Accepting normal bivariate distribution,Pearson’s correlation coefficient,defined


byr=


∑n
i= 1

(xi, 1 −x ̄ 1 )(xi, 2 − ̄x 2 )

∑n
i= 1

(xi, 1 −x ̄ 1 )^2

∑n
i= 1

(xi, 2 − ̄x 2 )^2

, that is the estimator of the population’s correlation


coefficientρ, measures the intensity of the linear relation between both variables


X 1 ,X 2. It is possible to calculate the 100(1−α)% confidence interval forρ, and/or


test the null hypothesisH 0 ≡ρ=0 vs the alternativeH 1 ≡ρ=0, by means of


the statistictcal=
r

√n−^2
1 −r^2 which has a t-distribution withn−2df;andif|tcal|>
t 1 −α/ 2 ,n− 2 , or if the associated probability is less thanα,H 0 is rejected andρ=0is


accepted.


If normality of the data cannot be accepted,Spearman’s correlation coefficient


and its corresponding nonparametric test can be used for the null hypothesisH 0 ≡


ρ=0.


13.2.1.1 Applications


As an example, correlation analysis has been applied: to confirm the correlation


between biogenic amine formation and disappearance of their corresponding amino

Free download pdf