(i.e., selection of exclusion regions, binning, and
normalization).
- Mean centered the data: subtract the mean value of each vari-
able from the original data of that bin.
3.4 Statistical
Analysis of NMR Data
- Reduce the data by using Principal Component Analysis. This
process assigns to each sample a score relative to each extracted
component (Principal Component, PC). The extracted com-
ponents are independent of each other by construction, thus
they are non-overlapping features of the studied system. Use
the component scores to plot PC maps of the samples which
best provide an indication of the differences between the classes
(healthy or disease groups) in terms of metabolic similarity. - Carry out separate inferential statistics (t-test) on the different
component scores, so as to check for the statistical significance
of the between groups differences. - Compare the metabolic profiles and the clinical features of each
patient by Pearson’s correlation and/or ANOVA test, having as
dependent variables the components and as regressors (sources
of variation) potential modulating or confounding factors. - After having verified the absence of potentially confounding
factors on the PCs, apply a linear discriminant analysis (LDA)
to the components so as to develop a predictive model for the
classification of patients in healthy or disease groups. - In the case of statistically significant effects of confounding
factors on discriminant components, correct (covariance analy-
sis or partial correlation analysis) for the effect of the above-
mentioned factors. This procedure will allow estimating the
actual degree of association between DA-based membership
class probability and clinical status.
We investigated the NMR data by using Principal Component
Analysis (PCA) carried out on samples from young patients with
cystic fibrosis (CF) and healthy children. Five components are
sufficient to explain the 40% of the variance in the metabolic data.
The score plot in Fig.1-NMR shows a clear separation between the
CF and healthy children on the PC1 (p¼0.001 byt-test) and PC4
(p<0.0001 byt-test).
In this study, since the metabolic status of the CF patients could
be influenced by several variables such as age, gender, and antibiotic
and probiotic assumption, we decided to assess whether any of
these factors could influence the separation between CF patients
and healthy children. To address age and gender as potential con-
founding factors, the metabolic profiles and the clinical features of
each child were compared by Pearson’s correlation, while for asses-
sing antibiotic and probiotic assumption variables, the metabolic
332 Luca Casadei et al.