Systems Biology (Methods in Molecular Biology)

3.3.3 NMR Data
Postprocessing

NMR data postprocessing is a necessary step of metabolomics pipeline to extract useful information related to the state of biological system. This step helps to avoid sources of variation in the data, such as dilution effect, subtle changes in chemical shifts, line-widths, and baseline across series of spectra, which can interfere with the outcome of the statistical analysis, leading to false deductions. NMR data postprocessing usually includes exclusion of non-informative regions, binning, normalization, scaling, and data export for subsequent multivariate statistical analysis.

Remove theregions in thespectra thatcontain only noise and/or
exogenous peaks. Therefore, exclude the spectral regions outside
the window 0.5 (including TSP signal) and 9.0 ppm and those
containing the residual water (δ4.7–5.0 ppm) and drug peaks.

Reduce the dimensionality of data splitting the p-JRES spectra
into small segments (bins or buckets) with variable widths
ranging from 0.01 to 0.04 ppm to ensure that each bin con-
tains the same signals throughout all the spectra. If local peak
shifts across series of spectra are still observed, compress groups
of bins into single bins or alignment of the spectra. Then,
integrate the signal within each bin (seeNote 5).

Normalize the binned spectra by applying the Probabilistic
Quotient Normalization (PQN) [10, 11] method to make
spectra comparable:
(a) Set the total spectral area of every spectrum to 100.
(b) Calculate as a reference spectrum the median spectrum
(median of each variable/bin area) of healthy group
samples.
(c) Calculate the quotient between the area of each spectral
bin of the considered spectrum and that of the
corresponding bin in the reference spectrum.
(d) Calculate the median of all the quotients.
(e) Divide all the variables of the considered spectrum by the
median quotient.
(f) Repeatsteps c–efor all spectra.

Scaling the data by applying the generalized log (g-log) trans-
formation [12, 13] to make the variables within spectra
comparable:
(a) Estimate the g-log transformation parameter (λ) by the
maximum likelihood method using a set of five replicate
measurements.
(b) Obtain these five replicates from a single homogeneous
pool of fecal water samples from healthy and pathological
patients. Process the replicate spectra as described above

Metabolomics and Clinical Needs 331

Systems Biology (Methods in Molecular Biology)

Get our desktop app

Company

Features

Documentation

Resources