- Extraction of the principal components of interest: an eigen-
value is associated with each eigenvector and the eigenvectors
in the transformation matrix are sorted in descending order
according to their respective eigenvalue. By this way, the first
eigenvector of the transformation matrix, that is, the first prin-
cipal component, corresponds to the direction of largest vari-
ance (largest-amplitude fluctuations), the second to the
direction of second largest variance, and so on. - Dimensionality reduction: the latest components that are the
less significant are not considered. - Calculation of the new coordinate matrix: the original matrix of
atomic coordinates is projected onto the PCA space and the
new matrix of atomic coordinates is derived.
In our example, we use the PCA to compare the efficiency of
the conformational sampling of the aMD to cMD and its ability to
retrieve the experimental crystal structures available in the PDB
database.
Trajectory Preparation Using the module cpptraj of AMBER we:
- Create a new trajectory by concatenating the trajectory of the
200 ns aMD with that of the 200 ns cMD and all DFG-in
structures of p38 without gaps available in the RCSB database. - Remove water molecules and counterions from the trajectory.
During the analysis, we also remove hydrogens to reduce the
size of the trajectory to avoid memory saturation. - Align the conformations of the trajectory on the C alpha atoms
of the backbone to get all the frames (snapshots) in the same
referential. - Export the new aligned trajectory and a single frame
(or snapshot) as pdb format to use it as information of topology
for the next analysis.
Principal Component
Analysis of the Trajectory
In PCA, a set of atoms is selected to describe the conformation of
the protein during the trajectory. Here, the atoms of the backbone
are selected to describe residue positions and the corresponding
trajectory constitutes the initial matrix of atomic coordinates (see
Note 10). The PCA was performed with the Bio3D package in R
(seeNote 11)[38]. The topology is read from the pdb file. The
trajectory can be read in DCD format (format used by CHARMM,
NAMD, and X-PLOR) or NetCDF AMBER format after installing
the ncdf4 package. Since the trajectory has been previously aligned
on the backbone, PCA is directly applied on the initial matrix of
atomic coordinates.
414 Sonia Ziada et al.