Personalized_Medicine_A_New_Medical_and_Social_Challenge

(Barré) #1

weights. If a graph isundirected, its adjacency matrix is symmetric. A great
computational disadvantage of this representation is a memory allocation. There-
fore, other alternatives for graph representation include adjacency list and object-
oriented representation.^94 However, many data integration methodologies require
data on molecular networks to be in the matrix form (see Sect.4.3). Since biological
networks are sparse, in the implementation of these methodologies it is a common
practice to use sparse matrix representation to save memory space.
TheLaplacian matrixis a transformed version of the graph’s adjacency matrix,
which is often used in spectral graph theory. It is defined as follows: letDbe the
diagonal degree matrix of a graph, where elements on the diagonal are node degrees
(i.e., row summation of the adjacency matrix:D(j,j)¼∑iA(i,j)) and all other entries
are zero. The Laplacian matrix is defined asL¼D–A(see Fig. 2 (B, C)). This matrix
plays an important role inconstrained clusteringproblems. Constrained clustering
is asemi-supervised^95 technique that aims at incorporating prior knowledge, in the


Fig. 2(a)(Above) Graphlets containing 2–5 nodes, denoted byG 0 ,G 1 ,...G 29 and their
automorphic orbitsi 2 {0, 1,..., 72}. Nodes within a graphlet belonging to the same orbit (i.e.,
invariant under the symmetry group) are represented by thesame shade.(Below) An example of a
GDV vector for a nodev. The node touchesG 0 graphlet (orbit 0) two times; end node ofG 1
graphlet (orbit 1) one time and middle node of the same graphlet (orbit 2) one time; and orbit 5 of
the graphletG 3 one time. (b) An example of the graph containing 6 nodes, along with the GDV
vectors of nodes A and B. Identical GDV vectors of nodes A and B indicate that these two nodes
have the same local topology even though they are not directly connected. (c) Adjacency and
Laplacian matrices for the graph shown in panel (b). Zero entries inredindicate that nodes A and
B are not connected


(^94) West ( 2000 ).
(^95) Semi-supervised learning is a class of machine learning techniques that uses small amounts of
labeled data (prior information about data) for training an algorithm in performing a specific task.
For example, in clustering tasks, traditional clustering algorithms use only unlabeled data to
Computational Methods for Integration of Biological Data 155

Free download pdf