Nature - USA (2020-06-25)

(Antfer) #1
Nature | Vol 582 | 25 June 2020 | 595

Overall, 38.4% of the identified proteins did not have any functional
annotation for the biological processes, and interestingly this was
true even for 22.9% of the 100 most highly abundant proteins of each
species at the biological-process level, and for 10% when considering
protein functional domains (Extended Data Fig. 7 and Supplemen-
tary Table 6). Thus, our data point to a very large number of highly


expressed proteins without any functional annotation or sequence
homology to proteins with known gene ontology terms. Exploration
of this part of the ‘dark proteome’ would be attractive: these proteins
may indicate essential but unique features in the evolutionary develop-
ment of these organisms that may be of biological or biotechnological
interest.

P < 0.05
P < 0.01

All proteins
Proteinqualitycontrolformisfolded
orincompletelysynthesizedproteins

10

9

8

7

11

0 100 200 300 400 500 600 700 800

(i) Respiratoryelectrontransportchain (8)
(ii) Proteinqualitycontrolformisfoldedor
incompletelysynthesizedproteins (15)
(iii) Post-translationalproteintargetingto
membrane,translocation (5)
(iv) Ribosomal RNA processing (63)
(v) Cellwallorganization (58)
(vi) Proteinphosphorylation (376)
(vii) Peptidyl-threoninephosphorylation (7)

Rank

log

10

-transformed intensity

0 2,000 4,000 6,000 8,000 10,000 12,000

log

10

-transformed intensity

Rank

Glycine max Vitis vinifera

(i) Lon protease homologue, mitochondrial
(ii) ATP-dependent Clp protease proteolytic subunit

Functional annotation
Homology

(iii) Lon protease homologue 2, peroxisomal
(iv) ATP-dependent Clp protease proteolytic subunit

Project
Taxonomy

Annotation

Protein
Functional region

(i)
(i)

(ii)

(ii)

(iii)

(iii)

(iv)

(iv)
(v)
(vi)

(vii)

ab

cd

Proteinqualitycontrolformisfolded
orincompletelysynthesizedproteins

(i) Lonprotease homologue,mitochondrial
(ii) ATP-dependent Clp proteaseproteolytic subunit
(iii) Lon protease homologue2, peroxisomal
(iv)ATP-dependentClpproteaseproteolyticsubunit

(i)(i)

(ii)

(iii)

(iv)

10

9

8

7

11

12

0
0

10

20

30

40

50

60

70

80

90

100

Abundance-rankedproteins (%)

Portio

no

fp

rote

in mass

(%

)

10 20 30 40 50 60 70 80 90 100

Minimum to maximum

Median with 25–75% percentile

Fig. 3 | Organism-resolved integration of proteome data into a global
analysis. a, Cumulative protein intensities (ranked by abundance; x axis) and
their contribution to total protein mass (y axis) across all organisms (n = 100
organisms). b, Exemplified structure from the data model of the graph
database, illustrating the connection between two homologous proteins of
G. max and V. vinifera, and related annotations. c, All quantified proteins
from G. max are displayed, plotting their intensities against their rank in the
dynamic range. All proteins for which the functions are associated with


‘protein quality control for misfolded or incomplete synthesized proteins’
are highlighted. d, Significantly enriched functions (grey circles, P < 0.05;
red circles, P < 0.01) within the proteome of G. max (with seven specific
examples) and their distribution across the dynamic range (sample sizes in
parentheses; one-sided Mann–Whitney U-test to the mean functional
expression level). Error bars represent minimum to maximum values, and
boxes show 10–90% percentiles.
Free download pdf