- EXECUTIVE SUMMARY
- 1 INTRODUCTION
- 1.1 Excitement at the Interface of Computing and Biology,
- 1.2 Perspectives on the BioComp Interface,
- 1.2.1 From the Biology Side,
- 1.2.2 From the Computing Side,
- 1.2.3 The Role of Organization and Culture,
- 1.3 Imagine What’s Next,
- 1.4 Some Relevant History in Building the Interface,
- 1.4.1 The Human Genome Project,
- 1.4.2 The Computing-to-Biology Interface,
- 1.4.3 The Biology-to-Computing Interface,
- 1.5 Background, Organization, and Approach of This Report,
- 2 21st CENTURY BIOLOGY
- 2.1 What Kind of Science?,
- 2.1.1 The Roots of Biological Culture,
- 2.1.2 Molecular Biology and the Biochemical Basis of Life,
- 2.1.3 Biological Components and Processes in Context, and Biological Complexity,
- 2.2 Toward a Biology of the 21st Century,
- 2.3 Roles for Computing and Information Technology in Biology,
- 2.3.1 Biology as an Information Science,
- 2.3.2 Computational Tools,
- 2.3.3 Computational Models,
- 2.3.4 A Computational Perspective on Biology,
- 2.3.5 Cyberinfrastructure and Data Acquisition,
- 2.4 Challenges to Biological Epistemology,
- 2.1 What Kind of Science?,
- 3 ON THE NATURE OF BIOLOGICAL DATA xiv CONTENTS
- 3.1 Data Heterogeneity,
- 3.2 Data in High Volume,
- 3.3 Data Accuracy and Consistency,
- 3.4 Data Organization,
- 3.5 Data Sharing,
- 3.6 Data Integration,
- 3.7 Data Curation and Provenance,
- 4 COMPUTATIONAL TOOLS
- 4.1 The Role of Computational Tools,
- 4.2 Tools for Data Integration,
- 4.2.1 Desiderata,
- 4.2.2 Data Standards,
- 4.2.3 Data Normalization,
- 4.2.4 Data Warehousing,
- 4.2.5 Data Federation,
- 4.2.6 Data Mediators/Middleware,
- 4.2.7 Databases as Models,
- 4.2.8 Ontologies,
- 4.2.8.1 Ontologies for Common Terminology and Descriptions,
- 4.2.8.2 Ontologies for Automated Reasoning,
- 4.2.9 Annotations and Metadata,
- 4.2.10 A Case Study: The Cell Centered Database,
- 4.2.11 A Case Study: Ecological and Evolutionary Databases,
- 4.3 Data Presentation,
- 4.3.1 Graphical Interfaces,
- 4.3.2 Tangible Physical Interfaces,
- 4.3.3 Automated Literature Searching,
- 4.4 Algorithms for Operating on Biological Data,
- 4.4.1 Preliminaries: DNA Sequence as a Digital String,
- 4.4.2 Proteins as Labeled Graphs,
- 4.4.3 Algorithms and Voluminous Datasets,
- 4.4.4 Gene Recognition,
- 4.4.5 Sequence Alignment and Evolutionary Relationships,
- 4.4.6 Mapping Genetic Variation Within a Species,
- 4.4.7 Analysis of Gene Expression Data,
- 4.4.8 Data Mining and Discovery,
- 4.4.8.1 The First Known Biological Discovery from Mining Databases,
- Integration for Functional Analysis of Proteins, 4.4.8.2 A Contemporary Example: Protein Family Classification and Data
- 4.4.8.1 The First Known Biological Discovery from Mining Databases,
- 4.4.9 Determination of Three-dimensional Protein Structure,
- 4.4.10 Protein Identification and Quantification from Mass Spectrometry,
- 4.4.11 Pharmacological Screening of Potential Drug Compounds,
- 4.4.12 Algorithms Related to Imaging,
- 4.4.12.1 Image Rendering,
- 4.4.12.2 Image Segmentation,
- 4.4.12.3 Image Registration,
- 4.4.12.4 Image Classification,
- 4.5 Developing Computational Tools,
- BIOLOGICAL DISCOVERY 5 COMPUTATIONAL MODELING AND SIMULATION AS ENABLERS FOR
- 5.1 On Models in Biology,
- 5.2 Why Biological Models Can Be Useful,
- 5.2.1 Models Provide a Coherent Framework for Interpreting Data,
- 5.2.2 Models Highlight Basic Concepts of Wide Applicability,
- 5.2.3 Models Uncover New Phenomena or Concepts to Explore,
- 5.2.4 Models Identify Key Factors or Components of a System,
- 5.2.5 Models Can Link Levels of Detail (Individual to Population),
- 5.2.6 Models Enable the Formalization of Intuitive Understandings,
- 5.2.7 Models Can Be Used as a Tool for Helping to Screen Unpromising Hypotheses,
- 5.2.8 Models Inform Experimental Design,
- 5.2.9 Models Can Predict Variables Inaccessible to Measurement,
- 5.2.10 Models Can Link What Is Known to What Is Yet Unknown,
- 5.2.11 Models Can Be Used to Generate Accurate Quantitative Predictions,
- 5.2.12 Models Expand the Range of Questions That Can Meaningfully Be Asked,
- 5.3 Types of Models,
- 5.3.1 From Qualitative Model to Computational Simulation,
- 5.3.2 Hybrid Models,
- 5.3.3 Multiscale Models,
- 5.3.4 Model Comparison and Evaluation,
- 5.4 Modeling and Simulation in Action,
- 5.4.1 Molecular and Structural Biology,
- 5.4.1.1 Predicting Complex Protein Structures,
- 5.4.1.2 A Method to Discern a Functional Class of Proteins,
- 5.4.1.3 Molecular Docking,
- Structural Sites in Protein Structures, 5.4.1.4 Computational Analysis and Recognition of Functional and
- 5.4.2 Cell Biology and Physiology,
- 5.4.2.1 Cellular Modeling and Simulation Efforts,
- 5.4.2.2 Cell Cycle Regulation,
- Human Pathophysiology of Red Blood Cells, 5.4.2.3 A Computational Model to Determine the Effects of SNPs in
- 5.4.2.4 Spatial Inhomogeneities in Cellular Development,
- Stability, 5.4.2.4.1 Unraveling the Physical Basis of Microtubule Structure and
- 5.4.2.4.2 The Movement of Listeria Bacteria,
- Intracellular Signaling, 5.4.2.4.3 Morphological Control of Spatiotemporal Patterns of
- 5.4.3 Genetic Regulation,
- 5.4.3.1 Cis-regulation of Transcription Activity as Process Control Computing,
- 5.4.3.2 Genetic Regulatory Networks as Finite-state Automata,
- 5.4.3.3 Genetic Regulation as Circuits,
- 5.4.3.4 Combinatorial Synthesis of Genetic Networks,
- Biological Network Information, 5.4.3.5 Identifying Systems Responses by Combining Experimental Data with
- 5.4.4 Organ Physiology,
- 5.4.4.1 Multiscale Physiological Modeling,
- 5.4.4.2 Hematology (Leukemia),
- 5.4.4.3 Immunology, xvi CATALYZING INQUIRY
- 5.4.4.4 The Heart,
- 5.4.5 Neuroscience,
- 5.4.5.1 The Broad Landscape of Computational Neuroscience,
- 5.4.5.2 Large-scale Neural Modeling,
- 5.4.5.3 Muscular Control,
- 5.4.5.4 Synaptic Transmission,
- 5.4.5.5 Neuropsychiatry,
- 5.4.6 Virology,
- 5.4.7 Epidemiology,
- 5.4.8 Evolution and Ecology,
- 5.4.8.1 Commonalities Between Evolution and Ecology,
- 5.4.8.2 Examples from Evolution,
- 5.4.8.2.1 Reconstruction of the Saccharomyces Phylogenetic Tree,
- 5.4.8.2.2 Modeling of Myxomatosis Evolution in Australia,
- 5.4.8.2.3 The Evolution of Proteins,
- 5.4.8.2.4 The Emergence of Complex Genomes,
- 5.4.8.3 Examples from Ecology,
- 5.4.8.3.1 Impact of Spatial Distribution in Ecosystems,
- 5.4.8.3.2 Forest Dynamics,
- 5.5 Technical Challenges Related to Modeling,
- 5.4.1 Molecular and Structural Biology,
- 6A COMPUTATIONAL AND ENGINEERING VIEW OF BIOLOGY
- 6.1 Biological Information Processing,
- 6.2 An Engineering Perspective on Biological Organisms,
- 6.2.1 Biological Organisms as Engineered Entities,
- 6.2.2 Biology as Reverse Engineering,
- 6.2.3 Modularity in Biological Entities,
- 6.2.4 Robustness in Biological Entities,
- 6.2.5 Noise in Biological Phenomena,
- 6.3 A Computational Metaphor for Biology,
- 7 CYBERINFRASTRUCTURE AND DATA ACQUISITION
- 7.1 Cyberinfrastructure for 21st Century Biology,
- 7.1.1 What Is Cyberinfrastructure?
- 7.1.2 Why Is Cyberinfrastructure Relevant?
- 7.1.3 The Role of High-performance computing,
- 7.1.4 The Role of Networking,
- 7.1.5 An Example of Using Cyberinfrastructure for Neuroscience Research,
- 7.2 Data Acquisition and Laboratory Automation,
- 7.2.1 Today’s Technologies for Data Acquisition,
- 7.2.2 Examples of Future Technologies,
- 7.2.3 Future Challenges,
- 7.1 Cyberinfrastructure for 21st Century Biology,
- 8 BIOLOGICAL INSPIRATION FOR COMPUTING
- 8.1 The Impact of Biology on Computing,
- 8.1.1 Biology and Computing: Promise and Skepticism,
- 8.1.2 The Meaning of Biological Inspiration,
- 8.1.3 Multiple Roles: Biology for Computing Insight,
- 8.1 The Impact of Biology on Computing,
- 8.2 Examples of Biology as a Source of Principles for Computing, CONTENTS xvii
- 8.2.1 Swarm Intelligence and Particle Swarm Optimization,
- 8.2.2 Robotics 1: The Subsumption Architecture,
- 8.2.3 Robotics 2: Bacterium-inspired Chemotaxis in Robots,
- 8.2.4 Self-Healing Systems,
- 8.2.5 Immunology and Computer Security,
- 8.2.5.1 Why Immunology Might Be Relevant,
- 8.2.5.2 Some Possible Applications of Immunology-based Computer Security,
- 8.2.5.3 Immunological Design Principles for Computer Security,
- 8.2.5.4 An Example: Immunology and Intruder Detection,
- 8.2.5.5 Interesting Questions and Challenges,
- 8.2.5.5.1 Definition of Self,
- 8.2.5.5.2 More Immunological Mechanisms,
- 8.2.5.6 Some Possible Difficulties with an Immunological Approach,
- 8.2.6 Amorphous Computing,
- 8.3 Biology as Implementer of Mechanisms for Computing,
- 8.3.1 Evolutionary Computation,
- 8.3.1.1 What Is Evolutionary Computation?
- 8.3.1.2 Suitability of Problems for Evolutionary Computation,
- 8.3.1.3 Correctness of a Solution,
- 8.3.1.4 Solution Representation,
- 8.3.1.5 Selection of Primitives,
- 8.3.1.6 More Evolutionary Mechanisms,
- 8.3.1.6.1 Coevolution,
- 8.3.1.6.2 Development,
- 8.3.1.7 Behavior of Evolutionary Processes,
- 8.3.2 Robotics 3: Energy and Compliance Management,
- 8.3.3 Neuroscience and Computing,
- 8.3.3.1 Neuroscience and Architecture in Broad Strokes,
- 8.3.3.2 Neural Networks,
- 8.3.3.3 Neurally Inspired Sensors,
- 8.3.4 Ant Algorithms,
- 8.3.4.1 Ant Colony Optimization,
- 8.3.4.2 Other Ant Algorithms,
- 8.3.1 Evolutionary Computation,
- 8.4 Biology as Physical Substrate for Computing,
- 8.4.1 Biomolecular Computing,
- 8.4.1.1 Description,
- 8.4.1.2 Potential Application Domains,
- 8.4.1.3 Challenges,
- 8.4.1.4 Future Directions,
- 8.4.2 Synthetic Biology,
- 8.4.2.1 An Engineering Approach to Building Living Systems,
- 8.4.2.2 Cellular Logic Gates,
- 8.4.2.3 Broader Views of Synthetic Biology,
- 8.4.2.4 Applications,
- 8.4.2.5 Challenges,
- 8.4.3 Nanofabrication and DNA Self-Assembly,
- 8.4.3.1 Rationale,
- 8.4.3.2 Applications,
- 8.4.3.3 Prospects, xviii CONTENTS
- 8.4.3.4 Hybrid Systems,
- COMPUTING AND BIOLOGY 9 ILLUSTRATIVE PROBLEM DOMAINS AT THE INTERFACE OF
- 9.1 Why Problem-focused Research?
- 9.2 Cellular and Organismal Modeling,
- 9.3 A Synthetic Cell with Physical Form,
- 9.4 Neural Information Processing and Neural Prosthetics,
- 9.5 Evolutionary Biology,
- 9.6 Computational Ecology,
- 9.7 Genome-enabled Individualized Medicine,
- 9.7.1 Disease Susceptibility,
- 9.7.2 Drug Response and Pharmacogenomics,
- 9.7.3 Nutritional Genomics,
- 9.8 A Digital Human on Which a Surgeon Can Operate Virtually,
- 9.9 Computational Theories of Self-assembly and Self-modification,
- 9.10 A Theory of Biological Information and Complexity,
- 8.4.1 Biomolecular Computing,
- 10 CULTURE AND RESEARCH INFRASTRUCTURE
- 10.1 Setting the Context,
- 10.2 Organizations and Institutions,
- 10.2.1 The Nature of the Community,
- 10.2.2 Education and Training,
- 10.2.2.1 General Considerations,
- 10.2.2.2 Undergraduate Programs,
- 10.2.2.3 The BIO2010 Report,
- 10.2.2.3.1 Engineering,
- 10.2.2.3.2 Quantitative Training,
- 10.2.2.3.3 Computer Science,
- 10.2.2.4 Graduate Programs,
- 10.2.2.5 Postdoctoral Programs,
- Computational Molecular Biology, 10.2.2.5.1 The Sloan/DOE Postdoctoral Awards for
- Scientific Interface, 10.2.2.5.2 The Burroughs-Wellcome Career Awards at the
- Structural Biology: The Research Training Program, 10.2.2.5.3 Keck Center for Computational and
- 10.2.2.6 Faculty Retraining in Midcareer,
- 10.2.3 Academic Organizations,
- 10.2.4 Industry,
- 10.2.4.1 Major IT Corporations,
- 10.2.4.2 Major Life Science Corporations,
- 10.2.4.3 Start-up and Smaller Companies,
- 10.2.5 Funding and Support,
- 10.2.5.1 General Considerations,
- 10.2.5.1.1 The Role of Funding Institutions,
- 10.2.5.1.2 The Review Process,
- 10.2.5.2 Federal Support, CONTENTS xix
- 10.2.5.2.1 National Institutes of Health,
- 10.2.5.2.2 National Science Foundation,
- 10.2.5.2.3 Department of Energy,
- 10.2.5.2.4 Defense Advanced Research Projects Agency,
- 10.2.5.1 General Considerations,
- 10.3 Barriers,
- 10.3.1 Differences in Intellectual Style,
- 10.3.1.1 Historical Origins and Intellectual Traditions,
- 10.3.1.2 Different Approaches to Education and Training,
- 10.3.1.3 The Role of Theory,
- 10.3.1.4 Data and Experimentation,
- 10.3.1.5 A Caricature of Intellectual Differences,
- 10.3.2 Differences in Culture,
- 10.3.2.1 The Nature of the Research Enterprise,
- 10.3.2.2 Publication Venue,
- 10.3.2.3 Organization of Human Resources,
- 10.3.2.4 Devaluing the Contributions of the Other,
- 10.3.2.5 Attitudinal Issues,
- 10.3.3 Barriers in Academia,
- 10.3.3.1 Academic Disciplines and Departmental Structure,
- 10.3.3.2 Structure of Educational Programs,
- 10.3.3.3 Coordination Costs,
- 10.3.3.4 Risks of Retraining and Conversion,
- 10.3.3.5 Rapid But Uneven Changes in Biology,
- 10.3.3.6 Funding Risk,
- 10.3.3.7 Local Cyberinfrastructure,
- 10.3.4 Barriers in Commerce and Business,
- 10.3.4.1 Importance Assigned to Short-term Payoffs,
- 10.3.4.2 Reduced Workforces,
- 10.3.4.3 Proprietary Systems,
- 10.3.4.4 Cultural Differences Between Industry and Academia,
- 10.3.5 Issues Related to Funding Policies and Review Mechanisms,
- 10.3.5.1 Scope of Supported Work,
- 10.3.5.2 Scale of Supported Work,
- 10.3.5.3 The Review Process,
- 10.3.6 Issues Related to Intellectual Property and Publication Credit,
- 10.3.1 Differences in Intellectual Style,
- 11 CONCLUSIONS AND RECOMMENDATIONS
- 11.1 Disciplinary Perspectives,
- 11.1.1 The Biology-Computing Interface,
- 11.1.2 Other Emerging Fields at the BioComp Interface,
- 11.2 Moving Forward,
- 11.2.1 Building a New Community,
- 11.2.2 Core Principles for Practitioners,
- 11.2.3 Core Principles for Research Institutions,
- 11.3 The Special Significance of Educational Innovation at the BioComp Interface,
- 11.3.1 Content,
- 11.3.2 Mechanisms,
- 11.4 Recommendations for Research Funding Agencies, xx CONTENTS
- 11.4.1 Core Principles for Funding Agencies,
- 11.4.2 National Institutes of Health,
- 11.4.3 National Science Foundation,
- 11.4.4 Department of Energy,
- 11.4.5 Defense Advanced Research Projects Agency,
- 11.5 Conclusions Regarding Industry,
- 11.6 Closing Thoughts,
- 11.1 Disciplinary Perspectives,
- A The Secrets of Life: A Mathematician’s Introduction to Molecular Biology APPENDIXES
- B Challenge Problems in Bioinformatics and Computational Biology from Other Reports
- C Biographies of Committee Members and Staff
- D Workshop Participants
- What Is CSTB?
nextflipdebug5
(nextflipdebug5)
#1