Science - USA (2019-01-18)

(Antfer) #1

beyond the range of selectivities in the training
data, the models quantitatively underpredicted
these reactions. By using Keras ( 49 ) with the
Theano backend, a Python package that can
facilitate deep learning, we generated a deep
feed-forward network. Grid-based hyperparam-
eter optimization was used with linear, relu, elu,
and selu activation functions; 0.05, 0.1, and
0.2 dropouts on the layers; 4, 40, 400, and
4000 nodes per layer; and 0 to 6 hidden layers.
Further, all optimizers available in Keras were
tested. This method of hyperparameter optimi-
zation was very time intensive, and it is strongly
recommended that practitioners instead use a
Bayesian optimization of hyperparameters. At-
tempts to use this kind of optimization and
more modern machine learning methods are
currently under way.


REFERENCES AND NOTES



  1. T. Engel, Basic overview of chemoinformatics.J. Chem. Inf.
    Model. 46 , 2267–2277 (2006). doi:10.1021/ci600234z;
    pmid: 17125169

  2. P. Willett, Chemoinformatics: A history.Wiley Interdiscip.
    Rev. Comput. Mol. Sci. 1 ,46–56 (2011). doi:10.1002/wcms.1

  3. D. K. Agrafiotis, D. Bandyopadhyay, J. K. Wegner, Hv. Vlijmen,
    Recent advances in chemoinformatics.J. Chem.
    Inf. Model. 47 ,1279–1293 (2007). doi:10.1021/ci700059g;
    pmid: 17511441

  4. E. A. Feigenbaum, B. G. Buchanan, DENDRAL and Meta-
    DENDRAL: Roots of knowledge systems and expert system
    applications.Artif. Intell. 59 , 233–240 (1993). doi:10.1016/
    0004-3702(93)90191-D

  5. M. H. S. Segler, M. Preuss, M. P. Waller, Planning chemical
    syntheses with deep neural networks and symbolic AI.
    Nature 555 , 604–610 (2018). doi:10.1038/nature25978;
    pmid: 29595767

  6. S. Szymkućet al., Computer-assisted synthetic planning:
    The end of the beginning.Angew. Chem. Int. Ed. Engl.
    55 , 5904–5937 (2016). doi:10.1002/anie.201506101;
    pmid: 27062365

  7. J. N. Wei, D. Duvenaud, A. Aspuru-Guzik, Neural networks for the
    prediction of organic chemistry reactions.ACS Cent. Sci. 2 ,725– 732
    (2016). doi:10.1021/acscentsci.6b00219;pmid: 27800555

  8. C. W. Coley, R. Barzilay, T. S. Jaakkola, W. H. Green,
    K. F. Jensen, Prediction of organic reaction outcomes using
    machine learning.ACS Cent. Sci. 3 , 434–443 (2017).
    doi:10.1021/acscentsci.7b00064; pmid: 28573205

  9. H. Chen, O. Engkvist, Y. Wang, M. Olivecrona, T. Blaschke,
    The rise of deep learning in drug discovery.Drug Discov. Today
    23 , 1241–1250 (2018). doi:10.1016/j.drudis.2018.01.039;
    pmid: 29366762

  10. J. Ma, R. P. Sheridan, A. Liaw, G. E. Dahl, V. Svetnik, Deep
    neural nets as a method for quantitative structure-activity
    relationships.J.Chem. Inf. Model. 55 , 263–274 (2015).
    doi:10.1021/ci500747n; pmid: 25635324

  11. S. E. Denmark, N. D. Gould, L. M. Wolf, A systematic
    investigation of quaternary ammonium ions as asymmetric
    phase-transfer catalysts. Synthesis of catalyst libraries
    and evaluation of catalyst activity.J. Org. Chem. 76 ,
    4260 – 4336 (2011). doi:10.1021/jo2005445; pmid: 21446721

  12. S. E. Denmark, N. D. Gould, L. M. Wolf, A systematic
    investigation of quaternary ammonium ions as asymmetric
    phase-transfer catalysts. Application of quantitative structure
    activity/selectivity relationships.J. Org. Chem. 76 , 4337– 4357
    (2011). doi:10.1021/jo2005457; pmid: 21446723

  13. R. Gómez-Bombarelliet al., Automatic chemical design
    using a data-driven continuous representation of molecules.
    ACS Cent. Sci. 4 , 268–276 (2018). doi:10.1021/
    acscentsci.7b00572; pmid: 29532027

  14. P. Raccugliaet al., Machine-learning-assisted materials
    discovery using failed experiments.Nature 533 ,73–76 (2016).
    doi:10.1038/nature17439; pmid: 27147027

  15. A. P. Bartók, M. J. Gillan, F. R. Manby, G. Csányi, Machine-
    learning approach for one- and two-body corrections to density
    functional theory: Applications to molecular and condensed
    water.Phys. Rev. B 88 , 054104 (2013). doi:10.1103/
    PhysRevB.88.054104
    16. Z. Zhou, X. Li, R. N. Zare, Optimizing chemical reactions with
    deep reinforcement learning.ACS Cent. Sci. 3 , 1337– 1344
    (2017). doi:10.1021/acscentsci.7b00492; pmid: 29296675
    17. J. B. O. Mitchell, Machine learning methods in
    chemoinformatics.Wiley Interdiscip. Rev. Comput. Mol. Sci. 4 ,
    468 – 481 (2014). doi:10.1002/wcms.1183; pmid: 25285160
    18. K. B. Lipkowitz, M. Pradhan, Computational studies of chiral
    catalysts: A comparative molecular field analysis of an
    asymmetric Diels-Alder reaction with catalysts containing
    bisoxazoline or phosphinooxazoline ligands.J. Org. Chem. 68 ,
    4648 – 4656 (2003). doi: 10 .1021/jo0267697; pmid: 12790567
    19. M. C. Kozlowski, S. L. Dixon, M. Panda, G. Lauri, Quantum
    mechanical models correlating structure with selectivity:
    Predicting the enantioselectivity ofb-amino alcohol catalysts in
    aldehyde alkylation.J. Am. Chem. Soc. 125 , 6614–6615 (2003).
    doi:10.1021/ja0293195; pmid: 12769554
    20. J. L. Melville, B. I. Andrews, B. Lygo, J. D. Hirst, Computational
    screening of combinatorial catalyst libraries.Chem. Commun.
    2004 , 1410–1411 (2004). doi:10.1039/b402378a;
    pmid: 15179489
    21. S. Sciabolaet al., Theoretical prediction of the enantiomeric
    excess in asymmetric catalysis. An alignment-independent
    molecular interaction field based approach.J. Org. Chem. 70 ,
    9025 – 9027 (2005). doi:10.1021/jo051496b; pmid: 16238344
    22. K. C. Harper, M. S. Sigman, Predicting and optimizing
    asymmetric catalyst performance using the principles of
    experimental design and steric parameters.Proc. Natl. Acad.
    Sci. U.S.A. 108 , 2179–2183 (2011). doi:10.1073/
    pnas.1013331108; pmid: 21262844
    23. K. C. Harper, M. S. Sigman, Three-dimensional correlation of
    steric and electronic free energy relationships guides
    asymmetric propargylation.Science 333 , 1875–1878 (2011).
    doi:10.1126/science.1206997; pmid: 21960632
    24. M. S. Sigman, K. C. Harper, E. N. Bess, A. Milo, The
    development of multidimensional analysis tools for asymmetric
    catalysis and beyond.Acc. Chem. Res. 49 , 1292–1301 (2016).
    doi:10.1021/acs.accounts.6b00194; pmid: 27220055
    25. K. C. Harper, E. N. Bess, M. S. Sigman, Multidimensional steric
    parameters in the analysis of asymmetric catalytic reactions.
    Nat. Chem. 4 , 366–374 (2012). doi:10.1038/nchem.1297;
    pmid: 22522256
    26. Y. Park, Z. L. Niemeyer, J.-Q. Yu, M. S. Sigman, Quantifying
    structural effects of amino acid ligands in Pd(II)-catalyzed
    enantioselective C–H functionalization reactions.
    Organometallics 37 , 203–210 (2018). doi:10.1021/
    acs.organomet.7b00751
    27.D. T. Ahneman, J. G. Estrada, S. Lin, S. D. Dreher, A. G. Doyle,
    Predicting reaction performance in C-N cross-coupling
    using machine learning.Science 360 , 186–190 (2018).
    doi:10.1126/science.aar5169; pmid: 29449509
    28. M. K. Nielsen, D. T. Ahneman, O. Riera, A. G. Doyle,
    Deoxyfluorination with sulfonyl fluorides: Navigating reaction
    space with machine learning.J. Am. Chem. Soc. 140 ,
    5004 – 5008 (2018). doi:10.1021/jacs.8b01523;
    pmid: 29584953
    29. L. Breiman, Random forests.Mach. Learn. 45 ,5–32 (2001).
    doi:10.1023/A:1010933404324
    30. G. Skoraczyńskiet al., Predicting the outcomes of organic
    reactions via machine learning: Are current descriptors
    sufficient?Sci. Rep. 7 , 3582 (2017). doi:10.1038/s41598-017-
    02303-0; pmid: 28620199
    31. In this case, meaning as many as can be accessed by the
    synthesis of fragments that require no more than four well-
    established synthetic steps before being combined with a
    common scaffold.
    32. D. Parmar, E. Sugiono, S. Raja, M. Rueping, Addition and
    correction to complete field guide to asymmetric BINOL-
    phosphate derived Brønsted acid and metal catalysis: History
    and classification by mode of activation; Brønsted acidity,
    hydrogen bonding, ion pairing, and metal phosphates.
    Chem. Rev. 117 , 10608–10620 (2017). doi:10.1021/
    acs.chemrev.7b00197; pmid: 28737901
    33. K. Roy, S. Kar, R. N. Das, inUnderstanding the Basics of QSAR
    for Applications in Pharmaceutical Sciences and Risk
    Assessment, K. Roy, S. Kar, R. N. Das, Eds. (Academic Press,
    2015), pp. 291–317.
    34. V. L. Cruz, S. Martinez, J. Ramos, J. Martinez-Salazar, 3D-QSAR
    as a tool for understanding and improving single-site
    polymerization catalysts. A review.Organometallics 33 ,
    2944 – 2959 (2014). doi:10.1021/om400721v
    35. P. Braiuca, K. Lorena, V. Ferrario, C. Ebert, L. Gardossi,
    A three-dimensional quantitative structure-activity relationship
    (3D-QSAR) model for predicting the enantioselectivity of


Candida antarcticalipase B.Adv. Synth. Catal. 351 , 1293– 1302
(2009). doi:10.1002/adsc.200900009


  1. C. L. Senese, J. Duca, D. Pan, A. J. Hopfinger, Y. J. Tseng,
    4D-fingerprints, universal QSAR and QSPR descriptors.
    J. Chem. Inf. Comput. Sci. 44 , 1526–1539 (2004). doi:10.1021/
    ci049898s;pmid: 15446810

  2. J. L. Melvilleet al., Exploring phase-transfer catalysis with
    molecular dynamics and 3D/4D quantitative structure-
    selectivity relationships.J. Chem. Inf. Model. 45 , 971– 981
    (2005). doi:10.1021/ci050051l; pmid: 16045291

  3. R. E. Bellman,Dynamic Programming(Princeton Univ.
    Press, 1957).

  4. K. Pearson, LIII. On lines and planes of closest fit to systems of
    points in space.London Edinb. Dublin Philos. Mag. J. Sci. 2 ,
    559 – 572 (1901). doi:10.1080/14786440109462720

  5. R. W. Kennard, L. A. Stone, Computer aided design of
    experiments.Technometrics 11 , 137–148 (1969). doi:10.1080/
    00401706.1969.10490666

  6. G. K. Ingle, M. G. Mormino, L. Wojtas, J. C. Antilla, Chiral
    phosphoric acid-catalyzed addition of thiols toN-acyl imines:
    Access to chiralN,S-acetals.Org. Lett. 13 , 4822–4825 (2011).
    doi:10.1021/ol201899c; pmid: 21842841

  7. I. Steinwart, D. Hush, C. Scovel, Learning from dependent
    observations.J. Multivar. Anal. 100 , 175–194 (2009).
    doi:10.1016/j.jmva.2008.04.001

  8. L. Simón, J. M. Goodman, Theoretical study of the mechanism
    of Hantzsch ester hydrogenation of imines catalyzed by chiral
    BINOL-phosphoric acids.J. Am. Chem. Soc. 130 , 8741– 8747
    (2008). doi:10.1021/ja800793t; pmid: 18543923

  9. S. E. Wheeler, K. N. Houk, Through-space effects of
    substituents dominate molecular electrostatic potentials of
    substituted arenes.J. Chem. Theory Comput. 5 , 2301– 2312
    (2009). doi:10.1021/ct900344g; pmid: 20161573

  10. C. Hansch, A. Leo, R. W. Taft, A survey of Hammett substituent
    constants and resonance and field parameters.Chem. Rev.
    91 , 165–195 (1991). doi:10.1021/cr00002a004

  11. M. Valievet al., NWChem: A comprehensive and scalable
    open-sourcesolution for large scale molecular simulations.
    Comput. Phys. Commun. 181 , 1477–1489 (2010). doi:10.1016/
    j.cpc.2010.04.018

  12. F. Pedregosaet al., Scikit-learn: Machine learning in Python.
    J. Mach. Learn. Res. 12 , 2825–2830 (2011).

  13. Denmark Lab Chemoinformatics, ccheminfolib, Project ID
    8113486, GitLab (2018);https://gitlab.com/SEDenmarkLab/
    ccheminfolib.

  14. F. Chollet, Keras: Deep learning for humans, GitHub;https://
    github.com/fchollet/keras.


ACKNOWLEDGMENTS
We thank K. A. Robb and Z. Wickenhauser for experimental
assistance and N. Russell for informative discussions about
machine learning. We are also grateful for the support services of
the NMR, mass spectrometry, and microanalytical laboratories of
the University of Illinois at Urbana-Champaign.Funding:We are
grateful for generous financial support from the W. M. Keck
Foundation. A.F.Z. is grateful to the University of Illinois for
graduate fellowships. Y.W. thanks Janssen Research Development,
San Diego, CA, for a postdoctoral fellowship.Author
contributions:A.F.Z. contributed to catalyst synthesis, acquisition
of experimental selectivity data, and computer modeling and
composed the manuscript. J.J.H. contributed to creating
ccheminfolib, designing and implementing the ASO descriptors,
and revising the manuscript. B.T.R., Y.W., and W.T.D. contributed
to catalyst synthesis. S.E.D. secured funding, supervised the
project, analyzed data, and revised the manuscript.Competing
interests:The authors declare no competing interests.Data and
materials availability:Full experimental procedures,
characterization data, and copies of^1 H,^13 C,^31 P, and^19 F spectra
can be found in the supplementary materials, along with analytical
supercritical fluid chromatography traces of all products. The
computer code used in these studies is available in GitLab ( 48 ).

SUPPLEMENTARY MATERIALS
http://www.sciencemag.org/content/363/6424/eaau5631/suppl/DC1
Materials and Methods
Supplementary Text
Figs. S1 to S10
Table S1
References ( 50 – 86 )
Data S1 to S3
22 June 2018; accepted 3 December 2018
10.1126/science.aau5631

Zahrtet al.,Science 363 , eaau5631 (2019) 18 January 2019 11 of 11


RESEARCH | RESEARCH ARTICLE


on January 18, 2019^

http://science.sciencemag.org/

Downloaded from
Free download pdf