beyond the range of selectivities in the training
data, the models quantitatively underpredicted
these reactions. By using Keras ( 49 ) with the
Theano backend, a Python package that can
facilitate deep learning, we generated a deep
feed-forward network. Grid-based hyperparam-
eter optimization was used with linear, relu, elu,
and selu activation functions; 0.05, 0.1, and
0.2 dropouts on the layers; 4, 40, 400, and
4000 nodes per layer; and 0 to 6 hidden layers.
Further, all optimizers available in Keras were
tested. This method of hyperparameter optimi-
zation was very time intensive, and it is strongly
recommended that practitioners instead use a
Bayesian optimization of hyperparameters. At-
tempts to use this kind of optimization and
more modern machine learning methods are
currently under way.
REFERENCES AND NOTES
- T. Engel, Basic overview of chemoinformatics.J. Chem. Inf.
Model. 46 , 2267–2277 (2006). doi:10.1021/ci600234z;
pmid: 17125169 - P. Willett, Chemoinformatics: A history.Wiley Interdiscip.
Rev. Comput. Mol. Sci. 1 ,46–56 (2011). doi:10.1002/wcms.1 - D. K. Agrafiotis, D. Bandyopadhyay, J. K. Wegner, Hv. Vlijmen,
Recent advances in chemoinformatics.J. Chem.
Inf. Model. 47 ,1279–1293 (2007). doi:10.1021/ci700059g;
pmid: 17511441 - E. A. Feigenbaum, B. G. Buchanan, DENDRAL and Meta-
DENDRAL: Roots of knowledge systems and expert system
applications.Artif. Intell. 59 , 233–240 (1993). doi:10.1016/
0004-3702(93)90191-D - M. H. S. Segler, M. Preuss, M. P. Waller, Planning chemical
syntheses with deep neural networks and symbolic AI.
Nature 555 , 604–610 (2018). doi:10.1038/nature25978;
pmid: 29595767 - S. Szymkućet al., Computer-assisted synthetic planning:
The end of the beginning.Angew. Chem. Int. Ed. Engl.
55 , 5904–5937 (2016). doi:10.1002/anie.201506101;
pmid: 27062365 - J. N. Wei, D. Duvenaud, A. Aspuru-Guzik, Neural networks for the
prediction of organic chemistry reactions.ACS Cent. Sci. 2 ,725– 732
(2016). doi:10.1021/acscentsci.6b00219;pmid: 27800555 - C. W. Coley, R. Barzilay, T. S. Jaakkola, W. H. Green,
K. F. Jensen, Prediction of organic reaction outcomes using
machine learning.ACS Cent. Sci. 3 , 434–443 (2017).
doi:10.1021/acscentsci.7b00064; pmid: 28573205 - H. Chen, O. Engkvist, Y. Wang, M. Olivecrona, T. Blaschke,
The rise of deep learning in drug discovery.Drug Discov. Today
23 , 1241–1250 (2018). doi:10.1016/j.drudis.2018.01.039;
pmid: 29366762 - J. Ma, R. P. Sheridan, A. Liaw, G. E. Dahl, V. Svetnik, Deep
neural nets as a method for quantitative structure-activity
relationships.J.Chem. Inf. Model. 55 , 263–274 (2015).
doi:10.1021/ci500747n; pmid: 25635324 - S. E. Denmark, N. D. Gould, L. M. Wolf, A systematic
investigation of quaternary ammonium ions as asymmetric
phase-transfer catalysts. Synthesis of catalyst libraries
and evaluation of catalyst activity.J. Org. Chem. 76 ,
4260 – 4336 (2011). doi:10.1021/jo2005445; pmid: 21446721 - S. E. Denmark, N. D. Gould, L. M. Wolf, A systematic
investigation of quaternary ammonium ions as asymmetric
phase-transfer catalysts. Application of quantitative structure
activity/selectivity relationships.J. Org. Chem. 76 , 4337– 4357
(2011). doi:10.1021/jo2005457; pmid: 21446723 - R. Gómez-Bombarelliet al., Automatic chemical design
using a data-driven continuous representation of molecules.
ACS Cent. Sci. 4 , 268–276 (2018). doi:10.1021/
acscentsci.7b00572; pmid: 29532027 - P. Raccugliaet al., Machine-learning-assisted materials
discovery using failed experiments.Nature 533 ,73–76 (2016).
doi:10.1038/nature17439; pmid: 27147027 - A. P. Bartók, M. J. Gillan, F. R. Manby, G. Csányi, Machine-
learning approach for one- and two-body corrections to density
functional theory: Applications to molecular and condensed
water.Phys. Rev. B 88 , 054104 (2013). doi:10.1103/
PhysRevB.88.054104
16. Z. Zhou, X. Li, R. N. Zare, Optimizing chemical reactions with
deep reinforcement learning.ACS Cent. Sci. 3 , 1337– 1344
(2017). doi:10.1021/acscentsci.7b00492; pmid: 29296675
17. J. B. O. Mitchell, Machine learning methods in
chemoinformatics.Wiley Interdiscip. Rev. Comput. Mol. Sci. 4 ,
468 – 481 (2014). doi:10.1002/wcms.1183; pmid: 25285160
18. K. B. Lipkowitz, M. Pradhan, Computational studies of chiral
catalysts: A comparative molecular field analysis of an
asymmetric Diels-Alder reaction with catalysts containing
bisoxazoline or phosphinooxazoline ligands.J. Org. Chem. 68 ,
4648 – 4656 (2003). doi: 10 .1021/jo0267697; pmid: 12790567
19. M. C. Kozlowski, S. L. Dixon, M. Panda, G. Lauri, Quantum
mechanical models correlating structure with selectivity:
Predicting the enantioselectivity ofb-amino alcohol catalysts in
aldehyde alkylation.J. Am. Chem. Soc. 125 , 6614–6615 (2003).
doi:10.1021/ja0293195; pmid: 12769554
20. J. L. Melville, B. I. Andrews, B. Lygo, J. D. Hirst, Computational
screening of combinatorial catalyst libraries.Chem. Commun.
2004 , 1410–1411 (2004). doi:10.1039/b402378a;
pmid: 15179489
21. S. Sciabolaet al., Theoretical prediction of the enantiomeric
excess in asymmetric catalysis. An alignment-independent
molecular interaction field based approach.J. Org. Chem. 70 ,
9025 – 9027 (2005). doi:10.1021/jo051496b; pmid: 16238344
22. K. C. Harper, M. S. Sigman, Predicting and optimizing
asymmetric catalyst performance using the principles of
experimental design and steric parameters.Proc. Natl. Acad.
Sci. U.S.A. 108 , 2179–2183 (2011). doi:10.1073/
pnas.1013331108; pmid: 21262844
23. K. C. Harper, M. S. Sigman, Three-dimensional correlation of
steric and electronic free energy relationships guides
asymmetric propargylation.Science 333 , 1875–1878 (2011).
doi:10.1126/science.1206997; pmid: 21960632
24. M. S. Sigman, K. C. Harper, E. N. Bess, A. Milo, The
development of multidimensional analysis tools for asymmetric
catalysis and beyond.Acc. Chem. Res. 49 , 1292–1301 (2016).
doi:10.1021/acs.accounts.6b00194; pmid: 27220055
25. K. C. Harper, E. N. Bess, M. S. Sigman, Multidimensional steric
parameters in the analysis of asymmetric catalytic reactions.
Nat. Chem. 4 , 366–374 (2012). doi:10.1038/nchem.1297;
pmid: 22522256
26. Y. Park, Z. L. Niemeyer, J.-Q. Yu, M. S. Sigman, Quantifying
structural effects of amino acid ligands in Pd(II)-catalyzed
enantioselective C–H functionalization reactions.
Organometallics 37 , 203–210 (2018). doi:10.1021/
acs.organomet.7b00751
27.D. T. Ahneman, J. G. Estrada, S. Lin, S. D. Dreher, A. G. Doyle,
Predicting reaction performance in C-N cross-coupling
using machine learning.Science 360 , 186–190 (2018).
doi:10.1126/science.aar5169; pmid: 29449509
28. M. K. Nielsen, D. T. Ahneman, O. Riera, A. G. Doyle,
Deoxyfluorination with sulfonyl fluorides: Navigating reaction
space with machine learning.J. Am. Chem. Soc. 140 ,
5004 – 5008 (2018). doi:10.1021/jacs.8b01523;
pmid: 29584953
29. L. Breiman, Random forests.Mach. Learn. 45 ,5–32 (2001).
doi:10.1023/A:1010933404324
30. G. Skoraczyńskiet al., Predicting the outcomes of organic
reactions via machine learning: Are current descriptors
sufficient?Sci. Rep. 7 , 3582 (2017). doi:10.1038/s41598-017-
02303-0; pmid: 28620199
31. In this case, meaning as many as can be accessed by the
synthesis of fragments that require no more than four well-
established synthetic steps before being combined with a
common scaffold.
32. D. Parmar, E. Sugiono, S. Raja, M. Rueping, Addition and
correction to complete field guide to asymmetric BINOL-
phosphate derived Brønsted acid and metal catalysis: History
and classification by mode of activation; Brønsted acidity,
hydrogen bonding, ion pairing, and metal phosphates.
Chem. Rev. 117 , 10608–10620 (2017). doi:10.1021/
acs.chemrev.7b00197; pmid: 28737901
33. K. Roy, S. Kar, R. N. Das, inUnderstanding the Basics of QSAR
for Applications in Pharmaceutical Sciences and Risk
Assessment, K. Roy, S. Kar, R. N. Das, Eds. (Academic Press,
2015), pp. 291–317.
34. V. L. Cruz, S. Martinez, J. Ramos, J. Martinez-Salazar, 3D-QSAR
as a tool for understanding and improving single-site
polymerization catalysts. A review.Organometallics 33 ,
2944 – 2959 (2014). doi:10.1021/om400721v
35. P. Braiuca, K. Lorena, V. Ferrario, C. Ebert, L. Gardossi,
A three-dimensional quantitative structure-activity relationship
(3D-QSAR) model for predicting the enantioselectivity of
Candida antarcticalipase B.Adv. Synth. Catal. 351 , 1293– 1302
(2009). doi:10.1002/adsc.200900009
- C. L. Senese, J. Duca, D. Pan, A. J. Hopfinger, Y. J. Tseng,
4D-fingerprints, universal QSAR and QSPR descriptors.
J. Chem. Inf. Comput. Sci. 44 , 1526–1539 (2004). doi:10.1021/
ci049898s;pmid: 15446810 - J. L. Melvilleet al., Exploring phase-transfer catalysis with
molecular dynamics and 3D/4D quantitative structure-
selectivity relationships.J. Chem. Inf. Model. 45 , 971– 981
(2005). doi:10.1021/ci050051l; pmid: 16045291 - R. E. Bellman,Dynamic Programming(Princeton Univ.
Press, 1957). - K. Pearson, LIII. On lines and planes of closest fit to systems of
points in space.London Edinb. Dublin Philos. Mag. J. Sci. 2 ,
559 – 572 (1901). doi:10.1080/14786440109462720 - R. W. Kennard, L. A. Stone, Computer aided design of
experiments.Technometrics 11 , 137–148 (1969). doi:10.1080/
00401706.1969.10490666 - G. K. Ingle, M. G. Mormino, L. Wojtas, J. C. Antilla, Chiral
phosphoric acid-catalyzed addition of thiols toN-acyl imines:
Access to chiralN,S-acetals.Org. Lett. 13 , 4822–4825 (2011).
doi:10.1021/ol201899c; pmid: 21842841 - I. Steinwart, D. Hush, C. Scovel, Learning from dependent
observations.J. Multivar. Anal. 100 , 175–194 (2009).
doi:10.1016/j.jmva.2008.04.001 - L. Simón, J. M. Goodman, Theoretical study of the mechanism
of Hantzsch ester hydrogenation of imines catalyzed by chiral
BINOL-phosphoric acids.J. Am. Chem. Soc. 130 , 8741– 8747
(2008). doi:10.1021/ja800793t; pmid: 18543923 - S. E. Wheeler, K. N. Houk, Through-space effects of
substituents dominate molecular electrostatic potentials of
substituted arenes.J. Chem. Theory Comput. 5 , 2301– 2312
(2009). doi:10.1021/ct900344g; pmid: 20161573 - C. Hansch, A. Leo, R. W. Taft, A survey of Hammett substituent
constants and resonance and field parameters.Chem. Rev.
91 , 165–195 (1991). doi:10.1021/cr00002a004 - M. Valievet al., NWChem: A comprehensive and scalable
open-sourcesolution for large scale molecular simulations.
Comput. Phys. Commun. 181 , 1477–1489 (2010). doi:10.1016/
j.cpc.2010.04.018 - F. Pedregosaet al., Scikit-learn: Machine learning in Python.
J. Mach. Learn. Res. 12 , 2825–2830 (2011). - Denmark Lab Chemoinformatics, ccheminfolib, Project ID
8113486, GitLab (2018);https://gitlab.com/SEDenmarkLab/
ccheminfolib. - F. Chollet, Keras: Deep learning for humans, GitHub;https://
github.com/fchollet/keras.
ACKNOWLEDGMENTS
We thank K. A. Robb and Z. Wickenhauser for experimental
assistance and N. Russell for informative discussions about
machine learning. We are also grateful for the support services of
the NMR, mass spectrometry, and microanalytical laboratories of
the University of Illinois at Urbana-Champaign.Funding:We are
grateful for generous financial support from the W. M. Keck
Foundation. A.F.Z. is grateful to the University of Illinois for
graduate fellowships. Y.W. thanks Janssen Research Development,
San Diego, CA, for a postdoctoral fellowship.Author
contributions:A.F.Z. contributed to catalyst synthesis, acquisition
of experimental selectivity data, and computer modeling and
composed the manuscript. J.J.H. contributed to creating
ccheminfolib, designing and implementing the ASO descriptors,
and revising the manuscript. B.T.R., Y.W., and W.T.D. contributed
to catalyst synthesis. S.E.D. secured funding, supervised the
project, analyzed data, and revised the manuscript.Competing
interests:The authors declare no competing interests.Data and
materials availability:Full experimental procedures,
characterization data, and copies of^1 H,^13 C,^31 P, and^19 F spectra
can be found in the supplementary materials, along with analytical
supercritical fluid chromatography traces of all products. The
computer code used in these studies is available in GitLab ( 48 ).
SUPPLEMENTARY MATERIALS
http://www.sciencemag.org/content/363/6424/eaau5631/suppl/DC1
Materials and Methods
Supplementary Text
Figs. S1 to S10
Table S1
References ( 50 – 86 )
Data S1 to S3
22 June 2018; accepted 3 December 2018
10.1126/science.aau5631
Zahrtet al.,Science 363 , eaau5631 (2019) 18 January 2019 11 of 11
RESEARCH | RESEARCH ARTICLE
on January 18, 2019^
http://science.sciencemag.org/
Downloaded from