Nature - USA (2019-07-18)

(Antfer) #1

reSeArcH Article


We hypothesized that a series of focused correlations, coupled with an
evaluation of the overall trends, might serve to reveal fundamental fea-
tures of the systems. To this end, we truncated the dataset into subsets,
categorized by imine transition-state geometry (E or Z) determined by
the relative sign of the enantiomeric excess defined previously, as these
are hypothesized to lead to structurally distinct interactions with the
other reaction components. This organizational scheme was viewed as
a means of facilitating the identification of catalyst features that affect
particular mechanistic pathways and therefore, reactant combinations
(and vice versa). Linear regression algorithms were then applied to this
data classification to identify correlations between molecular structure
and the experimentally determined enantioselectivity. Subsequently,
analysis and refinement of the resulting models were used to produce
explicit mechanistic hypotheses (Fig.  3 ).
The correlation depicted in Fig.  3 was identified from a set of 204
reactions (evenly split into training and validation sets) that proceed via
the E-imine transition state. The relationship includes two solvent, two
imine, one nucleophile and three catalyst terms. Overall, the statistical


model suggests a mechanistic scenario in which the imine adopts an
arrangement that minimizes energetically penalizing repulsion interac-
tions with reasonably large catalyst substituents^31. Perhaps most telling
is that the steric profile of the nucleophile does not have much effect
on the stereoselectivity outcome, despite the large structural variance.
The included parameters (LUMO and the P‒O asymmetric stretch-
ing intensity, iPOas) suggest that hydrogen-bonding contacts between
catalyst and nucleophile play a minor part and the use of almost any
nucleophile should be compatible with the reaction if the imine and
catalyst are matched.
In evaluating the model for Z-imines determined by 147 reactions, a
number of overlapping terms reinforce the notion that similar interac-
tions between catalyst and substrates remain within the two geometric
imine stereoisomers. Two of these terms—the size of the catalyst aryl
substituent as measured by the Sterimol B1 term and the imine NBO
parameter—essentially describe the repulsive interactions between
proximal sterics and the imine N-substituent, a critical catalyst–
substrate interaction common to both transition-state imine

abInterpretation of model terms

O

O
P

O
O

Ar

L
A remote
environment angle

iPOas

O

O
P

O
O

Ar

N

B5 H

L

H
XNu
LUMO

Steric contributors

Electronic contributors

c d

O
OP

O
OH

i-Pr i-Pr

i-Pr

i-Pr

i-Pr i-Pr

B1
N

NBOPG HNBOH

L N

S
Ar

H B5

Captures benecial effect
of large proximal
substituents

Large nucleophiles
improve e.e.

NBO terms capture the
benecial effect of aryl
over benzyl substituents

–3.0 –2.5 –2.0 –1.5 –1.0 –0.5 0.0

–3.0

–2.5

–2.0

–1.5

–1.0

–0.5

0.0

Training set
Validation set

ΔΔG‡ = –1.47 – 0.15NBOH + 0.32NBOPG + 0.50Ls
–0.79B5Nu – 0.33B1cat

Training set
Validation set

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5
ΔΔG‡ =1.73 + 0.40CI + 0.17PEOE5 + 0.20B5PG + 0.11Ll


  • 0.25LUMO + 0.45Lcat – 0.45iPOas + 0.10sin(AREA)


Measured ΔΔG‡ (kcal mol–1)

Predicted

ΔΔ

G
‡ (kcal mol

–1
)

Measured ΔΔG‡ (kcal mol–1)

Predicted

ΔΔ

G

‡ (kcal mol

–1

)

Enantioselectivity is dependent on reasonably large imine N-substituent,
imine C-substituent and catalyst substituents

H-bonding effects mainly originate from choice of solvent (competing
H-bonding) and nucleophile and catalyst (H-bonding distances)

Fig. 3 | Development of focused correlations. a, Regression E-imine
model containing 204 entries data-mined from nine literature sources
(see the Supplementary Information for references). ‘CI’ and ‘PEOE5’ are
solvent descriptors, ‘B5PG’ and Ll are the imine steric descriptors, LUMO
is the lowest unoccupied molecular orbital energy describing the
nucleophile, Lcat is the length of the catalyst 2-substituent, ‘iPOas’ is the
P‒O asymmetric stretching intensity and ‘AREA’ is a remote environment
angle. The line is a fit, y = 0.80x + 0.35. The LOO cross-validation score
is 0.76; the average k-fold (here, fourfold) cross-validation score is 0.74;
R^2 is 0.80; the predicted R^2 is 0.73. b, Interpretation of E-imine model
terms. The model emphasizes the importance of both steric and electronic
factors. Reasonably large catalyst and imine substituents lead to high levels
of enantioselectivity; if these two components are matched any nucleophile


should be compatible. c, Regression Z-imine model containing 147
entries data-mined from eight literature sources (see the Supplementary
Information for references). ‘NBOH’ and ‘NBOPG’ are the imine natural
bond orbital parameters; Ls is a steric descriptor of the smallest imine
substituent; ‘B5Nu’ is the nucleophile steric descriptor and ‘B1cat’ is the
Sterimol B1 term. The line is a fit, y = 0.83x − 0.24. The LOO cross-
validation score is 0.80; the average k-fold (here, fourfold) cross-validation
score is 0.79; R^2 is 0.83; the predicted R^2 is 0.80. d, Interpretation of Z-
imine model terms. Overlapping steric terms describing the catalyst and
the imine reinforce the notion that similar interactions remain within the
two geometric imine stereoisomers. However, this model emphasizes the
importance of steric contributions predominantly from the nucleophile for
high enantioselectivities.

346 | NAtUre | VOl 571 | 18 JUlY 2019

Free download pdf