Enrico
Berardo
a,
Rebecca L.
Greenaway
b,
Marcin
Miklitz
a,
Andrew I.
Cooper
b and
Kim E.
Jelfs
*a
aDepartment of Chemistry, Imperial College London, Molecular Sciences Research Hub, White City Campus, Wood Lane, London W12 0BZ, UK. E-mail: k.jelfs@imperial.ac.uk; Tel: +44 (0)2075943438
bDepartment of Chemistry and Materials Innovation Factory, University of Liverpool, 51 Oxford Street, Liverpool L7 3NY, UK
First published on 13th August 2019
Supramolecular self-assembly has allowed the synthesis of beautiful and complex molecular architectures, such as cages, macrocycles, knots, catenanes, and rotaxanes. We focus here on porous organic cages, which are molecules that have an intrinsic cavity and multiple windows. These cages have been shown to be highly effective at molecular separations and encapsulations. We investigate the possibility of complexes where one cage sits within the cavity of another. We term this a ‘nested cage’ complex. The design of such complexes is highly challenging, so we use computational screening to explore 8712 different pair combinations, running almost 0.5 million calculations to sample the phase space of the cage conformations. Through analysing the binding energies of the assemblies, we identify highly energetically favourable pairs of cages in nested cage complexes. The vast majority of the most favourable complexes include the large imine cage reported by Gawroński and co-workers using a [8 + 12] reaction of 4-tert-butyl-2,6-diformylphenol and cis,cis-1,3,5-triaminocyclohexane. The most energetically favourable nested cage complex combines the Gawroński cage with a dodecaamide cage that has six vertices, which can sit in the six windows of the larger cage. We also identify cages that have favourable binding energies for self-catenation.
Design, System, ApplicationThe ultimate goal of chemists is to be able to design and then control the assembly of complex molecules and architectures. Computer-aided design allows us to screen hundreds to millions of possibilities, a scale that is not feasible in the laboratory, even using high-throughput automation. Here, we use simulations to focus on a specific type of supramolecular material, porous organic cages, which are discrete organic molecules that contain an internal cavity. These cavities, and the multiple entry and exit routes to these cavities, allow for potential application of these systems in catalysis, separation and encapsulation. In this paper, we aim to control the encapsulation such that we form nested cages where one cage sits within the internal cavity of a larger cage. To do this, we use simulations to trial 8712 different combinations of cage pairs, to discover systems that have highly favourable binding energies and are thus therefore the most likely to form. We are therefore able to predict not only the most promising nested cage pair systems, but also the systems that have a preference for self-catenation. These predictions can be used for future synthetic realisation. |
Porous organic cages (POCs) are an example of a porous molecular material where the porosity of the material originates in the intrinsic cavity of the cage molecule. A POC can be defined as having an internal cavity with multiple entry and exit routes through which guests can access the central cavity.3 POCs have been reported in a variety of sizes, shapes and topologies (see examples in Fig. 1), although the total number reported is only on the order of 200 molecules. The majority of POCs are made through the use of dynamic covalent chemistry (DCC), in particular imine condensation reactions, the reversible nature of which allows for error correction during the synthesis of these high symmetry products, often in high yield. There are also examples of POCs catenating to form interlocked structures with another molecule of the same cage species.21–25 This catenation can typically be controlled by the functionalisation of the cage; for instance, the position of the alkyl groups, which can introduce multiple pores of different sizes within a single complex.22 Potential applications of POCs include their use as encapsulants,26 in catalysis,27 molecular separations,28–31 as sensors,32,33 and in porous liquids.34
Fig. 1 Examples of previously reported porous organic cages that are a variety of sizes, shapes, and topologies; (top left) cryptophane;35 (top centre) CC3;36 (top right) ExCage;37 (bottom left) a giant boronate cage,22 and (bottom right) C26.23 |
It is possible to use computer simulations to assist in the design and discovery of porous organic cages.38 First, starting from the precursors of the cage synthesis reaction, the outcome in terms of the molecular mass and topology of the cage can be hard to predict, since small changes in the precursors are known to have led to large changes in topology and consequently the properties of the cages.39 It has been shown that it is possible to predict the topology by examining the relative energies of the different possible assemblies,40,41 or further, by considering the formation mechanisms of the cage products.42 While most studies to date have focused upon a posteriori rationalisation of previously reported systems, we have recently shown it is possible to identify trends in the reaction outcomes and therefore assist in the discovery process during a larger-scale robotic screening of 78 potential cage reactions.23 It is also possible to predict shape persistency, the ability for a cage to maintain an internal void in the absence of solvent.43,44 The solid-state structure of materials and their properties can be predicted from molecular structures using crystal structure prediction (CSP) techniques.45–49 Molecular-level calculations of binding energies of dimer pairs can also assist in predicting the preferential binding modes in the solid-state, for example the preference to racemise or form enantiopure structures.50 The properties of the materials can be also understood or predicted a priori once the structures are known.38,51,52
We report here a computational screening study where we conduct almost 0.5 million calculations in order to search for elusive nested organic cage complexes. Through an examination of binding energies, we identify the most promising candidate cages for forming such complexes and analyse which systems are the most suitable for targeting for synthesis. Alongside this, we compare the competing pathways of self-catenation of the cages, which further identifies promising candidates for that type of assembly.
The cage structures were taken from reported crystal structures, or constructed manually for those structures where no X-ray diffraction structure was reported. The manual construction involved a short molecular dynamics (MD) simulation that sampled several hundred conformations and selected the lowest energy conformation for further simulations. For a few cases where more than one enantiomer of the cage is possible, we used only the enantiomer that was reported in the crystal structure. If there was positional isomerism, for example in methyl position for CC2, a random isomer was constructed to keep the number of calculations required feasible. The structural properties of the cages and their voids and windows are given in Table S2.† All void sizes and window sizes were calculated with our pywindow software, with void sizes calculated as the diameter of the largest sphere that can fit in the cavity and window size as the diameter of the largest circle that can fit in a window.78 The maximum diameter of a molecule was defined as the distance between the edges of the van der Waals spheres of the two atoms at the greatest distance from each other in the molecule. The average diameter of a molecule was determined as a mean distance from the centre of mass of a molecule to its van der Waals surface. The latter value can match the experimentally determined solvodynamic diameters.23
Pairing each molecule with every other molecule in our data set, including the self-catenation combination, made a total of 8712 pair combinations to be considered. To sample the possible relative orientations of each molecular pair, we considered 56 different but evenly spaced relative orientations by carrying out rotations of the polar and azimuthal angles of one cage while keeping the other in a fixed position. Each of these structures was geometry optimised so that the lowest energy orientation could be analysed further. Thus, we conducted a total of 487872 calculations (8712 molecule pairs, each in 56 orientations). For all forcefield calculations, we used the OPLS3 forcefield79 which we have previously shown effectively predicts the structure of flexible porous imine cages,41 and is designed to be transferable to new organic systems. The calculations were carried out in Macromodel, using a conjugate gradient minimisation with a convergence criteria of a root mean square force below 0.05 kJ mol−1 Å−1. The individual cage structures were geometry optimised in isolation with the same setup. The binding energy for the lowest energy conformation of each cage pairing was then calculated as:
Eb = Ecage pair − Ecage1 − Ecage2 | (1) |
In theory, a favourable binding energy should be found in cases where there is a good match of the void size of the larger cage in the pairing with the dimensions of the smaller cage. However, in Fig. S1,† a heat-map of the difference in the void size of the larger cage to the maximum dimension of the smaller cage does not show any correlation with the binding energies in Fig. 2. Similarly, there appears to be no correlation between the difference in those sizes and the binding energy (Fig. S2†). This suggests that there are many other factors, for instance symmetry and the intermolecular bonding available for a given cage pairing, that are influencing the binding energy.
In each case of pairing two different cages, there would be competition with each of the individual cages preferring to self-catenate instead, which might be due to that potentially being thermodynamically favoured over forming a nested cage complex. We therefore compared the binding energies against the comparable self-catenation energies, and in Fig. 3A we plot a heat-map that shows which pairings would energetically favour self-catenation (red) and which would favour forming a nested cage complex (blue). We find that all the pairings where self-catenation is preferred correspond to regions in Fig. 2 where the binding energy for a nested cage pairing was unfavourable (pink). Therefore, if we replot Fig. 2 with any pairings that would prefer self-catenation shown as energetically unfavourable, then there is no visual difference in the heat-map. Further, many of the pairings that might favour self-catenation over a nested cage complex are instances where even the self-catenation is energetically unfavourable. In Fig. 3B, we show a heat-map of which pairings actually truly prefer self-catenation (yellow); i.e., have both an energetically favourable binding energy for self-catenation and self-catenation is energetically preferred over a nested cage complex. It is clear that it is very rare for the pairings to prefer self-catenation.
Rank | Cage number | Binding energy (kJ mol−1) | Remaining cavity diameter (Å) | Number of windows | Degree of interlocking |
---|---|---|---|---|---|
1 | 81 | −505 | 10.5 | 8 | 4 |
2 | 52 | −477 | 11.8 | 6 | 6 |
3 | 78 | −473 | 9.5 | 6 | 6 |
4 | 89 | −453 | 2.1 | 3 | 2 |
5 | 26 | −453 | 12.1 | 6 | 6 |
6 | 25 | −427 | 9.1 | 6 | 6 |
7 | 51 | −387 | 8.7 | 6 | 6 |
8 | 77 | −377 | 9.9 | 6 | 6 |
9 | 15 | −365 | 8.3 | 4 | 3 |
10 | 71 | −346 | 10.8 | 4 | 3 |
11 | 18 | −345 | 9.8 | 4 | 2 |
12 | 69 | −333 | 8.0 | 4 | 3 |
13 | 97 | −325 | 15.7 | 6 | 6 |
14 | 20 | −325 | 7.5 | 4 | 3 |
15 | 68 | −320 | 8.2 | 4 | 3 |
16 | 45 | −311 | 11.3 | 4 | 3 |
17 | 70 | −309 | 7.8 | 4 | 3 |
18 | 19 | −306 | 12.2 | 4 | 3 |
19 | 43 | −303 | 8.7 | 4 | 3 |
20 | 41 | −291 | 8.6 | 4 | 3 |
In all cases, the self-catenated complex is left with a considerable internal cavity, which could still host further guests. The cavities range from 7.5–15.7 Å, with the exception of cage 89 which has a much smaller cavity of 2.1 Å when self-catenated. We can examine the extent to which the structures are interlocked and compare this to the number of windows (Table 1). The number of windows in these best self-catenation pairs ranges from 3 to 8. In theory, a self-catenation that is maximally interlocked will have the same degree of interlocking as the number of windows, and this is also likely to lead to a more symmetric structure. This maximal interlocking occurs in 6 of the 20 best self-catenation pairs, and in all cases, these are instances where the individual cage has six windows.
The self-catenated structures of the three most energetically favoured combinations are shown in Fig. 4, along with the chemical structures of the precursors for those cages. The images for the remaining structures in the top 20 are shown in Fig. S4.† All three of the best self-catenating cages approximate to (truncated) tetrahedrons, with 81 being a TCC1[6+12] cage reported by Stackhouse et al.,54 and 52 and 78 being two [4 + 4] cages recently reported by Greenaway et al.23 The latter two cages differ only by the fact that the aromatic triamine of 52 is decorated with three methyl groups, whilst 78 is decorated with three ethyl groups. Only in the case of 78 are all of the windows interlocked. Due to this difference in interlocking between 52 and 78, this may be indicative that despite our best efforts for the large numbers of structures, we have not been able to fully sample the potential energy surface for every pair. However, we still believe it is sufficient to identify favourable pairings (such as cage 97). We also note the large size of all of the ‘best’ cages for self-catenation, naturally, with additional atoms, they are likely to have higher binding energies than smaller molecules. Indeed, this is the case for three smaller cages previously reported to self-catenate experimentally, cages 102 (imine cage CC2), 110 (imine cage CC1) and 112 (imine cage CC4),21 which all had much less favourable binding energies for self-catenation than the top 20 reported in Table 1.
The binding energies and structural features of the top 20 nested cage complexes are shown in Table 2. In none of these instances is self-catenation energetically competitive. The most favourable binding energy for a nested cage complex was −1023 kJ mol−1, found for combining cages 111 and 117, which were also the two cages most frequently found in energetically favourable pairings. This pairing is 148 kJ mol−1 more favourable than the next best pairing, and by the 20th best pairing the binding energy has fallen to −660 kJ mol−1, although that is still more favourable than the best self-catenation pairing (−505 kJ mol−1). There are only 3 of the top 20 pairings where cage 117 is not involved. In the majority of cases, the space filling of a smaller cage inside a larger cage is relatively efficient, and only a small cavity remains in the complex, typically below 2.5 Å. However, there are exceptions; for example, the pairing of 97 and 117, which has a remaining cavity of 6.7 Å, where 117 is now the inner cage.
Rank | Inner cage | Outer cage | Binding energy (kJ mol−1) | Remaining cavity diameter (Å) | List 1 | List 2 | High symmetry |
---|---|---|---|---|---|---|---|
1 | 111 | 117 | −1023 | 2.2 | ✓ | ✓ | ✓ |
2 | 103 | 117 | −875 | 2.5 | ✓ | ✓ | |
3 | 4 | 117 | −862 | 0.7 | ✓ | ||
4 | 114 | 117 | −852 | 2.0 | ✓ | ✓ | ✓ |
5 | 117 | 97 | −807 | 6.7 | ✓ | ✓ | ✓ |
6 | 91 | 117 | −782 | 0.0 | |||
7 | 116 | 117 | −781 | 2.1 | ✓ | ✓ | ✓ |
8 | 86 | 117 | −775 | 1.4 | |||
9 | 113 | 117 | −732 | 2.4 | ✓ | ✓ | |
10 | 89 | 117 | −718 | 1.8 | ✓ | ||
11 | 56 | 117 | −714 | 0.5 | ✓ | ✓ | |
12 | 26 | 117 | −714 | 3.9 | ✓ | ||
13 | 71 | 117 | −690 | 0.3 | ✓ | ||
14 | 117 | 81 | −688 | 6.6 | ✓ | ||
15 | 30 | 52 | −678 | 1.0 | ✓ | ✓ | |
16 | 90 | 117 | −674 | 2.2 | ✓ | ✓ | |
17 | 111 | 50 | −666 | 2.5 | ✓ | ✓ | ✓ |
18 | 48 | 117 | −664 | 3.0 | ✓ | ||
19 | 111 | 52 | −661 | 2.4 | ✓ | ✓ | ✓ |
20 | 28 | 117 | −660 | 0.0 | ✓ | ✓ |
The structures of the five most energetically favoured nested cage complexes are shown in Fig. 6, with the rest of the top 20 shown in Fig. S6.† All of the top 5 complexes involve cage 117, which is shown in purple in the figures. It is likely that 117 is found in so many favourable complexes due to the large internal cavity of the cage (diameter 12.9 Å), and the fact that it has six relatively large windows (diameter 8.7 Å). The large size of the window diameter means that many of the inner cages can have their vertices aligned so as to sit inside or through the window, forming favourable intermolecular interactions with the windows of 117. The fact that 117 has six windows is also significant, as 45% of the cages in our data set are formed from [4 + 6] reactions into structures that have six vertices. Each of these vertices can then sit in one of the six windows of 117, as is the case in many of the best structures, including the top two hits (Fig. 6).
As our goal in making these computational predictions is the eventual synthetic realisation of the nested cage complexes, we now consider the synthetic route to realisation of the top 20 pairings. This will allow us to suggest the most promising targets for synthesis. We considered both the availability and ease of synthesis of the precursors, and how readily and with what yield the cages have been reported to be synthesised. With that in mind, we considered whether the cages met two separate sets of criteria. The first set of criteria includes cages that could be synthesised from commercially available precursors or precursors that are readily synthesisable in a reasonable number of steps (“List 1”). This does not consider how readily and cleanly cage formation has been reported, allowing us to include cages from our recent high-throughput screen23 that were attempted but either did not form or formed mixtures experimentally. We still include them here as it could be the case that the synthesis in a nested cage complex allows templating of the cage that did not previously form. The second set of criteria is more stringent, using all the initial criteria with the additional requirement that the cages have been reported to be synthesised readily and cleanly, eliminating all failed syntheses, reports of mixtures forming, and low yielding reactions (“List 2”). The complete characterisation of each individual cage into these criteria is given in Table S3,† and which list criteria are met for each of the top 20 nested cage complexes are given in Table 2. We also visually inspected the complexes for their symmetry, denoting the complexes as high symmetry if there were similar arrangements of the molecules at each window of the outer cage (Table 2). We would expect that high symmetry complexes would be particularly favoured and therefore might have a higher chance of synthetic realisation.
The vast majority of the nested cage complexes meet the initial criteria (95%), whereas only 56% of the complexes meet the more stringent criteria. Only 40% of the cage pairings are deemed to be high symmetry, and overall only 6 of the complexes (30%) meet both the stringent criteria and are high symmetry. We would expect these six complexes to be the most promising for synthetic realisation of a nested cage complex. These complexes were ranked 1st, 4th, 5th, 7th, 17th, and 19th purely based on binding energies. Therefore, the most promising nested cage complex still remains the combination of cages 111 and 117. Further, the best four of these six complexes all include 117 as the outer cage, but the last two complexes include cage 111 as the inner cage with two different outer cages.
The three most favourable cages for self-catenation were found to be a TCC1[6+12] large truncated tetrahedron (cage 81),54 and two related large tetrahedral cages formed from a [4 + 4] imine reaction, that differ only in the alkyl functionalisation of the triamine vertex (52 and 78).23 Of these, cages 52 and 78 would be the most promising for synthesis because they were formed cleanly and in good yield. The nested cage complexes were found to be energetically much more favourable than the self-catenation reactions, and in all of the top nested cage complexes, the alternative self-catenation reactions were never found to be energetically competitive. The large [8 + 12] imine cage from Gawroński and co-workers (cage 117),75 was found to be involved in the largest number of favourable nested cage complexes by a considerable margin, typically as the outer cage. The frequency that 117 was involved in favourable nested cage complexes can in large part be attributed to the fact that it has six large windows, which are a good symmetry match for encapsulating inner cages that have six vertices that can sit in each of the windows of 117, which is the case for 45% of the cages in our data set. The next most frequently found cage in the complexes was cage 111, which is a dodecaamide cage that has six vertices. The highest binding energy pairing involves cages 111 and 117, in an arrangement where pairs of naphthalene arms at each vertex of 111 sit in the windows of 117, with intermolecular interactions that make the binding energy particularly favourable.
We can now suggest the most promising routes for synthetic realisation of a nested cage complex. The most promising complex is that containing 111 and 117. Given the irreversible nature of the inner cage 111, there are two viable approaches to synthesising the nested cage complex with 117. The first would be to use the inner cage 111 as a template and attempt a one-pot reaction with the precursors for the outer cage 117 or, alternatively, the reversible outer cage could be formed separately and then mixed with the inner cage and allowed to equilibrate. A range of different solvents and additives could be trialled for both of these approaches, as well as recrystallisation screens, which is how previous organic cage catenanes were initially discovered.
Due to the high frequency with which 117 occurred in energetically favourable complexes however, we would also suggest that it is worth screening for complexes combining 117 with a wider range of potential cages, particularly smaller cages with six vertices that we find to be good partner cages for 117. It will be important to take into consideration the reversible nature of alternative organic cage partners, particularly those formed through imine condensations, where competing reactions to form new species may occur over forming a nested cage complex, such as scrambled statistical distributions81 or socially self-sorted structures.82 Furthermore, a similar screen could also be attempted with cage 111, which was found to partner favourably with many cages. We hope that by narrowing down the thousands of possibilities for nested cage complexes to just a handful, this computational study stimulates synthetic work and assists in the realisation of the first organic nested cage complex.
Footnote |
† Electronic supplementary information (ESI) available: Additional figures and tables. The structures of the individual cage molecules and the best self-catenation and nested cage complexes are provided at https://github.com/JelfsMaterialsGroup/Nested-cages. See DOI: 10.1039/c9me00085b |
This journal is © The Royal Society of Chemistry 2020 |