Predicting solvent effects on the structure of porous organic molecules

The influence of solvents on the structure and properties of biological and chemical systems is diﬃcult both to determine and to predict. In simulations the solvent is frequently either ignored or treated as a continuous medium. However, we know that solvents can play vital roles in biology and chemistry and that tools that could predict solvent eﬀects would have impacts in fields as diverse as materials and drug discovery. Here we develop an approach for the specific problem of predicting the voids formed within the molecular structure of porous organic molecules in solution or in the solvate solid state. Porous organic molecules are an alternative class of porous materials to extended networks such as zeolites and metal organic frameworks (MOFs). They lack chemical bonding in 3-dimensions and instead, as a result of internal cavities, have pores even in solution, which can then be assembled in the solid state. 1–3

The influence of solvents on the structure and properties of biological and chemical systems is difficult both to determine and to predict.In simulations the solvent is frequently either ignored or treated as a continuous medium.However, we know that solvents can play vital roles in biology and chemistry and that tools that could predict solvent effects would have impacts in fields as diverse as materials and drug discovery.Here we develop an approach for the specific problem of predicting the voids formed within the molecular structure of porous organic molecules in solution or in the solvate solid state.Porous organic molecules are an alternative class of porous materials to extended networks such as zeolites and metal organic frameworks (MOFs).][3] These materials have some potential advantages over porous network materials, particularly in terms of their solubility.There are processing advantages and their inherently modular nature allows for property tuning through varying the ratios of different 'modules'. 4Examples include calixarenes, cyclodextrins, cryptophanes, cucurbiturils and 'cages'.Together these show promise for applications in catalysis, 5 sensing 6,7 and separations. 8,9s attempts are made to synthesise new porous organic molecules, in particularly those with larger pores, an increasing problem is a lack of 'shape persistence'.In the absence of solvent, the large number of flexible torsions allow the molecule to undergo collapse, which is often associated with the complete loss of the pore. 10Fig. 1 shows an example of this for the porous organic imine cage CC8, which can be synthesised through imine condensation of tris(4-formylphenyl)amine and (R,R)-1,2cyclohexanediamine (see Fig. S1 for all reaction schemes, ESI †). 10 The collapse process is important as, if solvent is deleted in silico from a solvate crystal structure, then materials may superficially appear highly porous. 11However, this of course neglects the structural collapse or phase change these molecules will frequently undergo upon desolvation.Whilst these nonshape persistent materials might seem of little interest, they could have attractive properties in solution, such as encapsulation, separation 8 or optical sensing. 7For such molecules, prediction of solvate molecular structures would allow for rational design of synthetic modifications to prevent their loss of porosity. 12This prediction thus forms an integral part of any in silico molecular design process. 13It has previously been shown that it is possible to use simulations to predict the odd-even effect of precursor alkane chain length upon the molecular mass of the resultant porous organic imine cages 14 and furthermore that, if the two-dimensional chemical structure is known, the 3-dimensional crystal structure can be predicted using polymorph prediction techniques. 15,16Thus far, however, solvent effects have not been incorporated into these prediction processes despite examples where both molecular mass 17 and polymorph 18 have been affected by solvent choice.Prediction of structural and property changes upon solvation may also open the door to prediction of stimuli-response materials.Successful predictions of the solvate stoichiometry and structure of pharmaceuticals have been performed, although including solvent considerably increases the computational expense of these methods. 19Furthermore, whilst these simulations involve 1 or 2 solvent molecules per 'host' molecule, for porous molecular systems there can be more than 70 solvent molecules per host, 10 which would represent a significant challenge.For zeolitic materials, de novo design of templates to guide towards desired topologies has had some success. 20,21However, these successes have not been replicated for self-assembled systems such as MOFs and porous molecular materials.In addition, many MOFs exhibit phase changes between 'open' and 'closed' structures in response to guest-loading, temperature or pressure, the archetypal example being MIL-53. 22If the 'open' and 'closed' structures are known, then the external conditions, such as the pressure, required for the transition to occur can be predicted through a thermodynamic framework. 23Issues with MOF stability with respect to structural collapse are well known and, if forcefields are available, the lack of structural stability can be predicted using atomistic simulations that start from an initially 'open' phase. 24Predicting the structures of 'open' phases that form in response to guest loading is, by contrast, more difficult. 25e demonstrate that constrained molecular dynamics (MD) simulations can predict the 'open' form of several non-shape persistent porous organic molecules by reproducing reported solvate structures.We restrict ourselves to porous organic molecules that retain a defined cavity in the presence of guest molecules, whereby this open conformation is metastable with respect to the collapsed conformations and not significantly affected by crystal packing forces.We investigate several reported porous imine cages (Fig. S1, ESI †), including CC8, 10 the endo-functionalised cage of Mastalerz and co-workers, 26 the fluorescent cage of Mukherjee and co-workers 7 and two hypothetical imine cage molecules; the [6+9] and [8+12] versions of the reported [4+6] CC3 molecule. 27In addition, we look at two other classes of porous organic molecules; the glycolurilbased band-like macrocycles known as cucurbiturils, 28 and cryptophanes, 29 formed from two interconnected cyclotriveratrylenes cups.We include the 'parent' cucurbituril, CB6, 30 bis-nor-sec-CB10 (CB10ns), which lacks two methylene bridges compared to CB10, resulting in two interconnected cavities rather than one, 31 and cryptophane-A, where the single-crystal X-ray diffraction (SCXRD) structures of Xe and water-loaded conformations and the collapsed conformation have recently been reported. 32The underlying topology of these molecules covers a range of polyhedra, including tetrahedron, cube and triangular prism (Fig. S2, ESI †).None of these molecules retain a large cavity in their low energy structures in the absence of solvent, although the Mastalerz cage and CB6 retain a smaller void.
Conformer search algorithms can not discover the open conformations of the porous organic molecular systems, as they typically lie 4100 kJ mol À1 above the low energy collapsed conformations and these high energy regions of configuration space are not adequately sampled, if at all.As detailed in the ESI † (Section 1.8), we tried a number of different methods before settling on a simple approach that involved applying an energetic penalty if any atom from the molecule was within a certain distance of the molecule's centre of mass, x c .In doing this we calculate a smoothed version of the minimum distance, s, between each atomic position, x i , and x c using: where b ensures the function's derivatives with respect to the atomic positions are continuous.A harmonic restraint: with force constant k was added to the Hamiltonian for the system to encourage the minimum distance, s, to take on particular values.This procedure is conceptually equivalent to inserting a spherical probe, of radius s 0 , into the centre of the molecule and requiring that no atom enters this region.This method was successful when combined with simulated annealing (SA) to allow the system to sample configuration space adequately.By repeating the simulations with multiple different s 0 values, typically integer values between 3-8 Å, we were able find the lowest energy conformation.Our approach thus allows us to predict the conformation the molecules adopt in solution or in the solvate solid state ab initio, that is in the absence of any experimental input.The simulations were typically started from a collapsed conformation, although the starting point was not important due to the SA procedure.The OPLS all-atom forcefield 33 was used, which we have previously used for the prediction of conformer energetics of porous organic imine cages. 14All MD simulations were performed with DL_POLY2.20, 34the velocity verlet algorithm, a time step of 0.7 fs and an intermolecular cutoff of 12 Å.PLUMED2 35 was used to apply the constraints.During the first 14 ps of a 10 ns MD simulation, the force constant was scaled from 0 to 50 000 kJ mol À1 nm À2 .These calculations were run for a range of constraint sizes and sampled every 0.35 ps.The resulting B30 000 structures were geometry optimised using MacroModel (see ESI †).Only those with a spherical void (between 2-4 Å, dependent on the system) were retained.Duplicates were removed and whilst some inflated structures would collapse during the minimisation, typically B30% were found to be local open minima.Finally, SA was applied on the constraint size that gave the lowest energy conformation from the 300 K simulations.This involved 500 ps simulations at each temperature, with increments of 50 K between 300 and 500 K.This was repeated a minimum of 10 times, with structures sampled during the 300 K simulations.
We first examine systems where we can make a comparison to the solvate crystal structures.CC8 is the largest system, with 272 non-hydrogen atoms.The low energy structure does not contain a void because it has vertices folded inwards.Experiments have thus shown that the desolvated material is non-porous. 10Constrained MD with a probe radius of 6 Å was found to give the lowest energy open conformation (Fig. S3, ESI †).This configuration had a final void size of 4.6 Å, which is smaller than the probe radius because we subtract the van der Waals radius when calculating this quantity, but not when calculating the constraint.The low energy, open conformation is 4113 kJ mol À1 higher in energy than the collapsed conformation (Fig. 1) and has pseudo-octahedral symmetry.This is a good match to the experimental structure, which is also pseudo-octahedral, but not completely symmetric, presumably because of the high degree of solvation (B70 solvent molecules per cage).The similarity index between the simulated and experimental structure, calculated via the radial distribution function of the interatomic distances of all nonhydrogen atoms, is 58%, for further details refer to Section 1.7 of the ESI.† As shown in the overlay in Fig. 2, the main difference is that the simulated structure has vertex pairs that are contracted towards each other by B2 Å.Interestingly, when we geometry optimise the experimental solvate conformation, we see the same contraction of the vertices (Fig. 2) and a similarity of 98% to our structure, suggesting the difference is a crystal packing effect we can not expect to account for here.Nevertheless, the prediction of conformation, shape and void size is in good agreement with experiment.
The Mastalerz imine cage has tetrahedral topology from a [6+4] reaction.We found in MD simulations that the tetrahedral symmetry was not maintained and there was a partial collapse of the void, although a smaller void of 2.1 Å radius is retained (compared to B5 Å in the solvate crystal structure).This partial collapse of the intrinsic pore in the absence of solvent may explain why the reported Brunauer-Emmett-Teller (BET) surface area of 1377 m 2 g À1 for this system 36 is so much smaller than the computed solvent accessible surface area of 44000 m 2 g À1 for a N 2 probe (this holds for radii of 1.55-1.82Å, spanning the van der Waals and kinetic radius of N 2

37
).When the probe radius was set to 7 Å or below, the lowest energy open conformation was found.For larger probe radii, higher energy structures were found (Fig. S4, ESI †).The open conformation is 4158 kJ mol À1 above the partially collapsed conformation and has a void radius of 5.2 Å (Fig. 3).This conformation is in good agreement with the solvate crystal structure (Fig. S5, ESI †), with a similarity index of 85% and endo-functionalisation of the alcohol groups.The very minor torsional differences could be accounted for through either crystal packing effects or minor forcefield inaccuracies.
The Mukherjee cage is a [3+2] cage with phenyl groups pendant on each of the 3 vertices, and 2 trimethylbenzene faces.In the absence of solvent we found this amine cage to completely collapse during the MD simulations.However, we found an open, highly symmetric structure with a void radius of 3.6 Å for all probe sizes between 2-8 Å (Fig. 3).The lowest energy structure has a cage with the same core as the solvate crystal structure (Fig. S5, ESI †), but with the pendant vertices arranged more symmetrically, towards a C 3 symmetry -the solvate crystal structure has the vertices arranged in a T-shape.These small differences result in a similarity index of only 89%.We expect the lower symmetry crystal structure is due to packing forces and that our more symmetric conformation would be dominant in solution, where this molecule acts as a chemical sensor for the explosive picric acid. 7These conformational changes between the collapsed, solution and solvate solid state structure undoubtedly have the potential to impact upon chemical sensing.
CB6 is known to have potential inversion of glycoluril units, 38 in fact we find that every other glycoluril is inverted in the lowest energy 'collapsed' conformation of CB6 (Fig. 3).For constrained simulations with probes above 2.5 Å, we find exclusively the open conformation without any glycoluril inversion, which is an excellent match to the SCXRD structure (similarity 99.8%), Fig. 3 and Fig. S6 (ESI †).For CB10-ns, we find the collapsed conformation to have no internal cavity.The SCXRD solvate structure has two distinct cavities, which are not at the molecule's centre of mass.Nevertheless, if we use large probe sizes (16 Å), we exclusively find the open conformations to have the reported dual symmetrical interconnected cavities, Fig. 3 and Fig. S6 (ESI †).The similarity is 93%, but as with CC8, this increases to 99.7% if we compare our conformation to the geometry optimised SCXRD structure.Cryptophane-A is the only system studied here where a collapsed SCXRD structure has been reported. 32Our constrained calculations correctly predict the open anti conformation of cyclotriveratrylenes (Fig. 3 and Fig. S6, ESI †).We also found different probe sizes can match to the Xe-loaded or water-loaded SCXRD structures, which vary in cavity size, with matches of 94% (Xe) and 91% (water), which increase upon geometry optimisation (96% and 96%, respectively).The change in cavity size upon loading with  different guests 32 is understandable given we found conformations with a range of spherical cavities (radius 2.0-2.5 Å).
Following success with the observed molecules, we looked at two hypothetical imine cages.These are formed from the same chemistry as the reported tetrahedral [4+6] CC3 cage, but they have cubic, [8+12], and trigonal prismatic, [6+9], topologies.For [8+12] we found different structures with different probe sizes (Fig. 4 and Fig. S7, ESI †).Smaller probes gave energetically unfavourable partially collapsed structures, while larger probes gave 'overinflated' high energy conformations.Between 4.5-5.5 Å, we found low energy open conformations, with pseudo-octahedral symmetries that were similar in shape to the CC8 molecule.The [6+9] molecule had the potential to adopt a triangular prism shape, so we first used a cylindrical rather than a spherical probe, but found this was not necessary to find prismatic structures (Fig. S7 and S8, ESI †).We consistently found that very large spherical probes (49 Å) gave highly overinflated structures, but that they can subsequently minimise to symmetric open conformations that lack spherical cavities, albeit with poor sampling.
In conclusion, we have developed an approach through which the open, metastable, conformations of porous molecules in solution or the solvate solid state can be predicted.The approach is guided through the determination of the lowest energy structure and thus requires no experimental input.Alternative shape probes can also allow for the opening of different geometry pore systems.The method has the potential to be used more broadly in the prediction of guest responsive materials, in particular for porous materials such as MOFs.For porous organic molecules our results demonstrate that the solvent is acting chiefly as a 'scaffold', whose effects can be reproduced without any specific chemical interactions.The prediction of the open molecular conformations is a step towards predicting the properties of these systems in solution, for example the chemical sensing of the Mukherjee cage.This capability is also a significant development for the in silico design of these materials, as successful prediction of the reaction outcome is possible through knowledge of the solution structures.

Fig. 1 A
Fig.1A comparison of (left) low energy, voidless, conformation of CC8 and (right) higher energy structure with a void of B5 Å radius in grey.

Fig. 2 (
Fig. 2 (left) Overlay of the CC8 computed open structure (blue) and solvate crystal structure (red).(right) Overlay of the CC8 computed open structure (blue) and the geometry optimized solvate crystal structure (dark red).Hydrogens are not shown.

Fig. 3
Fig. 3 The minimum energy collapsed structures (left) and lowest energy open conformations (right) found here for reported systems.