Hydrogen bond architecture in crystal structures of N-alkylated hydrophobic amino acids

CrystEngComm This journal is © The Royal Society of Chemistry 2014 Department of Chemistry, University of Oslo, P.O. Box 1033 Blindern, N0315 Oslo, Norway. E-mail: c.h.gorbitz@kjemi.uio.no † Electronic supplementary information (ESI) available: Additional illustrations of hydrogen bonding and crystal packing, synthetic details. CCDC 985300– 985304. For ESI and crystallographic data in CIF or other electronic format see DOI: 10.1039/c4ce01412j Fig. 1 (a) General packing arrangement exemplified by DL-Val. The C-atoms of th in light grey while dark grey is used for t denote the two sheets of opposite chiral L1–D1 layer in the structure (dashed box direction of the N–H bond vector in C(4 towards carboxylate syn lone pairs. Side hydrophobic layer. (b) Detail of a L1 she level R4(16) ring system 13 is highlighted in Z′ = 2 observed for most enantiomerica sheet with amino acids of both Land D-c racemates and 1 : 1 complexes between (quasiracemates). Orange spheres represe Cite this: CrystEngComm, 2014, 16, 9631


Introduction
The crystal structures of hydrophobic amino acids, as chiral substances or racemates, permit systematic investigations of recurring hydrogen bonding patterns due to the absence of side chain functional groups that otherwise could have interfered with the interactions between the charged amino and carboxylate groups. All crystal structures of hydrophobic amino acids reported to date have the same basic layout. The side chains form distinct hydrophobic layers and the polar heads hydrophilic layers, Fig. 1a. The latter can in turn be divided into two distinct sheets, each incorporating two head-to-tail hydrogen-bonded chains. A total of five different types of sheets have been identified. 1 The L1 sheet illustrated in Fig. 1b occurs in crystal structures of enantiomerically pure amino acids, but is also observed in structures of racemates where it is paired with its mirror image D1 sheet in the L1-D1 layer shown in Fig. 1a. The L2 sheet in Fig. 1c, with Z' = 2, is favored for L-amino acids, while the LD sheet in Fig 1d is reserved for racemates and quasiracemates, 2 as it contains amino acid molecules with both L-and D-configuration (thus the name).
The third amino H atom, which is not involved in hydrogen bonding within the sheet, serves to connect two adjacent sheets into a hydrophilic layer, Fig. 1a. 3 Accordingly, alkyl substitution of this hydrogen could potentially result in a series of structures with a single-sheet architecture. Surprisingly, there have been no systematic investigations into the structures of such N-alkylated hydrophobic amino acids. An overview of N-substituted amino acids in the Cambridge Structural Database (CSD, version 5.35 of November 2013) 4 is given in Table 1. Special attention has been given to structures where neither the N-substituent nor the regular side chain participates in strong hydrogen bonds. There are only five such entries in the CSD: N-methylglycine (sarcosine, CSD refcode YIHHON), 5 N-methyl-L-tryptophan (WAJBIS), 6 which is  3 The C-atoms of the L-enantiomers are coloured in light grey while dark grey is used for the D-enantiomers. L1 and D1 denote the two sheets of opposite chirality constituting a hydrophilic L1-D1 layer in the structure (dashed box). Small arrows indicate the direction of the N-H bond vector in C(4) hydrogen-bonded chains 13 towards carboxylate syn lone pairs. Side chains in orange make up a hydrophobic layer. (b) Detail of a L1 sheet. A 16-membered second level R4 4 (16) ring system 13 is highlighted in (a) and (b). (c) L2 layer with Z' = 2 observed for most enantiomerically pure amino acids. (d) LD sheet with amino acids of both L-and D-configuration, as observed for racemates and 1:1 complexes between closely related amino acids (quasiracemates). Orange spheres represent side chains in (b), (c) and (d). a naturally occurring substance called abrine, N-(2-pyrimidylmethyl)-L-alanine (QURSIG), 7 N α ,N ε ,N ε -tri(cyanoethyl)-L-lysine (VEQZIB), 8 which is a low molecular weight dendrimer with potential antimicrobial properties, and tritylglycine (LEFPUH), 9 see Fig. 2. Fig. 2. N-alkylated amino acids discussed in this paper. Previously investigated molecules [5][6][7][8][9][10][11][12] have been identified by their CSD 4 refcodes.
To evaluate the hydrogen bonding properties of more general Nalkylated hydrophobic amino acids, we have synthesized five new substances (shown in Fig. 2). These were selected to explore the effect of variable side chain bulks, including an interchange of hydrophobic groups between the amino group and the Cα carbon atom [N-isopropyl-L-phenylalanine (NiPrF) and N-benzyl-L-valine (NBzV)], and the effect of going from N-isopropyl-L-valine (NiPrV) to its racemate N-isopropyl-DL-valine (NiPrVR) as well as to the higher analogue N-isopropyl-L-leucine (NiPrL).There are 13 plain (NiHB) structures with an endocyclic N-atom in Table 1. Three of them proved to be of interest for the present investigation: L-proline itself (PROLIN) 10 and the two derivatives GEVMOK 11 and GULCUM 12 (included in Fig. 2).

Results and discussion
Crystallographic data are listed in Table 2, while the molecular structures of NiPrV, NiPrVR, NiPrL, NiPrF and NBzV are shown in Fig. 3. Torsion angles are listed in Table 3. For NiPrL and NiPrF there are two molecules in the asymmetric unit. These differ primarily with regards to the orientation of the regular side chain. In the crystal structure of NiPrL the side chain of molecule A adopts the most common conformation for the side chain of L-Leu, with N1A-C2A-C3A-C4A (χ 1 ) = trans and C2A-C3A-C4A-C5A/C6A (χ 2,1 /χ 2,2 ) = trans/gauche+, as seen for 14 out of 15 zwitterionic L-Leu molecules (after inversion of some D-Leu molecules) in the CSD. Molecule B on the other hand has a unique, χ 1 = trans, χ 2,1 /χ 2,2 = trans/gauche-, conformation not previously observed for unprotected, zwitterionic L-Leu. Large deviations from the ideal staggered geometry (dihedral angles of ±60 and 180º) point to a strained molecular geometry caused by packing interactions, as will be discussed in more detail below. The two molecules in NiPrF adopt the same overall conformation, but with a >30º difference between molecules A and B for the dihedral angle C2-C3-C4-C5. The Val residues in NiPrV, NiPrVR and NBzV share a conformation with N1-C2-C3-C4/C5 (χ 1,1 /χ 1,2 ) = trans/gauche+, which in crystal structures of L-Val occur with about the same frequency as trans/gauche− (eight and seven observations, respectively). The third rotamer, gauche+/gauche−, is less common with three previous observations. The N-isopropyl group has the same orientation in all molecules in Fig. 3 except NiPrV.
The unit cells and crystal packing arrangements of NiPrV, NiPrVR, NiPrL, and NiPrF are depicted in Fig. 4. Together with WAJBIS, QURSIG and VEQZIB these compounds form the anticipated molecular monolayers. In principle, one of the familiar hydrogen bonded sheets seen in crystal structures of the regular amino acids, such as L1 and L2 illustrated in Fig. 1, could have been retained for their Nalkylated counterparts, which would have led to molecular monolayers with the regular side chains and N-alkyl groups positioned on opposing faces, Fig. 5. Instead, we find that a slightly modified type of sheet is used in these structures, Fig. 6. Compared to the sheets in Fig. 1 every second N-H···O (syn) head-to-tail chain is here flipped in the opposite direction. This has the effect of positioning both side chains and N-alkyl groups on alternating sides of the hydrogen-bonded sheet, unlike the model shown in Fig. 5. At the same time are not only two chains with first level graph set 13 C(4) chains retained, but also second level R 4 4 (16) sixteen-membered rings that are related, but not identical to those of the L1 sheet. As in the L1 and L2 sheets, N-H···O (anti) head-to-tail chains are parallel in Fig. 6. This means that all such chains are in fact running in the same direction in the monoclinic structures of NiPrV and NiPrL in Fig. 4, while in the orthorhombic structure of NiPrF directions are opposite in adjacent layers.
NiPrL and NiPrF both have Z' = 2, but display the same hydrogen bonding pattern as NiPrV and other structures in the group. The difference between a L1 layer with Z' = 1 and a L2 layer with Z' = 2 for regular amino acids in Fig. 1b and 1c is thus not reproduced here. This means that the increase from Z' = 1 to Z' = 2 for NiPrL and NiPrF is due to better stacking of hydrophobic groups rather than an improved hydrogen bonding arrangement. Upon closer inspection, the pattern of A and B molecules in the crystal structures of NiPrL and NiPrF in Fig.  6c and 6d are dissimilar, so these molecules adapt to the challenges of stacking their side chains in slightly different manners, in addition to the shift from a monoclinic to an orthorhombic space group.
Although we had expected NBzV to belong to the same structural family as the other four substances investigated here, already the crystallization behavior (see below) suggested that this was not the case. The crystal structure shown in Fig. 7a may at first seem to share some features with e.g. its "retroanalogue" NiPrF, but in fact there is no hydrogen bonded sheet in NBzV, only a hydrogen bonded tape composed of a series of fused 11-membered rings, Fig. 7b. The density of NBzV, 1.193 g/cm 3 , is slightly lower than 1.234 g/cm 3 for NiPrF, so the shift of hydrogen bonding pattern is apparently not driven by a more efficient crystal packing arrangement. In this connection, compounds with endocyclic N-atoms in five-membered rings provide interesting, additional information. Together with GEVMOK 11 and GULCUM, 12 Lproline (PROLIN), 10 take on the same hydrogen bonding pattern as the acyclic group, Fig. 6f. If, however, the -C γ H 2group is substituted with -S-(thioproline, NELSEC), 14 the pattern is shifted to that of NBzV (apparently the only other occurrence of this kind of tape together with (2R,1'S)-2-(1'benzyl-2'-hydroxyethylamino)-4-phenylbutanoic acid (AZABEI). 15 Methylene and sulfur are of comparable sizes, showing that minute changes to molecular structure can have a profound impact on the overall crystal packing arrangement.
For PROLIN, GEVMOK and GULCUM we furthermore note that the ring system forces the side chain and the N-alkyl substituent of an individual molecule to be on the same side of a hydrogen-bonded sheet, Fig. 8e, and not on opposite sides as in Fig. 8b -8d. We believe this inherent rigidity is responsible for the fact that 10 out of 13 compounds with an endocyclic Natom in Table 1 form other types of hydrogen bonding patterns. Two previous structures identified in Table 1 also do not share the hydrogen bonding pattern shown in Fig. 6. Sarcosine (YIHHON), 5 which has a single methyl group only, is able to generate a high density crystal structure (1.309 g/cm 3 ) with a three-dimensional hydrogen bonding pattern. LEFPUH, 9 on the other hand, has three bulky phenyl groups that are incompatible with formation of even a 2-D hydrogen bonded sheets. Hydrogen bonding is in this case reduced to a single 1-D chain that leaves one of the two NH hydrogen bond donors unused (instead, the aromatic groups serve as acceptors in weaker interactions).
The hydrogen pattern in Fig. 6 is very flexible and can accommodate a wide range of side chains and N-substituents. The periodicity along the N-H···O (syn) interaction is almost constant, with a very limited 0.06 Å range for the values listed in Table 4, but the periodicity along the N-H···O (anti) interaction varies considerably over a 1.19 Å range. Obviously, hydrogen bond lengths in the two perpendicular directions are affected, as seen from the list of N···O distances in Table 4 (complete hydrogen bond geometries are available as supplementary material). The explanation for this variability lies in the size and shape of the hydrophobic moieties. WAJBIS 6 and QURSIG 7 are interesting in that both compounds have methyl groups. For regular, hydrophobic amino acids methyl groups are usually too small to participate in layer formation. Our own efforts to prepare quasiracemic L:D amino acid complexes with Ala as one of the components were uniformly unsuccessful except for L-Ile:D-Ala (FITHIZ), 16 where a nice fit is observed between surfaces where Ile ridges fit into Ala grooves, Fig. 8a. For WAJBIS, which represents an extreme value for the periodicity along the N-H···O (anti) interaction at 8.595(1) Å, Fig. 6e, the methyl groups serve only to fill voids close to the polar heads, the surface of a molecular monolayer essentially being formed by the bulky Trp indole groups, Fig. 8b. For QURSIG, Fig. 8c, the situation is rather similar, although the roles of the two hydrophobic groups have been reversed. The side chain methyl group is, however, more exposed than the N-methyl group of WAJBIS and forms intermolecular contacts across the hydrophobic layer as in the L-Ile:D-Ala complex.
The isopropyl groups of NiPrV evidently build considerable strain into the network, as the periodicity along the N-H···O  10 Color coding as in Figure 3. The C-atoms in the ring system of L-proline appear in light green.
(anti) reaches the maximum value of 9.788(4) Å, Fig. 6a. A further increase in side chain bulk to sec-butyl in NiPrL or benzyl in NiPrF forces a Z' = 2 system. VEQZIB, 8 also with Z' = 2, illustrates how the separation between hydrophilic layers can be increased dramatically when the combined size of the two hydrophobic groups in the molecule becomes very large, Fig. 8d.
A most interesting comparison can be made between the structure of NiPrV and its racemate NiPrVR. Although examples exist, such as L-Ala and DL-Ala, 17 it is very uncommon for a racemate to share hydrogen bonding pattern and general packing arrangement with the corresponding enantiomerically pure compound. This is the situation for NiPrV and NiPrVR, as can be seen from Fig. 4a and 4b and Fig. 6a and 6b, respectively. Compared to the structure of NiPrV, the roles of the side chain and the N-isopropyl group with respect to space filling are reversed in the structure of NiPrVR, rendered possible also by a change in N-isopropyl conformation as described above. It follows that the display of isopropyl groups on both sides of a molecular monolayer gives two very similar surfaces, Fig. 9a and 9b. NiPrL is quite similar. The transition to Z' = 2 can be seen (Fig. 9c) as the result of an alternating set of conformations for the side chains. The surface of NiPrV is also closely reminiscent of the hydrophobic surface of a molecular double-layer in DL-Val, Fig. 1a and 9d, while the corresponding surface for L-Val 18 shows yet another way of arranging these groups (illustration available as supplementary material). Name., 2012, 00, 1-3 This journal is © The Royal Society of Chemistry 2012

Experimental
NiPrV, NiPrL and NiPrF were prepared by reductive amination of acetone with the corresponding proteinogenic amino acids, using sodium cyanoborohydride (NaBH 3 CN) as reducing agent. NiPrVR was prepared by mixing equimolar amounts of L-NiPrV and D-NiPrV. NBzV was prepared by reductive amination of benzaldehyde with L-Val. See the supplementary material for experimental details. Single crystals of NiPrV, NiPrL and NBzV were obtained by dissolving about 0.2 -0.5 mg of each compound in 30 ml of water in a small test tube which subsequently was sealed with Parafilm ®. After pricking a hole in the film with a needle, the tube was set to equilibrate inside a larger tube filled with about 1 ml of acetonitrile. The same method was used for NiPrF, but with hexafluoro-2propanol as the solvent and water as the precipitating agent. Well-shaped crystals formed within a week for all substances except NBzV, which yielded long (several mm) but exceedingly thin needles that easily bent and fractured. These also had a pronounced right-handed twist. One needle was cut with a scalpel to 0.30 mm length for collection of X-ray data. X-ray data were collected on a Bruker SMART APEX II CCD diffractometer equipped with an Oxford Cryostream low temperature device. Data integration/reduction and absorption correction were carried out by the programs SAINT and SADABS, 19 respectively, while SHELXTL 20 was used for refniments. No structural disorder was discovered. Heavy atoms were refined with anisotropic displacement parameters, whereas all hydrogen atoms were constrained to theoretical positions. Some restraints were used in the refinement of NBnV; details are in the SHELXL .res file which is part of the submitted cif.Molecular images and packing graphics were generated using the program Mercury. 4,21

Conclusions
In summary, the available set of ten related structures, including six compounds studied previously and four out of the five studied here, demonstrates a remarkably robust and versatile hydrogen bonding arrangement for N-alkylated amino acids that is compatible with a wide range of hydrophobic side chains and N-alkyl groups.