Anton I.
Isakov
a,
Heike
Lorenz
*b,
Andrey A.
Zolotarev
Jr
a and
Elena N.
Kotelnikova
a
aDepartment of Crystallography, Saint Petersburg State University, Universitetskaya emb. 7/9, 199034 Saint Petersburg, Russia
bMax Planck Institute for Dynamics of Complex Technical Systems, Sandtorstrasse 1, 39106 Magdeburg, Germany. E-mail: lorenz@mpi-magdeburg-mpg.de; Tel: +49 391 6110 293
First published on 14th January 2020
A classification of discrete compounds in binary chiral systems of single and different substances (homo- and heteromolecular compounds) is presented. It considers both chemical and crystallographic characteristics of such compounds. Using amino acids as chiral model systems, features of crystal structures of equimolar and non-equimolar heteromolecular compounds are reviewed. In this connection, the concept of homo- and heteromolecular dimers in compounds formed by amino acids is introduced and analyzed. As a result, a correlation between the molecular dimer type and the side chain structure (linear or branched) and the conformation (extended or folded) of the relevant compounds' molecules is derived. Two non-equimolar discrete heterocompounds discovered in the chiral systems L-valine–L-isoleucine and L-valine–L-leucine are discussed using the concepts proposed. The previously studied first system features a 2:
1 (Val
:
Ile) compound. A newly investigated second system is found to form a 3
:
1 (Val
:
Leu) compound, and its crystal structure is described and evaluated.
Size and shape of molecules and geometry of intermolecular hydrogen bonds are general factors influencing the molecular packing in crystal structures of organic substances.2,3 In the case of chiral substances, the molecule configuration becomes the main additional factor for molecular packing.
Enantiomers are widely used in pharmaceuticals, food industries and electronics. For example, in 2004, nine out of the ten most sold drugs contained chiral active ingredients.4 Furthermore, in the same year, among the 16 newly approved synthetic drugs, 13 were chiral with all of them being single enantiomers.5 However, products of non-stereoselective industrial synthesis are mixtures of both enantiomers. Therefore, the problem of applicability of chiral substances (for example, for API's production) is closely connected to the problem of resolution of enantiomeric mixtures. The most profitable resolution methods are crystallization methods that require the understanding of phase equilibria in a system and plotting its phase diagram.6
The largest part of published information on chiral systems refers to enantiomers of a single substance, while reports on enantiomeric systems of different substances are rather rare. Amino acids are suitable model compounds to investigate binary chiral systems of the latter type. The great diversity of amino acids and relatively simple structures of their molecules made them already model compounds, for example, for the determination of hydrogen bond lengths in proteins and other biopolymers.7,8
The vast variety of proteins is a result of the combination of twenty proteinogenic α-amino acids, with nineteen being chiral compounds. Among them, seventeen molecules contain only one asymmetric carbon atom and, consequently, exist as two enantiomers. Molecules of the residual amino acids, threonine and isoleucine, contain two asymmetric carbon atoms and thus, exist as four stereoisomers forming two enantiomer and four diastereomer pairs, respectively. Eight proteinogenic α-amino acids are essential amino acids, which cannot be produced by the human body and, therefore, must be consumed. The subjects of our study, the aliphatic amino acids valine (C5H11NO2), leucine (C6H13NO2), and isoleucine (C6H13NO2), belong to this group.
In this present work, we (1) introduce a classification of discrete compounds formed in chiral systems (section 2), (2) discuss the chiral molecular packing in crystal structures of amino acids and their equimolar heteromolecular compounds (sections 3 and 4), and (3) present detailed studies of mainly two levorotatory amino acid systems, L-valine–L-leucine and L-valine–L-isoleucine, as examples of binary chiral systems forming non-equimolar heteromolecular compounds (section 5).
The revised classification is shown as a flow diagram in Fig. 1. We implemented significant changes, namely, the proposed concepts for homo- and heteromolecular dimers (see sections 4.1 and 4.2), and considered diastereomers to be different substances, differing on molecular structures and physical characteristics. However, the key change is an improved order of basic chemical and crystallographic classifying features that define the hierarchy of chiral compounds.
![]() | ||
Fig. 1 Classification of discrete compounds in binary systems of chiral molecules. Corresponding related and historical terms6,10–17 are given with references. |
The chemical composition of a particular compound became the first classifying feature, and the molecular ratio of the components constituting a compound as the second classifying feature. Consequently, in Fig. 1, compounds formed in binary chiral systems were first divided into two main types: homomolecular and heteromolecular discrete compounds. Afterwards, they were split into equimolar and non-equimolar compounds, respectively.
In the case of homomolecular compounds (homocompounds), the third classifying feature is based on the molecular packings' character in the crystal structure. Accordingly, equimolar homocompounds are divided into compounds with symmetry related molecular components (centrosymmetric and non-centrosymmetric ones) and compounds with symmetrically independent molecular components. Exemplary centrosymmetric and non-centrosymmetric equimolar homocompounds are two polymorphs of malic acids' true racemate, RSI and RSII, respectively.9 An exemplary homocompound containing symmetrically independent molecules is DL-allylglycine.18
Non-equimolar homocompounds, in turn, can be divided into compounds having ordered and disordered molecular positions. An exemplary compound with ordered molecular positions is the non-equimolar discrete compound S3R of malic acid.9,19 To the present authors' knowledge, there is no available information in the literature about any non-equimolar homocompounds with disordered molecular positions.
In the case of heteromolecular compounds (heterocompounds), the third classifying feature is the chirality of the molecular components. Accordingly, equimolar heterocompounds are divided into compounds composed of molecular components with different chiralities and ones with the same chirality. The crystal structure of heterocompounds of either type can consist of homomolecular or heteromolecular dimers.
In other words, systems with components having different chiralities can produce compounds containing homomolecular dimers (e.g., L-valine–D-norvaline, see section 4.1) and compounds containing heteromolecular dimers (e.g., D-valine–L-leucine, see section 4.2). Heterocompounds with homomolecular dimers are found not only among amino acids but also among other chiral organic compounds.20–25 As well as systems of components having the same chirality can form compounds containing heteromolecular dimers (e.g., L-malic acid–L-tartaric acid26) and, possibly, those consisting of homomolecular dimers (no examples have been found yet in the published literature).
Similarly, non-equimolar heterocompounds are divided into compounds composed of molecular components with the same or different chiralities. The latter case is found in heterocompounds formed in the (+)-2,4-dimethylglutaric acid–(−)-dilactic acid and (−)-2,4-dimethylglutaric acid–(+)-2-methylglutaric acid systems.14 In contrast, the molecular components of non-equimolar heterocompounds found by the present authors in the systems of L-valine–L-isoleucine9,27 and L-valine–L-leucine (see section 5) have the same chirality.
For the description of the amino acids' crystal structure, the terms “dimer” and “molecular layer”9,27,36–38 will be used. The enantiomer L-valine is applied here as a representative example of a typical aliphatic amino acid. Fig. 2a shows the projection of the L-valine crystal structure onto the ac plane. In the crystal structure, it is possible to distinguish dimers consisting of two levorotatory valine molecules interlinked with hydrogen bonds (Fig. 2b). Pairing of molecules with formation of a dimer is typical for hydrophobic amino acids.7 Each dimer is bonded to neighbor dimers in the ab plane via hydrogen bonds. Thus, the thickness of the formed molecular layer in the direction of the c axis is equal to one dimer molecule (two valine molecules)‡ (Fig. 2a). The layers are interlinked via the van der Waals bonds.
![]() | ||
Fig. 2 Projection of the L-valine crystal structure on the ac plane of the monoclinic cell (a) and its dimer molecule (b). Hereinafter, dotted lines are the hydrogen bonds. Images are constructed in the program Mercury40 using the structural data CSD LVALIN01.33 |
![]() | ||
Fig. 3 Homomolecular dimers of L-valine (a) and D-norvaline (b) and projection on the ac plane of the crystal structure of the equimolar discrete compound in the L-valine–D-norvaline system (c). Images are constructed in the program Mercury40 using the structural data CSD BERQEU.31 |
![]() | ||
Fig. 4 L-Leucine–D-valine dimer molecule (a) and projection on the bc plane of the crystal structure of the equimolar heterocompound in the D-valine–L-leucine system (b). Images are constructed in the program Mercury40 using the structural data CSD BERPET.31 |
1b. Different types of side chains in molecular components | |||
---|---|---|---|
No. | Branched molecule (component 1) | Linear molecule (component 2) | Ref. |
1 | L-Valine | D-α-Aminobutyric acid | 31 |
2 | L-Valine | D-Norvaline | 31 |
3 | L-Valine | D-Methionine | 31 |
4 | L-Valine | D-Norleucine | 16 |
5 | L-Isoleucine | D-Alanine | 29 |
6 | L-Isoleucine | D-Norleucine | 29 |
7 | L-Isoleucine | D-Norvaline | 29 |
8 | L-Isoleucine | D-Methionine | 29 |
9 | L-Isoleucine | D-α-Aminobutyric acid | 29 |
10 | L-allo-Isoleucine | D-Norleucine | 16 |
2. Equimolar heterocompounds of amino acids with heteromolecular dimers | |||
---|---|---|---|
2a. Same type of side chains in molecular components | |||
No. | Branched molecule (component 1) | Branched molecule (component 2) | Ref. |
1 | L-Leucine | D-Valine | 31 |
2 | L-Leucine | D-allo-Isoleucine | 7 |
3 | L-Isoleucine | D-allo-Isoleucine | 30 |
4 | L-Isoleucine | D-Valine | 29 |
5 | L-Isoleucine | D-Leucine | 29 |
Amino acid | Conformation of molecule | ||
---|---|---|---|
Enantiomer L | True racemate LD | Heterocompound LD′ | |
Valine | Extended and folded | Folded | Extended |
Isoleucine | Extended and folded | Folded | Extended |
allo-Isoleucine | Extended and folded | No data | Extended |
In Table 1, the L and D components of a heterocompound are arranged in separate columns. The equimolar heterocompounds of aliphatic α-amino acids with homo- and heteromolecular dimers are listed in parts 1 and 2 of Table 1, respectively. Each part comprises two categories, distinguished in accordance with the structure of radical R or the structure of the side chain of the corresponding α-amino acid (see Fig. 5 and 6). The side chain can be branched or linear. For example, molecules of valine (C5H11NO2), leucine (C6H13NO2), and isoleucine (C6H13NO2) (Fig. 6a) have a branched side chain, while molecules of norvaline (C5H11NO2), norleucine (C6H13NO2), and methionine (C5H11NO2S) (Fig. 6b) possess a linear one.
![]() | ||
Fig. 6 Structural formulae of certain aliphatic α-amino acids with branched (a) and linear (b) side chains (colored in green). |
In the heterocompounds with homomolecular dimers (Table 1, part 1), category 1a includes two compounds, wherein both molecular components are linear. Category 1b comprises ten compounds with the L component being branched and the D component linear. In the heterocompounds with heteromolecular dimers (Table 1, part 2), category 2a includes five compounds, wherein both constituents are branched. Category 2b comprises four compounds wherein the L component is branched, while the D component is linear.
This method of examining the equimolar heterocompounds of amino acids reveals a connection between the type of molecular dimers (homo- or heteromolecular) and the side chain structure of the molecular components (branched or linear). As seen in Table 1, all the heterocompounds composed of molecular components with linear side chains belong to group 1, i.e. have homomolecular dimers. In turn, all the heterocompounds consisting of components with branched side chains belong to group 2, i.e. feature heteromolecular dimers.
More complex is the case when one of the heterocompound components is characterized by a branched molecule and the other by a linear molecule. Compounds of such a type are present in both parts in Table 1. Affinity of a particular compound toward part 1 or 2 likely depends on the nature of the branched molecular component. It turned out that if the branched component is valine, isoleucine, or allo-isoleucine (part 1b, Table 1), then the heterocompound has homomolecular dimers. In turn, if the branched component is leucine (part 2b, Table 1), then the resulting heterocompound has heteromolecular dimers.
A probable explanation is based on the following. There are two conformations of molecules in the crystal structures of valine, isoleucine, and allo-isoleucine (Table 2). Relative to each other, one of the conformations can be considered as “extended” (Fig. 7a) and the other one as “folded” (Fig. 7b). The differences between the extended and folded conformations are clearly seen from the comparison of the corresponding torsion angles in Fig. 7.
![]() | ||
Fig. 7 Extended (a) and folded (b) conformations of valine, isoleucine and allo-isoleucine molecules. Images are constructed in the program Mercury40 using the structural data CSD LVALIN01, LISLEU02 and DAILEU01, respectively. |
In the case of the equimolar homocompounds (true racemates LD, Fig. 1), molecules of valine, isoleucine, and allo-isoleucine adopt the folded conformation. In contrast, in the case of the equimolar heterocompounds (“quasiracemates” LD′, Fig. 1), they have the extended conformation. This is confirmed by the examples of four heterocompounds containing valine, five heterocompounds containing isoleucine, and one heterocompound comprising allo-isoleucine (see Tables 1 and 2). Another situation occurs in the case of leucine. All the molecules in the crystal structure of L-leucine are identical and, consequently, there is no need to attribute a conformation. It was found that all the heterocompounds containing leucine and linear molecules of other amino acids form heteromolecular dimers (see 2b in Table 1).
Table 3 summarizes the equimolar heterocompounds with the L component being the aromatic α-amino acid phenylalanine (C9H11NO2) and the D component as an aliphatic amino acid. Similar to Table 1, Table 3 consists of two parts comprising heterocompounds with homo- and heteromolecular dimers, respectively. The phenylalanine molecule (component 1) is considered as branched, since it contains a phenyl group. The molecules of the aliphatic amino acids (component 2) are both branched and linear molecules.
Heterocompounds with homomolecular dimers (Table 3, part 1) are represented exclusively by compounds whose both the components have branched molecules and one of the components is aromatic, while the other one is aliphatic. It should be noted that when both branched components are aliphatic molecules (see 2a in Table 1), the resulting heterocompound, in contrast, is composed of heteromolecular dimers.
Heterocompounds with heteromolecular dimers (Table 3, part 2) are represented solely by compounds whose one of the components has a branched molecule (aromatic component), while the other one has a linear molecule (aliphatic component). Thus, as in the case of heterocompounds composed of two aliphatic amino acids, the configuration of the molecular side chain greatly affects the structure of the equimolar heterocompounds containing an aromatic phenylalanine molecule.
Structural analysis of a V3L single crystal was performed using a diffractometer Agilent Technologies SuperNova with CuKα radiation and at a temperature of 100 K. The structure has been solved by direct methods and refined by means of the SHELX program incorporated in the OLEX2 program package. The carbon- and nitrogen-bound H atoms were placed in calculated positions and included in the refinement in the ‘riding’ model approximation. The experimental conditions applied and crystal structure parameters of the non-equimolar heterocompound V3L are summarized in Table 5. Fig. 9 shows the projection of the V3I compounds' crystal structure onto the ac plane of the monoclinic cell (Fig. 9a) and the V3L dimer molecule (Fig. 9b).
Empirical formula | C21H46N4O8 |
---|---|
Formula weight | 482.62 |
Temperature/K | 100 |
Crystal system | Monoclinic |
Space group | P21 |
a/Å | 9.6267(7) |
b/Å | 5.2704(2) |
c/Å | 13.8290(17) |
α/° | 90 |
β/° | 109.943(11) |
γ/° | 90 |
Volume/Å3 | 659.56(11) |
Z | 4 |
ρ calc/g cm−3 | 1.215 |
μ/mm−1 | 0.764 |
F(000) | 264.0 |
Radiation | CuKα (λ = 1.54184) |
2θ range for data collection/° | 6.8–139.962 |
Index ranges | −11 ≤ h ≤ 11, −6 ≤ k ≤ 6, −16 ≤ l ≤ 16 |
Reflections collected | 5431 |
Independent reflections | 2475 [Rint = 0.0515, Rsigma = 0.0500] |
Data/restraints/parameters | 2475/1/189 |
Goodness-of-fit on F2 | 1.025 |
Final R indices [I ≥ 2σ(I)] | R 1 = 0.0687, wR2 = 0.1864 |
Final R indices [all data] | R 1 = 0.0829, wR2 = 0.2039 |
Largest diff. peak/hole / e Å−3 | 0.35/−0.26 |
Flack parameter | 0.0(2) |
![]() | ||
Fig. 9 (a) Projection of the crystal structure of non-equimolar discrete heterocompound V3L on the ac plane of the monoclinic cell.42 (b) Dimer molecule of heterocompound V3L; occupation degree of the left position is 100% Val; occupancy of the right position is mixed: 50% Val and 50% Leu. Images are constructed in the program Mercury40 using the structural data CCDC 1903257. |
Heterocompound V3L contains the molecules of valine and leucine in a ratio of 3:
1. Two out of four molecular positions in the unit cell (Z = 4) are independent. One of the independent positions is occupied by a valine molecule (1 Val) (Fig. 9b, left part). The other independent position is disordered, i.e. it is characterized by mixed occupation. It can be occupied with equal probability by either a valine or leucine molecule (1/2 Val + 1/2 Leu) (Fig. 9b, right part). Therefore, the total number of valine molecules in the compounds' unit cell is (1 + 1/2) × 2 = 3, while the total number of leucine molecules is 1/2 × 2 = 1. Consequently, the heterocompound has a general formula of V3L. Since one of the independent positions has a mixed occupation in the crystal structure, two kinds of dimers exist: homomolecular Val–Val and heteromolecular Val–Leu ones. The dimers are connected to each other via hydrogen bonds in the bc plane and, therefore, form a layer with a thickness of one dimer molecule. The layers, in turn, are interlinked via van der Waals bonds (Fig. 9a).
![]() | ||
Fig. 10 Projections of Leu, Val, Ile, V3L and V2I crystal structures on the corresponding planes of their monoclinic cells. |
Compound | S. G. | a, Å | b, Å | c, Å | β, deg. | V, Å3 | Ref. |
---|---|---|---|---|---|---|---|
L-Ile | P21 | 9.75(2) | 5.32(2) | 14.12(2) | 95.8(2) | 723 | 43 |
L-Val | P21 | 9.71(1) | 5.27(2) | 12.06(2) | 90.8(2) | 617.07 | 33 |
L-Leu | P21 | 9.562(2) | 5.301(1) | 14.519(3) | 94.20(2) | 733.965 | 35 |
V3L | P21 | 9.6267(7) | 5.2704(2) | 13.829(2) | 109.94(1) | 659.6 | This work |
V2I | C2 | 25.7697(14) | 5.2445(2) | 9.6681(6) | 97.215(5) | 1296.29 | 27 |
As shown, the crystal structure of compound V3L closely resembles the structures of Leu, Val, and Ile enantiomers (Fig. 10). In the four compounds, the monoclinic cell comprises four molecules, and its asymmetric unit includes two molecules. Linear parameters a, b, and c and volume V of the monoclinic cells have close values as well. The angular parameter β of heterocompound V3L is notably larger than those of the Leu, Val, and Ile enantiomers (Fig. 10 and Table 6).
The molecular components of both non-equimolar heterocompounds V3L and V2I have the same (L) chirality and both have disordered molecular positions. Nevertheless, the comprehensive analysis of the crystal structures of V3L and V2I revealed substantial differences. First, in the case of compound V3L, only a half of the molecular positions show mixed occupation (Fig. 11a), while in compound V2I, all the molecular positions exhibit mixed occupation (Fig. 11b). Second, the monoclinic cell of V3L comprises four molecules, while that of V2I includes eight molecules, i.e. it is doubled in the direction of the longest axis of the cell.27
![]() | ||
Fig. 11 Occupancy of molecular positions in the dimers of non-equimolar heterocompounds V3L (a) and V2I (b). Ellipsoids reflect the 20% and 50% probabilities in the case of V3L and V2I, correspondingly. Images are constructed in the service OLEX2.44 |
The molecular components' conformations of non-equimolar heterocompounds V3L and V2I are discussed below. For convenience, the extended conformation of molecules is marked with superscript e (Vale and Ilee) and the folded conformation with symbol f (Valf and Ilef). As it was already mentioned, there are no conformations attributed to Leu molecules.
As given above, heterocompound V3L is characterized by one ordered and one disordered positions in the asymmetric unit. A 50/50 statistically mixed population of the disordered position implies that in every monoclinic cell, one dimer is composed solely of Val molecules (homomolecular dimer) and the other dimer is composed of both Val and Leu molecules (heteromolecular dimer) (Fig. 11a). One of the valine molecules in the homomolecular dimer has extended conformation Vale, while the other molecule has folded conformation Valf (dimer Vale–Valf). The valine molecule in the heteromolecular dimer shows extended conformation Vale (dimer Vale–Leu). It means that the ordered position is occupied by valine molecules having the extended conformation only, while the disordered position is populated with valine molecules having the folded conformation and leucine molecules (Table 7).
System | Heterocompound | Dimer types | Number of independent molecular positions | Number of ordered molecular positions | Character of molecule in ordered position | Number of disordered molecular positions | Character of molecules in disordered position |
---|---|---|---|---|---|---|---|
Val–Leu | V3L | Vale–Valf | 2 | 1 | Vale | 1 | Valf, Leu |
Vale–Leu | |||||||
Val–Ile | V2I | Vale–Valf | 2 | 0 | — | 2 | Ilee, Ilef, Vale, Valf |
Ilee–Ilef | |||||||
Vale–Ilef | |||||||
Valf–Ilee |
The monoclinic cell of heterocompound V2I has two independent molecular positions and both are disordered, i.e. characterized by mixed occupation. Molecules occupying one of these independent positions have extended conformation, while molecules present in the other independent position have folded conformation. Consequently, the crystal structure contains dimers of four types, namely, homomolecular dimers Vale–Valf and Ilee–Ilef and heteromolecular dimers Vale–Ilef and Valf–Ilee (Table 7).
Therefore, it can be concluded that the currently known non-equimolar heterocompounds of amino acids can be divided into two groups. V3L as the compound of the first group contains two independent molecular positions and only one of them is disordered, while V2I as the compound belonging to the second group has all its molecular positions disordered. It should be noted that in the L-isoleucine–L-leucine system, the recently found compound I3L belongs, apparently, to the first group. The results of its investigation were presented at a conference45 and will be published later.
Equimolar heterocompounds of amino acids having different chiralities are known from the published literature. Their crystallochemical characteristics were analyzed based on the data available from the CSD and other sources. The proposed concept of homo- and heteromolecular dimers allowed the division of the known equimolar heterocompounds into two groups: those with homo- and those with heteromolecular dimers. It was found that the molecules of the same substance have different conformations in the homo- and heteromolecular dimers. In the case of the equimolar homocompounds (true racemates), the branched side chain molecules have a folded conformation, while in the case of the equimolar heterocompounds, they are characterized by an extended conformation. Equimolar heterocompounds of amino acids having the same chirality have not yet been found in the published literature.
Non-equimolar heterocompounds are very rare. Only three examples of the compounds of chiral substances have been reported in the literature. Two compounds consist of molecules having different chiralities and their crystal structures are unknown.14 The third compound was described by the present authors27 and is an example of a non-equimolar heterocompound composed of the same chirality molecules. The crystal structure of this compound V2I, formed in the system L-Val–L-Ile, is characterized by two disordered molecular positions in the asymmetric unit of its monoclinic cell (S. G. C2). A recently revealed further example is compound V3L in the system L-Val–L-Leu. Its crystal structure exhibits one ordered and one disordered molecular positions in the asymmetric unit of its monoclinic cell (S. G. P21), thus differing from the crystal structure of heterocompound V2I. The crystal structures of V2I and V3L were analyzed using the concepts of homo- and heteromolecular dimers, side chain types (linear or branched) and molecular conformations (extended or folded). Of course, further assessment of the presented structural trends requires more data. Thus, in view of the poor quantity of known crystal structures of non-equimolar compounds, there is an obvious need for continuation of their investigations.
Footnotes |
† CCDC 1903257. For crystallographic data in CIF or other electronic format see DOI: 10.1039/c9ce01333d |
‡ C. H. Görbitz7,39 uses the term “bilayer” to define such layers. |
§ Hereinafter, international abbreviations of the amino acid names are used: Val for valine, Ile for isoleucine, and Leu for leucine. |
This journal is © The Royal Society of Chemistry 2020 |