Aalae
Alkhalil
,
Jagadeesh
Babu Nanubolu
and
Jonathan C.
Burley
*
Boots Science Building, School of Pharmacy, University of Nottingham, NG7 2RD, UK. E-mail: jonathan.burley@nottingham.ac.uk; Tel: +44 (0)115 8468357; Fax: +44 (0) 115 951 5102
First published on 1st November 2011
The efficacy of phonon-mode spectral data (20–400 cm−1) in identifying and characterising phase transitions is for the first time compared directly with traditional “fingerprint” intra-molecular spectral data (400–3800 cm−1) for a model molecular system, using a range of statistical approaches and algorithms. Both data sets were collected in the same experiment, allowing a direct comparison. We find that phonon-mode data offer a reliable method of identifying phase transitions, whereas the intra-molecular are inherently unsuitable. Our results are likely to apply widely to solid-solid transformations.
Of the three main vibrational spectroscopies, INS is unsuitable for general applications due to limited access to neutron sources, and also the often prohibitively long data acquisition times. Infra-red methods have the benefit of short acquisition times but collection of data across a wide spectral range requires the use of several dedicated instruments (due to the requirement in IR spectroscopy for a radiation source to have the same energy as the vibration being probed). For example, to collect THz-frequency data a THz spectrometer is required, whereas mid-IR data require an entirely separate mechanism of generating the incident radiation and therefore an entirely separate data acquisition instrument. In order to directly compare the efficacy of different data collection strategies for characterisation of materials, specifically the use of different spectral windows (the aim of this work), it is clearly a pre-requisite to collect comparable data. We therefore employ Raman spectroscopy, in which a very wide spectral window data set can be collected under identical conditions on the same instrument, in order to compare quantitatively the use of THz-frequency (phonon-mode, i.e. intermolecular bands) data with mid-IR frequency (intramolecular bands) for characterisation of phase transitions in a model molecular system.
There is a relatively limited amount of published work to date dealing with phonon-mode Raman spectroscopy, in large part due to the fact that the improvement in standard Rayleigh rejection filters is very recent. This can be contrasted with the body of work on THz infra-red spectroscopy.13 It has been stated by several researchers (including one of the current authors) that the phonon-mode data—whether in Raman or THz infra-red—from 10–400 cm−1 are more sensitive to inter-molecular interactions and crystalline forms (polymorph, solvate, etc) than the data from 400–3800 cm−1. In crystalline materials the phonon-mode bands are quantised and thus yield relatively sharp peaks; in amorphous materials they are not quantised and instead a broad feature is observed over this region (known as the boson peak14–18). Examples from the field of Raman spectroscopy include the use of phonon-mode data to distinguish between polymorphs.19–21 Although the enhanced sensitivity of the phonon-mode (THz frequency) data to solid-state information when compared with “molecular fingerprint” (mid-IR) data is intuitive,22,23 to the best of our knowledge there exist no studies which address this through a direct comparison of the two spectral windows in a thorough and statistically rigorous manner. For a direct comparison the phonon-mode and intra-molecular data should ideally be collected on the same instrument, and thus Raman spectroscopy is the obvious (and indeed the only) viable technique. This direct comparison of phonon-mode and intra-molecular spectra using Raman spectroscopy therefore forms the basis of the present study. For this comparison we employ a model compound, paracetamol (acetaminophen), which is known to exhibit phase transitions between various solid forms. We investigate and compare the efficacy of mid-IR frequency data (predominantly probing intra-molecular bands) with THz-frequency data (predominantly probing inter-molecular bands) for the spectral classification of the different solid forms.
Paracetamol is a very common analgesic, and is a very well characterised model system, for which both intra-molecular and phonon-mode have been reported. In the solid state it can adopt three polymorphs and an amorphous form, and has been well characterised by several researchers including ourselves. Crystal structures of the three polymorphs are available and indicate that the molecular conformation is relatively invariant.24–26 The melting temperature of form I is 169–170 °C, form II melts at 154–157 °C and the melting point of form III is 143 °C.27–30 The glass transition occurs in the region of 25 °C. On heating the glass, it is possible to isolate all three polymorphs for certain experimental configurations. It therefore forms an excellent well-characterised model system which undergoes several successive phase transformations on heating. Full Raman spectral data, including phonon-mode data, have been presented for all solid forms.20 A previous paper by Kauffman et al. reported the results of simultaneous differential scanning calorimetry and Raman spectroscopy to this system, including principal component analysis of the Raman data.31 The Raman data of Kauffman, and their analysis by PCA, are relevant to the work reported below. The study of Kauffman covered the spectral range 350–4000 cm−1, which is the range traditionally accessible using a standard Raman spectrometer. We are fortunate in that our Raman spectrometer can collect meaningful data over the spectral range 20–4000 cm−1, which includes the phonon-mode spectral window, and we are thereby able to directly compare the use of phonon-mode and intra-molecular Raman spectroscopy in characterising the transformations between different forms in this model system. The different forms are generated as per several literature reports, namely through melt-quenching of liquid paracetamol to generate the amorphous form, followed by slow heating (−100 to 180 °C) to drive the system from a high-energy amorphous state to the lowest energy crystalline state, following Ostwald's rule of stages.20,27–30 A spectrum is collected every 1 °C (total 281 spectra), and these spectra are analysed as outlined below.
Our study is aimed at investigating the utility of low-wavenumber (less than 400 cm−1) Raman data in characterisation of phase transitions in organic solids, and in particular comparing these low-wavenumber data with data from the more traditional mid-IR frequency range (400–4000 cm−1). Our work represents the first direct comparison of these two spectral regions and is intended as a general guide to experimental design for future researchers who may consider employing vibrational spectroscopy for characterisation of molecular solids. A model and well-characterised polymorphic pharmaceutical system is employed.
The as-received crystalline powder sample was placed onto a glass microscope slide, a cover slip was placed on top of the sample, and the sample was loaded into a Linkam hot-stage (model LTS350, with TMS94 temperature controlling programmer and LNP94 cooling system and a 2 litre dewar for liquid Nitrogen). Temperature control and data collection were computer-controlled, and the sample stage was adjusted for optimal height automatically prior to each measurement. The hot-stage was flushed with nitrogen gas throughout the experiment. There is perhaps some evidence for a small discrepancy between the recorded and actual temperatures within the hot-stage§, but this effect is quite small and does not impact on our analysis.
The sample was heated to 180 °C (above the melting point of 169 °C), held at this temperature for 5 min, and cooled at a rate of 30 °C min−1 to −100 °C to isolate a purely amorphous sample. Raman data were then collected on heating from −100 to 180 °C at 1 °C increments. A heating rate of 1 °C min−1 was employed between individual temperatures. Temperature was allowed to stabilise prior to each data collection so the overall (underlying) heating rate is significantly lower than 1 °C min−1.
The signal:
noise ratio of the spectra improves gradually from −100 to −10 °C, this may be intrinsic but is thought to be due to the formation of ice on the hot-stage windows at these low temperatures, which subsequently melts on heating. The signal
:
noise ratio also drops above the melting point (169 °C), this is likely due to flow of the liquid out of the sampling volume of the spectrometer and is largely unavoidable in the current experimental configuration. With the exception of these two temperature windows, the quality of the spectra (signal
:
noise ratio) is excellent across the vast majority of the temperature range. Example spectra are given the supplementary information, Fig. S1.†
Data were analysed visually (necessarily subjectively) and using a variety of statistical approaches. Prior to statistical analyses, the data (entire spectra) were subject to background subtraction which was performed using the LabSpec software with a second-order polynomial. For analysis the spectral range was divided into phonon-mode (20–400 cm−1) and intra-molecular (400–3800 cm−1) spectral regions, the former containing a total of 73060 data points and the latter a total of 651920 data points for all experiments.
Statistical analyses were performed within the R software package,33 which is open-source, freely available and fully documented. For principal component analysis the separate pcaMethods library34 was employed (routine “pca”); the default of singular variable decomposition was used to generate the components. The first two principal components are reported for the PCA (the first twenty were calculated). Data were either employed raw or scaled. Where scaling was performed, all spectra were scaled for intensity prior to the analysis by dividing the mean-centred data by their root-mean-square using the standard “scale” function within R, otherwise all parameters employed were the default for the particular software/statistics routine.
The hierarchical agglomerative clustering is implemented within the default installation of R (routine “hclust”). For the hierarchical agglomerative clustering four and five clusters were defined, with a distance matrix being calculated using the Euclidean distance measure. A total of seven separate clustering algorithms were employed in the hierarchical agglomerative clustering, in order to examine whether the choice of clustering algorithm affected the clusters formed.
All raw data and details of the statistical analyses performed (R scripts) are available in the supplementary material for information and reference, and it should be possible to reproduce all of the results in this paper from the information given therein.
![]() | ||
Fig. 1 Scaled experimental Raman spectra as a function of temperature for spectral windows: a) 20–3800 cm−1; b) 20–400 cm−1; c) 1200–1350 cm−1; d) 1450–1700 cm−1. |
From simple inspection of Fig. 1, it can be seen that spectra can be quickly classified into five main regions, separated sequentially by temperature. Comparison with previous work27–30,35 indicates that these correspond to: amorphous (−100 to 69 °C); to form III (70 to 110 °C); via a slow transition in the range 112–120 °C to form II (121 to 140 °C); to form I (141 to 165–168 °C); to the final melt (169 to 180 °C). The transformations are more visually apparent in the phonon-mode spectra range (20–400 cm−1) than in the intra-molecular spectra range (400–3800 cm−1); this has been previously noted and commented on in detail.20 There is also some evidence for the presence of a glass transition (amorphous solid → supercooled liquid) around 35 °C. Overall the data presented in Fig. 1 agree very well with previous literature and therefore form a suitable model data set with which to investigate and directly compare the utility of phonon-mode Raman data, and intra-molecular Raman data for characterising the phase transitions and spectrally classifying the various forms of paracetamol. Note that the transformation II → I was not observed and in fact did not occur in a very similar study by Kaufmann et al.31 The transformation is clear and unambiguous from our data, and it is equally clear from the data of Kaufmann et al. that this transformation did not occur in their experiment (their Fig. 3 can be directly compared with our Fig. S3 to further illustrate this). The reason for the slight difference in crystallisation pathways is almost certainly the effect of nucleation, which is known to be a highly stochastic phenomenon.36 In the work of Kaufmann et al. form I did not nucleate following the melt of form II (despite form I being the thermodynamically stable from the temperature range 156–169 °C), whereas in our experiment form II underwent a solid–solid polymorphic transformation to form I at 140 °C.
In the context of the statistical analyses which will be reported below, it is important to note at this point that any meaningful and reliable statistical analysis must at the very least be able to reproduce the majority of the observations discussed above, and that any analysis which is not in agreement with the visual observations is almost certainly unreliable. Statistical analysis of the data may of course reveal new details about the experiments which have not been noted in the (subjective) discussion above, but an agreement with the visual observations is a minimum criterion for physically meaningful results.
The PCA results are presented as scores plots (PC1 against PC2) in Fig. 2, and the variation in the score as a function of temperature in Fig. 3. The four panels in Fig. 2 present data for the intra-molecular spectral range 400–3800 cm−1 and the phonon-mode spectral range 20–400 cm−1 ranges, and illustrate the effects of pre-scaling data against using raw, uncorrected data. In PCA it is generally important to pre-scale data before analysis (see standard textbooks, for example “Multivariate data analysis: in practice”38), especially for cases in which the variance of the data sets is not constant across data sets.
![]() | ||
Fig. 2 Scores plots for molecular-mode and phonon-mode data, scaled and unscaled as labelled. Colour codes: black = −100 to 69 °C; blue = 70 to 111 °C; indigo = 112–120 °C; green = 121–140 °C; orange = 141–161 °C; red = 162–180 °C. |
Focussing therefore on the pre-scaled data in Fig. 2, the key observation, which may be made from visual inspection of the plots, is that from the intra-molecular data no clear clustering of spectra is observed (and therefore the phase transformations are not obvious), whereas from the analysis of the phonon-mode data, four clear and obvious clusters are formed. Correlation of the data points with temperature (and the visual inspection of the spectra outlined earlier) indicates that the four clusters correspond to: 1) amorphous solid and melt (black data points); 2) crystalline form III (blue data points); 3) crystalline form II (green data points); and 4) crystalline form I (orange data points).
The variation in PC1 and PC2 as a function of temperature in Fig. 3 presents the same information (pre-scaled data) as in Fig. 2, but this time as a function of temperature. It is immediately clear that analysis of the intra-molecular spectral window (Fig. 3a) allows the glass transition, the crystallisation of the supercooled liquid, and the melting point to be identified. The various solid–solid polymorphic transformations however are not clear from these data. In marked contrast, for the phonon-mode spectral window (Fig. 3b), the PCA results clearly and unambiguously identify all transitions expected (glass, crystallisation of form III, the various solid → solid transformations, and the melt). These results are in complete agreement with the scores plots presented in Fig. 2 and discussed briefly above.
![]() | ||
Fig. 3 Variation in scores as a function of temperature for spectral ranges: a) intra-molecular 400–3800 cm−1; b) phonon-mode 20–400 cm−1. PC 1 in black, PC 2 in red. The transformation temperatures noted by eye are indicated by dashed vertical lines. SCL = super-cooled liquid. |
From an initial inspection of Fig. 2 and 3, we can therefore immediately conclude that the phonon-mode data are far more suitable for the study and characterisation of phase transformations than the intra-molecular data.
Considering Figs. 2 and 3 in more detail, the temperatures at which transitions occur between the clusters derived from the phonon-mode data correspond extremely well with the various phase transformations expected (and which were noted earlier from the visual inspection of the data). The transformation from amorphous (black data points) to form III (blue data points) corresponds to crystallisation from the supercooled liquid (it occurs at 69–70 °C which is well above the glass transition temperature of 25 °C, but below the melting point of form I at 169 °C). This transformation is instantaneous on our experimental time-scale—there are no experimental points which link the amorphous cluster with the form III cluster. The form III cluster exists until 110 °C, after which a slow transition (mainly on PC2) occurs (indigo data points), until by 120 °C a new cluster is formed. This cluster (green data points) corresponds to crystalline form II, which is stable until 139 °C, at which point another abrupt transition occurs. At 140 °C a new cluster (orange data points), corresponding to form I, is evident. Form I is stable until melting occurs. The melting point of form I has been repeatedly determined to be at 169 °C—in the current experiment the transformation from form I to the melt seems to occur gradually, with several points (in red) linking the form I cluster and the amorphous/liquid cluster. This is curious—one might expect the melting to occur sharply. From inspection of the raw data (Fig. S2 in supplementary information†) it seems that the melting of form I is a rather gradual process, in which the intensity of the Raman signal decreases steadily in the temperature range 161–169 °C. One possible explanation would be a temperature gradient across the sample, however given the abruptness of (for example) the SCL → III and the II → I transformations, it seems that this is unlikely. At the present moment it is not clear why the melting transformation should appear gradual in our data, but it seems likely that this is an experimental artefact which results from sample movement in the stage as the melting point is approached, rather than anything which is intrinsic to the melting of paracetamol.
The relatively diffuse nature of the amorphous/liquid cluster, compared to the tight definition of the crystalline clusters, is in full accord with glassy materials exhibiting a range of relaxation states, whereas crystalline materials possess a single thermodynamic ground state. The amorphous and liquid states are separated only by the glass transition, in which symmetry-breaking does not occur (unlike, for example, glass to crystal, crystal to liquid etc). Thus it is reasonable that the glass and the liquid define the same cluster, and that this cluster should be more diffuse than any of the clusters formed from the crystalline phases.
Our key conclusion from the PCA results (via consideration of Fig. 2 and 3) is that phonon-mode data are suitable for clear and unambiguous differentiation between solid forms of materials (specifically paracetamol), whereas intra-molecular data are not. This stands in some contrast to the conclusions of Kauffman et al.,31 who undertook an essentially identical experiment (albeit with access to data in the 350–4000 cm−1 range only, and data collected every 3 °C rather than every 1 °C as in the present work) and concluded that data in the intra-molecular spectral window are suitable for differentiating between various forms of paracetamol. Direct comparison of our work and that of Kauffman is difficult for two reasons: i) the raw data of Kauffman et al., and their numerical routines, are not publicly available; ii) in our experiment a transformation II → I occurred at 140 °C, whereas in the experiments of Kauffman et al. this did not occur and their sample melted at the melting point of form II (156 °C) as discussed earlier. Their assignment of three rather than four clusters was therefore reasonable for their data, as their experiments isolated the amorphous form, plus crystalline forms III and II. In contrast, a full description of our data requires four clusters, with form I being required in addition to those observed by Kauffman et al.
Returning to whether or not the intra-molecular data are suitable for classifying spectra according to the phase present, it is important to note that in the work of Kauffman et al., data pre-scaling was not applied (see their experimental section p1312). As outlined earlier pre-scaling of data is typically essential for a robust and reliable statistical analysis. To allow a direct assessment of whether spectral classification is possible using unscaled data (the Kauffman procedure) we present in Fig. 2 the results of PCA of our data with no pre-scaling applied. For the intra-molecular data it is immediately apparent that the separation of the various physical forms is not very distinct at all for this analysis. The majority of the variation in both PC1 and PC2 occurs for the amorphous/liquid spectra. Forms III and II are very poorly separated. Form I is hardly distinct from the liquid melt. For the unscaled phonon-mode data the separation of the different forms is again very indistinct.
Overall therefore we can state that regardless of the exact nature of the statistical routine applied to the data, the intra-molecular spectra in the range 400–3800 cm−1 are not sufficiently different for the various forms of paracetamol to allow reliable spectral classification. The phonon-mode data in contrast offer a clear and reliable differentiation of the forms, if the usual and recommended practice of pre-scaling38,39 is applied to the data prior to analysis.
Although the enhanced sensitivity to these polymorphic transformations of the phonon-mode data over the intra-molecular data is intuitive, it is at first sight rather puzzling that the limited range intra-molecular data presented in Fig. 1c (1200-1350 cm−1) and Fig. 1d (1450–1700 cm−1) clearly show the transformations even from simple visual inspection, whereas the PCA of the entire intra-molecular data (pre-scaled) in Fig. 2 (400–3800 cm−1) does not. To clarify this apparent disparity, PCA was performed on the data shown in Fig. 1d, i.e. the limited spectral range 1450–1700 cm−1. These results are presented in Fig. S4a (supporting information†). Obvious and physically meaningful clustering is observed, which corresponds directly with both the visual inspection of the data (Fig. 1) and the phonon-mode PCA (Fig. 2b and 3b). Overall, there are (visually) more similarities than differences in the entire intra-molecular data (400–3800 cm−1) between the different solid forms, and is therefore reasonable that PCA is unable to reliably assign the spectra to the various polymorphs of paracetamol. However careful selection of a limited spectral region in which clear visual differences are present (1450–1700 cm−1), and subsequent analysis of that spectral region by PCA allows the spectra to be assigned correctly.
We can therefore conclude that for the current model system, a limited sub-set of the intra-molecular data can in certain cases discriminate between polymorphs, whereas use of the entire intra-molecular data range does not. We note that it is not apparent from the outset which limited spectral range to use: for example, employing data in the range 2800–3200 cm−1 does not lead to any clustering (Fig. S4b, supporting information†). These results again support our observation that the phonon-mode data are reliable for discriminating between polymorphs, whereas the intra-molecular data are not reliable.
The extreme difference between the PCA results for the intra-molecular mode (Fig. 2, 3b) and phonon-mode (Fig. 2, 3a) spectral data is noteworthy, and illustrates the strongly enhanced sensitivity of the phonon-mode data to solid state forms. If only intra-molecular data are available (as is often the case with older generation Raman spectrometers for example, and with all mid-IR systems), great care must be taken both in data selection and data processing when employing only intra-molecular spectra data to investigate physical transformations between solids.
We now turn to an entirely separate statistical technique to assess the relative reliability of phonon-mode and intra-molecular Raman data for the study of phase transitions, in order to further validate the results outlined above.
The basic premise employed in the current work is as follows: for data which allow reliable clustering, the choice of HA clustering algorithm should not materially affect the results of the clustering; whereas for data which do not allow reliable clustering, the choice of algorithm may change the clustering observed. As with the PCA, any reliable clustering should lead to physically meaningful clusters. In the context of the current work two useful rules of thumb are: 1) clusters should be separated at the known transition temperatures between the various forms of paracetamol; 2) clustered spectra should be spread sequentially in temperature.
Results of seven different hierarchical clustering analyses are presented in Fig. 4a, for the phonon-mode data, and in 4b for the intra-molecular data. Four clusters were requested as output from the analysis. The known transition temperature between the various forms of paracetamol are also shown in the Fig. 4.
![]() | ||
Fig. 4 Results of HA clustering analyses for spectral ranges a) 20–400 cm−1; b) 400–3800 cm−1. The algorithm used is given at the left, and clusters are indicated by colours. Note that in this case the colours are arbitrary and do not relate directly to the various physical forms. |
The first point to note from analysis of the phonon-mode data (Fig. 4a) is that all seven clustering algorithms produce similar (albeit not identical) clustering results. The four clusters in general correspond well with: 1) amorphous/liquid; 2) form III; 3) form II; 4) form I. The only exception is for the “single” algorithm, which places forms III and II in the same cluster. The glass transition is not detected by any of the algorithms (even when five clusters are requested, data not shown). However all of the other transitions (crystallisation; form III; form II; form I; melt) are clearly defined, and occur at physically meaningful temperatures which correspond well with those deduced in the earlier analyses.
Overall the clustering of the spectra, using the phonon-mode data as input; appears to be reliable and robust, with all of the physical transformations of paracetamol assigned, with the exception of the glass transition. The inability of HA clustering to detect the glass transition is most likely due to the very similar spectra from the amorphous solid, and the supercooled liquid. However it is of note that PCA unambiguously identified the glass transition temperature whereas HAC did not.
The results of the HA clustering analyses for the spectral window 400–3800 cm−1 are given in Fig. 4b, where it is clear that the seven different algorithms do not yield similar clusters. Five of the seven algorithms appear totally insensitive to the different solid physical forms of paracetamol, with the “single”, “median”, “McQuitty”, “centroid” and “average” algorithms clustering all data from −100 to 168 °C. Not a single algorithm (applied to this data range) is capable of providing useful information on the various transformations, despite the presence of several signature peaks in the molecular region (Fig. 1, and as discussed earlier). It appears that despite the minor differences in peak positions in this spectral region between the different forms, the patterns are overall sufficiently similar that the clustering algorithms are unable to distinguish the different solid forms present in this experiment.
The results from the HA clustering analyses clearly and unambiguously indicate that the phonon-mode data are highly suitable for differentiating the various forms of paracetamol encountered in our experiment. In contrast, the intra-molecular data (including the traditional “fingerprint” region) are not. The use of seven different clustering algorithms for each analysis provides confidence that this difference between the two data ranges is not an artefact of our methodology. In the context of assigning the different physical forms of paracetamol, the results mirror those obtained from PCA, in which the phonon-mode data were demonstrably superior to the intra-molecular data for the characterisation of phase transformations.
The work outlined above has a number of potential applications. Firstly, it points the way to development of more appropriate spectroscopic instrumentation for materials analysis, in that any attempts to extend the wavenumber range available should focus primarily on the low-wavenumber capabilities. Second, it suggests that online Raman monitoring of processes in which solid–solid phase changes are of importance should, where possible, employ low-wavenumber data if possible. Third, the statistical approaches outlined above can readily be applied to situations in which automated classification of materials is important, for example in manufacturing plant, further in situ monitoring, etc. Finally, it demonstrates that automated screening for polymorphism in pharmaceutical materials can be readily achieved by online monitoring of recrystallisation from the glass state as a material is heated; sample requirements are of the order of mg or less.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c1ra00422k |
‡ For the sake of simplicity we include all data collected in our statistical analysis. It is trivial to demonstrate using the raw data and numerical routines provided that changing the lowest wavenumber cut-off for our data makes no real difference to our results. A demonstration of this is provided in the supplementary information.† |
§ For example, the glass transition seems to occur at 35 °C in our data set, whereas it is very well established through DSC that 25 °C is a more appropriate value. However the melting point observed in our work seems reasonable, as do the temperatures of the other phase transitions. As the various transitions of paracetamol are very well documented (and are in any case subject to the stochastic nature of nucleation), and as the main thrust of this work is to classify the different forms, this is not an issue of any real consequence. |
This journal is © The Royal Society of Chemistry 2012 |