Bozena M.
Lukasiak
a,
Rita
Faria
b,
Simeone
Zomer
a,
Richard G.
Brereton
*a and
John C.
Duncan
b
aCentre for Chemometrics, School of Chemistry, University of Bristol, Cantocks Close, Bristol, UK BS8 1TS
bTriton Technology Ltd., 3 The Courtyard, Main Street, Keyworth, Nottinghamshire, UK NG12 5AW
First published on 16th November 2005
A new method of polymer classification is described involving dynamic mechanical analysis of polymer properties as temperature is changed. The method is based on the chemometric analysis of the damping factor (tan δ) as a function of temperature. In this study four polymer groups, namely, polypropylene, low density polyethylene, polystyrene and acrylonitrile-butadiene-styrene, each characterised by different grades, were studied. The aim is to distinguish polymer groups from each other. The polymers were studied over a temperature range of −50 °C until the minimum stiffness was reached, tan δ values were recorded approximately every 1.5°. Principal components analysis was performed to visualise groupings and also for feature reduction prior to classification and clustering. Several clustering and classification methods were compared including k-means clustering, hierarchical cluster analysis, linear discriminant analysis, k-nearest neighbours, and class distances using both Euclidean and Mahalanobis measures. It is demonstrated that thermal analysis together with chemometrics provides excellent discrimination, representing a new approach for characterisation of polymers.
Thermal analysis has been employed much less frequently by analytical chemists, but involves studying the change in physical properties as materials are heated. The properties of plastics change as they are heated, usually progressing from solid through to glass to liquid. These properties can be measured in a variety of ways, for example physical properties (e.g. resistance to stress), density, calorimetry and size/shape. In this study we employ dynamic mechanical analysis (DMA) which is one of the most widespread methods for finding the glass transition and other viscoelastic relaxations in materials.10 This method yields the stiffness (modulus) and damping factor (tan δ), which can be measured as temperature is changed. These parameters allow the complete characterisation and qualification of viscoelastic materials11–14 (amorphous and semi-crystalline polymers15,16).
Most materials are not perfectly elastic and have some viscous or liquid nature in their response to an applied stress. In DMA experiments, the oscillating strain lags behind the applied oscillating stress by a phase difference which is defined by δ, where δ = 0° for elastic deformation and δ = 90° for viscous deformation. The modulus is complex, with an in-phase component and an out-of-phase component.11 The stress/resultant strain ratio can be resolved into an in-phase component, or storage modulus, E′ proportional to the elastic contribution of the sample's response, and an out-of-phase component, or loss modulus, E″, proportional to the viscous contribution.10 The ratio of these parameters, (E″/E′) gives the damping parameter, tan δ, which is the ratio of dissipated energy to stored energy for each cycle. In practice, E′ and tan δ are the parameters most often used to characterise amorphous polymers as a function of temperature.17
The classification of amorphous and semi-crystalline polymers depends on the polymer structure and intermolecular forces. Amorphous materials are characterized by having no long range order since the molecules do not arrange themselves into a regular habit, i.e. they do not crystallise. Semi-crystalline polymers consist of two phases: a crystalline phase and an amorphous phase.15 If the molecules have time to organise themselves into a regular habit, i.e. crystals, then a highly crystalline polymer will result. If, on the other hand, there is insufficient time to organize into crystals the resulting structure will be mainly amorphous.18
Dynamic mechanical methods, like other relaxation techniques, such as dielectric measurements and broad band NMR, provide a sensitive way of measuring the glass transition associated with amorphous materials. This is due to the large changes observed in the mechanical properties of an amorphous material as it passes through the glass to rubber transition Tg.10,11
In this paper we measure the properties of various polymeric materials using DMA as temperature is changed. Temperature profiles are characteristic of different materials, and so can be treated as multivariate data, with a temperature profile generated for each material. In this way we can employ chemometrics methods to study these profiles and so determine the nature of a plastic. In this paper we use only tan δ measurements for our analyses, although other physical parameters could potentially be employed.
Chemometric methods are especially important for aiding the non-expert. Whereas an expert can often deduce information about the characteristic of a compound, for example, by analysing a spectrum, in situations where high throughput automated quality control or characterisation is required it is often impracticable to employ an expert. In this paper we restrict the application of our methodology to four groups of polymers, for simplicity. It is hard to manually distinguish, for example, 10 or more groups using visual approaches without the aid of multivariate chemometric tools.
The aim of this paper is to demonstrate the applicability of multivariate chemometric techniques in a new area, namely the thermal analysis of materials, where hitherto more traditional approaches based primarily on empirical parameters requiring expert physical chemical knowledge have been employed.
![]() | ||
Fig. 1 Triton Tritec Dynamic Mechanical Analyser. |
The samples were tested at a heating rate of 7.1 °C min−1, frequency 10 Hz and displacement 0.050 mm, ≈0.5% strain, data collection rate maximum one point per s. The time of the analysis was on average 30 min. The temperature range studied was from −50 °C until the minimum value of stiffness of the polymer reached 2 × 102 N m−1. At this point the material is in a totally viscous condition and no further useful information can be obtained. Due to the different melting/softening temperatures, polymers were analysed over different temperature ranges. Measurements were made approximately every 1.5 °C. The smallest temperature difference between data points is 0.88 °C, the largest is 3.67 °C. In total each curve contains between 99 and 197 data points (see Table 1). There were no significant differences in the average physical rate of data acquisition between polymer groups.
Polymer name | Polypropylene (PP) | |||
---|---|---|---|---|
Grade name | HA507MO | RD204CF | HL512FB | |
No. of repetitions | 9 | 7 | 9 | |
Numbers of samples in | the training set | from 1 to 7 | from 10 to 14 | from 17 to 23 |
the test set | 8 and 9 | 15 and 16 | 24 and 25 | |
Average temperature ranges | beginning | −65.4 | −62.9 | −63.6 |
end | 181.9 | 160.2 | 168.6 | |
Average no. of data points | 157 | 149 | 154 |
Polymer name | Low density polyethylene (LDPE) | |||
---|---|---|---|---|
Grade name | FB4230 | FA3223 | “Jim” | |
No. of repetitions | 9 | 7 | 9 | |
Numbers of samples in | the training set | from 26 to 32 | from 35 to 39 | from 42 to 48 |
the test set | 33 and 34 | 40 and 41 | 49 and 50 | |
Average temperature ranges | beginning | −57.9 | −59.7 | −56.4 |
end | 137.2 | 121 | 121.6 | |
Average no. of data points | 130 | 102 | 109 |
Polymer name | Polystyrene (PS) | |||
---|---|---|---|---|
Grade name | SR550 | N2560 | R850E | |
No. of repetitions | 9 | 7 | 9 | |
Numbers of samples in | the training set | from 51 to 57 | from 60 to 64 | from 67 to 73 |
the test set | 58 and 59 | 65 and 66 | 74 and 75 | |
Average temperature ranges | beginning | −59.1 | −55.5 | −66.5 |
end | 168.4 | 168.1 | 169.3 | |
Average no. of data points | 138 | 155 | 141 |
Polymer name | Acrylonitrile-butadiene-styrene (ABS) | |||
---|---|---|---|---|
Grade name | B432/E | F332 | JG | |
No. of repetitions | 9 | 7 | 9 | |
Numbers of samples in | the training set | from 76 to 82 | from 85 to 89 | from 92 to 98 |
the test set | 83 and 84 | 90 and 91 | 99 and 100 | |
Average temperature ranges | beginning | −58.8 | −60.3 | −60.7 |
end | 168.8 | 168.4 | 185.9 | |
Average no. of data points | 139 | 136 | 147 |
Each analysis was carried out on a fresh moulding made from the same batch of material.
![]() | ||
Fig. 3 Tan δ plots of amorphous polymers between 70 and 170 °C, each polymer is represented by 3 different grades; one replicate of each grade is illustrated. |
![]() | (1) |
Interpolated values of tan δ at the same temperatures for all the polymers were calculated for the range: −48 °C to 118.5 °C, with an interpolated temperature interval equalling exactly 1.5 °C, resulting in 112 equidistant data points in time. Before interpolation the last data point for the shortest polymer curve (sample no. 40) was at temperature 119.28 °C. In order to retain the most useful piece of information corresponding to the high temperatures additionally for all the samples one more (113th) data point was added: the tan δ value was interpolated for temperature 119.28 °C.
A matrix X was obtained, consisting of 113 columns, containing the 113 interpolated tan δ values and 100 rows corresponding to 100 polymer samples. Data were mean-centered along the columns.
The whole dataset was divided into a training set containing data for 76 polymer samples and a test set containing data for 24 polymer samples (Table 1). Principal components analysis (PCA) and choosing the optimal number of components (Section 3.2) is performed on the training set. Outlier detection (Section 3.3) is performed on both training and test sets together. Unsupervised clustering (Sections 3.5) using linkage and k-means is performed on the entire dataset together whereas classification methods (Section 3.6) are performed on the training set and validated using the test set.
X = TP + E | (2) |
When using PCA as a graphical method for visualising data, many different plots can be constructed once PCA has been performed.1,18,20 In this paper two are employed: scores vs. scores and loadings vs. temperature.
Scores vs. scores plots (scores from one component against scores from another component) relate the different objects to each other. Clustering of points represents clusters in real data. In this paper these plots are used to determine which plastics are more similar to each other, and so visualise groupings.
Loadings vs. temperature plots (loadings from one component against temperature) illustrate which variable the corresponding principal component mainly accounts for. Temperatures at which the loadings of a defined PC are far from zero are strongly associated with this PC.1,2
In addition to using PCA as a method for data visualisation it can also be employed for variable reduction prior to clustering or classification. Datasets characterised by a number of variables higher than number of objects, such as our dataset, produce singular variance–covariance matrices1 due to the correlation existing between variables.2,21 Since in the calculation of the Mahalanobis distance one uses the variance–covariance matrix, a method of removing the correlation between variables, like PCA, is required. For consistency a similar approach is employed when using Euclidean distance measures. The multidimensionality of data (in total each curve contains between 99 and 197 measurements) may also cause some problems for k-NN classifiers because redundant information used in the training sets influences their classification ability.22
It should be noted that there are alternative approaches for classification apart from using the Mahalanobis distance measure, such as SIMCA or RDA that are also applicable to matrices where the number of variables exceed the number of samples. For brevity, in this paper, we restrict the number of methods we employ but there are alternative approaches that have been developed specifically for these types of problems that are often encountered in chemometrics.
Because the aim of this work is not curve reconstruction but the classification of polymers, the quality of classification provide us with insight as to the number of significant components. The method of determining the optimum number of components to use for classification used here incorporates the leave one out technique (LOO), where each sample is removed once from the training set and classified using a model constructed from the remainder.25–27 Each polymer sample is classified to one of the 4 polymer groups on the basis of its nearest neighbours (k-nearest neighbour method) membership.
The generalized distance DiA of the polymer sample i to the centre of cluster containing samples of group A is calculated as:1,2
![]() | (3) |
An unknown polymer was classified into the group whose distance to the relevant centroid was least. The models were set up using the training set and then validated using the test set.
Scores of the first two significant PCs were used for the classification.
fAB = (![]() ![]() ![]() ![]() ![]() ![]() | (4) |
The whole procedure was performed in 2 steps, firstly classifying a polymer into amorphous and semi-crystalline polymers then division of these 2 groups into PP or LDPE, or into PS or ABS respectively. Two separate classifications were performed: the training set classification and the test set classification. In the test set classification all the samples from the test set were classified using weights calculated from the training set.
![]() | ||
Fig. 4 Scores plot, whole data set, PC2 against PC1, polymers: PP (□), LDPE (○), PS (◇), ABS (△). Some of the samples which are described in the article text are numbered in the plot. |
The loadings plot (Fig. 5) should be compared to the tan δ plots (Fig. 2 and Fig. 3). The first PC is characterised mainly by the region between 100 °C and 120 °C. This PC distinguishes very well polymer LDPE from the rest of polymers and polymer PP from polymer ABS. This PC also characterises the division of two very similar polymers, PS and ABS, which is reflected in the scores plot (Fig. 4). The second PC has characteristic intensity mainly in the regions 100 °C and 120 °C. This temperature is most important for characterising the division of LDPE and ABS polymers. The loadings for the third PC are high for the region around 90 °C and 110 °C, and low in between. This PC helps to distinguish between polymers LDPE and ABS and also highlights a little distance between polymers PP and LDPE.
![]() | ||
Fig. 5 Loadings plot for three first principal components. |
Results of preliminary classification as described below using the k-NN method are very good in the presence of outliers. This implies that these risky samples do not influence the result and probably there is no need to remove them. However, results of classification using distance measures are made worse in the presence of outliers, as these samples have significant influence on the centroids of the clusters.
After removing of the outlier PCA is performed once again on the data.
![]() | ||
Fig. 6 Dendrogram for average linkage method. |
Although the random initial choice of centroids used in the k-means method gives different results for each repetition of procedure, a general pattern can be found. Often one ABS grade clusters together with the PS polymers (apart from one sample). Furthermore two other incorrect situations can occur: LDPE and PP polymers appear in the same cluster and one PS grade creates a separate cluster or the same PS grade cluster together with the LDPE polymer. The last situation is surprising, as these two polymers are placed quite far away.
Training set | Test set | |
---|---|---|
Euclidean distance | 74/75 (99%) | 24/24 (100%) |
Mahalanobis distance | 74/75 (99%) | 23/24 (96%) |
k-NN (k = 1) | 75/75 (100%) | 24/24 (100%) |
Linear discriminant analysis | 75/75 (100%) | 24/24 (100%) |
One sample from the training set (no.68, from the PS group) was incorrectly classified to ABS using the Euclidean distance measure. The classification is 99% correct. Surprisingly a better result was obtained with the test set as no sample was misclassified (see Table 2), but sample 74 was very close to the wrong result.
This result may be improved after removal of sample 54, which also may be considered an outlier. This sample has considerable influence on the PS centroid (see Fig. 4), which makes the distance between the cluster centroid and sample 68 higher.
Using the Mahalanobis distance measure one sample of the ABS polymer (no. 98) is closer to PS than to ABS centroid. This result is a consequence of the characteristic of the Mahalanobis distance, which takes into account the data distribution, unlike the Euclidean distance.2 The mean distance the of PS polymer samples from the centre of its cluster is quite high. Because the Mahalanobis distance accounts for dispersion of different groups samples which are far from the distance of the PS polymer centre in Euclidean space can have relatively low Mahalanobis distance from this class. The result is that samples (for example sample 98) belonging to other classes are incorporated into the PS polymer class.
This result is surprising, because one expects sample 68, which is furthest to the right of all the PS samples in the plot of Fig. 4, and which is misclassified using Euclidean distance, to be misclassified also using the Mahalanobis distance. But the large distance of this sample from the cluster centre is corrected for by the high variance of both PC1 and PC2 in the PS class which results in correct classification.
The classification rate was similar for the test set: one sample was misclassified, sample 99, which is the closest to the sample 98, also misclassified in the training set.
The k-NN method gives the best result for k = 1 if clusters are disjoint,22 thus we use this number of nearest neighbours. Using this method the training set is correctly classified in all cases as well as the test set.
For LDA 100% are correctly classified for both the training set and the test set (see Table 2).
Characteristics of polymer grades belonging to the same polymer differ. As a result of these differences clustering methods based on the assumption that one polymer group forms only one cluster such as k-means are not able to find the correct polymer clusters. Supervised classification methods, like: class distance using Euclidean or Mahalanobis distance measures, LDA and k-NN work 100% correctly in our case (see Table 2).
The combination of thermal analysis and chemometrics appears to have significant promise for the characterisation of materials.
This journal is © The Royal Society of Chemistry 2006 |