Pattern recognition for the analysis of polymeric materials

Bozena M. Lukasiak a, Rita Faria b, Simeone Zomer a, Richard G. Brereton *a and John C. Duncan b
aCentre for Chemometrics, School of Chemistry, University of Bristol, Cantocks Close, Bristol, UK BS8 1TS
bTriton Technology Ltd., 3 The Courtyard, Main Street, Keyworth, Nottinghamshire, UK NG12 5AW

Received 25th July 2005 , Accepted 25th October 2005

First published on 16th November 2005


Abstract

A new method of polymer classification is described involving dynamic mechanical analysis of polymer properties as temperature is changed. The method is based on the chemometric analysis of the damping factor (tan δ) as a function of temperature. In this study four polymer groups, namely, polypropylene, low density polyethylene, polystyrene and acrylonitrile-butadiene-styrene, each characterised by different grades, were studied. The aim is to distinguish polymer groups from each other. The polymers were studied over a temperature range of −50 °C until the minimum stiffness was reached, tan δ values were recorded approximately every 1.5°. Principal components analysis was performed to visualise groupings and also for feature reduction prior to classification and clustering. Several clustering and classification methods were compared including k-means clustering, hierarchical cluster analysis, linear discriminant analysis, k-nearest neighbours, and class distances using both Euclidean and Mahalanobis measures. It is demonstrated that thermal analysis together with chemometrics provides excellent discrimination, representing a new approach for characterisation of polymers.


1. Introduction

Most analytical methods for the characterisation of chemicals involve either spectroscopy or chromatography or a combination of techniques. Chemometrics methods have been developed over the past two decades to assist in this procedure1–3 usually by performing multivariate analysis on the resultant profiles. Both exploratory methods, such as principal components analysis4–6 and classification methods7–9 are commonly employed.

Thermal analysis has been employed much less frequently by analytical chemists, but involves studying the change in physical properties as materials are heated. The properties of plastics change as they are heated, usually progressing from solid through to glass to liquid. These properties can be measured in a variety of ways, for example physical properties (e.g. resistance to stress), density, calorimetry and size/shape. In this study we employ dynamic mechanical analysis (DMA) which is one of the most widespread methods for finding the glass transition and other viscoelastic relaxations in materials.10 This method yields the stiffness (modulus) and damping factor (tan δ), which can be measured as temperature is changed. These parameters allow the complete characterisation and qualification of viscoelastic materials11–14 (amorphous and semi-crystalline polymers15,16).

Most materials are not perfectly elastic and have some viscous or liquid nature in their response to an applied stress. In DMA experiments, the oscillating strain lags behind the applied oscillating stress by a phase difference which is defined by δ, where δ = 0° for elastic deformation and δ = 90° for viscous deformation. The modulus is complex, with an in-phase component and an out-of-phase component.11 The stress/resultant strain ratio can be resolved into an in-phase component, or storage modulus, E′ proportional to the elastic contribution of the sample's response, and an out-of-phase component, or loss modulus, E″, proportional to the viscous contribution.10 The ratio of these parameters, (E″/E′) gives the damping parameter, tan δ, which is the ratio of dissipated energy to stored energy for each cycle. In practice, E′ and tan δ are the parameters most often used to characterise amorphous polymers as a function of temperature.17

The classification of amorphous and semi-crystalline polymers depends on the polymer structure and intermolecular forces. Amorphous materials are characterized by having no long range order since the molecules do not arrange themselves into a regular habit, i.e. they do not crystallise. Semi-crystalline polymers consist of two phases: a crystalline phase and an amorphous phase.15 If the molecules have time to organise themselves into a regular habit, i.e. crystals, then a highly crystalline polymer will result. If, on the other hand, there is insufficient time to organize into crystals the resulting structure will be mainly amorphous.18

Dynamic mechanical methods, like other relaxation techniques, such as dielectric measurements and broad band NMR, provide a sensitive way of measuring the glass transition associated with amorphous materials. This is due to the large changes observed in the mechanical properties of an amorphous material as it passes through the glass to rubber transition Tg.10,11

In this paper we measure the properties of various polymeric materials using DMA as temperature is changed. Temperature profiles are characteristic of different materials, and so can be treated as multivariate data, with a temperature profile generated for each material. In this way we can employ chemometrics methods to study these profiles and so determine the nature of a plastic. In this paper we use only tan δ measurements for our analyses, although other physical parameters could potentially be employed.

Chemometric methods are especially important for aiding the non-expert. Whereas an expert can often deduce information about the characteristic of a compound, for example, by analysing a spectrum, in situations where high throughput automated quality control or characterisation is required it is often impracticable to employ an expert. In this paper we restrict the application of our methodology to four groups of polymers, for simplicity. It is hard to manually distinguish, for example, 10 or more groups using visual approaches without the aid of multivariate chemometric tools.

The aim of this paper is to demonstrate the applicability of multivariate chemometric techniques in a new area, namely the thermal analysis of materials, where hitherto more traditional approaches based primarily on empirical parameters requiring expert physical chemical knowledge have been employed.

2. Experimental

2.1. Equipment

The equipment used in these analyses was the Tritec Dynamic Mechanical Analyser (Triton Technology Ltd., Keyworth, UK) (Fig. 1). This instrument can apply a sinusoidal load of ±10 N, at a specified frequency between 0.001 and 600 Hz, measure displacement between 0.0001 mm to 1 mm and heat and cool from 0.1 °C min−1 to 20 °C min−1 over the temperature range −150 °C to 400 °C. The drive head can be operated vertically downwards, horizontally, and vertically upwards. This facilitates mounting of samples in the most convenient means and also allows immersion in fluid. In this study single cantilever clamped bending was employed.
Triton Tritec Dynamic Mechanical Analyser.
Fig. 1 Triton Tritec Dynamic Mechanical Analyser.

2.2. Sample treatment

Polymer samples were compression moulded from polymer granules in a heated press to form bars. All the bars were measured in single cantilever bending geometry. Fig. 1 shows a schematic of this. Samples had the following dimensions; free length (L) of 10 mm between the clamps, 5 mm width (W) and 3–4 mm thickness (T).

The samples were tested at a heating rate of 7.1 °C min−1, frequency 10 Hz and displacement 0.050 mm, ≈0.5% strain, data collection rate maximum one point per s. The time of the analysis was on average 30 min. The temperature range studied was from −50 °C until the minimum value of stiffness of the polymer reached 2 × 102 N m−1. At this point the material is in a totally viscous condition and no further useful information can be obtained. Due to the different melting/softening temperatures, polymers were analysed over different temperature ranges. Measurements were made approximately every 1.5 °C. The smallest temperature difference between data points is 0.88 °C, the largest is 3.67 °C. In total each curve contains between 99 and 197 data points (see Table 1). There were no significant differences in the average physical rate of data acquisition between polymer groups.

Table 1 Training and test sets (semi-crystalline polymers)
Polymer name Polypropylene (PP)
Grade name HA507MO RD204CF HL512FB
No. of repetitions 9 7 9
Numbers of samples in the training set from 1 to 7 from 10 to 14 from 17 to 23
the test set 8 and 9 15 and 16 24 and 25
Average temperature ranges beginning −65.4 −62.9 −63.6
end 181.9 160.2 168.6
Average no. of data points 157 149 154

Polymer name Low density polyethylene (LDPE)
Grade name FB4230 FA3223 “Jim”
No. of repetitions 9 7 9
Numbers of samples in the training set from 26 to 32 from 35 to 39 from 42 to 48
the test set 33 and 34 40 and 41 49 and 50
Average temperature ranges beginning −57.9 −59.7 −56.4
end 137.2 121 121.6
Average no. of data points 130 102 109

Polymer name Polystyrene (PS)
Grade name SR550 N2560 R850E
No. of repetitions 9 7 9
Numbers of samples in the training set from 51 to 57 from 60 to 64 from 67 to 73
the test set 58 and 59 65 and 66 74 and 75
Average temperature ranges beginning −59.1 −55.5 −66.5
end 168.4 168.1 169.3
Average no. of data points 138 155 141

Polymer name Acrylonitrile-butadiene-styrene (ABS)
Grade name B432/E F332 JG
No. of repetitions 9 7 9
Numbers of samples in the training set from 76 to 82 from 85 to 89 from 92 to 98
the test set 83 and 84 90 and 91 99 and 100
Average temperature ranges beginning −58.8 −60.3 −60.7
end 168.8 168.4 185.9
Average no. of data points 139 136 147


2.3. Polymers

The materials used in the analyses were supplied by Borealis (Linz, Austria) and Polimeri (Milan, Italy). The dataset involves a total of 100 analyses, originating from 4 main polymer groups: polypropylene (PP), low density polyethylene (LDPE), polystyrene (PS) and acrylonitrile-butadiene-styrene (ABS) each in turn consisting of 3 grades. The number of analyses performed for each grade is shown in Table 1. Note that the PS grades SR550 and R850E contain a high impact additive which implies that these might not cluster very well with the other PS grades.

Each analysis was carried out on a fresh moulding made from the same batch of material.

2.4. Graphical representation

Typical DMA traces are presented in Fig. 2 and Fig. 3. The material shown in Fig. 2 is PP and LDPE which are semi-crystalline materials, whereas in Fig. 3 traces for PS and ABS which are totally amorphous polymers are illustrated. These graphs are used as raw data for chemometric analysis.
Tan δ plots of semi-crystalline polymers PP and LDPE polymers.
Fig. 2 Tan δ plots of semi-crystalline polymers PP and LDPE polymers.

Tan δ plots of amorphous polymers between 70 and 170 °C, each polymer is represented by 3 different grades; one replicate of each grade is illustrated.
Fig. 3 Tan δ plots of amorphous polymers between 70 and 170 °C, each polymer is represented by 3 different grades; one replicate of each grade is illustrated.

2.5. Software

DMA control and data acquisition were performed using software based in Excel, written in Visual Basic. Subsequent data processing and visualisation were performed using Matlab 6 and Excel 2002. Some functions were downloaded from http://www.cis.hut.fi/projects/somtoolbox/, written by Johan Himberg, Helsinki University of Technology, Finland.

3. Data analysis

3.1. Data preprocessing

Because the measurements of different polymer samples were performed at different temperatures and were not equally spaced, data were linearly interpolated, using an equation of the form:
 
ugraphic, filename = b510561g-t1.gif(1)
where tan δ is the new, interpolated, tan δ value between temperatures T1 and T2 and tan δ1 and tan δ2 are the respective recorded measurements.

Interpolated values of tan δ at the same temperatures for all the polymers were calculated for the range: −48 °C to 118.5 °C, with an interpolated temperature interval equalling exactly 1.5 °C, resulting in 112 equidistant data points in time. Before interpolation the last data point for the shortest polymer curve (sample no. 40) was at temperature 119.28 °C. In order to retain the most useful piece of information corresponding to the high temperatures additionally for all the samples one more (113th) data point was added: the tan δ value was interpolated for temperature 119.28 °C.

A matrix X was obtained, consisting of 113 columns, containing the 113 interpolated tan δ values and 100 rows corresponding to 100 polymer samples. Data were mean-centered along the columns.

The whole dataset was divided into a training set containing data for 76 polymer samples and a test set containing data for 24 polymer samples (Table 1). Principal components analysis (PCA) and choosing the optimal number of components (Section 3.2) is performed on the training set. Outlier detection (Section 3.3) is performed on both training and test sets together. Unsupervised clustering (Sections 3.5) using linkage and k-means is performed on the entire dataset together whereas classification methods (Section 3.6) are performed on the training set and validated using the test set.

3.2. Principal components analysis

PCA is a common method for multidimensional data visualisation and reduction of dimensions1,2,4–6 PCA projects the data matrix X (100 × 113 in our case) into new orthogonal subspaces (called principal components or PCs). The decomposition of the data matrix can be defined as
 
X = TP + E(2)
where T is the scores matrix (dimensions 100 × K) whose rows correspond to objects in X (in this case polymer samples), P is the loadings matrix (K × 113) which represents trends across the columns (variables or temperatures) of X, and E is the residual matrix (100 × 113). The rank of data matrix K, is the number of significant components retained (see Section 3.4).

When using PCA as a graphical method for visualising data, many different plots can be constructed once PCA has been performed.1,18,20 In this paper two are employed: scores vs. scores and loadings vs. temperature.

Scores vs. scores plots (scores from one component against scores from another component) relate the different objects to each other. Clustering of points represents clusters in real data. In this paper these plots are used to determine which plastics are more similar to each other, and so visualise groupings.

Loadings vs. temperature plots (loadings from one component against temperature) illustrate which variable the corresponding principal component mainly accounts for. Temperatures at which the loadings of a defined PC are far from zero are strongly associated with this PC.1,2

In addition to using PCA as a method for data visualisation it can also be employed for variable reduction prior to clustering or classification. Datasets characterised by a number of variables higher than number of objects, such as our dataset, produce singular variance–covariance matrices1 due to the correlation existing between variables.2,21 Since in the calculation of the Mahalanobis distance one uses the variance–covariance matrix, a method of removing the correlation between variables, like PCA, is required. For consistency a similar approach is employed when using Euclidean distance measures. The multidimensionality of data (in total each curve contains between 99 and 197 measurements) may also cause some problems for k-NN classifiers because redundant information used in the training sets influences their classification ability.22

It should be noted that there are alternative approaches for classification apart from using the Mahalanobis distance measure, such as SIMCA or RDA that are also applicable to matrices where the number of variables exceed the number of samples. For brevity, in this paper, we restrict the number of methods we employ but there are alternative approaches that have been developed specifically for these types of problems that are often encountered in chemometrics.

3.3. Outliers

Because distance measures are very sensitive to outliers, one needs to reject the outlying samples before classification. Several tests for the outlier detection have been developed.2,23 Due to the characteristics of data (see Section 4.2) in our case visual investigation of the scores plot of the two first PCs was employed. For a fully automated procedure methods for outlier detection such as using the Mahalanobis distance24 could be employed. However outlier detection is mainly important in the training set, which is a one-off piece of information, if an outlier is found in the test set it is simply rejected as not being a member of any predefined group which probably is a realistic result.

3.4. Determining the number of significant components

An early step in data analysis using PCA is the choice of the optimal number of components to use in the classification (K, the rank of the data matrix). It is assumed that a certain number of components model data, whereas the residuals model mainly noise.1,2,18

Because the aim of this work is not curve reconstruction but the classification of polymers, the quality of classification provide us with insight as to the number of significant components. The method of determining the optimum number of components to use for classification used here incorporates the leave one out technique (LOO), where each sample is removed once from the training set and classified using a model constructed from the remainder.25–27 Each polymer sample is classified to one of the 4 polymer groups on the basis of its nearest neighbours (k-nearest neighbour method) membership.

3.5. Clustering methods

Clustering was performed on the PC1 and PC2 scores of the whole data set. Because clustering is an unsupervised method for grouping samples, the data were not divided into test and training sets.
3.5.1. Agglomerative hierarchical clustering (unweighted average linkage). In this method a similarity matrix is calculated, consisting of an Euclidean distance of the PC scores all the samples to each other.1,2,28 After the calculation samples are joined together on the basis of their similarity. After joining the samples characterized by the lowest distance (and creating a new cluster) the similarity matrix is recalculated. The distance to the new cluster is the average distance to all the samples it contains.
3.5.2. k-means clustering. k-means clustering (called also c-means clustering) is a non-hierarchical clustering method, which divides all samples into a defined number of disjointed clusters.2,29 The algorithm starts with randomly chosen samples as an initial guess for the cluster centres. Each sample is then classified to the nearest cluster on the basis of its distance to the cluster centre. For each cluster a new centroid is calculated, and again distance to the new cluster's centroids is calculated for every sample. The iterations minimise the distance of any given sample to its cluster centre. The algorithm stops when the next iteration does not change the centroids any more.

3.6. Supervised classification methods

Supervised methods for classification can be validated by dividing the data into training and test sets.
3.6.1. Classification using distance measures. Euclidean and Mahalanobis distances to the four polymer groups were calculated using scores of the significant PCs of the training set. Note that because PCA was performed on the entire training set, since Mahalanobis distance measures were computed on each individual class, the scores matrices for each individual class were not orthogonal. Centroids of the four polymer groups were calculated using only the training set. All the samples from the training set were classified to one of the four polymers on the basis of their distance from the calculated centroids. Samples from the test set were then classified on the basis of distance to centroids of the training set.

The generalized distance DiA of the polymer sample i to the centre of cluster containing samples of group A is calculated as:1,2

 
ugraphic, filename = b510561g-t2.gif(3)
where ti and [t with combining macron]A are vectors of scores on the significant PCs for to the ith polymer sample, and the centre of Ath polymer cluster respectively, W is a weighting matrix, for Euclidean distance W = I, for Mahalanobis distance21W = CA−1, where CA is the variance–covariance matrix for group A.

An unknown polymer was classified into the group whose distance to the relevant centroid was least. The models were set up using the training set and then validated using the test set.

Scores of the first two significant PCs were used for the classification.

3.6.2. k-nearest neighbours (k-NN). The k-NN classifies an unknown object to the class most heavily represented among its k-NN.1,2 In the case when the same number of neighbours belong to different clusters, the sample is classified randomly to one the two nearest clusters. This procedure is performed both for the test and training set. For the training set, the aim was to determine the optimum number of PCs required for the model. In the test set classification all the samples from the test set were classified using nearest neighbours from the training set using the PC models obtained from the training set.
3.6.3. Linear discriminant analysis. Linear discriminant analysis (LDA) is a method of projecting the objects on to a line perpendicular to the line discriminating between two classes,2 according to the following equation:1
 
fAB = ([x with combining macron]A[x with combining macron]B)CAB−1x′ − 0.5([x with combining macron]A[x with combining macron]B)CAB−1([x with combining macron]A + [x with combining macron]B)′(4)
where x is a vector of measurements for any given polymer sample, [x with combining macron]A is the centroid of scores from the significant components for all the samples from class A, and CAB is the pooled variance covariance matrix. Objects corresponding to values of fAB greater than zero belong to polymer group A, whereas objects with values less than zero belong to group B.

The whole procedure was performed in 2 steps, firstly classifying a polymer into amorphous and semi-crystalline polymers then division of these 2 groups into PP or LDPE, or into PS or ABS respectively. Two separate classifications were performed: the training set classification and the test set classification. In the test set classification all the samples from the test set were classified using weights calculated from the training set.

4. Results and discussion

4.1. Visualization using principal components analysis

The scores plot (Fig. 4) shows that the 4 groups of polymers appear well distinguished. All four groups can be divided into subgroups, which are polymers of different grades. Differences between polymer grades for amorphous polymers are quite high and are illustrated in the Fig. 3. The PP polymer, which produces a highly clustered group to the right, is very well distinguished from the rest of the polymers. In addition the LDPE polymer (circles in Fig. 2) can be easily distinguished. Two amorphous polymers, PS and ABS, consist of three different grades (see Fig. 3). Some problems in the classification may appear if the difference between two grades of different amorphous polymers is less then differences between two different grades of the same polymer. This situation is reflected in the scores plot (Fig. 4). Although one subgroup of the PS polymer is very close to the ABS polymer group, these two groups of polymers are not mixed and supervised methods such as k-NN and LDA are expected to work well in this situation.
Scores plot, whole data set, PC2 against PC1, polymers: PP (□), LDPE (○), PS (◇), ABS (△). Some of the samples which are described in the article text are numbered in the plot.
Fig. 4 Scores plot, whole data set, PC2 against PC1, polymers: PP (□), LDPE (○), PS (◇), ABS (△). Some of the samples which are described in the article text are numbered in the plot.

The loadings plot (Fig. 5) should be compared to the tan δ plots (Fig. 2 and Fig. 3). The first PC is characterised mainly by the region between 100 °C and 120 °C. This PC distinguishes very well polymer LDPE from the rest of polymers and polymer PP from polymer ABS. This PC also characterises the division of two very similar polymers, PS and ABS, which is reflected in the scores plot (Fig. 4). The second PC has characteristic intensity mainly in the regions 100 °C and 120 °C. This temperature is most important for characterising the division of LDPE and ABS polymers. The loadings for the third PC are high for the region around 90 °C and 110 °C, and low in between. This PC helps to distinguish between polymers LDPE and ABS and also highlights a little distance between polymers PP and LDPE.


Loadings plot for three first principal components.
Fig. 5 Loadings plot for three first principal components.

4.2. Outliers

In the scores plot the polymer groups do not form dense clusters, but rather aggregations of smaller clusters. Visual investigation of the scores plot indicates that sample no. 55 clearly is an outlier (see Fig. 4).

Results of preliminary classification as described below using the k-NN method are very good in the presence of outliers. This implies that these risky samples do not influence the result and probably there is no need to remove them. However, results of classification using distance measures are made worse in the presence of outliers, as these samples have significant influence on the centroids of the clusters.

After removing of the outlier PCA is performed once again on the data.

4.3. Determination of the number of significant components

The cumulative variance accounted for by the first two eigenvalues is 98.1%, for 3 eigenvalues 99.6% and for 4 eigenvalues 99.9%. Inspection of the cumulative eigenvalues suggests that two PCs are sufficient for describing the data adequately. The results of preliminary classification show 100% correct classification for every number of nearest neighbours (k = 1,2,3,4,5) using scores for the first 2 PCs. By including one more component samples 68, 71 and 98 are incorrectly classified. Thus the optimal number of components for polymer classification was set to 2. It is not the primary purpose of this paper to compare methods for determining the number of significant components which are discussed in depth elsewhere.7,19

4.4. Clustering techniques

Unsupervised clustering methods which are based on density such as k-means and agglomerative hierarchical clustering (average linkage) are not able to correctly separate 4 polymers groups in our data set, because polymer groups from different suppliers do not always form regular, dense clusters. The result of average linkage clustering shown in Fig. 6 demonstrates this. Two of the PS grades and two of the ABS grades appear to cluster together as the same polymer, whereas the third PS grade appears to cluster separately, as well as the rest of ABS samples. The fourth cluster contains all the samples of PP and LDPE polymers.
Dendrogram for average linkage method.
Fig. 6 Dendrogram for average linkage method.

Although the random initial choice of centroids used in the k-means method gives different results for each repetition of procedure, a general pattern can be found. Often one ABS grade clusters together with the PS polymers (apart from one sample). Furthermore two other incorrect situations can occur: LDPE and PP polymers appear in the same cluster and one PS grade creates a separate cluster or the same PS grade cluster together with the LDPE polymer. The last situation is surprising, as these two polymers are placed quite far away.

4.5. Classification using supervised techniques

Semi-crystalline polymers are classified perfectly using Euclidean as well as Mahalanobis distance. However, the division of amorphous polymers is not perfect (summarised in Table 2).
Table 2 Percent of polymers classified correctly using different classification methods
  Training set Test set
Euclidean distance 74/75 (99%) 24/24 (100%)
Mahalanobis distance 74/75 (99%) 23/24 (96%)
k-NN (k = 1) 75/75 (100%) 24/24 (100%)
Linear discriminant analysis 75/75 (100%) 24/24 (100%)


One sample from the training set (no.68, from the PS group) was incorrectly classified to ABS using the Euclidean distance measure. The classification is 99% correct. Surprisingly a better result was obtained with the test set as no sample was misclassified (see Table 2), but sample 74 was very close to the wrong result.

This result may be improved after removal of sample 54, which also may be considered an outlier. This sample has considerable influence on the PS centroid (see Fig. 4), which makes the distance between the cluster centroid and sample 68 higher.

Using the Mahalanobis distance measure one sample of the ABS polymer (no. 98) is closer to PS than to ABS centroid. This result is a consequence of the characteristic of the Mahalanobis distance, which takes into account the data distribution, unlike the Euclidean distance.2 The mean distance the of PS polymer samples from the centre of its cluster is quite high. Because the Mahalanobis distance accounts for dispersion of different groups samples which are far from the distance of the PS polymer centre in Euclidean space can have relatively low Mahalanobis distance from this class. The result is that samples (for example sample 98) belonging to other classes are incorporated into the PS polymer class.

This result is surprising, because one expects sample 68, which is furthest to the right of all the PS samples in the plot of Fig. 4, and which is misclassified using Euclidean distance, to be misclassified also using the Mahalanobis distance. But the large distance of this sample from the cluster centre is corrected for by the high variance of both PC1 and PC2 in the PS class which results in correct classification.

The classification rate was similar for the test set: one sample was misclassified, sample 99, which is the closest to the sample 98, also misclassified in the training set.

The k-NN method gives the best result for k = 1 if clusters are disjoint,22 thus we use this number of nearest neighbours. Using this method the training set is correctly classified in all cases as well as the test set.

For LDA 100% are correctly classified for both the training set and the test set (see Table 2).

5. Conclusions

A new method of polymer classification using DMA has been proposed. The method is based on the chemometric analysis of the damping factor (tan δ) as a function of temperature. It appears that this new type of data benefits from a new approach for data analysis. Conventionally univariate methods have been applied for the analysis of thermal analysis data and this paper is a first attempt to employ a multivariate chemometrics approach. PCA was performed: 2 components account for 98.1% of data variance and the same 2 components are enough to reach 100% correct classification.

Characteristics of polymer grades belonging to the same polymer differ. As a result of these differences clustering methods based on the assumption that one polymer group forms only one cluster such as k-means are not able to find the correct polymer clusters. Supervised classification methods, like: class distance using Euclidean or Mahalanobis distance measures, LDA and k-NN work 100% correctly in our case (see Table 2).

The combination of thermal analysis and chemometrics appears to have significant promise for the characterisation of materials.

Acknowledgements

We thank the EPSRC KTP scheme for finance (grant 4413).

References

  1. R. G. Brereton, Chemometrics: Data Analysis for the Laboratory and Chemical Plant, John Wiley & Sons, Chichester, 2003 Search PubMed.
  2. B. G. M. Vandeginste, D. L. Massart, L. M. C. Buydens, S. De Jong, P. J. Lewi and J. Smeyers-Verbeke, Handbook of Chemometrics and Qualimetrics, Part B, Elsevier, Amsterdam, 1998 Search PubMed.
  3. D. L. Massart and L. Kaufman, Interpretation of Analytical Chemical Data by the Use of Cluster Analysis (Chemical Analysis), John Wiley & Sons, New York, 1983 Search PubMed.
  4. S. Wold, K. Esbensen and P. Geladi, Chemom. Intell. Lab. Syst., 1987, 2, 37–52 CrossRef CAS.
  5. I. T. Joliffe, Principal Components Analysis, Springer, New York, 1987 Search PubMed.
  6. K. V. Mardia, J. T. Kent and J. M. Bibby, Multivariate Analysis, Academic Press, London, 1979 Search PubMed.
  7. R. G. Brereton, Multivariate Pattern Recognition in Chemometrics, Elsevier, Amserdam, 1992 Search PubMed.
  8. H. Haario and V. Taavitsainen, Chemom. Intell. Lab. Syst., 1998, 44, 77–98 CrossRef CAS.
  9. L. Coma, M. Breitman and S. Ruiz-Moreno, J. Cult. Heritage, 2000, 1, S273–S276 Search PubMed.
  10. J. C. Duncan, in Mechanical Properties and Testing of Polymers, ed. G. M. Swallowe, Kluwer Academic Publishers, Dordrecht, 1999, ch. 12, pp. 43–48 Search PubMed.
  11. N. G. McCrum, B. E. Read and G. Williams, Anelastic and Dielectric Effects in Polymeric Solids, John Wiley & Sons, New York, 1967, ch. 8–14, pp. 238–574 Search PubMed.
  12. B. E. Read, G. D. Dean and J. C. Duncan, in Physical Methods of Chemistry Volume 7, ed. B. W. Rossiter and R. G. Baetzold, John Wiley & Sons, Chichester, 1991, ch. 1, pp. 1–70 Search PubMed.
  13. J. J. Aklonis and W. J. MacKnight, Introduction to Polymer Viscoelasticity, John Wiley & Sons, New York, 1983, pp. 73–82 Search PubMed.
  14. B. E. Read and G. D. Dean, The Determination of Dynamic Properties of Polymers and Composites, Adam Hilger, Bristol, 1978, pp. 27–44 Search PubMed.
  15. R. H. Boyd, Polymer, 1985, 26, 323–347 CrossRef CAS.
  16. R. H. Boyd, Polymer, 1985, 26, 1123–1133 CrossRef CAS.
  17. R. E. Wetton, in Polymer Characterization, ed. B. J. Hunt and M. I. James, Blackie Academic & Professional, 1993, ch. 7, pp. 178–220 Search PubMed.
  18. I. M. Ward, Mechanical Properties of Solid Polymers, John Wiley & Sons, New York, 2nd edn, 1983, pp. 8–14 Search PubMed.
  19. M. Wasim and R. G. Brereton, Chemom. Intell. Lab. Syst., 2004, 72, 133–151 CrossRef CAS.
  20. P. Geladi, M. Manley and T. Lestander, J. Chemom., 2003, 17, 503–511 CAS.
  21. R. De Maesschalck, D. Jouan-Rimbaud and D. L. Massart, Chemom. Intell. Lab. Syst., 2000, 50, 1–18 CrossRef CAS.
  22. M. O'Farrell, E. Lewis, C. Flanagan, W. Lyons and N. Jackman, Sens. Actuators, B: Chem., 2005, 107, 104–112 CrossRef.
  23. J. N. Miller and J. C. Miller, Statistics and Chemometrics for Analytical Chemistry, Pearson Education Limited, Essex, 4th edn, 2000, pp. 54–57 Search PubMed.
  24. J. A. F. Pierna, F. Wahl, O. E. de Noord and D. L. Massart, Chemom. Intell. Lab. Syst., 2002, 63, 27–39 CrossRef CAS.
  25. N. Glick, Pattern Recognit., 1978, 10, 211–222 CrossRef.
  26. P. A. Lachenbruch, Biometrics, 1967, 23, 639–645 Search PubMed.
  27. K. Komuro, M. Tada, E. Tamoto, A. Kawakami, A. Matsunaga, K. Teramoto, G. Shindoh, M. Takada, K. Murakawa, M. Kanai, N. Kobayashi, Y. Fujiwara, N. Nishimura, J. Hamada, A. Ishizu, H. Ikeda, S. Kondo, H. Katoh, T. Moriuchi and T. Yoshiki, J. Surg. Res., 2005, 124, 216–224 CrossRef.
  28. S. Hirano, X. Sun and S. Tsumoto, Inf. Sci., 2004, 159, 155–165 CrossRef.
  29. S. S. Khan and A. Ahmad, Pattern Recognit. Lett., 2004, 25, 1293–1302 CrossRef.

This journal is © The Royal Society of Chemistry 2006
Click here to see how this site uses Cookies. View our privacy policy here.