A study into the ageing and dating of blue ball tip inks on paper using in situ visible spectroscopy with chemometrics

This paper presents a study into the potential of visible spectroscopy with chemometrics as an approach to dating blue ball tip inks on paper documents. Analysis of six inks left under various conditions found that the majority of those kept in the dark could still be matched to the source pen after 32 months of ageing. Conversely, the majority of those exposed to light exhibited rapid spectral changes that continued throughout the 32 month period. Partial least squares regression (PLSR) was used to generate dating models for inks aged with exposure to light. Evaluation using an external test set found absolute dating to be challenging for these ink deposits within the first 2–6 months of ageing. However, predictive accuracy was found to improve for long-term ageing, with two-year old samples yielding age estimates with a maximum error of 6 months. This rapid, non-destructive methodology could assist document examiners in the relative ageing or approximate age determination of questioned documents, as well as the identification of document alterations.


Introduction
Civil and criminal cases involving fraud or forgery remain a signicant concern to society. The Western Australia Police identied fraud as the largest and most expensive category of federal offences from Australian courts in 2015, despite only an estimated 25% of cases being reported. 1 Such offenses may involve the falsication or alteration of handwritten documents, which remain prevalent in nancial, legal or identity matters. 2 Ink analysis can play an important role in the forensic investigation of such cases. Modern writing inks are complex mixtures consisting of multiple dyes or pigments suspended in a solvent, along with various additives designed to regulate properties such as viscosity or adhesion. [3][4][5] The most common ink formulations in use today are those associated with ball tip pens; namely ballpoint, rollerball and gel inks.
Contemporary ballpoint inks usually consist of synthetic dyes dissolved in a glycol-or benzyl alcohol-based solvent. 6,7 This results in a high viscosity formulation that is fast-drying, water resistant and provides a controlled ink ow. In contrast, rollerball inks are typically water or xylene based; the latter being water resistant. 8 These inks are attractive due to their smoother ow and rich colour saturation compared to ballpoint inks, although the lower viscosity makes them more susceptible to smudging or bleeding. Gel inks combine characteristics of both ballpoint and rollerball inks by using a water-based gel rather than a liquid solvent. 5 This results in a thixotropic ink that is semi-solid until liquied by the movement of the ball tip. Gel inks are usually pigment-based to increase colour permanence and opacity, although newer gel formulations may be dye-based. 9,10 More recently, 'low viscosity' gel inks have also been developed as a hybrid between ballpoint and typical gel inks.
The complexity and variety of modern ink formulations, whilst challenging for analysis, potentially allows for discrimination based on their distinctive compositions. The chemical and physical examination of ink components can therefore provide valuable information regarding document authenticity. These analyses can be used to determine the source of an unknown ink, to identify additions or alterations made to handwritten ink entries, or to estimate the date at which an entry was produced.
Ink dating in particular is a highly complex challenge in questioned document examination. Once deposited on a substrate, inks can undergo rapid compositional changes due to factors such as solvent evaporation, chemical interactions with the substrate, or dye degradation. [11][12][13] The exact rates at which these processes occur vary according to several factors, including the initial ink composition, paper properties and the storage environment. [14][15][16] Several studies have attempted to utilise the changes observed in ink deposits over time to establish a reliable means of ink dating, as reviewed by Ezcurra et al. 7 These studies can be broadly divided into three approaches; those based on the evolution of resins, [17][18][19] loss of volatile components, 15,20,21 or variations in the dyes. 12,22,23 The latter is of specic interest as many inks contain unique combinations of dyes and pigments, making colour a highly distinguishing feature.
More recent studies have investigated the practicality of ink dating using various approaches to chemical analysis in combination with chemometrics. Senior et al. used principal component analysis (PCA) with ultraviolet-visible (UV-vis) spectroscopy and high performance thin layer chromatography to study the ageing of ten blue ballpoint inks up to 18 months aer deposition on paper. 24 The aged deposits were found to produce signicantly different PCA scores, which could be correlated with time using multiple linear correlation analysis. The resulting equations were used to estimate the age of three samples aged for four months, with two samples yielding predictions accurate to within ten days.
Sharma and Kumar later demonstrated the use of UV-vis with multiple linear regression (MLR) to model the degradation of blue ballpoint inks aged up to approximately nine months. 25 The nal model was able to estimate the age of tested ink entries to within an accuracy of 14 days. Díaz-Santana et al. similarly employed gas chromatography-mass spectrometry and high performance liquid chromatography in conjunction with MLR, modelling the concentration changes in solvents or dyes from blue and black ballpoint inks over 45 months. The resulting models provided dating estimates with a maximum error of 4-7 months. 26 Although promising, it should be noted that the methodologies described above required extraction of the ink into a solvent, rather than analysis in situ on the paper substrate. Given the legal or nancial nature of documents that may be submitted for examination, non-destructive testing is vital in order to preserve the document's integrity. Additionally, these previous studies utilised the same ink deposits for both the construction of the models and their validation. This fails to consider possible heterogeneity in the ink composition or variations in the ageing process between ink deposits and may therefore result in an over-estimation of the model's capabilities.
Most recently, Ortiz-Herrero et al. presented a non-invasive methodology for ink dating based on UV-vis-NIR reectance spectra. 27 48 samples of a commercial ink were articially aged for up to 11 days, and the resulting data used to construct regression models using partial least squares analysis. These models were tested using a selection of ve inks aged under the same conditions for up to nine days, achieving relative standard deviations as low as 16%. Additional samples of these inks were also exposed to natural ageing for up to 45 days in order to establish a correlation between natural and articial ageing.
This approach addresses many of the limitations associated with previous studies as described above. However, as noted by the authors, articial ageing does not necessarily provide a complete representation of ink ageing processes under natural conditions. Additionally, as the regression model was constructed using a single brand of commercial ink, predictions were found to be inaccurate for inks of substantially different composition. The authors thus acknowledge the need to develop separate predictive models for each class of ink in order for this method to be more broadly applicable.
In previous research by the authors Sauzier et al., diffuse reectance visible spectroscopy was utilised with chemometrics to examine the classication and chemical changes upon ageing of blue ball tip inks on paper. 28 As highlighted by Ortiz-Herrero et al., 27 diffuse reectance spectroscopy is highly advantageous for this purpose as it is rapid, requires minimal sample preparation and is non-destructive when applied in situ. A discriminant model was constructed from a sample set of 35 inks using PCA and linear discriminant analysis (LDA), then evaluated using an external validation set of 12 inks. The resultant model was able to assign 71.7% of spectra to the correct pen model, and a further 16.7% to the correct supplier.
Six of these inks were then selected for ageing under various office conditions over a two-month period. It was found that ink deposits stored in a dark environment (either open to air or in polypropylene sleeves) remained chemically stable, whereas those exposed to light could undergo signicant spectral changes in as little as one week. It was proposed that these changes were due to the photodegradation of triarylmethane dyes in the ink; a process further accelerated by the presence of titanium dioxide in the paper substrate. 29,30 However, this study was focussed solely on how ageing could affect the classication of inks using the chemometric model, rather than establishing a means of age determination. Additionally, this work was limited to the ageing of ink deposits over two-months. As questioned documents encountered in forensic casework may be several months or years old, it is of relevance to investigate what additional changes occur over a long-term ageing period.
This study expands upon the previous work by examining the continued natural ageing of blue ball tip ink deposits between 2-32 months, and how this affects their classication using the established predictive model. Partial least squares regression (PLSR) was subsequently used to establish individualised dating curves for each of the light-exposed ink deposits, and these models evaluated using a separate validation set aged for up to 24 months.

Sample collection and preparation
In previous work by the authors, 35 blue ball tip pens were obtained from existing stationery supplies and Western Australian retailers (Table S1 †). 28 The sample set was intended to consist solely of ballpoint inks, as ballpoint pens are generally considered to be the most common writing implement in current use. 31 However, during the course of this study it was found that one ink, originating from a Pilot G-2 05 retractable pen, was in fact a gel formulation. It was decided to retain this ink in the sample set in order to compare the results obtained using different ink types. Six of the above pens, including the Pilot G-2 05 gel pen, were chosen for ageing purposes (Table S1 †). Selection was made of inks that were clearly discriminated in the initial PCA model, to ensure that any changes in classication could be attributed to ageing rather than overlap between similar ink formulations. Three sets of samples were prepared for each ink by depositing 25 mm Â 25 mm squares of parallel lines on white copy paper (Fuji Xerox Professional Carbon Neutral, 80 g m À2 ), as shown in Fig. S1. † The deposits were then le to age in three locations representative of typical office environments as per below, with spectra previously recorded at various intervals up until two months. 28 The present manuscript is a continuation of that study utilising the same ink deposits, with additional spectra recorded at 4, 6, 8, 10, 13, 14, 19, 24, 28 and 32 months aer ink deposition.
(i) On an office shelf, exposed to ambient light (on a diurnal cycle) and air.
(ii) In an office drawer away from light but exposed to air.
(iii) In an office drawer away from light and stored in transparent polypropylene archive sleeves (Ditto A4 reinforced sheet protectors).
A further set of 25 mm Â 25 mm deposits were prepared from the six ageing pens (above) as an external validation set for ink dating purposes. These deposits were stored on an office shelf exposed to light and air, with spectra obtained at various intervals following deposition: 6 days; 3, 6, 14 and 21 weeks; and 18, 21, 22 and 24 months. All experiments were carried out in a building with controlled air-conditioning. Data collected using a Digitech QP-6013 data logger found that the temperature over this period remained reasonably constant at 24 AE 1 C, whilst relative humidity varied between 20-70%.

Visible spectroscopy
Ink deposits were analysed by diffuse reectance visible spectroscopy in the same manner described in previous work. 28 Five replicate spectra were obtained for each sample using a Cary 4000 UV-Visible spectrophotometer with a DRA-900 internal diffuse reectance accessory (Fig. S1 †). Baseline scans were taken using an empty sample holder and mounted halon reference plate prior to sample measurement. The instrument was operated in double beam mode with reduced slit height, and data acquisition carried out using Cary WinUV Bio Version v. 4.20. All spectra were recorded over the 400-700 nm range, with a scan interval of 1 nm and scan speed of 600 nm min À1 .

Data analysis
All data pre-processing and analysis was performed using the Unscrambler® X 10.4 soware (Camo Soware AS, Oslo, Norway). Previous work investigated the use of the Kubelka-Munk (K-M) conversion, 28 as this is routinely applied to reectance spectra in order to facilitate quantitative analysis. 32,33 However, it was found that the K-M function did not improve overall discrimination, and so this function was omitted in the present study. Spectra were thus baseline offset to 0% reectance to account for background scattering effects, then unit vector normalised to remove variation caused by the sample surface texture.
Selected spectra from the aged deposits were projected onto the previously established PCA training model, derived from 175 spectra from 35 inks. The resulting scores plots were used to visualise their distribution compared to fresh ink deposits. The corresponding LDA model, employing the rst four PCs (accounting for 98.0% of variation in the dataset) and Mahalanobis distance measure, was then used to predict the source pen model for each of the aged sample spectra. These predictions were compared to the expected classications to assess any ongoing changes in the deposited inks over time.
Separate dating models were constructed for each of the light-exposed inks using the spectra collected over the 32 month period. All spectra were mean-centred and subjected to PLSR using the non-linear iterative partial least squares (NIPALS) algorithm. The resulting regression models were used to predict the age of the validation deposits (exposed for up to 24 months), with the actual and predicted ages compared to evaluate the efficacy of the model.

Characterisation of aged samples
Deposits of six inks were le to age in various office environments, with analysis at intervals ranging from 1 day to 32 months aer deposition. Previous work found that although inks exposed to light underwent spectral changes attributed to dye fading in as little as one week, those stored in darkness (either open to air or in polypropylene sleeves) remained chemically stable up until at least two months of ageing. 28 Over the extended 32 month period, deposits stored in polypropylene sleeves away from light continued to show little change, with the LDA model correctly predicting the source pen for the majority of these spectra. These ndings are consistent with previous work by Andrasko, who found that ballpoint inks underwent slower compositional changes when stored in the dark. 34 Similarly, it can be expected that ageing in gel ink formulations would occur at a slower rate when deposits are stored away from light, thus minimising photodegradation.
It was noted that individual spectra for certain deposits were misclassied at certain intervals, as seen in Table 1. However, this was largely limited to single replicates, and the misclassi-cations did not remain consistent across the ageing period. The overall classication of the inks across the ageing period thus remained unchanged, with exception of the pen 11 (Office Basics) ink, which yielded three misclassied spectra at 13 months. This caused the ink to be wrongly assigned as belonging to a J.Burrows ballpoint pen. The following month, the ink was again correctly assigned with only one misclassied replicate. The previous result is hence likely to have been caused by instrumental variation such as lamp power uctuations or dri, rather than an actual compositional change in the ink.
Inks stored open to air but protected from light remained similarly unchanged for the rst 25 months, with any misclas-sications limited to single replicates (Table 2). However, from 28 months of exposure, the Celco Retractable ink was mis-classied as originating from a J.Burrows ballpoint pen. This indicates that despite the lack of exposure to light, the dyes in this ink are still undergoing a gradual degradation process, likely due to oxidation. Aginsky found that even in the absence of light, triarylmethane dyes can undergo an oxidative process with atmospheric oxygen to form diphenylmethane derivatives and phenol. 35 This theory is supported by the fact that these changes were not observed for the deposits stored in polypropylene sleeves with limited air exposure.
The classication results were corroborated through projection of the aged ink deposits onto the original PCA model, which revealed the samples to remain largely clustered with their equivalent fresh deposits (Fig. 1). Exceptions were noted for single replicates of the Pentel Rolly ink stored in polypropylene at 32 months, and Office Basics ink kept open to air at 14 months. Visual inspection revealed these spectra to signicantly differ from the typical spectra of their respective deposits, and so these replicates were identied as outliers.
Interestingly, spectra from the Celco Retractable ink at 32 months showed little apparent shi from the fresh deposits, despite being misclassied by the LDA model. It should be noted that the remaining 29 inks in the population, though not shown in the projection plot, may lie close to inks subjected to ageing. As a result, even seemingly minor changes in sample distribution may produce incorrect classications. However, these misclassied samples can be identied as atypical through inspection of their discriminant values, as discussed in previous research. 28 It can be seen from Fig. 1 that the Celco, PaperMate and Keji inks exposed to air displayed a greater spread compared to those in polypropylene sleeves, reinforcing the idea that these inks are vulnerable to oxidative dye degradation. Conversely, the Office Basics ink shows a greater shi when stored in polypropylene sleeves. In this instance, it is possible that the solvent trapped within the polypropylene sleeve is undergoing a chemical interaction with the dyes, as previously suggested by Grim et al. 14 In both cases however, the variation between spectra obtained at different ageing periods was no greater than the variation between spectra collected on the same date. As the overall classications of these samples remain unchanged using LDA, it cannot be stated with certainty whether these variations are indicative of a current compositional change, or rather due to instrumental variation and inhomogeneity within the ink deposits.
Inks exposed to light exhibited varying misclassications throughout the ageing period ( Table 3). The Pentel Rolly remained correctly classied until 25 months of exposure, indicating that this formulation is relatively slow-ageing. This suggests that the Pentel Rolly ink formulation uses a colourant with greater photostability, such as copper phthalocyanine, in order to provide increased fade resistance. In contrast, the Pilot G-2 05 gel ink exhibited misclassications within the rst four months of ageing, suggesting that this formulation is dyerather than pigment-based and thus vulnerable to photodegradation.
Notably, the classications of the inks change several times, indicating that the ageing process is still ongoing. This is evident in the PCA projection plot (Fig. 2), where the aged deposits continue to dri further from the baseline samples with time. The direction of this shi is consistent with the short-term ageing results previously reported, with samples attaining more positive scores along PC1 and negative scores along PC3. 28 Though compositional changes pose a challenge in the identication of an unknown ink formulation, they may facilitate the identication of altered documents, as discussed in initial research. 28 The recognition of handwritten document alterations may be difficult if changes have been made using the same pen as the original entry. However, this research demonstrates that handwritten ink entries completed using the same pen at different times may be readily differentiated if the document has been exposed to light.
Examination of the spectra revealed an increasing red reectance and decreasing blue reectance over time, consistent with preliminary ndings that shis along PC1 and PC3 are due to changes in these regions. 28 This spectral shi can also be related to the anticipated degradation pathway of the ink dyes. One of the main proposed mechanisms for triarylmethane dye degradation is oxidative N-demethylation, wherein methyl Table 1 Number of misclassified spectra for inks stored in polypropylene archive sleeves in the dark, at various intervals following ink deposition. Labels in brackets indicate assigned groups. Inks that did not yield any misclassified spectra throughout the ageing period have been omitted for clarity  groups are successively substituted by hydrogen atoms. 36,37 The degree of methylation of triarylmethane dyes affects their shade, with greater methylation resulting in blue-violets and lower methylation leading to red-violets. 38 The red shis observed in the spectra of the aged inks is therefore consistent with dye degradation occurring via N-demethylation. Interestingly, the spectral changes and PCA shis observed in the Pilot G-2 05 ink were comparable to those in the ballpoint deposits. These results again suggest a dye-based gel formulation, containing the same or structurally similar dyes as the ballpoint inks.
Although the ageing process appears to still be ongoing, analysis of the spectra from a single ink deposit indicates that the rate of ageing is slowing. Fig. 3 shows the distribution of spectra obtained from the Celco Retractable ink at each analysis interval. It is evident that the shi in sample scores between intervals becomes smaller as the deposits age, with overlap observed for spectra obtained aer 8-10 months. This follows   the expectation that the rate of dye degradation would be most rapid in the immediate months following exposure to light, with the rate of change then decreasing over time. These ndings are also consistent with literature suggestions that the degradation of triarylmethane dyes can be detected up to approximately 30 months of ageing. 37 Development of dating models for light-exposed samples The results obtained from both PCA and LDA indicate that the ageing of ink deposits under ambient light is a dynamic process, with changes continuing to occur over at least a 32 month period. These changes could potentially be used to develop chemometric models for the age estimation of unknown ink entries. With this in mind, calibration curves were generated for the inks using partial least squares regression (PLSR). PLSR is a multivariate regression method that rst reduces the original predictor set into a lesser number of orthogonal factors, in a similar fashion to PCA. 39,40 These factors are calculated in such a way that covariance between the predictor and response variables is maximised. 41,42 That is, greater weight is applied to wavelengths that display variation highly correlated with sample age, under the assumption that these will be the most accurate for predictive purposes.
Separate PLSR models were constructed for each of the lightexposed inks using the spectra acquired over the 32 month ageing period. For each model, a Scree plot was used to select the optimal number of factors to retain for regression, as summarised in Table 4 below. Dispersion graphs of the actual and predicted ageing intervals for the calibration set are provided in the ESI (Fig. S2 †). Excellent agreement was observed between the actual and estimated values, with correlation coefficients greater than 0.97 obtained for all models including the Pentel Rolly ink. Based on this high correlation, the resultant models would be expected to demonstrate good predictive capability in estimating the age of an unknown ink deposit.
The regression models were evaluated using a separate validation set of ink deposits aged for up to 24 months. As these deposits were prepared approximately seven months aer the initial ageing samples, baseline spectra from each deposit were  rst predicted using the discriminant model, in order to determine whether any signicant compositional changes in the inks had occurred whilst in storage. All spectra were correctly assigned, indicating the inks to have remained chemically stable within the pen cartridges during storage. These results are consistent with research by Andrasko and Kunicki, who found no evidence of dye degradation in ink entries produced from the same pen up to 6 years apart. 43 Grim et al. similarly found that dyes within the ink cartridges of pens stored for up to 20 years remained predominantly unchanged, although a small number of inks exhibited ageing at a greater rate than when deposited on paper. 44 Age estimations for the validation samples are summarised in Table 5. The predicted and actual ages for deposits within the rst 2-6 months of ageing were substantially different, with some inks obtaining negative age estimations. Closer inspection of the data revealed a high level of variability between replicate spectra of any given deposit acquired on a single analysis date, resulting in poor predictive performance. This was evidenced by the high relative standard deviations (RSDs) associated with the dating estimates, which in several instances exceeded the actual age of the ink (Table S2 †). It is likely that the rapid chemical changes occurring during this initial ageing period are too variable to be precisely modelled for dating purposes.
Interestingly, model performance was observed to improve with increasing sample age. In particular, 24 month-old deposits were predicted with an absolute error between 1-6 months. The RSDs associated with these predictions (Table S2 †) were also noted to decrease with time. These results suggest that age estimation using this approach is more readily achieved based on subtle changes occurring during mid-to long-term ageing. Ortiz-Herrero et al., similarly observed that one ink analysed yielded lower RSDs over time, although other inks displayed no clear correlation between ageing period and predictive accuracy. 27 Closer inspection of the scores plots for each ink found that although there is a clear change in scores over time, the shi between spectra recorded at different time intervals is similar to that amongst spectra obtained on the same date. The similar magnitudes of intra-group and inter-group variance are likely to have reduced the discrimination between deposits of different ages. Additionally, the calibration and validation samples were separately prepared approximately seven months apart. It is thus possible that heterogeneity between deposits or uncontrolled environmental variations (e.g. ambient light intensity or humidity) may have caused the ageing process to occur at different rates.
A further gure of merit is the root mean square error of calibration (RMSEC), which measures the level of dispersion about the regression line. Table 6 shows the RMSEC values for each model to be in the range of 1-2 months, leading to a disproportionately high prediction error for inks within the rst six months of ageing.
Despite these limitations, the regression models demonstrate potential in establishing the approximate age of an unknown ink entry, which may be sufficient to prove or disprove a point of contention in court. Additionally, these results indicate that the models can distinguish relatively fresh deposits (within six months of ageing) from those that are at least a year old, even when the same ink has been used in both instances. This approach could therefore still be applicable to the relative ageing of two or more ink entries of the same formulation. This could be valuable in determining the order in which two or more documents were written, or again to identify whether an alteration has been made to a document using the same ink formulation at a later date. Furthermore, the comparable results obtained for both the ballpoint and gel inks demonstrate the potential applicability of this approach to multiple ink types, such as gel, rollerball or hybrid formulations.

Conclusions
This study demonstrates the utility of diffuse reectance visible spectroscopy for the analysis of inks in situ on paper, providing  135  126  164  207  504  453  357  463  411  417  551  596  432  416  511  432  461  544  617  588  435  599  484  520  569  672  633  513  623  549 594 609 rapid and probative information without the need for destructive analysis. Analysis of six inks stored in darkness found them to remain chemically stable for at least 32 months stored in polypropylene sleeves, and at least 28 months open to air. Those exposed to ambient light were found to misclassify throughout the ageing period, with the spectra exhibiting a shi in relative reectance between the blue and red regions. This supports the theory that these changes are due to the photodegradation of triarylmethane dyes via oxidative N-demethylation. Furthermore, the continual change in classications indicates that ageing is still continuing (though at a slower rate) even aer 32 months of light exposure.
PLSR models constructed for the light-exposed inks performed well for mid-to long-term ageing, although short-term ageing within the rst six months proved challenging. Twoyear-old samples of both ballpoint and gel deposits could be predicted with a maximum error of six months, comparable with recent work by Díaz-Santana et al. 26 The predictive accuracy obtained was less than that achieved by either Senior et al. or Sharma and Kumar over 9-18 months, 24,25 although this was expected due to the more limited information derived from in situ analysis and the longer ageing period examined.
The established models could potentially be used to establish the approximate age of an unknown ink entry, thus assisting to prove or disprove an issue of contention in courtsuch as whether a will was signed relatively recently (within six months) or many years ago. The ability of the model to distinguish between 'fresh' and 'old' deposits could also allow for the relative ageing of two or more ink entries on a single document. This could assist document examiners in determining the sequence in which two or more documents were written, or alternatively to identify alterations made to a written document following its initial generation.
It must be noted that the ink deposit sizes used in this study are not reective of 'real' samples submitted for document examination. The transition of this approach to casework would necessitate instrumentation with a smaller sampling aperture, such as a microspectrophotometer or video spectral comparator (VSC). Future work could also examine the use of micro-FTIR or micro-Raman spectroscopy to characterise non-colourant components of the ink, or the use of alternative pre-processing and regression techniques to improve model performance on the collected data.
Finally, it must be considered that optimal application of these models would require knowledge of the source of the ink as well as the storage history of the document, which is not feasible in a casework scenario. It is hence important to consider not only alternate approaches to developing chemometric dating models, but the extent to which these models are applicable within an operational context.

Conflicts of interest
There are no conicts to declare.