Masahiko
Taniguchi
* and
Jonathan S.
Lindsey
Department of Chemistry, North Carolina State University, Raleigh, NC 27695-8204, USA. E-mail: mtanigu@ncsu.edu
First published on 16th December 2024
The field of photochemistry underpins broad scientific endeavors, encompasses diverse molecular substances, and incorporates descriptions of qualitative and quantitative properties, all of which together may be representative of many scientific disciplines. Yet finding absorption and fluorescence spectra along with companion values of the molar absorption coefficient (ε) and fluorescence quantum yield (Φf) for a given compound is an arduous task even with the most advanced search methods. To gauge whether chatbots could be used to reliably search the literature, the absorption and fluorescence spectra and quantitative parameters (ε and Φf) for 16 popular dyes and fluorophores were sought using ChatGPT 3.5, ChatGPT 4o, Microsoft Copilot, Google Gemini, Gemini advanced, and Meta AI. In most cases, the values of ε and Φf returned by the chatbots accurately cohered with known values from established resources, whereas the retrieval of spectra was only marginally successful. The chatbots were further challenged to find data for fictive compounds (e.g., rhodamine 7G). The results from each chatbot were categorized as follows: “fabricated” (provides numbers that do not exist in the context queried), “fooled” (mis-identifies the compound but does not return any data), “feigned” (acts as if the fictive compound is real but does not provide any data), or “faithful” (responds that the compound is not known or is not available). In summary, the present shortcomings should not cloud the view that chatbots – judiciously used – already provide a valuable resource for the challenging scientific task of finding granular data, and to lesser degree, spectral traces for known compounds.
Over the years, we have been working to assemble a curated database of absorption and fluorescence spectra along with computational modules for carrying out quantitative evaluations commonly encountered in the field.1–4 The term “curated” refers to the presence of considered spectral traces including values of ε and Φf (where available), solvent information, and references to the originating literature. Spectra databases have been prepared that include 339 common compounds,1,2,4 12 natural porphyrins,5 150 chlorophylls,6 14 tolyporphins,7 324 synthetic chlorins,8 73 phyllobilins,9 177 flavonoids10 and 220 bilins;11 altogether for the 1309 compounds there are >2000 absorption and fluorescence spectra in the databases.
The accumulation of curated databases has been a tedious task because the existing search methods are woefully inadequate for finding spectral traces and companion values of ε and Φf.12 A further challenge is assessing the appropriateness of values reported in the published literature. As one example, the reported values of ε and Φf for the benchmark compounds zinc(II)tetraphenylporphyrin and free base tetraphenylporphyrin are known to vary widely among hundreds of published papers.13 Another area of concern is whether light-scattering corrections have been applied upon acquisition of spectra.14 After appropriate spectra are identified in the existing literature, digitization is required to generate the requisite XY dataset of intensity versus wavelength (or wavenumber) that describes a spectrum.15 Collections of spectral traces are more valuable than tabulations of wavelength maxima12,15–17,19 in enabling important assessments such as molecular brightness18 and calculation of the spectral overlap term19 in Förster resonance energy transfer (FRET) processes.20
An alternative to seeking spectral traces is to calculate spectral properties. The in silico prediction of properties of organic molecules is not yet fully satisfactory. For example, density functional theory (DFT) with use of appropriate basis sets and parameters (typically divined by testing against a battery of known members in the target family) can now provide deep insight into electronic structure and the origin of molecular transitions as well as reasonably accurate excitation and emission energies, but not the bandwidths, vibronic progressions, and tails that are part and parcel of spectra of organic compounds in the condensed phase. Knowledge of the full spectra – not merely tabulated wavelengths – is essential for the creation and understanding of photoactive materials; the imaginative design of zero-overlap fluorophores by Flood and coworkers21 and the identification of fascinating pigments in plants by Bastos and coworkers22 may comprise ideal examples. The de novo prediction of a value for the fluorescence quantum yield, which is a consequence of competitive photophysical relaxation processes, is generally beyond the scope of present calculational methods. The question arises, however, concerning the extent to which prediction of spectra of organic molecules can be achieved by artificial intelligence (AI) technologies. For instance, natural language processing (NLP) text mining techniques can be utilized to “scrape” a massive amount of absorption and fluorescence spectral data from the literature.23,24 On the basis of acquired experimental data,25–31 in conjunction with experimental and DFT calculated data32–34 or solely DFT calculated data,35–41 machine learning (deep learning) has been used to predict spectra for organic molecules. The significant challenges to mining the extraordinary wealth of information in the chemistry literature, and possible resolutions to present limitations, have been articulated by Risko and coworkers, taking the venerable task of laboratory recrystallization as a case study.42
Chatbots, of which ChatGPT is perhaps the most popular and representative, are human-like conversational-styled AI software packages that rely on large language models and are integrated into web-based graphical user interfaces. Here, we report the capability of six chatbots for (i) data retrieval of absorption and fluorescence spectral parameters, and (ii) finding absorption and fluorescence spectral traces. The chatbots are ChatGPT (version 3.5 and 4o, OpenAI), Copilot (Microsoft), Gemini and Gemini advanced (Google), and Meta AI (Meta). The spectral traces and data are sought for compounds that are well-known in the fields of photochemistry and photobiology. The core issue is whether chatbots can ameliorate the tedious tasks of finding the spectral traces and critical companion granular information of ε and Φf for specific well-known compounds and thereby accelerate the assembly of curated spectral databases (Fig. 1). A surprising outcome is that regardless of the present shortcomings, we likely are standing at the dawn of chatbots, which already comprise innovative tools for appropriately chosen applications. The integration of AI technologies into photochemistry research should accelerate development of organic molecule-based dyes and fluorophores.
![]() | ||
Fig. 1 The development of curated databases of spectra (e.g., PhotochemCAD) has required meticulous searching in the literature, which may be ameliorated through the use of chatbots. |
Q1 | What is the molar absorption coefficient of 10,10-diphenylanthracene? |
Q2 | What is the molar absorption coefficient of coumarin 808? |
Q3 | What is the molar absorption coefficient of chlorophyll k? |
Q4 | What is the fluorescence quantum yield of Lucifer Red? |
Q5 | What is the fluorescence quantum yield of rhodamine 7G? |
Q6 | What is the fluorescence quantum yield of Alexa Fluor 850? |
Q7 | Please display the absorption spectrum of beta-carotene |
Q8 | Please display the absorption spectrum of tetraphenylporphyrin |
Q9 | Please display the fluorescence spectrum of chlorophyll a |
Q10 | Please display the spectral overlap integral of the absorption spectrum of Nile Blue green and the fluorescence spectrum of fluorescein |
Data from reliable sources are used for comparison with the responses from the chatbots. The data sources listed in the chatbots upon the question are categorized into four groups: (i) database freely accessible on the internet,46,47 (ii) data from chemical vendors,48–57 (iii) miscellaneous web site58–60 and (iv) published journal articles.61–88 Many chatbots refer to PhotochemCAD databases hosted in the Oregon Medical Laser Center (OMLC)46 as the sources. The program PhotochemCAD and accompanying spectral database of 125 compounds were conceived around 1980 at The Rockefeller University by one of us (J. S. L.),12,15 originated and developed by our group in the mid-late 1980s at Carnegie Mellon University, and first published in 19981 following a move to NC State University in the mid-1990s.3 The goal has always been that the spectral data can be freely downloaded for use by others. Sometime thereafter, the near-entirety of the PhotochemCAD spectral data and companion references were republished on a website at Oregon Medical Laser Center (OMLC). The chatbots may access the latter but not download the spectral data from the original PhotochemCAD site, hence the often-incomplete referencing by the chatbots concerning the true origin of the data (the PhotochemCAD spectral databases were expanded to 150 compounds and published in 2005 as PhotochemCAD 2.2 PhotochemCAD 3 database was expanded to 336 compounds in 2018 (ref. 4) and expanded further to >2000 absorption and fluorescence spectra in the databases as stated in the Introduction.)
ChatGPT 3.5 | ChatGPT 4o | Copilot | Gemini | Gemini advanced | Meta AI | Data from reliable sourcesb | |
---|---|---|---|---|---|---|---|
a Reported values are significantly different from the authentic, generally accepted data. b Data from PhotochemCAD are curated, and original literature sources are provided therein. | |||||||
Naphthalene | 15![]() |
23![]() |
600046 | 600046 | 6000 (275 nm)46 | 600046 | 133![]() |
6000 (275 nm)47 | |||||||
See ref. 61–63 and 65 | |||||||
1,8-ANS | 5000 (375 nm) | 4950 (350 nm) | 4.9a![]() |
NA | 4950 (350 nm)48 | NA | 4950 (350 nm)66 |
8000 (375 nm)49 | 4000 (375 nm)47 | ||||||
Anthracene | 10![]() |
8600a (252 nm) | 970046 | 970046 | 9700 (356 nm)46 | 9700 (356.2 nm)46 | 180![]() |
400a (350 nm) | 9700 (356 nm)47 | ||||||
9,10-DPA | 30![]() |
22![]() |
14![]() |
NA | 14![]() |
14![]() |
4000 (338 nm) |
8740 (354 nm) | |||||||
14![]() |
|||||||
9200 (392 nm)47 | |||||||
Quinine | 11![]() |
5810 (347 nm) | 5700 (349 nm)46 | 5700 (347.5 nm)46 | 5700 (347.5 nm)46 | 5700 (349 nm)46 | 5700 (349 nm)47 |
Acridine orange | 40![]() |
70![]() |
27![]() |
27![]() |
27![]() ![]() |
27![]() |
27![]() |
See ref. 67–76 | |||||||
Coumarin 1 | 26![]() |
29![]() |
23![]() |
23![]() |
23![]() |
23![]() |
23![]() |
Fluorescein | 80![]() |
83![]() |
70![]() |
70![]() |
92![]() |
92![]() |
92![]() |
Rhodamine 6G | 108![]() |
116![]() |
116![]() |
116![]() |
116![]() |
116![]() |
116![]() |
Chlorophyll a | 75![]() |
117![]() |
71![]() |
117![]() |
120![]() |
117![]() |
117![]() |
23![]() |
86![]() |
86![]() |
|||||
Chlorophyll b | 45![]() |
54![]() |
159![]() |
62![]() |
56![]() |
159![]() |
159![]() |
22![]() |
40![]() |
46![]() |
57![]() |
||||
Chlorophyll d | NA | 63![]() |
63![]() |
63![]() |
63![]() |
NA | 45![]() |
21![]() |
44![]() |
||||||
63![]() |
|||||||
Chlorophyll f | NA | 71![]() |
71![]() |
57![]() |
71![]() |
NA | 66![]() |
48![]() |
56![]() |
71![]() |
|||||
TPP | 250![]() |
530![]() |
4450a (532 nm)a![]() |
18![]() |
480![]() |
18![]() |
443![]() |
4450a (532 nm)a![]() |
18![]() |
||||||
ICG | 100![]() ![]() |
136![]() |
NA | 230![]() |
78![]() |
230![]() |
194![]() |
Alexa 488 | 76![]() |
71![]() |
73![]() |
73![]() |
91![]() |
73![]() |
73![]() |
Each chatbot generally retrieved a value for ε, but with accuracy dependent on the given chatbot and particular compound. The values in question have been flagged in Table 3. Chatbots recognize two chief elements of absorption spectra: (i) the ε value depends on the wavelength, and (ii) the absorption spectrum may consist of multiple peaks. Most of the responses from chatbots include information on the wavelength (at λmax) as well as values at other wavelengths (e.g., for multi-banded spectra), whenever applicable, without a specific request in the questions. In general, the data accuracies of ChatGPT 3.5 and 4o (9 out of 16 for both) were inconsistent and hence the two chatbots were unreliable as a sole source of a value of ε. On the other hand, Copilot, Gemini, Gemini advanced and Meta AI were surprisingly reliable. In particular, Copilot and Meta AI did not afford apparently fabricated data; indeed, those cases with wildly disparate data were reported faithfully from the cited source, but the data in the source itself were incorrect. The retrieval results are reported in detail along with our analyses as described below:
(i) The absorption spectra of (polycyclic) aromatic hydrocarbons typically exhibit strong ethylenic bands (E1 and E2 bands) and weak benzenoid bands (B bands).61,63 For example, the absorption spectrum of benzene consists of an E1 band (∼180 nm, ε = 60000), E2 band (∼200 nm, ε = 8000), and tiny B band (255 nm, ε = 215).61 Although the E bands are documented in classical articles,61,63–65 such absorption features are often omitted in modern articles and authoritative treatises.62 The apparent rationale for the omission is perhaps not because the absorption is <250 nm, but because the E bands arise from an S0 → S2 transition. Immediate relaxation occurs therefrom to the S1 excited state; hence, E bands do not contribute directly to fluorescence. Some chatbots chose the E1 band for the wavelength for the ε value. Therefore, spectral traces of polycyclic aromatic hydrocarbons (naphthalene, anthracene, 1,8-ANS, and 9,10-DPA) were freshly measured here and are displayed from 200 nm to capture the E bands (Fig. 2). For the spectra in Fig. 2, the ε values of naphthalene and anthracene were applied from representative literature values,64 while those of 1,8-ANS and 9,10-DPA were redetermined herein.
The absorption spectrum of naphthalene comprises a strong E1 band (221 nm, ε = 133000), multiple E2 bands (∼275 nm, ε = ∼6000), and a weak B band (311 nm, ε = ∼300) (Fig. 2, panel a). ChatGPT 3.5 and 4o chose the E1 band (216 and 220 nm, respectively) for the wavelength; however, the corresponding ε values were ∼5 to ∼9 times less than the actual values. All other chatbots culled the E2 band maxima and quote 6000 as the ε value of naphthalene.
The absorption spectrum of 1,8-ANS exhibits solvent effects; for example, the ε value in water and ethanol vary depending on the sources from 4000 to 8000.47–49,66 The new measurement of the ε value here is 3780 at 354 nm in water and 5810 at 375 nm in ethanol (Fig. 2, panel b). ChatGPT 3.5 applied the data in ethanol (375 nm),66 while ChatGPT 4o adopted the data in water 4950 at 350 nm.48,64 The molar absorption coefficient is listed as EmM unit (equal to cm−1 mM−1) in a catalogue from the dye vendor,48,49 which requires conversion to cm−1 M−1. Gemini advanced successfully converted and responded in proper units, while Copilot was incapable of the unit conversion, in which case the value is listed as is.
The absorption spectrum of anthracene also comprises a strong E1 band (252 nm, ε = 180000), multiple E2 bands (maxima at 356 nm, ε = 7400 measured here; 9700 in the literature47), and the B band is submerged into the region of the E2 bands (Fig. 2, panel c). ChatGPT 3.5 picked the longest wavelength E2 band (374 nm, ε = 10
300), and the value is in an acceptable range. ChatGPT 4o denoted the positions of the E1 (252 nm) and E2 (350 nm) bands correctly; however, the ε values were completely unreasonable. All other chatbots properly employed the data from literature values.47
9,10-DPA exhibits four absorption peaks at wavelengths greater than 300 nm: 338 nm (ε = 4000), 354 nm (ε = 8740), 373 nm (ε = 14000), and 392 nm (ε = 9200) (Fig. 2, panel d). These values are based on literature data,47 whereas the data shown in the figure are different by as much as 20%. ChatGPT 3.5 denoted the first peak (333 nm) whereas ChatGPT 4o picked the second peak (354 nm); however, the ε values were overestimated. All other chatbots properly employed data from literature values.47
(ii) All chatbots responded quite well to the questions for the long-established fluorophores quinine, fluorescein, and rhodamine 6G.
(iii) The absorption spectrum of acridine orange is drastically altered by protonation/deprotonation70 and is concentration-dependent due to monomeric and dimeric forms.68,70,75 As determined here, the absorption maximum of acridine orange in ethanol (490 nm, ε = 48600) is shifted hypsochromically and hypochromically in basic ethanol (431 nm, ε = 21
900) (Fig. 3).
![]() | ||
Fig. 3 Absorption spectrum of acridine orange in ethanol (solid line) and basic ethanol (dotted line). |
The ε values reported in the literature67–76 for acridine orange are summarized together with the data measured herein, as shown in Table 4.
λ max (nm) | ε (M−1 cm−1) | Solvent | Reference |
---|---|---|---|
490 | 82![]() |
Ethanol | 67 |
490 | 75![]() |
Ethanol | 68 |
492 | 63![]() |
0.001 M HCl aq | 75 |
491.5 | 58![]() |
Ethanol | 69 |
492 | 55![]() |
Ethanol | 76 |
493 | 53![]() |
Ethanol with H2O and CO2 gas | 70 |
490 | 48![]() |
Ethanol | This work |
492 | 32![]() |
Aqueous solution below pH 2 | 72 |
489 | 31![]() |
Aqueous solution | 73 |
496 | 22![]() |
SDS buffer | 71 |
420 | 59![]() |
Aqueous solution pH 7 | 74 |
432 | 27![]() |
Basic ethanol | 70 |
431 | 21![]() |
Basic ethanol | This work |
The responses from ChatGPT 3.5 and 4o were based on data in ethanol solution and are in a reasonable range (∼495 nm, ε = 40000 or 70
000). The responses from Copilot and Meta AI were values in basic ethanol that originate from literature data (431 nm, ε = 27
000).47 The response from Gemini advanced was affected by the propagation of misplaced values from the commercial vendor's data,50 which are composed of the molar absorption coefficient in basic ethanol (ε = 27
000) and the wavelength maxima (492 nm) in ethanol.
(iv) The ε value of coumarin 1 from ChatGPT 3.5 (ε = 26000) and 4o (ε = 29
000) was close to the value from PhotochemCAD (ε = 23
500); however, the corresponding absorption wavelength (λmax = 350 nm) is different from the value from PhotochemCAD (λmax = 375 nm). On the other hand, all other chatbots provided the values from PhotochemCAD. The ChatGPTs were found to tend to estimate and give approximate values in the current study, which may stem from the characteristic features of ChatGPTs.
(v) The absorption spectra of chlorophylls a, b, d, and f are displayed in Fig. 4 to visually guide the comparison described below. The spectra are drawn from a comprehensive database of chlorophyll spectra6 that have been included in PhotochemCAD.
![]() | ||
Fig. 4 Absorption spectra of chlorophylls a (green, in diethyl ether), b (black, in diethyl ether), d (blue, in methanol), and f (red, in methanol).6 |
The ε value of chlorophyll a from ChatGPT 3.5 was 0.25–0.5 times that of the values from PhotochemCAD, whereas the values from ChatGPT 4o were highly likely taken directly from PhotochemCAD data. The ε values of chlorophyll b from ChatGPT 3.5, ChatGPT 4o and Gemini advanced were also far less than the widely accepted value.
(vi) The ε value of chlorophyll d from ChatGPT 4o is given for the peak position for the Q band (662 nm); however, 662 nm is not a peak maximum. The correct value is 697 nm, and the given molar absorption coefficient (21 000) is 0.3 times that of the accepted value (63 680). Discrepancies of 35 nm in peak position and 3-fold in peak intensity are profound errors in the context of function in a photosynthetic apparatus as well as in many other systems. Gemini advanced returned a source journal reference for chlorophyll d and f wherein the title, journal name, year, and volume were correct; the list of authors was only partially correct (key author Blankenship was excluded whereas the estimable chlorophyll scientist Scheer was erroneously included); and the journal page number was wrong. Such errors were easy to spot. Such errors in this field are referred to as fabrication.
(vii) The ε value of chlorophyll f from ChatGPT 4o fabricated an additional peak (740 nm) that is non-existent. The values for chlorophyll f from Gemini (ε = 57500 at 705 nm) and Gemini advanced (ε = 71
000 at 437 nm and ε = 56
800 at 706 nm) also were fabricated, even though the values were close to those reported in the specified reference by Gemini and Gemini advanced.79 On top of that, the absorption maximum for the near-ultraviolet absorption band (termed the B band) from Gemini advanced (437 nm) was completely incorrect; the correct value is 406.5 nm.
(viii) The ε value of TPP from both Copilot and Meta AI was incorrect (ε = 4450 at 532 nm). The sources were from different web-based homework helpers for students: Bartleby for Copilot and Chegg for Meta AI. The original text material displayed in Bartleby59 was identical with that in Chegg60 as shown in Fig. 5 (also see the ESI† for screenshots of the web site).
![]() | ||
Fig. 5 Image of (erroneous) parameter values in a source material59,60 cited by a chatbot. |
The absorption spectrum of TPP is comprised of a strong band in the blue region (denoted as B) and a set of comparatively weaker bands in the green-red region (denoted as Q); however, no peak exists at 532 nm. The absorption spectrum of TPP is shown in Fig. 6.8 The source of the material shown in Fig. 5, presumably drawn from a textbook, could not be located regardless of further internet searches or examination of diverse printed materials. The origin of the inaccurate data (peak at 532 nm) is unknown. This subtle but non-negligible incident exemplifies how the propagation on the internet of pernicious errors concerning properties of even the most common benchmark materials can damage scientific research and corrode understanding.
![]() | ||
Fig. 6 Absorption spectrum of TPP in toluene at room temperature.8 |
(ix) The ε value of ICG from Gemini advanced (ε = 78000) was provided without any sources and was dissimilar to the values from other retrieved data or known from other sources.
(x) The ε value of Alexa 488 from Gemini advanced (ε = 91400) was slightly different from other retrieved data or known from other sources. The disparity appears to be due to the different environment (the value was for Alexa 488 conjugated to secondary antibodies, but no sources were provided).
It is noteworthy that all chatbots provided the wavelength (nm) together with the ε value even though in most cases the question requested only the latter parameter. The ε value varies depending on the wavelength and is senseless without a specified wavelength. An absorption spectrum typically consists of multiple peaks due to the presence of distinct electronic levels often each accompanied by a manifold of vibrational energy levels; therefore, it is not easy to describe a spectrum solely by tabulated numbers without the spectral trace, as shown by the material in this section. The availability of spectra is an essential matter in the photosciences.12,15–17,19
ChatGPT 3.5 | ChatGPT 4o | Copilot | Gemini | Gemini advanced | Meta AI | Data from reliable sourcesb | |
---|---|---|---|---|---|---|---|
a Reported values are significantly different from the authentic, generally accepted data. b Data from PhotochemCAD are curated, and original literature sources are provided therein. | |||||||
Naphthalene | 0.25 | 0.23 | 0.2346 | 0.2346 | 0.2346 | 0.2346 | 0.2347 |
1,8-ANS | 0.28 to 0.33 | 0.001 in H2O | 0.2 to 0.380 | Low | 0.004 in H2Oa![]() |
0.003 in H2Oa | 0.2447 |
0.154 in ethylene glycola![]() |
0.004 in H2O82 | ||||||
Anthracene | 0.28 | 0.27 | 0.3646 | 0.3646 | 0.3646 | 0.3646 | 0.3647 |
9,10-DPA | 0.98 | 0.90 to 0.95 | 146 | 0.8–146 | 146 | 146 | 147 |
Quinine | 0.54 to 0.58 | 0.54 | 0.54646 | 0.54646 | 0.54646 | 0.54646 | 0.54647 |
Acridine orange | 0.7 to 0.85a | 0.3 to 0.4 | 0.246 | 0.246 | 0.246 | 0.246 | 0.247 |
Coumarin 1 | 0.15 to 0.40a | 0.73 | 0.5, 0.7346 | 0.5, 0.7346 | 0.7346 | NA | 0.547 |
Fluorescein | 0.85 to 0.90 | 0.92 | 0.7954 | 0.92583 | 0.92583 | 0.9746 | 0.9747 |
Rhodamine 6G | 0.95 to 0.99 | 0.95 | 0.9546 | 0.9583 | 0.9583 | 0.9546 | 0.9547 |
Chlorophyll a | 0.001 to 0.01a | 0.3 | 0.2584 | 0.3246 | 0.3246 | 0.01 to 0.06 deep water88![]() |
0.3247 |
Chlorophyll b | 0.003 to 0.01a | 0.16 | 0.11746 | 0.06 to 0.1184 | 0.11746 | 0.11746 | 0.11747 |
Chlorophyll d | 0.001 to 0.003a | 0.1 | NA | NA | NA | NA | 0.3686 |
Chlorophyll f | 0.001 to 0.003a | 0.1 | 0.1685 | NA | NA | NA | 0.3986 |
TPP | 0.15 to 0.25a | 0.11 | 0.1146 | 0.03a to 0.1146 | 0.1146 | 0.1146 | 0.1147 |
ICG | 0.13 to 0.16a | 0.02 | 0.02587 | 0.0957 | 0.0957 | 0.0452 | 0.0547 |
Alexa 488 | 0.92 | 0.92 | 0.9256 | 0.9256 | 0.9256 | 0.9256 | 0.9256 |
The responses from ChatGPT 3.5 and 4o are distinctive: dramatic improvements in the accuracy of data retrieval were observed for ChatGPT 4o compared to ChatGPT 3.5. The responses from ChatGPT 3.5 were unreliable; indeed, the responses for the Φf values of chlorophylls indicated each was a weakly fluorescent compound. A major drawback of the ChatGPT family is the absence of reported data sources (e.g., web links or research articles). ChatGPT 4o performed considerably well for the 16 compounds listed here, but of course that does not imply that ChatGPT 4 will afford reliable results for the Φf value of other compounds. Indeed, for a non-expert, the absence of annotation of sources presents a situation where the retrieved values must be taken on faith.
The following are notable points.
(i) The Φf value of 1,8-ANS exhibits a strong solvent dependence and ranges from 0.004 in water to 0.63 in n-octanol.82 The Φf value of 1,8-ANS retrieved from Gemini advanced and Meta AI was actually for 1,8-ANS derivatives – not 1,8-ANS itself – and although those values were very close by coincidence,81 the data retrieval by Gemini advanced and Meta AI are judged as unsuccessful.
(ii) The web links of PubMed are often embedded as sources (especially by Gemini), and in most of the cases, the links are valid; however, an invalid (fabricated) PubMed ID was provided for the Φf value of acridine orange.
(iii) The initial response for chlorophyll a from Meta AI was the Φf value of oceanic phytoplankton, not that of the molecule chlorophyll a. While phytoplankton likely contain chlorophyll a, the former is a living organism whereas the latter is a molecule; the chatbot mixup is non-trivial. A more specific question “What is the fluorescence quantum yield of the chlorophyll a molecule?” to Meta AI generated reasonable answers (0.32 and 0.25).
(iv) The response for TPP from Gemini reflects the lethal problem of the incapability of distinguishing chemical derivatives by generative AI. The response included not only the Φf value of TPP (0.11) but also that of the zinc chelate of TPP, namely Zn-TPP (0.03). This is a common problem for not only all chatbots but also the results from search engines. The responses for TPP also reflect a longstanding problem perhaps appreciated only by the photosciences aficionado – that values of Φf depend on a number of experimental conditions, including whether the solution is aerated or deaerated, and even if aeration is controlled and specified, the reported values can span a distressingly large range.13 The recent consensus values for Φf of TPP are 0.090 in deaerated toluene versus 0.070 in toluene in air,13 replacing a longstanding reliance on the generic value of 0.11 for the Φf of TPP in toluene. The passage of time – and perhaps the advent of more powerful chatbots – may be required for the new values to supplant the old.
(v) To our knowledge, there is only one reported value for the Φf of chlorophyll d [0.36 in benzene],86 and only two values for chlorophyll f [0.39 in benzene86 and 0.16 in pyridine85]. The latter two values were recorded by different research groups and could reflect different experimental methods or true solvent effects. Thus, it is quite understandable that most chatbots have trouble retrieving the data. Note that ChatGPT 4o quoted the Φf value of both chlorophyll d and f as 0.1, which must originate by estimations from other related compounds.
(vi) For chlorophyll f, ChatGPT 3.5 gave two results: one was the fabricated value of 0.001 to 0.003, whereas the other was ‘there is no widely accepted value’.
ChatGPT 3.5 | ChatGPT 4o | Copilot | Gemini | Gemini advanced | Meta AI | |
---|---|---|---|---|---|---|
Q1 | Fabricated | Fabricated | Faithful | Fabricated | Faithful | Faithful |
Q2 | Fabricated | Feigned | Faithful | Faithful | Feigned | Faithful |
Q3 | Feigned | Faithful | Faithful | Faithful | Faithful | Faithful |
Q4 | Fabricated | Faithful | Feigned | Faithful | Fooled | Fooled |
Q5 | Fabricated | Fabricated | Faithful | Fabricated | Faithful | Faithful |
Q6 | Faithful | Fabricated | Faithful | Faithful | Faithful | Faithful |
(Q1) 10,10-Diphenylanthracene: Gemini regarded 10,10-diphenylanthracene as equal to 9,10-diphenylanthracene. Copilot, Gemini advanced, and Meta AI apprised that data for 10,10-diphenylanthracene were not readily available. No chatbots pointed out that the 10,10-diphenyl substitution is chemically wrong and that 10,10-diphenylanthracene is a non-existent compound.
(Q2) Coumarin 808: ChatGPT 3.5 and 4o presumed that coumarin 808 is a known coumarin derivative (which is a typical behavior of ChatGPT) and provided values similar to those of other coumarin derivatives. Gemini advanced reported that “coumarin 808 absorbs light in the near-infrared range,” most likely due to the beguiling number of 808. Such a labeling scheme is common, as exemplified by commercial dyes (e.g., DyLight 800).
(Q3) Chlorophyll k: ChatGPT 3.5 defined chlorophyll k as a recently discovered pigment; otherwise, all other chatbots skipped this booby trap.
(Q4) Lucifer Red: ChatGPT 4o declared that Lucifer Red is not readily available. Copilot regarded Lucifer Red as a compound similar to that of Lucifer Yellow, but did not fabricate any data. Gemini stated that “There isn't a well-established fluorophore called “Lucifer Red”.” On the other hand, Gemini advanced concluded that Lucifer Red is a red-emitting luciferin analog used in bioluminescence imaging. Meta AI deduced that Lucifer Red is a derivative of rhodamine (Lucifer Yellow is an amino-naphthalimide derivative).
(Q5) Rhodamine 7G: ChatGPT 3.5, ChatGPT 4o, and Gemini regarded rhodamine 7G as a synonym of other rhodamine derivatives. Conversely, Copilot, Gemini advanced, and Meta AI recognized that rhodamine 7G is a non-existing fluorophore.
(Q6) Alexa Fluor 850: all chatbots other than ChatGPT 4o clearly discerned that Alexa Fluor 850 is not a valid dye and does not exist.
Again, the intention for giving these tricky questions was not meant to cheat or depreciate the value of chatbots, but rather to show the consequences and capabilities even with zero-shot prompts. By adding (i) appropriate additional prompts (e.g., if you cannot find the relevant information, please say “I don't know”) or (ii) using few-shot prompts (first ask if the titled compounds exist or not, then provide additional questions), chatbots should be able to respond honestly.
ChatGPT 3.5 | ChatGPT 4o | Copilot | Gemini | Gemini advanced | Meta AI | |
---|---|---|---|---|---|---|
Q7 | ASCII art | Gaussian | — | Spectrum | Spectrum | Tabulated |
Q8 | ASCII art | Gaussian | — | Spectrum | Spectrum | Tabulated |
Q9 | ASCII art | Gaussian | — | Spectrum | Spectrum | Tabulated |
Q10 | — | Gaussian | — | — | Spectra (false) | Tabulated |
ChatGPT 3.5 tried to display absorption and fluorescence spectra by ASCII art, yet the generated graphics were nonsensical and unsatisfactory. ChatGPT 4o created spectral traces by applying a Gaussian distribution, which was a good upgrade from that of ChatGPT 3.5. Nonetheless, the spectra were of limited use due to lack of information such as the full-width-at-half-maximum (fwhm), the intensity ratio of each peak, and the inclusion of multiple peaks. Copilot and Meta AI lack capabilities for drawing spectra in graphics. Gemini and Gemini advanced pulled out the corresponding spectra in graphical form together with web links of sources, a function equivalent to that of a Google image search.
Question 10 requested the spectral overlap integral derived from the absorption spectrum of Nile Blue and the fluorescence spectrum of fluorescein. Such an overlap entails the unusual reverse FRET – in other words, uphill energy transfer from fluorescein to Nile Blue. The question is demanding by requiring immensely specific data that may be neither published nor available. No chatbot could provide a suitable response. The attempt by Gemini advanced displayed the homo overlap of the absorption and fluorescence spectra of rhodamine 6G, and hence was a failure.
(1) It is known that Gemini, Meta AI, and Copilot leverage both search engine results and LLMs (thereby accessing a wide range of literature), whereas GPT 3.5 and GPT 4o rely solely on their training data. An alternative means of comparison could rely on use of an application programming interface (API), which enables local operation of the chatbots independent from servers.89 Accessing chatbots via an API can automate tasks by batch processing, which is an efficient approach compared to that of a web user interface (WUI), which is time consuming given the reliance on manual input.90,91 Similar accuracy rates were typically obtained by both WUI and API approaches for GPT-4V,92 yet the processing time for data extraction from scientific graphs via an API can be 30 times faster than that of a WUI.92 With speed comes cost, however: an API can cost ∼6 times more ($125 per month) than a WUI ($20 per month).92 The tasks examined herein were relatively simple given that limited numbers of photophysical parameters were examined; thus, WUI-based chatbots were employed. Data extraction in bulk upon submitting questions concerning photophysical parameters may enjoy benefits from use of an API in the future.
(2) Chatbots are known to generate different answers when the identical question is repeated.93,94 An identical question was fed to GPT 4.1 for three subjects five times (see the ESI†). For those photophysical parameter retrieval tasks that are relatively straightforward, almost identical responses were obtained. Thus, each question was asked only once thereafter.
(3) Chatbots are known to be sensitive to how a prompt is formulated.94 A prompt can be engineered to improve the quality of the responses from chatbots without tedious fine-tuning of the training data.95 For example, repeatedly modifying the questions can engender responses in a desired format.96 Examples in this domain include adding conditions, pinpointing solvents, indicating the environment (pH, bound to protein), limiting the phase (solid or liquid or gas), specifying the composition (e.g., molecules), and so forth.
(4) During the preparation and review of this manuscript, a new version of ChatGPT (GPT 4.1) was released. Are the results presented herein already out of date? Some domain specific tasks may enjoy greater benefits due to the features of GPT 4.1; however, a foray using GPT 4.1 revealed little improvement for the retrieval of values for ε and Φf. The results are summarized in the ESI (Tables S1 and S2†). GPT 4.1 is not immune to questions involving fictive dyes and fluorophores. GPT 4.1 does exhibit notable improvements for the graphical display of spectra, albeit utilizing Gaussian distributions.
It is clear that the present chatbots are of limited reliability for tasks that require broad processing, such as unguided education in the photosciences.97 The results shown in Fig. 5 herein and in the accompanying text substantiate this conclusion. On the other hand, extant chatbots are of significant benefit already for punctate and singular tasks such as identification of the values of photophysical parameters that otherwise are often buried deeply in the vast scientific literature.
What is the significant difference between chatbots and extant search engines? Chatbots provide clear-cut definitive answers (a double-edged sword), which can save time and accelerate acquisition of desired information; in this regard, chatbots surpass present search engines in terms of efficiency. Chatbots do not generate novel data from scratch, but create sentences by simply adding the most suited words by following natural language theories utilizing trained data. The knowledge of chatbots acquired through training processes relies heavily on internet resources, causing the knowledge of chatbots to overlap heavily with web resources. Thus, the information accessible to chatbots and search engines has partial commonality. The well-deserved criticisms leveled at chatbots may reflect shortcomings that are not entirely idiosyncratic to chatbots, but rather arise from the nature of the source materials, particularly internet web resources. As an example, an erroneous value of a parameter in a textbook (due to author error or publication production error) could enter a website or scientific publication, and from the book or the latter sources be accessed by a chatbot. The thread-like lineage of such values often is frayed if not clipped. In other words, the propagation of mistakes and typographical errors across the internet may arise from decades-old (if not centuries-old), traditional printed materials, not chatbots.
Chatbots may not always provide accurate information, thus the domain-specific expert needs to inspect the answers carefully. With the spectral databases of PhotochemCAD in hand, evaluations of ε and Φf are straightforward tasks for compounds represented therein. An outcome of the present study indicates the importance of using multiple chatbots to elicit results followed by evaluation by the domain-specific expert. Multiple chatbots are readily accessible, and others are under development. The possibility exists of course that all chatbots elicit identical, incorrect responses; however, the current study already demonstrates the diversity of responses from chatbots at least in the photosciences field. Even inaccurate or incorrect results often have subtleties that may warrant further scrutiny. In summary, the chatbots examined here are quite effective (but not universally so) for retrieval of granular data (ε and Φf) of considerable importance in the photosciences, are only marginally effective for finding spectral traces, and can be susceptible to inquiries concerning (intentionally or inadvertently) fictive compounds. Molecular design in the photosciences can make use of information beyond absorption and fluorescence spectra; for example, a database of the yield of intersystem crossing, phosphorescence spectrum, and triplet state lifetime would enable design of molecules for diverse photoprocesses.98–100 In sum, chatbots would appear to be in their infancy, yet if judiciously applied, already offer a valuable means for searching the scientific literature, for which new strategies are urgently required.
Footnote |
† Electronic supplementary information (ESI) available: Retrieval of the molar absorption coefficient (ε) and the fluorescence quantum yield (Φf). Questions involving fictive dyes and fluorophores. Retrieval of absorption and fluorescence spectral traces. Comparison of results from ChatGPT 4o and GPT 4.1. See DOI: https://doi.org/10.1039/d4dd00255e |
This journal is © The Royal Society of Chemistry 2025 |