Do Llamas understand the periodic table?
Abstract
Large Language Models (LLMs) demonstrate remarkable abilities in synthesizing scientific knowledge, yet their limitations, particularly with basic arithmetic, raise questions about their reliability. As materials science increasingly employs LLMs for tasks like hypothesis generation, understanding how these models encode specialized knowledge becomes crucial. Here, we investigate how the open-source Llama series of LLMs represent the periodic table of elements. We observe a 3D spiral structure in the hidden states of LLMs that aligns with the conceptual structure of the periodic table, suggesting that LLMs can reflect the geometric organization of scientific concepts learned from text. Linear probing reveals that middle layers encode continuous, overlapping attributes that enable indirect recall, while deeper layers sharpen categorical distinctions and incorporate linguistic context. These findings suggest that LLMs represent symbolic knowledge not as isolated facts, but as structured geometric manifolds that intertwine semantic information across layers. We hope this inspires further exploration into the interpretability mechanisms of LLMs within chemistry and materials science, enhancing trust of model reliability, guiding model optimization and tool design, and promoting mutual innovation between science and AI.

                                            Please wait while we load your content...