Open Access Article
This Open Access Article is licensed under a
Creative Commons Attribution 3.0 Unported Licence

Jeweler-in-the-loop: personalized alloy color optimization via preference-based BO

Chase Katz*, Ting-Yu Yang, Parker King, Md Shafiqul Islam, Brent Vela and Raymundo Arróyave
Department of Materials Science and Engineering, Texas A&M University, College Station, TX 77843, USA. E-mail: ichasekatz@icloud.com

Received 9th January 2026 , Accepted 2nd March 2026

First published on 5th March 2026


Abstract

Many materials design attributes that are central to adoption, such as aesthetics, perceived quality, or user-specific preferences, are difficult to quantify directly, making preference feedback a practical proxy for optimization. Here we present a preference-driven Bayesian optimization framework and demonstrate it using ring color as a concrete case study. This case study simulates a scenario in which a jeweler synthesizes a ring with a given chemistry and presents it to a stakeholder, who expresses a preference relative to an incumbent ring color. Our approach, preferential Bayesian optimization (PBO), learns a latent utility function over alloy composition space, with perceived color obtained via a Thermo-Calc optical forward model. Using preference feedback, the framework iteratively proposes new alloy chemistries predicted to better align with user aesthetic preferences. After a limited number of proposed alloys and their associated colors, the user selects the most desirable option, guiding the search toward optimal aesthetic outcomes. We then evaluate the cost of the proposed alloys and identify a cost-aesthetic Pareto front, enabling informed trade-offs between affordability and visual appeal. The proposed framework is readily applicable to materials design problems in which subjective or hard-to-measure attributes play a dominant role in decision-making.


1. Introduction

Rings have long served as powerful symbols across cultures, embodying ideas such as commitment, heritage, status, and personal identity.1–3 The art of metallurgy, dating back over 6000 years, enabled ancient civilizations to work copper, gold, and bronze into functional and symbolic objects, often signifying wealth, power, or spiritual meaning.3,4 Today, the jewelry industry continues to thrive, with the global market valued at approximately $366.8 billion in 2024 and projected to grow to $578.5 billion by 2033, driven by rising disposable incomes, evolving fashion trends, and increasing demand for personalized accessories.5

This scale presents a significant financial opportunity for jewelers. In an ideal scenario with unlimited resources, a custom jeweler could fabricate every possible alloy within a given chemical design space and present the full palette of ring colors to the user for selection. However, exhaustive fabrication to explore alloy compositions for a preferred color is costly and time-consuming. The expense is particularly acute because precious metals such as gold and platinum can cost over $130 and $50 per gram,6,7 respectively, meaning that even a handful of prototype rings represents a substantial investment.

Traditional alloy design for rings and other jewelry has historically relied on trial-and-error synthesis and subjective evaluation of properties such as color. Moreover, aesthetic preferences are inherently personal and difficult to quantify, making systematic optimization of ring color for individual users challenging.8 Consequently, there is a need for methods that can efficiently identify optimal aesthetic outcomes while minimizing the number of prototypes required.

More broadly, many materials design problems are ill-posed as the optimization of a fully specified scalar or even a fixed multi-objective function. Key attributes that affect ring preference (aesthetics, tactile feel, surface finish, perceived quality, or designer intent) are not directly measurable, vary across stakeholders, and evolve as candidates are explored. In these settings, the objective is better treated as a latent preference model rather than a known function. Stakeholders can often provide reliable pairwise judgments (A vs. B) even when absolute scoring is inconsistent, making preference data a natural feedback channel for learning and optimization. In this context, “jeweler-in-the-loop” refers to a human-in-the-loop design workflow in which a jeweler or designer provides iterative pairwise preference judgments over candidate ring colors, guiding the optimization process without explicitly specifying a quantitative objective function.

Preference-based optimization formalizes this process by combining a latent-utility model with an acquisition strategy that selects the next comparison to maximize expected utility improvement or information gain.9 This yields an iterative loop that can efficiently explore high-dimensional composition spaces while remaining faithful to subjective or context-dependent criteria. The approach is compatible with conventional objectives and constraints, enabling joint optimization (e.g., preference-informed utility vs. cost, processability, or compliance) within a multi-objective PBO framework. In this work, alloy color is a concrete example of a hard-to-quantify attribute, but the method is intended as a general recipe for materials design problems where preference signals are the most reliable data available.

In this work, we present a preference-driven Bayesian optimization framework to identify ring colors that best match an individual user's aesthetic preferences.8,9 Preferential Bayesian optimization learns a latent utility function over alloy composition space, with color induced by the Thermo-Calc optical forward model, and proposes new alloy chemistries predicted to align with the user. We implement this as an iterative loop in which each query renders two candidate alloy colors and elicits a pairwise preference, using that feedback to update the utility model and drive subsequent proposals.

To demonstrate the proposed preference-based Bayesian optimization method, we perform an in silico study using Thermo-Calc's optical properties module within the Noble Metal Alloys Database (TCNOBL4) as the ground truth.10,11 Each query is analogous to fabricating a physical prototype. Each query retrieves a predicted color from Thermo-Calc's optical Model. The user is then presented with two candidate designs and selects the preferred option. After a fixed number of iterations, the resulting set of candidate alloys is analyzed in a multi-objective setting to identify the aesthetic-cost trade-off frontier, which serves as a decision aid for selecting a composition that balances visual appeal with affordability.

Preference-based Bayesian optimization has been widely studied, including multi-objective and scalable variants.8,9,12–17 Prior work has addressed preferences over explicit objectives, such as preference-order constraints within multi-objective BO frameworks that bias sampling toward preferred regions of the Pareto front, as well as extensions of preferential BO to multi-objective settings by scalarized Thompson sampling. Other studies have focused on algorithmic improvements, including preference exploration strategies for efficient multi-outcome optimization, decision-theoretic acquisition functions such as qEUBO, and improved inference schemes for preferential Gaussian processes that account for posterior skewness. Applications of preferential BO to subjective design problems, such as additive manufacturing parameter tuning, have also been demonstrated.

These prior works primarily focus on developing new acquisition strategies, inference schemes, or scalable variants of preferential Bayesian optimization. In contrast, this work does not introduce a new preferential BO algorithm. Instead, we adopt the established pairwise Gaussian-process preference model and deploy it within a materials-design setting, demonstrating how preference-based BO can be embedded within alloy composition design using a CALPHAD-based optical model.

Prior work has addressed preferences over explicit objectives, such as preference-order constraints within multi-objective BO frameworks that bias sampling toward preferred regions of the Pareto front,13 as well as extensions of preferential BO to multi-objective settings by scalarized Thompson sampling.12 Other studies have focused on algorithmic improvements, including preference exploration strategies for efficient multi-outcome optimization,15 decision-theoretic acquisition functions such as qEUBO,17 and improved inference schemes for preferential Gaussian processes that account for posterior skewness.16 Applications to subjective design problems, such as additive manufacturing parameter tuning, have also been demonstrated.14

In contrast, this work does not introduce a new preferential BO algorithm. Instead, we adopt the established pairwise Gaussian-process preference model and deploy it within a materials-design setting, demonstrating how preference-based BO can be embedded within alloy composition design using a CALPHAD-based optical model.

The novelty of this work lies in the application and integration of established preference-based optimization within a materials-design workflow. To our knowledge, this is one of the first applied design studies to deploy Thermo-Calc's optical properties module within TCNOBL4 for alloy color prediction in an iterative design setting. This deployment enables a preference-driven, human-in-the-loop Bayesian optimization workflow for ring design that integrates a physically grounded optical forward model, human preference feedback, and cost trade-off analysis within alloy composition space. Ring color serves as a concrete demonstration of how preference-based optimization can be used in materials problems where key attributes are subjective or difficult to quantify directly. Together, these contributions establish a practical use case for the Thermo-Calc optical model and a generalizable template for preference-based materials design.

2. Methods

2.1. Modeling latent color-preferences: pairwise Gaussian processes

We summarize the standard pairwise Gaussian-process preference learning formulation for completeness and to make the manuscript self-contained; this work adopts the established model without modification and focuses on its application to materials design.8,9

We consider a preference-learning setting in which each candidate alloy is represented by a composition vector image file: d6dd00009f-t1.tif, where each component of x corresponds to the fractional concentration of a constituent element e in the alloy design space. There are d + 1 elements in the alloy system, with x constrained to the composition simplex (non-negative fractions summing to a fixed total). The goal is to infer a latent utility function f(x) from noisy pairwise ring comparisons.9 The training set consists of a collection of items X = {x1, …, xn} together with m observed preferences, where the k-th preference states that xik is preferred to xjk.

Pairwise comparisons are encoded through a design matrix image file: d6dd00009f-t2.tif, where each row corresponds to a single incumbent–challenger preference (ik, jk) and contains a +1 in the column of the preferred item and a −1 in the column of the non-preferred item, with zeros elsewhere, as shown in Table 1. Let image file: d6dd00009f-t3.tif denote the latent utilities at the n candidate alloys, with fi = f(xi). The design matrix maps utilities to comparison differences via

image file: d6dd00009f-t4.tif

Table 1 Illustrative design matrix D for pairwise comparisons in the preference-learning model
image file: d6dd00009f-u1.tif


We use a Gaussian process (GP) prior to model the latent utility:

image file: d6dd00009f-t5.tif

The prior mean m(x) is taken to be constant, providing an uninformative, non-biasing baseline. The covariance function is chosen as a scaled radial basis function kernel,

image file: d6dd00009f-t6.tif
where image file: d6dd00009f-t7.tif are lengthscale parameters (optionally distinct across dimensions) and s2 > 0 is an output scale. Kernel hyperparameters were restricted to positive values through the default parameterization and constraints in GPyTorch/BoTorch. The multi-objective model used the BoTorch PairwiseGP, which applies default weakly informative priors to the kernel lengthscale and output scale (the prior distributions and parameters are specified in the SI). The single-objective study used a GPyTorch RBF kernel with a ScaleKernel and no explicit hyperparameter priors. Alloy compositions were represented as normalized fractions in [0, 1], and no manual hyperparameter tuning was performed. Implementation details and hyperparameter settings are described in the SI and documented in the repository.

Observed comparisons are modeled through a probit likelihood.9 Conditional on latent utilities at the compared items, the probability that xik is preferred to xjk is

image file: d6dd00009f-t8.tif
where Φ(·) is the standard normal cumulative distribution function. Assuming independent comparisons given f, the log likelihood is
image file: d6dd00009f-t9.tif

Because the probit likelihood yields a non-Gaussian posterior, inference proceeds via a Laplace approximation centered at the maximum a posteriori (MAP) utilities fMAP.9 Define the negative log posterior,

image file: d6dd00009f-t10.tif
where m is the prior mean vector evaluated at X and K is the kernel matrix Kij = kθ(xi, xj). The MAP estimate satisfies ∇S(f) = 0 and is obtained using Newton–Raphson iterations that exploit analytic expressions for the gradient and Hessian of S(f). The Hessian takes the form
2S(f) = K−1 + W,
where W is the negative Hessian of the log likelihood evaluated at the current utilities. The resulting Laplace posterior is a Gaussian approximation,
image file: d6dd00009f-t11.tif

To learn the kernel hyperparameters image file: d6dd00009f-t12.tif, we maximize the Laplace-approximated marginal log likelihood (evidence).9 Because the probit likelihood renders the exact evidence intractable, we expand the log posterior around the MAP estimate fMAP. Let W = −∇2[thin space (1/6-em)]log[thin space (1/6-em)]p(C|f)|fMAP denote the negative Hessian of the log likelihood evaluated at the MAP point, and let K be the prior covariance matrix. The Laplace approximation yields

image file: d6dd00009f-t13.tif
This expression balances data fit at the MAP utilities, the GP prior regularization, and the Occam factor log|K−1 + W|, which penalizes models that are unnecessarily flexible. Maximizing this approximate marginal likelihood provides a principled and tractable way to learn the hyperparameters for pairwise-probit Gaussian process preference models.

2.2. Prediction with pairwise Gaussian processes

For a set of test points X*, let k* = k(X*, X) and K** = k(X*, X*). Under the Laplace approximation, the posterior predictive distribution of f* = f(X*) is Gaussian with mean
µ* = m(X*) + k*K−1(fMAPm(X)),
and covariance
image file: d6dd00009f-t14.tif
This form reduces to standard GP prediction when W = 0 and incorporates preference information through the curvature of the likelihood at the MAP point.

Given the Gaussian predictive distribution for the latent utilities, the model can evaluate the probability that one candidate is preferred over another. For two test compositions a, bX*, let µa, µb denote their predictive means and let

σab2 = Σ*,aa + Σ*,bb − 2Σ*,ab
be the predictive variance of their utility difference. Under the probit likelihood, the preference probability is
image file: d6dd00009f-t15.tif
where Φ(·) is the standard normal cumulative distribution function. Thus, the utility posterior induces a predictive distribution over pairwise preferences. This quantity is used implicitly in the acquisition function, since the posterior means µ* enter the surrogate model and the posterior covariances Σ* govern uncertainty in the latent utility.

2.3. Acquisition function: expected improvement (single objective)

Because the pairwise GP provides a posterior distribution over the latent utility f(x), this latent function can be optimized directly using standard acquisition criteria within a PBO framework. In the single-objective setting, we select new candidates by maximizing Expected Improvement (EI).18 We implemented the optimization loop in BoTorch.19

Let image file: d6dd00009f-t16.tif denote the Laplace-approximated posterior at iteration t. Let

image file: d6dd00009f-t17.tif
be the current best predicted utility among evaluated designs. The improvement achieved by evaluating a new candidate x is
I(x) = max(0, f(x) − ft+),
which is a random variable under the posterior. The Expected Improvement is the posterior expectation of I(x),
image file: d6dd00009f-t18.tif

Because the posterior over f(x) is Gaussian, EI admits a closed form. Define the standardized improvement

image file: d6dd00009f-t19.tif
and denote by Φ(·) and ϕ(·) the standard normal cumulative distribution function and probability density function. Then
αEI(x) = (µt(x) − ft+)Φ(z(x)) + σt(x)ϕ(z(x)).
In this formulation, the latent utility GP plays the same role as an ordinary regression GP: its posterior mean µt(x) reflects how promising a candidate is according to inferred preferences, while the posterior variance σt2(x) quantifies uncertainty due to limited or ambiguous comparisons. Thus EI naturally balances exploitation of compositions predicted to yield high latent utility with exploration of regions where uncertainty in f(x) remains large, even though the model is trained only on pairwise preference data.

2.4. Acquisition function: expected area improvement (bi-objective)

In this work, we consider the design space evaluated according to two conflicting objectives: (i) a latent utility function u(x) describing color preference (to be maximized), inferred through pairwise comparison experiments, and (ii) a cost function c(x) (to be minimized).20 Following standard practice, we convert cost to a maximization objective by working with the transformed value
f2(x) = −c(x),
so that the two-objective vector is
f(x) = (f1(x), f2(x)) = (u(x), −c(x)).

Each objective is modeled using an independent GP. The GP posterior at any unobserved point x provides predictive distributions

image file: d6dd00009f-t20.tif
where independence between the two models is assumed for analytical tractability.

Let image file: d6dd00009f-t21.tif denote the current set of non-dominated (Pareto optimal) points in the bi-objective space. We choose a reference point r = (r1, r2) that is dominated by all feasible objective values. The dominated hypervolume (in this case, area) is defined as

image file: d6dd00009f-t22.tif
where λ(·) denotes Lebesgue measure (area).20,21

If a new point f(x) is added, the hypervolume improvement is

image file: d6dd00009f-t23.tif

Since f1(x) and f2(x) are uncertain under the GP posteriors, we evaluate the Expected Hypervolume Improvement (EHVI):21

image file: d6dd00009f-t24.tif

For the two-objective case, the Pareto frontier can be written as an ordered sequence of points

p(1), p(2),…, p(K),
sorted such that p1(k) increases and p2(k) decreases. These points divide the objective space into disjoint “cells” A1, …, AK+1.21 Let
ak = p1(k), bk = p2(k−1),
with the conventions
p(0) = (r1, p2(1)), p(K+1) = (p1(K), r2).

A newly sampled point f(x) contributes hypervolume improvement only if it falls inside one of these cells. In such a case, the improvement is exactly the area of the rectangle:

HVI(x) = (f1(x) − ak)(bkf2(x)), for f(x) ∈ Ak.
In this application, the first objective models a latent preference derived from pairwise comparisons over alloy colors, while the second objective captures the manufacturing cost. EHVI evaluates each candidate design x by the expected increase in the area dominated by the Pareto frontier. This leads to principled exploration of alloy compositions that jointly improve color aesthetics and reduce cost. Because the hypervolume measures the trade-off surface between objectives, maximizing EHVI efficiently pushes the frontier outward, ensuring consistent progress toward discovering superior alloy designs.

In our preference-based bi-objective Bayesian optimization framework, the acquisition function (EHVI) is defined over individual candidate points, while the latent utility model requires pairwise comparisons for learning. Thus, at each iteration, we first identify the candidate that maximizes the analytical expected hypervolume improvement, selecting

image file: d6dd00009f-t25.tif

Because the latent utility is inferred from pairwise feedback, this EHVI-selected challenger is not evaluated alone, but is compared against the current incumbent x, defined as the design with the highest posterior mean utility under the PairwiseGP model,

image file: d6dd00009f-t26.tif

The next experiment therefore consists of presenting the pair (x, xn+1) to the user, who indicates which of the two is preferred. This comparison updates the latent utility posterior, while the cost of the challenger is evaluated directly. This incumbent–challenger mechanism allows EHVI to guide exploration in the multi-objective space, while remaining consistent with the structure of pairwise preference learning.

2.5. Ground truth for alloy color: Thermo-Calc optical model

Alloy color was computed using the Thermo-Calc optical properties model, which predicts visible appearance of metals from equilibrium phase constitution and first-principles-informed dielectric data.10,11 For each composition x at the specified thermodynamic condition, the workflow proceeds as follows.

The electronic-structure origins of the dielectric response for individual elements (e.g., Pt or Al) are not analyzed explicitly here; instead, we treat the Thermo-Calc optical properties model as a ground-truth oracle that deterministically maps alloy composition to visible color for the purposes of preference-based optimization.

The equilibrium phases {αi}i=1N present at a given alloy composition x and their corresponding volume fractions {vi}i=1N were obtained from the Equilibrium Calculator. Here, N denotes the number of thermodynamically stable phases at composition x, αi labels the i-th equilibrium phase, and vi is its volume fraction, satisfying image file: d6dd00009f-t27.tif. These phase identities and volume fractions constitute the phase-resolved inputs required for subsequent optical mixing calculations.

For each phase αi, the complex dielectric function εi(λ) = ε1,i(λ) + 2,i(λ) was taken from the Thermo-Calc noble-metal optical database as a function of wavelength λ.11,22 For multiphase alloys, these were combined using the generalized Bruggeman effective-medium model to obtain an effective dielectric function εeff(λ) satisfying

 
image file: d6dd00009f-t28.tif(1)
where vi and εi are the volume fraction and dielectric function of phase αi, respectively.23,24

The effective complex refractive index ñ(λ) = n(λ) + ik(λ) was computed from εeff(λ) via

 
image file: d6dd00009f-t29.tif(2)
 
image file: d6dd00009f-t30.tif(3)
where ε1 and ε2 are the real and imaginary parts of εeff.

Assuming light incident from air onto a semi-infinite alloy surface, the wavelength-dependent reflectance R(λ) was calculated using the Fresnel equations for s- and p-polarized light at incidence angle θ:

 
image file: d6dd00009f-t31.tif(4)
 
image file: d6dd00009f-t32.tif(5)
with ñ1 = 1 (air), ñ2 = ñ(λ) (alloy), and θt determined by Snell's law ñ1[thin space (1/6-em)]sin[thin space (1/6-em)]θ = ñ2[thin space (1/6-em)]sin[thin space (1/6-em)]θt. The unpolarized reflectance was taken as image file: d6dd00009f-t33.tif.24

Visible color was obtained by integrating the reflectance spectrum over the visible range (400–700 nm) in the CIE XYZ framework:25

 
image file: d6dd00009f-t34.tif(6)
 
image file: d6dd00009f-t35.tif(7)
 
image file: d6dd00009f-t36.tif(8)
where S(λ) is the spectral power distribution of the chosen standard illuminant (default D65), [x with combining macron], ȳ, [z with combining macron] are the CIE color-matching functions for the selected standard observer (2° or 10°), and k is a normalization constant such that Y = 100 for a perfectly reflecting surface.25 The resulting tristimulus vector (X, Y, Z) was then converted to the reported color space (CIE Lab or sRGB) using standard CIE transforms.

2.6. Definition of design space

It is of interest to assess how many alloy compositions within this high-dimensional design space can be evaluated using the Thermo-Calc optical properties model. The alloy design space, denoted image file: d6dd00009f-t37.tif, was defined as a quinary composition manifold spanned by {Au, Ag, Cu, Pt, Al}, with five composition variables constrained to sum to 100 at%, and sampled on a uniform grid in 5 at% increments. Only quinary alloys were considered; that is, each candidate composition contained at least 5 at% of every element, which substantially increases the combinatorial complexity of the color-composition mapping and provides a stringent test of the Thermo-Calc noble-metal optical database. Enumeration of this constrained 5 at% grid yielded 3876 distinct compositions. For each image file: d6dd00009f-t38.tif, the Thermo-Calc optical properties-noble model was used to compute equilibrium phase fractions and wavelength-resolved optical properties.11 Due to database coverage limitations, complete dielectric and optical solutions were successfully obtained for 933 of the 3876 quinary compositions, as shown in Fig. 1. These 933 alloys constitute the design space that is to be explored with optimization in search of optimal color.
image file: d6dd00009f-f1.tif
Fig. 1 Affine projection of calculated colors using Thermo-Calc's optical property module.

2.7. Preference-based Bayesian optimization of ring color

We iteratively learned an individual user's color preference over the quinary composition space using preference-based Bayesian optimization. Each alloy composition x induces a visible ring color via the optical forward model described above; in an ideal experimental workflow this color would be obtained from a synthesized ring, but here it is computed in silico. Preference learning is formulated through a latent scalar utility function f(x) that represents the user's (unobserved) subjective valuation of ring color. The objective is to identify compositions that maximize f(x) while minimizing the number of user queries.

At iteration t, the user is presented with two candidate rings having compositions xt(a) and xt(b). For each candidate, the corresponding color is predicted by the Thermo-Calc optical model and rendered onto a ring visualization to facilitate realistic comparison. The user provides a binary preference,

image file: d6dd00009f-t39.tif
which is recorded as a noisy observation of the ordering of latent utilities. Collecting mt such comparisons yields a dataset image file: d6dd00009f-t40.tif.

At each iteration, we select the next candidate by

image file: d6dd00009f-t41.tif
where image file: d6dd00009f-t42.tif is the candidate composition set. The newly selected composition is paired with a reference candidate (e.g., the current incumbent best) to form a two-ring query shown to the user; the resulting preference is added to image file: d6dd00009f-t43.tif, the Pairwise GP posterior is updated, and the process repeats. In this way, user feedback on rendered ring colors is converted into pairwise constraints on f(x), allowing the model to efficiently converge to compositions that best match the user's aesthetic preference.

3. Results and discussion

3.1. Optimizing color

In this case study, we conducted a color-preference-based ring optimization campaign for 25 iterations. In practice, a user would be presented with two color options and would pick the option they preferred. However, because color preference is inherently subjective and varies across individuals, we adopted an in silico benchmark to enable controlled evaluation. Specifically, to have a “simulated user” select its preferred ring colors, we defined a synthetic ground-truth latent preference function whose optimum corresponds to a canonical gold appearance, i.e., the alloy (among the 933 candidates for which the optical model converged) with the highest Au content. This would be Ag05Al05Au80Cu05Pt05. A “gold likeness” of a given color is assigned by using an equally weighted combination of color distance in HSV space and difference in a hue-based warmth metric between the candidate and the optimum color. This allows the model to have a quantitative value for a typically qualitative measurement as a substitute for a real person. The simulated user was not assumed to know this preference a priori; instead, each query returned a pairwise choice consistent with this latent “gold” utility, thereby emulating an internal but undisclosed aesthetic target.

Starting from an initial set of candidate rings, the campaign proceeded by iteratively proposing two color-rendered options, eliciting a preference, and updating the pairwise Gaussian process posterior over latent utility. Over 25 iterations, the model progressively concentrated sampling near compositions predicted to yield higher gold-likeness, producing a clear trajectory toward the gold-dominant region of the design space, as seen in Fig. 2. To complement this visualization, the SI reports the optimization quantitatively, showing the R, G, and B channel values of sampled alloys moving toward those of the target gold color over successive iterations. This convergence demonstrates that preference-based Bayesian optimization can efficiently infer and exploit a user's latent aesthetic utility using only pairwise feedback.8,9


image file: d6dd00009f-f2.tif
Fig. 2 Progress of evaluated alloy colors along with their associated preference ranking.

These results indicate that the framework converges, as queried colors become increasingly gold-like over successive iterations. To assess the robustness of this behavior, we performed additional runs under varied hyperparameter and initialization settings; the resulting convergence trends were consistent and are reported in the SI. We further evaluated robustness to surrogate-model choices by repeating the optimization with alternative Gaussian-process kernels and a random sampling baseline, with results provided in the SI.

3.2. Optimizing color and cost

In the previous case study, the user's selections drove the preference-based optimization toward increasingly gold-like colors, reflecting the latent aesthetic utility learned during the campaign. However, in practical jewelry design, cost is a dominant and often competing consideration: Au is a particularly expensive alloying element, and if a user has a higher preference for gold colors, that usually entails higher material cost.

In this case study, we demonstrate a bi-objective optimization over ring color and price. As before, we model the color objective with a pairwise GP, while price is modeled with a standard GP regressor. Because price is relatively easy to predict from rule-of-mixtures estimates of elemental costs, we train the price GP once using all candidate alloys upfront; only the preference GP for color is updated iteratively as new comparisons are collected. Each iteration, the EHVI acquisition function then combines predictions from both GPs to propose new alloys that are expected to improve the current color-price Pareto front, i.e., alloys whose colors the user is likely to prefer at an attractive price point. As before, we initialize the optimization scheme with 2 initial data points (a Pairwise GP needs to begin with 2 points as it needs an initial preference). We then run the optimization campaign for 25 iterations.

To analyze the results of this optimization campaign, we visualize the preference-rank-versus-price Pareto front, as shown in Fig. 3. As expected, more metallic silver-like colors tend to have lower costs, whereas more gold-like colors (i.e., those closer to the preferred target color) are generally more expensive. A key advantage of this approach is that it presents users with a menu of Pareto-efficient options: they can select alloys along the color-price front according to their own willingness to pay for a given color when choosing candidates for further scale-up.20


image file: d6dd00009f-f3.tif
Fig. 3 Pareto front of preference rank versus price per gram, showing the trade-off between a dummy user's learned preferred ring color and the cost of the ring.

4. Conclusions

This work demonstrates a preference-based, human-in-the-loop Bayesian optimization framework for ring alloy design that unites materials science, statistical modeling, and jewelry customization. Using Thermo-Calc's optical model and noble-metal databases, we constructed a quinary Au–Ag–Cu–Pt–Al design space, computed equilibrium phase constitutions, and mapped each composition to a predicted visible color. These computations provide a physically grounded link between alloy chemistry and ring appearance.

On top of this forward model, we implemented PBO to infer a latent utility function over alloy composition from pairwise comparisons. In a simulated-user setting, the framework efficiently retrieved target colors within a limited number of queries, illustrating that pairwise feedback alone is sufficient to drive the search toward compositions that match an individual aesthetic preference. Extending this approach to a bi-objective setting, we incorporated a cost model and used expected hypervolume improvement to construct a color-cost Pareto front. This front provides users or designers with a transparent set of trade-off options between visual appeal and material expense.

Although the present study relies on in silico data and an idealized user model, the results highlight a practical path toward interactive, data-driven alloy design for jewelry. Future work will focus on experimental validation of selected compositions, integrating real user studies, and developing a more accessible graphical interface that connects preference learning, Thermo-Calc simulations, and visualization of color-cost trade-offs. Taken together, these developments point toward a customizable ring design workflow in which users can explore personalized aesthetic choices within physically realistic and economically informed alloy spaces.

Author contributions

Chase Katz: writing – original draft, writing – review & editing, visualization, software, investigation, formal analysis, data curation. Ting-Yu Yang: writing – original draft, writing – review & editing, visualization, software, investigation, formal analysis, data curation. Parker King: writing – original draft, writing – review & editing, visualization, software, investigation, formal analysis, data curation. Md Shafiqul Islam: writing – original draft, writing – review & editing, visualization, software, investigation, formal analysis, data curation, supervision. Brent Vela: writing – original draft, writing – review & editing, visualization, software, investigation, formal analysis, supervision. Raymundo Arróyave: writing – review & editing, supervision, funding acquisition.

Conflicts of interest

There are no conflicts of interest.

Data availability

The full implementation of the preferential Bayesian optimization color selection model is archived on Zenodo. The repository contains the complete source code and interactive notebook used in this study, including alloy color visualization, Gaussian process – based preference modeling, interactive user preference updates, and visualization of aesthetic – cost trade-offs for tested alloys.

The code supporting this article can be accessed at: https://doi.org/10.5281/zenodo.18164808.

Optical color outputs shown in this article were generated using Thermo-Calc with a licensed database that cannot be redistributed publicly. To enable transparency and reuse, the workflow reports and visualizes obfuscated output data, included in the repository. Reproducing the full workflow requires access to Thermo-Calc.

Supplementary information (SI): additional figures, tables, computational details, model parameters, and supporting data related to the preference-based Bayesian optimization workflow and alloy color optimization results. See DOI: https://doi.org/10.1039/d6dd00009f.

Acknowledgements

We acknowledge the support from the U.S. Department of Energy (DOE) ARPA-E ULTIMATE Program through Project DE-AR0001427 and ARPA-E CHADWICK Program through Project DE-AR0001988. RA also acknowledges the Army Research Laboratory (ARL) for support through Cooperative Agreement Number W911NF-22-2-0106, as part of the High-throughput Materials Discovery for Extreme Conditions (HTMDEC) program as supported by the BIRDSHOT Center at Texas A&M University. BV acknowledges the support of NSF through Grant No. 1746932 (GRFP) and 1545403 (NRT-D3EM). The authors gratefully acknowledge Thermo-Calc Software AB for providing access to Thermo-Calc and associated optical-property databases used in this work.

References

  1. M. Cifarelli, Adornment, identity, and authenticity: Ancient jewelry in and out of context, Am. J. Archaeol., 2010, 114(1), 1–9,  DOI:10.3764/ajaonline114.1.Cifarelli . URL https://ajaonline.org/online-museum-review/368/.
  2. P. Kiernan and K.-P. Henz, Rings from the Forbidden Forest: The function and meaning of Roman trinket rings, J. Roman Archaeol., 2023, 36(1), 73–95,  DOI:10.1017/S1047759423000211.
  3. M. Abram, A new look at the Mesopotamian rod and ring: Emblems of time and eternity, Stud. Antiq., 2011, 10(1), 15–36 Search PubMed . https://scholarsarchive.byu.edu/studiaantiqua/vol10/iss1/5/.
  4. M. Thoury, B. Mille, T. Séverin-Fabiani, L. Robbiola, M. Réfrégiers, J.-F. Jarrige and L. Bertrand, High spatial dynamics–photoluminescence imaging reveals the metallurgy of the earliest lost-wax cast object, Nat. Commun., 2016, 7, 13356,  DOI:10.1038/ncomms13356.
  5. Grand View Research, Jewelry market size, share & trends analysis report, 2024, https://www.grandviewresearch.com/industry-analysis/jewelry-market Search PubMed.
  6. Bullion-Rates, Gold prices – US dollars (USD) – November 2025, accessed November 22, 2025, https://www.bullion-rates.com/gold/USD/2025-11-history.htm.
  7. Bullion-Rates, Platinum prices – US dollars (USD) – November 2025, accessed November 22, 2025, https://www.bullion-rates.com/platinum/USD/2025-11-history.htm.
  8. E. Brochu, V. M. Cora and N. de Freitas, A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning, arXiv, 2010, preprint, arXiv:1012.2599,  DOI:10.48550/arXiv.1012.2599.
  9. W. Chu and Z. Ghahramani, Preference learning with Gaussian processes, Proceedings of the 22nd international conference on Machine learning, ACM, 2005 Search PubMed.
  10. J.-O. Andersson, et al., Thermo-Calc and DICTRA, computational tools for materials science, Calphad, 2002, 26, 273–312 CrossRef CAS.
  11. Thermo-Calc Software AB, Noble metal based alloys database, 2025, https://thermocalc.com/products/databases/noble-metal-based-alloys/.
  12. R. Astudillo, K. Li, M. Tucker, C. X. Cheng, A. D. Ames and Y. Yue, Preferential multi-objective bayesian optimization, arXiv, 2024, preprint, arXiv:2406.14699,  DOI:10.48550/arXiv.2406.14699.
  13. M. Abdolshah, A. Shilton, S. Rana, S. Gupta and S. Venkatesh, Multi-objective bayesian optimisation with preferences over objectives, arXiv, 2019, preprint, arXiv:1902.04228,  DOI:10.48550/arXiv.1902.04228.
  14. J. R. Deneault, W. Kim, J. Kim, Y. Gu, J. Chang, B. Maruyama, J. I. Myung and M. A. Pitt, Preferential bayesian optimization improves the efficiency of printing objects with subjective qualities, Digital Discovery, 2025, 4(3), 723–737,  10.1039/D4DD00320A.
  15. Z. J. Lin, R. Astudillo, P. I. Frazier and E. Bakshy, Preference exploration for efficient bayesian optimization with multiple outcomes, arXiv, 2022, preprint, arXiv:2203.11382,  DOI:10.48550/arXiv.2203.11382.
  16. S. Takeno, M. Nomura and M. Karasuyama, Towards practical preferential bayesian optimization with skew gaussian processes, arXiv, 2023, preprint, arXiv:2302.01513,  DOI:10.48550/arXiv.2302.01513.
  17. R. Astudillo, Z. J. Lin, E. Bakshy and P. I. Frazier, qeubo: A decision-theoretic acquisition function for preferential bayesian optimization, arXiv, 2023, preprint, arXiv:2303.15746,  DOI:10.48550/arXiv.2303.15746.
  18. D. R. Jones, M. Schonlau and W. J. Welch, Efficient global optimization of expensive black-box functions, J. Global Optim., 1998, 13, 455–492 CrossRef.
  19. M. Balandat, B. Karrer, D. R. Jiang, S. Daulton, B. Letham, A. G. Wilson and E. Bakshy, BoTorch: a framework for efficient Monte Carlo Bayesian optimization, Adv. Neural Inf. Process. Syst., 2020, 33, 21524–21538 Search PubMed.
  20. M. T. M. Emmerich and A. H. Deutz, A tutorial on multiobjective optimization: fundamentals and evolutionary methods, Nat. Comput., 2018, 17(3), 585–609,  DOI:10.1007/s11047-018-9685-y.
  21. M. Emmerich and J.-W. Klinkenberg, The computation of the expected hypervolume improvement in multiobjective optimization, Operations Research Proceedings, 2011 Search PubMed.
  22. P. B. Johnson and R. W. Christy, Optical constants of the noble metals, Phys. Rev. B, 1972, 6, 4370–4379 CrossRef CAS.
  23. D. A. G. Bruggeman, Berechnung verschiedener physikalischer Konstanten von heterogenen Substanzen. I. Dielektrizitätskonstanten und Leitfähigkeiten der Mischkörper aus isotropen Substanzen, Ann. Phys., 1935, 416, 636–664 CrossRef.
  24. C. F. Bohren and D. R. Huffman, Absorption and Scattering of Light by Small Particles, Wiley, 1983 Search PubMed.
  25. International Commission on Illumination (CIE), Colorimetry, CIE Publication 15:2004, Commission Internationale de l'Eclairage, Vienna, Austria, 3rd edn, 2004 Search PubMed.

This journal is © The Royal Society of Chemistry 2026
Click here to see how this site uses Cookies. View our privacy policy here.