A data-driven perspective on the colours of metal–organic frameworks

Colour is at the core of chemistry and has been fascinating humans since ancient times. It is also a key descriptor of optoelectronic properties of materials and is often used to assess the success of a synthesis. However, predicting the colour of a material based on its structure is challenging. In this work, we leverage subjective and categorical human assignments of colours to build a model that can predict the colour of compounds on a continuous scale. In the process of developing the model, we also uncover inadequacies in current reporting mechanisms. For example, we show that the majority of colour assignments are subject to perceptive spread that would not comply with common printing standards. To remedy this, we suggest and implement an alternative way of reporting colour—and chemical data in general. All data is captured in an objective, and standardised, form in an electronic lab notebook and subsequently automatically exported to a repository in open formats, from where it can be interactively explored by other researchers. We envision this to be key for a data-driven approach to chemical research.


Exploratory data analysis of the online survey
We invited our followers on Twitter to participate in the survey, and also shared the link to the survey with all members of the schools of basic science and engineering at EPFL. In total, 4184 colours were picked by the participants. Participants could take the survey as often as they wished.
In Supplementary Figure 1 we show the distribution of the time the user took to pick a colour in our survey. The distribution is skewed, with a mean of 34.8 s and a median of 21.6 s. The maximum is 4113 s, the minimum is 0.76 s. From the analysis, we eliminated 309 entries for which the users took less than 5 s (ca. 2.6 %) or more than 80 s (ca. 4.9 %).
It is interesting to analyse for which colour strings the users took most or least time. These are shown in Figure 2, where it is observable that there is a striking spread in the median time participants took to select the colour depending on the colour name. For use in scholarly communication, the name must be unambiguous. As a surrogate for the ambiguity we use the variance of the colour picks in RGB space. Care has to be taken here as the sensitivity of the human eye is not uniform across the whole colour space, wherefore we also calculated standard deviations that attempt to take this into account by weighting the channels with the luma weights, i.e., the red channel with 0.299, the green channel with 0.587, and the blue channel with 0.114. We furthermore removed outliers using a z = 2.5 threshold on each colour channel.
We show the colours with maximum and minimum standard deviations in RGB space in Supplementary Figure 3. Both in the unweighted and weighted measurement yellowish colours show a high standard deviation. One can get a good intuition for the choices of the participants by showing the choices for each colour term next to each other, as shown in Supplementary Figures 4-6. Interestingly, it is observable that some colour names like amber, that are less widely used, show a high variance. This is likely due to a linguistic problem. But also for more common colour names such as "yellow red" or "deep yellow" we observe a considerable variation in the colours that were picked by the users. Another interesting question to ask is whether there is a relationship between the variance and the time the users take to select a colour, which might indicate that there are some colours for which observers are generally more uncertain. The alternative could mean that for some colours there is simply a wide range in perception that is independent of how certain the participants are, i.e., how long they take to pick a colour. Notably, we observe no correlation (cf. Supplementary Figure 7), especially for the unweighted variance.
The use of our survey data to encode the colour strings is much more useful if our results are representative. A good way to estimate if this is the case is to compare our results with the ones from the xkcd survey, where nearly half a million users were asked the reverse question, i.e., to give a name to a colour which they were shown. The comparison is shown in Supplementary Figure 8: The median colour of our cleaned survey data is generally close to the result from the xkcd survey. We find a mean µ(∆E * ab ) = 8.4 and a median µ 1/2 (∆E * ab ) = 6.9, these values are smaller than the average variance in our survey.  Figure 9 | Comparison of colours for which there is an overlap between the xkcd survey and our survey. We aggregated our data using the mean of the cleaned data. Continued from Figure 8.

Perceptive spread in colours
Supplementary Figure 10 shows the cumulative distribution of the difference between the colours that were picked in our survey for a given colour string.
To reflect that the use of colour strings is not uniform in the Cambridge Structure Database (CSD), we also weight those mean differences by the frequency of the colour string in the metal-organic framework (MOF) subset of the CSD. This plot shows that for most colours the difference in perception is so large that the difference in colours that people pick for the same colour string is so large that it would not comply with common colour reproduction standards-potentially also limiting how well the reproducibility of syntheses can be assessed.
Importantly, this plot also shows that we cannot choose a too tight cutoff on the variance of the colours as we would otherwise too drastically limit the size of our training set. To trade-off the size of the training set and the variance of the perception we chose to only include colours in our training set for which the mean pairwise ∆E * ab is below 16 (vertical line in Supplementary Figure 10).

MOF structures and colour labels
For this study, we used a subset of structures from the Computation-Ready, Experimental (CoRE)-2019 MOF database, 1 for which the colour was deposited in the CSD version 5.4 (November 2019). 2 We used the CSD Python API to retrieve the colour attribute of the CSD entries. We restricted ourselves to structures from the CoRE-MOF database as the code that generates the revised autocorrelation (RAC) assumes that the structures are non-disordered. For 9525 structures in the CoRE database we could generate RAC fingerprints (we reused the ones we generated for the work by Moosavi et al. 3 ). For 8632 of structures in the CoRE database we found a colour in the CSD. After filtering out structures for which featurization failed or the colour label showed a too high variance in our survey we ended up with 6423 structures, from which we dropped 590 duplicates to avoid data leakage and biases.

Model architecture
During the development of the model we experimented with many model architectures going from Bayesian neural network (BNN), over Gaussian process regression (GPR) (with coregionalized kernels) to gradient boosted decision tree (GBDT). Many of our development attempts are tracked on comet.ml (https://www.comet.ml/kjappelbaum/color-ml?shareable=jfE6o kDmxlnYimYFFnsJcMCO6) and wandb 4 (https://app.wandb.ai/kjappelbaum /colorml/) and the corresponding code is also available on GitHub (https: Finally, we decided to use the LightGBM 5 implementation of GBDT as it implements the quantile loss, which can be used to also predict prediction intervals. By switching the loss function to quantile loss one can use these models to predict the estimated prediction intervals. 6 The quantiles of some conditional distribution P(Y|x) estimate f α (x) = y subject to P(Y < y|x) = α (α ∈ (0, 1)). For example, the 0.5 quantile is just the median and corresponds to minimising the mean absolute error, which is symmetric for over-and underprediction. The loss for other quantiles penalises negative errors more (for higher quantiles) or less (for lower quantiles) than positive errors. In this work, we trained a GBDT to predict the median and the 10th as well as the 90th percentile of the colour channels, which allows us to provide estimated prediction intervals.

Revised autocorrelation functions (RACs)
Difference RACs are computed as follows where, atomic property P of atom i (part of the start atom list) is correlated to atom j (part of the scope atom list) when they are separated by d number of bonds. Analogously, product RACs are defined as For this work we considered a maximum depth of three. More details about the implementation can be found in Moosavi et al. 3

Additional linker features
Using the SMILES extracted using the MOFid code 7  We considered both the sum and the average of those counts for all linkers in a primitive cell of a MOF as descriptors.

Hyperparameter optimization
We performed a hyperparameter optimisation over a wide range of parameters using the Bayesian optimiser (using Gaussian processes as surrogate models) and hyperband algorithm implemented in wandb. 8 We used the 5-fold cross-validated error as the empirical error estimate.
For efficiency reasons, we the estimators for different colour channels shared hyperparameters. The ranges we considered and the hyperparameters which we used in the final model are listed in Supplementary Table 1. The influence of the different hyperparameters is shown as a parallel coordinates plot in Supplementary Figure 11.

Colourspaces
There are different spaces in which one can define colours, between which one can convert with (non)-linear transformations. The most commonly used colourspace is the sRGB colourspace, but it is typically found that other colour spaces such as Hue, Saturation and Luminance (HSL) can give better performance in some applications. Generally, one can distinguish between human-(e.g., HSL), hardware-(e.g., RGB) and instrument-oriented (e.g., CIE 1976 L*a*b* (CIELAB)) colour spaces. 9;10 In initial experiences, we varied the colourspace between RGB, HSL and CIELAB. We performed the transformation between the different colour spaces using the Colour Python package. 11 Typically, we found the RGB colourspace to perform best, wherefore we used it in the final model.

Sensitivity to the cutoff in the variance in perceptive
spread and using all datapoints from the survey as noisy labels Following the analysis of Supplementary Figure 10 we investigated how the predictive performance of our model depends on the cutoff we choose for the variance in the perception of the colours (Supplementary Figure 12, i.e., the mean pairwise ∆E * ab between the colours picked by the participants of our survey). To allow for a fair comparison, we first split of a holdout set of the full database (after the preprocessing steps described in section 1) and then applied the threshold on the perceptive variance on the training set. We considered a threshold of 5, 16, as well as no threshold at all on the mean pairwise ∆E * ab between the survey responses for any given colour. Moreover, one can imagine that one could use all the data from the survey in such a way that the same MOF is presented to the model multiple times with the different colours our survey participants picked for a given colour string. This is, one MOF feature vector would be mapped to multiple RGB values during training. Those values might be quite similar (for the low threshold on the in-survey variance) or dissimilar (if we do not apply a threshold). This is similar to the addition of noise to labels that is sometimes used to reduce overfitting. To understand the influence of this effect, we trained our model only on the medians of the survey results (one MOF mapped to one RGB value during training) or on all colour labels (one MOF mapped to one RGB value during training) with 50 different train/test splits and measured the ∆E * ab on the two different test sets. For efficiency reasons (and also following the low sensitivity on the hyperparameter settings that is evident from Supplementary Figure 11) we chose the same hyperparameters for all training sets. We observe that if we apply a threshold, the use of all responses in the survey (i.e., showing the same structure with different RGB values to the model) tends to lead to overfitting (as expected, we observe the best performance on the test set drawn from the same distribution) and poor transfer. If we train on all colours, this approach also leads to poor generalisation on a test set drawn from the same distribution. Analysing the results we get from training only on the median, a threshold of 16 seems to be least prone to overfitting. It is easy to understand why the tight threshold of 5 performs badly: We drastically limit the number of training points and the chemical space our model sees during training. In contrast, for the case without threshold, there are more responses with a high variance that are prone to be wrong and the data is more difficult to learn for the model.
For these reasons we trained our models on the medians of colours with a in-survey mean pairwise colour distance of ∆E * ab < 16.
6 Performance measurement

Representative predictions
The predictions for 100 random structures of the test and train set are shown in Supplementary Figure 13 and Supplementary Figure 14.

Test in experimental compounds
To test our model we took photos from some structures of our experimental colleagues and excluded structures which descriptor are closer (Manhattan, p = 1, distance) than 0.02 to one of the test structures from our training set. All structures except Sion-17 have been prepared according to published procedures. Sion-17 is a MOF that has not been reported so far. We chose to include it in our test set due to the rather unusual chemistry with a Pd porphyrin.
For UiO-66-NH 2 (Nb) we used the samples prepared the work reported by Syzgantseva et al. 12

UiO−66−NH 2
UiO-66-NH 2 was synthesised by a method adapted from Cavka et al. 13 Briefly, stock solutions of the zirconium precursor (320 mg of zirconium(IV)chloride (ZrCl 4 Aldrich), sonicated in 20 mL of DMF) and ligand (240 mg of 2-aminoterphthalic acid (H 2 BDC-NH 2 , Aldrich), dissolved in 20 mL DMF) were prepared. 2 mL of each solution was pipetted into 10 × 12 mL glass reactor vials. The reactor vials were heated to 120 • C for 48 hours, then cooled at a rate of 0.2 • C min −1 to room temperature. The resulting pale-yellow material was combined and washed by centrifuge x 3 with DMF (40 mL), and x 2 with methanol (40 mL), and left overnight to dry at room temperature

UiO-NDC
UiO-66-NDC was synthesised by a method adapted from Cavka et al. 13 25 mg of of zirconium(IV)chloride (ZrCl 4 , Aldrich), and 25 mg of 1,4-Napthalenedicarboxylic acid (H 2 NDC, Aldrich) was added to a 12 mL reactor vial containing 4 mL of DMF. The solution was sonicated for 5 minutes, to afford a clear yellow tinged solution, and then 0.1 mL of acetic acid was added. The vial was then sealed, and heated to 120 • C for 48 hours, then cooled at a rate of 0.2 • C min −1 to room temperature. The resulting material was washed by filtration by 3 x 10 mL of DMF, and 1 x 10 mL of acetone, and left overnight to dry at room temperature.

Mg-MOF-74
Mg-MOF-74 was synthesised by a method adapted from Millward and Yaghi. 14 125 mg of magnesium(II)nitrate hexahydrate (Mg(NO 3 ) 2 .6H 2 O, Aldrich) and 100 mg of 2,5-dihydroxyterephthalic acid (H 2 HBSC) was added to a 25 mL glass reactor containing 10 mL of DMF. After sonicating for 10 min, 0.5 mL of 1 propanol, and 0.5 mL of water was added to the solution. The reactor was sealed, and then heated to 100 • C for 20 hours, and then cooled at a rate of 0.1 • C min −1 to room temperature. The resulting crystals were washed with DMF by syphoning off the mother liquor with a pipette, and replacing with 10 mL of DMF. After gently shaking the vial, the DMF was again syphoned off and replaced with fresh DMF. This procedure was repeated 5 times. The crystals were then filtered, and allowed to dry overnight on the filter paper at room temperature.

Co-MOF-74
A mixture of cobalt(II)nitrate hexahydrate (970 mg, 4.61 mmol), 2,5-dihydroxyterephthalic acid (198 mg, 999 µmol), ethanol (27 mL), N,N-dimethylformamide (27 mL, 349 mmol, 1 eq.), and water (27 mL, 1.5 mol, 4.29 eq.) was transferred into a 250 mL Pyrex jar. The jar was placed for 10 min in an ultrasonic bath until a transparent solution was reached. The glass jar was sealed and kept at 100 • C in an oven for 24 h. After that, the dark red crystals were filtered and washed with ethanol three times. Then, the product was allowed to be air dried.

Zn-MOF-74
A mixture of 2,5-dihydroxyterephthalic acid (240 mg, 1.21 mmol), zinc nitrate hexahydrate (720 mg, 2.42 mmol), water (3 mL) and N,N-dimethylformamide (27 mL) were transferred into a 100 mL pyrex jar. The jar was placed for 10 minutes in an ultrasonic bath until a transparent solution was reached. The glass jar then was sealed and kept at 120 • C in an oven for 3 days. After that, the yellow crystals were filtered and washed with ethanol three times. Then, the product was allowed to be air dried.

Sion-17 (first reported in this work)
10 mg of Pd(II) meso-Tetra(4-carboxyphenyl)porphine (H 4 -TpCPP-Pd, Frontier Scientific) was placed into a 12 mL glass reactor, in addition to 5 mg of Zinc(II)nitrate hexahydrate (Zn(NO 3 ) 2 .6H 2 O), and 1 mg of adenine (C 5 H 5 N 5 , Alfa Aesar) to act as a mediator. To this, 2.5 mL of DMF was added, and the solution was sonicated for 10 min. After sonication, 0.5 mL of water was added, and then the solution was acidified with 400 µL of 1M nitric acid. The reactor was then sealed and heated to 120 • C for 72 hours, then cooled to room temperature at a rate of 0.2 • C min −1 . The resulting crystals were washed with DMF by syphoning off the mother liquor with a pipette, and replacing with 5 mL of DMF. After gently shaking the vial, the DMF was again syphoned off and replaced with fresh DMF. This procedure was repeated 5 times. The crystals were then filtered and allowed to dry overnight on the filter paper at room temperature. 6.2.7 HKUST-1 1,3,5-benzenetricarboxylic acid (210 mg, 999 µmol) and copper(II)nitrate hemi(pentahydrate) (348 mg, 1.5 mmol) were added to a 12 mL microwave vial. To this, N,N-dimethylformamide (2.5 mL), ethanol (2 mL) and finally water (500 µL) were added. The vial was then capped and crimped, and sonicated until the material dissolved (approx. 2 min). A clear, pale blue solution was formed.
The vial was placed in a microwave, and heated at a power of 200 W to 140 • C for 20 min. The microwave was then cooled for 8 min to 40 • C.
A blue powder was formed. The material was washed by centrifugation (5 x 30 mL N,N-dimethylformamide) in a 50 mL size centrifuge tube at 4000 rpm for 5 min. The solvent was removed from the centrifuge tube, and the material was then allowed to dry in the centrifugation tube at room temperature over 2 days.
Approximately 100 mg of material was activated for 24 h at 200 • C under vacuum in a Schlenk tube. The blue colour darkened.

Cu-TDPAT
2,4,6-tris(3,5-dicarboxylphenylamino)-1,3,5-triazine (422 mg, 682 µmol), copper(II)nitrate trihydrate (2.31 g, 11 mmol), N,N-dimethylacetamide (30 mL), 1,4-dioxane (30 mL), water (1.5 mL), and fluoroboric acid (13.5 mL) were mixed and stirred for 5 minutes in a 250 mL jar. Then, the solution was located inside the 85 • C oven for 3 days. Then it was taken out of the oven. After decanting the hot mother liquor, the crystals were rinsed with DMAc and methanol and dried. In order to activate the sample, firstly, the solvent exchange process should be done. In order to do this and also to ensure the extraction of any unreacted ligand, the material was inserted into a thimble and washed extensively with methanol in a soxhlet apparatus overnight. The solvent-exchanged sample then was activated by heating at 150 • C for 12 hours under vacuum. The sealed activated sample then was transferred into the glove box.

AlPMOF
Al-PMOF was synthesised by a method adapted from Fateeva et al. 15 100 mg of meso-Tetra (4-carboxyphenyl)porphine (H 4 -TpCPP-H 2 , Frontier Scientific), and 60 mg of aluminium(III)chloride hexahydrate (AlCl 3 .6H 2 0) were added to a 23 mL PTFE reactor. To this, 8 mL of milipore water was added, and 2 mL of DMF. The reactor was placed into a Parr Vessel, and heated to 180 • C for 16 hours, then cooled to room temperature at a rate of 1.5 • C min −1 . The product was washed by centrifuge x 5 with DMF, and x 2 with acetone, and left to dry overnight at room temperature.

Examples of errors
In the following, we discuss some cases where our prediction is distant (∆E * ab > 30) from the colour reported in the CSD and for which we found a report of the colour in the original journal article. We see that in some instances the colour deposited in the CSD is not the same as the one reported in the original reference. Of course, sometimes our model is also simply wrong. Values in parentheses give the RGB coordinates, names are typically the closest HTML colour names.
• IZOWEA is reported as red in the CSD and as dark red in the paper. 17 We predict a burgundy colour (99, 36, 64) which closest HTML name pale violet red.
• YORLEY is reported in the CSD as yellowish green crystals that turn orange within a week. 18 We predict a rosy brown (215, 150, 143).
• PUPYAA is reported in the CSD and the paper 19 as yellow. We predict plum (181, 123, 170).
• PULDOQ is reported in the CSD as yellow and as blue block crystals in the paper. 20 We predict a pale violet red (162, 73, 105).
• GAMXAV is reported in the CSD and the paper 21 as red. We predict a dark blue colour (14, 0, 144).
• NEGMUI is reported in the CSD and the paper 22 as red. We predict a peru color (235, 175, 72).
• FIFMIS is reported in the CSD and the paper 23 as red. We predict a medium violet red (223, 41, 143).
• AVUCEA is deposited in the CSD as red and also described as such in the paper 24 . We predict a medium violet red (204, 34,192).
• GIVHIC is reported in the CSD and the paper 25 as red. We predict a plum color (169, 110, 148).
• MAGBUT01 is reported in the CSD and the paper 26 as yellow. We predict a peru color (214, 142, 58).
• POHWIU is reported as red in the CSD and as dark red in the paper. 27 We predict a dark magenta (180, 12, 182).
• FOLLEZ is reported as pink in the CSD and as colorless in the paper. 28 We predict alice blue (224, 230, 240).
• GARMAQ is reported in the CSD as black and as brown in the paper 29 . We predict rosy brown (164, 127, 114).
• DOZCEC is reported in the CSD as brown and in the paper as brownishred 30 . We predict pale violet red (213, 120, 140).
• WOPWOO is deposited in the CSD as dark red and as dark violet in the paper. 31 We predict a light pink (216, 147, 178).
• YICGOI is deposited as orange in the CSD and reported as such in the paper. 32 We predict a medium violet red (218, 36, 147).

Baseline models
Before building the GBDT models we evaluated the performance of some simple baseline models. This is important to understand what the minimum performance one can expect. Some metrics are summarised in Supplementary Table 2 and examples of the predictions on a test set are shown in Supplementary Figures 16 and 16. Supplementary

Permutation test
One method to assess whether the model learned a meaningful association between features and target is to perform a permutation test. 33 For efficiency reasons, we performed this test with only 500 training points and 100 permutations, using the permutation_test_score function in sklearn (see Supplementary Figure 19).

Learning curve
Learning curves for the GBDT model are shown in Figure 20. We observe that the curves did not saturate, i.e., that we could improve our model if we would have more data and that our representation allows the model to learn. (For example, for a non-unique representation one would expect a flat learning curve. 34  For the learning curves, we sampled ten different training sets, trained the models, and tested on a test set, and show the standard deviation between the ten runs as errorbars.

Feature importance analysis
For the blue colour channel, we find high importance (higher than, e.g., the number of double bonds or aromatic rings) of the octanol/water partition coefficient lg P of the linker, which is typically used to quantify hydrophobicity/lipophilicity. To understand the importance of the features better it can be instructive to analyse some ligands in our dataset that minimise or maximise the features. Obviously, just using lg P alone is not predictivebut it can help our model to make good first splits and then refine those based on other features (cf. Figure 21). Figure 21 | Some examples of ligands that minimise and maximise lg P. The numbers in the ligand report the sum of the lg P across all unique ligands in the MOFs.

SHAP interaction values
One feat of the SHAP analysis is that it can provide insights into feature interactions. In Supplementary Figures 22-24 we show the strongest absolute interaction values. We can see that the metal-ligand interactions are strongest for the blue colour channel but play a role for every colour channel. Interaction values are shown for a random subset of 1000 structures from the training set.

Colour calibration
We used the Spydercheck Checker 24 colour rendition chart with the accompanying software to create a correction profile for Adobe Lightroom, which we used for initial comparisons. All processing of the photos was performed in Adobe Lightroom, ImageJ or macOS Preview.

Semi-automatic colour calibration
Using a colour rendition chart, one can calibrate the colour-either using photo editing software like Lightroom or Photoshop together with the colour rendition chart. The most important step in this process is the correction of the white balance.
We perform the neutralisation before the colour calibration by dividing and then multiplying the image by the reference colour of the grey neutral 5 (.70 D) swatch.

Caveats
Colour calibration is extremely sensitive to spotlight effects, i.e., when one part of the image is much brighter than another part (see also a corresponding issue on GitHub https://github.com/danforthcenter/plantcv/issu es/254). To detect if this is a problem one can analyse the image profile, for example, using ImageJ (see Supplementary Figure 25). In practice, spotlight effects can be avoided by means of a diffuser and by taking the photographs from some distance. We used a setup as shown in Supplementary Figure 28, where we use a lightbox to reduce shadows and ensure homogeneous illumination.
Additionally, for colour calibration, we have to keep in mind that typically the illuminant changes and is different from the reference values we use in the software (CIE D50). We are planning to investigate approaches in which the illuminant is inferred from images using machine learning (ML). 35;36 Note also that we took some photos through the glass of a vial (e.g., when we wanted to measure the colour of an activated material). In this case, the glass can slightly distort the colour measurement. Also note that for powders the result will depend on the powder density, i.e. air to grain ratio. For this reason some recommend measuring the powder in the form of a tablet. 37 To minimise effects of light trapping 38 between the particles and increase reproducibility we found that the sample thickness should be at least half a centimetre.
Furthermore, we observed a significant variability of colour for different batches of the same material. This is also reflected in the literature, for example for Ni-MOF-74, for which both yellow-green 39 and yellow-brown 40 have been reported.

Python implementation
For the Python implementation we built a Dash app and use the implementation in colour-checker-detection for detection of the colour rendition card. 41 We offer the user a selection of calibration algorithms (polynomial expansions using the Vandermonde method, the method proposed by Cheung et al. 42 and by Finlayson et al. 43 ), which are implemented in the colour Python package. 44 By default we use the algorithm proposed by Finlayson et al. Operations on the colours are performed in linear sRGB space (applying the decoding colour component transfer function (CCTF) to the non-linearly encoded images). The code is, together with a Dockerfile which we use for deployment, available on GitHub (https://github.com/kjappelbaum/colorcalibrator).

JavaScript Implementation
We are currently implementing semi-automatic colour calibration in JavaScript (ES2015). The source code is available on GitHub (https://github.com/kja ppelbaum/colorcal). In general, our implementation follows the algorithms described by Sunoj et al. 45 For the patch selection, the user selects the edges of a colour calibration card. Using geometric reasoning, we determine the coordinates of the 24 patches of the colour calibration card-also in case of moderate tilt or bad alignment. We only assume that the full card is visible in the image. For each patch, we select a rectangular region of interest (ROI) in which we compute the average RGB colour.

Telegram chatbot
To facilitate the upload of photos of the synthesised compounds into the ELN, 46 we developed a prototype of a Telegram chatbot using the PyTelegramBotAPI Python package. 47 The code of the prototype is available on GitHub (https: //github.com/kjappelbaum/elnbot). An example from the first interaction (asks for the EPFL username) to the upload of an image is shown in Supplementary Figure 29.
The ELN employed by us already supports automatic upload from several instruments via samba shares. For the chatbot we simply save an image with the correct filename (including the username, sample name, and batch number) on a samba share from which a script takes over to attach the image to the correct entry in the ELN via the CouchDB.

Web app
We implemented the web app (see Supplementary Figure 31) for the colour prediction using Dash 48 and use crystaltoolkit 49 to visualise the structures. On the GitHub repository (https://github.com/kjappelbaum/mofcolori zer) we provide a Dockerfile that allows building a Docker image that can be used to run this image on any platform. For example, we deploy the image on a Dokku instance (https://go.epfl.ch/mofcolorizer). 50