de novo generated combinatorial library design

Artificial intelligence (AI) contributes new methods for designing compounds in drug discovery, ranging from de novo design models suggesting new molecular structures or optimizing existing leads to predictive models evaluating their toxicological properties. However, a limiting factor for the effectiveness of AI methods in drug discovery is the lack of access to high-quality data sets leading to a focus on approaches optimizing data generation. Combinatorial library design is a popular approach for bioactivity testing as a large number of molecules can be synthesized from a limited number of building blocks. We propose a framework for designing combinatorial libraries from de novo generated building blocks using k-Determinantal Point Processes and Gibbs sampling. We explore optimization of biological activity, Quantitative Estimate of Drug-likeness (QED) and diversity and the trade-offs between them, both in single-objective and in multi-objective library design settings. Using retrosynthesis models to estimate building block availability, the proposed framework is able to explore the prospective benefit from expanding a stock of available building blocks by synthesis or purchase the preferred building blocks before designing a library. In simulation experiments with building block collections from all available commercial vendors near-optimal libraries could be found without synthesis of additional building blocks; in other simulation experiments we showed that even one synthesis step to increase the number of available building blocks could improve library designs when starting with an in-house building block collection of reasonable size.


Introduction
AI and AI-assisted tools have seen rapidly increased popularity in cheminformatics over the past decade.In drug discovery, these tools have impacted bioactivity prediction 1,2 , de novo molecular design [3][4][5][6][7] , synthesis prediction [8][9][10][11][12] and toxicology prediction 13 .In turn, the demand for high-quality data has increased beyond the extent of existing data sources 14 and there is a need to facilitate a larger number of informative experiments to generate data in a standardized format.Combinatorial chemistry is a popular method for producing large collections of compounds, motivated by material efficiency and more sustainable chemistry 15,16 since synthesis of 100 molecules using two building blocks per synthesis could in the worst case require 200 different building blocks, whereas a library of the same size using combinatorial chemistry would use 20 in a 10 × 10 design.
Library design has traditionally aimed to optimize the selection of molecules for either molecular diversity [17][18][19] or molecular properties like high activity towards a target or reduce lipophilicity, i.e. a focused library design [20][21][22][23][24] .A diverse library design provides a larger coverage of the chemical space and is often viewed as more 'informative', since similar molecules hypothetically would provide redundancy in the information gained 17,25 .Focused libraries on the other hand might aim to optimize a selected lead compound 26,27 by lowering the structural diversity and exploring similar structures to the lead compound to improve a specific property.
The space of synthetically feasible molecules is estimated to be of size 10 60 28 , whereas traditional High-throughput screening (HTS) has the capability to physically test approximately 10 6 compounds.Consequently, virtual compound libraries became the focus as the computational resources became large enough to store their chemical structures 16,29,30 .The virtual library CH/PMUNK 31 consists of 95 million compounds by enumerating products using common reactions from combinatorial chemistry.The virtual library REAL 32 has over 6 × 10 9 molecules for virtual screening that obey Lipinski's rule of 5 33 .The GDB-17 library of small molecules enumerated by Ruddigkeit et al. 34 contains 160 billion virtual compounds with up to 17 heavy atoms.Additionally, compound suppliers also offer "synthesis on demand" building blocks of which the largest is MADE 35 , a catalogue of 770 million building blocks that can be ordered and made with "over 76% success rate".
Generative models for de novo design offer an alternative to virtual screening or HTS, by instead generating focused selections with a smaller size 3,36 .Several deep learning models have been proposed to generate chemical libraries in a focused manner, in particular decorating a scaffold 37 by suggesting which building blocks to attach to this scaffold.The Mol-GPT model showed capability to both optimize a lead, as well as decorate a scaffold 38 .STRIFE emphasized pharmacophore information to decorate and optimize proteins 39 .Domenico et al. adapted the REINVENT 3 architecture to create focused libraries towards inhibiting NA, AChE and SARS-CoV-2 40 .LibINVENT 41 uses reinforcement learning to generate reactionconstrained decorations to input scaffolds.These methods can generate building blocks for combinatorial library design, but do not inherently offer an optimized combinatorial selection.Given a limited experimental budget, there is motivation to develop workflows for optimizing combinatorial design for novel de novo generated building blocks.
Methods that simultaneously optimize both diversity and molecular properties of a library have been used in several previous studies, using for example simulated annealing 42 (SA) or genetic algorithms (GA) [43][44][45] .These approaches provide optimization over lists of provided building blocks, or virtual libraries but cannot determine whether novel generated building blocks can be acquired or if they are only hypothetical structures impossible to synthesize in practice.As such, a design made by these models on de novo generated building blocks is limited by the "synthesis on demand" success rate.
A model that has proven to perform well for modelling the trade-off between quality and diversity is the Determinantal Point Process (DPP) [46][47][48] .DPPs are probabilistic models that have been argued to represent repulsion between items 49 .They are used in other application areas for text summarization 48 , pose estimation 47 and diverse image selection 46 , but have not yet been investigated for library design.While common methods for selecting diversity are maximizing the sum of pairwise distances 17,45 or minimizing average pairwise similarity 44 , the determinant of the similarities captures the interaction between multiple molecules simultaneously 50 .Additionally, the max-sum or min-average methods scale in time complexity quadratically with the number of building blocks in the optimization space.While the DPP has a cubic scaling, it is instead dependent on the size of the sampled library rather than the number of options.
We propose a library optimization workflow for de novo generated building blocks in a combinatorial fashion applying recombination 51,52 .Using LibINVENT 41 , we generate and filter building blocks that can attach to an example scaffold using specified reactions.We then use the Computer Aided Synthesis Prediction (CASP) tool AiZynthFinder 12 to evaluate all generated building blocks and their availability in the eMolecules building block platform 53 of purchasable building blocks, or estimate the number of reaction steps needed to synthesize them using template-based retrosynthesis prediction 8,9 .We simultaneously explore and optimize the library selection for Quantitative Estimate of Drug-likeness (QED) 54 , Quantitative Structure-Activity Relationship (QSAR) 1,36,55,56 and Structural diversity (ECFP6) 57 using Gibbs sampling 58 , conditioned on a constant size, thus sampling from a determinantal point process of constant size k (k-DPP) 59 .The workflow is model-agnostic and can be applied to any list of building blocks and any CASP tool that break down the building blocks into stock-available precursors.We apply this workflow to optimize a library from all available building blocks from eMolecules 53 .We also simulate an in-house building block store by optimizing over a subset of the available building blocks and explore the differences in optimized libraries between using available building blocks and commercially available building blocks.
The main contributions of this framework are as follows.we • extend combinatorial library design to score de novo designed building blocks, propose the use of DPPs, in particular k-DPPs, to sample libraries that optimize the trade-off between quality and diversity, and estimate the difference in score between libraries using available building blocks and total pool of generated reactants, and estimate the potential gain from expanding the available building blocks.

Methods
The framework (see Figure 1) consists of the generation of building blocks, followed by use of retrosynthesis prediction models to estimate if the building blocks are available in a defined stock data set, or if they could be produced from this stock through synthesis.While the implementation here [https://github.com/SeemonJ/combinatorial-library-designdpp] is specifically made to work with the open source versions of LibINVENT 60 and AiZynthfinder 61 , the framework itself can be adapted to work with any metrics.

Application example
The scaffold displayed in Figure 2 is adapted from the original LibINVENT publication 41 .The reactions used are Buchwald-Hartwig 62 for the left attachment point and primary amide coupling 63 for the right one.We will refer to these reactions as BH and AC respectively in the following.

Target activity model
The QSAR model is a random forest model 64

Building block generation using LibINVENT
The building blocks were generated using the pre-trained prior model of LibINVENT 60 .The reinforcement learning was run for 1,000 epochs with a batch size of 128 and a learning rate of 5 × 10 −6 .The default diversity filter, which penalizes previously sampled building blocks, and the custom alerts for non-druglike groups were included during training.Reaction filters for the BH and AC reactions were applied, which penalize building blocks that do not match the reaction SMARTS 68 .
A total of 104,991 unique molecules (82%) were generated, of which 94,808 (74%) matched the reaction filters.All molecules for which QSAR model assigned a probability of being active lower than 0.8 were removed in post-processing.This yielded 45,928 remaining products, from which the building blocks were extracted.32,159 unique carboxylic acids and 2,084 unique aromatic halides were identified, corresponding to AC and BH reactions, respectively.The runtime was approximately 2 hours using a Nvidia 2080Ti.

Building block availability
The public version of AiZynthFinder 61 was used to check which building blocks were available directly 'in stock', and which building blocks would require synthesis to be available.The baseline stock consists of purchasable building blocks from eMolecules 53 , and consists of approximately 1.5 million building blocks (including 227K carboxylic acids and 444K aromatic halides).AiZynthFinder was set to a maximum search time of 5 minutes, and maximum 10 reaction steps for identifying a synthetic route.AiZynthFinder was run in batches across multiple CPU's of varying models as performing the analysis on ~34K building blocks for up to 5 minutes each would, in the worst case, require ~2,800 CPU hours, in the scenario that no building blocks were available directly in stock.This analysis was performed both for the baseline stock and for five limited availability subsets, used to simulate internal stock.The limited availability subsets were sampled uniformly without replacement from the baseline stock and were chosen to be 3% of the size of the baseline size (~45k building blocks).
The parameters chosen both for generative modelling and retrosynthesis let both models run for a longer time, 1000 epochs compared to 100 during generation and 5 minutes instead of 2 for retrosynthesis evaluation, than previous uses of the same architectures 12,41 .This yields more output building blocks and solves more routes than previous use in demonstrated studies, and potentially include LibINVENT output that could be a result of over-exploiting the QSAR model.This was done intentionally to increase the size of the search space and provide a larger diversity of building blocks with respect to quality properties to showcase the effect of the different strategies.

Determinantal Point Processes
In library design, diversity is often computed between compounds through the matrix of pairwise distances.When optimizing the library, the most common approaches maximize the sum of distances, maximize the minimum distance, or maximize the average distance to the nearest neighbour 17,44,45 .This captures the distance between a pair of two molecules well, but does not capture the relationships between multiple molecules simultaneously 50 .
Discrete DPPs are probability distributions first used by Odile to model fermions 69 , and have been increasingly popular within machine learning for capturing the trade-off between diversity and quality 46 .Let  ∈ ℝ × be a positive semi-definite (PSD) matrix.A discrete DPP with kernel  is a probability distribution : 2 [𝑛] → ℝ + defined by where   is the principal submatrix of  indexed by the elements of .Consider that if each row of the matrix is a feature vector that represents an item, then the probability of a set of items is proportional to the volume of the hull spanned by the vectors.A diverse selection in the given features will correspond to a larger volume.For this study, the feature representation used to describe the products of the selection is the ECFP6 similar to the QSAR model, and the similarity measure described with the Tanimoto index 70 (also known as the Jaccard index).This is well suited for application into DPPs, as the pairwise similarities  is a typical kernel 46 .
Kulesza and Taskar 46 demonstrate that the quality of terms can be incorporated into DPPs by decomposing the kernel into where      represents the similarity between items ,  and   is a measure of the quality of the item.This applies to multiple quality measures and inserting equation 2 into the definition of DPP thus yields the probability for observing the set  while sampling the DPP

Sampling process
Evaluating the determinant of all possible products at once may introduce practical problems, since the naive implementation of determinant calculations are ( 3 ).This naïve implementation is used in most libraries.Due to parallelization in smaller blocks of submatrices across multiple threads, it is possible to compute determinants of matrices with  > 10,000 in minutes.For the sampled number of possible products, 32,159 × 6,213 = 199,803,867, it is computationally infeasible to evaluate all subsets, let alone optimize across all possible selections.For scenarios such as ours, however, the only selections of relevance are sets of practical size, such as the same sizes as screening plates, i.e., 96, 384 or 1536.K-DPPs are an extension of general DPPs that are conditioned to selected sets of size exactly .Gharan and Rezaei 71 introduced a computationally efficient method for sampling k-DPPs using a Gibbs sampling scheme shown to have fast mixing properties.Here, the proposal distribution samples suggestions only from exchange operations between one element and one non-element of the current -set.This ensures that the size of selection always remains constant.Moreover, at time step t during sampling, it requires only computation of the transition probability where  is the set of quality parameters included and  (•) are the respective weights for each parameter.These weights are tuneable. where, and  is a tunable parameter on the acceptance probability, 8. Move to the new state  =  1 with probability   ( +1 ) or stay with  =  0 with probability 1 −   ( +1 ) 9. Repeat steps 5-8 until termination.
Since the pairwise similarity values of   are all in [0,1], the determinants may become too small for double precision with relevant choices of k.For numerical stability, the logarithm of the right hand side of equation 3 is used in step 7.The logarithm of the determinant become negative, where a greater value represents a more diverse set.In the numerical experiments we let  = 12,  = 8, corresponding to the generated building blocks of carboxylic acids and aromatic halides respectively, and used  = 96 as it is a common plate size.
We chose to conduct experiments for  = 0 such that we only accept strict improvements (hill climbing, which is a greedy search).The selections of the model for different optimization strategies were examined, see Table 1.To explore the mixing time, the termination criteria were set as a patience parameter, sampling the distribution until 10,000 samples were drawn without finding a better solution.We compare the results against the average result of 100 random selections and the top 96 cherry-picked compounds by QSAR values from the LibINVENT run.

Results
In this section, we first show the results of processing the generated building blocks from LibINVENT through AiZynthFinder, to give a measure of the selection space for the framework.We then present the average results of each optimization strategy for different levels of availability related to required number of reaction steps.Next we show optimization results for a simulated scenario of limited stock building block availability.Finally, we discuss the computational performance of the model when scaling up to larger selection space.
The 32,159 unique carboxylic acids and 2,084 unique aromatic halides generated through LibINVENT were analysed using AiZynthFinder.The retrosynthetic prediction found that 88.7% of the generated carboxylic acids and 98.3% of the aromatic halides could be synthesized within 2 steps of reactions from the base eMolecules stock.Of the building blocks, 6,203 carboxylic acids (19.3% of the generated building blocks) and 763 aromatic halides (36.6%) were directly available in stock; i.e., required no synthesis.The full distribution of reaction availability can be seen in Figure 3.
The compound selection was performed on the criteria of only QSAR, only QED, only Diversity and all the metrics simultaneously with equal weight.For the rest of this section, we will refer to the strategy of optimizing the metrics simultaneously with Simultaneous Optimization (SO).The single-objective strategies were performed by setting the weights   in Algorithm 1 for the ignored metrics to 0. This was performed for building blocks available from 0-4 reaction steps, as extending the search to the remaining compounds added few additional options (see Figure 3).At each step, the new building blocks were added to the existing pool of available blocks to model the marginal gain for the chemist to perform synthesis for acquisition of new building blocks.We repeated 10 runs for each level of reaction step for Algorithm 1 from different randomized initializations.
The results for single-objective search, cf.Table 1, show that the average QSAR values while optimizing for the other objectives tended to stay between 0.6-0.7,indicating that an arbitrary recombination of building blocks from LibINVENT compounds of high QSAR values does not always result in a product that also has a high QSAR value.
Expanding the search to building blocks available by 1-4 reaction steps resulted in samples of slightly lower diversity as average QSAR value went from very close to 1.0 to selections that had each compound with a value of exactly 1.0.Optimizing for diversity maintained the average QSAR value in the observed selections.The results of SO did not improve as the number of available building blocks increased.This indicates that the set of purchasable building blocks that is already available covers optimal solutions given our scoring parameters.For the single-objective optimization strategies, the QED value tended to decrease as the size of the search space increased.A possible explanation could be that the building blocks corresponding to several steps of reactions are more complex, which tend to have a negative effect on the QED value 54 .The difference between the selections from baseline available building blocks and selections of building blocks one reaction step away represent the largest change in QED score, while further expansions of the building block availability resulted in much smaller or no changes for all metrics.This observation is likely explained by the distribution of building blocks we previously observed in Figure 3; one reaction step represents a change from a space of 6,203 × 763 products to a space of 23,034× 1,926, almost ten times larger.The next reaction steps increase the size of the product space relative to the previous step by 31.7% and 4.9%, respectively.The sampling process thus selects building blocks from a pool that is very similar between these three selections, and as such the distributions are similar.
The top 96 compounds by QSAR value generated by LibINVENT had an average QSAR value of 1.0, average QED of 0.43.While these compounds are more diverse than any selection found in our combinatorial selection, they achieve this by breaking the combinatorial constraint.The selection had 96 different carboxylic acids and 3 different aromatic halides.95 carboxylic were evaluated by AiZynthfinder to be synthesizable, in at most four reaction steps.The 3 aromatic halides were all available directly in stock.
To compare these results against random selection, we sampled 100 combinatorial selections of size 12 × 8, where each building block for the respective AC and BH reactions was sampled with equal probability.This was repeated for building block availability from each level of reaction steps up to 4 reaction steps from the stock.The random selections consistently had worse QSAR values and QED values than SO, while having diversity values that were not noticeably different from the optimized selections.The average QED value among the random selections is <0.25, which is significantly lower than the average of an "attractive drug" 54 .In addition, the average QSAR value is lower than 0.8, which means many products in the selection are not very likely to be bioactive.This validates the need for optimizing these selections.The selected products of the single-objective optimizations as well as the SO were also compared visually.Figure 4 shows a small sample of 2 × 2 combinatorial examples from the different selections for visual clarity.The single-objective selections leave plenty of room for improvement.QEDoptimized and diversity-optimized selections both have QSAR values around 0.7, but while the QED-optimized compounds are small, the diversity optimized compounds promote larger building blocks with several rings and side chains.QSARoptimized selections have the lowest diversity and cover a range of low QED-scores, favouring building blocks with 1-2 rings each and are generally too large still for being druglike.It is likely that the QSAR score of 1.0 indicates that LibINVENT finds exactly which bits in the fingerprint representation that exploit the QSAR model.SO yielded a balanced selection of smaller building blocks that still yielded a high average QSAR value of ~0.848.
To evaluate the selection strategies in a more practically relevant setting, we restricted our building block stock availability to a subset of 3% of the original size (~45k building blocks) simulating an approximate availability of building blocks available for a pharmaceutical company.The distribution of solved retrosynthesis routes for the building block subsets are shown in Figure 5.The unsolved routes on average were 26,504 with a standard deviation of 526.6 and 1,072 with a standard deviation of 132.9 for AC and BH reactions, respectively.
Table 1.Summary of average metrics across all selection strategies used.LogDet is the logarithm of determinant of the kernel matrix, or matrix of all pairwise Tanimoto similarities in the current selection.A value closer to 0 is more diverse.Random selection is the average values of 100 combinations selected for each reaction step availability.For each optimization strategy, we show the results of stock-available building blocks (0 reaction steps) and building blocks up to 4 reaction steps away.The overall average results are denoted by ∑ .It is noteworthy that the proportion of building blocks added per reaction step relative to the current available size is larger for these limited availability subsets, i.e., as 1,745 and 385 building blocks are added for AC and BH after one reaction, compared to 16,831 and 1,163 building blocks added for the full stock.The general trend continues as the selection space is expanded to more reaction steps and in the first four reaction steps almost half of the total number of aromatic halides and more than half of the carboxylic acids become available.
The same four selection strategies were used for building blocks available from 0-4 reaction steps with ten starting randomized initializations each.Here, the selection from stock-available (zero reaction steps), seen in Table 2, shows that the highest achievable values are drastically lower than after acquiring more building blocks by synthesis.For this smaller space the algorithm is likely to result in the same optimum for the given stock with multiple initializations.
The results show that optimized selections approach their respective values from the full eMolecules availability already after extending the selection space to building blocks available within one reaction, and that the stock-available selections score similar in average QSAR and diversity to the random selection of previous experiment.There are smaller improvements in selections with building blocks available within two reaction steps and no improvements with further reactions.We can draw parallels with the distribution of available building blocks in Figure 4 to the distribution of the previous experiment, and note that the improvements occur when a relatively large number of new building blocks are added to the selection space.When the relative expansion of the space is low the probability of finding a new improved solution is also low.
Unlike the previous experiment, however, the QED score remains at a similar level or, in some cases, improves as the number of reaction steps increase.It is likely that the number of added building blocks through reactions that are "too complex" are lower in this experiment.
The methodology of comparing the optimization results between two different stocks of availability might be useful to estimate the prospective gain from synthesizing new building blocks compared to buying available compounds or simply using the current stock by comparing the optimization results with different selection spaces.This can assist the decision-maker in designing efficient libraries in a combinatorial manner.The number of building blocks estimated to be available through synthesis shows a substantial/relevant increase in search space as the number of reaction steps increases.In practice, only stock-available building blocks or building blocks that can be synthesized in one reaction step will often be used.Alternatively, one could introduce a constraint on the total number of reaction steps used for the selected library, which could be accounted for using e.g., reaction sampling.

Computational time
During selection, we opted for relatively small selection dimensions to limit the computational time to less than ten hours per run, since we performed 12 optimizations, for 10 splits and 5 different building block availabilities, for a total of 600 selections.The observed runs would perform for approximately 20,000-100,000 samples depending on selection space, initialization and number of metrics, which could take between 20 minutes and 4 hours on a single CPU with the QSAR model being the biggest bottleneck.However, since the evaluation of a random forest model is linear in the number of new products between two samples (12 or 8 depending on the exchanged building block) and determinant calculations have the time complexity of (k 3 ) with total number of products, the method will eventually be limited by evaluations of diversity rather than QSAR.This appears feasible with size 1,536 as here  =  2 = ( × ) 2 .The termination criterion for 10,000 samples without improvement was chosen after some initial experimentation.For larger library dimensions, it is possible that more samples are more suitable to find convergence.The increase in number of building blocks to choose results in more decision variables to determine for an optimal solution.Additionally, larger dimensions generally mean the marginal change of exchanging one building block on the average values in the selection is smaller, which implies the acceptance ratio becomes closer to 1. On an Intel Xeon W-2125 CPU @ 4.00GHz machine with 8 threads the 12 × 8 configuration required approximately 0.11s for the QSAR computations compared to 0.04s for computing diversity for each sample, while a 48 × 32 configuration required 0.14s for the QSAR and 4.0s for computing the diversity.A full exhaustive search was never considered even for the smallest subsets as e.g., the size of the average 3% subset at stock-availability in a 12 × 8 configuration results in ~2 × 10 27 different possible combinations.For the same reasons, hyperparameter optimization of  and  was not performed, as this scaffold is hypothetical and that a marginally better selection would not lead to generalizable guidelines for these parameters.

Conclusions
We present a framework for combinatorial library design evaluated using available public data and open source software to allow reproducibility.The framework can be controlled by specifying both importance of different evaluation metrics and the acceptance ratio .Our experimental results show that it is possible to perform the multi-objective optimization towards both quality and diversity for our example library.The results show that our framework can navigate the search space around combinatorial library design and find selections of high (>0.8)QSAR values while retaining good (>0.7)QED values and high diversity.The trade-offs between the different objectives were investigated and it was found that the multi-objective optimization maintained a QED relatively close to the maximum possible while optimizing QSAR and diversity.Building blocks that were selected at random showed on average low (<0.25)QED values and lower QSAR value (~0.78) than the qualityfocused optimization strategies.Our experiments indicate that the set of all available purchasable building blocks require minimal extra synthesis to reach the highest observed scores, while simulated scenarios of limited stock greatly benefit-to comparable score levels-from single-step synthesis of building blocks.The latter scenario might be useful in practise in a larger company with a sizable building block store.It might be faster and cheaper to synthesize the needed building blocks for the combinatorial library design in one step compared to purchasing additional building blocks.It was also shown that synthesizing building blocks in more than one step was not attractive given the size of the internal building block store.For an institution with a very small internal building block store, it might be favourable to synthesize the needed building blocks for the libraries in more than one step.

Figure 1 .
Figure 1.Flowchart of methods used for the combinatorial library design.

Figure 2 .
Figure 2. Scaffold used as input for the generation of building blocks.This figure is adapted from 1 .

Figure 3 .
Figure 3. Distribution of number of reaction steps needed for the generated building blocks from the entire eMolecules stock.The building blocks for which a retrosynthetic route could not be found are denoted with '-'

4 0 Figure 4 .
Figure 4. Sampled compounds using the selection strategies of Max QSAR, Max QED, Max diversity and Simultaneous Optimization of all three criteria.The shown examples are using building blocks available in the eMolecules stock.

Figure 5 .
Figure 5. Distribution of average number of reaction steps needed for the generated building blocks while using a 3% subset of the stock.The error bars show the standard deviation across the 5 splits.The number of unsolved routes is omitted from the figure for visual clarity.
65ilt using Scikitlearn 0.21.365with50 estimators.The training data used is all DRD2 data available in ExcapeDB 66 , with a threshold for active/inactive pXC50 of 6. Compounds from HTS assays from ChEMBL 67 without pXC50 data were assigned as inactive.With these definitions for activity, the data set had 6,304 active compounds and 344,905 inactive compounds.The compounds were represented by the extended connectivity fingerprint with 2,048 bits and radius 3 (ECFP6).The model was trained using an 80%/20% training/test data split.The data is imbalanced with most of training points labelled as inactive compounds, resulting in AUC-ROC score of 0.995 by having a pessimistic bias.This model was used both as part of the LibINVENT reinforcement learning run and during Library selection.
To give equal importance to QSAR value, QED score and diversity, we set   =   =   = 0.33 as constant.At each point t, this results in two computations of complexity ( 3 ) for the two determinant calculations.The following sampling scheme was implemented for selecting  and  number of building blocks from the respective sets ,  of available building blocks for two attachment points: Algorithm 1.1.Initialize selection with  and  building blocks at random from ,  respectively 2. Create  ×  matrix of products  0 , denote this matrix as the active set  3. Compute the quality values,   0 and the matrix of pairwise similarities,   0 +  (  0 ) 5. Select a new building block from either  or  uniformly 6. Compute the new matrix  1 , and the corresponding values,   1   1 7. Calculate the transition probability )

Table 2 .
Summarization of average metrics across all selection strategies used for optimizing over the smaller (3%) subsets of available building blocks.LogDet is the logarithm of determinant of the kernel matrix, or matrix of all pairwise Tanimoto similarities in the current selection.A value closer to 0 is more diverse.