Evolving Concept of Activity Cliffs

Activity cliffs (ACs) are generally defined as pairs or groups of structurally similar compounds that are active against the same target but have large differences in potency. Accordingly, ACs capture chemical modifications that strongly influence biological activity. Therefore, they are of particular interest in structure–activity relationship (SAR) analysis and compound optimization. The AC concept is much more complex than it may appear at a first glance, especially if one aims to represent ACs computationally and identify them systematically. To these ends, molecular similarity and potency difference criteria must be carefully considered for AC assessment. Furthermore, ACs are often perceived differently in medicinal and computational chemistry, depending on whether they are studied on a case-by-case basis or systematically. For practical applications, intuitive access to AC information plays a major role. Over the years, the AC concept has been further refined and extended. Herein, we review the evolution of the AC concept, emphasizing new analysis schemes and findings that help to better understand ACs and extract SAR knowledge from them.


INTRODUCTION
Activity cliffs (ACs) have been discussed in computational and medicinal chemistry for nearly three decades. 1−4 The first mention of the term probably dates back to a book chapter published by Michael Lajiness in 1991. 1 ACs were generally defined as pairs of structurally similar active compounds with a large difference in potency. 1,2 As such, ACs received strong attention in computational chemistry and drug design because they represented instances of structure−activity relationship (SAR) discontinuity that were detrimental for quantitative SAR (QSAR) modeling. 2 In QSAR, AC compounds were often falsely considered outliers, as pointed out in a milestone commentary by Gerry Maggiora that raised awareness of ACs in the field. 2 However, SAR discontinuity also translates into high SAR information content and ACs identify small chemical modifications that determine the potency of active compounds. This explains the strong interest in ACs in medicinal chemistry to aid in compound optimization. 3,4 ACs and the SAR discontinuity they capture are usually focal points during early stages of compound optimization efforts when potency must be improved. By contrast, encountering ACs is less desirable during late stages when multiple compound properties must be balanced. 5 This is the case because ACs and the underlying "steep" SARs complicate multiple property optimization while attempting to retain sufficiently high compound potency. 5 In medicinal chemistry and structurebased compound design, individual ACs are frequently found 6,7 including instances revealing dramatic effects of minute chemical modifications such as "magic methyl" effects. 7 Moreover, large volumes of SAR information become available when ACs are systematically identified across bioactive compounds. 8 This is an area where computational and medicinal chemistry meet, providing a large knowledge base for compound optimization. 4,5 While ACs might be intuitively appreciated from a chemical viewpoint, by subjectively considering compound series and judging SARs, the systematic study of ACs requires careful consideration of two criteria that are essential for defining ACs, the similarity criterion and potency difference criterion. 3,4 Investigating these criteria and their interplay has largely determined the way in which the AC concept has evolved over the years and continues to evolve, 5 as discussed in the following.

SIMILARITY
2.1. Molecular Graph-Based Similarity. Compound similarity can be assessed in a variety of ways. For computational representation of ACs, molecular fingerprints (bit string representations of chemical structure and/or properties) have originally been used as descriptors to calculate Tanimoto similarity, 3 a standard procedure in chemoinformatics. Tanimoto coefficient (Tc)-based ACs are shown in Figure 1. While convenient computationally, the use of the Tc, which ranges from 0 (no fingerprint overlap) to 1 (identical fingerprints), requires the choice of a threshold value for classifying compounds as similar, which is largely subjective. 3 Furthermore, different fingerprints produce different Tc values, which changes similarity criteria. Accordingly, attempts have been made to identify ACs that would be formed independently of different molecular representations and similarity calculations, leading to the search for "consensus ACs". 9 Moreover, although straightforward to calculate for computational screening of compounds, Tc values are sometimes difficult to interpret from a chemical perspective because they are calculated as a whole-molecule similarity measure and do not take any substituent rules or reaction information into account. Therefore, as an alternative to calculated similarity values, substructure-based similarity criteria can be applied, which lead to a binary assessment of similarity, 4 i.e., either two compounds share a predefined substructure and are classified as similar or not. Substructurebased similarity can also be assessed in different ways, for example, by using formally defined molecular scaffolds (core structures) and distinguishing different scaffold/substituent (Rgroup) relationships, 10 as illustrated in Figure 1.
The substructure-based classification scheme depicted in Figure 1 establishes different categories of ACs that are chemically intuitive including "chirality cliffs" where compounds differ only by the configuration at a single stereo center but display large potency differences. The study of such chirality cliffs (or chiral cliffs) capturing subtle structural differences was recently further extended by systematically evaluating if pairs of enantiomers tested in the same assay formed an AC or not. 11 In this study, chiral cliffs were represented by a variety of chirality-sensitive descriptors and subjected to machine learning to predict chiral cliffs and distinguish them from pairs of enantiomers that did not form ACs. 11 Another substructure-based similarity criterion for ACs is provided by the formation of matched molecular pairs (MMPs), 12 i.e., pairs of compounds that are only distinguished by a chemical modification at a single site. 13 MMPs can be efficiently identified algorithmically in large compound databases. 13 The size of chemical modifications can be restricted to mostly limit compounds forming MMPs to structural analogs typically generated in medicinal chemistry. 12 Application of this similarity criterion led to the introduction of MMP-based ACs that were termed "MMP-cliffs". 12 For MMP generation, algorithmic fragmentation of exocyclic single bonds can also be replaced with bond fragmentation according to retrosynthetic rules, giving rise to the formation of retrosynthetic MMPs (RMMPs) 14 and "RMMP-cliffs" (in analogy to MMPcliffs), as further discussed below. By definition, MMPs and RMMPs are characterized by substitutions at a single site. Hence, MMP-and RMMP-cliffs cannot contain different Rgroups at multiple sites, which is often the case for lead optimization series. Recently, a computational method has been developed for systematically identifying analog series with single or multiple substitution sites in compound data sets. This approach, termed the compound-core relationship (CCR) method, 15 is applicable to enumerate analog pairs (APs) from a given series with single or multiple substitution sites, hence providing an extension of the (R)MMP formalism to multiple

ACS Omega
Mini-Review points of chemical modification (R-group replacements) 15 and another similarity criterion for AC formation.
2.2. From Two to Three Dimensions. ACs can not only be defined using molecular graph-based (2D) similarity but also on the basis of three-dimensional (3D) structures. To these ends, bound ligands in complex structures with a given target protein must be spatially aligned based upon protein superposition and 3D similarity is calculated taking conformational and positional differences between ligands into account. 16 For this purpose, numerical 3D similarity functions are available. 4 Then, if ligands with similar binding modes exceed a predefined potency difference threshold, "3D-cliffs" are obtained. As illustrated in Figure 2, compounds forming 3D-cliffs often (but not always) display characteristic interaction differences in experimental structures that are likely to explain differences in their potency. As such, 3D-cliffs provide informative examples for structure-based compound design and excellent test cases for the calibration of computational methods to calculate binding energies.
In 2015, a systematic analysis of publicly available X-ray structures of ligands in complex with human targets identified 630 3D-cliffs for which high confidence activity data were available in the medicinal chemistry literature. 16 These ACs involved 61 human targets. MMP search in ChEMBL, 17 the major public repository of compounds and activity data for medicinal chemistry, identified a total of 1980 structural analogs of 268 3D-cliff compounds with activity against 50 human targets. 16 These analogs complemented 414 3D-cliffs and hence bridged between 3D-and 2D-ACs, providing ample opportunities for advanced SAR analysis.
3D-cliffs were also used to investigate the ability of different docking and scoring approaches to predict large potency differences between similar compounds. 18 From 158 complex X-ray structures of nine target proteins, 146 3D-cliffs were isolated and subjected to docking calculations. A target-based scoring scheme was established to evaluate 3D-cliffs and predict cliff partners for compounds with known or modeled binding modes. 18 Another study combined the analysis of corresponding 2D-and 3D-ACs using a database of more than 1700 X-ray structures of inhibitor complexes of 190 kinases. 19 Compound similarity was assessed on the basis of 2D fingerprints and similarity of crystallographic inhibitor-kinase interactions using molecular interaction fingerprints (IFPs), hence providing a complementary assessment of molecular and interaction similarity, leading to the introduction of "interaction cliffs". 19 Only ∼25% of 2D-ACs identified in this study also qualified as interaction cliffs, due to overall low interaction similarity detected with IFPs. However, interaction cliffs revealed ligand−target interaction "hot spots" and were proposed to aid in the design of new inhibitors, 19 thus widening the spectrum of 3D-cliffs.

CONCERTED FORMATION OF ACTIVITY CLIFFS
Originally, ACs were investigated on the basis of compounds pairs, 2,3 consistent with how ACs were typically encountered during sequential compound optimization efforts. 5 However, in

ACS Omega
Mini-Review compound data sets, ACs are rarely formed by "isolated" pairs of compounds (i.e., pairs without structural neighbors formin ACs). Rather, most ACs (>90%) are formed by groups of structural analogs with varying potency, involving compounds in multiple ACs. 20 The formation of these "coordinated" ACs can be rationalized using AC network representations in which nodes represent compounds and edges pairwise AC relationships. 20 Figure 3 shows a representative example. In networks, coordinated ACs give rise to clusters of varying size and topology. These clusters reveal more SAR information than ACs considered as isolated pairs. AC clusters are frequently formed by a highly potent compound and multiple weakly potent analogs or vice versa, resulting in densely connected central nodes or "hubs" in AC networks. 20 ACs can also be visualized in "activity landscapes", which are generally defined as graphical representations that Figure 3. Activity cliff network. The RMMP-based AC (RMMP-cliff) network for adenosine A1 receptor ligands is shown. Nodes represent compounds and edges represent pairwise ACs. Highly and weakly potent cliff partners are colored green and red, respectively, while yellow nodes represent compounds that are highly and weakly potent partners in different ACs. For two exemplary clusters (I, II) from the network, compound structures are displayed, and pK i values are reported. Structural differences between AC-forming compounds are highlighted using orange circles. For ACs, a general potency difference threshold of ΔpK i ≥ 2 was applied. . Potency difference criterion for target set-dependent activity cliffs. For an exemplary target set (serine/threonine-protein kinase PIM2 inhibitors), the potency value (pK i ) distribution of all compounds (CPDs) is shown on the left (each dot represents a CPD). In addition, potency differences (ΔpK i ) between CPDs forming analog pairs (APs) are reported (middle, each triangle represents an AP). The red dashed line indicates the potency difference threshold value for the target set-dependent AC definition (ΔpK i ≥ 2.26). This threshold is calculated as the mean plus two times the standard deviation (SD, δ) of the ΔpK i distribution. Red triangles represent target set-dependent ACs. On the right, two exemplary ACs are shown.

ACS Omega
Mini-Review integrate compound similarity and potency relationships. 4 Hubs in AC networks correspond to "AC generators" in activity landscape models, which were defined as compounds forming ACs with high frequency. 21 AC generators identified using fingerprint Tanimoto similarity can be organized on the basis of scaffolds and enrichment factors can be calculated that account for the proportion of ACs containing a particular scaffold, 21 thereby combining numerical and substructurebased similarity assessment.

RELEVANT POTENCY DIFFERENCES
While similarity criteria for AC formation have been intensely investigated, as discussed above, comparably little attention has been paid to determining most relevant potency differences. In general, as potency measurements for AC assessment, assayindependent equilibrium constants (K i values) are strongly preferred over other assay-dependent measurements (such as IC 50 values). 4 Although it is possible to represent ACs as a continuum of compound pairs with increasing potency differences, 22 most studies have used a predefined and constant potency threshold across different compound activity classes (also termed target sets). 4,5 A potency difference threshold of 2 orders of magnitude (100-fold) has often been applied, 12,18 but also a threshold of only 1 order of magnitude. 11,19 A constantly applied potency difference threshold is convenient for computational identification and analysis of ACs across different target sets but does not take target set-dependent differences in potency value distributions into account. 23 However, these distributions significantly vary in target sets where AC formation is strongly influenced by the interplay between potency value distributions and structural relationships. 23 Therefore, ACs have also been defined on the basis of variable target set-dependent potency difference thresholds. 24 For a given target set, it is first investigated if compound potency variations are large enough to yield ACs, 23,24 then all qualifying RMMPs or APs are identified and their potency differences determined. Finally, the set-dependent potency difference threshold for AC formation is calculated as the mean of the compound pair-based potency difference distribution plus two standard deviations. 24 Figure 4 provides a representative example.
In a systematic analysis of ChEMBL (release 23), 212 target sets with available K i measurements and qualifying potency value distributions yielded a total of 16 096 set-dependent RMMP-cliffs. 24 Most target set-dependent potency difference thresholds fell into the range 1 ≤ ΔpK i ≤ 2.5. For comparison, when a general potency difference criterion of ΔpK i ≥ 2 was applied, 11 773 RMMP-cliffs originating from 195 target sets were obtained. 24 Hence, the target set-dependent potency difference criterion yielded more ACs in more target sets than a generally applied potency difference threshold of comparable magnitude and a more balanced distribution of ACs across target sets. The increase in the number of set-dependent ACs was often due to AC formation in different compound subsets, Figure 5. Evolution of activity cliff definitions. Exemplary first, second, and third generation ACs are shown. In each case, the similarity criterion for AC definition is given (fingerprint-based, formation of (R)MMPs or APs) and the applied method is specified (Tc, random or rule-based fragmentation, CCR). Compound potency (pK i ) values are provided and structural modifications in ACs are highlighted in red.

ACS Omega
Mini-Review which also further increased the amount of AC-associated SAR information.

DIFFERENT GENERATIONS OF ACTIVITY CLIFFS
On the basis of alternative similarity and potency difference criteria, as discussed above, we distinguish between different generations of ACs, as illustrated in Figure 5, which reflects the evolution of the AC concept in our work. The first generation of ACs was based upon fingerprint descriptors, numerical Tanimoto similarity, and predefined potency difference thresholds. 3 The transition to second generation ACs was marked by the application of substructure-based similarity criteria such as the formation of MMPs 12 or RMMPs 23 and the use of general or target set-dependent potency difference criteria, leading to the introduction of MMP-cliffs 12 and RMMP-cliffs, 24 respectively. These types of ACs contained a single substitution site, consistent with the underlying MMP formalism. Compared to numerical similarity measures, the use of MMPs/RMMPs often increased the interpretability and medicinal chemistry relevance of ACs and set-dependent potency difference criteria further increased their SAR information content.
Third generation ACs, as introduced recently, 25 are exclusively based on variable target set-dependent potency difference thresholds and depend on APs enumerated from computationally identified or already available analog series 15 such that single and multiple substitution sites are taken into account, as illustrated in Figure 6. Thus, application of this similarity criterion has led to the introduction of single-site and multisite ACs. A systematic search in ChEMBL (release 24.1) revealed that only 25.6% of currently available third generation ACs are multisite cliffs. The 4205 multisite ACs include 3805 (90.5%) dual-site ACs, 25 as depicted in Figure 6.
For 297 dual-site ACs, pairs of single-site analogs were identified that contained the individual substitutions of the dual-site ACs. Comparison of the potency of dual-site AC partners with single-site analogs identified redundant dual-site ACs, which were represented by single-site cliffs, as well as the presence of additive, synergistic, and compensatory potency effects in confirmed dual-site ACs. 25 Figure 6 illustrate a synergistic effect of individual substitutions.

MONITORING ACTIVITY CLIFF POPULATIONS ON
A TIME SCALE Some new findings are reported to illustrate how AC populations evolve over time. In ChEMBL (release 25), 96 target sets were identified that grew over a period of 10 years (2009−2018) and contained at least 100 APs from different analog series in 2009. For each year, the number of third generation ACs per set was determined. Figure 7 shows the cumulative growth in the number of compounds, APs, and ACs. While the total number of third generation ACs available in all sets increased from 5563 to 17 539 over 10 years, the proportion of APs forming target set-dependent ACs remained essentially constant at 4.9%. Over time, there only was a slight increase in the proportion of dual-site and multisite ACs with three or more substitution sites relative to single-site ACs. Hence, ACs were formed at a constant rate over time in different targets sets in the presence of stable compound structure-potency relationships and single-site ACs dominated AC populations.

CONCLUSIONS AND OUTLOOK
Herein, foundations of the AC concept and its evolution have been reviewed. ACs are of interest in both computational and Figure 6. Single-and dual-site activity cliffs. Shown are third generation ACs with a structural modification at a single site (single-site AC, left) or modifications at two sites (dual-site AC, right) that are formed by kinase inhibitors given a target set-dependent ΔpK i threshold of 1.49. For the dual-site AC, single-site analogs were identified that revealed a synergistic effect of the two substitutions contributing to AC formation, as illustrated by the potency value comparison of the four analogs at the right.

ACS Omega
Mini-Review medicinal chemistry. They are explored on a case-by-case basis, especially during early stages of compound optimization, as well as on a large scale through systematic computational analysis of compound collections and activity data. In computational chemistry, there is also interest in predicting ACs using machine learning, and a few models have been reported so far, yielding promising prediction accuracy. 11,26,27 Most of our current knowledge about ACs has originated from compound data analysis. In growing compound data sets, ACs are formed at an essentially constant rate. Large numbers of ACs have been identified across current target sets, providing a wealth of SAR information for medicinal chemistry. In addition to applying molecular graph-based similarity measures, ACs can also be defined on the basis of compound binding modes and 3D similarity, providing interesting examples and test cases for drug design.
Several factors influence AC analysis. First and foremost, compound similarity and potency difference criteria are of critical importance for the definition, perception, and utilization of ACs.
Clearly, alternative AC definitions change AC populations and SAR information associated with them. Another critical aspect of AC analysis is that experimental accuracy and measurement characteristics must be carefully considered. For example, AC analysis is not reliable if incompatible measurement types such as K i and IC 50 values are used in combination. In general, high-confidence activity data are strongly preferred.
A recent conceptual advance has been the introduction of target set-dependent ACs taking differences in potency value distributions into account and arriving at set-dependent potency difference thresholds. These ACs best reflect SAR characteristics of different compound classes and data sets. The evolution of the AC concept is well illustrated by distinguishing different generation of ACs and their characteristics, as discussed herein.
The AC concept is subject to further extensions. For example, an interesting task will be the evaluation of ACs during sequential optimization efforts, considering alternative compound series in parallel. Since most large-scale analyses of ACs have focused on data sets from heterogeneous sources, the frequency with which ACs occur during series-based compound optimization is currently unknown. A thorough analysis of this topic has thus far been prohibited by limited public availability of "real life" lead optimization series from

ACS Omega
Mini-Review drug discovery. Another interesting area for future research will be the use of the AC knowledge base to better understand structural features that determine compound potency across different target sets. This also provides further opportunities for machine learning and significance analysis of molecular features that determine successful predictions. Clearly, despite progress made in rationalizing ACs over the years, we are just beginning to explore their potential for practical applications. There is more to come.