Machine-Learning-Assisted Construction of Ternary Convex Hull Diagrams

In the search for novel intermetallic ternary alloys, much of the effort goes into performing a large number of ab initio calculations covering a wide range of compositions and structures. These are essential to building a reliable convex hull diagram. While density functional theory (DFT) provides accurate predictions for many systems, its computational overheads set a throughput limit on the number of hypothetical phases that can be probed. Here, we demonstrate how an ensemble of machine-learning (ML) spectral neighbor-analysis potentials (SNAPs) can be integrated into a workflow for the construction of accurate ternary convex hull diagrams, highlighting regions that are fertile for materials discovery. Our workflow relies on using available binary-alloy data both to train the SNAP models and to create prototypes for ternary phases. From the prototype structures, all unique ternary decorations are created and used to form a pool of candidate compounds. The SNAPs ensemble is then used to prerelax the structures and screen the most favorable prototypes before using DFT to build the final phase diagram. As constructed, the proposed workflow relies on no extra first-principles data to train the ML surrogate model and yields a DFT-level accurate convex hull. We demonstrate its efficacy by investigating the Cu–Ag–Au and Mo–Ta–W ternary systems.


I. INTRODUCTION
Systematic materials design aims to develop methods that can help to accelerate the discovery of compounds with tailor-made properties, fit for certain applications.The large investment in the area, not least through the materials genome initiative [1], underpins the importance of searching for novel compounds to bolster technological progress.Atomistic simulations provide a suitable pathway to achieve this goal, since the search can be performed systematically, at low cost and with a complete control over structure and composition.Density functional theory (DFT) calculations are notably used to predict material properties in silico, such as material stability or elastic responses.By performing property predictions across a large range of prototype structures, in the form of high-throughput studies [2], novel magnetic [3], high-hardness [4] and battery materials [5] have been discovered.Extensive databases, grouping large numbers of such calculations, have been created and are open to the community.These include AFLOWlib [6], Materials Project [7], OQMD [8] and NOMAD [9].While such studies remain faster than experimental investigations, the composition and structural spaces to be searched are incredibly large, limiting the scope of application of pure DFT workflows.Importantly, such limitation in sampling capacity becomes increasingly critical as the number of elements per compound grows, despite the anticipation that a majority of newly discovered compounds being highly multi-elemental [10].In order to address this issue and to harness the data available from existing ab-initio calculations, machine learning (ML) has proved to be a very powerful tool, as it typically comes at a fraction of the DFT computational cost.
The first step in high-throughput computational studies consists in identifying stable compounds by finding a stoichiometry and an associated structure that can be formed.In order to assess the stability of a given structure, the appropriate convex hull diagram needs to be calculated.The proximity between a compound's enthalpy of formation, ∆H f , and the closest tie-line on the convex hull serves as a criterion for evaluating its stability.Lower values indicate a higher likelihood of stability.Threshold values, typically up to ∼100 meV/atom, are used as stability cut-offs [11].In order to speed up electronic structure methods such as DFT, one possibility is to predict this quantity directly by using ML models, where compounds' compositional and structural information is encoded and mapped directly onto ∆H f .This is otherwise known as composition prediction, as it is used to identify which stoichiometries are stable by fixing structural variations.The ML models typically used include neural networks, kernel ridge regression and random trees, while the training data are often taken from OQMD, AFLOWlib or Materials Project.For instance, models where the feature vector is only based on compositional information have been used to predict the stability of compounds forming a set prototype structures (elpasolites, perovskites, heuslers, etc.), which is fixed for the compounds in the training set [11][12][13][14].Including structural information in the definition of a model mainly improves the predictions, if large training datasets (>10 5 data points) are used [15].Graph convolutional neural networks [16][17][18] have notably been used to predict convex hull distances accurately and benefit greatly from structural features [19,20].Note that these can also be constructed with compositional information only [21,22].One downside to the inclusion of structural information in the models is that the optimised structure is not known prior to the search, so that data for unrelaxed structures has to be used.This can notably be corrected by using machine-learning interatomic potentials (MLIAPs), that are capable of performing relaxations.
MLIAPs combine atomic fingerprints, representing individual atomic environments in the form of feature vectors, with ML algorithms, and effectively map the potential energy surface of a collection of atoms [23].The last decade has seen an immense expansion of the development and application of such potentials [24][25][26][27][28][29].When trained using active learning, MLIAPs have most notably been able to extend the length and timescales of ab-initio molecular dynamic simulations by several orders of magnitude [30][31][32].Such potentials have been successfully applied to predict the energy and forces of alloys [33], and have been used to accelerate and assist the construction or further exploration of binary and ternary convex hulls.Workflows built on these potentials use MLIAPs as surrogate models to first relax and then make energy predictions on a large library of prototype structures.The lowest energy structures are then compared to a reference convex hull obtained from DFT calculations.This process allows one to improve the reference convex hull diagram by identifying structures lying below it.The training of such potentials is crucial for adequate performance, and studies insist on using high-energy structures for the relaxations to be reliable.
Work in this area has broadly been split into two categories.In the first, specific MLIAPs are trained for a given system [34][35][36][37][38][39], typically using active learning.In the other, MLIAPs are trained on large generic databases, and are used to scan over many phases [40,41].The former is more accurate than the latter, but it is not transferable to other phases.Due to their higher accuracy, phase-specific MLIAPs can also be regarded as global structure optimisers, in that not only they can be used to identify specific stable compositions, but they can accurately predict their structure as well.Many other ML global structure optimisers, exist, either in the form of novel workflows [42][43][44][45] or by inserting MLIAPs into the pre-existing state-of-the-art global structure optimisers [46][47][48].
In this work, we demonstrate how a MLIAP can be trained on data readily available on a mainstream repository, such as AFLOWlib [6], and used to screen a library of ternary-alloy prototypes constructed from their associated binary systems.Recently, we have shown [49] that an ensemble of spectral neighbor-analysis potentials (SNAPs) [26] models, trained on the energy data of the three binary subsystems associated with a ternary one, was able to predict the energies of ternary compounds with a mean absolute error (MAE) of ∼30 meV/atom, as long as the structures were fully relaxed.This, not only provides a fast energy-screening tool for ternary compounds, which only requires existing ab-initio data on binary structures, but it also gives the valuable insight that chemical environments within binary and ternary transition-metal alloys are similar.Such observation is at the heart of the workflow introduced here.A selection of binary structures, those close to their respective convex hull tie-plane, are selected as templates for ternary alloys.In a high-throughput set up, these are screened using an ensemble of SNAP models, trained on binaries.The lowest-energy compounds are then selected as the most promising candidates, and their energies are calculated using high-fidelity DFT.The ternary convex hull is thus updated.
What makes this workflow different to tailor-made MLIAPs used for convex hull construction is that all the data, both for the prototypes generation and for training the SNAPs, are taken from the relevant binary phases of the AFLOWlib database [6].In other words, there is no need to generate any new data for the purpose of training the MLIAPs.Despite its training database not being specifically made, either by including important configurations through physical intuition or through active learning, it still has a low enough error on energy predictions to enable a high-throughput search of novel alloys.This is because stable binary and ternary phases, at least for the materials class of transition-metal intermetallics investigated here, share similar local atomic environments.In some sense, the workflow enables an interpolation of the data already available in AFLOWlib, to scan ternary convex hulls and identify stable compositions.Since only a few high-energy structures and no outof-equilibrium configurations are included in the SNAP training dataset, additional features are introduced in the workflow to increase the robustness of the predictions.These include constraints on the SNAP-driven relaxation (constant volume and the inclusion of a maximum number of steps) as well as using an ensemble of models.
In this paper, the workflow used to generate novel ternary compounds is presented.The methodology Section II details how ternary prototype structures are built from their binary counterparts, and how binary compound data from AFLOWlib [6] is used to train an ensemble of SNAP [26] models.Such SNAPs are used to relax and screen the ternary prototypes.Then, the results section III presents how the workflow is used to find stable phases for the Cu-Ag-Au and Mo-Ta-W ternary systems.The so-constructed convex hull diagrams are finally compared with their available AFLOWlib counterparts, and conclusions are drawn in Section IV.

II. METHODS
The general methodology of our workflow, schematically introduced in Figure 1, is here described in detail.From the AFLOWlib database of binary compounds and their associated DFT-computed energies, an ensemble of SNAP models are trained.A subsection of these structures, the ones with lowest enthalpy of formation, are also used as parent structures and form a library of prototypes.Note that here, as is standard practice in many DFT-based convex hull constructions, a compound's enthalpy of formation is approximated solely by its groundstate DFT energy, so that the terms enthalpy and energy will be considered equivalent throughout the rest of the manuscript.Candidate ternaries are then created, by generating all possible and unique derivative structures of such prototypes [50] at a fixed composition [51].These two parts are then combined as the ensemble of SNAP models are used to relax the candidate ternaries.The final energies are predicted from cross-validation among the ensemble of models, while the standard deviation of the predictions is also used to detect and remove geometries for which the relaxation has failed.The resulting structures with lowest energy and standard-deviation are selected as best candidates (closest to the convex hull).Finally, full ab-initio relaxation is performed for these.The ternary system Cu-Ag-Au is used to develop the methodology and is hence employed here as an example in each subsection.
FIG. 1.The stable ternary compounds search workflow implemented in this work.Data available from the three binary systems associated with a ternary one (box in the upper left corner) is used for two tasks: i) the training of an ensemble of SNAP models, and ii) the construction of a library of parent prototype structures.Derivative structures of the latter are created and all possible ternary decorations of these are produced.Each of these are then relaxed with the SNAPs model and the lowest-enthalpy structures are screened.

A. Generating Prototypes
The first step of the workflow consists in creating a suitable library of ternary prototype materials.The driving idea of this work is that the local atomic environments seen in binary intermetallic alloys are similar to those in the associated ternaries, especially for structures close to equilibrium [49].This leads to choosing the binary structures as prototypes for the ternaries.More specifically, the bottom of the three binary convex hulls associated to a ternary system (in our example Ag-Au, Cu-Ag and Cu-Au for Cu-Ag-Au) are scanned to select the lowest-enthalpy compounds.Those within a certain energy range from the convex hull are then selected.All binary structures considered here are taken from the AFLOWlib database [6].The threshold energy selected differs depending on the system at hand, such as to ensure that roughly the same number of structures are taken from each binary diagram.For instance, in our test system Cu and Ag are immiscible [52].Therefore, all binaries have positive enthalpy of formation and lie far from the Cu-Ag tie line between the two elementary phases (fcc Cu and Ag).As a consequence, the energy window above the hull for this binary is larger than that of the other two.The convex hulls of Ag-Au and Cu-Ag are compared in Figure 2 and Table I gives the energy window used as well as the number of structures selected for each binary system.Here we also report the energy window, ∆E, above the convex hull used for the selection.
Once the prototypes are selected, the constituent atoms are stripped of their chemical identity and all structures are compared using the AFLOWlib symmetry tool [53] in order to curate redundancies.This is necessary, since certain structure types (such as fcc or bcc) may be present several times in the collected database, but may be "decorated" in several different ways for different stoichiometries and binaries.All structures are reduced to their primitive cell at this stage.It is also important to note that single-element structures are also included in this analysis.This leads to a library of unique, undecorated prototypes, taken from the binary convex hulls.For the Cu-Ag-Au system, this results in 40 prototypes.

FIG. 2. Convex hull diagrams of the binary Ag-Au (left) and
Cu-Ag (right) binary systems.The convex hull is defined by the bottom black lines, while the green shadowed regions up to the upper full lines show the energy windows chosen to select the binary structures.
From this set of prototypes, ternary alloys are generated.This task is performed at a fixed stoichiometry and for cells up to a maximum number of atoms, N max .For all the prototypes with a number of atoms compatible with the fixed stoichiometry, the set of all unique derivative structures are created, by following the procedure introduced in references [50,51,54] and by using the associated open source Enumlib code.The initial implementation of the algorithm begins from a parent lattice and uses group theory to efficiently enumerate all the unique ways to occupy the sites of supercells constructed from that lattice [50].Further modifications of the scheme allow for the starting structures to be defined by a lattice and an atomic basis (multi-lattices) [54] and for the generation of derivative geometries at a fixed stoichiometry [51].This completes the first step of the workflow (bottom branch in Figure 1) and leads to a set of unique ternary compounds, inspired by the structures of the binaries.The energy of these is then screened using a MLIAP.

B. Ensemble of SNAP models
MLIAPs typically assume that the total energy, E, of a N -atom system defined by the coordinates r N can be written as a sum of atomic energies E i , Such a partition, first proposed by Behler and Parinello [25], is based on the principle of near-sightedness [55,56].
The MLIAP of choice for this work is SNAP [26], which has been proved to perform well regardless of the nature of the chemical bond [57].In this model, the total energy of a compound is written as a sum of linear combinations of the feature vectors describing the chemical environments of each atom i of type α i in the system.SNAP then takes the bispectrum components, B αi i , as feature vectors.The function, E SNAP , that returns the SNAP-predicted total energy is thus defined as where β αi 0 and β αi are the species-dependent linear coefficients of the ML model.Further details on this potential can be found in Section V. SNAP's linear form allows one to obtain good performance with a small number of features, 56 per species in our case, and when trained on small datasets (≤ 10 3 data points) [58].Furthermore, the SNAP hyperparameters are easy to optimise, since the range of optimal values for J max (the maximum angular momentum of the bispectrum) and r cut (the cut-off radius) is wide and consistent for accurate performance [26,33,59].In our experience, the optimisation of the atomic weights, although generally useful, only leads to modest improvements [49].
As for our previous study, an ensemble of SNAP models is used to increase the robustness of the predictions.Furthermore, this provides a mean of estimating the prediction uncertainty [49].The ensemble is defined as a set of K functions, E k SNAP K k=1 , where each SNAP model, E k SNAP , is trained differently, and hence has different linear coefficients.The predicted energy of a new system with atomic positions, r N , is defined as the mean prediction of the models, Ē, and its uncertainty is estimated from the standard deviation, σ, of Ē namely, The training data only consists of binary alloys obtained from the AFLOWlib database, whose energy have been re-computed for consistency by single-point DFT calculation (no further relaxation is performed).Differently from what was done when generating the prototypes, here all binaries, no matter their distance from the convex hull, are included in the training dataset.The same workflow is also used for Mo-Ta-W (the results are described in section III), for which we demonstrate that the energy values taken directly from AFLOWlib are suitable for training the SNAP models.The full details on the Cu-Ag-Au binary subsystems can be found in reference [49].
Previously, 10 different SNAP models within the ensemble were obtained by training on different subsets of the same size of the binary-alloy database [49].For this work, 5 models are trained on the full database, but for each one, a different set of atomic weights for Cu, Ag and Au are used to compute the bispectrum components.This difference is motivated by the need to distinguish compounds with identical site positions in their structure (e.g. the sites of a bcc supercell), but different atomic site occupations.If the atomic weights for all species are identical, for some of these structures, notably the high symmetry ones, SNAP will predict identical energies for different compounds.This is illustrated in Figure 3 for two distinct compounds obtained as bcc derivative structures with Cu 1 Ag 1 Au 2 composition.Prototypes A and B only differ by a permutation of the Ag and Cu atoms.Hence, SNAP models using identical weights for these two atomic types will fail to predict different energies for the compounds.Therefore, by construction, in the ensemble created the two elements in each pair of atomic types (e.g.Cu and Ag in Cu-Ag) have different weights in at least one model.
Before selecting the values for the atomic weights, r cut and J max are optimised manually and independently by using 10-fold Monte-Carlo cross-validation for fixed identical weights, {1, 1, 1} and thus find: r cut = 3.5Å and J max = 4.For these values, the optimal atomic weights are set by performing a grid search, with the same crossvalidation method, and where all three atomic weights are varied from -5 to 5 in steps of 1. Provided the condition above is fulfilled, the sets of weights used for the SNAP models in the ensemble are chosen to minimise the cross-validation root-mean squared error (RMSE).The training and cross-validation errors for each model of the ensemble are given in Table II.The ensemble is then used to predict which of the prototype structures have the lowest enthalpy.

C. Energy Screening
The final aim of the workflow is to suggest low-energy ternary structures at a given stoichiometry.Since many compounds with a large energy spread are screened, the suggestions made need to be accurate (must include lowenergy structures) and reliable (must not include highenergy and unphysical systems).While the energy error of the SNAP surrogate model is low, it is still prone to making poor predictions on new systems that do not resemble the structures seen in training.As a result, the construction of the workflow highly focuses on the robustness of the final predictions made.Note that choosing parent prototypes from binary compounds already increases the reliability of the predictions.
The first step in the energy-screening process consists in setting the compounds' unit-cell volume.This is chosen by taking the weighted average of the elemental volumes of the constituent atoms, an approximation that reproduces the results from ab-initio-relaxed compounds quite well, as is illustrated in Figure 4.Then, the volume and all lattice parameters are kept fixed during any relax-  ation driven by the SNAP models.This is because, while the training database includes a diverse set of structures, they are all at equilibrium, namely their forces and stresstensor elements are close to zero.Therefore, no configurations are strongly compressed or expanded, a fact that causes the SNAP models to perform poorly on the prediction of equilibrium volumes and lattice parameters.
The volume is only allowed to change for the final, most promising, structures selected for the DFT relaxation.Each of the available K SNAP models are used to drive an ionic relaxation, with a maximum of N s steps, for all of the prototypes, leading to K differently relaxed structures per prototype.A "cross-validated" energy prediction is given for each relaxed structure.Given a candidate obtained by relaxing a prototype with the k-th SNAP model, the energy prediction is made with K-1 models, namely all SNAPs bar the one used for the relaxation of the candidate at hand.The mean and standard deviation of the energy predictions of the K-1 models are then saved.For every prototype, one of the K relaxed structures obtained is selected, namely the one with lowest "cross-validated" standard deviation.This is the structure whose final total energy has received the largest consensus among the SNAP models.Therefore, there is only one relaxed structure per prototype.
The reason why this process is not a single ionic relaxation stems from the drive towards robustness of the predictions.Without the inclusion of the N s iteration cutoff, some of the relaxations would lead to structures that are trapped in unphysical local minima of the potential energy surface (PES) of the driving SNAP model.By stopping the relaxation process at a low number of steps (N s = 10 in this study), this effect is mostly avoided, as the structures cannot change too drastically.For the relaxations that are accurately driven by SNAP, the largest drop in energy typically occurs during the first few steps of the relaxation process.While accurate relaxations are also cut before convergence, as they are not distinguished from the inaccurate ones, the final structures are still lower in energy than the initial prototypes.This reduces the likelihood of obtaining high-energy structures and the total run time of the workflow remains modest.
Using "cross-validated" energy predictions of the re- laxed structures helps to remove the bias of specific SNAP models.The SNAP driving the relaxation usually predicts the final structure to be at an energy lower than the initial one, since it moves the geometric configuration to, at least, a local minimum of that particular SNAP's PES.If the relaxation is inaccurate, the resulting structure will be, in fact, high in energy (as predicted by DFT).The relaxation-driving SNAP model is, therefore, not used for energy predictions.Instead, the other models of the ensemble are, since they are less likely to present a bias towards that relaxed structure and therefore to predict it being low in energy.Note that since the structures used as training data are the same for all the K models, they could also lead to biases for the same structure.This is accounted for by using the "cross-validated" standard deviation, rather than the mean energy, to select the "best" structure out of the K relaxed ones.Indeed, even if several/all SNAP models are biased towards a particular structure and collectively predict it to have a low energy, the inaccurate predictions of each model, will be different.This is because they are inaccurate, extrapolated predictions.Hence, while the mean prediction of the ensemble may give a low energy value, the standard deviation will be large.Finally the "cross-validated" standard deviation prediction for all structures must be lower than a cut-off value, σ cut , to be considered for the final energy screening.This typically removes structures with a low SNAPpredicted energy but which DFT predicts to be high energy, as well as structures with high SNAP-predicted energy.From the sample of structures selected, the ones with the lowest "cross-validated" mean energy are chosen and relaxed with DFT.In this study, 15 structures per stoichiometry are selected through such process.
In summary, the workflow described creates a set of prototypes and uses an ensemble of ML potentials to relax and screen the structures, which are most likely to have low energy.This is done for a fixed stoichiometry.The final selected compounds are then recomputed with full DFT relaxation.The workflow, therefore, allows one to perform all the computationally intensive DFT calculations only on the most promising candidates.In the following section, this workflow will be used to reconstruct the ternary-alloy convex of Cu-Ag-Au and Mo-Ta-W.

III. RESULTS
This section, which is structured into two subsections, presents the key outcomes of our method.Firstly, we examine the performance of our workflow against the wellestablished and extensively studied Cu-Ag-Au phase diagram [60].Then, we provide a comparison between our results and those of one of the better-characterised phase diagrams available in AFLOWlib, specifically Mo-Ta-W.By benchmarking the workflow phase diagram predictions to the DFT created ones available in AFLOWlib, we gain valuable insights into the effectiveness of our approach.
In order to accurately evaluate the stability of our predicted prototypes and ensure consistency in our analysis, we have used the QHull [61] library to calculate the convex hulls presented in this work.The data used to construct the reference convex hull (a subset of the full database) was downloaded from the ground-state compounds calculated by AFLOWlib.In the case of the Cu-Ag-Au ternary system, in order to guarantee consistency, we have re-calculated the energies of these compounds with the Vienna Ab initio Simulation Package (VASP) [62].Throughout the entire process, we have strictly followed the AFLOWlib standards as outlined in reference [63], with an energy cut-off of 600 eV to ensure tight convergence.More information regarding the DFT calculations can be found in Section V A. In contrast, for Mo-Ta-W we directly use the AFLOWlib pre-computed energies.

A. Cu-Ag-Au ternary convex hull
In order to evaluate the performance of our workflow, it is essential to select a well-studied phase diagram that meets specific requirements.One key consideration is the availability of sufficient data to train an accurate MLIAP.In order to facilitate the identification and correction of any errors in the initial implementation of the workflow, it is also beneficial to choose a phase diagram that is relatively simple.With these criteria in mind, we chose the Cu-Ag-Au ternary system, a choice further supported by the fact that the MLIAPs for this phase diagram have already been optimized and trained in our previous work.[49].
As a proof of concept, we have focused on the equiatomic Cu 1 Ag 1 Au 1 ternary phase as well as phases with stoichiometric ratios of 2-1-1 and 2-2-1.The reason for this choice is that data at these stoichiometries are available in AFLOWlib for comparison.The results of the workflow are presented in Figure 5 and Table III.In order to quantitatively assess the stability of the structures proposed by the workflow, we use the distance, δ, from the reference convex hull (AFLOWlib).A negative value indicates that the predicted structure lies below the calculated convex hull, establishing its stability as an intermetallic compound.Then, the convex hull needs to be recalculated and corrected by taking into account the newly predicted stable structure.In contrast, a positive distance from the convex hull provides a criterion for assessing whether the structure is metastable or unstable.In Table III values predicted by the workflow (AFLOWlib) are labelled as δ WP (δ AFLOW ).FIG. 5. Workflow predictions for the Cu-Ag-Au ternary system across different stoichiometries.The graph presents the different compositions and their corresponding distance from the convex hull, ∆H f .The blue points are associated with the predictions from the proposed workflow, whereas the orange ones represent the lowest-energy AFLOWlib points.The dashed line (CH) marks the tie-plane position of the convex hull.The unit cell of the newly discovered crystal structure on the convex hull is presented as well.Here, Au atoms are in gold, Ag in silver, and Cu in bronze.The proposed workflow manages to identify one stable intermetallic phase among these, namely Cu1Ag1Au2.Furthermore, it manages to outperform the AFLOW dictionary method in all of the presented cases.
The scalability and speed of the algorithm allow us, in principle, to investigate more regions of the phase diagram, in a single study, than a pure DFT phase diagram construction scheme.This is exemplified by using the proposed workflow to predict structures that are not in AFLOWlib's database, namely compounds with 3-1-1 stoichiometry.The results of the benchmark are presented in Table III, alongside the crystal structure of the new stable phase, Cu 1 Ag 1 Au 3 , in Figure 6.
Our approach outperforms the AFLOW dictionary method in all cases, demonstrating a better predictive ca-  Workflow predictions for the Cu-Ag-Au ternary system with 1-2-2 and 1-1-3 compositions.The stoichiometries and their corresponding distance from the convex hull, δ WP , are presented.For the 1-2-2 compounds, the distance from the convex hull of the phases available in the AFLOWlib database, δ AFLOW , are also given.Note that for all materials, the distance from the AFLOWlib convex hull tie plane is used as reference.A new gold-heavy intermetallic, namely Cu1Ag1Au3 is predicted as stable (see Fig. 6).
FIG. 6.The unit cell of the 1-1-3 crystal structure found on the convex hull, namely Cu1Ag1Au3, is presented in both a top-view with respect to the z-axis (left), a side-view along the x-axis (middle) and a tilted view (right).In this structure, Au atoms are colored in gold color, Ag atoms in silver color, and copper Cu atoms in bronze.
pability, which arises from the exploration of a larger pool of prototypes.Interestingly, the structures predicted by the proposed workflow are consistently closer to the convex hull than those predicted by the AFLOW dictionary method.This is to be expected, since the workflow effectively selects the relevant structures for creating the pool of ternary candidates.Furthermore, our model consistently predicts structures with negative or almost negative (< 10 meV/atom) enthalpy of formation, a fact that gives us confidence in the reliability of the predicted structures.Notably, we have been able to identify two new gold-heavy stable phases, namely Cu 1 Ag 1 Au 2 and Cu 1 Ag 1 Au 3 .This indicates that stable intermetallic phases may exist on the gold side of the phase diagram.
We have confidence in our prediction, given the fact that the dictionary method structure for Cu 1 Ag 1 Au 2 is within 3 meV/atom of the convex hull, suggesting the possibility of the existence of a stable phase.This is consistent with the formation of the solid solutions in the gold-rich region of the experimental phase diagram [60].The rest of the structures are considered to be potentially metastable, with an average distance from the convex hull of around 30 meV/atom [64].Overall, our analysis demonstrates the ability of the workflow introduced here to predict structures closer to the convex hull than those from the state-of-the-art dictionary method and possibly uncover novel phases, should these exist.

B. Mo-Ta-W ternary convex hull
As a second benchmark, we wish to explore a phase diagram that exhibits a variety of stable phases.Thus, the major criterion for our selection, among all the possible transition-metal ternary combinations, is the total number of stable compounds.The Mo-Ta-W ternary system emerged as a good candidate, based on a search run with the AFLOW REST API [65].In fact, it exhibits the highest number of stable ternary phases of the entire database of transition metal alloys.In order to compare our proposed workflow with the dictionary method, we have made predictions corresponding to the same stoichiometries presented in the previous section.Furthermore, we have used our method to explore areas of the phase diagram poorly covered by AFLOWlib.
We now perform a similar analysis as that described in the previous section.The structure prototypes used for the element decoration are extracted from those of the binaries closest to their respective convex hulls.Then, an ensemble of ML models relax the created structures and order them based on their predicted energy.A set of 15 structures for each stoichiometry, corresponding to those with the lowest predicted energies, is sampled and proceeds to the next stage.This consists in performing a DFT relaxation and a static calculation for each one of these predictions.A significant difference with respect to the Cu-Ag-Au system is that we now use AFLOWlib's database to train the models without any further re-calculation.The AFLOW REST-API is used to download the energies and the crystal structures for the three binary convex hulls (Mo-W, Ta-W, and Mo-Ta).The models are trained as explained in the Methods section (see Section II).Recycling data already available on AFLOWlib allows us to avoid about 1,500 DFT relaxation calculations, some of them for cells up to 46 atoms, just for the training of the model.The results for the 1-1-1 and 1-1-2 compositions, those heavily populated in AFLOWlib, are presented first in Figure 7.
In this case as well we predict a new stable intermetallic phase, Mo 1 Ta 2 W 1 .However, this time our workflow is not consistently outperforming the dictionary method.In fact, for two out of the four stoichiometries investigated in Figure 7, we obtain compounds with energies similar to the ones already present in AFLOWlib, while for one, Mo 2 Ta 1 W 1 , our search delivers a compound with a higher energy.Interestingly, in this last case our newly found structure and the original one, contained in AFLOWlib, belong to different space groups.The AFLOW-predicted one has space group 107 (tetragonal), while our scheme finds a low-stmmetry monoclinic crystal structure with space group 9.The final geometries are not equivalent as determined by the AFLOW-SYM tool [53].Nevertheless, the compound discovered with the workflow only has an enthalpy of formation 14.91 meV/atom higher than the AFLOWlib compound.
As a force-field approach, our workflow gets better when the MLIAP improves.In this case, we have extracted the data used to train the SNAPs from the AFLOWlib repository, a detail that led to a less accurate force field than the one used for the Cu-Ag-Au system.In fact, minor inconsistencies in the energy data may generate errors in the force-field [66,67].That being understood, we have still demonstrated that new phases can be predicted by an almost DFT-free workflow, since our initial data for model training are readily available in the AFLOWlib database.The workflow systematically assesses a wide range of compositions and potential compounds.Specifically, it involves the evaluation of 331,734 ternaries based on their calculated SNAPs energies.Following this, the 15 lowest-enthalpy structures, for each stoichiometry, undergo relaxation through DFT.Interestingly, the DFT analysis reveals that, on average, the most stable compound ranks 7 th among the suggested options.Additionally, the ab-initio computations are shortened since all compounds move closer to their equilibrium geometry after the SNAP-guided relaxation, in contrast to their fully unrelaxed counterparts.
Perhaps a more accurate force field would also be able to find the AFLOWlib minimum for Mo 2 Ta 1 W 1 (see Fig. 7).Nevertheless, our workflow is already able to identify the majority of the structures close to the convex hull.It should also be noted that this is the phase diagram for which AFLOWlib's dictionary method works best, as it is able to detect four intermetallic phases, more than any other transition metal alloy phase diagram.
Then, we move to analyse stoichiometries poorly explored in AFLOWlib, namely 1-2-2 and 1-1-4.In Table IV we provide a comparison of the distance from the convex hull for the structures predicted with our method, δ WP , and the ones from AFLOWlib, δ AFLOW .For these compositions, the AFLOWlib compounds are unstable as they all have a positive enthalpy of formation.In contrast, those found by our workflow all have a negative enthalpy of formation and are found near or at the convex hull.These results provide a comparison between our method and AFLOWlib for structures predicted as unstable by the latter.Workflow predictions for the Mo-Ta-W ternary system with 1-2-2 and 1-1-4 compositions.The stoichiometries and their corresponding distance from the convex hull, δ, are presented (δ WP is for compounds generated by our workflow, while δ AFLOW is for the AFLOWlib compounds).Three intermetallic phases are predicted as stable and two others metastable.Surprisingly, our algorithm is able to find structures with an energy of up to 1 eV/atom lower than those identified by the dictionary method of AFLOWlib.
The ability of our workflow to consistently predict structures that are (i) close to the convex hull and (ii) have a negative enthalpy of formation is thus demonstrated.The former point means that we have an effective algorithm to use for structure search in regions of interest.The latter validates our physical intuition behind the assumption that the crystal structures of the binary alloys close to the convex hull can be used as a template for atomic decoration in the search for ternary phases.This approach has allowed us to identify three new intermetallic compounds, see Table IV, namely Mo 2 Ta 2 W 1 , Mo 1 Ta 2 W 2 , and Mo 1 Ta 1 W 4 .Such positive results demonstrate the value of the enhanced freedom in the structure search provided by our algorithm with respect to dictionary methods.
Finally, following the same spirit as for the analysis of the Cu-Ag-Au system, we now look into previously unexplored areas of the ternary convex hull.Our results for the 1-2-3 and 1-1-3 compositions are shown in Figure 8.As one can observe, together with structures away from the tie-plane, we also find two new stable compounds, namely Mo 1 Ta 2 W 3 and Mo 1 Ta 3 W 1 .Such new phases, together with the low-energy ones previously discussed, call for a modification of the ternary convex hull that exists in AFLOWlib.The new diagram is here presented in the top panel of Figure 9.In order to facilitate the comparison, the lower panel of the same figure shows the difference between the AFLOW-and our workflowpredicted convex hulls (positive values mean that our predicted convex hull is lower in energy than the original AFLOWlib one).
The new convex hull returns a picture, where most of the stable ternary structures identified belong to the Ta-W heavy area, and only one intermetallic alloy exists in the Mo-rich region of the compositional space.The latter is the compound found on AFLOWlib.Interestingly, the new phases predicted by our workflow undercut Mo 1 Ta 1 W 1 , Mo 1 Ta 2 W 1 and Mo 1 Ta 1 W 2 , the other intermetallic alloys initially predicted as stable by AFLOWlib.These are now 2.59, 10.95 and 1.31 meV/atom, respectively, above their associated tie-planes and have to be considered metastable.Experimentally, there is evidence that the Mo-Ta-W system forms a ternary solid solution [68] at finite temperature across the entire phase diagram.It should be noted that the Mo-Ta binary space is far better sampled by AFLOWlib than the Ta-W and Mo-W ones.This could imply that it is more difficult to reach the convex hull close to such facet of the diagram.In contrast, the Mo-W system only displays small enthalpies of formation for the stable binary phases, implying that both Mo and W form more stable phases with Ta than amongst themselves.These two reasons could explain why it is more difficult to find stable intermetallic phases in the Mo-rich part of the composition space.
FIG. 9. Workflow-computed convex hull for the Mo-Ta-W system (upper panel).The color heat map corresponds to the calculated enthalpy of formation at a given stoichiometry.In the lower panel, we present the difference between the convex hull of AFLOWlib (reference) and that computed by our workflow.Black crosses for the ternary region symbolize the newly predicted intermetallic phases, and the red cross denotes the only stable intermetallic originally predicted by the AFLOW dictionary method.

IV. CONCLUSION
We have developed a workflow that predicts the crystal structure and assesses the stability of ternary compounds of a particular stoichiometry.A library of prototype structures is formed from the lowest-enthalpy alloys of the associated binary subsystems.From this database, derivative ternary structures are generated by site decoration.Then, an ensemble of SNAP force-fields is used to select the most promising structures among them, bypassing the majority of the ab-initio calculations.Therefore, the proposed workflow highly increases the throughput in the crystal-structure search without compromising the quality of the predictions.This is used here to map the ternary convex hull of transition-metal alloys.The crucial aspect of the proposed scheme is that both the training of the force fields and the creation of the prototype ternary structures are based solely on the knowledge of the binary phases.As such, no additional DFT calculations are required, since both the structures and their corresponding energies are readily available on the AFLOWlib database.Employing ab-initio calculations solely in the final stage of the workflow and focusing them on the most promising candidates, allows us to perform a comprehensive exploration of the phase diagram of a ternary system with only a few hundred DFT calculations.This enables us to map previously unexplored portions of the ternary space and to identify regions of interest, thus driving the discovery of novel compounds.
We have demonstrated that the proposed workflow is able to predict crystal structures with negative enthalpy of formation and effectively identify the stable intermetallics, should they exist.In particular, we have used the Cu-Ag-Au and Mo-Ta-W ternary systems as an example.In the first case, we have predicted several new phases that, although not all thermodynamically stable, have an enthalpy of formation lower than those found by the AFLOW dictionary method.In addition, we have identified an Au-rich composition region, where stable intermetallic phases are expected, in accordance with the location of solid solutions in the experimental phase diagram [60].Interestingly, in the case of Mo-Ta-W, one of the ternary systems with the largest number of stable intermetallics in AFLOWlib, our method is capable of identifying a plethora of new phases, resulting in the correction of the original DFT-calculated convex hull proposed by AFLOW.
In summary, we have developed a novel way to integrate machine learning to accelerate a DFT workflow.Although the ML model introduced here does not perform as well as force-fields with tailor-made databases, its construction requires no new DFT calculations and simply recycles pre-existing results, already present on large-scale databases.This represents an example of how machine-learning interatomic potentials can be seamlessly integrated into a materials design pipeline without the need to generate ad hoc large training sets.

V. COMPUTATIONAL METHODS
The details of the computational methods are presented in this section.The parameters used for the DFT calculations run with VASP [62] are first discussed.A brief presentation of the SNAP [26] is then given, along with details of the implementations used for the current work.

A. DFT Calculations
All DFT calculations are performed using the Vienna Ab initio Simulation Package (VASP) [62], version 5.4.4.Projector augmented wave (PAW) pseudopotentials are used for each element together with the Perdew-Burke-Ernzerhof (PBE) functional [69].A plane wave cutoff of 600 eV is used for all calculations.The energy convergence criterion for each self-consistent cycle is of 10 −4 eV.Full atomic relaxations are performed (update of atomic positions, cell volume and lattice parameters) with a stopping criterion on the forces of 10 −3 eV/Å.A Fermi-Dirac smearing of 0.2 eV is chosen for all calculations.
For the k-point sampling, a Gamma-centered mesh is employed for all calculations.The density of the mesh and the spacing between k-points is chosen based on AFLOWlib's convergence criteria [63].The mesh is system specific and determined from the N KPPRA (number of k-points per reciprocal atom).The number of sampling points along each direction is proportional to the norm of the corresponding reciprocal lattice vector.The total number of sampling points per reciprocal atom is then minimised and N KPPRA is used as a lower bound.Values of 10 × 10 3 and 6 × 10 3 are used for static calculations and relaxations respectively.

B. Spectral Analysis Neighbor Potential
The Spectral Analysis Neighbor Potential (SNAP) [26] is used as an energy predictor.As described in section II B, an ensemble of models is employed for predictions.Equation ( 2) defines the expression of the function, E SNAP , and combined with equation (1), gives the energy of a system with N atoms.The atomic fingerprints that define the chemical environments of each atom i in the system, belonging to species α i , are the bispectrum components [23].These are used to represent configurations instead of seemingly more obvious choices (e.g.atomic Cartesian coordinates), as they are invariant upon rotation and permutations of identical atoms.Note that invariance with respect to translations is guaranteed by Eq. (1).For each atom, the vector B αi i , which collects the first components up to a maximum index, is taken as a feature for the machine-learning model (ridge regression in the case of SNAP).A short description of the bispectrum components is given below.
The neighborhood of an atom i atom can be described by a density function, ρ i , centered at that atom with delta functions at the sites of surrounding atoms, within a sphere of radius r cut .It is defined in three dimensions as where the sum is over all atoms within r cut from the central atom.Here, r i is the position of atom i, r ij = |r i − r j |, w αj is the specie-specific weight of atom j and f c is a cut-off function that smoothly runs to zero as r ij approaches r cut , as defined in [25].In order to represent this density distribution as a vector, it is expanded in a suitable basis.Atomic positions are first mapped onto the 4D sphere, by switching to polar coordinates (θ, ϕ, r) and by defining a third polar angle, θ 0 , from the radial coordinate (see Ref. [23] for details).The density function is then expanded in terms of hyperspherical harmonics U J m ′ ,m , the natural basis for expansion on the 4D sphere.
Dropping the atomic index, ρ is written as The hyperspherical harmonic index J runs in half-integer steps, while m and m ′ run between −J and J in integer steps.The outer sum is truncated in practice at a value J max , treated as a hyperparameter.The expansion coefficients, c J m,m ′ , cannot be used as descriptors, since they are complex and are not invariant under system rotation.From them, however, the rotationally-invariant and realvalued bispectrum components B J,J1,J2 are constructed Here, C J,J1,J2 mm1m2 and C J,J1,J2 m ′ m ′ 1 m ′ 2 are the Clebsch-Gordan coefficients, which possess the same symmetry invariances as the system.After taking the non-zero and unique distinct components, the bispectrum vector is formed, denoted B αi i with atomic and specie indices.The bispectrum components are a highly non-linear representation of the local atomic coordinates and account for up to four-body interactions.Their complexity is what makes it possible for them to be effectively used together with a simple regressor in SNAP to accurately map structures to energies.
The fitting, testing and predictions of the SNAP models used are performed using an in-house python library built with scikit-learn [70] and the Atomic Simulation Environment (ase [71]) python libraries.The bispectrum components are computed using lammps [72].The pipeline is built in python to perform the API download of binary structures and energies from the AFLOWlib [73] database and to generate derivative structures from the prototypes using enumlib [50].DFT calculations are managed by using a combination of ase [71] and pymatgen [74].

X
− Y N struct ∆E (meV) Ag

FIG. 3 .
FIG.3.SNAP performance for two structurally identical prototypes.The upper two panels show two different possible site occupations for a 3 × 1 × 1 bcc derivative structure with Cu1Ag1Au2 stoichiometry.Here we show the z-axis view, with brown, gray and yellow spheres representing Cu, Ag and Au respectively.The table shows the SNAP-predicted energies for the two prototypes, when the SNAP is trained with different atomic weights, {wα}, as indicated in the first column.Note that when the Cu and Ag weights are identical the two energies coincide by construction.

FIG. 4 .
FIG.4.Plot showing the initial unrelaxed volumes, V unrelax , and relaxed equilibrium ones, V relax , of a set of ternary prototypes at the stoichiometry Cu2Ag1Au1.The unrelaxed volumes are chosen from the volume of the binary associated with each ternary compounds' structure.The dashed line indicates the mean equilibrium volumes for these compounds, Vrelax , while the full line shows the volume predicted by the weighted average of the elemental volumes, V pred .

FIG. 7 .
FIG. 7. Workflow predictions for the Mo-Ta-W ternary system across different stoichiometries, 1-1-1 and 1-1-2.The graph presents the different compositions and their corresponding distance from the convex hull, δ.The blue points are associated to the predictions from the proposed workflow, whereas the orange ones represent the lowest-energy AFLOWlib data.The dashed line marks the tie-plane position of the convex hull (CH).The proposed workflow has managed to identify one previously unknown intermetallic phase, namely Mo1Ta2W1, whose unit cell is shown as an inset.Here Mo atoms are in purple, Ta in gold, and W in silver.

FIG. 8 .
FIG. 8. Workflow predictions (blue points) of the enthalpy of formation for the Mo-Ta-W ternary system across the 1-2-3 and 3-1-1 compositions.The enthalpy of formation for each composition at the appropriate the convex hull tie-plane is shown as a dashed line.The unit cell of the crystal structures found on the convex hull are presented as well.Here, Mo atoms are in purple, Ta in gold, and W in silver.Two new intermetallics alloys have been identified, namely Mo1Ta2W3 and Mo1Ta3W1.

TABLE I .
Number of structures, Nstruct, selected from each binary system, X − Y , to construct the ternary prototypes.