The Future of Ligand Engineering in Colloidal Semiconductor Nanocrystals

Conspectus Next-generation colloidal semiconductor nanocrystals featuring enhanced optoelectronic properties and processability are expected to arise from complete mastering of the nanocrystals’ surface characteristics, attained by a rational engineering of the passivating ligands. This aspect is highly challenging, as it underlies a detailed understanding of the critical chemical processes that occur at the nanocrystal–ligand–solvent interface, a task that is prohibitive because of the limited number of nanocrystal syntheses that could be tried in the lab, where only a few dozen of the commercially available starting ligands can actually be explored. However, this challenging goal can be addressed nowadays by combining experiments with atomistic calculations and machine learning algorithms. In the last decades we indeed witnessed major advances in the development and application of computational software dedicated to the solution of the electronic structure problem as well as the expansion of tools to improve the sampling and analysis in classical molecular dynamics simulations. More recently, this progress has also embraced the integration of machine learning in computational chemistry and in the discovery of new drugs. We expect that soon this plethora of computational tools will have a formidable impact also in the field of colloidal semiconductor nanocrystals. In this Account, we present some of the most recent developments in the atomistic description of colloidal nanocrystals. In particular, we show how our group has been developing a set of programs interfaced with available computational chemistry software packages that allow the thermodynamic controlling factors in the nanocrystal surface chemistry to be captured atomistically by including explicit solvent molecules, ligands, and nanocrystal sizes that match the experiments. At the same time, we are also setting up an infrastructure to automate the efficient execution of thousands of calculations that will enable the collection of sufficient data to be processed by machine learning. To fully capture the power of these computational tools in the chemistry of colloidal nanocrystals, we decided to embed the thermodynamics behind the dissolution/precipitation of nanocrystal–ligand complexes in organic solvents and the crucial process of binding/detachment of ligands at the nanocrystal surface into a unique chemical framework. We show that formalizing this mechanism with a computational bird’s eye view helps in deducing the critical factors that govern the stabilization of colloidal dispersions of nanocrystals in an organic solvent as well as the definition of those key parameters that need to be calculated to manipulate surface ligands. This approach has the ultimate goal of engineering surface ligands in silico, anticipating and driving the experiments in the lab.


■ INTRODUCTION
Colloidal semiconductor nanocrystals (NCs), also called quantum dots, can be regarded as hybrid inorganic−organic materials consisting of a nanosized inorganic semiconducting core surrounded by an outer shell of organic ligands, as illustrated in Figure 1. The NC inorganic core is the optically active part and plays a role in defining the optoelectronic characteristics of the overall material, while the primary function of the ligand shell is to stabilize the NC in several types of organic solvents. Surface ligands, however, are essential in several other aspects of the colloidal nanocrystal chemistry: (1) at high surface coverages, they suppress nonradiative recombination centers (defects); 5−9 (2) in conventional II−VI and IV−VI nanocrystals, they can shift the energetic position of the conduction and valence band edges, 5,10−12 a feature that is still to be proven for other semiconductor nanocrystals; and (3) they can be exchanged with other types of ligands, in particular with shorter ligands that can act as electronic bridges between NCs, for example to improve electron−hole extraction when the solution is cast into films. 5,13−15 This last step is crucial in engineering and optimizing efficient photoactivated conductive devices. From this set of characteristics, it becomes clear that the surface chemistry of NCs, if deeply understood, can be used as a platform for the rational design of ligands, which in our view represent the most critical feature of colloidal NCs.
Unsurprisingly, the successful synthesis of a given NC material is invariably followed by a plethora of experimental reports that propose new surface ligands to improve the nanocrystals' stability and enhance their optoelectronic properties. 16−18 These experimental attempts usually employ the candidate ligands either in postsynthesis ligand exchange procedures or directly during the synthesis of the NCs. To date, the search for optimal ligands is mostly based on an inefficient empirical approach, in which the number of syntheses can grow drastically before a ligand that fits perfectly with the NC core surface is identified. On the positive side, this problem can nowadays be tackled atomistically by carrying out classical molecular dynamics (MD) simulations on NC− ligand−solvent systems with realistic NC sizes and ligands and explicit solvent molecules, thus providing an atomistic picture of the dynamical nanocrystal surface region. With these recently developed tools in hand, our group is planning to investigate the role of parameters such as ligand entropy, ligand−ligand steric hindrance, and ligand−core interactions in the processes of ligand binding/displacement and NC dissolution/precipitation, ultimately providing a strategy to maximize the surface coverage and boost both the colloidal stability and efficiency of the NCs. Unfortunately, we are also aware that this type of study, which is already by itself very challenging from a computational standpoint, allows the exploration of only a limited part of the ligand chemical space. To put things in perspective, according to the Generated DataBase, also called GDB-17, created by Reymond et al. in the drug discovery field, the total number of possible organic ligands with up to 17 atoms (excluding hydrogens) that satisfy simple chemical and synthesis rules is in fact 166 billion. 19 This is about 3 orders of magnitude larger than the number of ligands that have been synthesized to date, which is about 169 million structures according to the CAS Registry. 20 This hints at the idea that a clear help in the rationalization and minimization of the synthetic effort can be provided by following the example set in the drug discovery field, where the search for the best drugs nowadays follows an efficient integration of computational chemistry tools and machine learning models. We believe that finding a pathway to compute rapidly ligand properties that best describe ligands at the NC surface is a strategy that can be successful in the near future. We can expect that when a sufficiently trained data set of ligand properties becomes sufficiently large for machine learning algorithms to be predictive, this high-throughput approach will be able to suggest optimal ligand candidates to be assessed in the experiments.

OF TODAY
In recent years, our group has developed automated tools to obtain highly accurate force field parameters for nanocrystal− ligand−solvent systems. 1,21 In practice, we are able to describe with chemical accuracy the interactions between the inorganic nanocrystal core and surface ligands as well as those within the nanocrystal core through a simple combination of Lennard-Jones potentials 22,23 and Coulombic electrostatic interactions. The unknown parameters (the depth of the potential well as the ε parameter, the size of the ion as the σ parameter, and the atomic charges) are fitted against density functional theory 24 (DFT) reference data, ultimately attaining a DFT-quality description of the overall system. Because of the good transferability of the parameters to large sizes and the possibility of also including the solvent explicitly, we are now able to perform classical MD simulations for realistic NC− ligand−solvent interfaces with up to 1 million atoms in the simulation box. 4 This opens the way, for the first time, to the possibility of using MD methodologies that have mostly been applied in biochemistry and drug discovery also in the colloidal nanochemistry field. Classical MD simulations can be used to study NC−ligand−solvent systems featuring different surface coverages to unveil the dynamical behavior of the ligands at the nanocrystal surface, revealing for example how ligands (i) cluster via dispersive (i.e., non-covalent) forces or prefer a random distribution, (ii) diffuse at the surface, and (iii) undergo attachment and displacement events. These atomistic studies can be performed by taking advantage of the abundance of analysis tools offered by classical MD simulation packages. 25−27 Most importantly, such packages usually also enable calculation of the free energy of complexation of ligands to a receptor and the free energy of solvation of ligands in a solvent, features that are particularly attractive to us in evaluating the relevant energetic terms for the main chemical processes occurring at the nanocrystal surface.
To obtain a comprehensive picture of the NC surface chemistry, one could aim at computing these factors for a set of ligands that intuitively spans the chemical space of molecular surfactants. Unfortunately, although this approach would be very valuable in obtaining fundamental insights into NC surface chemistry, it has the drawback of being computationally demanding. If we wish to explore a much larger set of ligands, our main task will boil down to finding and computing specific descriptors, i.e., chemico-physical properties of the ligands that obey specific rules of thumb that we will define later in the text. The best approach to compute these descriptors will likely require a better understanding of the NC surface chemistry, although we expect that they will be computed either by using first-principles calculations, like DFT, 24 or by defining features that can be straightforwardly computed, like the volume occupied by the ligands, the surface polar area, the synthesizability score, etc. Irrespective of the methodology employed to compute these properties, it is expected that a massive number of calculations must be performed, and it is thus paramount that such calculations are carried out seamlessly using quantum chemistry software packages and algorithms that are fast, efficient, and wellintegrated one with another. For this purpose, we have recently developed QMflows, 2 a Python-based library that seamlessly executes complex workflows consisting of a large number of calculations with complicated dependencies. It also allows the processing of an enormous number of inputs/outputs and postprocessing data collection and analysis. Ultimately, we will be able to write a series of composite workflows that will enable us to compute a vast number of ligand and ligand-NC related properties, store them in databases, and finally process them with machine learning algorithms. ■ REVISITING THE CHEMICAL EQUILIBRIA AT NC SURFACES To address the above challenges atomistically and set the base for a comprehensive description the NC surface chemistry, it is paramount for us to formalize the mechanism of ligand binding/displacement and NC dissolution/precipitation in a unique thermodynamic framework that is suitable to be addressed with atomistic calculations (Figure 2).
In this respect, we start from the work of Peng et al., 28 who demonstrated that identifying the factors that govern the dissolution/precipitation equilibrium of nanocrystal−ligand complexes in organic solvents by means of a quantitative thermodynamic model enables the design of surface ligands that maximize the NCs' solubility, thus mastering the processability challenge of colloidal nanocrystals. In particular, their thermodynamic analysis revealed the decisive role of enthalpy and entropy changes during dissolution of nanocrystals coated with n-alkanoate ligands, namely, (i) a large enthalpy cost for dismantling the strong ligand−ligand interdigitation between adjacent particles during the precipitation step and (ii) a massive increase in intramolecular entropy related to the ligands' C−C σ-bond rotation and skeletal bending in solution. Subsequently, Peng et al. 29 introduced the so-called "entropic ligands" concept to interrupt crystalline ligand−ligand packing in the solid but harvest conformational entropy in solution: for example, they demonstrated an increase in the solubility of various nanocrystals by several orders of magnitude by just replacing n-alkanoate ligands with irregularly branched ones.
Although this aspect is crucial to improve the solubility of colloidal nanocrystals, there is a second and perhaps even more important challenge for colloidal nanocrystal chemistry, which is to achieve complete passivation of the NC surface by ligands to maximize the surface coverage, a key aspect for the elimination of surface defects and for the improvement of the NCs' optoelectronic characteristics.
As anticipated at the beginning of this section, it is thus critical to place on the same ground the process of binding/ displacement of ligands at the nanocrystal surface in solution and the process of precipitation/dissolution of nanocrystal− ligand complexes in organic solvents (see Figure 2). In this picture, ideal ligands should favor the intermediate state by simultaneously pushing the first equilibrium to the right (maximizing the surface coverage) and the second equilibrium to the left (maximizing the solubility, similarly to the entropic ligands). For practical computational purposes, we propose the thermodynamic cycle sketched in Figure 3 that transfers the most critical processes from the solvent to the gas phase, thus providing a substantial computational advantage.
In this cycle, NC is the unpassivated nanocrystal, L is the ligand, and [NC−L n ] is the NC capped with n surface ligands L. According to the thermodynamic cycle above, we can express the total free energy of the process from the separate fragments to NC precipitation as Since our aim is to shift the equilibrium toward the fully passivated nanocrystals in solution, the free energy of the ligand binding, ΔG solv binding , should be minimized, i.e. more exergonic, while that of NC agglomeration, ΔG solv agg , should be maximized, i.e. more endergonic. For practical purposes, however, it is more convenient to study the two processes separately, as will be done from now on. It should be noted that in typical apolar solvents, the ligands are not attached or displaced from the nanocrystal surface alone but as neutral ion pairs or as a Z-type ligand, composed of the anionic/cationic ligand bound to a metal ion/anion from the NC, respectively. The symbol L, employed for clarity, actually represents the overall ion pair, thus ensuring the consistency of the binding process under examination with the chemistry of precursors used in the NC synthesis. Accounts of Chemical Research pubs.acs.org/accounts Article Step 1: Binding/Displacement of Ligands at the NC Surface As sketched in Figure 3, the separate fragments (the bare NC and the n ligands L that are finally present at its surface) are considered as the starting point in the binding process, thus immediately and explicitly taking into account the expected dependence of the binding free energy on the surface coverage, expressed as the fraction of surface sites occupied by ligands. In this framework, the ligand binding process is described as where we can group the terms in parentheses by defining a solvation energy term ΔΔG solv that represents the gain/loss in free energy obtained by solvating the overall [NC−L n ] system compared with the separate components at infinite dilution: The next term to be discussed in detail is the total binding free energy of surface ligands to the generic nanocrystal in the gas phase, ΔG gas binding . As in many quantum-mechanical approaches, 30 we can conveniently decompose the ΔG gas binding term as the following sum: The first term of this decomposition, ΔG gas prep , is the preparation (deformation) energy to bring the starting fragments NC and L from their configuration at infinite distance from each other to the configuration they assume in the final [NC−L n ] system. The second term of the sum, ΔG gas int , represents the local interaction and relaxation energy between the as-prepared fragments. A detailed view of the ligand binding process in the gas phase using this approach is sketched in Scheme 1.
We can expect the ligands to undergo a large reorganization, both enthalpic and entropic, from the starting configuration to that found at the NC surface, 3 similar to what was observed by Peng et al. during the precipitation of nanocrystal−ligand complexes in organic solvents. In other words, we can expect the ΔG gas prep (L n ) term, corresponding to the ligand preparation energy, to be very large. On the other hand, we can assume that the core of the NC remains substantially unchanged during the preparation process, meaning that ΔG gas prep (NC) ≈ 0: Here we can further decompose the first term to capture the enthalpic and entropic contributions in the reorganization of ligands at the NC surface, thus obtaining: The free binding energy for binding of surface ligands to a generic NC in a given solvent can ultimately be expressed as binding gas prep gas int solv (7) or equivalently as Importantly, we can expect that for a given ligand, the binding free energy is the largest in absolute value (i.e., most exergonic) at a certain surface coverage, θ c , which is simply defined by the formula θ = n n c tot (9) where n is the number of surface sites effectively occupied by the ligands at a given surface coverage and n tot is the total of available surface sites. In our picture, the minimum of free energy, obtained at the equilibrium value of the surface coverage, accounts for the dependence of all of the terms in eq 8 on θ c . We can highlight this dependence by rewriting eq 8 as Each term in eq 10 has an intuitive and relevant chemical meaning: • ΔG gas int (θ c ) is the actual interaction between the NC and the surface ligands L (or, more precisely, between the NC and the metal−ligand or anion−ligand ion pairs), and it is always exergonic (i.e., ΔG gas int (θ c ) is negative). It mostly depends on the particular interaction between the anchoring group of the ligand and the NC, and thus, we expect its strength to widely fluctuate according to the type of NC material considered and the type of functional group anchored to the surface (−COO, −S, −O, −PO(OH)O, etc.). • ΔH gas prep (L n )(θ c ) is an enthalpic term that evaluates the agglomeration interaction among the ligands, and its sign is strongly dependent on the surface coverage. At some coverages, ligands can favorably pack through weak dispersive ligand−ligand bonding interactions (i.e., ΔH gas prep (L n )(θ c ) is negative). At high coverages, however, the repulsive steric interactions (i.e., ΔH gas prep (L n )(θ c ) is positive) would probably dominate. 4 • ΔS gas prep (L n )(θ c ) is the entropic term and is reminiscent of the entropic ligand definition by Peng et al. It favors ligands as detached from the surface, and thus, it is always negative, making the −TΔS gas prep (L n ) energy term positive. This is always true, as the number of configurations that ligands can assume is much higher at infinite dilution than on the surface, where the translational, rotational, and conformational mobility is reduced. At maximum packing, the entropic term is likely to be at its maximum because we can expect the ligand mobility to be at its minimum.
• ΔΔG solv (θ c ) = ΔG solv [NC−L n ] (θ c ) − ΔG solv NC − nΔG solv L is the gain/loss in free energy resulting from solvation of the system relative to its components at infinite dilution. It is possible that the ligand-passivated NC presents a solubility/solvation free energy that is larger than the Scheme 1. Thermodynamic Cycle for the Binding Interaction between the Ligands and the NC in the Gas Phase Accounts of Chemical Research pubs.acs.org/accounts Article sum of those for its components, similar to what reported for micelles. This aspect is still unknown and has not been explored before, either computationally or experimentally. Moreover, the sign of this term is expected to vary strongly with the type of solvent. For a given ligand, the total binding free energy can be determined as the sum of the contributions of all of the above factors, which act differently as functions of the surface coverage (dashed red line in Figure 4a). Ideally, ligands will perform best when the surface coverage is near unity because all of the surface vacancies are expected to be passivated (surface traps suppressed) (Figure 4b). Even though a rigorous quantitative analysis is needed, we can intuitively expect the best ligands to match the following characteristics: (1) they have a functional group that binds strongly to the NC surface; (2) they are small enough to avoid steric hindrance at the NC surface; (3) they favor interligand packing via weak (e.g., van der Waals) bonding interactions at the NC surface; (4) they have low entropy changes; and (5) they are soluble enough in the solvents used in the synthesis to stabilize the colloidal dispersion of nanocrystals.

Step 2. Precipitation/Dissolution Equilibrium of NCs
The ligand binding model constructed in the previous section already helps in defining some rules of thumb to identify surface ligands that fit best to a given NC material, i.e., that maximize the surface coverage by pushing the binding/ displacement equilibrium of Figure 3 toward the passivated NC in solution. We will now consider the precipitation/ dissolution equilibrium, which should be oriented toward dissolution of the NCs by maximizing the free energy of agglomeration. According to Figure 3, the latter can be written as The first one is justified by the fact that, as a first approximation, the precipitated nanocrystal−ligand complex interacts locally only with its nearest neighbors, with the solvent playing only a negligible role. This is key to simplify the interaction of the NC with its surroundings to an effective one-body problem, although more sophisticated models of the agglomerate should also include the presence of solvent trapped between nanocrystals. The second approximation stems from the idea that the interactions between [NC−L n ] systems in the solid state preferentially occur via facet-to-facet interactions. Although this is better justified during the formation of superlattices, we can practically assume that something similar takes place during the precipitation of NCs. It should be noted that we are attempting to simplify a complex problem to make it tractable for computational purposes. On the basis of these arguments, the agglomeration term becomes G n G G solv agg facet gas facet facet solv NC L n (12) In the first term, ΔG gas facet−facet represents the local pair interaction between a given facet of the [NC−L n ] system and the adjacent facet of one of its nearest neighbors in the precipitate. We can further decompose ΔG gas facet−facet in terms of ligand−ligand interactions, as expressed by the term ΔG gas L (m) −L (m) , which corresponds to free energy for the interaction of the m ligands on a facet (m = n/n facet ) with the m ligands on the adjacent facet. Although the theoretical model proposed by Peng et al. basically neglects the NCs' core−core interactions in the solid state, the experimental measurements presented in the same paper highlight the importance of this term in the dissolution enthalpy of nanocrystal−ligand complexes. On the basis of this evidence, we take this interaction term into account. On the other hand, we can assume that the interaction between the core of a NC and the ligands of the neighboring facet in the precipitate is negligible. With these considerations, the facet-to-facet pair interaction in the solid state can be ultimately expressed as where ΔG gas NC−NC is the nanocrystal-to-nanocrystal local interaction. Here again, we can further decompose this expression to capture the enthalpic and entropic contributions to the precipitation of the NCs, thus obtaining The free energy of precipitation for a generic [NC−L n ] complex in a given solvent in eq 12 can now be equivalently expressed as As in the case of the ligand binding free energy, we can expect each term in eq 15 to be strongly dependent on the surface coverage θ c . Consequently, the largest (i.e., most endergonic) value of the agglomeration free energy, resulting in the maximal solubility of nanocrystal−ligand complexes, should be obtained at a certain surface coverage. We can evidence this dependence by rewriting eq 15 as Each term in this formula has also an intuitive chemical meaning, and we can expect the following dependences on θ c : (θ c ) is the enthalpic term that represents the facet-to-facet agglomeration interaction among the ligands in the precipitate and should be strongly dependent on the surface coverage. Below a certain coverage, the ligand spacing allows favorable interdigitation through nonbonding ligand−ligand interactions (i.e., (θ c ) is the enthalpic term that describes the core−core interactions in the precipitate. This agglomeration term should always be negative and is expected to dominate at low surface coverage (corresponding to solid-state packing between NCs), where it acts as a driving force for precipitation of the nanocrystals.
• ΔS gas NC−NC (θ c ) is the entropic term that accounts for the locking of rotational and translational degrees of freedom of the nanocrystal core during agglomeration of [NC−L n ] complexes. This term is always negative, and therefore, the −TΔS gas NC−NC (θ c ) energy term in eq 16 is positive. We can expect this entropy variation to be almost independent of the surface coverage.
• ΔG solv [NC−L n ] (θ c ) is the free energy of solvation for ligandpassivated NCs, which was already encountered in the previous section. Here again, the behavior of this term with the surface coverage is difficult to predict and should widely fluctuate with the type of solvent. For a given ligand−nanocrystal−solvent combination, the total free energy of agglomeration of the nanocrystal−ligand complexes can be obtained simply by adding the contributions of all of the above factors and, like its component terms, is a function of the surface coverage (dashed red line in Figure 5a). In our framework, the entropic ligands defined by Peng et al. can now be seen as the ligands that are able to maximize the free energy of precipitation, thus favoring the dissolution process and boosting the NCs' solubility. In fact, Peng et al. recommended minimization of the interligand packing for destruction of the crystalline chain−chain interactions in solids and maximization of the ligands' intramolecular entropy ( Figure 5b) as a universal strategy to battle the processability challenge of colloidal nanocrystals.
Finally, it is important to notice that mixing of different types of passivating ligands at the nanocrystal surface has also been successfully used to enhance both the nanocrystal optoelectronic properties and solubility. For example, Peng et al. reported an increase of several orders of magnitude in the solubility of CdSe nanocrystals by simple addition of a shortchain ligand (hexanoate) to a pure shell of a long-chain ligand   32 The search for a multicomponent ligand shell that matches, as a whole, the above characteristics represents an interesting and flexible alternative to the search for single ideal ligands and should be included in the above thermodynamic framework in the future.
■ A BRIGHT FUTURE FOR ENGINEERING OF SURFACE LIGANDS At this point, we have a comprehensive picture in which all of the terms in the thermodynamic cycle of Figure 3 can be calculated with standard classical and quantum chemistry tools.
Unfortunately, some features that help the ligands bind to the NC surface (proposed in step 1) are in sharp contrast with the definition of entropic ligands proposed by Peng et al. to maximize the solubility of the capped NCs in organic solvents. To better explain this apparent contradiction, we can look back at the unique equilibrium that includes both binding/ displacement of ligands at the NC surface in solution and precipitation/dissolution of nanocrystal−ligand complexes, as illustrated in Figure 2. In this unified picture, promoting the ligand−ligand packing at the NC surface helps to maximize the surface coverage (first equilibrium pushed to the right) but could also favor crystalline ligand−ligand interdigitation in the precipitate, thus dramatically decreasing the NC solubility. Similarly, ligands with a huge conformational entropy ("entropic ligands") greatly stabilize the colloidal dispersion (second equilibrium pushed to the left) but could assist the displacement of ligands from the NC surface, thus deteriorating their optoelectronic properties by introducing surface defects.
Overall, ideal ligands are those that favor the intermediate state, maximizing both NC surface coverage (efficiency) and NC solubility. We should thus balance our rules of thumb for the rational design of surface ligands accordingly in order to take into account both of these critical processes. The ligand anchoring group plays a critical role in minimizing the ligand binding energy and should ensure that the binding strength to the NC surface is as high as possible. The ligand backbone should be engineered in order to (1) favor ligand−ligand packing at the NC surface (i.e., in its initial segment, close to the anchoring group), (2) prevent ligand−ligand interdigitation between nanocrystal−ligand complexes (e.g., by adding some branching alkylic chains in its terminal segment), (3) have a balanced entropy change (especially in the case of weak binding), and (4) be soluble enough in the solvents used in the synthesis to stabilize the colloidal dispersion of nanocrystals. An example of a possible candidate is sketched in Figure 6.
In reality, an enormous number of structures (ligands) can be considered and characterized to find candidate ligands with these finely tuned properties. Thus, here we arrive at the future challenges of this field, which includes the application of advanced statistical analysis tools like machine learning. The identification of optimal ligands that demonstrate improved NC optoelectronic qualities (photoluminescence, color purity, etc.) and processability will indeed require the construction of a large data set of computed ligand properties. The most demanding task will be to build a robust training set of thousands of structures in a practical amount of time, which will boil down to being able to compute and store the properties of interest for each ligand in a very fast and efficient way. As pointed out at the beginning of this Account, this challenge is now within reach thanks to the set of computational tools that our group has recently developed. The manuscript was written through contributions of both authors. Both authors approved the final version of the manuscript.