Designing Algorithms To Aid Discovery by Chemical Robots

Recently, automated robotic systems have become very efficient, thanks to improved coupling between sensor systems and algorithms, of which the latter have been gaining significance thanks to the increase in computing power over the past few decades. However, intelligent automated chemistry platforms for discovery orientated tasks need to be able to cope with the unknown, which is a profoundly hard problem. In this Outlook, we describe how recent advances in the design and application of algorithms, coupled with the increased amount of chemical data available, and automation and control systems may allow more productive chemical research and the development of chemical robots able to target discovery. This is shown through examples of workflow and data processing with automation and control, and through the use of both well-used and cutting-edge algorithms illustrated using recent studies in chemistry. Finally, several algorithms are presented in relation to chemical robots and chemical intelligence for knowledge discovery.

A n algorithm is a set of rules that determines the execution of a sequence of operations. As they are fundamental theoretical constructs they are of great use, and the earliest recorded algorithms detailing procedures to solve mathematical problems date back almost 4000 years. 1 In the field of chemistry, the desire for repeatability, control, and correlation of sensor outputs with inputs exemplifies the need for welldefined control and decision-making systems. Algorithms in chemistry are often implemented in real-world chemical systems, and so their development is affected by hardware, physical and computational resources, as well as chemical handling constraints. This leads to new technologies being quickly utilized for chemical purposes. An early case is the use of punch cards at the advent of digital computing for analysis of mass spectra. 2 With the increase of computing power at an ever-diminishing cost, chemistry has gained much from new instrumentation, data collection and analysis, better scientific communication, and many other avenues of improvement. In recent years there have been breakthroughs in the ability of computers to complete tasks that once seemed the exclusive purview of humans, such as image processing 3 and playing games. 4 In this Outlook we describe, through real-case examples, how algorithms could assist in current chemical research through increased productivity and also how the proper use of algorithms coupled with integrated platforms can expand the ability to search for new chemical knowledge.
Current Uses of Algorithms in Chemistry. Algorithms for use in chemistry can be separated into three classes: menial, assistive, and enabling. The menial are mainly low-level algorithms such as those controlling syringe pumps for liquid handling, whose primary purpose is to replace manual technical work. Other algorithms that belong to this group are higher-level algorithms for monitoring and control. The assistive class primarily improves the intellectual productivity of the human chemist; fundamentally, these algorithms reduce the cognitive load on the user. A common usage case is in the evaluation and processing of analytical measurements, 5 for example, using wavelet transforms to treat and extract data from spectra. 6 In this case, an algorithm interprets the data and assigns peaks based on the available database. Other algorithms help to visualize, manipulate, and extract chemical information from representations of molecules. 7,8 The integration of these algorithms allows for sophisticated platforms to be built which perform chemistry without human intervention; 9−11 a plot of a simulated optimization sequence undergone in such a system is shown in Figure 1. The optimization algorithm used is called the Stable Noisy Optimization by Branch and Fit (SNOB-FIT). 11 It combines both local and global searching to find the maximal value in the available search space in the most efficient manner. In this example the maximal value sought was the highest yield, the search space defined over ranges of concentration, pH, and temperature, and efficiency in this case is conducting the least amount of experiments. The enabling algorithms are the most powerful as they accomplish tasks that humans are incapable of. This is often due to the amount of chemical data available reaching levels beyond the ability of any human to process (e.g., chemical databases such as Reaxys). Therefore, many algorithms are being designed or co-opted to deal with such a large wealth of information and data.
Big Data and Automatic Data Analysis Including Feature Extraction. "Big data" is a growing area of science with great significance in the field of chemistry (i.e., drug discovery). 12−14 Big data means not only a large amount of data but also usually more varied data. The Web provides access to a rich selection of diverse chemical data sources (some of the most common can be found in Table 1 or in the literature 15 ). A crucial factor is the availability of representations of chemical data, predominantly molecular structure; notably simplified molecular input line entry specification (SMILES), 8 a line notation for molecules; Mol, 16 property information about atoms, bonds, and connectivity of molecules; structure data format (SDF), text format representing multiple chemical structures; and many more as described in the literature. 17,18 The fundamental benefits of using such databases are the huge number of samples presented in a consistent manner and scalable with clear barriers to access, if any. An important caveat is that the quality of the data can vary greatly as most of the data is a collection of reported results, most of which are not independently verified. Because of the large amounts of available data, scientists must identify which data to mine and how to preprocess it for their research purposes. In addition to existing stored data, the combination of experimental chemical platforms with digitization produces large amounts of new data with the potential to promote cooperation with business and academia on the characterization and interpretation of the data. 19 The tasks have growing significance for computational and statistical analysis arising from the size, complexity, and heterogeneity of available data sets, and could be aided using adequate algorithms. 20 One such common task in databases is knowledge discovery which can refer to the use of methodologies from virtual screening, machine learning, statistics, and pattern recognition. For example, the retrosynthetic software Chematica uses chemical reaction information to search for new synthesis reaction routes. 19,21 A different approach for the same task has recently also shown inference of chemical reactivity from knowledge graphs. 22 One class of algorithms that can process so-called "big data" are neural nets. A neural net is made of highly connected nonlinear logical units where each connection has parameter that is adjusted as part of the training phase. The number of connections and therefore also parameters can reach into the thousands. Following a period of training where the network is taught a known relation between inputs and outputs it can be used to make prediction on new inputs. This approach allows the algorithm to implement mathematical operations such as classification of chemicals based on their chemical structure/ behavior; modeling of relationships between different structures; and storage and retrieval of given information. Indeed, chemists have been working with neural nets for decades, 30 and with the recent resurgence of neural nets in deep learning, 31 new prospects and applications are again gaining traction. 32,33 Large amounts of data improve the ability of neural nets, and so the growing amount of available chemical information allows researchers to construct new ways of performing and analyzing chemistry. 34,35 Some algorithms even build up their own information about the space of chemistry from first-principles with little guidance from established chemical knowledge. 15 One of the major uses of big data-driven chemistry is virtual screening (VS), which describes the usage of computational algorithms and models for identification of bioactive molecules. Generally, compounds with common physicochemical properties are combined into assembled libraries/databases. This allows for classification of big data sets of chemical compounds according to their probability to match a criterion, for example, bioactivity where top performing compounds are   36 Most VS approaches depend on the application of descriptors of molecular structure and properties. The accumulated knowledge from VS techniques can be used to propose many possible molecules according to chosen criteria. VS has been successfully applied together with highthroughput screening (HTS). HTS allows for more costeffective research and development in chemical laboratories by running a large number of experiments. 37,38 The combination of chemical experiments alongside with virtual screening allows for a more targeted and efficient use of the large number of experiments that can be conducted by HTS. Nevertheless, as vast areas of chemical compound space, which is the relevant search space, do not contain useful molecules, it is vital to filter chemical space in order to identify the molecules with a high likelihood of selectivity. Filtering out molecules that are not likely to be of use can be achieved by a similarity search. In this process, defined search criteria allow for the identification of compounds that are similar in their required properties to those stored in a database. Other methods that could expedite and increase the efficiency and accuracy of screening include the following: privileged structures, 39 fingerprints, 40,41 single similarity measure, 42 pharmacophore-based methods (centered on geometric and topological constraints), 43 quantitative structure−activity relationship (QSAR), 44 "forward" and "backward" filtering as described by Klebe,45 and many more as described in refs 46−48.
One of the objectives of chemical research is to produce reliable data to enable knowledge discovery. The main challenges to achieving this goal are validating the data and giving a statistically significant interpretation. For the former using data of bad quality will at best yield nothing and at worst produce an erroneous result. The latter is important since in chemistry the analysis of data is in service of increased understanding which must rely on statistically significant results. Substantial work on these issues is being done in the field of chemometrics. 49,50 This discipline utilizes statistical approaches to demonstrate, interpret, and rationalize the results of measurements of chemical data. 51 Various multivariate data analysis (MVA) or pattern recognition 52 algorithms are covered by chemometrics, which can be divided into two groups: unsupervised, which allows searching for hidden structures from unlabeled data, and supervised, which mainly focuses on classification or prediction of new samples based on categorized samples. These algorithms can assist in interpreting the outputs at various stages of processing pipelines ( Figure 2) thereby making it easier for the user to focus on a higher level of abstraction. 6,53 Chemometrics approaches such as principal component analysis (PCA), cluster analysis, 54 multidimensional scaling (MDS), and partial least-squares (PLS) 5,55 allow chemists to recognize potential outliers and specify whether there are any patterns or trends in the data. All these methods reframe the space representation of the data according to criteria which are different for each method. PCA attempts to relate the variance in the data; MDS rearranges the data by similarity, and PLS finds a linear relation between the input and output variables. Furthermore, methods like PCA and MDS can be used for feature selection and dimensionality reduction of large and complex data sets. Alternatively, regression algorithms such as principal component regression, ridge regression, stepwise regression, robust regression, and partial least-squares regression, 56 which deal with outputs that are continuous, could be helpful in decision making involving online monitoring or in process control of a given metric. 5 A major focus in this area is on feature extraction. Feature extraction is a critical step in knowledge. For this process, a variety of algorithms are used to transform a large data set into reduced features called "latent variables". A selection of latent variables is expected to cover essential information derived from the original data, so that the chosen goal can be achieved by using the reduced representation of the original data set ( Figure 3). In other words, the process reduces the influence of certain parameters/variables and focuses on those that provide most of the information captured by the first several latent variables. The automatic or manual mining of features can represent the conclusion of the research question or a processing step in understanding the observed chemical system. 57 When the data is labeled, the chemical classification problem can be solved by application of supervised methods which cover traditional discriminatory algorithms [linear discriminant analysis (LDA), partial least-squares-discriminant analysis (PLS-DA)] and various machine learning methods (e.g., support vector machines, random forests). 58−61 Other knowledge discovery algorithms successfully applied in chemistry include k-nearest neighbors, neural networks, 62 genetic algorithms, 63,64 Gaussian mixture models, and many more as described in refs 65−67. Additionally, the subject has been repeatedly reported in the literature. 68,69 Automation and Control. The advantages of automating chemical processes are numerous. They include a substantial increase in scale, improved precision, a reduction in the amount and effect of uncontrollable variables, better Diagram demonstrating a standard chemometrics workflow including data processing. Different data inputs are first preprocessed into compatible data matrices, followed by specific problem-related algorithms that are applied for data modeling and validation. At the end of a given analysis, the results go through interpretation followed by decision making.
reproducibility, and continuous feedback. The desirability of these traits has brought investment from large pharmaceutical companies to build highly automated systems. 70,71 Automating chemical processes is also prominent in chemical research, enabling faster and more precise scientific inquiries. 72,73 The abilities gained by automation lend themselves to be combined with statistical methods for optimization of chosen chemical parameters in chemical space.
The complex tasks of identifying significant parameters for optimizing outcomes and exploring regions of interest in chemical space are required for effective experiments and knowledge discovery. In essence a given chemical space is being searched either to find an optimal point or to gather more information about the areas of interest in the space. A tool for that task is design of experiments (DoE), which helps in recognizing the most relevant parameters. The numerous statistical methods in use today for DoE are linked to the work of R. A. Fisher starting from 1935. Fisher demonstrated the importance of effective randomization, repetition, blocking, orthogonality, and factorial experiments in order to increase the sensitivity of designed experiments. Fisher indicated that the key factor in DoE is to apply valid and efficient experiments that will produce quantitative results to support decision making. 74 One of the biggest advantages of DoE is that it allows researchers to decide which reactions and conditions to focus on. This can be achieved through the generation of a mathematical model/design space which exposes a relationship between factors affecting a process and the output of that process. In other words, DoE ( Figure 4) could reveal which factors impact the outcome and determine optimal parameters (time, temp, quantity, pH, etc.). 69 However, one also needs to take into account that, in DoE, no one method offers a complete solution, and significant work is needed to find the many factors required for discovery. Hence, the algorithms used for searching the space may be simple (e.g., screening design of experiment such as a fractional factorial design) or verbose (e.g., full factorial design). 75 A good DoE will allow for the robust comparison of experimental  . This allows for estimating factor directions (right-hand side figure), which facilitate the use and interpretation of multivariate statistical models. The important impacts from single factors and relations between factors can subsequently be estimated. As more data is collected the model becomes more precise.
outputs and provide good sample size requirements. Various DoE algorithms have been applied in chemistry such as 2-level factorial, Plackett-Burman, full factorial, Box−Behnken, Doehlert, Mixture, and many more. 74 A selection of other search algorithms such as simplex, multidirectional search, parallel simplex search, and more are described in a report by Dixon and Lindsey. 76 The report also shows that such approaches have been used effectively in chemistry-related studies to maximize the output of information with a minimal amount of computing power and experimental resources. Performing experiments in a given chemical space and validating the results can benefit greatly from using DoE techniques.
Chemical Robots. Recent advances in the design and application of algorithms, big data, and automation and control systems may allow the development of intelligent chemical robots that can target discovery. A "chemical robot" can be defined as any controllable agent capable of performing chemistry. Under this definition, there are several different types of systems that fall into this category. This would include simple systems that are static, yet offer the capability of using their inherent properties to modify the chemical system by performing the experiments in designed 3D printed devices. 77,78 More complex systems use integration of analytical instruments into the experimental platform at the cost of requiring bespoke fabrication and construction. 79 At the other end of the spectrum, there are many different commercial systems available today 80,81 which offer modularity, reliability and ease of use, at the cost of high expense and lack of integrability. However, most systems in use in research are built in-house to avoid these shortcomings. They offer flexibility and a focus on making a robot that is as close as possible to the right tool for the job; there are no superfluous abilities or complexity, as that would waste resources. An example model of such a system for flow chemistry can be seen in Figure 5. Figure 6 shows a scheme for an automated system for the exploration of an inorganic polyoxometalate chemical space involving many possible input materials. 81 The computer controls the pumps dispensing the starting materials and so can perform an array of reactions with different starting material ratios which resulted in the discovery of several new inorganic compounds. The drawbacks of the systems 81−84 include technical expense and numerous engineering challenges. Beyond solving the specific problems required by the various chemical operations, a major hurdle is the difficulty in integration of the various kinds of subsystems. Many subsystems, such as analytical tools and material handling, do not offer an industry-wide standard for control or even a physical interface. Thus, much work is required to integrate these devices into a larger system, especially across different vendors. It is hoped that, with time, the demand for simpler subsystems with the ability to easily integrate between vendors, as well as different kinds of modules, will result in more integration-focused products with cross-industry standardization.
A lot of work is being done to develop robots with everincreasing complexity and ability. Recently systems with differing modules of chemical operations have been created 38,85,86 that enable several automated chemical reactions, including workup of products. Some robots are even able to conduct end-to-end pharmaceutical processes, including purification and formulation, 87−89 as depicted in Figure 7. Despite the high level of engineering in these systems and their expense, they have a lot of potential if they could be generalized. However, looking past the improvements in engineering, many systems are not reaching the fullest potential of chemical robots. Robots should not be merely a combination of modules that perform chemistry. They can be enablers of improved chemistry, which in turn can enable better chemical robots. 90 If fit-for-purpose chemistry can be coupled to enhanced capabilities of the robotic system, then the capabilities of chemical robots can be advanced. Instead of performing the same chemistry but only in an automated manner the chemistry can be adapted to the abilities of the chemical robots and thereby acting as a multiplier for its effectiveness.
Chemical Intelligence. There is an ongoing drive toward improved automation. On one front, systems are becoming cheaper, more common, and easier to use. On a different front, researchers are working to extend the capabilities of such automated systems. 91 Beyond the engineering effort going into this field there is a more profound enhancement that automated systems require: autonomy. In addition to the layers of systems, components, and algorithms capable of automatic operation there is scope to add another layer of algorithms that will give the overall system the ability to decide on its own which experiments to execute once it is set in motion.
An obvious approach to introducing autonomy is by giving the system some level of chemical understanding. To do so, first the standard chemical representation of information needs to be digitized. Efforts to standardize this fundamental Figure 5. Representation of a typical gas−liquid photoredox continuous flow system for gas−liquid photocatalytic transformations. The system starts from the top with a reactant gas with a mass flow controller. The gas enters a mixing zone before entering a photomicroreactor, often assembled from a coiled PFA capillary with an LED array as the light source. After the reaction quenching zone the product solution can be obtained.
A "chemical robot" can be defined as any controllable agent capable of performing chemistry. requirement have produced several previously mentioned widely used representations such as SMILES, 8 InChI, 92,93 and Mol. 16 Once this information is digitized, it becomes possible to use supervised learning for prediction. In chemistry, many types of systems, also called expert systems, use accumulated knowledge to evaluate the likely outcomes of human or computer generated hypotheses. A recent example of this approach is to use a large database of experimental results along with digital features of the chemicals involved to predict possible reactions; 94 this work is also noteworthy for using data about negative results as well as positive results. Another clear yet difficult usage of these techniques is in retrosynthesis, finding synthesis routes that match given criteria. These efforts began many decades ago 95 and are still ongoing. 96,97 The operations performed by these systems are computationally intensive and therefore are often considered independently of a  running experimental system. 19 We, however, envision the use of these systems in close coupling with a running system in real-time so that the theoretical predictions are used to direct experiments, and the feedback from real-world data can be used to give fine-grained information for the expert system to improve its output.
However, not every chemical system is reliable. This is particularly true for scientific research as there cannot be experimental information for chemicals, reactions, and methods created as part of the research. In fact, the push to expand scientific understanding demands that we investigate systems with partial or no information. In that case, understanding of the chemical system is comparable to conducting a search within the accessible chemical space with no prior knowledge. We can define the parameters of a chemical system a set of input parameters and associate their relation to the resulting state of the chemical system. This allows us to map any set of input parameters of a given size. Different sets of input variables can have the same output, yet the reverse is not allowed; there cannot be more than one output from the same set of input values. All the states and the definition of their inputs comprise a space which can be viewed as a surface (see left plot in Figure 8), for which each point has an outcome associated with it, which is the chemical and physical state. The point on the surface is the chemical space resulting from performing an experiment with specific input parameters.
The outcome of any experiment in a chemical system is the physical and chemical state of the all constituent parts of the system from the lowest level of molecules up to clusters, micelles, and any other compound structure. The full richness and information about these systems often cannot be evaluated exactly. First, there is a matter of output variability, as even conducting a repetition of an experiment with the same input parameter values will likely yield an outcome that is within a distribution of outcomes. Second, the chemical and physical state of a system is difficult to know exactly down to the individual molecule level, thus introducing experimental uncertainty. Although the entire complete chemical state of a system is likely hard to measure, there is a practical level of knowledge that can be reached. For a desired level of knowledge about the chemistry, there is undoubtedly a set of measurements that contain the relevant information about the state. The measurements represent the real outcome by a mapping function. This mapping function relates the results of the measurements to the desired information about the outcome. A schematic example of the results from different utility functions can be seen in Figure 8. When the input parameters are designed or otherwise known, understanding the chemical system is the same as learning these two functions: the space function which would give the results of measurements for any given input point, and the mapping function which ties the measurements to the representative chemical outcome. Presenting the experimental chemical system in this way is a prerequisite for an autonomous system to be able to conduct experiments that improve the chemical understanding of the system especially when aiming at discovery.
In chemistry, many types of systems, also called expert systems, use accumulated knowledge to evaluate the likely outcomes of human or computer generated hypotheses. Figure 8. Surface of a model function with two continuous input parameters. The left shows the real space where the outcome is the chemical system, and on the right are three different plots originating from approximating the real system with different utility functions M 1 −M 3 , where M 1 is the difference between peaks in the mass spectrum, M 2 is the amplitude of the UV/vis spectrum at a given wavelength, and M 3 is a combination of the former two.
Algorithm Design for Chemical Discovery. The choice of algorithms to use for discovery in chemical systems can lead in many different directions with many forks in the road. By understanding the characteristics of the different chemical spaces and algorithms we can make the selection easier. Given the vast size of organic chemical space (mw up to 500), it is estimated that more than 10 60 molecules 98 might be stable, with a limited range of conditions for reactions between these molecules; the space is in essence extremely sparse. This stems from a basic truth that most molecules, under most conditions, do not react with most other molecules. This leaves many possible combinations of reaction conditions and starting materials empty. The main problem with sparsity is that it becomes difficult to get statistically significant understanding about the space with which to make better decisions. An additional problem in this type of space is that for chemical systems we have additional constraints such as time, expense, and availability. Given that the clear majority of chemical experiments are destructive to the starting materials, this forms a hard limit on the total number of experiments that can be done. If possible, the design of the chemistry to use in a system should use heuristics to focus on the options that reduce sparsity. In fact, in most chemical systems this is an intuitive method. A chemist uses their knowledge of chemical reactivity to choose a set of chemicals and conditions that constitute a portion of the chemical space that is dense. In experimental terms, that means that a significant portion of the possible reactions produce a measurable result. Some spaces, however, either cannot be designed or cannot be guaranteed to be dense. Fortunately, there are also some systems that are not only dense, but also convex, or in other words, the space function has a single global maximum. A common case would be the yield of a reaction as a function of temperature; from the peak of yield at a certain temperature the yield will decrease continuously in both directions. These types of systems lend themselves easily to optimization, and it is common in chemistry to solve these problems with various DoE algorithms, as discussed earlier. However, most interesting scientific problems stand to be more complex than that. For instance if there are a number of combinations of variables that lead to high yields then it is not trivial to find which of these regions is the best without measuring the entire space.
A simple way to tackle the search problem is the application of random experiments in order to explore the space. This has proven to be useful in combination with clever heuristics to improve search efficiency. 99,100 However, the process is not robust, and it is hard to statistically validate the outcome since it would require many repetitions of different sets of random experiments over the same space. On the other hand, bruteforce algorithms cover the entire possible space. 101 This allows one to reduce the odds of missing interesting outcomes, but which would be impractical for many systems given resource constraints. A comparison between random and brute-force algorithms can be seen in Figure 9A,B. Many optimization algorithms for solving complicated systems are instead stochastic. These are divided into two classes of algorithm: instance-based, and model-based. Both classes of algorithm choose the next experiment based on the previously performed experiments. This means they use closed-loop feedback to iterate over performing experiments to gather more information which is used to choose the next experiments and so on. For instance-based algorithms such as simulated annealing, 102,103 particle swarm optimization, 98 and genetic algorithms, 63,99 the sequence of chosen experiments aims to follow a general direction of improvement of the outcomes, yet there is no model being constructed or updated. On the other hand, a model-based algorithm builds and updates the model that it was trained on. The model can be seen as an approximation of the space function. This function can be constructed using an additional algorithm such as support vector machines, 101,102 self-organizing maps, 104 and kriging. 105 As the models built during the search are closely related to the surface function, they are more useful in terms of discovery.
Discovery does not mean that the chemical system is described in its entirety by a model. Rather, it is the new information gained from a new experiment. In other words, a discovery occurs when the model needs to be updated by a substantial amount to better match the real space function. Finding new results that differ from previous data in a statistically significant way is called outlier or anomaly detection. It is an area of significant research 106,107 as it is in many settings important to know when new data is different enough to merit special attention. Figure 9C and 9D shows examples of outliers. Outliers indicate a statistical difference from expectation and as such can indicate either a positive When the input parameters are designed or otherwise known, understanding the chemical system is the same as learning these two functions: the space function which would give the results of measurements for any given input point, and the mapping function which ties the measurements to the representative chemical outcome. Figure 9. Model of the five first experiments conducted in a 1D system, whose surface is the red line, randomly (A) and with a bruteforce approach (B). (C, D) Examples of outliers where the former has an outcome that is a statistical outlier from the three experiments, and the latter is an outlier due to a deviation from the real outcome surface.
discovery or a worsening of the outcome. It is the mapping function that must be able to distinguish between these possibilities. Such a mapping function should give an outlier for a real discovery receiving a high value, whereas an outlier with a negative outcome should receive a low value. Both positive and negative values should represent a significant deviation from expectation which means that they both add substantially more information about the chemical space. Performing experiments to completely understand the function describing a chemical system is in many cases impossible. Even if it is possible, it may be impractical, and even if practical, it is likely to be inefficient. The shape of the model that any algorithm would be able to produce depends on the mapping function. Even for the same surface function, if the desired outcome from the mapping function is changed, so would the shape of the surface as depicted in Figure 8. Therefore, even if the space is fully explored, the shape of the resulting function may not match the real system, as the mapping function must always be an approximation. Furthermore, using a static mapping function will block an avenue of discovery and limit the possible discoveries to the shape of the surface exclusively. It can therefore be useful for discovery that the algorithm to understand the space function and the algorithm to define the mapping function are connected and coevolving. As the exploration of the system progresses, the mapping function needs to be updated as well, so both move together to gain a better understanding of the system and the outliers that should be of most interest.

■ CONCLUSIONS
While algorithms are very widely used in the chemical sciences, the potential to expand the use beyond data processing to decision making and active searching of chemical space is possible. 108 By exploring the types of algorithms that are needed to accomplish different goals, it is possible to build on those used in standard chemical work as well as classes for extending the possibilities of research that could otherwise not be accessible. The key excitement should be focusing on the potential of developments for chemical discovery. By explaining the inherent problems of conducting research in the scope of chemical space, we have shown that such scientific problems can be related to optimization and searching methods. 109 We have shown the importance of the definition of the space function and the utility function. Finally, we have explained how the coupled exploration of space and utility function might assist in real discovery, and this might also be applicable to more complex chemical systems. 110 As such we feel there are two directions of development for the use of algorithms in chemistry. The first is employing algorithms into standard chemical science. An increasing selection of algorithms is finding a use in chemical research over different levels of operation. However, these algorithms need to have suitable frameworks and software foundations for integration in chemical systems. Thus, they can be implemented as a tool by nonexperts. The second direction is improving the algorithms used for development of systems capable of new discoveries. Here, new algorithms are being implemented along with existing algorithms being modified to suit the chemical world. Many of these algorithms will be used for discovery and the expansion of chemical space to search new undiscovered possibilities. Finally, the use of algorithms helps scientists to set up entirely new models of interactions, behaviors, and expectations of discovery. Consequently, this allows to define a new area of chemistry, that of "meta-chemistry". This might be compared to "meta-physics", whereby radical new models of reality emerge from making logical arguments with existing data.