Structure and Pore Size Distribution in Nanoporous Carbon

We study the structural and mechanical properties of nanoporous (NP) carbon materials by extensive atomistic machine-learning (ML) driven molecular dynamics (MD) simulations. To this end, we retrain a ML Gaussian approximation potential (GAP) for carbon by recalculating the a-C structural database of Deringer and Cs\'anyi [Phys. Rev. B 2017, 95, 094203] adding van der Waals interactions. Our GAP enables a notable speedup and improves the accuracy of energy and force predictions. We use the GAP to thoroughly study the atomistic structure and pore-size distribution in computational NP carbon samples. These samples are generated by a melt-graphitization-quench MD procedure over a wide range of densities (from 0.5 to 1.7 g/cm$^3$) with structures containing 131,072 atoms. Our results are in good agreement with experimental data for the available observables, and provide a comprehensive account of structural (radial and angular distribution functions, motif and ring counts, X-ray diffraction patterns, pore characterization) and mechanical (elastic moduli and their evolution with density) properties. Our results show relatively narrow pore-size distributions, where the peak position and width of the distributions are dictated by the mass density of the materials. Our data allow further work on computational characterization of NP carbon materials, in particular for energy-storage applications, as well as suggest future experimental characterization of NP carbon-based materials.


Introduction
Nanoporous materials, where the pore-size distribution can be tuned according to growth conditions, can be engineered for specific applications, such as ionic and molecular transport, 1 biosensors, 2 air and water purification, 3 or energy storage. 4 In particular, nanopores in disordered graphitic carbons arise due to the misalignment and local curvature of the graphenelike sheets that make up these materials. These pores are defined as any interstitial space between graphene planes larger than the typical interlayer spacing in graphite (≈ 3.5Å). 4 When the pore sizes are of the order of a few nanometers, the graphitic carbons are referred to as "nanoporous" (NP) carbons. Nanopores significantly increase the effective surface area of NP carbons and confers this class of carbon materials with the ability to intercalate a number of small particles. 5 Most notably, the nanopores can accommodate ions which can then migrate through the material or intercalate/deintercalate upon application of an external electric field, a mechanism exploited, for instance, in Li-ion batteries. 4 In addition to the existence and utility of pores in NP carbons, graphitic carbons in general display many attractive features that make them versatile materials for a number of technological and industrial applications. They are chemically stable, thanks to the large cohesive energy of the sp 2 bonds in elemental carbon. This makes them naturally resistant to corrosion and other degradation processes affecting material lifetimes in devices where electrochemical reactions are taking place, such as electrochemical sensors and batteries. 2,4 Graphitic carbons also display some astonishing mechanical, thermal and electronic properties, with graphene or carbon nanotubes (CNTs) topping the charts of known materials in terms of some of their physical properties. [6][7][8] At the same time, carbon materials can be chemically functionalized to alter their reactivity, an important example being graphene oxide (GO). 9,10 While graphite itself shows formidable inplane strength, it can be easily deformed along the direction of stacking of its constituent graphene monolayers. On the other hand, disordered graphitic carbons, such as NP and glassy carbons (GCs), can show isotropic mechanical response to dimensional changes in all spatial directions even in the absence of noticeable sp 3 bonding. This happens whenever the graphitic planes are randomly oriented such that there is no preferential direction of stacking. These graphitic carbons are known as "non-graphitizing", because it is not possible to experimentally transform them into crystalline graphite even when applying large amounts of pressure and/or heat. 4,11 For these reasons, NP carbons have attracted a great deal of attention in recent years and in particular because of their potential for electrochemical and electrocatalytical applications, and for energy storage. Unfortunately, as is often the case for disordered materials, experimental characterization of the structure of NP carbons is hindered by the lack of mid-and long-range order, which renders common characterization techniques inconclusive. The poresize distribution in NP carbons is a critical parameter to assess their potential for ion intercalation. To this end, in this work we computationally examine the structure and pore-size distribution in NP carbons over a wide range of mass densities using state-of-the-art machinelearning-driven molecular dynamics (MLMD) simulations. 12 Overcoming typical shortcomings of accurate but expensive density functional theory (DFT), on the one hand, and cheap but inaccurate empirical interatomic potentials, on the other, MLMD grants us access to accurate atomistic simulation on time and length scales previously out of reach. 13,14 In particular, we carry out computational generation of NP structures, with approximately 130,000 atoms each, in the mass density range from very low (0.5 g/cm 3 ) to relatively high (1.7 g/cm 3 ). The large number of atoms allows us to simulate pores whose size is not limited by the dimensions of the simulation box. On these structural models, we then perform a detailed characterization of the atomistic structure, pore-size distribution and elastic properties. We expect that these results will be useful in planning and interpretation of experimental studies of NP carbons targeting specific pore sizes, as well as to achieve a better fundamental understanding of the structure of this important class of carbon materials.

Methodology
Nanoporous carbons can be experimentally synthesized by high-temperature heating ( 1800 K) of a carbon-rich precursor of either organic or inorganic origin. 4 At these high temperatures, the volatile molecules and elements in the precursor compound other than carbon will evaporate away, leaving behind the carbonrich graphitic material. Since this is a very complicated process to simulate directly (even with MLMD) due to the complex chemistry and time scales involved, in this study we pursue a different route. We prepare a highly disordered (liquid) pure carbon precursor at very high temperature (9000 K) and then graphitize this system at 3500 K using MLMD at different mass densities. This approach, 15,16 which we review in detail in this section, allows us to model the process of graphitization within accessible simulation times and leads to the generation of NP carbons with a wide range of pore sizes.

GAP potential for carbon
Our MLMD method of choice is the Gaussian approximation potential (GAP) framework. 17,18 GAP is a kernel-based ML method for interpolation of atomic potential energy surfaces, i.e., interatomic energies and forces. To realistically simulate carbon graphitization we have retrained the GAP interatomic potential for a-C of Deringer and Csányi 19 (GAP17) for improved accuracy and computational performance. We used the same structural database of Ref. 19 as a baseline, but incorporated additional graphite and dimer structures, and recomputed it at the PBE+MBD level of theory 20,21 as implemented in VASP. [22][23][24] The GAP17 database contains dimers, amorphous and liquid carbon structures, amorphous surfaces, diamond, and graphite configurations. The GAP potential itself uses kernel-based regression 17 with two-body (2b), three-body (3b) and many-body (mb) descriptors. Within this framework, a local atomic energy is predicted by the following expression: where k(d * , d s ) is a kernel, measuring the similarity (with a score from 0 to 1) between the atomic descriptor of the test configuration, d * , and the atomic descriptors in the database, . Each atomic environment, denoted with a star (*), has a set of 2b, 3b and mb descriptors associated to it. Typically, this comprises a 2b descriptor for each atom pair it belongs to, a 3b descriptor for each atomic triplet it belongs to, and one mb descriptor. The prime symbol indicates that the sums over neighbors are carried out up to a certain cutoff. The cutoff for the mb descriptor is embedded directly into it, since the mb descriptor itself is constructed using truncated sums over neighbor pairs. Descriptors of different types use different kernel functions, and have different sets of fitting coefficients {α s } Nsparse s=1 (obtained during training) and energy scale parameter δ associated to them. Finally, N sparse , which is related 1 to the number of training configurations in the database, is also descriptor specific. Forces can be obtained analytically from the gradients of Eq. (1). A detailed account of this approach has been given by Bartók and Csányi 18 and in the recent review by Deringer et al . 25 The retrained GAP uses the same 2b and 3b descriptors as the original GAP17 19 but replaces the SOAP mb descriptors 26 by soap turbo descriptors. 27 soap turbo offers computational and accuracy 1 We differentiate between the training set and the sparse set. The training set is all the data (atomic structures and target properties) used to train the model. The sparse set is a small subset of the descriptors found in the training set, which is used to construct the kernel functions.
advantages, such as faster computation of descriptors, smoother radial basis functions and compression (a dimensionality reduction of the descriptor resulting in significant speedup).
The longest range for interatomic interactions that are accounted for by this GAP is 3.7Å, as in GAP17. Van der Waals interactions are also explicitly incorporated, in a limited way, via a tabulated long-range pairwise interaction, which is optimized to reproduce the binding curve of graphite, as previously done in Ref. 28. We also used the tabulated potential to model the strong exchange repulsion taking place at very short interatomic distances; our "core" potential is fitted to reproduce the dimer curve of carbon up to 1.8 keV (or, equivalently, down to 0.1Å separation). The van der Waals and corepotential corrections were absent from GAP17, and therefore we also expect this GAP to improve these two aspects, relevant for graphitic carbons and high-temperature or collision simulations, respectively. Training of the new GAP was carried out with the QUIP and GAP codes (with soap turbo libraries). All testing and production MD simulations were done with the TurboGAP interface. 29 The new GAP is freely available online. 30 To ensure that the new GAP can accurately simulate the graphitization process, we performed some benchmarks as summarized in Fig. 1. We note that the root mean-square errors (RMSEs) on the "split test set" configurations (atomic configurations generated similarly to those used for training but that are excluded from the fit and reserved for testing) are smaller than those of the original GAP17. On the same test set, the new GAP reduces the error by 26% and 9% on energies and forces, respectively, as compared to GAP17. A more application-specific accuracy test was run with the new GAP for the vacancy defect formation energies in a graphene sheet. An accurate determination of these energies is critical because the graphitization process consists essentially of rearrangement of C atoms preferentially into 6-membered rings, with any deviation constituting a defect and thus incurring an energy penalty. The resulting ring and defect distribution in our porous carbon will thus be highly  sensitive to the GAP's ability to accurately reproduce these energies. In Fig. 1 c) we show how this GAP indeed succeeds at predicting the relative vacancy formation energies of a series of representative defects in close agreement with DFT. This is a stringent transferability test since structures similar to these are not included in the training set. The RMSE for defect formation energy in graphene shown in Fig. 1 c) are 0.52 eV/defect for the new GAP and 1.41 eV/defect for GAP17. Based on our tests with this and other GAPs, we ascribe this improved transferability to the use of the new soap turbo descriptor, which provides smoother  kernels.
The GAP also reproduces the cohesive energies ( Fig. 2), structural and elastic properties (Table 1) of the three most important carbon allotropes: diamond, graphite and graphene. We highlight the accurate description of interlayer vdW interaction in graphite, with the only significant deviation being the C 33 elastic constant. This is a particularly positive feature of this potential given the notorious difficulty encountered by carbon force fields to reproduce vdW binding; 31 PBE-DFT itself fails to bind graphite at all in the absence of explicit vdW corrections.
A crucially important aspect of the MLMD approach concerns its computational efficiency.
Here we performed a comparison between the GAP17 running with LAMMPS 32,33 and the new GAP running with TurboGAP for a test configuration containing 1000 C atoms (simple cubic lattice with 2 g/cm 3 density and random initial velocities). We ran 1000 MD steps on a 40 CPU-core machine (the same hardware for both codes). The TurboGAP calculation ran in 0.47 CPUh or 1.7 ms/atom/MD step (42 s or 42 µs/atom/MD step of walltime) whereas the LAMMPS calculation ran in 3.3 CPUh or 12 ms/atom/MD step (297 s or 297 µs/atom/MD step of walltime), i.e., the new GAP runs about seven times faster and is more accurate. There are three main reasons for this significant speedup. First, the soap turbo descriptors are intrinsically faster than the original SOAP because of more efficient iterative algorithms. 27 Second, soap turbo implements compression, which in itself can be the single largest contribution to computational speedup in kernel-based ML potentials. 34 Third, TurboGAP implements efficient polynomial 3b kernels, which are also significantly faster to compute than the squared-exponential 3b kernels used in GAP17.
The result of adding the vdW correction is a computational overhead that strongly depends on the cutoff for the vdW interactions. For this benchmark comparison, a 10Å vdW cutoff, the same as for the graphitization simulations in the next section, led to ≈ 13 % slower TurboGAP execution (48 s, 0.53 CPUh or 1.9 ms/atom/MD step). Therefore, with the new GAP and using the TurboGAP interface, it is possible to tackle much larger systems than before which is of critical importance to achieve realistic pore sizes in the present work.

MD simulation details
We carried out all the MD simulations with the new GAP interatomic potential for carbon and the TurboGAP code. 29 For a comprehensive characterization of NP carbon, we performed simulations over a wide range of mass densities: 0.5, 0.9, 1.3 and 1.7 g/cm 3 . For each density value, a cubic simulation box with 131, 072 carbon atoms is generated with randomized initial atomic positions. The side lengths of the cubic boxes are 17.36, 14.27, 13.62, and 11.54 nm for the four densities considered above, respectively. The simulation box dimensions give an idea of the maximum pore sizes that can be modeled.
The protocol for NP carbon generation is as follows [see Fig. 3(a)]. First, we hold the highly disordered structure at a constant high temperature (9000 K) in the liquid state for 20 ps. Second, we quench the liquid with a linear decaying temperature profile for 5.5 ps down to 3500 K. The structure is then kept at 3500 K for 200 ps. This is the graphitization step, 15,16 and the time has been specifically optimized to ensure that no significant further graphitization takes place after this period by monitoring the creation of 6-membered rings as a function of time, which is negligible after 200 ps at 3500 K [see Fig. 3(b)]. Finally, the graphitized sample is quenched down to 300 K over 3.2 ps, and annealed at 300 K for 10 ps to obtain the room-temperature structures. The velocity-rescaling Berendsen thermostat 35 with time constant τ t = 100 fs was used to control the temperature, and all GAP-MD simulations were performed under periodic boundary conditions within fixed box dimensions, with a Verlet integration time step of 1 fs.
For each density we performed six independent MD simulations as described above to assess the effect of individual random-seed structures on the final results. This is not needed for studying short-range structural properties such as coordination statistics, but is needed for accurate calculations of medium-and long-range structural properties such as ring statistics and pore-size distribution, as well as elasticity.

Structural characterization
After obtaining the MD trajectories we used OVITO 36 to calculate the radial distribution function (RDF) and the angular distribution function (ADF). The cutoff distance here was chosen as 2.0, 1.9, and 1.8Å at 9000, 3500, and 300 K, respectively. This software was also used to render the structural images and calculate the coordination numbers of the sp, sp 2 and sp 3 structural motifs using a cutoff distance of 1.8Å, which fully encloses the first-neighbors shell for NP carbon. We used a single frame to calculate the RDF and ADF which is sufficient 37,38 for systems with a large number of atoms.
The reciprocal-space diffraction scattering patterns, I (Q), were calculated using the Debye scattering equation as implemented in the Debyer code: 39 Here, Q = 4π sin(θ)/λ is the scattering parameter, where θ is the diffraction half-angle, λ is the wavelength (1.5406Å for the commonly used Cu K-α), r ij is the distance between atoms i and j, and f is the atomic scattering factor for a single carbon atom. As noted in Ref. 37, the minimum image convention causes problems when using the Debye equation, and therefore it is necessary to map the atoms into the primitive cell and treat the structure as an isolated cluster. The coordination numbers only account for short-range structural properties. To study the medium-range structural properties, we calculated ring statistics, using the shortest-path algorithm of Franzblau 40 as implemented in the I.S.A.A.C.S. code. 41 We used the same cutoff distances for nearest neighbors as in the calculation of the ADF.
To quantitatively characterize the nanopore structures arising in our samples, we calculated the size and shape distributions of the pores using the zeo++ code. 42 For the pore size, first the three-dimensional Voronoi network 43 is computed to establish the void space accessibility within the pores, using the Voronoi decomposition method implemented in the Voro++ library. 44 We further take Monte Carlo samples to determine the accessible volume, and eventually determine the radius of the largest sphere that encloses the sampled point without overlapping any of the C atoms in the NP structures. As a complement to pore size, we also present corresponding pore-shape information using a stochastic ray-tracing algorithm 45 that shoots random rays through the accessible volume of the Voronoi cell until they hit the wall of constituent C atoms. We take a probe radius of 0.7Å (half of the typical bond length in the NP carbon samples) and 130,000 points for Monte Carlo sampling.

Short-range order in liquid and NP carbon
To characterize the short-range order, we present the RDFs for the liquid (9000 K), graphitized (3500 K) and NP (300 K) structures generated during the melt-quench process in Fig. 4. We consider the product of RDF g(r) and density of particles n to highlight the comparison between the different density systems. The RDFs converge to their respective average density of particles n (0.025, 0.045, 0.065 and 0.085Å −3 for 0.5, 0.9, 1.3 and 1.7 g/cm 3 , re-  spectively) as the interatomic distance r tends to infinity.
Similar to the results obtained by GAP17, 19 the liquid structures in Fig. 4(a) are less strongly ordered and more diffuse, showing a nonzero minimum at ≈ 2.0Å, whereas the solid phases in Fig. 4(b) and (c) are more ordered even in the medium-range regions, exhibiting more individual peaks. These solid phases, especially at 300 K, display a clear gap between their first (r 1 ≈ 1.42Å) and second (r 2 ≈ 2.46 A) peaks. Compared to the solid structures, in the high-temperature liquid in Fig. 4(a) the peak shifts to a smaller r (≈ 1.35Å), which is also associated with the presence of a bimodal angular distribution in the corresponding ADFs of Fig. 5(a).
After the 3500 K structure is quenched down to room temperature, the peak at r 3 ≈ 2.84Å becomes clearly noticeable in Fig. 4(c), which corresponds to the third nearest-neighbor distance in a graphite hexagon. That the first three peak positions show a close correspon-dence to the first three neighbor distances of graphite implies 6-membered rings are the dominant short-range structural motif in these NP structures. Also, in the locally magnified view of Fig. 4(c), a small shoulder at ≈ 2.32Å (corresponding to the second nearest-neighbor distance in a carbon pentagon) appears in the early second nearest-neighbor region, a signature of the presence of some 5-membered rings.
To accompany the discussion with a coordination analysis of the first nearest-neighbor regions of RDF above, we give ADFs in Fig. 5. At 9000 K, in addition to the main peak at ≈ 120 • , an additional smaller one appears at ≈ 53 • (close to the 60 • in an equilateral triangle, Fig. 5(a)), which means there are some irregular triplets in existence in the liquid state. These irregular triplets might be the main cause for the shifted RDF peak in 4(a). We also note sizable differences in their ADFs across the four sampled densities in the liquid state. This is in contrast to the graphitized [ Fig. 5(b)] and room-temperature NP [ Fig. 5(c)] samples, for which the coordination statistics show very little dependence on the sample density. This is a clear sign that the short-range order in NP carbon is rather independent of mass density.
For the liquid samples, the ADF values all tend to zero at 180 • , indicative that there is almost no presence of flat linear sp-bonded chains. The most stable linear sp chains in solid carbon are expected to show bond angles in the vicinity of 155 • , 46 where the ADF in [ Fig. 5(a)] is still sizable. Our present result is inconsistent with previous findings for simulated liquid carbon, 19,47 in which small-size systems containing 216 atoms were modeled under period boundary conditions. Our tests indicate that this discrepancy is due to small inaccuracies at very high temperature for the energetics of linear chains as predicted by our GAP, which is biased towards the more stable bent chains (∼ 155 • ) observed at low temperature.
During the graphitization step, in Fig. 5(b), unstable triplets rearrange into more stable motifs centered at ≈ 120 • . We can clearly see the signature of near-complete graphitization, which mirrors the main structural feature of graphite hexagons. Notably, in quenched NP Morphology and short-and medium-range characterization of NP structures Figure 6 shows the cross-sectional morphology of our quenched NP carbon samples for the four densities studied. Clearly, most of the carbon in these samples makes up curved graphene fragments with predominantly sp 2 bonding, quantified in Table 2. These fragments assemble into three-dimensional networks, where edges contain sp motifs and planes can become interlinked through sp 3 motifs (cf. Fig. 6(eh)), which is the same role played by interstitial defects in graphite. 49 One of the distinctive characteristics of NP carbon is the typical nanostructuring of the various tangled graphitic ribbons and graphene sheets, as shown in Fig. 6. Obviously, the mass density is the main factor determining the pore sizes, with large open pores observed at low density and tightly packed graphitic planes interlinked to each other at high density. Due to their im-portance for emerging and established applications, pore-size distributions merit further discussion. We give an in-depth analysis of pore morphology later in this section. For all densities, 6-, 5-and 7-membered rings, in that order, are the main repeat units making up the individual graphene layers as quantified in Fig. 7 through ring statistics. Ring statistics as a measure of the structural topology is useful to characterize medium-range order and even to reflect the curvature of the graphitic planes. Figure 7 shows the number of rings normalized per atom, from triangles to dodecagons, for all four densities studied, in both high-temperature graphitized and roomtemperature NP carbon samples. As a reference point, ideal graphene or graphite has 0.5 hexagons per atom and zero curvature. Consistent with the extremely high sp 2 fraction reported in Table 2, our results show a predominance of hexagons over other rings, with an average count of 0.304 (0.299 at 3500 K). The second most common ring motif is pentagons with 0.088 rings/atom (0.086 at 3500 K), followed by heptagons with 0.054 rings/atom (0.053 at 3500 K). 6-, 5-and 7-membered rings are practically the only ring motifs contributing to the morphology of NP carbon, with the next most common motif (octagons) over two orders of magnitude less abundant than hexagons. Many of the unstable defect rings that remain after 200 ps of graphitization heal after quenching to 300 K, in particular the 3-membered rings. Ring distributions are very similar across all four densities except for larger rings, where differences are only noticeable because of the logarithmic scale used in Fig. 7. We also note that the small standard deviation (computed from the different random seeds employed at each density) confirms the robustness of the simulation protocol. Therefore, we conclude from our results that ring statistics have no density dependence.
In order to ensure the validity of the NP carbon structures obtained here, it is of great importance to assemble the evidence by calculating several key structural and mechanical indicators and comparing them with experimental data. Among them, diffraction patterns and  Figure 8 presents (a) the intensity of X-ray diffraction (XRD) and (b) the G(r) of the reference experimental samples and our simulated NP samples. We take the 1.7 g/cm 3 model structure as the closest to HTW2500, because of the similarity in mass density. The labeled peaks in Fig. 8 The interlayer separation can be estimated from Bragg's law according to the relation for the lth-order reflection: where d 00l gives the interplane separation, λ is the wavelength of the X-ray light source and θ 00l is the diffraction half angle at which the {00l} peak is observed. The shift toward smaller scattering angles in Fig. 8(a) where K ≈ 1 is the Scherrer constant and B(2θ) is the FWHM (in radians) of a given peak on the XRD pattern. The size of in-plane "crystalline" pockets can be inferred from the data for all densities, and it is L a ≈ 4Å for all of them. This number gives an idea of the typical regions in a graphitic sheet free of defects, i.e., made of hexagonal rings only. This quantity is in good agreement with the experimental porous sample which we compare to ours in Fig. 8. The size of crystallites along the direction of stacking can only be inferred for our 1.7 g/cm 3 NP sample, as L c ≈ 9Å. Therefore, in this sample the graphitic sheets are stacked locally similarly to a graphite crystal within length scales encompassing about three layers.

Pore-size distribution
Porosity is one of the most significant properties of NP carbons and the one that determines their usefulness for many technological applications. Therefore, the characterization of porosity and pore structures in our computational samples is a central part of this work. The normalized pore-size distribution (PSD) and ray-trace histogram are both plotted in Fig. 9. Comparison to experimental observations is challenging because of the difficulty in directly measuring PSD in experiments. In Fig. 9(a), the PSD histograms behave similarly to normal distributions. We can clearly see, as expected, that the PSDs shift to smaller diameters with increasing density, and their peaks appear, in decreasing order of density, at 0.47, 0.84, 1.37 and 2.68 nm for 1.7, 1.3, 0.9 and 0.5 g/cm 3 , respectively. As shown in Fig. 6, a more open structure means more available space between graphitic planes and, consequently, the presence of larger pores. At the same time, as the density increases the PSDs become sharper (i.e., the individual pores have less variation in size). It should be pointed out that our PSD analysis is based on a spherical approximation to pores and highly sensitive to small changes in the pore diameter. It does not reflect subtle changes in surface texture features such as pore shape. 42 To this end, stochastic ray tracing analysis, as complementary information for pore structure determination, is also shown in Fig. 9(b). Following the trend in PSDs, sharper peaks appear at smaller ray length for denser samples. In general, the low density of NP carbon explicitly gives a long trailing ray length, which is mainly caused by open structures, where some rays can travel freely through continuously connected pores until reaching the wall of a pore. Their peaks appear, in decreasing order of density, at 0.31, 0.84, 1.37 and 2.68 nm ray length for 1.7, 1.3, 0.9 and 0.5 g/cm 3 , respectively. Especially for lower densities, the fact that the PSD peak position is almost the same as that of ray trace indicates that a localized enclosed spherical pore curvature is the dominant morphological feature in the samples modeled. This observation agrees with the experimental prediction for the existence of spherical rather than channel-like or cylindrical-like void regions in GC. 50 This result might be related to the excess of 5-membered rings as compared to their 7-membered counterparts, because the former leads to positive curvature preferring to form closed spheres. Note that one of the most common crystallographic defects in graphene, the Stone-Wells defect, 53 involves the direct substitution of four 6-membered rings with two 5-membered and two 7-membered rings. 5-and 7-member rings arising in pairs allow to maintain the planarity of the graphitic sheets, therefore, an excess of either type of ring will lead to curving of the graphitic planes. An extreme limit of this is the C 60 fullerene, where only 5-and 6-membered rings are present. Moreover, the peak of ray distribution for the 1.7 g/cm 3 sample shifts slightly (by 1.6Å) to its corresponding PSD value, which is due to the local interstitial space between graphitic planes. This suggests the presence of some pockets of crystallites, i.e., oriented stacked graphitic sheets, in the 1.7 g/cm 3 sample. This is also evidenced by the {002} reflection in the XRD patterns and its size evaluation via the Scherrer equation above.
Finally, we note that, as in the case for ring counts, the good agreement (standard deviations visible only at the peak of the PSDs) between the six individual random-seed runs for each density gives confidence in the robustness of the simulation protocol.

Elastic properties
Under realistic conditions materials are subjected to different levels of mechanical stress. If the material's response to stress is poorly understood, this can lead to unexpected or unwanted effects, or even to catastrophic consequences such as material failure. Thus, understanding the mechanical properties of disordered graphitic carbons is of central importance with regard to their technological and industrial applications. Therefore, we have evaluated the elastic properties of our simulated NP carbon samples and characterized the density dependence of bulk, shear and Young's moduli. Six independent random seeds for each mass density were used to obtain good statistics for the elastic moduli of NP carbon. To assess the degree of isotropy in our samples, we computed the triclinic stiffness tensor of the material, which has 36 independent components, and then performed an isotropic projection. This projection is carried out according to the Moakher and Norris (MN) method 54 with rotation optimization, as implemented in the Mattpy code. [55][56][57] In addition to the ability to carry out a projection, the MN method provides a quantitative measure of the similarity between the original tensor and its projection based on Euclidean distances. In this way, we can establish the degree to which finite-size effects, on the one hand, and anisotropy, on the other, are present in our samples. Finite-size effects are a consequence of the limited number of atoms in the simulation, whereas elastic anisotropy in graphitic carbons car arise if there is a preferential orientation of the graphitic planes, as occurs in crystalline graphite.
The relations between the elastic constants that are fulfilled by isotropic materials are: These relations are only valid for threedimensional isotropy. When a material displays only in-plane isotropy, only the shear, axial and biaxial elastic constants along that particular plane satisfy this relation. For instance, in crystalline graphite the elastic isotropy relations for c-plane isotropy are expressed by C 11 = C 22 and C 66 = (C 11 − C 12 )/2. The bulk (B), shear (G) and Young's (E) elastic moduli, which are more commonly reported in the materials and engineering literature than the C ij , are given for isotropic materials as a function of the elastic constants as follows: B = (C 11 + 2C 12 )/3; Normalized Euclidean distance between Cij and C proj ij Figure 10: Two examples for 0.5 g/cm 3 (a) and 1.7 g/cm 3 (b) samples for the calculation of C 11 , given by the relation between the stress along x, σ 1 , and the strain along the same axis, 1 . (c) Isotropic projection (which gives C 11 , C 12 and C 44 ) of the 36-component C ij tensor computed for all computational NP samples. (d) Normalized distance between the triclinic stiffness tensor of the samples and the isotropic (iso), cubic (cub), hexagonal (hex) and orthorhombic (ort) projections; lines connect the data corresponding to the same sample. and where the compliance matrix S is obtained by inverting the 6 × 6 matrix of elastic constants C.
Because of the complexity of these materials the calculation of the stiffness tensor is not straightforward. We proceed in the following way. First, the 300 K structures are further quenched down to close to 0 K to avoid the presence of thermal noise when computing the stress tensor. Second, we slowly deform the simulation box by applying a total of ±0.5 % Figure 11: Average Young's (E), bulk (B) and shear (G) moduli as a function of density for NP carbons. Standard deviations are given over six random seeds.
strains over 1 ps in each direction for all six independent strains, j = 1 , ..., 6 . This is followed by an energy minimization with a gradient descent algorithm at the strain end points. We monitor the six independent components of the stress tensor σ i = σ 1 , ..., σ 6 as a function of these deformations. This allows us to compute the elastic constants from the relation σ i = j C ij j by assuming a linear evolution as given by the end points. Two examples of these calculations are given in Fig. 10 (a,b). As can be observed, there is a delayed elastic response in time, indicated by the non-linear dependence of the stress on the strain. The vertical lines at the end points show how the stress is relaxed at fixed strain during the energy minimization step. The thick straight lines connect the end points of the stress-strain curve, and give the time-independent elastic response of the material. In the limit of an infinitely slow deformation the non-linear curves will approximately converge towards the straight lines. Third, to estimate the isotropic elastic constants, we carry out an isotropic projection of the triclinic stiffness tensor by minimizing the Euclidean distance between the two, constrained to the symmetry of the isotropic stiffness tensor, 54-56 which yields the following ap- Note that a naive averaging of C 11 , C 12 and C 44 does not generally produce a tensor with the correct symmetry. All the results are shown in Fig. 10 (c). Fourth and last, we compute the Euclidean distance between the triclinic tensor and tensors of isotropic, cubic, hexagonal and orthorhombic symmetry (listed in decreasing order of symmetry), with results shown in Fig. 10 (d). This approach allows us to establish the best candidate for the underlying symmetry of each sample. From Fig. 10 (c), we can observe the smooth evolution of the elastic constants as a function of mass density, and also appreciate the degree of variability across samples with the same density, which is small (circa 10 %) thanks to the large number of atoms included in these calcu-lations. Figure 11 shows the average computed for each density, presented in this case in the form of elastic moduli. All of these values are also summarized in Table 3.
Going back to Fig. 10 (d) we observe the general expected trend that as the symmetry of the projection is lowered, the original triclinic tensor can be better accommodated (the distance between C ij and C proj ij decreases). These results also tell us that there is, loosely speaking, between 5 % (one of the samples at 1.3 g/cm 3 ) and 23 % (one of the samples at 1.7 g/cm 3 ) discrepancy between the triclinic tensor and its isotropic projection. Some valuable information can be extracted by comparing the hexagonal projection to both cubic (or even isotropic) and orthorhombic. If the hexagonal projection is very close to the orthorhombic projection and they are both relatively far away from the cubic one, there is indication of an underlying hexagonal elastic response. As we have discussed earlier, the most characteristic feature in terms of elasticity of hegaxonal lattices is the existence of an isotropy plane. In other words, these NP samples would show preferential stacking of graphitic planes along a particular axis. We can see in Fig. 10 (d) that at low densities the transition from cubic to hexagonal to orthorhombic projections is smooth, whereas at higher densities some of the samples show an abrupt transition. This effect can be further quantified by computing the ratio C hex 11 /C hex 33 , where, as this quantity deviates from unity (usually towards larger values), the existence of a preferential direction is further emphasized. This effect is quantified in Table 3, where the highest density samples show C hex 11 approximately 40 % larger than C hex 33 . This is indicative of a small degree of "graphite-likeness" in these samples, which is expected to increase further with density. We note, however, that for our computational NP carbon samples this ratio is still much smaller (by two orders of magnitude) than that of crystalline graphite. In addition, the effect is likely due to finite-size effects, and will probably become even less pronounced for bigger systems. We therefore conclude that these computational samples are for most practical purposes elastically isotropic.

Conclusions
In summary, we have carried out a comprehensive computational study of the atomistic structure, pore-size distribution and elastic properties of NP carbons in the mass density range from 0.5 g/cm 3 to 1.7 g/cm 3 . Realistic simulations of these NP carbon structures, including the modeling of nanopore sizes up to 40Å in diameter, have been made possible through the combination of large simulation cells containing 131,072 atoms each, and the utilization of an accurate and fast GAP force field for carbon including van der Waals interactions. The new GAP potential and the generated NP carbon structures are freely available to the community. 30,58 The structural analysis reveals that at all densities studied these NP carbons are made up mostly of graphitic planes, with circa 97 % of all atoms being sp 2 bonded. Graphitic planes are interlinked via sp 3 motifs, whose concentration increases twofold from 0.5 g/cm 3 (0.5 %) to 1.7 g/cm 3 (1.1 %). Some of the graphitic planes are terminated with edges involving the presence of sp motifs, whose concentration follows the opposite trend to that of sp 3 motifs (from 3 % at 0.5 g/cm 3 down to 1.4 % at 1.7 g/cm 3 ). The samples are highly graphitized, with a topology clearly dominated by 6-membered rings, followed in smaller quantities by 5-and 7-membered rings, in that order. By comparison, the presence of smaller or larger rings is negligible, with the next most common ring motif, 8-membered rings, being more that two orders of magnitude rarer than the 6-membered rings.
The radial distribution functions for all densities show well-defined peaks for first, second and third nearest neighbors corresponding to those of graphite, with the most remarkable feature being a sizable broadening of the secondneighbors peak. This corresponds to the presence of 5-membered rings in the samples. The angular distribution function correspondingly shows a bimodal distribution, with the largest peak centered around 120 • (hexagonal motifs) and a second smaller peak centered around 108 • (pentagonal motifs). A small tail also appears for angles larger than 120 • (heptagons and beyond). The calculated XRD patterns show the expected trends with density and offer a direct route for comparison with experiment.
The pore-size analysis reveals bell-shaped distributions whose center and width shift towards larger values as the density decreases. Typical nanopore diameters increase from 5Å at 1.7 g/cm 3 up to 27Å at 0.5 g/cm 3 . Interestingly, small nanopores are very rare in the lowdensity NP carbon samples. This is indicative of homogeneous nanopore size distributions for a sample of a given density.
Finally, our analysis of elasticity showed that the samples behave mostly isotropically from a mechanical perspective, with the degree of elastic isotropy slowly decreasing with density. The elastic moduli of the material increase monotonically with density, and the Young's modulus increases more rapidly than bulk or shear moduli. The elastic response of these NP carbons increased by roughly an order of magnitude from 0.5 g/cm 3 to 1.7 g/cm 3 , with the elastic constants of the latter still more than an order of magnitude smaller than those of diamond.
We hope that the detailed insight into the atomic structure of NP carbon materials provided in this work will inspire and guide future experimental efforts, especially for energystorage applications. We also expect that the library of atomic structures provided to the research community will serve as a starting point for future computational studies on the elec-tronic, structural and mechanical properties of NP carbon.