Topological Learning for the Classification of Disorder: An Application to the Design of Metasurfaces

Structural disorder can improve the optical properties of metasurfaces, whether it is emerging from some large-scale fabrication methods or explicitly designed and built lithographically. For example, correlated disorder, induced by a minimum inter-nanostructure distance or by hyperuniformity properties, is particularly beneficial for light extraction. Inspired by topology, we introduce numerical descriptors to provide quantitative measures of disorder with universal properties, suitable to treat both uncorrelated and correlated disorder at all length scales. The accuracy of these topological descriptors is illustrated both theoretically and experimentally by using them to design plasmonic metasurfaces with controlled disorder that we then correlate to the strength of their surface lattice resonances. These descriptors are an example of topological tools that can be used for the fast and accurate design of disordered structures or as aid in improving their fabrication methods.

wavefront shaping, 24,25 improve light absorption, 26,27 e.g. for solar cells, 12 or light extraction. 5,28For example, coating the air-LED interface with disordered nanostructures provides a broadband coupling between what would have been internally trapped photons to the external radiation, making more energy efficient LEDs. 2 In some of these fields, correlated disorder seems to be particularly important.Indeed, a correlation length, either induced by a minimum distance between the nanostructures, or by some stealthy hyperuniformity properties helps to create metasurfaces with broader absorption bands, 26 broader diffusive properties 19 or prevent light trapping between nanostructures for more efficient light extraction. 28e different applications of disordered metasurfaces lead to more recent effort to tailor disorder for specific desired optical properties, [29][30][31][32] for example using inverse design methods 33,34 based on machine learning 35,36 or via topology optimisation. 37,38Indeed, combining disorder engineering and topology optimisation one can build metasurfaces with selective light polarisation conversion while minimising the in-plane phase fluctuation. 39While these methods directly generate optimised disordered patterns, they can be time consuming and computationally expensive to implement.In some cases, knowing the link between disorder and the optical properties of a metasurface could significantly speed up the design process by restricting the optimisation to the degree of disorder of a metasurface.However, despite many methods existing to quantify disorder, they all have their strength and weaknesses and are only relevant for specific applications. 32In this work, we present new topology inspired numerical tools suitable for the characterisation of disordered metasurfaces.Their universality makes them useful for the characterisation of both correlated and uncorrelated disorder, and can be used either for the characterisation of disordered metasurfaces built with techniques similar to those mentioned above, or for the fast and accurate design of metasurfaces of specific disorder levels.We demonstrate the relevance of these tools by designing and fabricating metasurfaces made of plasmonic nanostructures embedded in dielectric media whose structural correlated disorder is related to the strength of their surface lattice resonances (SLR).

Results and discussion
We first present a generalised model of disorder generation.We show that a large correlation length may lead to potentially ambiguous designs, where the degree of disorder is poorly represented by the generative/statistical parameters, hinting to the need for better disorder descriptors.We then introduce the field of Topological Data Analysis (TDA) and the tools required to characterise metasurfaces.Using them, we show that correlated disordered metasurfaces are poorly represented by their generative parameters, while being suitably described by these topological descriptors.We then prove the characterisation accuracy and predictive properties of these tools by designing metasurfaces with specific disorder levels, first theoretically then experimentally.

Models of correlated and uncorrelated disorder
with r ij the distance between the nanostructure i and j, including the uncorrelated disorder applied to − → r i and − → r j .The correlation length is given by the full width at half maximum of C ij which is equal to 2 √ 2 ln 2L c P , which is proportional to the non dimensional parameter L c .The total disorder perturbation can be summarised as: Using the expression (2), we generated lattices with varying values of L c and S d , see figure 1.
On the lef column of figure 1, one can visually appreciate that the strength of uncorrelated disorder, for L c = 0, is well represented by S d .Adding a correlation length, for example in the middle and right columns of figure 1 makes the lattices' distortion smoother.However, if one can still guess that the middle column, bottom lattice is more disordered than the middle column, top lattice, as they were respectively generated with S d = 0.4 and S d = 0.2, this is not systematically the case.Indeed, the lattices in the right column were generated with the same parameters L c and S d as in the middle column, however the relative disorder strength of these two lattices is ambiguous.
While a correlation length makes the generative parameter S d less accurate to represent the positional disorder of a lattice, it also destroys the information about the original regular lattice by inducing collective movements of the lattice's points.This makes a statistical description of correlated disordered lattices much harder due to not having a reference lattice to compare them to.In order to circumvent these constraints, we introduce topology inspired numerical tools allowing to compare lattices with each other in a way that highlights the influence of a correlation length and provides a more accurate measure of disorder than S d .Topological Data Analysis (TDA) is a collection of tools inspired from topology and ge-ometry designed to provide qualitative and quantitative descriptors of structures in datasets.

Topological characterisation of disorder
3][44][45][46] One of its sub-fields, persistent homology, is particularly efficient to recover the scale of topological features.We provide here a brief description of the procedure.Interested readers can find detailed introductory notes 47,48 and the whole process can be executed using standard libraries such as GUDHI 49 or Ripser. 50arting from a point cloud such as that in panel a of figure 1, we build a collection of topological spaces called Rips simplicial complexes, indexed by a real number r.For a given value of r, the complex is constructed as follows.A ball of radius r is drawn around each point of the point cloud.If two balls intersect, we add a link between their respective centres.
Similarly, higher order links are added to the complex upon the intersection of three or more balls.Restricting ourselves to only two dimensions, which is relevant for flat metasurfaces, we draw circles of radius r around each point and we only consider connections between pairs and triplets of points.The topological properties of each simplicial complex, the number of connected components and the number of loops in two dimensions, can be directly computed using algebraic topology.Tracking the evolution of these topological features for different values of r provide useful insight over their scale.These features can be summarise in a persistence diagram for which each feature, indexed by the integer i, is represented by two coordinates, their "birth", b i and their "death", d i , which are the values of 2r at which they appear and disappear.
For example, if we consider a simple point cloud such as a square of side 500 in figure 2a, we see the birth of a loop when the diameter of the circles is equal to the side length of the square, figure 2b.This loop dies when the diameter is equal to the diagonal of the square, figure 2c, and the birth and death of this loop is represented as the point at coordinates The second row of figure 3 illustrates how TDA can be used for the clustering analysis of point clouds.The first step is to measure the "distance" between point clouds (figure 3e).We do this by measuring the distance between their corresponding persistence diagrams.Several metrics can be defined over the space of the persistence diagrams and we have selected to use the Wasserstein distance 51 for its simplicity of use (figure 3f and 3g).If several point clouds are considered, one can build a geometrical embedding, for example via classical multidimensional scaling, 52 in which each point cloud can be represented as one point and the distance between each point is given by the distance between their respective persistence diagrams (figure 3h).This provides a visual representation of the configuration space of the different point clouds and can be used to detect clustering.For example, we considered two sets of four point clouds made of either triangles or squares, such as represented in figure 2e.
Upon computing their persistence diagrams,figure 2f, and the distance matrix between then, figure 2g, we observe that a square seems to be more similar, or closer, to the other square than to the triangles.This can be directly visualised in their embedding in figure 2h, where we observe two clusters corresponding to the sets of squares and triangles.
In order to visualise the space of configurations obtained from the definition of correlated disordered lattices in (2), we performed the embedding of 1203 lattices generated for three different lattice periods, 500, 600 and 700 nm, and five different values of S d ∈ [0, 0.4].We repeated this for uncorrelated disorder, L c =0, weakly correlated disorder, L c =2, and strongly correlated disorder, L c =8 (panels a,b and c of figure 3).We see in figure 3a an unambiguous clustering of the uncorrelated disorder lattices generated with the same parameters, i.e. S d and the original lattice period.For a fixed value of the period, the lattices appear to live on a simple curve on which we see five separated clusters of points corresponding to the five values of S d considered.As can be seen in figure 3b, a correlation length increases the size of each cluster, allowing them to overlap.For a large correlation length, like in figure 3c, each cluster is so large that they form one common cluster between the three different periodicities and the different values of S d .This is in agreement with what we presented in the previous section.
A non-zero correlation length allows the lattices to explore a broader configuration space.
Therefore, one can generate lattices that are more different from each other despite using the same parameters and the overall size of the clusters, related to the maximum distance between the persistence diagrams of the generated lattices, is bigger.This expansion of the lattices configuration space leads to situations where two lattices generated with different parameters may actually be very similar to each other, which is represented by the overlap between the clusters in figure 3b.Eventually, a large enough correlation length will make the clusters big enough to make the generative parameters obsolete, which is what we observe in figure 3c.Indeed, even if one can see some general trend between the overall lattices position and the value of S d , one cannot accurately recover the value of S d based on the lattice position.This can lead to situations where a lattice generated with a high amount of disorder, i.e. a large value of S d , may be as, or more, ordered than a lattice generated with a small amount of disorder, such as represented in the right column of figure 1.
Using TDA, we were able to show the limitation of the parameter S d to characterise generated lattices.One can build several metrics to describe persistence diagrams, which can be used as simpler descriptors of the topology of datasets, or as inputs of more refined machine learning based models. 53,54In this work, we use two statistical descriptors based on lattices' persistence diagrams in order to describe both the typical distance between each point of the lattices and their positional disorder.The first numerical descriptor is normalised structural heterogeneity of degree 0 (nSH 0 ) and is the sum of the lifetime of the topological features of degree 0, the connected components, 55 divided by the number of points of the lattice, N : with b and d the birth and death of each topological feature of degree 0, H 0 , of the persistence diagram D. As the death of the topological features of degree 0 is proportional to the distance between the points of the lattices, nSH 0 can be directly related to the average nearest neighbour distance between the nanostructures.If we colour the embeddings of uncorrelated and strongly correlated, L c =8, lattices of figure 2 according to the value of nSH 0 of each lattice, we see in figure 2d that this quantity almost recovers perfectly the periodicity of the lattice for uncorrelated disorder, which confirms our interpretation of the topological features of degree 0. When applied to strongly correlated disordered lattices, figure 2e, SH 0 provides a smooth ordering of the lattices, following a similar trend as for uncorrelated disordered lattices.
We also introduce a new descriptor that we call Topological Disorder (T D), inspired from the persistent entropy (P E). 56-58 P E is defined as P E is maximal for d − b = l (constant), ∀(b, d) ∈ D and equal to log Ω, with Ω the total number of topological features in D. Therefore, P E is maximal for regular, periodic lattices and measures how ordered lattices are.In order to avoid the counter intuitive association of a highly ordered lattice with its high persistent entropy, and to define a measure of disorder independent of the lattice's size, which modifies the number of topological features Ω, we define T D as where we split the computation over the degrees i of the topological features, in order to capture the fundamental differences between topological features of different degree.Indeed, one can see on figure 2d that, despite the regularity of the dataset in figure 2a, the topological features in the persistence diagram are located in different places, which would artificially increase the value of T D. While the example in figure 2a is simple, this remains the case for ordered lattices.By construction, T D is invariant by rescaling of the typical length of the lattices, making it an orthogonal descriptor of the lattices with respect to nSH 0 .T D is also minimal for ordered lattices, equal to 0, and is independent of the number of points of the lattices.Therefore, it can be used as a universal measure of disorder, not only for point clouds perturbed from different periodic lattices array, but also for point clouds without any inherent order, such as in self-assembled systems.If we colour the embeddings of uncorrelated and strongly correlated lattices of figure 3 according to their T D, we see in figure 3f that T D recovers perfectly the strength of the uncorrelated disorder, regardless of the lattices' periodicity, which confirms that T D is indeed a measure of the lattices' disorder.When applied to strongly correlated disordered lattices, figure 3g, T D provides another smooth ordering of the lattices, orthogonal to the one given by nSH 0 .
These observations suggest that T D and nSH 0 are two topologically inspired descriptors that can be used to quantify the positional disorder and the typical distance between points of a dataset respectively.Being, by construction, independent of any reference dataset, these tools are suitable to classify datasets that are not easily described using classical statistical methods, such as correlated disorder point clouds or self-assembled systems.

Tailored metasurface design, fabrication and spectroscopy
We demonstrate the accuracy of T D by using it to design, and subsequently build, plasmonic metasurfaces of specific degree of disorder, that we relate to the strength of their SLRs.We first investigate the link between T D and the strength of the SLRs theoretically using the discrete dipole approximation. 59We randomly generated lattices of 25×25 points with L c =8 and S d =0.3, starting from a square lattice of period 500 nm, where each point represents the position of a plasmonic nanostructure.Filtering the point clouds using nSH 0 , we restrict ourselves to metasurfaces of similar nearest neighbour distance.From these point clouds, we pick those with the highest, lowest and median value of T D (figure 4a, b and c).We consider each nanostructure to be a gold nanocylinder of height 50 nm and diameter 120 nm whose optical properties, under the dipole approximation, are fully determined by their polarisability.The gold nanocylinders are assumed to be embedded in an homogeneous glass like dielectric layer of refractive index 1.41.We numerically compute the reflectance of the three metasurfaces under illumination by a circularly polarised plane at normal incidence, figure 4d.As predicted, the higher the topological disorder, the weaker the SLRs are.Indeed, we can see on figure 4d that the amplitude of SLR dip is inversely proportional to T D.
Similarly, the quality factors of these resonances which are 8.2, 7.5 and 6.5 for the lowest, median and highest T D respectively.

Experimental verification of the T D-SLR link
We additionally experimentally confirmed the link between T D and the strength of SLRs by designing metasurfaces built using Focused Ion Beam (FIB) lithography.Using three different correlation lengths L c ∈ {6, 8, 10} and starting from a regular square lattice of period 500 nm, we generated several hundreds of lattices for two values of S d : 0.2 and 0.4.
For each value of L c , we selected two lattices to compare with each other: the one with the highest value of T D among those generated with S d =0.2 and the one with the lowest value of T D among those generated with S d =0.4.Similarly to the previous section, we used SH 0 to select lattices of similar nearest neighbour distances.We built two sets of seven metasurfaces, three pairs for each value of L c and one reference square lattice of period 500 nm.The two sets only differ in the size of the nanostructures, which in both cases were elongated 50 nm thick gold nanodisk.The top nanodisk crossections are elliptical with x-and y-axis of size (160,180) nm and (120,140) nm for the first and second set respectively.The resonant wavelength of the SLRs depends both on the distance between the nanostructures and on their polarisability.The latter is strongly affected by the shape of the nanostructures and their anisotropy induces a shift of the SLRs wavelength of up to 60 nm according to the polarisation of the exciting light.We therefore report the optical properties of the metasurfaces excited under normal incidence light for two linear polarisation: polarised along the y-direction, parallel to the nanostructures' long axis and polarised along the xdirection, perpendicular to the nanostructures' short axis.SEM images of the first set, as well as their transmittance spectrum compared to the square lattice are in figure 5.The results for the second set of metasurafaces, the comparison of these experimental results to the dipolar model and close up SEM images are in the supplementary information.
The three columns of figure 5 contain for each L c , the SEM images of the designed pair of metasurfaces (first row) and their transmittance spectra upon excitation by light polarised parallel to the nanostructures' long axis (second row) and perpendicular to the nanostructures' short axis (third row).The transmittance spectrum of a periodic metasurface with the same pitch is added for comparison (black lines).We report in table 1 the quality factors of all the SLRs shown in figure 5 as well as the T D of the corresponding metasurfaces.
As can be seen in figure 5 and table 1, in five configurations out of six, the SLRs of the metasurfaces designed with a high S d but a low T D are stronger and have a larger    This demonstrates that T D is a more accurate measure of the positional disorder of these metasurfaces compared to S d as in all of the cases reported here, the metasurface that should have been the most ordered, generated with the lowest value of S d , is actually at least as disordered as the metasurface that should have been the most disordered, generated with the highest value of S d .Indeed, while a correlation length, induced by L c = 0, made S d more ambiguous to describe the disorder of the metasurfaces, T D was able to accurately select lattices of chosen disorder, that we experimentally probed via the quality factor of their SLRs, despite the non unique relationship between T D and SLRs quality factors.

Conclusion
We have shown how Topological Data Analysis and persistent homology can be used to classify both correlated and uncorrelated disordered metasurfaces via their topological disorder.
In particular, we showed that for correlated disorder, topological disorder is a significantly more accurate measure of disorder than generative probabilistic parameters.We proved,

Experimental / Method
The metasurfaces have lateral size approximately 12 × 12 µm and were fabricated in a 50 nm thick film of Au coated glass substrate using a focused ion beam facility, Helios Nanolab 600 from FEI ThermoFisher Scientific.The metasurfaces were then spin-coated with IC1-200 whose refractive index is similar to that of the glass substrate.
The spectral characterisation was performed in transmittance at normal incidence using a microspectrophotometer (CRAIC Technologies) equipped with a tungsten-halogen light source and cooled CCD array.
The persistent homology of all lattices was computed using the Ripser python package. 50e computation for each lattice, made of 625 nanostructures, was done in a fraction of a second.The computation of the distance between each lattice's persistence diagrams considered for the figure 3 was done using the Wasserstein distance from the GUDHI python package. 49mbeddings were obtained from the distance matrices by using classical multidimensional scaling.We projected the embeddings in two dimensions for the visual representations in figure 3.In general such embeddings live in a very high dimensional, non necessarily euclidean, space and a projection to a two dimensional flat space can lead to distortions.However, the magnitude of these distortions can be estimated in the classical multidimensional scaling methods by considering the relative absolute value of the eigenvalues of the embedding in each dimension. 60For the embedding represented in figure 3, the eigenvalues of the two largest dimension, used to represent the embedding in 2D, are respectively 278 and 40 time larger than the largest negative eigenvalue, proving that an embedding in an euclidean space is a good approximation.Similarly, the eigenvalues of the two largest dimension are respectively 22 and 3 times larger than the third largest positive eigenvalue, hinting that a projection in 2D is an accurate visual representation of the embedding.
The numerical simulations of the metasurfaces optical properties were done using the discrete dipole approximation 59 where each nanostructure is modelled as a dipole of the same polarisability.We assumed that the nanostructures were located in an homogeneous dielectric medium of refractive index n = 1.41 which is a good approximation of the refractive index of the glass substrate and of the IC1 layer.The reflectance was measured by computing the electromagnetic flux in the direction perpendicular to the surface, assuming a numerical aperture of 0.28, to match the experimental setup.The nanostructures polarisability was computed from simulating the optical response of an isolated nanostructure upon excitation by plane waves of different polarisability 17 , that we performed using the electromagnetic waves, frequency domain interface of the optics module of COMSOL 5.6, solved with a direct solver. 61

Figure 1 :
Figure 1: Examples of generated disordered lattices.The top row corresponds to lattices generated with S d =0.2 and the bottom row correspond to lattices generated with S d =0.4.The left column represents uncorrelated disordered lattices, L c =0, while the middle and right column represents correlated disordered lattices, L c =6.

Figure 2 :
Figure 2: Examples of two key TDA process used in this paper.The top row represents the computation of persistent homology from the dataset (a) to its representation in a persistent diagram (d).In (b) and (c) are represented the circles whose diameters correspond respectively to the birth and death of the loop of this dataset (single H 1 point in the persistence diagram in panel d).The bottom row represents the computation of the embedding of datasets (e) in a two dimensional space (h) via the computation of their persistence diagrams (f) and the distance between them (g).Datasets of the same type are clustered in the embedding space (panel h).

(
500, 707) in a persistence diagram, figure2d.Additionally, four connected components are born at r = 0 and three of them die such that only one remains after the circles intersect in figure2b.Therefore, three (overlapping) points at coordinates (0, 500) are represented in figure2d.The last connected components remains for r → ∞.As we stopped the computation of persistent homology at r = 400 we assign to this point coordinates (0, 800).

Figure 3 :
Figure 3: Scatter plots of the two dimensional embedding of three sets of generated lattices with uncorrelated (a), weekly correlated (b) and strongly correlated (c) disorder.Each set was generated from an original square lattice of period 500, 600 and 700 nm (left to right in panel a) and with S d ∈ [0, 0.4].In the absence of correlation, lattices with different values of the period and of S d are well clustered.In the insert of panel a we adapted the size of points to illustrate how clustered the lattices are.The clustering is lost in the presence of correlations, panels b and c.Panels d and f, and panels e and g are equivalent to panels a and c respectively, with colour coding based on the value of nSH 0 (T D) in panels d and f (e and g).In both cases the colour gradient is not significantly affected by correlation.

Figure 4 :
Figure 4: Theoretical investigation of the correlation between T D and the strength of SLR.a,b and c represent respectively the generated metasurfaces of lowest, median and highest T D. Their computed reflectance spectrum, in arbitrary units, is represented in d.

Figure 5 :
Figure 5: SEM images of the experimental samples (top row) and their transmittance spectra under normal incidence light linearly polarised parallel (middle row) or perpendicular (bottom row) to the long axis of the nanostructures.Each plot displays the spectra of a low and high T D metasurface, dashed pink and solid red respectively, and an ordered metasurface with the same pitch (black).Each column correspond to the metasurfaces generated with L c ∈ [6, 8, 10] from left to right.

T
figure 6.We see a decreasing trend of the quality factor in terms of T D despite outliers, such

Figure 6 :
Figure 6: Graph of the quality factors of the SLRs reported in figure 5 in terms of the T D of the metasurfaces under normal incidence light linearly polarised parallel (green dots) and perpendicular (red crosses) to the long axis of the nanostructures.Two lines are added to represent the trend of the quality factors for the parallel polarisation, in green, and perpendicular polarisation, in red, in terms of T D.
theoretically and experimentally, this accuracy by correlating topological disorder to the strength of surface lattice resonances of metasurfaces made of plasmonic nanostructures, despite the global definition of topological disorder being sensitive to large-scale distortion, while surface lattice resonances are not.We argue that the universality, accuracy and computational speed of topological disorder makes it an advantageous tool to characterise and tune the fabrication methods of self assembled disordered metasurfaces, as well as to help design metasurfaces of specific degree of disorder, for example to enhance light extraction for more efficient LEDs or light absorption for improved solar cells.

Table 1 :
T D of the metasurfaces reported in figure5and the corresponding quality factors (Q) of their SLRs for parallel and perpendicular polarisation of the exciting light.