Hostname: page-component-78c5997874-lj6df Total loading time: 0 Render date: 2024-11-16T17:00:00.422Z Has data issue: false hasContentIssue false

Unsupervised classification of fully kinetic simulations of plasmoid instability using self-organizing maps (SOMs)

Published online by Cambridge University Press:  29 May 2023

Sophia Köhne
Affiliation:
Institut für Theoretische Physik, Ruhr-Universität Bochum, Universitätstraße 150, 44801 Bochum, Germany
Elisabetta Boella
Affiliation:
Physics Department, Lancaster University, Bailrigg, Lancaster LA11NN, UK Cockcroft Institute, Sci-Tech Daresbury, Warrington WA44AD, UK
Maria Elena Innocenti*
Affiliation:
Institut für Theoretische Physik, Ruhr-Universität Bochum, Universitätstraße 150, 44801 Bochum, Germany
*
Email address for correspondence: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

The growing amount of data produced by simulations and observations of space physics processes encourages the use of methods rooted in machine learning for data analysis and physical discovery. We apply a clustering method based on self-organizing maps to fully kinetic simulations of plasmoid instability, with the aim of assessing their suitability as a reliable analysis tool for both simulated and observed data. We obtain clusters that map well, a posteriori, to our knowledge of the process; the clusters clearly identify the inflow region, the inner plasmoid region, the separatrices and regions associated with plasmoid merging. Self-organizing map-specific analysis tools, such as feature maps and the unified distance matrix, provide us with valuable insights into both the physics at work and specific spatial regions of interest. The method appears as a promising option for the analysis of data, both from simulations and from observations, and could also potentially be used to trigger the switch to different simulation models or resolution in coupled codes for space simulations.

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press

1 Introduction

In recent years, space physics research based on observations and numerical experiments has converged under several points of view. First, both spacecraft and numerical simulations have nowadays easy access to kinetic-scale processes. Recent missions, such as the Magnetospheric MultiScale (MMS) (Burch et al. Reference Burch, Moore, Torbert and Giles2016) spacecraft, Parker Solar Probe PSP (Fox et al. Reference Fox, Velli, Bale, Decker, Driesman, Howard, Kasper, Kinnison, Kusterer and Lario2016) and Solar Orbiter (Müller et al. Reference Müller, St. Cyr, Zouganelis, Gilbert, Marsden, Nieves-Chinchilla, Antonucci, Auchère, Berghmans and Horbury2020), routinely sample ion and electron scales. Similarly, fully kinetic codes are employed for the study of ion- and electron-scale processes embedded in increasingly large spatial and temporal domains. This is the case in particular for semi-implicit particle-in-cell (PIC) (Hockney & Eastwood Reference Hockney and Eastwood2021), codes, e.g. Markidis et al. (Reference Markidis, Lapenta and Rizwan-uddin2010), Innocenti et al. (Reference Innocenti, Johnson, Markidis, Amaya, Deca, Olshevsky and Lapenta2017b) and Lapenta, Gonzalez-Herrero & Boella (Reference Lapenta, Gonzalez-Herrero and Boella2017), where advanced numerical techniques may also be used to enable the simulation of large domains, either within the fully kinetic description, e.g. Innocenti et al. (Reference Innocenti, Lapenta, Markidis, Beck and Vapirev2013, Reference Innocenti, Beck, Ponweiser, Markidis and Lapenta2015b) and Innocenti, Tenerani & Velli (Reference Innocenti, Tenerani and Velli2019b), or by coupling the kinetic and fluid descriptions, e.g. Daldorff et al. (Reference Daldorff, Tóth, Gombosi, Lapenta, Amaya, Markidis and Brackbill2014), Ashour-Abdalla et al. (Reference Ashour-Abdalla, Lapenta, Walker, El-Alaoui and Liang2015) and Lautenbach & Grauer (Reference Lautenbach and Grauer2018). Second, both spacecraft observations and numerical simulations produce an increasing amount of data. As an example, MMS collects a combined volume of $\sim$100 gigabits per day of particle and field data, of which only a fraction can be transmitted to the ground due to downlink limitations (Baker et al. Reference Baker, Riesberg, Pankratz, Panneton, Giles, Wilder and Ergun2016). At the same time, a single PIC simulation can produce $\sim$ tens of terabytes of output, depending on the frequency at which field and particle information is saved. Such a huge amount of data leads naturally to the usage of machine learning (ML, Bishop Reference Bishop2006; Goodfellow, Bengio & Courville Reference Goodfellow, Bengio and Courville2016)-based methods for data analysis and investigation. A review of recent applications of ML techniques to the space physics domain is found in Camporeale (Reference Camporeale2019). Particularly useful in this context are classification or clustering techniques that can be used for the identification of ‘similar’ regions, either sampled or simulated, and hence for the detection of boundary crossing. Recently, supervised ML techniques have been used for the classification of large-scale magnetospheric regions and detection of boundary crossing, chiefly bow-shock and magnetopause crossings, e.g. in Argall et al. (Reference Argall, Small, Piatt, Breen, Petrik, Kokkonen, Barnum, Larsen, Wilder and Oka2020), Breuillard et al. (Reference Breuillard, Dupuis, Retino, Le Contel, Amaya and Lapenta2020), da Silva et al. (Reference da Silva, Barrie, Shuster, Schiff, Attie, Gershman and Giles2020), Olshevsky et al. (Reference Olshevsky, Khotyaintsev, Lalti, Divin, Delzanno, Anderzén, Herman, Chien, Avanov and Dimmock2021), Lalti et al. (Reference Lalti, Khotyaintsev, Dimmock, Johlander, Graham and Olshevsky2022) and Nguyen et al. (Reference Nguyen, Aunai, Michotte de Welle, Jeandet, Lavraud and Fontaine2022). ML techniques, either supervised, e.g. Camporeale, Carè & Borovsky (Reference Camporeale, Carè and Borovsky2017) and Li et al. (Reference Li, Wang, Tu and Xu2020), or unsupervised, e.g. Amaya et al. (Reference Amaya, Dupuis, Innocenti and Lapenta2020) and Roberts et al. (Reference Roberts, Karimabadi, Sipes, Ko and Lepri2020), have been used for the classification/clustering of solar wind states. Unsupervised techniques, namely Gaussian mixture models, have recently proven quite effective in PIC simulations, either for the identification of regions of interest (Dupuis et al. Reference Dupuis, Goldman, Newman, Amaya and Lapenta2020), where the particle distribution functions deviate from Maxwellian, or to encode particle information for later resampling during simulation restarts (Chen, Chacón & Nguyen Reference Chen, Chacón and Nguyen2021). In this paper, we will use an unsupervised clustering technique based on self-organizing maps (SOMs, Kohonen Reference Kohonen1982) to cluster simulated data points obtained from a PIC simulation. The aim is to verify whether this procedure can be used for two purposes, namely simulation pre-processing and scientific investigation. The same procedure (with minimal variations) has already given satisfactory results on data of rather different origin, obtained from both observations and simulations. In Amaya et al. (Reference Amaya, Dupuis, Innocenti and Lapenta2020), it has been applied to 14 years of Advanced Composition Explorer (ACE, Stone et al. Reference Stone, Frandsen, Mewaldt, Christian, Margolies, Ormes and Snow1998) solar wind measurements. In Innocenti et al. (Reference Innocenti, Amaya, Raeder, Dupuis, Ferdousi and Lapenta2021), it has been used to cluster simulated data, and specifically data from a global magnetospheric simulation. The code used there was the magneto-hydro-dynamic (MHD)-based code OpenGGCM-CTIM-RCM (Raeder Reference Raeder2003), which targets large-scale processes originating from the interaction of the solar wind with the magnetosphere–ionosphere–thermosphere system. Quite satisfactorily, the clustering procedure used there was able to cluster simulated points into regions associated, a posteriori, with the pristine solar wind, the magnetosheath (divided into three clusters, just downstream and further away from the bow shock), the lobes, the inner magnetosphere and boundary layers. Verification and validation activities were conducted to ascertain the dependence of the obtained clusters on several hyper-parameters, including the features used for the clustering, the number of $k$-means clusters used for the classification of the trained weights, and SOM-related hyper-parameters, such as the number of nodes in the SOM, the learning rate and the initial lattice neighbourhood width (see § 2). Robustness of the proposed clustering to temporal variation was also investigated.

One rather fundamental question left open in Innocenti et al. (Reference Innocenti, Amaya, Raeder, Dupuis, Ferdousi and Lapenta2021) was whether a similar clustering procedure would produce equally meaningful results when applied to smaller-scale processes, kinetic in nature. The MHD simulations intend to reproduce plasma behaviour at scales large enough that certain assumptions can be considered satisfied. Among these assumptions, we list quasi-neutrality, the presence of thermal (Maxwellian) and isotropic velocity distribution functions, the possibility of ignoring the non-ideal terms in Ohm's law and finite gyroradius effects (Ledvina, Ma & Kallio Reference Ledvina, Ma and Kallio2008). These assumptions are obviously not respected in most heliospheric plasmas, especially at the very small and fast scales sampled by recent magnetospheric and solar wind missions, such as the above-mentioned MMS, PSP and Solar Orbiter. Hence, the need arises to verify if the clustering procedure described in Innocenti et al. (Reference Innocenti, Amaya, Raeder, Dupuis, Ferdousi and Lapenta2021) is robust to using simulation approaches, such as PIC methods, that deliver results directly comparable to observations (Innocenti et al. Reference Innocenti, Norgren, Newman, Goldman, Markidis and Lapenta2016). This work intends to address this question. The aim of this work is twofold: on one hand, we intend to verify if this procedure is useful in the post-processing of large-throughput numerical simulations. On the other hand, this analysis constitutes a necessary first step to validate the method before applying it to spacecraft observations. We apply the clustering procedure from Innocenti et al. (Reference Innocenti, Amaya, Raeder, Dupuis, Ferdousi and Lapenta2021) to a fully kinetic PIC simulation of the plasmoid instability. The plasmoid instability is a fast instability that breaks down current sheets into multiple magnetic islands, which later undergo nonlinear evolution. We refer the reader to Loureiro & Uzdensky (Reference Loureiro and Uzdensky2015) and Pucci et al. (Reference Pucci, Velli, Shi, Singh, Tenerani, Alladio, Ambrosino, Buratti, Fox and Jara-Almonte2020) for a review of recent developments in plasmoid instability research both in the collisional and collisionless regimes. The plasmoid instability results in fast, spontaneous magnetic reconnection in plasmas and, as such, is deeply connected to the fundamental topic of particle heating and acceleration in space and astrophysical plasmas. A number of recent PIC simulations have focused specifically on the role of magnetic reconnection triggered by plasmoid instability in electron heating and acceleration. Processes observed in plasmoid instability simulations that result in particle heating and acceleration are acceleration by the reconnection electric field (Li et al. Reference Li, Guo, Li and Li2017), Fermi acceleration (Guo, Jokipii & Kota Reference Guo, Jokipii and Kota2010) and plasmoid merging (Drake et al. Reference Drake, Swisdak, Che and Shay2006; Petropoulou et al. Reference Petropoulou, Christie, Sironi and Giannios2018). The efficiency of these processes has been observed to vary according to the plasma beta (Li et al. Reference Li, Guo, Li and Li2015) and magnetization (Guo et al. Reference Guo, Li, Li, Daughton, Zhang, Lloyd-Ronning, Liu, Zhang and Deng2016). In simulations of plasmoid instability evolution, different regions are immediately distinguishable with the naked eye: an inflow region, separatrices, the plasmoid themselves and the plasmoid merging regions. We aim at understanding if our unsupervised clustering method is capable of comparable region identification.

This paper is organized as follows: in § 2 we describe SOMs and our clustering procedure, in § 3 the simulation used. In § 4 we describe our results: preliminary data inspection, scaling experiments, SOM training and the $k$-means clustering of trained SOM nodes (§ 4). An a posteriori analysis of clustering results and physical insights obtained from them are described in § 5. Discussions and conclusions follow. In the Appendix we comment on the robustness to hyper-parameter choice of our clustering procedure.

2 Self-organizing maps: a summary

Self-organizing maps (Kohonen Reference Kohonen1982; Villmann & Claussen Reference Villmann and Claussen2006) can be viewed as both a clustering and a dimensionality reduction procedure (Kohonen Reference Kohonen2014). The aim is to represent a large set of possibly high-dimensional data as a (usually) two-dimensional (2-D) ordered lattice composed of $y \times x= q$ nodes/units/neurons, with $y$ and $x$ respectively the number of rows and columns, and $q$ the total number of nodes. Each node $i$ is characterized by a position in the 2-D lattice and by a weight $\boldsymbol {w}_i \in \mathbb {R}^n$, where $n$ is the number of features associated with each data point, and hence with each weight. Before the training, the weights are initialized randomly so as to span the parameter space covered by the first two principal components (Shlens Reference Shlens2014) of the input data, to make training faster (Kohonen Reference Kohonen2014). At the end of the training, the node weights will have been modified so that the nodes (a) represent local averages of the input data that maps to them, and (b) are topographically ordered according to their similarity relation. Nearby nodes should be more ‘similar’ to each other than far away nodes. To define similarity a distance metric is needed. Common choices are the Euclidean, sum of squares, Manhattan and Tanimoto distances (Xu & Tian Reference Xu and Tian2015).

The training is unsupervised, meaning that training data are not labelled. Each data point $\boldsymbol {x}_\tau$ is presented to the map multiple times. On each occasion the following procedure, a mix of competition and collaboration, is repeated:

  1. (i) competition: the best matching unit (BMU) of the input data $\boldsymbol {x}_\tau$ is identified, by selecting the node whose weight $\boldsymbol {w}_s$, among the set $\boldsymbol {W}$ of all the weights in the map, has minimum distance ($\|\cdot \|$) with respect to $\boldsymbol {x}_\tau$

    (2.1)\begin{equation} \boldsymbol{w}_s = \underset{\boldsymbol{w}_i \in \boldsymbol{W}}{\arg \min} \left( \lVert \boldsymbol{x}_\tau - \boldsymbol{w}_i\rVert \right); \end{equation}
  2. (ii) collaboration: to obtain an ordered map, not only the BMU but also some neighbouring nodes are tagged for update at each iteration. Such neighbours are selected through a lattice neighbourhood function $h_\sigma (\tau,i,s)$

    (2.2)\begin{equation} h_\sigma(\tau,i,s) = \exp\left({-\frac{\lVert \boldsymbol{p}_i - \boldsymbol{p}_s \rVert^2}{2\sigma(\tau)^2}}\right), \end{equation}
    where $s$ is the BMU index and $\boldsymbol {p}_i \in \mathbb {R}^2$ the position of node $i$ in the 2-D map; $\sigma (\tau )$ is the iteration($\tau$)-dependent lattice neighbourhood width that determines the extent around the BMU of the update introduced by the new input;
  3. (iii) weight update: the weights of the selected nodes are updated, to make them become more similar to the input data. The magnitude of the correction $\Delta \boldsymbol {w}_i$ for the node $i$ is calculated as follows:

    (2.3)\begin{equation} \Delta\boldsymbol{w}_i = \epsilon(\tau)h_\sigma(\tau_i,i,s)(\boldsymbol{x}_\tau-\boldsymbol{w}_i). \end{equation}
    The correction depends on the distance of the nodes from the BMU and on the learning rate $\epsilon (\tau )$, which is also iteration dependent.

Both the lattice neighbourhood width and the learning rate decrease with increasing iteration number, according to predetermined rules. The initial lattice neighbour width and learning rate, $\sigma (\tau =0)$ and $\eta (\tau =0)$, are labelled $\sigma _0$ and $\eta _0$, respectively. The rationale for this iteration-dependent decrease is the following: at the beginning of the training far-reaching, large-magnitude updates of the node weights are needed, since the node weights are very different from the input data and the topology of the data has to be enforced on the map. After several iterations, when the map already resembles the data, the node updates become targeted in position and smaller in magnitude. The overall convergence of the map is, therefore, separable into two phases: first the topographic ordering of the nodes and then the convergence of the weight values according to the quantization error (Van Hulle Reference Van Hulle2012).

The quantization error, $Q_E$, defined as

(2.4)\begin{equation} Q_E = \frac{1}{m} \sum_{i=1}^m \lVert \boldsymbol{x}_i - \boldsymbol{w}_{s\,|\,\boldsymbol{x}_i} \rVert, \end{equation}

measures the average distance between each of the $m$ entry data points $\boldsymbol {x}_i$ and their BMU, $\boldsymbol {w}_{s\,|\,\boldsymbol {x}_i}$. As such, it can be used to measure how closely the map reflects the training data distribution. The quantization error is expected to decrease and finally plateau during map training.

The SOMs are at the core of the clustering procedure we use in this work, which has already been used (with minimal variation) and described in Amaya et al. (Reference Amaya, Dupuis, Innocenti and Lapenta2020) and Innocenti et al. (Reference Innocenti, Amaya, Raeder, Dupuis, Ferdousi and Lapenta2021). We briefly describe the method here for the reader's convenience:

  1. (i) preliminary data inspection and data pre-processing: we describe in § 4 our experiments with different data scalers;

  2. (ii) SOM training, as described above;

  3. (iii) $k$-means clustering (Lloyd Reference Lloyd1982) of the trained SOM nodes. After SOM training, the data points are clustered in $q$ clusters, with $q$ the number of SOM nodes; $q$ is typically too high for meaningful results inspections. Hence, we further cluster the trained SOM nodes into a lower number of $k$-means clusters. The number of clusters is determined through an unsupervised procedure, the Kneedle cluster number determination (Satopaa et al. Reference Satopaa, Albrecht, Irwin and Raghavan2011);

  4. (iv) clustering of the simulated data points, based on the $k$-means cluster of their BMU. At this stage, we can inspect our clustering results and use the result of the clustering to obtain physical insights on our data, see §§ 4 and 5.

In Amaya et al. (Reference Amaya, Dupuis, Innocenti and Lapenta2020) and Innocenti et al. (Reference Innocenti, Amaya, Raeder, Dupuis, Ferdousi and Lapenta2021) the serial SOM implementation from Vettigli (Reference Vettigli2018) was used. Here, due to the large volume of data points to cluster, we move to the parallel SOM implementation using CUDA (NVIDIA, Vingelmann & Fitzek Reference Vingelmann and Fitzek2020) and C++ by Mistri (Reference Mistri2018). The algorithm is the same as in the serial implementation, but the distance calculations between the data samples and the neuron weights are parallelized. The parallel implementation allows us to choose, for the training phase, between online and batch learning. In the online version of the algorithm, the data samples are each processed according to the scheme described above. In this case, the updates of the weights are determined by the impact of the individual samples. The batch algorithm, on the other hand, finds the BMU for each sample and then sums up all data samples mapping to one node at the beginning of a training cycle to form node sums. Then, during training, a neighbourhood set for each of the nodes is found and all the nodes sums are summed up to a neighbourhood sum, which is divided by the total number of samples contributing, to form a neighbourhood mean. In the end, the weights are updated in one operation over all nodes of the SOM, where the values of the weights are replaced by those neighbourhood means (Kohonen Reference Kohonen2014). The batch algorithm does not need a learning rate. We chose here the online training algorithm. In table 1 we compare the average run time per epoch, in seconds, of the serial and parallel SOM implementations, as a function of the number of samples $m$. One epoch corresponds to presenting all the samples to the map once. In the last column we record the ratio of the execution times of the parallel and serial implementations as a function of the number of samples. We observe that using the parallel implementation becomes increasingly convenient with increasing sample number.

Table 1. Comparison of the execution times of the parallel CUDA/C++ implementation CUDA-SOM (Mistri Reference Mistri2018) and the serial Python implementation miniSom (Vettigli Reference Vettigli2018), both trained for 5 epochs with different sample sizes $m$. The data points are extracted from the upper current sheet of the simulation described in § 3. The SOM parameters used for the training are initial learning rate $\eta _0=0.5$ and initial neighbourhood radius $\sigma _0=0.2 \times \max (x,y)$.

3 Simulation description

Self-organizing maps are used to cluster results of a 2-D kinetic simulation of plasmoid instability. The simulation was carried out with the semi-implicit PIC code ECsim (Lapenta Reference Lapenta2017; Lapenta et al. Reference Lapenta, Gonzalez-Herrero and Boella2017; Gonzalez-Herrero, Boella & Lapenta Reference Gonzalez-Herrero, Boella and Lapenta2018). In the PIC algorithm, a statistical approach is adopted and plasmas are described using computational particles or macroparticles representative of several real plasma particles. These computational particles interact via the electromagnetic fields that they themselves produce. The fields are computed on a fixed grid by solving Maxwell's equations. ECsim implements a semi-implicit algorithm, which makes the code stable and accurate over different spatial and temporal resolutions. As a consequence, the code resolution can be tuned to the physics of interest, rather than to the smallest scales of the problem (Micera et al. Reference Micera, Boella, Zhukov, Shaaban, López, Lazar and Lapenta2020). In the simulation, two oppositely directed force-free current sheets are used as initial condition. The sheets sustain a magnetic field $\boldsymbol {B}= B_x(y) \hat {\boldsymbol {x}} + B_z(y) \hat {\boldsymbol {z}}$ with

(3.1)\begin{equation} B_x(y) = B_{0,x} \left[{-}1 + \tanh \left(\frac{y-0.25 L_y}{\delta} \right) + \tanh \left(-\frac{y-0.75 L_y}{\delta} \right) \right], \end{equation}

and

(3.2)\begin{equation} B_z(y)=B_{0,x} \sqrt{ \operatorname{sech}^2 \left(\frac{y-0.25 L_y}{\delta} \right) + \operatorname{sech}^2 \left(\frac{y-0.75 L_y}{\delta} \right) + B_g^2 }. \end{equation}

Here, $L_y$ is the transverse size of the simulation box, $\delta = d_i$ is the current sheet half-thickness, $d_i = c/\omega _{pi}$ is the ion inertial length, $\omega _{pi} = \sqrt {4 {\rm \pi}\,{\rm e}^2 n_{0i}/m_i}$ the plasma frequency for ions of density $n_{0i}$ and mass $m_i$, $e$ is the elementary charge and $B_g$ the magnitude of the guide field. While the upper current sheet is unperturbed and the tearing instability is seeded from numerical noise, the magnetic field of the lower current sheet was perturbed with a long wave perturbation (Birn et al. Reference Birn, Drake, Shay, Rogers, Denton, Hesse, Kuznetsova, Ma, Bhattacharjee and Otto2001) to trigger the instability on faster time scales. The simulation box is filled with a plasma composed of electrons and ions with mass ratio $m_i/m_e = 25$, where $m_e$ is the electron mass, having uniform density $n_{0i} = n_{0e} = n_0$ and temperature $T_{0i} = T_{0e} = T_0$. Electrons are initialized with a drift velocity $\boldsymbol {v}_e = v_{e,x}(y) \hat {\boldsymbol {x}} + v_{e,z}(y) \hat {\boldsymbol {z}}$, such that $\boldsymbol {\nabla } \times \boldsymbol {B} = 4 {\rm \pi}\boldsymbol {J}_e/ c$ is satisfied, with $\boldsymbol {J}_e$ the electron current density. In our simulation, we set $v_A = 0.2 c$, $\omega _{pe} / \varOmega _{ce} = 1$, $\beta _e = \beta _i = 0.02$ and $B_g = 0.03 B_{0,x}$. Here, $v_A = B_{0,x} / (4 {\rm \pi}n_0 m_i)$ is the Alfvén velocity, $\omega _{pe} = \sqrt {4 {\rm \pi}\,{\rm e}^2 n_{0}/m_e}$ is the electron plasma frequency, $\varOmega _{ce} = e B_{0,x}/(m_e c)$ is the electron cyclotron frequency, $\beta _{e,i} = 8 {\rm \pi}n_0 T_0/ B_{0,x}^2$ is the ratio between electron/ion pressure and magnetic pressure and $c$ is the speed of light in vacuum. These dimensionless parameters are similar to those used in Li et al. (Reference Li, Guo, Li and Li2017) and correspond to plasma conditions typical of the solar corona and the accretion disk corona. We used a simulation box with longitudinal and transverse sizes $L_x = L_y = 200 d_i$, discretized with $2128 \times 2128$ cells to resolve $c/\omega _{pe}$ twice. To model the plasma dynamics accurately, 64 particles per cell per species were employed. Particles were pushed for more than 7000 iterations with a temporal time step of $0.16 \varOmega _{ci}^{-1}$, with $\varOmega _{c,i}$ ion cyclotron frequency. Periodic boundary conditions for fields and particles were adopted. We performed a detailed convergence study to ensure that the chosen numerical parameters do not affect the physics under investigation. Figure 1 displays the out-of-plane magnetic field component ($B_z$, a), one diagonal ($p_{xx,e}$, b) and one non-diagonal, ($p_{xz,e}$, c), electron pressure term, and the out-of-plane electron current ($J_{z,e}$, d) at $\varOmega _{ci}t=320$. Field lines are superimposed on each panel. At this time the tearing instability is in its nonlinear phase in both current sheets. In the upper current sheet, we observed the formation of small magnetic islands that grew with time and merged to produce the shown configuration. In the lower current sheet, where a perturbation was originally present, single X point reconnection developed initially. Later, plasmoids formed in the current sheet and plasmoid merging was also observed. Signatures associated with plasmoid merging are described e.g. in Cazzola et al. (Reference Cazzola, Innocenti, Markidis, Goldman, Newman and Lapenta2015, Reference Cazzola, Innocenti, Goldman, Newman, Markidis and Lapenta2016). At the location of the X points ($x/d_i \simeq 70$ and $170$ in the upper current sheet and $x/d_i \simeq 80$ in the lower current sheet), the out-of-plane magnetic field component $B_z$ shows the typical quadrupolar structure associated with collisionless Hall reconnection. The $xx$ component of the electron pressure tensor is low at the X points and high at the periphery of magnetic islands, as noted for the electron temperature in Lu et al. (Reference Lu, Angelopoulos, Artemyev, Pritchett, Liu, Runov, Tenerani, Shi and Velli2019). A similar behaviour is observed for the $xz$ pressure tensor component, which is higher in absolute value at the border of magnetic islands with respect to other regions. It is interesting to notice that $p_{xz,e}$ changes polarity within a magnetic island. The role of the pressure tensor off-diagonal components in causing fast reconnection in collisionless plasmas has been studied in a number of previous works, e.g. Hesse & Winske (Reference Hesse and Winske1998) and Ricci et al. (Reference Ricci, Brackbill, Daughton and Lapenta2004).

Figure 1. Out-of-plane magnetic field component ($B_z$, a), one diagonal ($p_{xx,e}$, b) and one non-diagonal, ($p_{xz,e}$, c), electron pressure term, and the out-of-plane electron current ($J_{z,e}$, d) at $\varOmega _{ci}t=320$. Magnetic field lines are superimposed in black. Electromagnetic fields, pressure components and currents are in units of $m_i c \omega _{pi} / e$, $n_0 m_i c^2$ and $e n_0 c$, respectively.

4 Clustering results

The data points we cluster are from the upper current sheet of the simulation described in § 3, at time $\varOmega _{ci}t=320.$ As is standard practice in clustering procedures based on distance, the data are scaled to prevent the features with larger magnitude from dominating the clustering (Angelis & Stamelos Reference Angelis and Stamelos2000). Scaling the data is the transformation of the feature values according to a defined rule. This is necessary to assure that all features have the same (or rather a proportional) degree of influence in the evaluation (Huang, Li & Xie Reference Huang, Li and Xie2015). If the feature values range for one feature in the tens and for another in the thousands, a ML algorithm will always be influenced more by the feature ranging in the thousands, if scaling is not used. We tested three different scalers from the scikit-learn library (Pedregosa et al. Reference Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel, Blondel, Prettenhofer, Weiss and Dubourg2011): the so-called MinMax, Standard and Robust scalers. The first scales each feature in the dataset to a fixed interval, here $[0,1]$, the second to zero mean and unit variance. The third removes the median and scales the data according to a quantile range, here the interquartile range between the first and third quartiles. In figure 2 we depict the violin plots for the distributions of an outlier-poor feature, $B_y$, and an outlier-rich one, $J_{z,e}$, after scaling them with the three scalers; we can observe that the data range after scaling changes dramatically for the outlier-rich feature. Since $J_{z,e}$ is strongly associated with reconnection X-points, we can expect that the choice of scaler will have an effect on clustering results.

Figure 2. Violin plots of the distribution, after scaling, of an outlier-poor ($B_y$) and an outlier-rich ($J_z,e$) feature. The MinMax, Standard and Robust scalers are used.

For each set of scaled data, we proceed to train a SOM. The number $q$ of nodes of each map has been determined using the following rule of thumb (Kohonen Reference Kohonen2014):

(4.1)\begin{equation} q \approx 5\times\sqrt{m}, \end{equation}

where $q$ is approximated to an integer and $m$ is the number of samples. The ratio of the side lengths $x$, $y$ of the map is set to match the ratio of the largest two principal components (Shlens Reference Shlens2014) of the training data. We notice that, since the principal component analysis is performed on scaled data, the number of rows and columns in the map may differ in the different cases. The initial neighbourhood radius is then chosen to cover 20 % of the larger side length of the map

(4.2)\begin{equation} \sigma_0 = 0.2\times\max{(x,y)}. \end{equation}

For the MinMax and the Standard scalers this resulted in $x=y=82$ and $\sigma _0=16$ and for the Robust scaler in $x=93$, $y=71$ and $\sigma _0=19$.

The initial learning rate was kept constant at $\eta _0 = 0.5$ and the weights were initialized using random samples from the training data. The maps shown in figure 3 have all been trained for five epochs. We remark that, with this relatively low number of epochs, the map cannot be expected to have converged, as is often the case with SOMs. However, the results that we obtain are remarkably stable to all parameters, including the number of epochs, as shown in the Appendix. There, we study how varying the SOM hyper-parameters and the seed for the random initialization of the weights influences clustering results. In table 2 we report all hyper-parameters used for SOM training.

Figure 3. Trained SOM nodes coloured according to their $k$-means clusters (a,c,e) and UDM maps (b,d,f) for the data scaled with MinMaxScaler (a,b), StandardScaler (c,d), RobustScaler (e,f). Cluster boundaries are drawn in black in (b,d,f).

Table 2. The CUDA-SOM hyperparameters used in SOM training.

In figures 3 and 4 we show the clustering results obtained with the three scalers. The trained SOM nodes have been further clustered with $k$-means into larger-scale regions, as described in § 2. The optimal $k$-means cluster number, $k=5$, has been selected with the Kneedle cluster number determination (Satopaa et al. Reference Satopaa, Albrecht, Irwin and Raghavan2011).

Figure 4. Upper current sheet simulated data at $\varOmega _{ci} t=320$, coloured according to the $k$-means cluster the BMU of each point is clustered into. Data scaled with MinMaxScaler, StandardScaler and RobustScaler are depicted in (a), (b) and (c) respectively.

In figure 3, first column, we see the trained SOM nodes obtained with the three different scalers (MinMaxScaler, StandardScaler and RobustScaler, (a,c,e), respectively) and coloured according to the $k$-means cluster they are clustered into. In the second column (b,d,f), we depict the unified distance matrices (UDMs) associated with the three cases. In the UDM representation each neuron is coloured according to its normalized distance with respect to its nearest neighbour (Kohonen Reference Kohonen2014); ‘darker’ neurons are less similar to their neighbour than ‘lighter’ neurons. We can observe a similar pattern in the three UDMs: a wide, light area composed of similar neurons is mapped to the same $k$-means cluster, the 0, blue cluster. A darker area is present in one of the map corners, not necessarily the same one in the three plots because, in SOMs, the information of interest is not the absolute position but the relative position with respect to the neighbouring nodes. This darker area maps to further four clusters. Even darker areas may be present within clusters or at the boundary between clusters, see e.g. the dark area in figure 3(f), at the intersection between cluster 3, red, 0, blue and 2, green.

In figure 4 we depict the simulated data points at time $\varOmega _{ci}t= 320$, see figure 1 in § 3, coloured according to the $k$-means cluster they map to, for the three different scalers. We can now map the neurons from figure 3 to the physical regions in figure 4.

In all three cases, the blue cluster from figure 3 maps to the plasma outside of the plasmoid region. We can call this region, slightly oversimplifying, the inflow region. The plasmoid region is clustered differently in the three cases; we have here a proof of the importance of pre-processing activities (in this case, the choice of scaler) in clustering results. With MinMaxScaler, figure 4(a), the outer plasmoid region neighbouring the inflow region is divided into two clusters, cluster 3, red and 4, purple. The inner plasmoid region, readily identifiable ‘by eye’ in the plots of § 3, is mapped to cluster 2, green. Intermediate plasmoid regions are mapped to cluster 1, orange. With StandardScaler, (b) in figure 4, the outer plasmoid region is assigned to a single cluster, cluster 3, red. The inner plasmoid region is again mapped to cluster 2, green. The remaining plasmoid regions are clustered into two clusters, cluster 1, orange, and 4, purple. The blue and green clusters obtained with the RobustScaler are very similar to those obtained with MinMaxScaler and StandardScaler. ‘Walking’ in figure 4(c), from the inflow region towards the inner plasmoid region we encounter cluster 3, red, 1, orange and 4, purple. Incidentally, we notice that we can similarly walk from cluster 0 to 3, 1, 4 and finally 2 also in the map depicted in figure 3(e), a confirmation that neighbourhood in the SOM derives from feature similarity.

When confronted with different clustering results, we have to identify criteria that allow us to prefer one clustering method over another. This is usually quite a daunting task for unsupervised methods, where the ‘ground truth’ is not known. Luckily, here, as already in Innocenti et al. (Reference Innocenti, Amaya, Raeder, Dupuis, Ferdousi and Lapenta2021), we are in a rather fortunate situation; we are clustering simulated data, hence we have complete information about all features as a function of space and time. Furthermore, we have previous knowledge of the process we are analysing. Hence, we can determine that the clustering obtained with the RobustScaler is the most useful for our purposes, because it separates regions where we can expect different physical processes to take place.

It is instructive to speculate on the reasons why regions that we may want to see clustered together, based on the physics that occur there, are assigned to different clusters. This is the case, for example, for cluster 3, red and 4, purple, in figure 4(a), depicting data scaled with MinMaxScaler. Analysing the feature values of the points in the two clusters, we realize that the main difference between the two sets of points is the sign of the $y$ component of the magnetic field. Most other features, including features that we consider of particular relevance in plasmoid instability/magnetic reconnection simulations (non-diagonal pressure terms, out-of-plane electron current $\ldots$) present quite similar values in the two clusters. Looking at $B_y$ and $J_{z,e}$ after scaling with MinMaxScaler in figure 2(a,d), we realize that the differences in the outlier-poor $B_y$ for the two clusters are at the two extremes of the value range, here 0 and 1, while the similarities in the outlier-rich $J_{z,e}$ are compressed towards 0.5; the differences in the former ‘weigh’ more in the clustering than the similarities in the latter. With RobustScaler, instead, $B_y$ spans ${\sim }[-6; 6]$, while $J_{z,e}$ varies in the range ${\sim }[-25; 42]$; the $B_y$ values become significantly less relevant in the clustering than the $J_{z,e}$ values, and, according to the results of figure 4, are not capable anymore of driving the formation of cluster 3 vs cluster 4.

We consider this quite a significant demonstration of the importance of accurate pre-processing before clustering activities; data should be inspected before and after scaling to assess if scaling has preserved important characteristics of the data we may want to rely upon during clustering (e.g. outliers). Furthermore, the results of clustering activities when using different scalers should be compared.

5 The a posteriori analysis of clustering results

In § 4 we identified the results obtained using the RobustScaler as the most useful for our purposes, since the points clustered together are the ones where we can expect similar processes to occur. This determination is made on the base of our previous knowledge of plasmoid instability evolution.

Now, we intend to examine these specific results in more detail.

First, we focus on the UDM depicted in figure 3(f). The area associated with the inflow region, cluster 0, blue in figure 4(c), is lighter in colour with respect to the other two cases, showing minimal differences between the nodes. We observe that a darker UDM region encloses a ‘walled in’ area in the lower right corner of the map associated with cluster 2, green. The nodes there are quite similar to each other, as seen from their light colour. They are, however, very different from the ones in the inflow region and also quite different from the other nodes mapping to plasmoids, as seen from the dark colour of the nodes at the boundary between clusters. Comparing figures 3(f) and 4(c), we see that the ‘walled in’ region in the UDM maps to the inner plasmoid region. The features in this regions are, in fact, quite distinct with respect to the neighbouring points, as confirmed by violin plot analysis of the data clustered in each cluster and by figure 9 below. Moving from cluster 2 in figure 4 towards the inflow region, cluster 0, blue, we encounter cluster 4, purple, 1, orange, and finally cluster 3, red, at the boundary between the inflow region and the plasmoid. We can broadly identify cluster 3, red, as the ‘separatrix’ cluster.

In figure 5 we depict the variation across the map of $B_z$, $p_{xx,e}$, $p_{xz,e}$, $J_{z,e}$. The boundaries between the $k$-means clusters are depicted as black lines. The feature values have been scaled back to the normalized values obtained as simulation results. Before going into a detailed analysis of figure 5, we notice several sub-structures in the clusters depicted. This is especially the case for cluster 4, at the right edge of the map. It is therefore no surprise that cluster 4 breaks into 2 and 3 clusters mapping the larger-scale sub-structures with $k=6$ and $k=7$, respectively.

Figure 5. The $B_z$ (a), $p_{xx,e}$ (b), $p_{xz,e}$ (c) and $J_{z,e}$ (d) values associated with the trained SOM nodes, for data points scaled with the RobustScaler. Cluster boundaries are drawn in black.

We compare figure 5 with figures 3 and 4 to highlight important characteristics of the different simulated regions. We observe that high-magnitude, positive values of $B_z$ (figure 5a), are concentrated in the inner plasmoid region, cluster 2, green. Lower, positive values are associated with cluster 4, purple. Most negative $B_z$ values are found in clusters 1, orange and 3, red. In the latter, both positive and negative $B_z$ values are found, most probably associated with the quadrupolar structure in the out-of-plane magnetic field associated with collisionless reconnection. In figure 5(b,c), we depict $p_{xx,e}$ and $p_{xz,e}$. We notice in (b) low $p_{xx,e}$ values in the inflow region, and higher pressure values in correspondence with the plasmoids. We further observe that $p_{xx,e}$ exhibits rather high values in the upper right corner of the map, associated with negative values of $p_{xz,e}$ (c) and rather high positive value of the out-of-plane electron current, $J_{z,e}$, depicted in (d); this interesting region deserves further consideration. We highlight in figure 6 some of these nodes. The nodes themselves are depicted in black in figure 6(a), and the associated points are depicted in the same colour in the 2-D simulated plane in (b). We observe that these nodes correspond to very specific regions in cluster 4 characterized by signatures associated with plasmoid merging.

Figure 6. Identification in the simulated plane of regions of interest in the feature maps; the colour black is used to highlight nodes of interest in (a) and associated points in the simulation in (b).

Going back to figure 5, we see that some of the nodes below the region just analysed are characterized both by large positive values of $p_{xx,e}$ and high, negative values of $J_{z,e}$. Again, in figure 7, we highlight these nodes in (a) and check which data points they correspond to in (b). Also in these cases, we obtain specific regions in cluster 4, purple, strongly associated with plasmoid merging. We can therefore reaffirm that cluster 4, purple, is the one more directly associated with plasmoid merging, and that specific nodes in the SOMs map to rather localized regions in space where merging-specific signatures are particularly strong.

Figure 7. Identification in the simulated plane of regions of interest in the feature maps: the colour black is used to highlight nodes of interest in (a) and associated points in the simulation in (b).

The variability of $J_{z,e}$ over the map (figure 5d) is extremely interesting to examine. First of all, we notice that $J_{z,e}$ in the feature map varies over values which seem quite different from those observed in figure 4. This may occur in SOM, and it has already been commented upon e.g. in Innocenti et al. (Reference Innocenti, Amaya, Raeder, Dupuis, Ferdousi and Lapenta2021); the feature value associated with a node may be quite different from those of the data points associated with it, especially, as in this case, for outlier-rich features. We can identify in (d) specific patterns; for example, positive and negative values of $J_{z,e}$ are associated with the plasmoid merging regions examined in figures 6 and 7. A further pattern of interest in the $J_{z,e}$ feature map is the high-value region at the intersection between clusters 3, 2 and 1. Figure 8 shows that also these nodes map to a rather small region in space, namely smaller-scale plasmoids developing at the X-line, a feature commonly observed in PIC simulations e.g. in Innocenti et al. (Reference Innocenti, Goldman, Newman, Markidis and Lapenta2015a), Innocenti et al. (Reference Innocenti, Cazzola, Mistry, Eastwood, Goldman, Newman, Markidis and Lapenta2017a) and Li et al. (Reference Li, Guo, Li and Li2017).

Figure 8. Identification in the simulated plane of regions of interest in the feature maps; the colour black is used to highlight nodes of interest in (a) and associated points in the simulation in (b).

In figure 9 we depict the position of data points as a function of the electron parallel plasma beta $\beta _{\parallel, e}$, $x$ axis, and of the electron perpendicular to parallel temperature ratio, $T_{\perp,e}/ T_{\parallel,e}$, $y$ axis. In (a) we depict all points at initialization, in (bf) the points associated with clusters 0, 1, 2, 3 and 4, respectively, at time $\varOmega _{ci}t= 320$. The solid, dashed and dotted lines in the $T_{\perp,e}/ T_{\parallel,e}< 1$ semiquadrant are the isocontours of growth rates $\gamma /\varOmega _{ce} = 0.001, 0.1, 0.2$ of the resonant electron firehose instability from Gary & Nishimura (Reference Gary and Nishimura2003). The solid and dotted lines in the $T_{\perp,e}/ T_{\parallel,e}> 1$ semiquadrant are the isocontours of growth rates $\gamma /\varOmega _{ce} = 0.01, 0.1$ for the whistler temperature anisotropy instability (Gary & Wang Reference Gary and Wang1996). The colours depict the number of points per pixel. This visualization is quite common for both electrons and ions in both the solar wind (e.g. Štverák et al. Reference Štverák, Trávníček, Maksimovic, Marsch, Fazakerley and Scime2008; Innocenti et al. Reference Innocenti, Tenerani, Boella and Velli2019a; Micera et al. Reference Micera, Zhukov, López, Boella, Tenerani, Velli, Lapenta and Innocenti2021) and magnetospheric (e.g. Alexandrova et al. Reference Alexandrova, Retino, Divin, Matteini, Contel, Breuillard, Catapano, Cozzani, Zaitsev and Deca2020) environment. When plotting electron quantities, we can quickly identify firehose- (bottom right corner) or whistler- (upper right corner) unstable plasma parcels. The area in between the two families of isocontour lines is stable to both instabilities. In this case, we can use this visualization to quickly appreciate the differences between the plots associated with the different clusters. In all panels the simulated data points sit in the stability region bounded by the stability thresholds. At initialization, (a), the data points are well into the stable region. In (b) we see that the majority of data points in cluster 0, inflow cluster, still sit quite close to initialization values at $\varOmega _{ci}t= 320$, as expected from regions of space barely influenced by the development of the plasmoid instability. A small number of points (darker colour points) have moved to larger parallel beta, still within the stable region. In all the clusters associated with plasmoids the majority of points (lighter colour points) have moved to larger parallel electron beta with respect to initialization, due to the larger electron pressure within the plasmoids with respect to the inflow region (see figure 3). Comparing (cf), we notice immediately the difference between (d) (cluster 2, green, inner plasmoid region) and the others. In the inner plasmoid region the spread of the points around the core of the distribution is quite small with respect to the other clusters, and all points are quite far from instability thresholds. ‘Walking’ from the plasmoid interior towards the inflow region, and crossing from cluster 2, green, (d), to cluster 4, purple, (f), to cluster 1, orange, (c), and finally to cluster 3, red, (e), we notice that the data distribution progressively moves towards higher electron parallel beta, still within the stable region. At the separatrix cluster, (e), the data points have spread into the narrow stable region at high parallel beta.

Figure 9. Data point distribution in the $\beta _{\parallel, e}$ vs $T_{\perp,e}/ T_{\parallel,e}$ plane. (a) All upper current sheet simulated points at initialization. (bf) Points associated with clusters 0, 1, 2, 3 and 4 at $\varOmega _{ci}t= 320$. The colours highlight the number of points per pixel. The isocontours of growth rates $\gamma /\varOmega _{ce} = 0.001, 0.1, 0.2$ for the resonant electron firehose instability and of growth rates $\gamma /\varOmega _{ce} = 0.01, 0.1$ for the whistler temperature anisotropy instability are depicted in the upper and lower semiquadrants.

6 Discussion and conclusion

In this paper, we have applied an unsupervised clustering method based on SOMs to data points obtained from a fully kinetic simulation of plasmoid instability. The clustering method and the simulation used for the clustering experiment are described in §§ 2 and 3. Pre-processing activities and clustering results obtained with three different types of data scaling are described in § 4. We remark on the fundamental role of scalers in determining clustering results. Our dataset includes outlier-poor and outlier-rich features, with the latter being particularly critical in the identification of magnetic reconnection regions. In this situation, a scaler designed to be robust to the presence of outliers, such as RobustScaler, delivers results that match better with our a priori knowledge of the process under investigation. The data scaled with RobustScaler are clustered in an unsupervised fashion into clusters that we identify, a posteriori, as an inflow cluster (cluster 0, blue in figure 4c), an inner plasmoid cluster (cluster 2, green), a separatrix cluster (cluster 3, red), a plasmoid merging cluster (cluster 4, purple) and a further cluster for regions intermediate between the separatrices and the area where plasmoid merging signatures are stronger (cluster 1, orange).

In § 5 we further examine our classification results in light of the analysis tools that SOMs provide us with, mainly feature maps, figure 5, and the UDM, figure 3. We recover from this analysis valuable information on the physical processes occurring in each cluster. In our case, where we are already acquainted with the dataset under investigation and with the physical results of interest, this a posteriori analysis results in an analysis procedure where information is made readily available in an easy-to-grasp 2-D representation, thus simplifying further investigation. Furthermore, we see the potential of using feature and UDM maps as a tool for scientific discovery when applied to datasets encoding unknown processes. We consider of particular interest the analysis of UDM patterns illustrated in figures 6 to 8; we show that darker patterns in the UDM (meaning, nodes ‘significantly’ different from their neighbours) map to small-scale regions of interest, such as regions in the 2-D simulated plane where plasmoid merging signatures are particularly strong, or where smaller-scale plasmoids develop in the current sheet in between larger-scale plasmoids. Also here, we see the double potential of the method, as a fast tool for identifying already known regions of interest in a potentially very large dataset, and as a tool of scientific discovery when applied to an unknown dataset.

This work is a follow up of previous activities, where a similar clustering method had been applied to MHD simulations. We verify here that this clustering technique delivers excellent and insightful results also with fully kinetic simulations, when it is fed data (moments separated by species, parallel and perpendicular pressure terms, electric field ‘including’ non-ideal processes, $\ldots$) comparable, in temporal and spatial resolution and in the nature of the processes of interest, to observed data from solar wind and magnetospheric missions.

Spacecraft observations are thus a natural field of future applicability for this method. Additionally, we envision a role of this and similar clustering methods in model development. An open issue in multi-physics and coupled methods (e.g. the above-mentioned Daldorff et al. Reference Daldorff, Tóth, Gombosi, Lapenta, Amaya, Markidis and Brackbill2014; Ashour-Abdalla et al. Reference Ashour-Abdalla, Lapenta, Walker, El-Alaoui and Liang2015; Innocenti et al. Reference Innocenti, Beck, Ponweiser, Markidis and Lapenta2015b; Lautenbach & Grauer Reference Lautenbach and Grauer2018) is how to decide when to switch between numerical methods (e.g. MHD to PIC, or 10-moment method to Vlasov) or between lower and higher resolution. This choice is often made somehow empirically, e.g. based on the expected location in space of target processes, or on thresholds of specific quantities. We envision the possibility of using SOMs trained on reference simulations to decide where to perform this switch; specific SOM nodes or specific clusters of SOM nodes could be associated with a certain model or resolution. We remark that, while training a SOM is (moderately) time consuming, BMU evaluation is in fact fast enough to be embedded into a run-time code without significant performance degradation. Preliminary and yet unpublished tests confirm that SOMs are robust to being deployed in simulations performed with a different physical model with respect to the one of training.

Acknowledgements

We thank G. Lapenta and the ECsim development team at KULeuven for early access to the code, the AIDA WP7 team (J. Amaya, J. Reader, B. Ferdousi and J. ‘Andy’ Edmond) for long discussions on the methodology (https://www.aida-space.eu) and R. Sydora for useful feedback on the draft.

Editor Antoine C. Bret thanks the referees for their advice in evaluating this article.

Funding

This work was funded by the German Science Foundation DFG within the Collaborative Research Center SFB1491. We gratefully acknowledge the Gauss Centre for Supercomputing e.V. (www.gauss-centre.eu) for providing computing time on the GCS Supercomputer SuperMUC-NG at Leibniz Supercomputing Centre (www.lrz.de). This work also used the DiRAC Extreme Scaling service at the University of Edinburgh, operated by the Edinburgh Parallel Computing Centre on behalf of the STFC DiRAC HPC Facility (www.dirac.ac.uk). This equipment was funded by BEIS capital funding via STFC capital grant ST/R00238X/1 and STFC DiRAC Operations grant ST/R001006/1. DiRAC is part of the UK National e-Infrastructure.

Declaration of interests

The authors report no conflict of interest.

Data availability statement

The data that support the findings of this study are openly available in Zenodo at https://zenodo.org/record/7463339#.Y6R2GhWZPEb, reference number 7463339 (Koehne, Boella & Innocenti Reference Koehne, Boella and Innocenti2022a). The scripts are available on GitHub at https://github.com/SophiaKoehne/unsupervised-classification (Koehne, Boella & Innocenti Reference Koehne, Boella and Innocenti2022b).

Author contribution

M.E.I. performed the simulations. S.K. performed the clustering. All authors contributed to analysing data and reaching conclusions, and in writing the paper.

Appendix. Robustness tests

In this Section we illustrate several tests run with the objective of assessing the robustness of our clustering procedure. First, we want to understand how dependent our clustering results are on the SOM hyper-parameters, see § 2. In table 3 we list several SOM training experiments, run using as input the upper current sheet data scaled with RobustScaler. We change the number of epochs, the initial learning rate $\eta _0$, the initial neighbourhood width $\sigma _0$ (calculated as a difference percentage of the largest map side), the random seed used for random weight initialization and the number of SOM nodes $q$. We then list, in the last column, the percentage of points classified in the same $k$-means cluster as with a 5 epoch-long training, $\eta _0 = 0.5$, $\sigma _0 = 0.2\times \max (x,y)$, random initialization seed. The $k$-means clustering is done afresh on the newly trained map. Since the cluster number in $k$-means is assigned arbitrarily, clusters which are assigned the same number in different clustering experiments do not necessarily correspond to similar regions of simulated space. For this reason, before calculating $R$, we reassign cluster numbers, so that the clusters compared during the $R$ calculation indeed match to similar regions of physical space in the simulations.

Table 3. Percentage $R$ of samples classified in the same $k$-means cluster as with a map trained for five epochs, initial learning rate $\eta _0 = 0.5$, initial neighbourhood radius $\sigma _0 = 0.2\times \max (x,y)$, random initialization seed 42 and number of nodes $q= 71\times 93$ when the number of epochs, $\eta _0, \sigma _0$, random initialization seed and $q$ are varied. In all cases, the scaler used is RobustScaler. The inflow cluster, cluster 0, is excluded in the calculation of the matching factor, since we are primarily interested in clusters mapping to the plasmoid region.

In calculating the matching factor $R$ we exclude the cluster mapping to the inflow region (cluster 0, blue), which stays essentially constant notwithstanding the SOM hyper-parameters, since we are primarily interested in clusters mapping to the plasmoid region. Without excluding the inflow cluster, the matching factors increase significantly in all cases, simply due to the amount of data points falling into the inflow region.

We notice that the matching factors are very high for all cases in table 3; our classification procedure is significantly stable to the choice of SOM hyper-parameters. The high matching factors can be explained noticing that map nodes may individually change with different hyper-parameters, but not so much as to give rise to significantly different $k$-means clustering results. The parameter that gives rise to the most significant differences is the initial neighbourhood radius. In deciding which initial neighbourhood radius to use in the body of the manuscript, we followed the rule of thumb prescription from Kohonen (Reference Kohonen2014). We see here that, with smaller neighbourhood radius and training on the same number of epochs, the map may have more difficulties reflecting the true structure and topology of the data.

It is interesting to identify which points in the simulation change clusters ‘more easily’. In figure 10 we plot the clustered points obtained from the SOM trained with the hyper-parameters listed in red and blue in table 3 compared with the reference SOM in (a). We notice that the regions of space which are more affected by the choice of hyper-parameters are those associated with cluster 1, orange and cluster 4, purple, i.e. the intermediate cluster region. This is not a surprise; the two clusters are the most similar and in fact tend to merge when the cluster number is reduced from $k=5$ to $k=4$. Cluster 2, inner plasmoid region, and cluster 3, separatrix cluster, are instead significantly different from the intermediate plasmoid region. The individual matching factors for each cluster in the case with the lowest matching factor $R = 85.78\,\%$ are: orange(1): 86.45 %, green(2): 99.4 %, red(3): 91.79 %, purple (4): 64.37 %. We see that results for clusters 0, 3, 2 (the clusters that physics tell us should be clearly distinct) are stable with all hyper-parameters, even in the worst case scenario, as we expect from the physics of the problem. From this point of view, the fact that a small subset of simulated points tends to be assigned, in different clustering experiments, to either one of the two most similar clusters is not considered a cause for concern.

Figure 10. Clustering of the upper current sheet simulated data, with the SOM used as a reference in table 3 in (a) and the SOMs trained with the hyper-parameters listed in the lines coloured in blue and red in (b,c).

We also want to check the robustness of our clustering procedure to temporal variation in the simulation. To do this, data samples from an earlier time in the simulation, $\varOmega _{ci}t= 160$, were classified using the map and $k$-means centroids described in §§ 4 and 5, trained on the data from the time step $\varOmega _{ci}t= 320$ scaled with RobustScaler. The BMUs of the data samples from the time step $\varOmega _{ci}t= 160$ were found in that trained map and coloured according to the $k$-means clusters visible in figure 3(e). The out-of-plane electron current and the resulting classification are shown in figure 11(a,b).

Figure 11. Out-of-plane electron current, $J_{z,e}$, (a), and clustering of the upper current sheet data points, (b), at $\varOmega _{ci}t=160$. The SOM used for the clustering has been trained at $\varOmega _{ce}t= 320$.

We immediately notice that, at $\varOmega _{ci}t= 160$, the simulation is at a quite different stage. More, smaller plasmoids are present, since the merging process is just at the beginning. This is reflected in the clustering results.

In figure 11(b), we observe that the inflow and plasmoid region are very well separated. Inside the plasmoid, the inner plasmoid region (cluster 2, green) occupied a larger percentage of the plasmoid area with respect to later times. The clusters associated with plasmoid merging, cluster 4, purple, and cluster 1, orange, have shrunk accordingly, possibly due to a combination of two factors; smaller plasmoid size (the plasmoid size increases after coalescing) and the fact that plasmoid merging is just at the beginning.

References

Alexandrova, A., Retino, A., Divin, A., Matteini, L., Contel, O.L., Breuillard, H., Catapano, F., Cozzani, G., Zaitsev, I. & Deca, J. 2020 In situ evidence of firehose instability in multiple reconnection. arXiv:2004.08280.CrossRefGoogle Scholar
Amaya, J., Dupuis, R., Innocenti, M.E. & Lapenta, G. 2020 Visualizing and interpreting unsupervised solar wind classifications. Front. Astron. Space Sci. 7, 66.CrossRefGoogle Scholar
Angelis, L. & Stamelos, I. 2000 A simulation tool for efficient analogy based cost estimation. Empir. Softw. Engng 5, 3568.CrossRefGoogle Scholar
Argall, M.R., Small, C.R., Piatt, S., Breen, L., Petrik, M., Kokkonen, K., Barnum, J., Larsen, K., Wilder, F.D., Oka, M., et al. 2020 MMS SITL ground loop: automating the burst data selection process. Front. Astron. Space Sci. 7, 54.CrossRefGoogle ScholarPubMed
Ashour-Abdalla, M., Lapenta, G., Walker, R.J., El-Alaoui, M. & Liang, H. 2015 Multiscale study of electron energization during unsteady reconnection events. J. Geophys. Res. 120 (6), 47844799.CrossRefGoogle Scholar
Baker, D.N., Riesberg, L., Pankratz, C.K., Panneton, R.S., Giles, B.L., Wilder, F.D. & Ergun, R.E. 2016 Magnetospheric multiscale instrument suite operations and data system. Space Sci. Rev. 199 (1–4), 545575.CrossRefGoogle Scholar
Birn, J., Drake, J.F., Shay, M.A., Rogers, B.N., Denton, R.E., Hesse, M., Kuznetsova, M., Ma, Z.W., Bhattacharjee, A., Otto, A., et al. 2001 Geospace environmental modeling (GEM) magnetic reconnection challenge. J. Geophys. Res. 106 (A3), 37153719.CrossRefGoogle Scholar
Bishop, C.M. 2006 Pattern recognition. Mach. Learn. 128 (9).Google Scholar
Breuillard, H., Dupuis, R., Retino, A., Le Contel, O., Amaya, J. & Lapenta, G. 2020 Automatic classification of plasma regions in near-earth space with supervised machine learning: application to magnetospheric multi scale 2016–2019 observations. Front. Astron. Space Sci. 7, 55.CrossRefGoogle Scholar
Burch, J.L., Moore, T.E., Torbert, R.B. & Giles, B.L. 2016 Magnetospheric multiscale overview and science objectives. Space Sci. Rev. 199 (1–4), 521.CrossRefGoogle Scholar
Camporeale, E. 2019 The challenge of machine learning in space weather: nowcasting and forecasting. Space Weather 17 (8), 11661207.CrossRefGoogle Scholar
Camporeale, E., Carè, A. & Borovsky, J.E. 2017 Classification of solar wind with machine learning. J. Geophys. Res. 122 (11), 10910.CrossRefGoogle Scholar
Cazzola, E., Innocenti, M.E., Goldman, M.V., Newman, D.L., Markidis, S. & Lapenta, G. 2016 On the electron agyrotropy during rapid asymmetric magnetic island coalescence in presence of a guide field. Geophys. Res. Lett. 43 (15), 78407849.CrossRefGoogle Scholar
Cazzola, E., Innocenti, M.E., Markidis, S., Goldman, M.V., Newman, D.L. & Lapenta, G. 2015 On the electron dynamics during island coalescence in asymmetric magnetic reconnection. Phys. Plasmas 22 (9), 092901.CrossRefGoogle Scholar
Chen, G., Chacón, L. & Nguyen, T.B. 2021 An unsupervised machine-learning checkpoint-restart algorithm using Gaussian mixtures for particle-in-cell simulations. J. Comput. Phys. 436, 110185.CrossRefGoogle Scholar
Daldorff, L.K.S., Tóth, G., Gombosi, T.I., Lapenta, G., Amaya, J., Markidis, S. & Brackbill, J.U. 2014 Two-way coupling of a global hall magnetohydrodynamics model with a local implicit particle-in-cell model. J. Comput. Phys. 268, 236254.CrossRefGoogle Scholar
da Silva, D., Barrie, A., Shuster, J., Schiff, C., Attie, R., Gershman, D.J. & Giles, B. 2020 Automatic region identification over the MMS orbit by partitioning NT space. arXiv:2003.08822.Google Scholar
Drake, J.F., Swisdak, M., Che, H. & Shay, M.A. 2006 Electron acceleration from contracting magnetic islands during reconnection. Nature 443 (7111), 553556.CrossRefGoogle ScholarPubMed
Dupuis, R., Goldman, M.V., Newman, D.L., Amaya, J. & Lapenta, G. 2020 Characterizing magnetic reconnection regions using Gaussian mixture models on particle velocity distributions. Astrophys. J. 889 (1), 22.CrossRefGoogle Scholar
Fox, N.J., Velli, M.C., Bale, S.D., Decker, R., Driesman, A., Howard, R.A., Kasper, J.C., Kinnison, J., Kusterer, M., Lario, D., et al. 2016 The solar probe plus mission: humanity's first visit to our star. Space Sci. Rev. 204 (1–4), 748.CrossRefGoogle Scholar
Gary, S.P. & Nishimura, K. 2003 Resonant electron firehose instability: particle-in-cell simulations. Phys. Plasmas 10 (9), 35713576.CrossRefGoogle Scholar
Gary, S.P. & Wang, J. 1996 Whistler instability: electron anisotropy upper bound. J. Geophys. Res. 101 (A5), 1074910754.CrossRefGoogle Scholar
Gonzalez-Herrero, D., Boella, E. & Lapenta, G. 2018 Performance analysis and implementation details of the energy conserving semi-implicit method code (ECsim). Comput. Phys. Commun. 229, 162169.CrossRefGoogle Scholar
Goodfellow, I., Bengio, Y. & Courville, A. 2016 Deep Learning. MIT Press.Google Scholar
Guo, F., Jokipii, J.R. & Kota, J. 2010 Particle acceleration by collisionless shocks containing large-scale magnetic-field variations. Astrophys. J. 725 (1), 128.CrossRefGoogle Scholar
Guo, F., Li, X., Li, H., Daughton, W., Zhang, B., Lloyd-Ronning, N., Liu, Y.-H., Zhang, H. & Deng, W. 2016 Efficient production of high-energy nonthermal particles during magnetic reconnection in a magnetically dominated ion–electron plasma. Astrophys. J. Lett. 818 (1), L9.CrossRefGoogle Scholar
Hesse, M. & Winske, D. 1998 Electron dissipation in collisionless magnetic reconnection. J. Geophys. Res. 103, 26479.CrossRefGoogle Scholar
Hockney, R.W. & Eastwood, J.W. 2021 Computer Simulation using Particles. CRC Press.CrossRefGoogle Scholar
Huang, J., Li, Y.-F. & Xie, M. 2015 An empirical analysis of data preprocessing for machine learning-based software cost estimation. Inf. Softw. Technol. 67, 108127.CrossRefGoogle Scholar
Innocenti, M.E., Amaya, J., Raeder, J., Dupuis, R., Ferdousi, B. & Lapenta, G. 2021 Unsupervised classification of simulated magnetospheric regions. Ann. Geophys. 39 (5), 861881.CrossRefGoogle Scholar
Innocenti, M.E., Beck, A., Ponweiser, T., Markidis, S. & Lapenta, G. 2015 b Introduction of temporal sub-stepping in the multi-level multi-domain semi-implicit particle-in-cell code Parsek2D-MLMD. Comput. Phys. Commun. 189, 4759.CrossRefGoogle Scholar
Innocenti, M.E., Cazzola, E., Mistry, R., Eastwood, J.P., Goldman, M.V., Newman, D.L., Markidis, S. & Lapenta, G. 2017 a Switch-off slow shock/rotational discontinuity structures in collisionless magnetic reconnection: what to look for in satellite observations. Geophys. Res. Lett. 44 (8), 34473455.CrossRefGoogle Scholar
Innocenti, M.E., Goldman, M., Newman, D., Markidis, S. & Lapenta, G. 2015 a Evidence of magnetic field switch-off in collisionless magnetic reconnection. Astrophys. J. Lett. 810 (2), L19.CrossRefGoogle Scholar
Innocenti, M.E., Johnson, A., Markidis, S., Amaya, J., Deca, J., Olshevsky, V. & Lapenta, G. 2017 b Progress towards physics-based space weather forecasting with exascale computing. Adv. Engng Softw. 111, 317.CrossRefGoogle Scholar
Innocenti, M.E., Lapenta, G., Markidis, S., Beck, A. & Vapirev, A. 2013 A multi level multi domain method for particle in cell plasma simulations. J. Comput. Phys. 238, 115140.CrossRefGoogle Scholar
Innocenti, M.E., Norgren, C., Newman, D., Goldman, M., Markidis, S. & Lapenta, G. 2016 Study of electric and magnetic field fluctuations from lower hybrid drift instability waves in the terrestrial magnetotail with the fully kinetic, semi-implicit, adaptive multi level multi domain method. Phys. Plasmas 23 (5), 052902.CrossRefGoogle Scholar
Innocenti, M.E., Tenerani, A., Boella, E. & Velli, M. 2019 a Onset and evolution of the oblique, resonant electron firehose instability in the expanding solar wind plasma. Astrophys. J. 883 (2), 146.CrossRefGoogle Scholar
Innocenti, M.E., Tenerani, A. & Velli, M. 2019 b A semi-implicit particle-in-cell expanding box model code for fully kinetic simulations of the expanding solar wind plasma. Astrophys. J. 870 (2), 66.CrossRefGoogle Scholar
Koehne, S., Boella, E. & Innocenti, M.E. 2022 a https://zenodo.org/record/7463339#.Y6Rvwh WZPEa.Google Scholar
Koehne, S., Boella, E. & Innocenti, M.E. 2022 b https://github.com/SophiaKoehne/unsupervised-classification.Google Scholar
Kohonen, T. 1982 Self-organized formation of topologically correct feature maps. Biol. Cybern. 43 (1), 5969.CrossRefGoogle Scholar
Kohonen, T. 2014 Matlab Implementations and Applications of the Self-Organizing Map. Unigrafia Oy.Google Scholar
Lalti, A., Khotyaintsev, Y.V., Dimmock, A.P., Johlander, A., Graham, D.B. & Olshevsky, V. 2022 A database of MMS bow shock crossings compiled using machine learning. J. Geophys. Res. 127 (8), e2022JA030454.CrossRefGoogle Scholar
Lapenta, G. 2017 Exactly energy conserving semi-implicit particle in cell formulation. J. Comput. Phys. 334, 349366.CrossRefGoogle Scholar
Lapenta, G., Gonzalez-Herrero, D. & Boella, E. 2017 Multiple-scale kinetic simulations with the energy conserving semi-implicit particle in cell method. J. Plasma Phys. 83, 705830205.CrossRefGoogle Scholar
Lautenbach, S. & Grauer, R. 2018 Multiphysics simulations of collisionless plasmas. Front. Phys. 6, 113.CrossRefGoogle Scholar
Ledvina, S.A., Ma, Y.-J. & Kallio, E. 2008 Modeling and simulating flowing plasmas and related phenomena. Space Sci. Rev. 139, 143189.CrossRefGoogle Scholar
Li, H., Wang, C., Tu, C. & Xu, F. 2020 Machine learning approach for solar wind categorization. Earth Space Sci. 7 (5), e2019EA000997.CrossRefGoogle Scholar
Li, X., Guo, F., Li, H. & Li, G. 2015 Nonthermally dominated electron acceleration during magnetic reconnection in a low-$\beta$ plasma. Astrophys. J. Lett. 811 (2), L24.CrossRefGoogle Scholar
Li, X., Guo, F., Li, H. & Li, G. 2017 Particle acceleration during magnetic reconnection in a low-beta plasma. Astrophys. J. 843 (1), 21.CrossRefGoogle Scholar
Lloyd, S. 1982 Least squares quantization in PCM. IEEE Trans. Inf. Theory 28 (2), 129137.CrossRefGoogle Scholar
Loureiro, N.F. & Uzdensky, D.A. 2015 Magnetic reconnection: from the sweet–Parker model to stochastic plasmoid chains. Plasma Phys. Control. Fusion 58 (1), 014021.CrossRefGoogle Scholar
Lu, S., Angelopoulos, V., Artemyev, A.V., Pritchett, P.L., Liu, J., Runov, A., Tenerani, A., Shi, C. & Velli, M. 2019 Turbulence and particle acceleration in collisionless magnetic reconnection: effects of temperature inhomogeneity across pre-reconnection current sheet. Astrophys. J. 878, 109.CrossRefGoogle Scholar
Markidis, S., Lapenta, G. & Rizwan-uddin, . 2010 Multi-scale simulations of plasma with iPIC3D. Math. Comput. Simul. 80 (7), 15091519.CrossRefGoogle Scholar
Micera, A., Boella, E., Zhukov, A.N., Shaaban, S.M., López, R.A., Lazar, M. & Lapenta, G. 2020 Particle-in-cell simulations of the parallel proton firehose instability influenced by the electron temperature anisotropy in solar wind conditions. Astrophys. J. 893, 130.CrossRefGoogle Scholar
Micera, A., Zhukov, A.N., López, R.A., Boella, E., Tenerani, A., Velli, M., Lapenta, G. & Innocenti, M.E. 2021 On the role of solar wind expansion as a source of whistler waves: scattering of suprathermal electrons and heat flux regulation in the inner heliosphere. Astrophys. J. 919 (1), 42.CrossRefGoogle Scholar
Mistri, M. 2018 CUDA-SOM. https://github.com/mistrello96/CUDA-SOM/releases, blacklast accessed: Mar 22, 2023 (bugfix in src/utility_functions.cu in line 49: change “i<nELements;” to “j<nElements;”).Google Scholar
Müller, D., St. Cyr, O.C., Zouganelis, I., Gilbert, H.R., Marsden, R., Nieves-Chinchilla, T., Antonucci, E., Auchère, F., Berghmans, D., Horbury, T.S., et al. 2020 The solar orbiter mission-science overview. Astron. Astrophys. 642, A1.CrossRefGoogle Scholar
Nguyen, G., Aunai, N., Michotte de Welle, B., Jeandet, A., Lavraud, B. & Fontaine, D. 2022 Massive multi-mission statistical study and analytical modeling of the earth's magnetopause. 1. A gradient boosting based automatic detection of near-earth regions. J. Geophys. Res. 127 (1), e2021JA029773.Google Scholar
Olshevsky, V., Khotyaintsev, Y.V., Lalti, A., Divin, A., Delzanno, G.L., Anderzén, S., Herman, P., Chien, S.W.D., Avanov, L., Dimmock, A.P., et al. 2021 Automated classification of plasma regions using 3D particle energy distributions. J. Geophys. Res. 126 (10), e2021JA029620.CrossRefGoogle Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. 2011 Scikit-learn: machine learning in Python. J. Machine Learning Res. 12, 28252830.Google Scholar
Petropoulou, M., Christie, I.M., Sironi, L. & Giannios, D. 2018 Plasmoid statistics in relativistic magnetic reconnection. Mon. Not. R. Astron. Soc. 475 (3), 37973812.CrossRefGoogle Scholar
Pucci, F., Velli, M., Shi, C., Singh, K.A.P., Tenerani, A., Alladio, F., Ambrosino, F., Buratti, P., Fox, W., Jara-Almonte, J., et al. 2020 Onset of fast magnetic reconnection and particle energization in laboratory and space plasmas. J. Plasma Phys. 86 (6).CrossRefGoogle Scholar
Raeder, J. 2003 Global Magnetohydrodynamics—A Tutorial, 212246. Springer.Google Scholar
Ricci, P., Brackbill, J.U., Daughton, W. & Lapenta, G. 2004 Collisionless magnetic reconnection in the presence of a guide field. Phys. Plasmas 11, 4102.CrossRefGoogle Scholar
Roberts, D.A., Karimabadi, H., Sipes, T., Ko, Y.-K. & Lepri, S. 2020 Objectively determining states of the solar wind using machine learning. Astrophys. J. 889 (2), 153.CrossRefGoogle Scholar
Satopaa, V., Albrecht, J., Irwin, D. & Raghavan, B. 2011 Finding a “kneedle” in a haystack: Detecting knee points in system behavior. In 2011 31st International Conference on Distributed Computing Systems Workshops, pp. 166–171. doi:10.1109/ICDCSW.2011.20.CrossRefGoogle Scholar
Shlens, J. 2014 A tutorial on principal component analysis. arXiv:1404.1100.Google Scholar
Stone, E.C., Frandsen, A.M., Mewaldt, R.A., Christian, E.R., Margolies, D., Ormes, J.F. & Snow, F. 1998 The advanced composition explorer. Space Sci. Rev. 86 (1–4), 122.CrossRefGoogle Scholar
Štverák, Š., Trávníček, P., Maksimovic, M., Marsch, E., Fazakerley, A.N. & Scime, E.E. 2008 Electron temperature anisotropy constraints in the solar wind. J. Geophys. Res. 113 (A3).Google Scholar
Van Hulle, M.M. 2012 Self-Organizing Maps, 585622. Springer.Google Scholar
Vettigli, G. 2018 Minisom: minimalistic and numpy-based implementation of the self organizing map. GitHub.[Online]. https://github.com/JustGlowing/minisom/.Google Scholar
Villmann, T. & Claussen, J.C. 2006 Magnification control in self-organizing maps and neural gas. Neural Comput. 18 (2), 446469.CrossRefGoogle ScholarPubMed
NVIDIA, Vingelmann, P. & Fitzek, F.H.P. 2020 CUDA, release: 10.2.89.Google Scholar
Xu, D. & Tian, Y. 2015 A comprehensive survey of clustering algorithms. Ann. Data Sci. 2.CrossRefGoogle Scholar
Figure 0

Table 1. Comparison of the execution times of the parallel CUDA/C++ implementation CUDA-SOM (Mistri 2018) and the serial Python implementation miniSom (Vettigli 2018), both trained for 5 epochs with different sample sizes $m$. The data points are extracted from the upper current sheet of the simulation described in § 3. The SOM parameters used for the training are initial learning rate $\eta _0=0.5$ and initial neighbourhood radius $\sigma _0=0.2 \times \max (x,y)$.

Figure 1

Figure 1. Out-of-plane magnetic field component ($B_z$, a), one diagonal ($p_{xx,e}$, b) and one non-diagonal, ($p_{xz,e}$, c), electron pressure term, and the out-of-plane electron current ($J_{z,e}$, d) at $\varOmega _{ci}t=320$. Magnetic field lines are superimposed in black. Electromagnetic fields, pressure components and currents are in units of $m_i c \omega _{pi} / e$, $n_0 m_i c^2$ and $e n_0 c$, respectively.

Figure 2

Figure 2. Violin plots of the distribution, after scaling, of an outlier-poor ($B_y$) and an outlier-rich ($J_z,e$) feature. The MinMax, Standard and Robust scalers are used.

Figure 3

Figure 3. Trained SOM nodes coloured according to their $k$-means clusters (a,c,e) and UDM maps (b,d,f) for the data scaled with MinMaxScaler (a,b), StandardScaler (c,d), RobustScaler (e,f). Cluster boundaries are drawn in black in (b,d,f).

Figure 4

Table 2. The CUDA-SOM hyperparameters used in SOM training.

Figure 5

Figure 4. Upper current sheet simulated data at $\varOmega _{ci} t=320$, coloured according to the $k$-means cluster the BMU of each point is clustered into. Data scaled with MinMaxScaler, StandardScaler and RobustScaler are depicted in (a), (b) and (c) respectively.

Figure 6

Figure 5. The $B_z$ (a), $p_{xx,e}$ (b), $p_{xz,e}$ (c) and $J_{z,e}$ (d) values associated with the trained SOM nodes, for data points scaled with the RobustScaler. Cluster boundaries are drawn in black.

Figure 7

Figure 6. Identification in the simulated plane of regions of interest in the feature maps; the colour black is used to highlight nodes of interest in (a) and associated points in the simulation in (b).

Figure 8

Figure 7. Identification in the simulated plane of regions of interest in the feature maps: the colour black is used to highlight nodes of interest in (a) and associated points in the simulation in (b).

Figure 9

Figure 8. Identification in the simulated plane of regions of interest in the feature maps; the colour black is used to highlight nodes of interest in (a) and associated points in the simulation in (b).

Figure 10

Figure 9. Data point distribution in the $\beta _{\parallel, e}$ vs $T_{\perp,e}/ T_{\parallel,e}$ plane. (a) All upper current sheet simulated points at initialization. (bf) Points associated with clusters 0, 1, 2, 3 and 4 at $\varOmega _{ci}t= 320$. The colours highlight the number of points per pixel. The isocontours of growth rates $\gamma /\varOmega _{ce} = 0.001, 0.1, 0.2$ for the resonant electron firehose instability and of growth rates $\gamma /\varOmega _{ce} = 0.01, 0.1$ for the whistler temperature anisotropy instability are depicted in the upper and lower semiquadrants.

Figure 11

Table 3. Percentage $R$ of samples classified in the same $k$-means cluster as with a map trained for five epochs, initial learning rate $\eta _0 = 0.5$, initial neighbourhood radius $\sigma _0 = 0.2\times \max (x,y)$, random initialization seed 42 and number of nodes $q= 71\times 93$ when the number of epochs, $\eta _0, \sigma _0$, random initialization seed and $q$ are varied. In all cases, the scaler used is RobustScaler. The inflow cluster, cluster 0, is excluded in the calculation of the matching factor, since we are primarily interested in clusters mapping to the plasmoid region.

Figure 12

Figure 10. Clustering of the upper current sheet simulated data, with the SOM used as a reference in table 3 in (a) and the SOMs trained with the hyper-parameters listed in the lines coloured in blue and red in (b,c).

Figure 13

Figure 11. Out-of-plane electron current, $J_{z,e}$, (a), and clustering of the upper current sheet data points, (b), at $\varOmega _{ci}t=160$. The SOM used for the clustering has been trained at $\varOmega _{ce}t= 320$.