Active-learning-based nonintrusive model order reduction

Qinyu Zhuang; Dirk Hartmann; Hans-J. Bungartz; Juan M. Lorenzi

doi:10.1017/dce.2022.39

Active-learning-based nonintrusive model order reduction

Published online by Cambridge University Press: 09 January 2023

Hans-J. Bungartz and

Qinyu Zhuang*: Affiliation:
Department of Informatics, Technical University of Munich, 85748 Garching, Germany Modeling, Simulation & Optimization, Siemens AG, 81739 Munich, Germany
Dirk Hartmann: Affiliation:
Company Core Technology Simulation and Digital Twin, Siemens Digital Industries Software GmbH, 81739 Munich, Germany
Hans-J. Bungartz: Affiliation:
Department of Informatics, Technical University of Munich, 85748 Garching, Germany
Juan M. Lorenzi: Affiliation:
Modeling, Simulation & Optimization, Siemens AG, 81739 Munich, Germany
*: *Corresponding author. E-mail: [email protected]

Article contents

Abstract
Impact Statement
Introduction
ML-Based MOR
AL for ML-Based MOR
Numerical Examples
Summary and Conclusions
Author Contributions
Competing Interests
Data Availability Statement
Ethics Statement
Funding Statement
Footnotes
References

Abstract

Model order reduction (MOR) can provide low-dimensional numerical models for fast simulation. Unlike intrusive methods, nonintrusive methods are attractive because they can be applied even without access to full order models (FOMs). Since nonintrusive MOR methods strongly rely on snapshots of the FOMs, constructing good snapshot sets becomes crucial. In this work, we propose a novel active-learning-based approach for use in conjunction with nonintrusive MOR methods. It is based on two crucial novelties. First, our approach uses joint space sampling to prepare a data pool of the training data. The training data are selected from the data pool using a greedy strategy supported by an error estimator based on Gaussian process regression. Second, we introduce a case-independent validation strategy based on probably approximately correct learning. While the methods proposed here can be applied to different MOR methods, we test them here with artificial neural networks and operator inference.

Keywords

Active learning joint space sampling machine learning model order reduction probably approximately correct learning

Type: Research Article
Information: Data-Centric Engineering , Volume 4 , 2023 , e2

DOI: https://doi.org/10.1017/dce.2022.39 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Open Practices: Open data
Copyright: © The Author(s), 2023. Published by Cambridge University Press

Impact Statement

Nonintrusive model order reduction methods draw much attention in the industry for their easy implementation. However, insufficient accuracy and stability hinder broader applications. Large amounts of training data are needed to ensure their accuracy and stability. In this article, we propose an approach to actively collect the training data. The reduced order models built in such a way have not only much better performance than those built in conventional ways but also users can know their expected accuracy and confidence before deploying them. With little required user interaction, the proposed method offers a huge potential for industrial usage to create the so-called digital twin.

1. Introduction

Model order reduction (MOR) (Willcox and Peraire, Reference Willcox and Peraire2002; Antoulas, Reference Antoulas2005; Volkwein, Reference Volkwein2013), as a key component of the concept of digital twin (DT) (Hartmann et al., Reference Hartmann, Herz and Wever2018; Rasheed et al., Reference Rasheed, San and Kvamsdal2019), is increasingly attracting attention from different research fields. The traditional ways of reduced modeling, such as the Krylov subspace method (Heres, Reference Heres2005), reduced basis method (Haasdonk, Reference Haasdonk2017), proper orthogonal decomposition (POD) (Lu et al., Reference Lu, Jin, Chen, Yang, Hou, Zhang, Li and Fu2019), and the discrete empirical interpolation method (Chaturantabut and Sorensen, Reference Chaturantabut and Sorensen2010) require detailed knowledge of full order models (FOMs) and are known as intrusive reduction methods. The applicability of intrusive methods is limited in industrial engineering simulation and commercial finite element method (FEM) software. Recently, the development of so-called nonintrusive reduction has attracted much attention.

Nonintrusive MOR methods use data generated by FOMs to build surrogate models that can accurately reproduce the physical behavior encoded in the data. Currently, most data-driven MOR methods are based on machine learning (ML). For example, Feed-forward Neural Network (Regazzoni et al., Reference Regazzoni, Dedè and Quarteroni2019), operator inference (OpInf) (Peherstorfer and Willcox, Reference Peherstorfer and Willcox2016), Long-short-term-memory (LSTM) neural network (Mohan and Gaitonde, Reference Mohan and Gaitonde2018), recurrent neural network (RNN) (Kani and Elsheikh, Reference Kani and Elsheikh2017; Wang et al., Reference Wang, Ripamonti and Hesthaven2020; Wu and Noels, Reference Wu and Noels2022), deep learning (Fresca and Manzoni, Reference Fresca and Manzoni2022), sparse identification of nonlinear dynamics (SINDy) (Champion et al., Reference Champion, Lusch, Kutz and Brunton2019), and Runge–Kutta neural network (Zhuang et al., Reference Zhuang, Lorenzi, Bungartz and Hartmann2021) have been tested for reduced modeling.

One of the concerning points of data-driven MOR techniques is collecting enough training data, as computing the FOM to generate the training data is very time-consuming, especially when looking at large 3D simulation models used productively in the industry. Therefore, collecting the training data smartly becomes one of the focuses of the data-driven MOR techniques.

We orient our discussion by focusing on two aspects relevant to snapshot collection: data quality and data quantity. The data quality could describe many aspects, for example, the distribution of the data and the noise in the data. Unlike other applications, the data in the MOR field has fewer corruptions since it is usually obtained directly from the FOM solver, and not, for example, from potentially noisy sensors. Therefore, the main focus will be improving the training data distribution. There are some investigations (Batista et al., Reference Batista, Prati and Monard2004; Gyori et al., Reference Gyori, Palombo, Clark, Zhang and Alexander2022) discussing how the distribution of the data could influence the ML process itself. However, this issue is less prevalent within MOR research. To minimize the amount of training data, we employ the concept of active learning (AL). The concept of AL is widely used in the world of intrusive MOR (Bui-Thanh et al., Reference Bui-Thanh, Willcox and Ghattas2008; Haasdonk et al., Reference Haasdonk, Dihlmann and Ohlberger2011; Lappano et al., Reference Lappano, Naets, Desmet, Mundo and Nijman2016). However, the data-driven nonintrusive MOR technique faces extra challenges without intrusiveness. One challenge is finding a good stopping criteria. The stopping criteria are usually needed to validate the reduced order model (ROM) and compute the accuracy. However, currently, most validation (Przekop et al., Reference Przekop, Guo and Rizzi2012; Perez et al., Reference Perez, Wang and Mignolet2014) is test-case-dependent, which implies that the general performance of the ROM cannot be evaluated without bias. To address this problem, a load-independent metric is proposed for a particular mechanical problem in Kuether and Allen (Reference Kuether and Allen2016). The second difficulty is the absence of an error estimator, which can decide the data increment strategy. Such an error estimator is available in an analytical form for the intrusive MOR methods but lacking in the nonintrusive case.

In this article, we propose an alternative approach to tackle the problems mentioned above. To improve the snapshot distribution, we propose a new method for sampling the training data, and we call it joint space sampling (JSS). With this method, we collect the training data from a joint space consisting of an estimated reduced solution space and a parameter space. Besides, we propose to use a validator based on probably approximately correct (PAC) learning theory (Haussler, Reference Haussler1990) to decide when to stop the AL iteration. The validated ROMs are useful for deploying in real industrial use cases where complex time-dependent inputs can be expected. Furthermore, the error estimator to speed up the convergence is constructed in a data-driven way. Among different existing approaches (Paul-Dubois-Taine and Amsallem, Reference Paul-Dubois-Taine and Amsallem2015; Guo and Hesthaven, Reference Guo and Hesthaven2018; Xiao, Reference Xiao2019), we employ the Kriging interpolation (Krige, Reference Krige1951) introduced in Guo and Hesthaven (Reference Guo and Hesthaven2018) as the error estimator to decide the data increment strategy.

The whole article is structured as follows: in Section 2, MOR based on ML is introduced. In Section 3, the key components and the whole workflow of the new AL-based MOR algorithm are described in detail. Numerical tests used to validate this approach are presented in Section 4. Finally, the conclusions are presented in Section 5.

2. ML-Based MOR

The offline phase of ML-based MOR generally consists of two steps: constructing a low-dimensional (reduced) space and identifying the ROM in the reduced space by ML-based approach. In the online phase, the identified ML model will be used as a surrogate model to predict the evolution of the simulated dynamic system at a real-time speed.

2.1. Reduced space construction

Without loss of generality, we can assume the FOM is governed by the equation:

(1)

$$ \dot{\boldsymbol{y}}(t)=\boldsymbol{f}\left(\boldsymbol{y};\boldsymbol{\mu} \right), $$

where $ \boldsymbol{y}\in {\mathrm{\mathbb{R}}}^N $ is the state of the FOM and $ N $ is normally very large. $ \boldsymbol{f} $ is the right-hand side (RHS) function configured by the parameter vector $ \boldsymbol{\mu} =\left[{\mu}_1\hskip0.5em {\mu}_2\hskip0.5em \dots \hskip0.5em {\mu}_{N_m}\right]\in {\mathrm{\mathbb{R}}}^{N_m} $ .

The first step of generating a ROM is to construct the reduced space by finding a low-dimensional space defined by a projection matrix $ \boldsymbol{V}\in {\mathrm{\mathbb{R}}}^{N\times n} $ , such that the reduced state $ {\boldsymbol{y}}_r\in {\mathrm{\mathbb{R}}}^n $ is given by projecting FOM state $ \boldsymbol{y} $ onto the reduced space

(2)

$$ {\boldsymbol{y}}_r={\boldsymbol{V}}^{\boldsymbol{T}}\boldsymbol{y}. $$

2.1.1. Proper orthogonal decomposition

POD is a widely used method to construct the reduced space. Plenty of introductory literature for POD method is available (e.g., Lu et al., Reference Lu, Jin, Chen, Yang, Hou, Zhang, Li and Fu2019). Here we will just briefly introduce it. The main ingredient of POD is a series of observations of the FOM states, which are usually called snapshots and take the form of

(3)

$$ \boldsymbol{Y}=\left[\begin{array}{cccc}{\boldsymbol{y}}^{(1)}& {\boldsymbol{y}}^{(2)}& \dots & {\boldsymbol{y}}^{\left({N}_s\right)}\end{array}\right], $$

where $ {N}_s $ is the number of the available snapshots and $ {\boldsymbol{y}}^{(i)} $ is the $ i $ th column in the snapshot matrix $ \boldsymbol{Y} $ .

Applying singular value decomposition (SVD) to $ \boldsymbol{Y} $ yields

(4)

$$ \boldsymbol{Y}=\boldsymbol{V}\varSigma {\boldsymbol{U}}^{\boldsymbol{T}}. $$

The left singular vectors $ \boldsymbol{V}\in {\mathrm{\mathbb{R}}}^{N\times k} $ can be used as the reduced space basis. This space will minimize the $ {L}^2 $ approximation error for the given snapshots. We can truncate $ \boldsymbol{V} $ at its $ {N}_r $ th column. By doing this, the size of the reduced space ( $ {N}_r $ ) can be chosen freely. A large space can retain more information in the original full space, but it will make the ROM run slower during the online phase.

2.1.2. Conventional snapshot sampling

To increase the diversity of the observed FOM states in the snapshots, usually a hypercubic parameter space is predefined as

(5)

$$ \mathcal{M}=\left[{\mu}_{1,\min },{\mu}_{1,\max}\right]\times \left[{\mu}_{2,\min },{\mu}_{2,\max}\right]\times \dots \times \left[{\mu}_{N_m,\min },{\mu}_{N_m,\max}\right], $$

where $ {\mu}_{i,\min } $ and $ {\mu}_{i,\max } $ is the upper and the lower limit of the $ i $ th component of the system parameter vector $ \boldsymbol{\mu} $ . The parameter vectors $ \boldsymbol{\mu} $ will be randomly selected from $ \mathcal{M} $ and different initial value problems (IVPs) can be constructed based on them. In this case, FOM’s states under different input parameters can be observed.

Next, we introduce two strategies for the sampling parameter values, static parameter sampling (SPS) and dynamic parameter sampling (DPS), respectively. For SPS, to construct $ m $ IVPs, $ m $ parameter vectors will be sampled from $ \mathcal{M} $ . For the $ i $ th IVP, we have

(6)

$$ \dot{\boldsymbol{y}}(t)=\boldsymbol{f}\left(\boldsymbol{y};{\boldsymbol{\mu}}^{(i)}\right), $$

where $ {\boldsymbol{\mu}}^{(i)} $ is the $ i $ th parameter samples.

With DPS, to construct $ m $ IVPs where each IVP has $ K $ time steps, $ mK $ parameter vectors will be sampled from $ \mathcal{M} $ . Here, in this article, we use Hammersley Low Discrepancy Set (Hammersley and Handscomb, Reference Hammersley and Handscomb1964) to take samples. For the $ i $ th IVP, we have

(7)

$$ \dot{\boldsymbol{y}}(t)=\boldsymbol{f}\left(\boldsymbol{y};{\boldsymbol{\mu}}^{(i)}(t)\right), $$

where

(8)

$$ {\boldsymbol{\mu}}^{(i)}(t)=\mathrm{Interp}\left(\left\{\left({t}_0,{\boldsymbol{\mu}}^{(i0)}\right),\left({t}_1,{\boldsymbol{\mu}}^{(i1)}\right),\dots, \left({t}_K,{\boldsymbol{\mu}}^{(iK)}\right)\right\}\right). $$

In equation (8), $ \mathrm{Interp}\left(\cdot \right) $ represent a linear interpolation of the points $ {t}_j,{\mu}^{(ij)} $ . Let $ \delta t $ be the time step size, $ {t}_i={t}_0+ i\delta t $ . As we can see that DPS is $ K $ times more efficient in exploring the parameter space $ \mathcal{M} $ , which will make the constructed ROM perform better facing complex input parameters in its online phase.

2.1.2.1. Example

Here, we use an example to show the sample distributions of the SPS and DPS methods, respectively. We consider an IVP,

(9)

$$ \dot{y}=k(t)y+{e}^{a(t)}, $$

where $ t\in \left(0,1\right) $ . $ k(t) $ and $ a(t) $ are two input parameters of the system, and they are time-dependent. Their ranges are $ \left[\mathrm{0.1,1}\right] $ and $ \left[1,2\right] $ , respectively. We first use Hammersley Low Discrepancy Set to take samples in the parameter space $ \mathcal{M} $ . Then, we use the SPS and DPS methods to create the IVPs. By solving the IVPs, we obtain the solution snapshots. For each IVP, we collect 50 snapshots evenly on the time grid.

In Figures 1 and 2, we show the sample distributions in the parameter space and the solution space for the SPS and the DPS method, respectively. As observed, we can have a good sample distribution in the solution space with the SPS method. However, we have much fewer samples in the parameter space. In contrast, much more samples are taken in the parameter space with the DPS method. Nevertheless, the diversity of the sample system state is not optimal.

Figure 1. The sample distribution in the parameter space $ \mathcal{M} $ .

Figure 2. The sample distribution in the solution space.

Through this toy example, we can see the advantage and disadvantage of the conventional sampling approaches for MOR. Later in this contribution, we propose a new sampling approach to ensure good sample distribution/diversity in both the parameter space and the solution space.

2.2. ROM identification

As introduced, the second step of ML-based MOR is model identification, which means finding the governing equation in the reduced space. The reduced governing equations are expected to have the form

(10)

$$ {\dot{\boldsymbol{y}}}_{\boldsymbol{r}}(t)={\boldsymbol{f}}_{\boldsymbol{r}}\left({\boldsymbol{y}}_r;\boldsymbol{\mu} \right), $$

where $ {\boldsymbol{f}}_{\boldsymbol{r}} $ is the low-dimensional representation of $ \boldsymbol{f} $ .

Different methods can be used to achieve this. In this article, artificial neural network (ANN) and OpInf will be briefly introduced.

2.2.1. Artificial neural network

The ROM we employ in this article is a dynamic model evolving on a discrete-time grid from $ {t}_0 $ to $ {t}_{\mathrm{end}} $ with time step $ \delta t $ :

(11)

$$ \frac{{\boldsymbol{y}}_{r,i+1}-{\boldsymbol{y}}_{r,i}}{\delta t}=\boldsymbol{g}\left({\boldsymbol{y}}_{r,i},{\mu}_i\right), $$

where $ {\boldsymbol{y}}_{r,i} $ means $ {\boldsymbol{y}}_{\boldsymbol{r}}\left(t={t}_i\right) $ , and $ \boldsymbol{g}\left(\cdot \right) $ is the approximation to the unknown reduced RHS $ {\boldsymbol{f}}_{\boldsymbol{r}}\left(\cdot \right) $ .

To learn the mapping between states of two consecutive time steps, a multilayer perceptron (MLP) is used as $ \boldsymbol{g}\left(\cdot \right) $ in equation (11). In this case, the inputs to the neural network are the current state of the system $ {\boldsymbol{y}}_{r,i} $ and $ {\boldsymbol{\mu}}_i $ , and the predicted reduced state $ {\hat{\boldsymbol{y}}}_{r,i+1} $ is calculated as

(12)

$$ {\hat{\boldsymbol{y}}}_{r,i+1}={\boldsymbol{y}}_{r,i}+\delta t{\boldsymbol{g}}_{\mathrm{ee}}\left({\boldsymbol{y}}_{r,i},{\boldsymbol{\mu}}_i\right). $$

This is similar to an Explicit Euler integrator, so we denote the MLP’s function as $ {\boldsymbol{g}}_{\mathrm{ee}}\left({\boldsymbol{y}}_{r,i},{\mu}_i\right) $ . We call such a network explicit Euler neural network (EENN) (Pan and Duraisamy, Reference Pan and Duraisamy2018; Regazzoni et al., Reference Regazzoni, Dedè and Quarteroni2019; Zhuang et al., Reference Zhuang, Lorenzi, Bungartz and Hartmann2021). From equation (12), we know that an EENN is essentially a variant of residual network (He et al., Reference He, Zhang, Ren and Sun2016) which learns the increment between two system states instead of learning to map the new state directly from the old state. This leads to a potential advantage of EENNs that we can use the deeper network for approximating more complex nonlinearity without suffering too much from network degeneration.

The algorithm for training such an EENN is provided in Algorithm 1. In this contribution, all MLPs consist of one input layer, two hidden layers, and one output layer. The activation function between each two consecutive layers is rectified linear unit (ReLU) function (Glorot et al., Reference Glorot, Bordes and Bengio2011). We use Pytorch (Paszke et al., Reference Paszke, Gross, Chintala, Chanan, Yang, DeVito, Lin, Desmaison, Antiga and Lerer2017) as the platform to build and train the networks. The learning rate decay and Early Stopping (Prechelt, Reference Prechelt1998) are applied to facilitate the training.

Algorithm 1. Training of EENN

Require: $ {\mathbf{y}}_{r,i},{\boldsymbol{\mu}}_i,{\mathbf{y}}_{r,i+1} $

1: while $ \mathrm{loss}>{\mathrm{loss}}_{\mathrm{tol}} $ do

2: $ {\hat{\boldsymbol{y}}}_{r,i+1}={\boldsymbol{y}}_{r,i}+\delta t{\boldsymbol{g}}_{\mathrm{ee}}\left({\boldsymbol{y}}_{r,i},{\boldsymbol{\mu}}_i\right) $

3: $ \mathrm{loss}=\mathrm{MSE}\left({\hat{\boldsymbol{y}}}_{r,i+1},{\boldsymbol{y}}_{r,i+1}\right) $

4: Use backpropagation to update the weights and biases of the network $ {\mathrm{g}}_{\mathrm{ee}} $

5: end while

2.2.2. Operator inference

Besides the purely data-driven ROM identification method introduced in Section 2.2.1, we briefly introduce a physics-informed ROM identification method, OpInf (Peherstorfer and Willcox, Reference Peherstorfer and Willcox2016).

The OpInf method uses a hypothetical form for the ROM’s governing equation:

(13)

$$ {\dot{\boldsymbol{y}}}_r={\boldsymbol{F}}_{\boldsymbol{1}}{\boldsymbol{y}}_r+{\boldsymbol{F}}_{\boldsymbol{B}}\boldsymbol{\mu} +{\boldsymbol{F}}_{\boldsymbol{C}}+{\boldsymbol{F}}_{\boldsymbol{2}}{\boldsymbol{y}}_r\otimes {\boldsymbol{y}}_r+{\boldsymbol{F}}_{\boldsymbol{N}}\boldsymbol{\mu} \otimes {\boldsymbol{y}}_r, $$

where $ {\boldsymbol{F}}_{\boldsymbol{1}},{\boldsymbol{F}}_{\boldsymbol{2}},{\boldsymbol{F}}_{\boldsymbol{B}},{\boldsymbol{F}}_{\boldsymbol{C}} $ and $ {\boldsymbol{F}}_{\boldsymbol{N}} $ are operators to be inferred. The operation $ \otimes $ is the Kronecker product:

(14)

$$ {\boldsymbol{y}}_r\otimes {\boldsymbol{y}}_r=\left[\begin{array}{c}{y}_r^{(1)}\\ {}{y}_r^{(2)}\\ {}\vdots \\ {}{y}_r^{\left({N}_r\right)}\end{array}\right]\otimes \left[\begin{array}{c}{y}_r^{(1)}\\ {}{y}_r^{(2)}\\ {}\vdots \\ {}{y}_r^{\left({N}_r\right)}\end{array}\right]=\left[\begin{array}{c}{y}_r^{(1)}{y}_r^{(1)}\\ {}{y}_r^{(1)}{y}_r^{(2)}\\ {}\vdots \\ {}{y}_r^{(1)}{y}_r^{\left({N}_r\right)}\\ {}{y}_r^{(2)}{y}_r^{(2)}\\ {}{y}_r^{(2)}{y}_r^{(3)}\\ {}\vdots \\ {}{y}_r^{\left({N}_r\right)}{y}_r^{\left({N}_r\right)}\end{array}\right]. $$

As we see, OpInf assumes a quadratic dependency of RHS on the system state. OpInf can also be applied to systems that are not quadratic, as long as they can be mapped to a quadratic form through variable transformations (see Qian et al., Reference Qian, Kramer, Peherstorfer and Willcox2020).

To infer the unknown operators, we first construct the matrices

(15)

$$ {\displaystyle \begin{array}{c}\hskip2em {\boldsymbol{Y}}_{r,\mathrm{in}}=\left[\begin{array}{cccc}{\boldsymbol{y}}_{r,0}& {\boldsymbol{y}}_{r,1}& \dots & {\boldsymbol{y}}_{r,K-1}\end{array}\right],\\ {}{\boldsymbol{Y}}_{r,\mathrm{out}}=\left[\begin{array}{cccc}{\boldsymbol{y}}_{r,1}& {\boldsymbol{y}}_{r,2}& \dots & {\boldsymbol{y}}_{r,K}\end{array}\right],\\ {}\hskip-1em {\boldsymbol{U}}_{\mathrm{in}}=\left[\begin{array}{cccc}{\boldsymbol{\mu}}_1& {\boldsymbol{\mu}}_2& \dots & {\boldsymbol{\mu}}_K\end{array}\right],\end{array}} $$

where $ K $ is the number of time steps. Using the matrices and the assumed RHS, we can construct a least square problem,

(16)

$$ \arg \underset{\boldsymbol{O}}{\min}\parallel \boldsymbol{DO}-\boldsymbol{R}{\parallel}_2, $$

where

(17)

$$ {\displaystyle \begin{array}{l}\boldsymbol{R}=\frac{{\boldsymbol{Y}}_{r,\mathrm{out}}^T-{\boldsymbol{Y}}_{r,\mathrm{in}}^T}{\delta t},\\ {}\boldsymbol{D}=\left[\begin{array}{ccccc}{\boldsymbol{Y}}_{r,\mathrm{in}}^T& {\boldsymbol{U}}_{\mathrm{in}}^T& \boldsymbol{I}& {\left({\boldsymbol{Y}}_{r,\mathrm{in}}\otimes {\boldsymbol{Y}}_{r,\mathrm{in}}\right)}^T& {\left({\boldsymbol{U}}_{\mathrm{in}}\otimes {\boldsymbol{Y}}_{r,\mathrm{in}}\right)}^T\end{array}\right].\end{array}} $$

By solving the least square problem (equation (16)), we get the matrix $ \boldsymbol{O} $ . We can unpack the matrix $ \boldsymbol{O} $ as

(18)

$$ \boldsymbol{O}=\left[\begin{array}{ccccc}{\boldsymbol{F}}_1^T& {\boldsymbol{F}}_B^T& {\boldsymbol{F}}_C^T& {\boldsymbol{F}}_2^T& {\boldsymbol{F}}_N^T\end{array}\right]. $$

In practice, we solve the least square problem with regularization to produce a stable ROM, and for more details see McQuarrie et al. (Reference McQuarrie, Huang and Willcox2021).

3. AL for ML-Based MOR

As introduced in Section 2.1.2, for the conventional ML-based MOR approaches, snapshots are collected up-front. This could cause the problem that not all important information is captured in the prepared snapshots that are used to construct the ROM. We propose a solution based on the concept of the pool-based AL (Settles, Reference Settles2009). The solution mainly contains the following steps: (a) we first use the JSS method to prepare a large data pool containing diverse joint samples, (b) the current ROM is validated, and the validation result is used to select the samples that are considered most valuable for the current ROM, (c) new training data are generated by the FOM and added to the current training data, and (d) the ROM is updated. The corresponding steps in the whole workflow are sketched in Figure 3.

Figure 3. The workflow of the AL-based MOR.

3.1. Joint space sampling

In this section, we will introduce the new sampling method JSS. The proposed method will be used to create the data pool and the validation samples, which are necessary preparation for the AL algorithm.

3.1.1. Estimating the reduced solution space

In order to take the joint samples, we need first to estimate a reduced solution space. With the predefined parameter space $ \mathcal{M} $ , the boundaries of the reduced solution space will be estimated as follows:

1. Use SPS to create $ m $ IVPs for the FOM. In this article, Hammersley Set (Hammersley and Handscomb, Reference Hammersley and Handscomb1964) is used to take samples in the parameter space. Each IVP is integrated from time $ {t}_0 $ to time $ {t}_{\mathrm{end}} $ with $ K $ time steps.
2. Solving all the FOM problems will produce a snapshot matrix $ {\boldsymbol{Y}}_{\mathrm{estimate}}=\left[\begin{array}{cccc}{\boldsymbol{y}}^{(1)}& {\boldsymbol{y}}^{(2)}& \dots & {\boldsymbol{y}}^{\left(m\times K\right)}\end{array}\right] $ containing $ m\times K $ FOM state vectors.
3. Apply POD on $ {\boldsymbol{Y}}_{\mathrm{estimate}} $ to construct the reduced space.
4. Project $ {\boldsymbol{Y}}_{\mathrm{estimate}} $ onto the reduced space to get its low-dimensional representation:

(19)

$$ {\boldsymbol{Y}}_{r,\mathrm{estimate}}={\boldsymbol{V}}^T{\boldsymbol{Y}}_{\mathrm{estimate}}. $$

5. Find the minimum and the maximum entries for each POD component in $ {\boldsymbol{Y}}_{\mathrm{estimate}} $ and denote them as:

(20)

$$ \left\{{y}_{r,\min}^{(1)},{y}_{r,\max}^{(1)},{y}_{r,\min}^{(2)},{y}_{r,\max}^{(2)},\dots, {y}_{r,\min}^{\left({N}_r\right)},{y}_{r,\max}^{\left({N}_r\right)}\right\}, $$

where $ {N}_r $ is the size of the ROM.

6. The estimated space $ \mathcal{Y} $ is:

(21)

$$ \mathcal{Y}=\left[{y}_{r,\min}^{(1)},{y}_{r,\max}^{(1)}\right]\times \left[{y}_{r,\min}^{(2)},{y}_{r,\max}^{(2)}\right]\times \dots \times \left[{y}_{r,\min}^{\left({N}_r\right)},{y}_{r,\max}^{\left({N}_r\right)}\right]. $$

In the workflow, $ m $ can be selected as any positive integer. Since the reduced basis $ \boldsymbol{V} $ computed in the step will be used as the initial reduced basis later, the greater $ m $ is, the better the initial reduced basis will be. As explained in Section 2.1.2, we can have a relatively good observation in the solution space with the SPS method. Therefore, we use the SPS-sampled state to estimate the reduced solution space. Besides, one can also use minimal intrusiveness to the full-order problem to improve the estimate. Specifically, the parameter configurations that most likely drive the system to its extreme status (boundaries of the solution space) can be included in the samples.

A joint sample $ {\boldsymbol{J}}_{\mathrm{sample}} $ taken from the joint space $ \mathcal{J}=\mathcal{Y}\times \mathcal{M} $ has the form of $ {\boldsymbol{J}}_{\mathrm{sample}}=\left[\begin{array}{cc}{\boldsymbol{y}}_{r,\mathrm{sample}}& {\boldsymbol{\mu}}_{\mathrm{sample}}\end{array}\right] $ . We can use $ \boldsymbol{J} $ , $ \boldsymbol{V} $ and the FOM solver to perform a one-step simulation whose initial condition is $ {\boldsymbol{y}}_{r,\mathrm{sample}} $ and step size is $ \delta t $ . The simulation result will form a one-step trajectory in $ \mathcal{Y} $ . We can use such joint samples to prepare the data pool and the validation samples needed for the AL algorithm.

3.1.2. Loosen and trim the estimated solution space

The joint space $ \mathcal{J} $ as created in Section 3.1 is a hyper-cubic space (see example in Figure 4a). We will loosen and trim $ \mathcal{Y} $ to make it closer to the true reduced solution space. The first step is loosening. For this purpose, we introduce an expansion ratio $ \beta $ :

(22)

$$ {\displaystyle \begin{array}{c}\Delta {y}_r^{(i)}={y}_{r,\max}^{(i)}-{y}_{r,\min}^{(i)},\\ {}{\overset{\smile }{y}}_{r,\min}^{(i)}={y}_{r,\min}^{(i)}-\beta \Delta {y}_r^{(i)},\\ {}{\overset{\smile }{y}}_{r,\max}^{(i)}={y}_{r,\max}^{(i)}+\beta \Delta {y}_r^{(i)}.\end{array}} $$

Figure 4. The evolution of the joint space.

We use $ {\overset{\smile }{y}}_{r,\min}^{(i)} $ and $ {\overset{\smile }{y}}_{r,\max}^{(i)} $ as the lower and upper limits for the loosened space $ \overset{\smile }{\mathcal{Y}} $ . In our practice, $ \beta \in \left(0,1\right) $ can be selected to loosen the initial estimation. To trim the space, we apply two conditions to a reduced state $ {\boldsymbol{y}}_r $ sampled from $ \overset{\smile }{\mathcal{Y}} $ :

(23)

$$ {\displaystyle \begin{array}{l}\operatorname{MAX}\left({\boldsymbol{Vy}}_r\right)<{y}_{\mathrm{max}},\\ {}\operatorname{MIN}\left({\boldsymbol{Vy}}_r\right)>{y}_{\mathrm{min}},\end{array}} $$

where $ \operatorname{MAX}\left(\cdot \right) $ and $ \operatorname{MIN}\left(\cdot \right) $ means the maximum and minimum entry of a vector/matrix. The value $ {y}_{\mathrm{min}} $ and $ {y}_{\mathrm{max}} $ can be either defined by empirical values of the physics quantity $ y $ or by $ \operatorname{MAX}\left({\boldsymbol{Y}}_{\mathrm{estimate}}\right) $ and $ \operatorname{MIN}\left({\boldsymbol{Y}}_{\mathrm{estimate}}\right) $ . In this article, $ \operatorname{MAX}\left({\boldsymbol{Y}}_{\mathrm{estimate}}\right) $ and $ \operatorname{MIN}\left({\boldsymbol{Y}}_{\mathrm{estimate}}\right) $ is used. We here denote the final feasible sampling space for reduced states as $ {\mathcal{Y}}^{\ast } $ , and the final joint space as $ {\mathcal{J}}^{\ast }={\mathcal{Y}}^{\ast}\times \mathcal{M} $ . A graphical example for joint space $ {\mathcal{J}}^{\ast } $ is given in Figure 4b.

Finally, we summarize how to collect $ {N}_s $ samples with the proposed JSS method in Algorithm 2.

Algorithm 2. Generation of a set of random joint samples

Require: $ \mathcal{M},{N}_s,m $ , FOM

1: Create a sample set $ M=\left\{{\boldsymbol{\mu}}_1,{\boldsymbol{\mu}}_2,\dots, {\boldsymbol{\mu}}_m\right\} $ in $ \mathcal{M} $

2: Perform SPS with $ m $ parameter samples and FOM to get $ {\boldsymbol{Y}}_{\mathrm{estimate}} $

3: Apply POD to $ {\boldsymbol{Y}}_{\mathrm{estimate}} $ to get $ \boldsymbol{V} $ and $ {\boldsymbol{X}}_{\mathrm{estimate}} $

4: Find the upper and lower limits for each POD component

5: Create hyper-cubic solution space $ \mathcal{J} $

6: Loosen the space $ \mathcal{J} $ to get $ \overset{\smile }{\mathcal{J}} $

7: Determine $ {y}_{\mathrm{max}} $ and $ {y}_{\mathrm{min}} $

8: $ i=0,\boldsymbol{I}=\left[\right] $

9: while $ i<{N}_s $ do

10: Generate a random sample $ {\boldsymbol{y}}_{r,\mathrm{sample}} $ in $ \overset{\smile }{\mathcal{Y}} $

11: if $ \operatorname{MAX}\left({\boldsymbol{Vy}}_{r,\mathrm{sample}}\right)<{y}_{\mathrm{max}} $ and $ \operatorname{MIN}\left({\boldsymbol{Vy}}_{r,\mathrm{sample}}\right)>{y}_{\mathrm{min}} $ then

12: Generate a random sample $ {\mu}_{\mathrm{sample}} $ in $ \mathcal{M} $

$$ \hskip-41em {\boldsymbol{J}}_{\mathrm{sample}}=\left[\begin{array}{c}{\boldsymbol{\mu}}_{\mathrm{sample}}\\ {}{\boldsymbol{y}}_{r,\mathrm{sample}}\end{array}\right] $$

$$ \hskip-45.5em \boldsymbol{I}=\boldsymbol{I}\cup {\boldsymbol{J}}_{\mathrm{sample}} $$

$$ \hskip-51em i++ $$

13: end if

14: end while

15: return $ I $

3.2. ROM validation

Finding adequate metrics to validate a ROM is challenging. When the training and test data sets are collected up front, the ROM could be biased toward the data set and lead to a lack of generalization for the model. To make our data-driven ROM more reliable, we must design a more generalized validating procedure.

3.2.1. Accuracy

Here, we define the relative error of the ROM in a one-step prediction as $ e $ :

(24)

$$ e=\frac{\left\Vert \boldsymbol{y}-\hat{\boldsymbol{y}}\right\Vert }{\left\Vert \boldsymbol{y}\right\Vert }. $$

We furthermore define ROM error as $ e $ . However, the ROM is very likely to be used for a long-term prediction and such a one-step accuracy still cannot reflect the performance of the ROM. Therefore, we will introduce the confidence of the ROM.

3.2.2. Confidence

If the ROM error is $ \tau $ at confidence level $ \eta $ , we can construct such an inequality:

(25)

$$ \Pr \left(e\le \tau \right)>\eta, $$

that is, the probability that the error $ e $ smaller than $ \tau $ is greater than $ \eta $ , where $ \eta $ and $ \tau $ is a pair of confidence and accuracy indicator that we are seeking.

We assume we have $ s $ independent tests in the validator. Then we can calculate the observed confidence $ p\left({\tau}^{\ast}\right) $ of a given ROM error $ {\tau}^{\ast } $ using:

(26)

$$ {\displaystyle \begin{array}{l}p\left({\tau}^{\ast}\right)=\frac{1}{s}\sum \limits_{i=1}^sL\left({e}_i\le {\tau}^{\ast}\right),\\ {}L\left({e}_i\le {\tau}^{\ast}\right)=\left\{\begin{array}{cc}1& {e}_i\le {\tau}^{\ast}\\ {}0& {e}_i>{\tau}^{\ast}\end{array}\right..\end{array}} $$

However, we can only include a limited number of independent tests into the validator. This means, with a given $ {\tau}^{\ast } $ , the $ p\left({\tau}^{\ast}\right) $ computed from the validator should also have deviation $ \epsilon $ and its corresponding confidence $ 1-\sigma $ from the true confidence $ \hat{p}\left({\tau}^{\ast}\right) $ . Thus, we obtain another inequality:

(27)

$$ \Pr \left(|p\left({\tau}^{\ast}\right)-\hat{p}\left({\tau}^{\ast}\right)|<\epsilon \right)>1-\sigma, $$

where $ \hat{p}\left({\tau}^{\ast}\right) $ is the true confidence. If we predefine the deviation $ \epsilon $ and the confidence $ 1-\sigma $ of the validator, the smallest number of the independent tests needed by the validator can be computed using Chernoff’s inequality (Alippi, Reference Alippi2016):

(28)

$$ s\ge {s}_{\mathrm{PAC}}=\frac{1}{2{\epsilon}^2}\mathit{\ln}\left(\frac{2}{\sigma}\right). $$

Let $ {\overline{\tau}}^{\ast } $ be the smallest $ {\tau}^{\ast } $ satisfying:

(29)

$$ p\left({\overline{\tau}}^{\ast}\right)=1. $$

Since Inequality 27 holds for all $ {\tau}^{\ast } $ , it also holds for $ {\overline{\tau}}^{\ast } $ :

(30)

$$ \Pr \left(|1-\hat{p}\left({\overline{\tau}}^{\ast}\right)|<\epsilon \right)>1-\sigma . $$

That is the inequality:

(31)

$$ \mid 1-\hat{p}\left({\overline{\tau}}^{\ast}\right)\mid <\epsilon $$

holds with the probability $ 1-\sigma $ . We further get:

(32)

$$ 1-\epsilon <\hat{p}\left({\overline{\tau}}^{\ast}\right)<1 $$

holds with probability $ 1-\sigma $ .

Finally, the ROM error $ \tau $ and its corresponding confidence $ \eta $ are therefore $ {\overline{\tau}}^{\ast } $ and $ 1-\epsilon $ , respectively. Since Inequality 32 holds with probability $ 1-\sigma $ and $ {S}_{\mathrm{PAC}} $ is negatively correlative to $ \sigma $ , we know that the more test samples we have, the more reliable the computed true confidence will be.

We denote such a validator as:

(33)

$$ p\left({\tau}^{\ast}\right)=\mathrm{PAC}\left({\tau}^{\ast };\mathrm{ROM}\right). $$

We can predefine a variable $ {\overline{\tau}}_{\mathrm{design}}^{\ast } $ and compute $ p\left({\overline{\tau}}_{\mathrm{design}}^{\ast}\right) $ in iterations for convergence check.

3.3. Error estimator

The ROM will go through all tests in the validator constructed as described in Section 3.2. We denote the ROM error snapshots as:

(34)

$$ \boldsymbol{e}=\left[\begin{array}{cccc}{e}^{(1)}& {e}^{(2)}& \dots & {e}^{(s)}\end{array}\right], $$

where $ {e}^{(i)} $ is $ e $ computed in the $ i $ th test. We assume there is a function:

(35)

$$ {e}^{(i)}=\mathrm{\mathcal{E}}\left({\boldsymbol{J}}^{(i)}\right). $$

Such a function can be used as the error estimator to estimate the ROM error with a new input $ {\boldsymbol{J}}^{\ast } $ which is excluded from the test inputs $ \boldsymbol{I}=\left\{{\boldsymbol{J}}^{(1)},{\boldsymbol{J}}^{(2)},..,{\boldsymbol{J}}^{(s)}\right\} $ . Here, we use GPR (Williams and Rasmussen, Reference Williams and Rasmussen1996) for interpolation, and the input of the interpolator is the joint input $ \boldsymbol{J} $ of the ROM and the output is the estimated error. In our problem, our training dataset is the test data in the PAC validator:

(36)

$$ \boldsymbol{D}=\left(\boldsymbol{I},\boldsymbol{e}\right)=\left\{\left({\boldsymbol{J}}^{(i)},{e}^{(i)}\right)|i=1,2,\dots, s\right\}. $$

According to the theory of GPR, the predicted ROM error $ {\hat{e}}^{\ast } $ for the new joint input $ {\boldsymbol{J}}^{\ast } $ is:

(37)

$$ {\hat{e}}^{\ast }=\boldsymbol{K}\left({\boldsymbol{J}}^{\ast },\boldsymbol{I}\right)\boldsymbol{K}{\left(\boldsymbol{I},\boldsymbol{I}\right)}^{-1}\boldsymbol{e}, $$

where $ \boldsymbol{K} $ is a covariance matrix whose entries can be computed by a preselected covariance function $ \kappa \left({\boldsymbol{J}}^{\ast },{\boldsymbol{J}}_i\right) $ . Here we use radial-basis function (RBF), which is also called squared-exponential function or Gaussian kernel and is one of the most popular choices for the covariance function:

(38)

$$ \kappa \left({\boldsymbol{J}}^{\ast },{\boldsymbol{J}}^{(i)}\right)={\sigma}^2\exp \left(-\frac{1}{2{l}^2}{\left\Vert {\boldsymbol{J}}^{\ast }-{\boldsymbol{J}}^{(i)}\right\Vert}^2\right), $$

where the amplitude $ \sigma $ and the lengthscale $ l $ are the hyperparameters which can be tuned. RBF is appropriate when the simulated function is expected to be smooth.

Besides GPR, other interpolators or approximators, such as Spline Interpolation (Habermann and Kindermann, Reference Habermann and Kindermann2007), Radial-basis-function Interpolation (Broomhead and Lowe, Reference Broomhead and Lowe1988), and ANN can be used to construct the estimator.

3.4. AL algorithm

Until now, the necessary theory and components of the AL algorithm have been fully introduced. In this section, we summarize the workflow algorithmically.

First, we need to prepare the initial data pool $ {\boldsymbol{I}}_{\mathrm{all}} $ , the PAC validator containing $ {\boldsymbol{I}}_{\mathrm{PAC}} $ and $ {\boldsymbol{O}}_{\mathrm{PAC}} $ using Algorithm 2 and the FOM. With the data pool and the PAC validator on hands, the AL algorithm can be designed as Algorithm 3. Besides the algorithm, the flow chart of the proposed workflow is given in Figure 5.

Algorithm 3. AL for the ROM

Require: $ {\boldsymbol{Y}}_{\mathrm{estimate}} $ , $ \boldsymbol{V} $ , $ {\overline{\tau}}_{\mathrm{design}}^{\ast } $ , $ \Delta {\overline{\tau}}_{\mathrm{tol}}^{\ast } $ , $ {\boldsymbol{I}}_{\mathrm{all}} $ , $ \mathrm{PAC}\left({\tau}^{\ast },\mathrm{ROM}\right) $ , $ {\boldsymbol{I}}_{\mathrm{PAC}} $ , $ {\boldsymbol{O}}_{\mathrm{PAC}} $

1: Define the snapshot increment $ \Delta s $

2: Randomly select $ \Delta s $ samples from $ {\boldsymbol{I}}_{\mathrm{all}} $ as the initial training input samples $ {\boldsymbol{I}}_{\mathrm{train}} $ and generate the training output samples $ {\boldsymbol{O}}_{\mathrm{train}} $ using the FOM solver

3: Train the ROM using $ {\boldsymbol{I}}_{\mathrm{train}} $ , $ {\boldsymbol{O}}_{\mathrm{train}} $ and $ \boldsymbol{V} $

4: $ p\left({\overline{\tau}}_{\mathrm{design}}^{\ast}\right)=\mathrm{PAC}\left({\overline{\tau}}_{\mathrm{design}}^{\ast };\mathrm{ROM}\right) $ and $ {\overline{\tau}}_{\mathrm{old}}^{\ast }={\overline{\tau}}^{\ast } $

5: Compute the error snapshots $ \boldsymbol{e} $ using $ {\boldsymbol{O}}_{\mathrm{PAC}} $

6: while $ p\left({\overline{\tau}}_{\mathrm{design}}^{\ast}\right)<1 $ do

7: Fit the GPR estimator $ \mathrm{\mathcal{E}} $ using $ \boldsymbol{e} $ and $ {\boldsymbol{I}}_{\mathrm{PAC}} $

8: Evaluate $ \mathrm{\mathcal{E}} $ on $ {\boldsymbol{I}}_{\mathrm{all}}\backslash {\boldsymbol{I}}_{\mathrm{train}} $ , denote the estimation as $ {\boldsymbol{e}}_{\mathrm{pool}} $

9: Find $ \Delta s $ largest $ {e}_i $ and their corresponding input samples $ \Delta \boldsymbol{I}=\left\{{\boldsymbol{J}}_{\mathrm{max}}^{(1)},{\boldsymbol{J}}_{\mathrm{max}}^{(2)},\dots, {\boldsymbol{J}}_{\mathrm{max}}^{\left(\Delta s\right)}\right\} $

10: $ {\boldsymbol{I}}_{\mathrm{train}}={\boldsymbol{I}}_{\mathrm{train}}\cup \Delta \boldsymbol{I} $

11: Update $ {\boldsymbol{O}}_{\mathrm{train}} $ using the enriched $ {\boldsymbol{I}}_{\mathrm{train}} $ and the FOM

12: Update $ \boldsymbol{V} $ using $ {\boldsymbol{Y}}_{\mathrm{estimate}}\cup {\boldsymbol{O}}_{\mathrm{train}} $

13: Train the ROM using $ {\boldsymbol{I}}_{\mathrm{train}} $ , $ {\boldsymbol{O}}_{\mathrm{train}} $ and $ \boldsymbol{V} $

14: $ p\left({\overline{\tau}}_{\mathrm{design}}^{\ast}\right)=\mathrm{PAC}\left({\overline{\tau}}_{\mathrm{design}}^{\ast };\mathrm{ROM}\right) $ and compute $ {\overline{\tau}}^{\ast } $

15: Update the error snapshots $ \boldsymbol{e} $

16: $ {\overline{\tau}}_{\mathrm{old}}^{\ast }={\overline{\tau}}^{\ast } $

17: if $ \mid {\overline{\tau}}^{\ast }-{\overline{\tau}}_{\mathrm{old}}^{\ast}\mid <\Delta {\overline{\tau}}_{\mathrm{tol}}^{\ast } $ then

18: Break

19: end if

20: end while

Figure 5. The flow charts of the preparation and iteration phase of the AL-based MOR. The line numbers in (b) match those in Algorithm 3.

Regarding Algorithm 3, we would like to clarify that the snapshots in $ {\boldsymbol{Y}}_{\mathrm{estimate}} $ are not used for the model identification step, only for the construction of the initial projection matrix $ \boldsymbol{V} $ . The neural network is therefore trained exclusively with one-step trajectories. We notice that the inclusion of $ {\boldsymbol{Y}}_{\mathrm{estimate}} $ does not improve the ROM’s quality but considerably increases the training time for the neural network.

As seen in the algorithm, the reduced basis $ \boldsymbol{V} $ is recomputed during the iterations. In principle, this should influence the estimate of the reduced solution space and require rebuilding the PAC validator using the FOM solver. However, in practice, the initial basis $ \boldsymbol{V} $ produced by $ {\boldsymbol{Y}}_{\mathrm{estimate}} $ should already have a relatively low projection error. The refined reduced basis only slightly changes the boundary of the reduced solution space. We can still use $ \mathcal{Y} $ as the space for sampling the reduced state.

In Algorithm 3, we see another convergence check $ \mid {\overline{\tau}}^{\ast }-{\overline{\tau}}_{\mathrm{old}}^{\ast}\mid <\Delta {\overline{\tau}}_{\mathrm{tol}}^{\ast } $ besides $ p\left({\overline{\tau}}_{\mathrm{design}}^{\ast}\right) $ . Additionally, we check the improvement for the best ROM accuracy between two iterations. If the difference between them is already smaller than the predefined value, the algorithm will be considered to be in a low-efficiency status and should be stopped. Since there are many factors influencing the ROM accuracy other than the amount of training data, for example, ROM structure, ROM dimension, and the way of collecting training data, other possible modification should be considered and adopted to improve the ROM accuracy.

4. Numerical Examples

In this section, three numerical models will be used to evaluate the performance of the proposed algorithm against the conventional ML-based MOR. We integrate ANN-MOR and OpInf-MOR into the proposed method. The gained improvement is shown respectively. In the rest of the article, we use the following notation:

• ML-ROM: the ROM is constructed in the conventional workflow of ML-based MOR. The training data are collected by the DPS method, and the ROM is constructed by either POD + ANN or POD + OpInf.
• AL-ROM: the ROM is constructed by the proposed AL algorithm in Algorithm 3, where a large data pool is firstly prepared by the JSS method, then the algorithm selectively generates the training data. The ROM is constructed by either POD + ANN or POD + OpInf.

4.1. 2-D linear thermal block

The governing equation of the test system is:

(39)

$$ {\displaystyle \begin{array}{l}\frac{\partial T\left(x,y,t\right)}{\partial t}={k}_x\frac{\partial^2T\left(x,y,t\right)}{\partial {x}^2}+{k}_y\frac{\partial^2T\left(x,y,t\right)}{\partial {y}^2},\\ {}T\left(0,y,t\right)={T}_{\mathrm{left}},\hskip2em T\left(10,y,t\right)={T}_{\mathrm{right}},\\ {}T\left(x,-10,t\right)={T}_{\mathrm{down}},\hskip2em T\left(x,10,t\right)={T}_{\mathrm{up}},\\ {}T\left(x,y,0\right)=20,\end{array}} $$

where $ x\in \left(0,10\right),y\in \left(-10,0\right),t\in \left(0,2\right] $ and the static parameters are given by $ {k}_x={k}_y=1 $ . The controllable parameters $ {T}_{\mathrm{left}},{T}_{\mathrm{right}},{T}_{\mathrm{down}} $ and $ {T}_{\mathrm{up}} $ ( $ {}^{\mathrm{o}}\mathrm{C} $ ) have all the same feasible range $ \left[20,1,000\right] $ . Thus, for this problem $ \mathcal{M}=\left[20,1,000\right]\times \left[20,1,000\right]\times \left[20,1,000\right]\times \left[20,1,000\right] $ . The full-order model is discretized by the finite difference method. The square region is divided into $ 50\times 50=2,500 $ cells. We consider 100-time steps uniformly distributed.

For the AL algorithm, we first need to estimate the reduced solution space. Thus, $ m=20 $ samples are taken from $ \mathcal{M} $ . The 20 constant-input IVPs are constructed with the sampled parameters. $ {\boldsymbol{Y}}_{\mathrm{estimate}} $ has totally 2,020 state vectors. The initial reduced space is constructed by 20 POD basis using $ {\boldsymbol{Y}}_{\mathrm{estimate}} $ . The low-dimensional projection of those state vectors is then used to perform the space estimate. After estimating, loosening with $ \beta =10\% $ and trimming, we create the joint space $ {\mathcal{J}}^{\ast }={\mathcal{Y}}^{\ast}\times \mathcal{M} $ and use random sampling to get 40,000 joint samples for $ {\boldsymbol{I}}_{\mathrm{all}} $ .

We define $ \varepsilon =3\% $ and $ \sigma =1\% $ for the validation step. Based on equation (28), we know 2,944 samples are needed by the PAC validator. The confidence level that will be investigated is $ {\eta}_{\mathrm{design}}=97\% $ . We pick $ {\overline{\tau}}_{\mathrm{design}}^{\ast }=1\% $ and $ \Delta {\overline{\tau}}_{\mathrm{tol}}^{\ast }=0.01\% $ . In conclusion, we are looking for a ROM whose $ 97\% $ -confident ROM error is $ 1\% $ .

4.1.1. AL for ANN-ROM

In this section, we present the investigation for enhancing the ANN-MOR with our methods. The ROM has an architecture of EENN, whose MLP has [24, 80, 80, 20] neurons.

As seen in Table 1, the AL algorithm does not converge to the prescribed accuracy after 10 iterations. However, it meets the other stopping criteria $ \Delta {\overline{\tau}}_{\mathrm{tol}}^{\ast}\le 0.01\% $ . As we described before, when $ \Delta {\overline{\tau}}_{\mathrm{tol}}^{\ast}\le 0.01\% $ , we consider that continuing the iteration will not refine the ROM efficiently.

Table 1. PAC scores in the AL algorithm.

Note. The ROM is constructed with POD + ANN.

For the conventional workflow, we use DPS to create 50 IVPs, and they produce 50 solution trajectories. A total of 101 outputs from each IVP can build 100 input–output samples, therefore, we totally have 5,000 training samples. To construct the POD space, besides the sampled 5,000 snapshots, we also include $ {\boldsymbol{Y}}_{\mathrm{estimate}} $ .

Once the ROMs are built, we use another PAC validator to compute the $ 97\% $ -confident ROM error of the AL-ROM and ML-ROM. The results are given in Table 2.

Table 2. $ 97\% $ -confident ROM error ( $ {\overline{\tau}}^{\ast } $ ) by another PAC validator.

Note. The ROMs are constructed with POD + ANN.

It is observed that with the same amount of training samples, the two performance indicators $ {\overline{\tau}}^{\ast } $ and $ p\left({\overline{\tau}}_{\mathrm{design}}^{\ast}\right) $ of the conventional ML ROM are both much worse than the ROM trained in the AL algorithm. Based on the PAC theory, the ROM constructed with the conventional approach is not expected to have a better performance than the AL-ROM.

To further validate the ROMs, we designed two additional test cases. In Test 1, four boundary temperatures are described by four different time-dependent functions, respectively. The function curves are given in Figure 6. We pick four different elements as the observation positions. In Figure 7a,b, we present the predicted trajectories by the compared ROMs. In the trajectory figures, it is observed that both ROMs’ predictions are very close to the reference. In Figure 8, we show the error fields at the final time step. To compute the error field, we first lift the reduced solution to the full space using $ \boldsymbol{V} $ and calculate the absolute error between the lifted solution and the FOM solution. We can see that the AL-ROM’s error is lower overall than ML-ROM’s.

Figure 6. Time-dependent inputs to the 2-D thermal problem for Test 1.

Figure 7. Linear thermal block: the FOM and ROM solution in Test 1. The ROM is constructed with POD + ANN.

Figure 8. Comparison between two ROMs’ error (absolute) fields at $ t=2\mathrm{s} $ in Test 1. Left: AL. Right: ML. The ROMs are constructed with POD + ANN.

In Test 2, the four boundary temperatures are assigned with the same constant value $ {T}_{\mathrm{left}}={T}_{\mathrm{right}}={T}_{\mathrm{down}}={T}_{\mathrm{up}}=1,{000}^{\mathrm{o}}\mathrm{C} $ . The trajectories are presented in Figure 9a,b. The error fields are shown in Figure 10. In Test 2, it is observed that the trajectories predicted by the conventional ML ROM have a large difference from the reference, while the AL-ROM’s prediction still matches the reference. If we check the error fields at the last time step, the AL-ROM’s advantage is even more remarkable. Here a possible reason can be explained in Figure 11.

Figure 9. Linear thermal block: the FOM and ROM solution in Test 2. The ROMs are constructed with POD + ANN.

Figure 10. Comparison between two ROMs’ error (absolute) fields at $ t=2\mathrm{s} $ in Test 2. Left: AL. Right: ML. The ROMs are constructed with POD + ANN.

Figure 11. The trajectories of the first POD coefficient formed by: random-input samples (blue), extreme-input samples (green), and mean-input samples (dashed red). Here the extreme inputs are $ \left[1,000,1,000,1,000,1,000\right] $ for the upper green curve and $ \left[\mathrm{20,20,20,20}\right] $ for the lower green curve. The mean input is $ \left[\mathrm{510,510,510,510}\right] $ .

In Figure 11, we present the time evolution of the first POD coefficient of $ {\boldsymbol{y}}_r $ . We create an IVP with the maximum inputs $ {T}_{\mathrm{left}}={T}_{\mathrm{right}}={T}_{\mathrm{down}}={T}_{\mathrm{up}}=1,000 $ . Similarly, another trajectory is produced with minimal input parameters $ {T}_{\mathrm{left}}={T}_{\mathrm{right}}={T}_{\mathrm{down}}={T}_{\mathrm{up}}=20 $ . These trajectories are marked with green solid lines in Figure 11. These two trajectories are called “extreme trajectories.” Since our full-order problem is continuous, it is reasonable to assume that any trajectory produced by any parameter combination in $ \mathcal{M} $ should be between these two green boundaries. Then we pick a parameter combination $ \left[\mathrm{510,510,510,510}\right] $ , which can be considered as the barycenter of $ \mathcal{M} $ , and the corresponding red-dashed trajectory is called “barycentric trajectory.” Additionally, we plot the trajectories of the DPS-samples in light blue color. It is observed that these trajectories form a thin band whose center is approximately the barycentric trajectory. This means that assigning randomized parameter combinations to the IVP does not produce a wide distribution of the solution trajectory. A result of this narrow distribution is that the ROM will perform badly beyond the barycentric trajectory. This is reflected in Test 2. In the AL workflow, different $ {\boldsymbol{y}}_r $ are randomly sampled in the estimated space $ {\mathcal{Y}}^{\ast } $ and used as the initial states for the one-step snapshots. This distributes the training samples in a wider region in the reduced solution space, and the ROM trained by such a dataset should also be more reliable.

4.1.2. AL for OpInf

In this section, we perform the same tests as in Section 4.1.1, except that we use OpInf instead of ANN. The convergence of the AL algorithm is shown in Table 3. The algorithm uses 150 training samples for convergence, which is remarkably fewer than those needed for constructing the ANN-ROM. Besides, according to Table 4, the OpInf-ROM constructed using DPS-sampled data also has a confident ROM error that is very close to the OpInf-ROM built by the AL algorithm.

Table 3. PAC scores in the AL algorithm.

Note. The ROM is constructed with POD + OpInf.

Table 4. $ 97\% $ -confident ROM error ( $ {\overline{\tau}}^{\ast } $ ) by another PAC validator.

Note. The ROMs are constructed with POD + OpInf.

We show the same trajectory comparison for the OpInf-ROM. In Figures 12a,b and 14a,b, we can see that both ROMs have similar performance, even for Test 2 where the maximal input parameters are used in the online prediction. In the error fields presented in Figures 13 and 15, it is observed that the overall accuracy of the OpInf-ROM constructed by the AL algorithm is still better. But the performance of the OpInf-ROM constructed by the conventional approach is remarkably improved compared to the ANN-ROM, especially while predicting with the maximal parameters.

Figure 12. Linear thermal block: the FOM and ROM solution in Test 1. The ROMs are constructed with POD + OpInf.

Figure 13. Comparison between two ROMs’ error (absolute) fields at $ t=2\mathrm{s} $ in Test 1. Left: AL. Right: ML. The ROMs are constructed with POD + OpInf.

Figure 14. The FOM and ROM solution in Test 2. The ROMs are constructed with POD + OpInf.

Figure 15. Comparison between two ROMs’ error (absolute) fields at $ t=2\mathrm{s} $ in Test 2. Left: AL. Right: ML. The ROMs are constructed with POD + OpInf.

Through this experiment, we draw two conclusions. The first conclusion is that no matter with what kind of training data, building an OpInf-ROM with the same confident ROM error requires less training data compared to building an ANN-ROM. The second conclusion is that the OpInf-ROMs constructed in both ways (the proposed AL algorithm and the conventional approach) have similar qualities. This can be confirmed by both case-independent and case-dependent validation.

These conclusions match OpInf’s feature of being physics-informed. While designing the architecture of the OpInf model, we have a hypothetical form for the governing differential equation of the FOM. By employing the hypothetical form, we reduce the flexibility of the ROM, which enables us to use fewer data to fit the ROM. Besides, the physics information restricts the extrapolated prediction, which is why even using the DPS-sampled data, the OpInf-ROM can still have good accuracy in Test 2.

4.2. Nonlinear RC ladder

The second test model is a nonlinear RC circuit model. This benchmark model is provided by The MORwiki Community (2018). It’s different variants are widely used to test MOR methods, such as in Chen (Reference Chen1999), Rewienski (Reference Rewienski2003), Gu (Reference Gu2011), and Kawano and Scherpen (Reference Kawano and Scherpen2019). To obtain the MATLAB code, readers should refer to Model 1 in The MORwiki Community (2018).

The governing equation is the same as equation (32) in Regazzoni et al. (Reference Regazzoni, Dedè and Quarteroni2019):

(40)

$$ \left(\begin{array}{l}{\dot{v}}_1(t)=-2{v}_1(t)+{v}_2(t)+2-{e}^{40{v}_1(t)}-{e}^{40\left({v}_1(t)-{v}_2(t)\right)}+u(t)\\ {}{\dot{v}}_i(t)=-2{v}_i(t)+{v}_{i-1}(t)+{v}_{i+1}(t)+2+{e}^{40\left({v}_{i-1}(t)-{v}_i(t)\right)}-{e}^{40\left({v}_i(t)-{v}_{i+1}(t)\right)},\hskip0.5em \mathrm{for}\hskip0.5em i=2,3,\dots, N-1\\ {}{\dot{v}}_N(t)=-{v}_N(t)+{v}_{N-1}(t)-1+{e}^{40\left({v}_{N-1}(t)-{v}_N(t)\right)}.\end{array}\right. $$

We assume there are $ N=128 $ resistors in the circuit. Therefore, the size of the FOM is 128. The range of the input signal $ u(t) $ is $ \mathcal{M}=\left[0,1\right] $ . To prepare the training data, the system is integrate in the time span $ t\in \left(0,1\right] $ with forward Euler method, where the time step $ \delta t=2E-3 $ .

We create 5 IVPs with the SPS method to generate $ {\boldsymbol{Y}}_{\mathrm{estimate}} $ and use $ {\boldsymbol{Y}}_{\mathrm{estimate}} $ to create the initial POD space with $ {N}_r=15 $ . While using the JSS method to prepare the data pool $ {I}_{\mathrm{all}} $ , we use $ \beta =0\% $ . There are 20,000 joint samples in $ {I}_{\mathrm{all}} $ .

We define $ \epsilon =3\% $ and $ \sigma =1\% $ and 2,944 additional samples are collected for the PAC validator. The investigated confidence level is $ {\eta}_{\mathrm{design}}=97\% $ .

4.2.1. AL for ANN

For both ANN-based ROMs, we use an MLP with [16, 64, 64, 15] neurons. The AL-ROM’s convergence is presented in Table 5. The AL algorithm converges while 700 training samples are used. According to the PAC scores in Table 6, the $ 97\% $ -confident ROM error of the AL-ROM is $ 1.00\% $ . The $ 97\% $ -confident ROM error of the ML-ROM is $ 3.07\% $ , which is lower than the AL-ROM. This result tells us that the AL-ROM is assessed to be generally more accurate.

Table 5. PAC scores in the AL algorithm.

Note. The ROM is constructed with POD + ANN. $ {\overline{\tau}}_{\mathrm{design}}^{\ast }=1\%. $

Table 6. $ 97\% $ -confident ROM error ( $ {\overline{\tau}}^{\ast } $ ) by another PAC validator.

Note. The ROMs are constructed with POD + ANN.

Then we test the ROMs with a specific test case. In the first test case, the input signal of the system is $ \mu (t)=0.5\left(\cos \left(2\pi t\right)+1\right) $ , which is the same as the test input used in Regazzoni et al. (Reference Regazzoni, Dedè and Quarteroni2019). We monitor the variable $ {v}_0(t) $ of the system and present the comparison between the FOM and ROM solution in Figure 16.

Figure 16. Nonlinear RC ladder: the FOM and ROM solution in the test case, where $ \mu (t)=0.5\left(\cos \left(2\pi t\right)+1\right) $ . The ROMs are constructed with POD + ANN.

4.2.2. AL for OpInf

The AL-ROM’s convergence is presented in Table 7. The AL algorithm converges while 50 training samples are used, which is much fewer than required by the ROM constructed with POD + ANN. In Table 8, the $ 97\% $ -confident ROM error of the AL-ROM and ML-ROM is $ 0.50 $ and $ 2.07\% $ , respectively. Based on the results, we expect the AL-ROM to perform generally better than the ML-ROM.

Table 7. PAC scores in the AL algorithm.

Note. The ROM is constructed with POD + OpInf. $ {\overline{\tau}}_{\mathrm{design}}^{\ast }=0.5\% $ .

Table 8. $ 97\% $ -confident ROM error ( $ {\overline{\tau}}^{\ast } $ ) by another PAC validator.

Note. The ROMs are constructed with POD + OpInf.

The same test case is used to further validate the ROMs. The corresponding results are presented in Figure 17. A similar comparison between the AL-ROM and ML-ROM is observed. In the test case, the OpInf-ROM constructed by the AL algorithm has almost the same transient response as the FOM.

Figure 17. Nonlinear RC ladder: the FOM and ROM solution in test case. The ROMs are constructed with POD + OpInf.

Through this numerical model, we see similar results to the linear model. To be specific, both types of the ROMs are benefited from the AL algorithm. Between them, the ANN-MOR method obtains more significant improvement as a purely data-driven method. The performance of the OpInf-ROM is also improved, but the difference is not as remarkable as the ANN-ROM.

4.3. 3-D vacuum furnace model

In Figure 18a, we present the 3-D simulation model of a vacuum radiation furnace (Hao et al., Reference Hao, Gu, Chen, Zhang and Zuo2008). The model was created and simulated using NX Simcenter 3D. The original FEM model has 20,209 elements. The dark gray part represents the outer case of the furnace. The red parts are heaters, for which there are six in total. The blue component between the heaters and the case is the protecting shell. The cubic area at the center is the workzone where material can be placed. Since the internal space of the furnace is a vacuum, during the industrial process, the material placed in the furnace will only be heated up by the thermal radiation from the heaters. The discrete governing equation of the system can be written as:

(41)

$$ \boldsymbol{E}\frac{\partial \boldsymbol{T}}{\partial t}=\boldsymbol{KT}+{\boldsymbol{A}}_{\mathrm{cv}}\left(\alpha \right)\boldsymbol{T}+{\boldsymbol{RT}}^4+\boldsymbol{Bu}, $$

where $ \boldsymbol{E} $ is the thermal capacity matrix, $ \boldsymbol{K} $ is the thermal conductivity matrix, $ {\boldsymbol{A}}_{\mathrm{cv}}\left(\alpha \right) $ is the heat convection matrix and $ \alpha $ is the convection coefficient, $ \boldsymbol{R} $ is the radiation matrix, and $ \boldsymbol{B} $ is the parameter-distribution matrix. The state (temperature) vector of this system is $ \boldsymbol{T}\in {\mathrm{\mathbb{R}}}^N $ where $ N=\mathrm{20,209} $ . We consider there are seven controllable parameters in the system: six heating power of the heaters ( $ \mathrm{kW}/{\mathrm{m}}^3 $ ) and the convection coefficient ( $ \mathrm{W}\;{\mathrm{m}}^{-2}{\mathrm{K}}^{-1} $ ) of the outer surface of the case, which are stored in the parameter vector $ \boldsymbol{u} $ . Therefore, we have the vector $ \boldsymbol{u}\in {\mathrm{\mathbb{R}}}^{N_u} $ with $ {N}_u=7 $ . The parameter space is predefined to be:

(42)

$$ {\displaystyle \begin{array}{c}\mathcal{M}=\left[0,{10}^4\right]\times \left[0,{10}^4\right]\times \left[0,{10}^4\right]\times \left[0,{10}^4\right]\\ {}\times \left[0,{10}^4\right]\times \left[0,{10}^4\right]\times \left[\mathrm{5,100}\right].\end{array}} $$

Figure 18. The vacuum furnace model.

Based on physical consideration, the time grid is from 0 to 45,000 s with 450 time steps.

For the AL algorithm, $ m=20 $ samples are taken from $ \mathcal{M} $ to estimate the boundaries of $ \mathcal{Y} $ . We also include two special parameter configurations:

(43)

$$ {\displaystyle \begin{array}{l}{\boldsymbol{\mu}}_{T_{\mathrm{min}}}=\left[\mathrm{0,0,0,0,0,0,100}\right],\\ {}{\boldsymbol{\mu}}_{T_{\mathrm{min}}}=\left[{10}^4,{10}^4,{10}^4,{10}^4,{10}^4,{10}^4,5\right].\end{array}} $$

The initial reduced space is constructed by $ {N}_r=22 $ POD basis. Then we create the joint space $ {\mathcal{J}}^{\ast }={\mathcal{Y}}^{\ast}\times \mathcal{M} $ , where $ \beta =10\% $ is used for loosening. We use random sampling to get 40,000 joint samples for $ {\boldsymbol{I}}_{\mathrm{all}} $ . We define $ \epsilon =3\% $ and $ \sigma =1\% $ and 2,944 additional samples are collected for the PAC validator. The investigated confidence level is $ {\eta}_{\mathrm{design}}=97\% $ . We pick $ {\overline{\tau}}_{\mathrm{design}}^{\ast }=1\% $ and $ \Delta {\overline{\tau}}_{\mathrm{tol}}^{\ast }=0.01\% $ . In conclusion, we are looking for a ROM whose $ 97\% $ -confident ROM error is $ 1\% $ .

4.3.1. AL for ANN

The AL-ROM’s convergence is presented in Table 9. We construct ROMs with our method (AL) and the conventional method (ML) with the same number of training samples (20,000). The ROM errors are $ 1.00\% $ (AL) and $ 13.07\% $ (ML) at the investigated confidence level ( $ 97\% $ ). Both ROMs have an MLP with [29, 80, 80, 22] neurons (Table 10).

Table 9. PAC scores in the AL algorithm.

Note. The ROM is constructed with POD + ANN. $ {\overline{\tau}}_{\mathrm{design}}^{\ast }=1\%. $

Table 10. $ 97\% $ -confident ROM error ( $ {\overline{\tau}}^{\ast } $ ) by another PAC validator.

Note. The ROMs are constructed with POD + ANN.

Two test scenarios are employed to show the ROMs’ performance in relatively realistic conditions. In the first test, all heaters are working with a 3-stage heating profile (Figure 19) and the convection coefficient is set to be $ 50\;\mathrm{W}\;{\mathrm{m}}^{-2}{\mathrm{K}}^{-1}\Big) $ .

Figure 19. The heat generation for the first test case. The strategy is applied to all six heaters.

In this test case, we monitor the temperature trajectories on a heater and in the workzone.

To show the error for the whole temperature field, we compute the absolute errors for all 20,209 elemental temperature individually. Four statistical measures of the absolute errors, that is, mean, standard deviation (std.), minimum and maximum, are used to reflect the error distribution at the final time point. The results are shown in Table 11.

Table 11. Statistical analysis for the error field ( $ \Delta T(t),t\in \left(0,45,000\right] $ ) for the 3-stage heating profile.

Note. The ROMs are constructed with POD + ANN.

In another test case, we use the same heating profile for four heaters and turn off two heaters (Figure 18b). This simulates a scenario where two heaters have failed during the industrial process. The existence of failed heaters will cause a nonuniform temperature field in the workzone. We call them cold side and hot side, respectively. In this test, we monitor the temperature trajectories of the cold side and the hot side using the ROMs. The corresponding comparison between ROM-predicted trajectories and FOM-predicted trajectories in two test cases is given in Figure 21 and Table 12.

Table 12. Statistical analysis for the error field ( $ \Delta T(t),t\in \left(0,45,000\right] $ ) for the heating profile with failure.

Note. The ROMs are constructed with POD + ANN.

As observed in both trajectory comparison in Figures 20 and 21, the ROM constructed by the AL algorithm has a better agreement with the reference result. Moreover, in Tables 11 and 12, all statistical measurement of the AL-ROM’s error fields are better than the ML-ROM’s. These facts tell us the proposed method significantly improves the performance of the ANN-ROM in this nonlinear thermal system.

Figure 20. Comparison between the NX solution and the ROMs’ prediction for the 3-stage heating profile. The ROMs are constructed with POD + ANN.

Figure 21. Comparison between NX solution and ROMs’ prediction for the heating profile with failure. The ROMs are constructed with POD + ANN.

4.3.2. AL for OpInf

In this section, the results of constructing the OpInf-ROM with the AL algorithm and the conventional approach are given. The confident ROM error of the OpInf-ROM successfully converges to $ 1.00\% $ after five iterations. The OpInf-ROM produced by the conventional approach has a confident ROM error of $ 7.54\% $ using the same amount of training data. This confident ROM error is lower than the confident ROM error of the ANN-ROM produced in the conventional way (Figures 22 and 23 and Tables 13–16).

Table 13. PAC scores in the AL algorithm.

Note. The ROM is constructed with POD + OpInf. $ {\overline{\tau}}_{\mathrm{design}}^{\ast }=1\% $ .

Table 14. $ 97\% $ -confident ROM error ( $ {\overline{\tau}}^{\ast } $ ) by another PAC validator.

Note. The ROMs are constructed with POD + OpInf.

Table 15. Statistical analysis for the error field ( $ \Delta T(t),t\in \left(0,45,000\right] $ ) for the 3-stage heating profile.

Note. The ROMs are constructed with POD + OpInf.

Table 16. Statistical analysis for the error field ( $ \Delta T(t),t\in \left(0,45,000\right] $ ) for the heating profile with failure.

Note. The ROMs are constructed with POD + OpInf.

Figure 22. Comparison between NX solution and ROMs’ prediction for the 3-stage heating profile. The ROMs are constructed with POD + OpInf.

Figure 23. Comparison between NX solution and ROMs’ prediction for the heating profile with failure. The ROMs are constructed with POD + OpInf.

Similar to the results in Section 4.1.2, in the case-dependent validation, we can only see limited improvement by employing the AL algorithm for the OpInf method. Besides the reasons mentioned in Section 4.1.2, we also do not expect to have an extremely high temperature in the use cases of the case-dependent validation for the vacuum furnace model. This means that the temperatures of the system under the validation inputs are still considered to be located in the sampled region of the DPS method. In this case, a ROM trained with the DPS-sampled data can still predict with high accuracy.

5. Summary and Conclusions

In this work, we propose an Active-Learning-based approach to support machine-learning-based (ML-based) MOR methods. It can be applied to the reduction of any problem of the form,

(44)

$$ \dot{\boldsymbol{y}}(t)=\boldsymbol{f}\left(\boldsymbol{y};\boldsymbol{\mu} \right), $$

where $ \boldsymbol{y}\in {\mathrm{\mathbb{R}}}^N $ is the state vector and the parameter vector $ \boldsymbol{\mu} \in {\mathrm{\mathbb{R}}}^{N_m} $ is time-dependent. It can also be applied to a wide variety of ML-based MOR methods including purely data-driven methods such as Deep Neural Network in Fresca and Manzoni (Reference Fresca and Manzoni2022), Gaussian process regression in Guo and Hesthaven (Reference Guo and Hesthaven2018)), Dynamic Mode Decomposition (Erichson et al., Reference Erichson, Mathelin, Kutz and Brunton2019), Residual Network in San et al. (Reference San, Maulik and Ahmed2019), as well as physics-informed methods such as OpInf (Peherstorfer and Willcox, Reference Peherstorfer and Willcox2016), physics-informed neural networks (PINNs) (Kim et al., Reference Kim, Choi, Widemann and Zohdi2022) or SINDy (Champion et al., Reference Champion, Lusch, Kutz and Brunton2019). The approach cannot however be used with recurrent ML models such as those in Kani and Elsheikh (Reference Kani and Elsheikh2017), Mohan and Gaitonde (Reference Mohan and Gaitonde2018), Maulik et al. (Reference Maulik, Mohan, Lusch, Madireddy, Balaprakash and Livescu2020), Wang et al. (Reference Wang, Ripamonti and Hesthaven2020), and Wu and Noels (Reference Wu and Noels2022).

One key ingredient is the generation of solution snapshots through the (active) sampling of a joint space consisting of a predefined parameter space and an estimated reduced solution space. This leads to a better coverage of these two spaces, which is otherwise difficult to achieve with simpler strategies. In addition, we present a case-independent validator constructed based on the PAC learning theory. Through numerical experiments, we find that our approach significantly improves the performance of purely data-driven MOR methods and that the PAC validator presents a good indicator of the ROM quality. However, the benefits of using these strategies are less clear when applied to the physics-informed method OpInf. We expect this to be also the case for other methods in this category, such as PINNs (Kim et al., Reference Kim, Choi, Widemann and Zohdi2022) or SINDy (Champion et al., Reference Champion, Lusch, Kutz and Brunton2019). By exploiting knowledge of the physical system, such methods are better at constructing ROMs that can generalize beyond the training data. Therefore the benefit of our improved sampling strategy is less marked.

Acknowledgments

We are grateful to Mr. Hicham Elmrini from Maya HTT for his guidance with the simulation software.

Author Contributions

Conceptualization: Q.Z., D.H., H.J.B., J.M.L.; Data curation: Q.Z.; Formal analysis: Q.Z., D.H., H.J.B., J.M.L.; Investigation: Q.Z.; Methodology: Q.Z., D.H., H.J.B., J.M.L.; Software: Q.Z.; Supervision: D.H., H.J.B., J.M.L.; Validation: Q.Z.; Visualization: Q.Z; Writing—original draft: Q.Z.; Writing—review and editing: D.H., H.J.B., J.M.L.

Competing Interests

The authors declare no competing interests exist.

Data Availability Statement

The data and the source code necessary for reproducing the results in Sections 4.1 and 4.2 will be made available under https://github.com/qinyu-zhuang/AL-MOR-paper 6 months after publication, to comply with internal policies for the publication of this information.

Ethics Statement

The research meets all ethical guidelines, including adherence to the legal requirements of the study country.

Funding Statement

This work received no specific grant from any funding agency, commercial or not-for-profit sectors.

Footnotes

This research article was awarded an Open Data badge for transparent practices. See the Data Availability Statement for details.

References

Alippi, C (2016) Intelligence for Embedded Systems: A Methodological Approach, 1st Edn. New York: Springer Publishing Company, Incorporated.Google Scholar

Antoulas, A (2005) Approximation of Large-Scale Dynamical Systems. Philadelphia, PA: Society for Industrial and Applied Mathematics.CrossRef Google Scholar

Batista, GEAPA, Prati, RC and Monard, MC (2004) A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explorations Newsletter 6(1), 20–29.CrossRef Google Scholar

Broomhead, DS and Lowe, D (1988) Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks. Technical Report, Royal Signals and Radar Establishment, Malvern, UK.Google Scholar

Bui-Thanh, T, Willcox, K and Ghattas, O (2008) Model reduction for large-scale systems with high-dimensional parametric input space. SIAM Journal on Scientific Computing 30(6), 3270–3288.CrossRef Google Scholar

Champion, K, Lusch, B, Kutz, JN and Brunton, SL (2019) Data-driven discovery of coordinates and governing equations. Proceedings of the National Academy of Sciences 116(45), 22445–22451.CrossRef Google Scholar PubMed

Chaturantabut, S and Sorensen, DC (2010) Nonlinear model reduction via discrete empirical interpolation. SIAM Journal on Scientific Computing 32(5), 2737–2764.CrossRef Google Scholar

Chen, Y (1999) Model Order Reduction for Nonlinear Systems. Master’s thesis, Massachusetts Institute of Technology, Cambrigde, MA.Google Scholar

Erichson, NB, Mathelin, L, Kutz, JN and Brunton, SL (2019) Randomized dynamic mode decomposition. SIAM Journal on Applied Dynamical Systems 18(4), 1867–1891.CrossRef Google Scholar

Fresca, S and Manzoni, A (2022) Pod-dl-rom: Enhancing deep learning-based reduced order models for nonlinear parametrized PDEs by proper orthogonal decomposition. Computer Methods in Applied Mechanics and Engineering 388, 114181.CrossRef Google Scholar

Glorot, X, Bordes, A and Bengio, Y (2011) Deep sparse rectifier neural networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, pp. 315–323. Fort Lauderdale, FL: PMLR.Google Scholar

Gu, C (2011) Model Order Reduction of Nonlinear Dynamical Systems. Berkeley, CA: University of California.Google Scholar

Guo, M and Hesthaven, J (2018) Reduced order modeling for nonlinear structural analysis using Gaussian process regression. Computer Methods in Applied Mechanics and Engineering 341, 807–826.CrossRef Google Scholar

Gyori, NG, Palombo, M, Clark, CA, Zhang, H and Alexander, DC (2022) Training data distribution significantly impacts the estimation of tissue microstructure with machine learning. Magnetic Resonance in Medicine 87(2), 932–947.CrossRef Google Scholar PubMed

Haasdonk, B (2017) Reduced basis methods for parametrized pdes–a tutorial introduction for stationary and instationary problems. Model Reduction and Approximation: Theory and Algorithms 15, 65.CrossRef Google Scholar

Haasdonk, B, Dihlmann, M and Ohlberger, M (2011) A training set and multiple bases generation approach for parameterized model reduction based on adaptive grids in parameter space. Mathematical and Computer Modelling of Dynamical Systems 17(4), 423–442.CrossRef Google Scholar

Habermann, C and Kindermann, F (2007) Multidimensional spline interpolation: Theory and applications. Computational Economics 30(2), 153–169.CrossRef Google Scholar

Hammersley, J and Handscomb, D (1964) Monte-Carlo Methods. London: Mathuen.CrossRef Google Scholar

Hao, X, Gu, J, Chen, N, Zhang, W and Zuo, X (2008) 3-d numerical analysis on heating process of loads within vacuum heat treatment furnace. Applied Thermal Engineering 28(14–15), 1925–1931.CrossRef Google Scholar

Hartmann, D, Herz, M and Wever, U (2018) Model order reduction a key technology for digital twins. In Reduced-Order Modeling (ROM) for Simulation and Optimization. Cham: Springer, pp. 167–179.CrossRef Google Scholar

Haussler, D (1990) Probably Approximately Correct Learning. Santa Cruz, CA: University of California, Santa Cruz, Computer Research Laboratory.Google Scholar

He, K, Zhang, X, Ren, S and Sun, J (2016) Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV: IEEE, pp. 770–778.Google Scholar

Heres, PJ (2005) Robust and Efficient Krylov Subspace Methods for Model Order Reduction. Phd Thesis in TU/e.CrossRef Google Scholar

Kani, JN and Elsheikh, AH (2017) DR-RNN: A deep residual recurrent neural network for model reduction. arXiv preprint arXiv:1709.00939.Google Scholar

Kawano, Y and Scherpen, J (2019) Empirical differential gramians for nonlinear model reduction. arXiv preprint arXiv:1902.09836.Google Scholar

Kim, Y, Choi, Y, Widemann, D and Zohdi, T (2022) A fast and accurate physics-informed neural network reduced order model with shallow masked autoencoder. Journal of Computational Physics 451, 110841.CrossRef Google Scholar

Krige, DG (1951) A statistical approach to some basic mine valuation problems on the witwatersrand. Journal of the Southern African Institute of Mining and Metallurgy 52(6), 119–139.Google Scholar

Kuether, RJ and Allen, MS (2016) Validation of nonlinear reduced order models with time integration targeted at nonlinear normal modes. Conference Proceedings of the Society for Experimental Mechanics Series 1, 363–375.CrossRef Google Scholar

Lappano, E, Naets, F, Desmet, W, Mundo, D and Nijman, E (2016) A greedy sampling approach for the projection basis construction in parametric model order reduction for structural dynamics models. In Proceedings of ISMA. Leuven, Belgium: KU Leuven, pp. 19–21.Google Scholar

Lu, K, Jin, Y, Chen, Y, Yang, Y, Hou, L, Zhang, Z, Li, Z and Fu, C (2019) Review for order reduction based on proper orthogonal decomposition and outlooks of applications in mechanical systems. Mechanical Systems and Signal Processing 123, 264–297.CrossRef Google Scholar

Maulik, R, Mohan, A, Lusch, B, Madireddy, S, Balaprakash, P and Livescu, D (2020) Time-series learning of latent-space dynamics for reduced-order model closure. Physica D: Nonlinear Phenomena 405, 132368.CrossRef Google Scholar

McQuarrie, SA, Huang, C and Willcox, KE (2021) Data-driven reduced-order models via regularised operator inference for a single-injector combustion process. Journal of the Royal Society of New Zealand 51(2), 194–211.CrossRef Google Scholar

Mohan, AT and Gaitonde, DV (2018) A deep learning based approach to reduced order modeling for turbulent flow control using lstm neural networks. arXiv preprint arXiv:1804.09269.Google Scholar

Pan, S and Duraisamy, K (2018) Long-time predictive modeling of nonlinear dynamical systems using neural networks. Complexity 2018, 4801012.CrossRef Google Scholar

Paszke, A, Gross, S, Chintala, S, Chanan, G, Yang, E, DeVito, Z, Lin, Z, Desmaison, A, Antiga, L and Lerer, A (2017) Automatic differentiation in pytorch. In NIPS 2017 Workshop on Autodiff. Long Beach, CA: NIPS.Google Scholar

Paul-Dubois-Taine, A and Amsallem, D (2015) An adaptive and efficient greedy procedure for the optimal training of parametric reduced-order models. International Journal for Numerical Methods in Engineering 102(5), 1262–1292.CrossRef Google Scholar

Peherstorfer, B and Willcox, K (2016) Data-driven operator inference for nonintrusive projection-based model reduction. Computer Methods in Applied Mechanics and Engineering 306, 196–215.CrossRef Google Scholar

Perez, R, Wang, XQ and Mignolet, MP (2014) Nonintrusive structural dynamic reduced order modeling for large deformations: Enhancements for complex structures. Journal of Computational and Nonlinear Dynamics 9(3), 031008.CrossRef Google Scholar

Prechelt, L (1998) Early stopping-but when? In Neural Networks: Tricks of the Trade. Berlin: Springer, pp. 55–69.CrossRef Google Scholar

Przekop, A, Guo, X and Rizzi, SA (2012) Alternative modal basis selection procedures for reduced-order nonlinear random response simulation. Journal of Sound and Vibration 331(17), 4005–4024.CrossRef Google Scholar

Qian, E, Kramer, B, Peherstorfer, B and Willcox, K (2020) Lift & learn: Physics-informed machine learning for large-scale nonlinear dynamical systems. Physica D: Nonlinear Phenomena 406, 132401.CrossRef Google Scholar

Rasheed, A, San, O and Kvamsdal, T (2019) Digital twin: Values, challenges and enablers. arXiv preprint arXiv:1910.01719.Google Scholar

Regazzoni, F, Dedè, L and Quarteroni, A (2019) Machine learning for fast and reliable solution of time-dependent differential equations. Journal of Computational Physics 397, 108852.CrossRef Google Scholar

Rewienski, MJ (2003) A Trajectory Piecewise-Linear Approach to Model Order Reduction of Nonlinear Dynamical Systems. PhD Thesis, Massachusetts Institute of Technology.CrossRef Google Scholar

San, O, Maulik, R and Ahmed, M (2019) An artificial neural network framework for reduced order modeling of transient flows. Communications in Nonlinear Science and Numerical Simulation 77, 271–287.CrossRef Google Scholar

Settles, B (2009) Active learning literature survey.Google Scholar

The MORwiki Community (2018) Nonlinear RC ladder. MORwiki – Model Order Reduction Wiki.Google Scholar

Volkwein, S (2013) Proper orthogonal decomposition: Theory and reduced-order modelling. Lecture Notes, University of Konstanz 4(4), 1–29.Google Scholar

Wang, Q, Ripamonti, N and Hesthaven, JS (2020) Recurrent neural network closure of parametric POD-galerkin reduced-order models based on the Mori-Zwanzig formalism. Journal of Computational Physics 410, 109402.CrossRef Google Scholar

Willcox, K and Peraire, J (2002) Balanced model reduction via the proper orthogonal decomposition. AIAA Journal 40(11), 2323–2330.CrossRef Google Scholar

Williams, CK and Rasmussen, CE (1996) Gaussian processes for regression.Google Scholar

Wu, L and Noels, L (2022) Recurrent neural networks (RNNS) with dimensionality reduction and break down in computational mechanics; Application to multi-scale localization step. Computer Methods in Applied Mechanics and Engineering 390, 114476.CrossRef Google Scholar

Xiao, D (2019) Error estimation of the parametric non-intrusive reduced order model using machine learning. Computer Methods in Applied Mechanics and Engineering 355, 513–534.CrossRef Google Scholar

Zhuang, Q, Lorenzi, JM, Bungartz, H-J and Hartmann, D (2021) Model order reduction based on runge-kutta neural networks. Data-Centric Engineering 2, e13.CrossRef Google Scholar

Figure 1. The sample distribution in the parameter space $ \mathcal{M} $.

Figure 2. The sample distribution in the solution space.

Figure 3. The workflow of the AL-based MOR.

Figure 4. The evolution of the joint space.

Figure 5. The flow charts of the preparation and iteration phase of the AL-based MOR. The line numbers in (b) match those in Algorithm 3.

Table 1. PAC scores in the AL algorithm.

Table 2. $ 97\% $-confident ROM error ($ {\overline{\tau}}^{\ast } $) by another PAC validator.

Figure 6. Time-dependent inputs to the 2-D thermal problem for Test 1.

Figure 7. Linear thermal block: the FOM and ROM solution in Test 1. The ROM is constructed with POD + ANN.

Figure 8. Comparison between two ROMs’ error (absolute) fields at $ t=2\mathrm{s} $ in Test 1. Left: AL. Right: ML. The ROMs are constructed with POD + ANN.

Figure 9. Linear thermal block: the FOM and ROM solution in Test 2. The ROMs are constructed with POD + ANN.

Figure 10. Comparison between two ROMs’ error (absolute) fields at $ t=2\mathrm{s} $ in Test 2. Left: AL. Right: ML. The ROMs are constructed with POD + ANN.

Table 3. PAC scores in the AL algorithm.

Table 4. $ 97\% $-confident ROM error ($ {\overline{\tau}}^{\ast } $) by another PAC validator.

Figure 12. Linear thermal block: the FOM and ROM solution in Test 1. The ROMs are constructed with POD + OpInf.

Figure 13. Comparison between two ROMs’ error (absolute) fields at $ t=2\mathrm{s} $ in Test 1. Left: AL. Right: ML. The ROMs are constructed with POD + OpInf.

Figure 14. The FOM and ROM solution in Test 2. The ROMs are constructed with POD + OpInf.

Figure 15. Comparison between two ROMs’ error (absolute) fields at $ t=2\mathrm{s} $ in Test 2. Left: AL. Right: ML. The ROMs are constructed with POD + OpInf.

Table 5. PAC scores in the AL algorithm.

Table 6. $ 97\% $-confident ROM error ($ {\overline{\tau}}^{\ast } $) by another PAC validator.

Figure 16. Nonlinear RC ladder: the FOM and ROM solution in the test case, where $ \mu (t)=0.5\left(\cos \left(2\pi t\right)+1\right) $. The ROMs are constructed with POD + ANN.