Nomenclature
- ADS
-
air data system
- ADHRS
-
air data, heading reference system
- AoA
-
angle-of-attack
- AoS
-
angle-of-sideslip
- ASSE
-
angle-of-attack and -sideslip estimator
- EMRAN
-
extended minimal resource-allocating network
- FCS
-
flight control systems
- GRBF
-
generalised radial basis function
- MLP
-
multilayer perceptron
- MRAN
-
minimal resource-allocating network
- NN
-
neural network
- RAN
-
resource-allocating network
- RBF
-
radial basis function
- TAS
-
true airspeed
- UAV
-
unmanned aerial vehicles
- UAS
-
unmanned aircraft systems
- VMS
-
vehicle management systems
1.0 Introduction
Redundancy is usually applied to safety critical systems to avoid single points of failure and to be compliant with the applicable airworthiness regulations. Analytical redundancy [Reference Pouliezos and Stavrakakis1] can guarantee the same level of reliability without installing any additional (or redundant) physical systems. The whole system can be also monitored by means of using analytical, or synthetic, sensors with the aim, for example, to accommodate possible failures [Reference Gertler2, Reference Taiebat and Sassani3].
A synthetic sensor implements a data fusion technique that can provide indirect measurements of flight data. The use of synthetic sensors can be beneficial to reduce the number of system units to be installed on the aircraft and to introduce the dissimilarity feature that is crucial to avoid common failure modes [Reference Eubank, Atkins and Ogura4, Reference Lu, Van Eykeren, Van Kampen and Chu5] and incorrect failure diagnosis from pilots or autonomous systems [6].
Modern control systems rely on vehicle management systems (VMS), or flight control systems (FCS), to autonomously control, navigate and complete the air vehicle’s mission. The VMS, or the FCS, encompasses several sub-systems in order to measure all necessary flight data to perform the required tasks. A set of flight data is directly measured from the external environment by means of the air data system (ADS) that is based on physical probes and vanes installed on the aircraft fuselage to be immersed in the surrounding airfield. The external sensors can provide direct measurements of pressures, temperatures and flow angles.
Flow angles, angle-of-attack (AoA) and angle-of-sideslip (AoS), can be estimated using several synthetic sensors [Reference Dendy and Transier7–Reference Valasek, Harris, Pruchnicki, McCrink, Gregory and Sizoo16] that can be grouped into two main categories, related to the approach used: model-based (e.g. Kalman filters) and data-based (e.g. neural networks). The latter approaches suffer from common disadvantages. In fact, they are designed on the target aircraft and, moreover, they need to be reconfigured for any change in aircraft configuration or flight regime. The latter drawbacks are overcome using a model-free approach.
In the present work, the adopted method for flow angle estimation aims to be independent from the target aircraft and avionics [Reference Lerro, Brandl and Gili17]. The proposed method, in fact, is based on a set of nonlinear equations derived from the theory of flight dynamics by means of fusing together several inputs: inertial accelerations, airspeed and angular rates. The nonlinear scheme, named ASSE, can be solved using iterative solvers that can lose accuracy in operative scenarios [Reference Lerro, Brandl and Gili18] dealing with actual signals that are affected by uncertainties [Reference Plataniotis and Venetsanopoulos19, Reference Grigorie and Botez20] due to several sources (electronics noise, structural vibrations, etc.).
The manufactured technological demonstrator encompasses several sensors (to provide direct measurements to the ASSE scheme) that are characterised [Reference Lerro, Gili and Pisani21]. The iterative method that is currently being used to solve the ASSE scheme may lose accuracy in realistic applications. Therefore, alternative solvers, such as neural ones, are considered in order to mitigate the effect of realistic uncertainties on the input signals. The multi-layer perceptron (MLP) and generalised radial basis function (GRBF) neural networks (NNs) are proposed in this work as possible solvers of the ASSE scheme. Both neural solvers are pre-trained in order to guarantee the deterministic feature. On the other side, training the neural networks with flight data from a specific aircraft would hinder the generality of the method. The objective of the present work is to investigate if it is possible to train the neural network as the ASSE solver using uncertainty-free inputs and, at the same time, preserving acceptable performance when using data corrupted by realistic uncertainties. With the proposed approach, the model-free synthetic sensor is maintained generic as it is fully decoupled from the target aircraft. In fact, modelling uncertainties, considering worst scenarios, would lead to pre-trained neural networks that are ready to be used as flow angle synthetic sensors on board any flying body at any flight regime without any further training sessions.
The ASSE scheme and the corresponding reliability criteria are introduced in Section 2. The characterisation study of the ASSE technological demonstrator is briefly reported in Section 3. The neural approach is justified in Section 4 along with the performance evaluation criteria and the description of training and test dataset. Neural network trainings are documented in Section 5. Before concluding the work, results are presented in Section 6 comparing the three methods.
2.0 Nonlinear ASSE scheme
In this section, some crucial information is introduced about the necessary inputs of the ASSE scheme.
In Fig. 1, the inertial and body reference frames, ${{\mathcal{F}}_I} = \left\{ {{X_I},{Y_I},{Z_I}} \right\}$ and ${{\mathcal{F}}_B} = \left\{ {{X_B},{Y_B},{Z_B}} \right\}$ respectively, are illustrated. The subscript denotes the reference frame where the vector components are represented. Generally speaking, the inertial velocity ${{\textbf{v}}_I}$ can be expressed as a function of the relative velocity ${{\textbf{v}}_B}$ and the wind velocity ${{\textbf{w}}_I}$ as
where ${{\textbf{C}}_{B2I}} = {\textbf{C}}_{I2B}^T$ is the direction-cosine matrix to express vector components in the inertial reference frame from the body one [Reference Schmidt22]. The vector ${{\textbf{v}}_B}$ can be written as a function of TAS and the flow angles, $\alpha $ and $\beta $ , as
where ${V_\infty }$ is the magnitude of the true airspeed (TAS), ${\hat{\boldsymbol{i}}_B}$ , ${\hat{\boldsymbol{j}}_B}$ and ${\hat{\boldsymbol{k}}_B}$ are the three unit vectors defining ${{\mathcal{F}}_B}$ and ${\hat{\boldsymbol{i}}_{WB}}$ is the unit vector of ${{\textbf{v}}_B}$ whose direction is only related to $\alpha $ and $\beta $ .
From the definition of the skew-symmetric matrix of body angular rates ${\boldsymbol{\Omega}_B}$ [Reference Etkin and Reid23], the coordinate acceleration can be written as
It is worth noting that the coordinate acceleration vector ${{\textbf{a}}_B}$ can be calculated by means of removing the effect of the gravity acceleration from the inertial acceleration ${{\textbf{a}}_I}$ that is directly measured by inertial reference systems.
The fundamental equation of the ASSE scheme [Reference Lerro, Brandl and Gili17] can be introduced as
where t is the current time, $\tau $ is a generic previous time, and
and
It is worth underlining that in Equations (5) and (6) all parameters are measured between time $\tau $ and t. Whereas, in Equation (4), AoA and AoS only appear at current time t. Moreover, Equation (4) can be rewritten considering different starting time ${\tau _i}$ with $i \in \left[ {0,1, \ldots ,n} \right]$ and ${\tau _0} \equiv t$ . The latter process leads to the ASSE scheme of $n + 1$ nonlinear equations:
where $\alpha $ and $\beta $ (at current time t) are the only unknowns of the system. The wind acceleration ${{\dot{\textbf{w}}}}$ is considered negligible in the time interval and all other coefficients can be directly measured using dedicated systems (as described in Section 3). Therefore, the flow angle estimation of Equation (7) requires direct measurements of: (1) body angular rates p, q and r; (2) TAS and its time derivative; (3) coordinate acceleration vector ${{\textbf{a}}_B}$ .
2.1 ASSE solvers
The flow angle estimation approach of Equation (7) can be solved using the most suitable solver for nonlinear schemes. The Levenberg-Marquardt algorithm [Reference Marquardt24] is used in Ref. [Reference Lerro, Brandl and Gili17] to solve the system of two, three and four nonlinear equations. In Ref. [Reference Lerro, Brandl and Gili18], it is shown that the iterative solver is highly affected by uncertainties on the inputs and neural solvers can mitigate the propagation of input uncertainties on the flow angle estimations.
Three nonlinear equations are used to assemble the ASSE scheme in the present work. The same number of equations is used to build the input layer of the neural networks as described in Section 2.1.
2.2 ASSE relibility criteria
In Ref. [Reference Lerro25], the flight conditions where the ASSE scheme is reliable are defined as:
-
1. Vertical/lateral coordinate acceleration:
-
• ${a_Z} \gt {a_{thr}} = 0.5$ ms–2 for AoA estimation
-
• ${a_Y} \gt {a_{thr}} = 0.5$ ms–2 for AoS estimation
-
2. $\tilde D \gt {D_{thr}} = 0.2{\rm{\;}}{{\rm{m}}^4}/{{\rm{s}}^6}$
where $\tilde D = {l_{t,{{\dot{\textbf{w}}}} \approx 0}}{m_{\tau ,{{\dot{\textbf{w}}}} \approx 0}} - {m_{t,{{\dot{\textbf{w}}}} \approx 0}}{l_{\tau ,{{\dot{\textbf{w}}}} \approx 0}}$ . In order to guarantee the flow angle estimation during dynamic manoeuvres, the aforementioned criteria are considered satisfied only if they are verified for (at least) 100 consecutive samples, equivalent to 1 s at the output rate adopted in this work (as described in Section 3).
3.0 Uncertainty characterisation
A dedicated demonstrator of the ASSE technology is designed and manufactured, as illustrated in Fig. 2, that encompasses an air data, inertial and heading reference system (ADAHRS).
The technological demonstrator can provide all the required measurements for the ASSE scheme at 100 Hz. The demonstrator is interfaced with: (i) a Pitot probe able to sense the dynamic and absolute pressures and (ii) an outside ambient temperature sensor. All sensors are characterised in terms of global uncertainty under static and dynamic condition (Reference Lerro, Gili and Pisani21). The sensor’s uncertainty is modelled using a bias ( ${B_0}$ ) and white noise (WN) with null mean and the standard deviation $1\sigma $ is evaluated as a function of the current value itself. Therefore, the sensor uncertainty is modelled as $ \pm {B_0} + {\rm{WN}}$ , where ${\sigma _{WN}} = \dfrac{1}{2}\sqrt {\sigma _{WN,0}^2 + {\nu ^2}\sigma _{WN,1}^2} $ and $\nu $ is the current value of the signal. The uncertainty models used in the present work are collected in Table 1.
For example, the inertial acceleration along the X direction, ${a_X}$ , is corrupted using an unbiased white noise with $1\sigma $ calculated as ${a_{X,1\sigma }} = \dfrac{1}{2}\sqrt {{{0.007}^2} + a_X^2{{0.02}^2}} $ ms–2. Whereas, the TAS uncertainty is modelled with a bias ${B_{0,TAS}} = \pm 0.47$ ms–1 and a white noise with constant $1\sigma $ value, ${\rm{TA}}{{\rm{S}}_{1\sigma }} = 1.3 \times 10^{-3}$ ms–1.
4.0 Selected neural approaches
It can be demonstrated that MLP [Reference Cybenko26] and GRBF [Reference Park and Sandberg27] NNs with a single hidden layer can approximate any continuous nonlinear input-output mapping function. However, the MLP and GRBF represent two different neural approaches as they belong to “global” and “local” approximators, respectively.
As far as a “global” neural network is concerned, the activation function is part of the ridge function class (e.g. Equation (A2)) and, therefore, the output of a single hidden unit (or neuron) is calculated as a function of all the available inputs. On the other hand, the GRBF activation functions are radial ones (e.g. Equation (B2)) and each centre is committed to a specific area of the input space. In fact, the output of a single hidden unit computes the Euclidean norm (or distance) between the input vector and the neuron centre.
Although the relation between the MLP and GRBF is demonstrated in Ref. [Reference Maruyama, Girosi and Poggio28], the degree of accuracy between the MLP and GRBF approaches cannot be preliminarily defined. For example, in Ref. [Reference Haykin29], the MLP is shown to achieve the same accuracy of the RBF neural networks but using less hidden units.
As the present work deals with a physics-based method solved using “global” and “local” neural networks, the input layer is defined by the ASSE scheme. Moreover, as the proposed GRBF-NN is defined with a single hidden layer, the proposed MLP-NN is also based on a single hidden layer.
4.1 Training and test dataset
The definition of a suitable training dataset can be challenging. As far as model-free methods are concerned, a possible strategy is to generate coherent uncertainty-free data using a reliable flight simulator. According to the latter potential strategy, it would be possible to control an adequate mapping of the domain where the neural networks are defined. However, building a suitable training dataset is not addressed in this work and the simulated flight manoeuvres only define a training and test dataset suitable for the scope of the work.
Simulated data are labelled as “nominal data” because they are not affected by any uncertainties. Simulated data are also corrupted using uncertainty models defined according to the characterisation study of a dedicated technological demonstrator. Latter data are labelled as “realistic data”. Firstly, the proposed networks are trained using nominal data, whereas they are tested using both nominal and realistic data. Secondly, the networks are re-trained using input data affected by realistic uncertainties and they are tested using both nominal and realistic data.
Training and test data are obtained with flight simulations and input uncertainties are artificially added in post processing according to Section 3. The flight simulator is a six-degree of freedom nonlinear aircraft model with nonlinear aerodynamic and thrust models. The simulation is executed with 10 ms fixed time steps aiming to be coherent with the output rate of the dedicated demonstrator.
Training and test manoeuvres considered in this work are introduced in [Reference Lerro, Brandl and Gili18] and available in Ref. [Reference Lerro and Brandl30]. Three manoeuvres (Figs. 3(a)–(c)) belong to the training dataset, whereas, the independent manoeuvre of Fig. 3(d) belongs to the test dataset conceived to evaluate the ability of the NNs to solve the ASSE scheme.
The training is made up of: (1) a stall manoeuvre (Fig. 3(a)) to excite the AoA up to maximum values under quasi-steady state flight conditions; (2) an AoS sweep manoeuvre (Fig. 3(b)) to excite lateral-directional modes; (3) 3-2-1-1 pitch manoeuvre (Fig. 3(c)) to excite longitudinal aircraft modes. The test manoeuvre, not related to the training dataset, is a pitch sweep superimposed to an angle-of-sideslip sweep manoeuvre to excite at the same time both longitudinal and lateral-directional aircraft modes (Fig. 3(d)).
The simulations only involve synchronised and noise-free signals. With the aim to preliminarily evaluate the ASSE performance with realistic input signals, the uncertainty models are used to corrupt the simulated flight data as introduced in Section 3.
The ASSE reliability criteria of Section 2.2 are applied to the test manoeuvre of Fig. 3(d) in order to limit the performance assessment during flight conditions that are reliable for ASSE applications. In Fig. 4, the shaded area represents the flight conditions where the acceleration criteria are not met, whereas, the dotted areas indicate where the determinant criteria is not satisfied. For simplicity, in this work, the acceleration criteria are only considered for the ASSE application. Therefore, the AoA will be evaluated in the time frame $\left[ {10.0{\rm{\;s}} \div 14.5{\rm{\;s}}} \right]$ , whereas the AoS in $\left[ {10.0{\rm{\;s}} \div 17.0{\rm{\;s}}} \right]$ . It is important to underline that the ASSE solution is always computed even when the aforementioned criteria are not met in order to show the flow angle estimations outside the reliability area.
4.2 Performance evaluation criteria
As metrological standards for synthetic sensors are under evaluation [31], some criteria are defined in this work to evaluate the flow angle estimation. The mean, maximum, $1\sigma $ , $2\sigma $ and $3\sigma $ errors are used to conduct the performance analysis. In the present work, $1\sigma $ , $2\sigma $ and $3\sigma $ are the values such that the probability ${\rm{Pr}}\!\left( { - \sigma \le X \le \sigma } \right) = 68.3{\rm{\% }}$ , ${\rm{Pr}}\!\left( { - 2\sigma \le X \le 2\sigma } \right) = 95.4{\rm{\% }}$ and ${\rm{Pr}}\!\left( { - 3\sigma \le X \le 3\sigma } \right) = 99.7{\rm{\% }}$ also in case the error does not have a normal distribution.
The rationale behind the latter strategy is to compare the maximum and $3\sigma $ errors in order to evaluate the presence of possible spike errors. The $1\sigma $ and $2\sigma $ errors are those commonly used to assess the sensor’s performance. From an operative standpoint, the maximum and the $2\sigma $ errors should be specified according to their target applications.
For example, if the synthetic flow angle sensor is used to display data on board or to monitor actual systems, maximum absolute errors up to 5° (or even 10°) can be accepted. Whereas, for analytical redundancy of UAS systems, e.g. VMS or FCS applications, the performance requirements can be more demanding. The latter scenario is assumed to define the following performance requirements:
-
• maximum absolute error $ \lt 2$ °
-
• $2\sigma $ error $ \lt 1^\circ $
5.0 Neural networks as ASSE solvers
As aforementioned, in the present work NNs are used as solvers of the ASSE scheme presented in Equation (7). The neural solvers are trained to learn the input-output nonlinear mapping between the ASSE inputs ( ${\textbf{m}}$ and n coefficients) and the output ( $\alpha $ and $\beta $ ). Exploiting the model-free feature of the ASSE scheme, the NNs can be trained independently from the target aircraft and, for the latter reason, using flight data obtained with a coherent data generator. Therefore, even though a particular flight simulator is used to generate training data, the trained NN is generic, i.e. independent from the aircraft model adopted in the simulations and applicable to any flying bodies.
The neural network input layer is shaped according to the ASSE scheme composed of three nonlinear equations. The latter choice is justified in [Reference Lerro, Brandl and Gili17] where the minimum set of equations is established. Considering Equation (7), the input layer, or vector, comprises the coefficients $\left({\left[ {{\textbf{m}}_t^T,{\textbf{m}}_{{\tau _1}}^T,{\textbf{m}}_{{\tau _2}}^T} \right]^T}\right)$ and known terms $\left({\left[ {{n_t},{n_{{\tau _1}}},{n_{{\tau _2}}}} \right]^T}\right)$ at the current time t and two previous time steps, ${\tau _1}$ and ${\tau _2}$ . As the demonstrator’s operating frequency is 100 Hz, ${\tau _1} = t - 10{\rm{\;ms}}$ and ${\tau _2} = t - 20{\rm{\;ms}}$ . Therefore, recalling Equation (7), the overall network response is a mapping ${f_{NN}}\,:\,{\mathbb{R}^{12}} \to {\mathbb{R}^2}$ for flow angle estimation at current time t that can be written as
where the neural networks of the present work have 12 inputs ( ${n_I}$ ), 2 outputs ( ${n_O}$ ) and the reference output is denoted as $\boldsymbol{y} = {\left[ {\alpha \!\left( t \right),\beta \!\left( t \right)} \right]^T}$ . The MLP and GRBF NNs are selected as examples of “global” and a “local” approximators respectively. A schematic overview is presented in Fig. 5.
As all presented neural networks are pre-trained on conventional personal computers, the operative life is expected to be performed on commercial hardware, e.g. for drone applications.
5.1 MLP trade-off
As the number of MLP-NN’s neurons is not defined a priori, a trade-off analysis is necessary to define the number of hidden units as presented in Fig. 6. The $3\sigma $ error is used because it will provide information about the largest errors excluding possible spike errors.
Once the number of hidden units is set, the MLP-NN is re-trained 10 times starting from random network weights. Among the ten re-trainings, the network that minimises the mean quadratic sum of the AoA and AoS $3\sigma $ errors is selected.
In order to define the best MLP-NN architecture, eight different neural networks are considered with several numbers of neurons ( $\left[ {10,{\rm{\;}}15,{\rm{\;}}20,{\rm{\;}}25,{\rm{\;}}30,{\rm{\;}}35,{\rm{\;}}40,{\rm{\;}}45} \right]$ ) as represented in Fig. 6. As far as the MLP-NN is considered, the best performance is achieved with 25 neurons (Fig. 6(a)). Conversely, the best performance is obtained with 30 hidden units for the MLP-NN-U (Fig. 6(b)). The latter configurations are used in this work.
5.2 GRBF trade-off
Adopting the minimal resource-allocating network (MRAN) growing algorithm, described in Appendix B, the neuron number is not defined a priori because the algorithm can populate, or prune, the single hidden layer following the logics introduced in Appendix C.
The maximum number of neurons is limited to ${K_{max}} = 100$ . The training evolution of the neuron number is reported in Fig. 7. The stability of the number of neurons below the upper limit ${K_{max}}$ is a solid indication to determine the convergence of the GRBF-NN. The training of the GRBF-NN (Fig. 7(a)) converged on 32 hidden units, whereas the GRBF-NN-U (Fig. 7(b)) on 30 hidden units. It can be noted that the final number of hidden units is comparable to those selected for the MLP-NNs in Section 5.1.
6.0 Results
Firstly, the ASSE scheme based on three nonlinear equations is solved using: (1) the iterative Levenberg-Marquardt algorithm [Reference Lerro, Brandl and Gili17], denoted as LM-3; (2) the MLP-NN trained with nominal data; (3) the GRBF-NN trained with nominal data. Additionally, the neural networks are re-trained using realistic data leading to MLP-NN-U and GRBF-NN-U and their performance are compared to those achieved with NNs trained with uncertainty-free data.
In this section, results only belong to the time intervals where the ASSE scheme’s reliability criteria are satisfied as described in Section 4.1.
6.1 Validation of neural networks trained with uncertainty-free input
The AoA and AoS estimation are plotted in Fig. 8 for the test manoeuvre, the error statistics are presented in Table 2. From Fig. 8(a), it is clear that the LM-3 shows the best accuracy for flow angle estimation as both maximum and $2\sigma $ errors are acceptable. As far as the neural solvers are concerned, the GRBF-NN exhibits the largest $2\sigma $ errors with acceptable maximum errors. Whereas, the MLP-NN shows $2\sigma $ errors less than 20 beyond the defined thresholds but with very large maximum errors. The latter aspect suggests the presence of some spike errors. The presence of spike errors, however, does not represent a real issue, in fact, spike errors can be managed using adequate filters. Therefore, as far as the simulated scenario is concerned, the best accuracies are achieved with the LM-3. Whereas, both the neural methods cannot be acceptable for analytical redundancy of UAS’s flow angle sensors considering the thresholds defined in Section 2.1.
The same solvers (LM-3, MLP-NN and GRBF-NN) are tested using realistic input signals that are obtained according to Section 3. Time histories of AoA and AoS estimations are reported in Fig. 8(c) and (d) respectively. Error statistics are presented in Table 2.
By means of comparing the $2\sigma $ errors obtained using nominal and realistic inputs for a particular solver, it is clear that the LM-3 and the MLP-NN are not satisfactory to be used in a realistic scenario. In fact, their performance degrade significantly when the realistic input uncertainties are involved.
On the other hand, the GRBF-NN is only marginally affected by realistic inputs. In fact, the $2\sigma $ errors are comparable with those achieved with nominal inputs, even though the maximum errors increase significantly. However, the GRBF-NN cannot satisfy the aforementioned performance requirements as AoA and AoS $2\sigma $ errors are larger than defined limits. In order to improve the GRBF-NN accuracy, several actions can be adopted (e.g. retrain, optimise the training coefficients, etc.) but they are beyond the scope of the present work. To conclude the analysis, it is clear that the iterative solver cannot be adopted to solve the ASSE scheme of three nonlinear equations with realistic input signals.
6.2 Validation of neural networks trained with input uncertainties
This section deals with neural networks that are trained using the same training dataset of Section 6.1 but corrupted with realistic uncertainties. Time histories of the test manoeuvre are reported in Fig. 9, whereas a quantitative error analysis is presented in Table 3. From a qualitative point of view, it can be noted that, comparing Figs 9(c) and (d) with respect to Figs. 8(c) and (d), the MLP-NN-U is able to approximate the realistic test data only if trained with realistic inputs. In other words, aiming to solve the ASSE scheme, the MLP can be predictive in a realistic context only if trained in a similar scenario. Conversely, the GRBF does not show the same sensitivity to input uncertainties.
If error data are compared with respect to those collected in Table 2, the GRBF-NN-U has an improved AoA estimation accuracy with a penalisation of the AoS accuracy. In fact, the GRBF-NN-U satisfies the defined requirements for AoA as both maximum and $2\sigma $ errors are below the thresholds. However, the GRBF-NN-U’s AoS performance are worse than those observed with the GRBF-NN and, therefore, the AoS estimation is still not suitable for the aforementioned scopes.
As far as the MLP is concerned, the use of training data corrupted with realistic uncertainties is beneficial to improve the performance with realistic data even though beyond the accuracy requirements.
Both neural solvers trained with realistic data are not affected by the input test type (nominal or realistic). In fact, it can be noted in Table 3 that errors achieved with realistic inputs are comparable with those obtained with nominal inputs.
Although both neural solvers benefit from a training dataset corrupted with realistic uncertainty, their estimation accuracy is not adequate to comply with the performance requirements defined in Section 2.1 for UAS applications. Moreover, in order to assess the neural solver accuracy, training and test manoeuvres should cover a realistic flight envelope, e.g. a common flight envelope for UAS applications. The latter topic will be addressed to training optimisation of the ASSE neural solver.
As a final observation, it is worth noting that the GRBF-NNs are more tolerant to input uncertainties even though they are trained with nominal inputs. This latter aspect is crucial when dealing with the ASSE scheme because it would give the chance to train the neural solver independently from the uncertainty models of the target avionics.
7.0 Conclusion
The ASSE scheme is a model-free approach suitable for developing synthetic sensors for flow angle estimations, e.g. for monitoring and redundancy purposes. The ASSE scheme is defined using three nonlinear equations and it is solved with: (i) the Levenberg-Marquardt algorithm; (ii) the multi-layer perceptron (“global”) neural network; (iii) the generalised radial basis function (“local”) neural network. In order to achieve deterministic solvers, neural networks are pre-trained. In operative scenarios, the iterative method is inadequate. On the contrary, training neural networks with real flight data would hinder the generality of the ASSE scheme. Therefore, as a first step, the neural networks are trained with uncertainty-free data. Considering nominal test data, the iterative solver is able to satisfy the defined performance requirements, whereas the neural solvers do not show the same capability. On the other hand, dealing with test data corrupted with realistic uncertainty, the iterative solver and the global neural network are inadequate because they develop very large errors, whereas the local neural network is more tolerant to input uncertainties. In addition to the uncertainty-free training, both neural networks are re-trained with the same training dataset but corrupted with realistic uncertainties. The “global” neural network takes great advantage of the presence of uncertainties in the training dataset. Whereas, the “local” approximator partially benefits from a realistic training dataset. Between the two neural approaches considered to solve the ASSE scheme, results show that the “local” neural network is more effective in a realistic environment even though it is trained with nominal data. The latter feature is crucial to maintain the generalisation of the synthetic flow angle sensors. In fact, as the ASSE scheme is model-free, i.e. independent from the aircraft application, the solver should also not depend on the target avionics. Therefore, in order to solve the ASSE scheme, the “local” approximator is the most suitable candidate to be used in operative scenarios.
Acknowledgements
The present work is an extended version of “Neural Network Techniques to Solve a Model-Free Scheme for Flow Angle Estimation”, A. Lerro, A. Brandl, P. Gili, presented at the International Conference on Unmanned Aircraft Systems ICUAS 2021. Data considered in this work are available in Ref. [Reference Lerro and Brandl30].
Declarations
-
• Funding: The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.
-
• Conflict of interest/Competing interests: The authors have no relevant financial or non-financial interests to disclose.
-
• Ethics approval: not applicable
-
• Consent to participate: not applicable
-
• Consent for publication: not applicable
-
• Availability of data and materials: https://doi.org/10.5281/zenodo.5938860
-
• Code availability: not available
-
• Competing interests: The authors declare none
-
• Authors’ contributions: All authors contributed to the study conception. Design, coding, material preparation, data collection and analysis were performed by Angelo Lerro. The manuscript was prepared and reviewed by all authors. All authors read and approved the final manuscript.
8.0 Appendix
A. Multi-layer perceptron neural network
The mapping function of a conventional MLP-NN, represented in Fig. 5(a), can be expressed as
where:
-
• ${\rm{\Phi }}$ is the activation function
-
• ${\boldsymbol{b}_I} \in {\mathbb{R}^K}$ is the vector with the hidden layer’s bias terms
-
• ${\boldsymbol{b}_O} \in {\mathbb{R}^{{n_O}}}$ is the vector with the output layer’s bias terms
-
• $\boldsymbol{w} \in {\mathbb{R}^{K \times {n_O}}}$ is the matrix of the connection weights between the hidden units and the output layer
-
• $\boldsymbol{W} \in {\mathbb{R}^{{n_I} \times K}}$ is the matrix of the connection weights between the input layer and the hidden units
-
• K is the number of hidden units (or neurons)
The activation function ${\rm{\Phi }}$ is defined as:
where the input ${z_k}$ to the k-th hidden unit is the k-th element of the vector defined as $\boldsymbol{z} = \boldsymbol{Wx} + {\boldsymbol{b}_I}$ .
The batch Back-Propagation algorithm [Reference Beale, Hagan and Demuth32] is adopted to train the MLP-NNs. The latter choice makes the training outcome not related to the data order. With the training, the neural network parameters ( $\boldsymbol{W}$ , $\boldsymbol{W}$ , ${\boldsymbol{b}_I}$ and ${\boldsymbol{b}_O}$ ) are optimised to minimise the mean square error over the entire training dataset. The Levenberg-Marquardt algorithm is chosen to train the MLP-NN.
When the MLP-NN is trained using training data corrupted with realistic uncertainties, it is denoted as MLP-NN-U.
B. Generalised radial basis function
The other neural solver considered in the present work is included in the group of GRBF-NNs that is illustrated in Fig. 5(b). Even though the GRBF-NN can be successfully applied for on-line and sequential learning [Reference Fabri and Kadirkamanathan33], a sequential off-line training is adopted in this work making the GRBF-NN sensitive to the data order adopted during the training stage. For the latter reason, the training database is shuffled in a random order.
The selected radial basis function is the Gaussian function of Equation (B2) that can model the local function representation, as the radial basis function’s output diminishes exponentially moving away from its centre (or mean value). The latter aspect is the most important distinction between the MLP and GRBF approaches.
In a GRBF-NN, the mapping function can be expressed as
where ${\boldsymbol{b}_0}$ is the vector with the bias terms, ${\boldsymbol{b}_k}$ is the weight vector associated to the k-th neuron. The k-th activation function ${\phi _k}$ is defined as:
where $\boldsymbol{x}$ is the input vector and ${\boldsymbol\mu _k}$ is the centre vector of the k-th hidden unit. The hidden units are placed on a uniform ${n_I}$ -dimensional domain comprising the region of the input space.
C. Description of the minimal resource-allocating network algorithm
The Resource-Allocating Network (RAN) growth algorithm [Reference Platt34] is a sequential learning technique that is based on three criteria
where ${\boldsymbol\mu _{nr}}$ is the closest of the K neurons to the current input $\boldsymbol{x}(n)$ and M represents the size of a sliding data window that covers a defined number of latest consecutive observations.
When all three growth criteria of Equation (C1) are satisfied, a new neuron $\left( {K + 1} \right)$ is added to the GRBF-NN; otherwise, only the network parameters of the GRBF-NN are updated accordingly to Ref. [Reference Yingwei, Sundararajan and Saratchandran35]. When a new neuron is added, its centre ${\mu _{M + 1}}$ , variance ${\sigma _{M + 1}}$ and weights ${\boldsymbol{w}_{M + 1}}$ are updated as
where $\kappa $ is the overlap factor that determines the overlap of the response in the input space of the added neuron and the closest one.
In order to avoid an excessive increase of the network size, a pruning strategy can also be applied; this modified training strategy is called minimal RAN (MRAN) [Reference Yingwei, Sundararajan and Saratchandran35]. The pruning criteria is defined as
where $\delta $ is a predefined threshold and $r_k^n$ is the normalised output of the single $k \hbox{-} th$ hidden unit [Reference Yingwei, Sundararajan and Saratchandran35]. In fact, when a certain neuron does not participate actively to the network output (with a normalised contribution less then $\delta $ ) for M consecutive samples, it is considered useless and, hence, pruned.
Therefore, the number of neurons is not defined a priori but the training algorithm can populate the single hidden layer according to aforementioned logics.
In this work, training data are randomly shuffled and then used for the training. The resulting GRBF-NN has 32 neurons using the following training parameters:
-
• $E1 = 0.15$
-
• $E3 = 0.20$
-
• ${\epsilon _{max}} = 0.85$
-
• ${\epsilon _{min}} = 0.50$
-
• ${\gamma _{err}} = 0.997$
-
• $\kappa = 0.85$
-
• $M = 100$
-
• $\delta = 0.0001$
where $E2 = max\!\left\{ {{\epsilon _{max}}{\gamma ^n},{\epsilon _{min}}} \right\}$ and n is the number of the current sample. The values used in this work are inspired to literature examples [Reference Yingwei, Sundararajan and Saratchandran35] and tuned according to a trial and error approach. The training parameters are the same except for the $\kappa = 0.95$ used for the GRBF-NN-U.