Impact Statement
One of the main reasons that data-based structural health monitoring systems are not widespread in the industry is because it is challenging to obtain information from damaged structures. One solution here is transfer learning where damage information from related structures is transferred to structures without damage information. However, in realistic industrial scenarios, it is unlikely that the damage states that are transferred from one structure are the same as the damage that is present in another. The situation is further complicated as real structures undergo varying environmental and operational conditions. This article tests the use of transfer learning for damage localisation using disparate types of damage from other structures under different environmental conditions, to evaluate its use in realistic industrial applications.
1. Introduction
In long-term, online, structural health monitoring (SHM) campaigns, the ability to automatically locate damage within a structure is vital. This increase in information, which aids smarter maintenance decisions, is why localisation is the second level of Rytter’s hierarchy of damage identification, directly following damage detection (Rytter, Reference Rytter1993; Farrar and Worden, Reference Farrar and Worden2012). In offshore wind farms, for example, significant savings on time and expenses can be made, if the most probable damage location is available prior to physically investigating the site for repair and maintenance. Moreover, knowledge of the location can help in the decision-making process regarding intervention, especially if the damage has occurred in a safety-critical location. In practice, collecting damage labels that describe locations of damage is expensive and often infeasible, making automatic damage localisation using a data-based approach to SHM rather challenging. In SHM, most data-driven damage localisation methods typically require a network of well-placed sensors close to the damage location in the structure of interest (Stubbs et al., Reference Stubbs, Kim and Farrar1995; Manson et al., Reference Manson, Worden and Allman2003; Chesné and Deraemaeker, Reference Chesné and Deraemaeker2013; Wernitz et al., Reference Wernitz, Chatzi, Hofmeister, Wolniak, Shen and Rolfes2022). This approach can be expensive and time-consuming to implement. This issue motivates population-based SHM (PBSHM), a field of SHM that aims to increase the available labeled data by leveraging information from a population of structures (Worden et al., Reference Worden, Bull, Gardner, Gosliga, Rogers, Cross, Papatheou, Lin and Dervilis2020; Gardner et al., Reference Gardner, Bull, Gosliga, Dervilis and Worden2021b).
Traditional machine learning methods rely on the assumption that training and testing data are drawn from the same underlying distribution (Murphy, Reference Murphy2012). However, differences between structures and their environmental and operational variations (EOVs) will lead to discrepancies in data. Thus, when using data from different structures, traditional machine-learning methods will likely have high generalization error. This issue motivates the application of transfer learning (Pan et al., Reference Pan, Tsang, Kwok and Yang2010), a field of machine learning that aims to use data from related source domains (structures), with a larger amount of labeled data, to improve performance in a target domain with sparse data. A method for reducing the impact of discrepancies between domains is known as domain adaptation (Pan et al., Reference Pan, Tsang, Kwok and Yang2010), which is a branch of transfer learning that aims to map the source and target domains to a shared space. Thus, supervised machine learning methods can be trained using source labels from the source structure and generalized to the target domain in order to undertake diagnostics or prognostics.
In the context of PBSHM, this is the first article to investigate the possibility of damage localisation of operational structures when the source and the target datasets contain different damage severities, and EOVs, via the use of domain adaptation. Here, operational structures refer to real structures in operation that are not under laboratory-controlled environments, and are exposed to the changing environment/weather. Consequently, the procedure investigated in this article can have a number of novel benefits for SHM. If applied across populations of structures, it can locate damage using disparate severities of damage from different structures. For example, one test structure within a population can be used as the source—where large damage states are introduced to explore their effect and collect damage labels—to transfer information to multiple operational target structures with different damage severities. If applied to a single structure, this methodology has the ability to identify and locate damages that have reoccurred in a structure in the same position with a different severity post-repair, for example.
A technique that has been successfully applied in SHM to address domain shift is joint distribution adaptation (JDA) (Long et al., Reference Long, Wang, Ding, Sun and Yu2013). JDA aims to minimize the distance between both marginal and conditional distributions in the shared latent space. Traditionally, this method uses naïve pseudo-labels to assess the conditional distributions in the unlabelled target domain. However, this process can produce poor target label estimates if the conditional distributions are very different. Recently, metric-informed joint distribution adaptation (M-JDA) was introduced in Gardner et al. (Reference Gardner, Bull, Dervilis and Worden2021a), that uses the Mahalanobis squared distance between the classes in the source and target domains to find better guesses at the initial pseudo-labels. This approach assumes that for each damage class, the source cluster will be closest to the corresponding target cluster. In this article, the M-JDA method is applied to an operational mast structure within a population-based SHM framework, in order to locate severe damage states using information from minor damage states.
This is the first article in the context of PBSHM that explores the effectiveness of the domain adaptation methods for damage localisation under realistic conditions, in the presence of class imbalances—where the marginal distributions differ significantly between source and target domains. The aim is to understand and discuss solutions for class imbalances within domain adaptation, which are likely to occur in realistic damage localisation applications when considering time-varying datasets, EOVs, and so forth. Two types of class imbalances are investigated here. The first is concerned with partial domain adaptation where the available target classes are a subset of available source classes. The second is universal domain adaptation where the target structure contains more classes than the source structure. Class imbalances are a major cause of negative transfer, where the performance in the target domain is negatively affected. Negative transfer can occur if the data from the source and target structures are significantly dissimilar, and not related. Avoiding negative transfer is crucial in realistic transfer learning applications in SHM.
1.1. Original contributions
This article aims to use domain adaptation techniques in transfer learning for continuous damage localisation of operational structures, using a population-based SHM approach. This approach has the potential to extend the application of supervised machine learning methods for SHM to scenarios where labeled data are unavailable in the structure of interest. More specifically, the main contributions of the article are:
-
• A population-based approach is adopted to localize severe damage states in an unlabelled target structure/domain using knowledge of minor damage states from a labeled source structure/domain. A metric-informed joint distribution adaptation method is applied here.
-
• The localisation method is applied to time-varying data collected from an operational structure under natural excitation, and environmental and operational variations. The challenges of damage localisation when using a realistic dataset from an operational structure are discussed.
-
• The effect of, and methods to address class imbalance, in terms of partial and universal domain adaptation are explored.
1.2. Related work
Standard SHM practices require a network of sensors placed in strategic positions in order to locate damage on a structure (Stubbs et al., Reference Stubbs, Kim and Farrar1995; Manson et al., Reference Manson, Worden and Allman2003; Chesné and Deraemaeker, Reference Chesné and Deraemaeker2013; Wernitz et al., Reference Wernitz, Chatzi, Hofmeister, Wolniak, Shen and Rolfes2022). Consequently, large amounts of well-placed sensors and collected data are required for damage location. Population-based SHM (Worden et al., Reference Worden, Bull, Gardner, Gosliga, Rogers, Cross, Papatheou, Lin and Dervilis2020; Gardner et al., Reference Gardner, Bull, Gosliga, Dervilis and Worden2021b; Gosliga et al., Reference Gosliga, Gardner, Bull, Dervilis and Worden2021) was proposed to address such issues of SHM by transferring information across structures. The idea is to use labeled information from structures that have been studied in the past to make inferences about the health of other structures for which detailed knowledge may not be available. One of the most successful techniques used in PBSHM to aid this task is transfer learning.
Transfer learning (Pan et al., Reference Pan, Tsang, Kwok and Yang2010) has been successful in transferring knowledge across different domains (source and target) in a wide variety of applications such as bioinformatics, medicine, transportation, and recommender-system applications (Zhuang et al., Reference Zhuang, Qi, Duan, Xi, Zhu, Zhu, Xiong and He2020). Typically, statistical properties of the data in the source and target are compared (Pan et al., Reference Pan, Tsang, Kwok and Yang2010), which can be extended to also comparing the similarities in the geometric structure of the data (Long et al., Reference Long, Wang, Ding, Shen and Yang2013). For monitoring applications, the majority of work on transfer learning has focused on fine-tuning of neural networks (Cao et al., Reference Cao, Zhang and Tang2018; Dorafshan et al., Reference Dorafshan, Thomas and Maguire2018; Gao and Mosalam, Reference Gao and Mosalam2018; Zhu et al., Reference Zhu, Zhang, Qi and Lu2020) and domain adaptation (Michau and Fink, Reference Michau and Fink2019; Gardner et al., Reference Gardner, Liu and Worden2020; Bull et al., Reference Bull, Gardner, Dervilis, Papatheou, Haywood-Alexander, Mills and Worden2021; Xu and Noh, Reference Xu and Noh2021; Gardner et al., Reference Gardner, Bull, Gosliga, Poole, Dervilis and Worden2022). Since this article aims to leverage labeled source data where there are no labeled target data, domain adaptation is the most appropriate technology to investigate.
Domain adaptation, which is a branch of techniques within transfer learning, has gained popularity as a useful method to reduce the difference between a labeled source and an unlabelled target domain, in order to transfer information across. It is used abundantly in visual applications (Ganin et al., Reference Ganin, Ustinova, Ajakan, Germain, Larochelle, Laviolette, Marchand and Lempitsky2016; Long et al., Reference Long, Zhu, Wang and Jordan2016; Csurka, Reference Csurka2017) and natural language processing (Ben-David et al., Reference Ben-David, Blitzer, Crammer and Pereira2006; Blitzer et al., Reference Blitzer, Dredze and Pereira2007), fault detection/condition monitoring (Li et al., Reference Li, Zhang, Ding and Sun2019; Jiao et al., Reference Jiao, Zhao, Lin and Liang2020; Li et al., Reference Li, Song, Jia, Gao, Li and Qiu2020; Wang and Liu, Reference Wang and Liu2020; Ding et al., Reference Ding, Jia and Cao2021; Zhang and Li, Reference Zhang and Li2022; Li et al., Reference Li, Yu, Lei, Li and Yang2023) and PBSHM (Michau and Fink, Reference Michau and Fink2019; Gardner et al., Reference Gardner, Liu and Worden2020; Bull et al., Reference Bull, Gardner, Dervilis, Papatheou, Haywood-Alexander, Mills and Worden2021; Xu and Noh, Reference Xu and Noh2021; Gardner et al., Reference Gardner, Bull, Gosliga, Poole, Dervilis and Worden2022). In PBSHM, domain adaptation—in the form of transfer component analysis—was used to detect damage on a tailplane with incomplete data, by leveraging information from a heterogeneous population of other tailplanes (Bull et al., Reference Bull, Gardner, Dervilis, Papatheou, Haywood-Alexander, Mills and Worden2021).
Domain adaptation was applied in PBSHM to localize damage across a heterogeneous population of aircraft wings in Gardner et al. (Reference Gardner, Bull, Gosliga, Poole, Dervilis and Worden2022). The authors employed graph matching methods alongside domain adaptation techniques (specifically, balanced distribution adaptation; Wang et al., Reference Wang, Chen, Hao, Feng and Shen2017) in order to identify the most suitable location labels to transfer information across in an unsupervised manner, that is, without using target damage labels. The results showed that the maximum common subgraph (Gosliga et al., Reference Gosliga, Gardner, Bull, Dervilis and Worden2021) between the damage location of the two structures provided the best candidate features and classification performances.
In Tsialiamanis et al. (Reference Tsialiamanis, Wagg, Gardner, Dervilis and Worden2021), damage localisation was performed on the Gnat aircraft wing by fine-tuning neural networks within a PBSHM approach. In Gardner et al. (Reference Gardner, Bull, Dervilis and Worden2021a), the metric-informed JDA method was introduced to address the problem of repair and damage localisation on the Gnat aircraft wing using domain adaptation. The authors found that the domain shift between the healthy state and the repair state was larger than between the healthy state and damage. As a result, the M-JDA method was used to map damage states before and after repair in order to increase classification performance. By doing so, it was possible to locate damage of the post-repair target structure to the pre-repair source structure. This successful implementation of the M-JDA method used repeated measurements of the damage scenarios under laboratory environments, where the effect of class imbalance was not addressed. In the current article, the authors explore the suitability of the M-JDA method for continuous damage localisation when the damage states differ in severity, from a structure under natural excitation, and varying EOVs. This article also explores the effect of class imbalance when localizing damage via transfer learning.
To demonstrate the use of domain adaptation for damage localisation in SHM with disparate damage states, Section 2 presents the background of transfer learning, domain adaptation, joint distribution adaptation, and metric-informed joint distribution adaptation. In Section 3, the experimental dataset used in this work is introduced. Localizing severe damage states from minor damage states is explored in Section 4. A discussion on the effect of class imbalance on the localisation method, and possible solutions are then presented in Section 5. Finally, in Section 6, conclusions and future work are stated.
2. Transfer learning and domain adaptation
The aim of this article is to explore the use of transfer learning jointly with methods developed for population-based SHM to localize damage on a structure. In particular, metric-informed joint distribution adaptation (M-JDA) (Gardner et al., Reference Gardner, Bull, Dervilis and Worden2021a) is applied owing to its previous success in SHM. In this section, transfer learning is discussed followed by joint-distribution adaptation and M-JDA.
Transfer learning is a branch of machine learning that seeks to make inferences about data across domains (Pan et al., Reference Pan, Tsang, Kwok and Yang2010). In transfer, learning the aim is to improve the predictive function in a target domain using knowledge from a source domain. Or to put simply, infer information about the target by leveraging knowledge from the source domain. A domain is defined as $ \mathcal{D}=\left\{\mathcal{X},p(X)\right\} $ containing $ \mathcal{X} $ , a feature space and $ p(X) $ , a marginal probability distribution. Here $ X={\left\{{\boldsymbol{x}}_i\right\}}_{i=1}^N\in \mathcal{X} $ . The task within the domain is $ \mathcal{T}=\left\{\mathcal{Y},,,\hskip0.35em ,f\left(\cdot \right)\right\} $ . In this case, $ \mathcal{Y} $ is a label space and $ f\left(\cdot \right) $ is a predictive function or a conditional distribution $ p\left(\boldsymbol{y}|X\right) $ , learnt from a training set $ {\left\{{\boldsymbol{x}}_i,{y}_i\right\}}_{i=1}^N $ , where $ \boldsymbol{y}\in \mathcal{Y} $ . The assumption here is that the source and the target domains are not equal and/or that the source and target tasks are not equal.
In domain adaptation, the aim is to improve the target predictive function by learning a mapping that reduces the discrepancies between the source and target distributions. As such, supervised machine learning methods can be learnt using source information, and generalized to the target domain with limited or, as in this article, no labeled target data. Typically, it is assumed that the source and the target feature spaces and label spaces are equal, that is, $ {\mathcal{D}}_s={\mathcal{D}}_t $ and $ {\mathcal{Y}}_s={\mathcal{Y}}_t $ , but the marginal and conditional distributions are not, that is, $ p\left({X}_s\right)\ne p\left({X}_t\right) $ and $ p\left({Y}_s|{X}_s\right)\ne \left({Y}_t|{X}_t\right) $ . In this article, homogeneous transfer is considered, where a homogeneous label space is assumed, that is, the set of possible classes in each domain is homogeneous. Within the homogeneous assumption, however, it is possible to have a subset of shared classes from all available classes within a domain.
Next, a method that enables the mapping of the marginal and conditional distributions between domains, named joint-distribution adaptation, is discussed (Long et al., Reference Long, Wang, Ding, Sun and Yu2013).
2.1. Joint distribution adaptation
First introduced in Long et al. (Reference Long, Wang, Ding, Sun and Yu2013), joint distribution adaptation (JDA) is a method that learns a mapping that projects the source and target data into a shared latent space, where a nonparametric distribution distance metric, the maximum mean discrepancy (MMD) (Gretton et al., Reference Gretton, Borgwardt, Rasch, Schölkopf and Smola2012), is minimized. The MMD is widely used in domain adaptation (Pan et al., Reference Pan, Tsang, Kwok and Yang2010; Wang et al., Reference Wang, Chen, Hao, Feng and Shen2017); it is given by,
where $ \phi \left(\cdot \right) $ is a kernel mapping. One limitation of typical DA algorithms is that, without knowledge of the labels, minimizing the conditional distribution distance is challenging. JDA aims to alleviate this issue by using pseudo-labels, such that the distance between class-conditional distributions $ p\left({X}_s|{\boldsymbol{y}}_s\right) $ and $ p\left({X}_t|{\hat{\boldsymbol{y}}}_t\right) $ , can be estimated. As such, the MMD between the marginal distributions (all data) and the class-conditional distributions (the MMD between data corresponding to a particular class), is minimized as a proxy for minimizing the joint distribution distance; this metric is called the joint-MMD (JMMD) and it is given by,
where $ {\mathcal{X}}_s^{(c)} $ and $ {\mathcal{X}}_t^{(c)} $ denote the samples from class $ c $ , with $ c\in 1,\dots, C $ , for the source and target domain respectively, $ {n}_s^{(c)} $ and $ {n}_t^{(c)} $ are the samples for class $ c $ in the source and target, and $ c=0 $ represents the data from all classes, giving the MMD between the marginal distributions. For a more in-depth discussion of JDA, the interested reader is directed to Long et al. (Reference Long, Wang, Ding, Sun and Yu2013).
2.2. Metric-informed joint distribution adaptation
The JDA method discussed in the previous section maps not only the marginal distributions, but also the class conditional distributions, making it a powerful tool for transfer learning. However, as the target labels $ {y}_t $ are not usually available for this formulation, the target conditional distribution $ p\left({\boldsymbol{y}}_{\boldsymbol{t}}|{X}_t\right) $ is unknown, making it challenging to match the joint distributions between the source and target datasets. A method for obtaining target pseudo-labels $ {\hat{\boldsymbol{y}}}_t $ in a semi-supervised manner has been discussed in Long et al. (Reference Long, Wang, Ding, Sun and Yu2013) where a base classifier trained on source labels is predicted in the unlabelled target domain. Gardner et al. (Reference Gardner, Bull, Dervilis and Worden2021a) built on this method by considering a metric-informed approach; a normalized distance metric—specifically, the Mahalanobis squared distance (MSD)—is used to find the most suitable target pseudo-labels, by assuming that for each class, the source cluster is closest to the target cluster. The MSD has been successfully used for outlier detection in SHM (Worden et al., Reference Worden, Manson and Fieller2000) where it is used to determine whether a new datapoint is similar or not to normal condition training data.
2.3. Pseudo-labeling using distance metrics
The method described in Gardner et al. (Reference Gardner, Bull, Dervilis and Worden2021a) for identifying target pseudo-labels using the MSD is,
-
• Calculate the MSD for each class in the labeled source domain $ {D}_c^2\left({\boldsymbol{x}}_i^s\right)\forall c\in \left\{1,2,\dots, C\right\} $ by using a sample mean $ \hat{\boldsymbol{\mu}}\left({\mathcal{D}}_s^{(c)}\right) $ and covariance $ \hat{\Sigma}\left({\mathcal{D}}_s^{(c)}\right) $ per class. The MSD in Worden et al. (Reference Worden, Manson and Fieller2000) is defined as,
where $ {\boldsymbol{x}}_i $ is the current data point, and $ \hat{\boldsymbol{\mu}} $ and $ \hat{\Sigma} $ are the calculated mean and covariance determined from $ X={\left\{{x}_i\right\}}_{i=1}^N $ ( $ X $ is the source labeled data in this case).
-
• A threshold $ T $ is then calculated for each class using an Monte Carlo approach. The reader is referred to Worden et al. (Reference Worden, Manson and Fieller2000) for more detail.
-
• The MSD for each class is normalized using the threshold for each class, that is, $ {\overline{D}}_c^2\left({\boldsymbol{x}}_i\right)={D}_c^2\left({\boldsymbol{x}}_i\right)/{T}_c $ . All MSD values below the threshold are subsequently set to zero. The normalization allows for objective comparison between each class.
-
• A normalized MSD vector is now available for each class in the normalized MSD feature space $ {D}_C^2\in {\mathrm{\mathbb{R}}}^{\left({n}_s+{n}_t\right)\times C}=\left\{{\overline{D}}_c^2\left({\boldsymbol{x}}_i\right)\right\}\forall c\in \left\{1,2,\dots, C\right\} $ .
-
• The $ i\mathrm{th} $ target instance is given the pseudo-label from the class with the minimum normalized MSD, that is, the $ c $ where $ {D}_c^2=\min \left({D}_C^2\right) $ .
For a description of the pseudo-labeling algorithm for metric-informed JDA, the reader is referred to Gardner et al. (Reference Gardner, Bull, Dervilis and Worden2021a).
The assumption for the above method is that the MSD calculated for each source class, when applied to the unlabelled target data $ {D}_c^2\left({x}_j^t\right) $ will provide the smallest MSD values for target data from the same corresponding class. In this article, this assumption is used to locate damage on a structure where the severity of damage in source and target clusters may be different, but similar. As a result, it is important to identify damage-sensitive features in both the source and target structures that are also sensitive to the damage location. In the next section, the experimental case study used in this article for damage location is introduced, with a detailed exploration of features used for transfer learning.
3. The experimental dataset from the LUMO structure
The Leibniz University test structure for MOnitoring (LUMO) is a steel lattice mast structure that includes reversible damage mechanisms (Wernitz et al., Reference Wernitz, Hofmeister, Jonscher, Grießmann and Rolfes2021, Reference Wernitz, Hofmeister, Jonscher, Grießmann and Rolfes2022). Located in Hannover, Germany, the structure provides a benchmark dataset for SHM, containing data pertaining to its dynamic, mechanical, and thermal behavior. Comprehensive documentation of the structure, as well as the open-source measurement data, can be found at https://data.uni-hannover.de/dataset/lumo. In this article, the data collected from the structure between September 2020 to July 2021 are explored.
The structure which stands at 9 m in height and weighs 90 kg has three tubular legs, 21 bracing levels, and short connections at each bracing level. Figure 1a presents a photograph of the structure. LUMO is equipped with 18 uni-axial accelerometers located across 9 measurement levels (ML), that measure its orthogonal deflection and the horizontal motion. Pairs of accelerometers are placed at each ML in the x- and y-directions. At the 10th measurement level (ML10), three strain gauges (one on each of the three legs) and a thermocouple also measure the strain and the temperature of the structure, respectively. At six locations throughout the structure, controlled damage states can be introduced using reversible damage mechanisms. Figure 1b presents a schematic that highlights the measurement levels, the reference axes, and the locations that contain the damage mechanisms. Figure 1c shows the damage mechanisms that are installed on the lowest damage level of the structure. An in-depth discussion of the damage states is held in Section 3.1.
As the structure is situated outside, it is excited by natural sources. As a result, the structure represents a realistic case for SHM monitoring. Given its location, the structure is exposed to, and is affected by environmental and operational variations (EOVs). These environmental variations can stem from seasonal and daily changes in temperature, change in wind directions, ice build-up, and freezing temperatures, to name a few. Each time the reversible damage states are repaired, operational variations are also introduced to the structure, which are discussed in detail in Wickramarachchi et al. (Reference Wickramarachchi, Poole, Hübler, Jonsher, Hofmeister and Rolfes2023).
3.1. Damage states of the LUMO dataset
The main feature of the LUMO structure is the ability to introduce and repair damage states in a controlled manner, whilst the structure operates under realistic environments. The damage states essentially introduce stiffness (and possibly mass) changes to the structure by removing the damage mechanisms (Figure 1c) connected to the bracing supports. These mechanisms are present in six locations of the structure (DAM1–DAM6 in Figure 1b). At each of these levels, one, two, or three damage mechanisms can be removed.
The dataset used in this article contains measurements collected during six individual damage scenarios across three locations. At each of these three damage locations, a minor damage and a major/severe damage are introduced. A minor damage is when one damage mechanism is removed, and a major/severe damage is when all three damage mechanisms are removed. Table 1 presents the damage states, their locations, and the corresponding labels assigned in the dataset used in this work.
Note. Each damage state and each location is assigned a label. The damage states are listed in the order they were introduced to the structure.
As the structure has a triangular cross-section, removing just one damage mechanism from a brace level leads to an asymmetrical reduction in structural stiffness. As a result, discrepancies between the data from the accelerometers that capture the dynamic behavior of the structure are to be expected.
It is the aim of this article to locate damage states using domain adaptation when the labels of the target structure are assumed unavailable. Therefore, it is pivotal that damage-sensitive features are obtained from the measured data in order to make inferences about damage locations. In the next section, the feature selection process is discussed and the selected features are provided.
3.2. Feature selection
Feature selection is a vitally important step in any monitoring campaign in order to identify damage-sensitive features that provide useful information to the model. In this article, the eigenfrequencies and the mode shapes of structure are considered as features, because they are sensitive to the damage states described in Table 1. The Bayesian operational modal analysis (BAYOMA) and mode tracking methods described in Au et al. (Reference Au, Zhang and Ni2013) and Jonscher et al. (Reference Jonscher, Hofmeister, Grießmann and Rolfes2023) are used to extract the natural frequencies from the acceleration signals. For a detailed description of the dynamics of the structure, the reader is referred to Wernitz et al. (Reference Wernitz, Hofmeister, Jonscher, Grießmann and Rolfes2022), where 15 modes (first 5 bending modes in x-direction, first 5 bending modes in the y-directions, and first 5 torsional modes) were identified from the current sensor setup.
One of the main benefits of the LUMO dataset is the continuous nature of the monitoring campaign. The structure operates in an environment representative of realistic operational civil structures and is purely excited via natural sources. In order to replicate a realistic application of a continuous SHM campaign, the behavior of the natural frequencies throughout time is considered.
The dynamic behavior of the LUMO structure, given its dimensions and physical composition, is predisposed to producing closely spaced modes, that is, some of the natural frequencies occur very close to one another. In such cases, smallest system changes are particularly noticeable through a change in the alignment of the bending mode shape in the mode subspace (Jonscher et al., Reference Jonscher, Liesecke, Penner, Hofmeister, Grießmann and Rolfes2023). In addition, the alignment is also subject to the greatest uncertainty. Consequently, the mode tracking algorithm used to obtain the natural frequencies can, at times, misclassify the data and introduce large data scatter. This effect is especially prominent during damage (Jonscher et al., Reference Jonscher, Hofmeister, Grießmann and Rolfes2023; Wickramarachchi et al., Reference Wickramarachchi, Poole, Hübler, Jonsher, Hofmeister and Rolfes2023). As a result, a number of bending modes that are affected by these misclassifications are excluded from analysis in order to avoid progression of these errors further down the line.
The torsional modes are also excluded from the input features in this work. Although the torsional modes are highly sensitive to damage of the LUMO structure, they are affected by the mode tracking misclassification, as a result of the sensor setup. Moreover, the use of torsional modes is not prevalent in industrial applications as they are not typically highly sensitive to changes within structures. Consequently, the torsional modes are not considered further, to ensure the industrial applicability of the damage localisation method developed here.
The bending modes in the x-direction are also excluded during the feature selection process because of their low sensitivity to the minor damage states. To demonstrate the sensitivities of the x and y-direction eigenfrequencies to the minor damage states, PCA is used. Here, PCA is calculated on the natural frequencies during the time period where minor damages were introduced to the structure (20.04.2021 to 12.07.2021). Table 2 presents the principal component coefficients for the first two principal components (PC1 and PC2) which capture over 95.97% of the total variance. The results show that the bending moments in the y-direction have a much higher contribution to the variance compared to those in the x-direction. As the minor damage states are asymmetric (they are initiated by removing the damage mechanism between the second and third legs) in this dataset, the accelerometers in the y-direction are more sensitive than those in the x-direction. As a result, x-direction data are omitted from the dataset, purely from a feature-selection standpoint, given their reduced sensitivity to some of the damages.
Finally, considering the above, the y-direction bending modes one (B1-y), two (B2-y), and four (B4-y) are chosen as suitable features that encapsulate damage. These features are presented in Figure 2, where each datapoint represents 10-minutes of measurements. The sensitivity of these features to damage states (D1–D6) can be seen in this figure. In this article, the data pertaining to damage states D1–D6, are extracted manually from the chosen input features in Figure 2. It is also possible to automatically extract input data collected during damage by using a damage detection method, such as the one suggested in Wickramarachchi et al. (Reference Wickramarachchi, Poole, Hübler, Jonsher, Hofmeister and Rolfes2023).
During the months of January to March, large shifts in the natural frequencies are present. By studying the temperature of the structure and the surrounding area,Footnote 1 it is found that these shifts correlate with freezing temperatures. As a result, it is assumed that a stiffening effect (SE) on the structure has caused these shifts, though inspections of the structure during this period were not conducted.
In the next section, domain-adaptation techniques are utilized to localize the severe damage states of the LUMO structure by considering the location of the minor damage states.
4. Damage localisation using M-JDA
Damage localisation, the second step in Rytter’s hierarchy (Rytter, Reference Rytter1993) aims to find the location of the damage preset in the system. As discussed previously, damage localisation in SHM through domain adaptation has been attempted in past literature to address the problem of repair, on the Gnat aircraft wing (Gardner et al., Reference Gardner, Bull, Dervilis and Worden2021a). The methods in question have relied on repeated measurements of the same damage scenarios under laboratory environments, for knowledge transfer. Although these method have proven to be successful, the scenarios in which they are attempted are not extremely likely in operational structures, that is, it is unlikely that the same damage will repeat in the same position over the life of a structure. A more appropriate situation is that a damage could reoccur in the same position with a different severity. For example, a large damage could follow an inadequate repair of a small damage. Consequently, this section explores damage localisation in the presence of different severity of damage states that appear in the same locations. Here, the metric-informed joint distribution adaptation (M-JDA) method is employed via a population-based approach, where in the first instance, the number of damage classes in the source domain is equal to the number of damage classes in the target domain.
In order to identify source and target domains for M-JDA, the LUMO mast is first viewed as a homogeneous population.
4.1. Viewing the LUMO structure as a population
To localize severe damage states by transferring knowledge from minor damage states, the LUMO structure is considered a homogeneous population with a source and target domains. The source domain includes the data collected during the minor damage states. The target domain includes data collected during the severe damage states. Table 3 presents the way in which the data has been separated. The assumption here is that for the source domain, the class labels and the parameters are known, that is, the shape of each cluster, the data within each cluster, and the location label are known for damage states D4–D6. Within the target domain (D1–D3), the number of classes, as well as the class labels (locations and severity) are assumed to be unknown.
At this stage, statistical alignment techniques developed for domain adaptation are used on the source and target domains in order to align the lower-order statistics. The aim here is to reduce the distance between the source and target domains—and, therefore, the effects of any EOVs—to aid positive transfer. Specifically, normal condition alignment (NCA) explained in Poole et al. (Reference Poole, Gardner, Dervilis, Bull and Worden2022), and used in Wickramarachchi et al. (Reference Wickramarachchi, Poole, Hübler, Jonsher, Hofmeister and Rolfes2023) to address the repair problem of the LUMO structure, is used here. Normal-condition alignment is a domain adaptation method that is robust to class imbalance, making it a suitable technique in this work when the number of possible classes in the target domain may be unknown. Normal condition alignment uses affine transformations to align the first two statistical moments of normal condition data in order to reduce distribution distance, and improving the generalization of a source classifier. In this article, the first two statistical moments (extracted from the first 7 days) of the normal condition data—captured prior to each damage state—are used to standardize each normal-condition state and the following damage state. For example, from Figure 2, the statistics from the first 7 days of normal condition after damage state D1 are used to standardize the normal condition between D1 and D2 and damage state D2. This type of standardization can be extremely useful in PBSHM to reduce domain shifts initiated from any number of changes—including EOVs—across populations. The resulting features, post-standardization, are presented in Figure 3.
After standardization, the natural frequencies displayed in Figure 3 present interesting observations. Relative to the normal condition/healthy states (H), the clusters that represent damage state data (D1–D6) always appear at a different direction to the SE state. For example, when considering B1-y versus B2-y, the damage clusters are formed in the horizontal direction whereas the SE state moves vertically (and diagonally) from the H state. Given the damage states introduce a stiffness reduction to the structure, this apparent change in direction of the SE state suggests that it is a result of a stiffness increase globally across the structure due to freezing weather. The statistical alignment techniques used here have made it possible to find physical interpretation of the clusters and their position in the feature space, which can be extremely useful when attempting to localize damage.
From the features in Figure 3, the damage states described in Table 3 are then extracted for damage localisation. The behavior of these features used in the source and target structures can be studied in Figure 4a, where standardized data are presented. It is clear that there is a common pattern between the source and target structures, as the locations have an influence on the damage state data in the feature space. Physical interpretability of the feature space can be extremely helpful in transfer learning when labels for the target structure are unavailable (as assumed in this article and demonstrated in Figure 4b); the confidence in the mapping may be increased as the feature space makes sense in reality. Some physical knowledge could also be incorporated into the process in order to make decisions about feature choice.
The features used in this work have been carefully engineered in order to pursue a successful localisation technique whilst reducing the risk of negative transfer (Bull et al., Reference Bull, Gardner, Dervilis, Papatheou, Haywood-Alexander, Mills and Worden2021). In transfer learning, negative transfer refers to the scenario where the performance of the transfer learner is worse than if a learning model was conducted solely on the target domain (Gardner et al., Reference Gardner, Bull, Gosliga, Dervilis and Worden2021b). The choice of features is important in any machine learning technique, as the chosen features should reflect the system well, and be sensitive to the attributes that are being discriminated. In the presence of unsuitable features, negative transfer can occur even if the system in question is very similar (Pan et al., Reference Pan, Tsang, Kwok and Yang2010; Gardner et al., Reference Gardner, Bull, Dervilis and Worden2021a, Reference Gardner, Bull, Gosliga, Dervilis and Worden2021b). The features used in this work are a sensible choice, as they represent the natural frequencies of the structure in the direction of the largest deflection, which are known to be sensitive to the presence of damage. The normalization techniques used here transform the data in a way in which the physical interpretability is preserved.
Figure 4a also shows that the conditional and marginal distributions between the source and target clusters are different, that is, $ p\left({X}_s\right)\ne p\left({X}_t\right) $ and $ p\left({\boldsymbol{y}}_{\boldsymbol{s}}|{X}_s\right)\ne \left({\boldsymbol{y}}_{\boldsymbol{t}}|{X}_t\right) $ . As the severity of the damage states are different in the source structure compared to the target structure, this observation is not surprising. However, the structure of the classes within each domain shows similarity. This structural information in the feature space is leveraged to aid transfer. The differences in the direction and distance between each pair of classes in the source and target domains motivate the use of joint distribution adaptation, which aims to match both the conditional and marginal distributions. Additionally, the metric-informed approach is well suited here given the similarity in the within-domain classes, as well as the differences in the conditional distributions; a very large difference in conditional distributions can cause negative transfer when using naïve pseudo-labeling (Gardner et al., Reference Gardner, Bull, Dervilis and Worden2021a).
4.2. Localizing severe damage states using M-JDA
Once the source and target domains are identified (Table 3) and suitable features are selected, the M-JDA approach can be applied to the LUMO dataset for damage localisation.
First, the location information is inserted into the source domain data by replacing their labels $ {\boldsymbol{y}}_{\boldsymbol{s}} $ . For example, damage cluster D4 is relabelled as L1. Then, the source and the target datasets in Table 3 are separated further into training and testing sets. The training data are used to obtain the estimates for $ {\boldsymbol{y}}_{\boldsymbol{s}} $ and $ W $ ; the training data from the source and target sets are used to infer the mapping of the domain adaptation, where the target training data are unlabelled (Gardner et al., Reference Gardner, Bull, Dervilis and Worden2021a). In this work, 300 samples are randomly chosen from each class for the training sets and the rest are used for testing.
The training and testing data from the source and the target domains are then normalized using the first and second moments of the corresponding source and target training sets, respectively. Then, the pseudo-labeling procedure described in Section 2.3 is undertaken on the training data. Figure 5 presents the (predicted) pseudo-labels obtained for the training sets. It is clear that the initial labels in the target training set matches the source training set, suggesting that the metric-informed approach has proven helpful here.
Next, the JDA method presented in Section 2.1 is conducted on the training data to calculate $ W $ , which is used to find the transformed feature spaces for source and target training and testing data. Here, the Gaussian kernel $ K\left(x,y\right)=\exp \left(-\parallel x-y{\parallel}^2/\left(2{L}^2\right)\right) $ (where $ L $ is a scale parameter) is used to provide an embedding of the features into the RKHS to calculate the MMD in equation (1)). The scale parameter or the width of the kernel $ L $ is calculated here using the median heuristic (Gretton et al., Reference Gretton, Borgwardt, Rasch, Schölkopf and Smola2012; Garreau et al., Reference Garreau, Jitkrittum and Kanagawa2017), as it has been shown to be robust in numerous studies for PBSHM (Gardner et al., Reference Gardner, Liu and Worden2020, Reference Gardner, Bull, Gosliga, Poole, Dervilis and Worden2022; Poole et al., Reference Poole, Gardner, Dervilis, Bull and Worden2022). The Gaussian kernel is a characteristic kernel that is continuous and universal in the RKHS, making the MMD a metric on the space of Borel probability measures, that is, $ \mathrm{MMD}\left(p,q\right)=0 $ if, and only if, $ p=q $ (Fukumizu et al., Reference Fukumizu, Gretton, Sun and Schölkopf2007; Gretton et al., Reference Gretton, Sejdinovic, Strathmann, Balakrishnan, Pontil, Fukumizu and Sriperumbudur2012).
In order to classify the transformed data within and following the M-JDA approach, a K-nearest neighbor (KNN) classifier with 10-fold cross-validation is employed. As the JDA and M-JDA methods map the data between the source and target domains to reduce their Euclidean distances, the KNN classifier is an appropriate choice here, as its aim is to find the closest class in the training set. These classifiers are only trained on the source training data. They are then applied independently to the source testing, target training, and testing sets.
4.3. Locating severe damage states: results
The results of severe damage localisation in the target domain when using minor damage states in the source domain are presented in Figures 5 and 6. Here, the regularization (or trade-off) parameter $ \mu $ is set to 0.1 (following the analysis in Pan et al., Reference Pan, Tsang, Kwok and Yang2010; Gardner et al., Reference Gardner, Bull, Dervilis and Worden2021a), and the number of transfer space dimensions $ k $ is set to 2 (as dimensionality reduction is expected from a feature space with a dimension of 3). For the Monte Carlo approach to obtain the pseudo-labels, 10,000 simulations are calculated to find each threshold with a 99% confidence bound.
In Figure 5, the confusion matrices between the true labels and predicted labels obtained using the MSD metric shows that the pseudo-labeling regime is robust during training.
In Figure 6, the left panel presents the training data mapping whilst the right panel displays the testing data mapping. It is clear that the M-JDA approach has successfully mapped the severe damage states to the minor damage states, thereby localizing the damage on the target structure, using information from the source. In comparison to classifying the normal-condition aligned features (Figure 4a), the M-JDA approach alongside NCA data has improved the classification accuracy of the target dataset significantly, as seen in Figure 7. These results demonstrate the advantage of using metric-informed labels over naïve pseudo labels.
In order to evaluate the performance of the classifiers, the $ {\mathrm{F}}_1 $ scores and accuracy values are calculated. The $ {\mathrm{F}}_1 $ scores are calculated by,
where TP are the true positives, FP are the true negatives and FN are the false negatives. The macro F1 score is used to demonstrate the average for all classes. The accuracy is calculated by,
where TN are the true negatives. The classification performances of the training data and testing data mapping are presented in Table 4. The results show that the classifier has performed extremely well following the M-JDA approach when using normal-condition aligned data. For comparison, Table 5 presents the KNN classification performances when using non-aligned data in the M-JDA formulation. It is clear that the model is unable to generalize in the target domain because there is a lack of the physical interpretability within the feature space in this scenario. These results indicate that a method such as NCA is required to reduce EOV differences between source and target structures, in order to attempt successful transfer of operational structures. Consequently, the combination of M-JDA and NCA has proven useful for SHM to locate damage using labels from disparate damage states and structures under different EOVs.
Note. Here, normal condition-aligned data are used in the M-JDA method.
Note. Here, minor damage states are in the source domain and severe damage states are in the target domain.
These results can be extremely useful for population-based SHM, where information pertaining to the type, location, or severity of damage may not be available from many structures within the population. By transferring knowledge from the source structure where labels are available, it is possible to learn more information about the target structure; in this case, the location of the damage is determined. The results here show that severe damage states can be located using smaller/minor damage states, as long as the data-based features follow a similar pattern. Localizing damage can be extremely useful when deploying engineers to repair structures, as they can immediately concentrate on an optimal location without having to inspect the entire structure.
The success of this method is attributed to the similarity in the damage mechanisms; both severe and minor damage states are caused by a local reduction of stiffness in the structure, which drives the dynamic response measured by the accelerometers. However, features collected from the accelerometers in the y-direction are more sensitive to the damage states compared to those facing the x-direction. This may be because, the severe damage states in the LUMO structure introduce a symmetric reduction in stiffness (as all three bracing supports are affected), whereas, the minor damage states introduce an asymmetric reduction in stiffness (given only one bracing support is affected in a structure with a triangular cross-section). Therefore, it is possible that the sensitivity of the accelerometers to the damage may vary if a different bracing support (in the same location) is removed instead. This issue can be avoided by placing the accelerometers in the y-direction on a separate leg to the x-direction. Also, in structures with circular cross sections such as wind turbine towers, the issue of asymmetric damage states does not arise.
Given the natural frequencies of the source and target structures have been used here for damage localisation, drawbacks of traditional SHM localisation techniques are avoided for a structure of this scale; specifically, a network of well-placed sensors near the damage location of the target structure is not necessary for damage localisation in a PBSHM setting, when using damage sensitive natural frequencies. Consequently, the number of sensors within the target structure could be reduced when using this method, saving installation times and costs.
4.4. Influence of the training dataset size
As the transfer mappings are ascertained using training data, the influence of the training dataset size on the performance of the classifiers is scrutinized in Figure 8a,b. In these figures, the $ {\mathrm{F}}_1 $ scores and classification accuracies are presented for each training and testing phase in both the source and target domains. As mentioned previously, 300 datapoints from each class (each damage state) is used for training in this article up to now. To test the influence of the training dataset size, this value is increased to 306 datapoints—which amounts to 80% of the data within the smallest class (the class with the least number of data points). Therefore, the total number of datapoints in source training set is 918 (80% of data from the three classes), whilst the same number is used in the target training set. The lowest amount of training data used is 10% of the smallest class which is 38 datapoints (or around 6 hr of data because each datapoint represents 10 mins). It is clear that the training dataset size has little influence on the performance of the classifiers when conducting metric-informed joint distribution adaptation; the classification performance values are all in the range of 99.5–100. This result demonstrates the importance of the feature selection process, as well-chosen features can reduce the burden on the methods that follow. This M-JDA method could therefore be implemented in operational structures where very little training data is necessary, as long as the chosen features show a similar behavior in both domains (in order to avoid negative transfer).
4.5. Transferring severe damage information to locate minor damage states
Unsurprisingly, the NCA + M-JDA method can be implemented in the opposite direction where the location of the severe damage states (source) can be used to determine the location of the minor damage states (target). Table 6 presents the results from this case, where a high classification performance is achieved. Figure 9 presents comparison of model performances between classifiying the normal-condition aligned data and the M-JDA method in this case. The M-JDA method shows much higher classification performances, yet again. These results show that utilizing transfer learning methods in a PBSHM setting enables successful localisation of minor damage states in the LUMO structure. In comparison, when using standard SHM practices—analyzing the LUMO structure as a single structure—Kalman filter-based methods found it challenging to localize minor damage states (Wernitz, Reference Wernitz2022). In practice, the M-JDA method may be suitable if one test structure within a population is used as the source, where large damage states are introduced to explore its behavior and collect damage labels. Then, label information could be mapped onto structures in operation that present minor damage states (unlabelled target structures).
So far, the damage localisation method using domain adaptation techniques has relied heavily on balanced classes in the source and target domain, that is, the number of damage classes in the source domain is equal to the number of damage classes in the target domain. In the next section, the effect of class imbalance on damage localisation of the LUMO structure is investigated.
5. The effect of class imbalance on damage localisation
In the previous section, a successful method for localizing severe damage states from minor damage states is developed via the use of metric-informed joint distribution adaptation combined with normal condition alignment. The method mapped three damage states in the target domain on to three damage states that shared the same location in the source domain. In realistic applications of this method, however, the presence of class imbalances is likely—specifically, the extreme case where no data for a given class is available, that is, the number of damage classes in the source domain may not be equal to the number of damage classes in the target domain. Therefore, in this section, the effect of class imbalance on M-JDA and the classifier performance is evaluated, and methods for addressing issues with class imbalances are suggested. First, the imbalances of damage classes within the formulation are explored. Then, the class imbalances from environmental and operational variations, specifically, the stiffening effects of the LUMO structure due to freezing conditions are studied.
5.1. Damage class imbalance
In the first example explored, one damage class is withheld from either the source or the target dataset, so the problem is deviating from the homogeneous label space assumption made by conventional domain adaptation. The idea here is to investigate (a) the effect of having a source structure that contains more classes than the target structure—this paradigm is referred to as partial domain adaptation (Cao et al., Reference Cao, Ma, Long and Wang2018), where the available target classes consist of subset of available source tasks—and (b) the effect of the target structure containing more classes than the source structure—this idea is also known as universal domain adaptation (Zhuang et al., Reference Zhuang, Qi, Duan, Xi, Zhu, Zhu, Xiong and He2020). The same method of NCA + M-JDA and KNN classification from the previous section is applied here, with the only difference being the subset of classes used in the source and target domains. Table 7 presents the classification performances during a class imbalance. In this table, the results from six scenarios are presented alongside the results from the previous section where the classes are balanced. In scenarios one to three, one class is withheld from the target domain. For example, in the entry Scenario 1 that states Source (All), Target (L2 and L3), the data pertaining to L1 in the target domain is omitted from the mapping. As a result, the source domain contains more classes than the target domain (partial domain adaptation). In scenarios four to six, one location is excluded from the source domain, thereby the target domain containing more classes than the source domain (universal domain adaptation).
Note. The locations (or classes) that are included in each M-JDA formulation are given in the second column. For example, the results for Scenario 1, Source (All), Target (L2 and L3) refer to the case where all the classes (L1–L3) are included in the source domain while only the L2 and L3 classes are included in the target domain. Macro-averaging of the $ {\mathrm{F}}_1 $ scores is used here as it treats all classes as equal, and therefore, is insensitive to class imbalances.
Some interesting observations can be made from the classification performances. Firstly, as expected, the performance of the classifiers reduces significantly in the presence of class imbalances. When a target class is excluded, the resulting $ {\mathrm{F}}_1 $ score for that class is zero.
When a class is withheld from the target domain, the metric-informed procedure finds pseudo-labels for the target data by considering all the classes in the source that are most similar. This process leads to mislabelling of the existing target domain classes, resulting in poor classification performances for some damage locations. The pseudo-labels assigned for Scenario 2 demonstrate this effect in Figure 10a. In Scenario 2, high classification performances are obtained as the pseudo-labeling procedure is not hindered by the removal of the L2 class in the target domain. The classification performances are high here because the excluded target damage class (L2) appears between L1 and L3 in the feature space (Figure 4a) as well as in the physical locations within the structure (Figure 1b). Consequently, the data in target classes L1 and L3 are more similar and closer in distance to the corresponding source classes. In scenarios one and three the opposite is true, leading to negative transfer.
Given the true target labels are not available to determine the classification performances in reality, it is challenging to ensure that the assigned target labels are correct. This further supports the use of physically interpretable features, such as those used in this article, which hopefully provides some structure to the feature space, that leads to a sensible M-JDA mapping.
In universal domain adaptation scenarios four to six (where there are fewer classes in the source domain than the target), it is not possible to obtain the $ {\mathrm{F}}_1 $ scores for a target class if it does not appear in the source domain. Here, the target pseudo-labels are only assigned from existing classes in the source domain, which may not always correspond to the same damage scenario, as seen in Figure 10b. As a result, the classification performances decrease in Table 7, as expected.
In order to investigate the influence of the initial target pseudo-labels from the metrics-informed approach on the JDA method when considering a class imbalance, Figures 11 and 12 are presented. In Figure 11 where partial domain adaptation is concerned, the M-JDA and JDA methods perform relatively similarly (except in Scenario 3 where the performance of the M-JDA is comparatively worse). However, in Figure 12 that presented the universal domain adaptation results, the JDA performance is much better than that of the M-JDA. This result suggests that the initial pseudo-labeling procedure using the metric-informed approach hinders the target pseudo-labeling procedure down the line. Figure 13a,b presents the JDA pseudo-labeling results for Scenario 2 and 4 in order to provide a comparison to the pseudo-labeling of the M-JDA process in Figure 10a,b. The assumption “for each class, the source cluster is closest to the target cluster” made in the metric-informed method is not applicable in the class-imbalance problem, leading to the higher degree of negative transfer.
5.2. Methods to address class imbalance in damage localisation
In the presence of class imbalance, the risk of negative transfer can increase, as seen in the previous section. However, analysis in Figure 14 shows that if the number of classes in the source and target domain are balanced, the highest classification performance is always achieved when using the same classes in the source and target domains. Given the number of classes in the source domain is known a priori, a leave-one-out method such as that used to obtain Figure 14 may be useful here. However, given target labels are assumed to be unavailable, it is not possible to obtain the classification performance metrics in real time. Again, the physical interpretability of the feature space may be extremely useful here to provide more confidence in the results; knowledge relating to the patterns between the clusters (that are driven by physics) in each domain, may be transferred across, leading to a better understanding of the mappings. For example, a Bayesian multi-task approach for fleet analysis described in Bull et al. (Reference Bull, Di Francesco, Dhada, Steinert, Lindgren, Parlikad, Duncan and Girolami2022) encodes the domain expertise and constraints the model assumptions/prior beliefs in order to better share knowledge across domains. A similar method could be helpful here.
From the literature, a number of other methods have been suggested to address class imbalance when the target labels are unavailable. For addressing the class imbalance in the field of image recognition using convolutional neural networks, researchers have developed a number of techniques. For example, feature transfer learning (Yin et al., Reference Yin, Yu, Sohn, Liu and Chandraker2019) seeks to augment the feature space of under-represented subjects using common subjects, large-scale fine-grained categorization (Cui et al., Reference Cui, Song, Sun, Howard and Belongie2018) employs the Earth movers distance to estimate domain similarity. Adding auxiliary weights into the formulation to handle class imbalances is also explored, where class-specific auxiliary weights are placed into the MMD formulation in Yan et al. (Reference Yan, Ding, Li, Wang, Xu and Zuo2017), and into the samples in the source and target datasets in Al-Stouhi and Reddy (Reference Al-Stouhi and Reddy2016). In Khan et al. (Reference Khan, Abraham, Hon and Guan2019), a loss function that focuses on classes with low probability is defined, to classify biomedical images with a class imbalance.
When considering the JDA method explored in this article, the most suitable development from previous literature that address class imbalances is balanced distribution adaptation (BDA) (Wang et al., Reference Wang, Chen, Hao, Feng and Shen2017). BDA considers both marginal and conditional distributions just as JDA, but does not give equal importance to both distributions. It does this by adaptively adjusting the importance of the distributions based on specific tasks. A weighted-BDA (W-BDA) method is also proposed in Wang et al. (Reference Wang, Chen, Hao, Feng and Shen2017) that adaptively varies the weight of each class to address class imbalance.
In BDA, the equivalent distance between the joint distributions is described as,
where $ \mu \in \left[0,1\right] $ is a balance factor that leverages the importance of distributions. The conditional distributions for the class imbalance problem are calculated using,
where $ {\alpha}_s $ and $ {\alpha}_t $ are weights that are approximated by the class priors of the respective domains. For a detailed explanation on how the marginal and conditional distributions are calculated in BDA and W-BDA, the reader is referred to Wang et al. (Reference Wang, Chen, Hao, Feng and Shen2017).
Figure 15 presents the BDA and W-BDA results when using $ \mu =0.5 $ for Scenario 1 from Table 7. The left figure shows the M-JDA mapping, the middle is the BDA mapping, and on the right is the W-BDA mapping. To reiterate, in scenario one, the source domain contains data from all classes, whereas the target domain has classes two and three. In these figures, the target classes have been given new labels B and C (equivalent to the source labels L2–L3), as their location labels are assumed to be unavailable during the mapping.
From the classification performance results from Table 7 suggest that in Scenario 1, the source class L3 should be mapped well to target class C. The other classes do not classify well. However, visual inspection of the M-JDA mappings in Figure 15 suggests that source class L1 is also similar to target class B (which should correspond to source class L2). The outlier-driven pseudo-labeling procedure leads to this result. The BDA and W-BDA mappings, however, present source class L3 and target class C together with the other classes separated, providing a helpful solution to the class imbalance problem when target labels are unavailable.
These algorithms typically rely on minimizing marginal distribution distance, which is part of the objective in M-JDA. However, where there is significant class imbalance, accurate estimation of the underlying marginal distributions is challenging, leading to alignment based on biased estimations. Thus, this objective may lead to suboptimal mappings. An alternative solution is to rely on adapting the conditional distributions, as would be the case with BDA where $ \mu =1 $ , but this increases the risk of negative transfer because of noisy pseudo-labels (Poole et al., Reference Poole, Gardner, Dervilis, Bull and Worden2022). Weighting schemes have also been proposed to correct for class imbalance (Cao et al., Reference Cao, Ma, Long and Wang2018, Reference Cao, You, Long, Wang and Yang2019; Zhang et al., Reference Zhang, Ding, Li and Ogunbona2018), but these issues may further limit the application of domain adaptation to similar domains, as considered in this article. In Rombach et al. (Reference Rombach, Michau and Fink2023) an approach for generating synthetic data to represent unseen/unavailable target classes is presented to address the partial domain adaptation problem. Although not considered here, this approach may be considered in future work on this dataset.
5.3. Class imbalance from EOVs
During long-term monitoring of structures, it is likely that environmental and operational variations will affect the collected data. Data affected by EOVs could present as abnormalities and should be dealt with appropriately. Automated outlier detection methods within SHM campaigns may identify data affected by EOVs as anomalous. Although detecting these abnormalities can be helpful to owners—as discussed in Wickramarachchi et al. (Reference Wickramarachchi, Poole, Hübler, Jonsher, Hofmeister and Rolfes2023)—methods to distinguish EOVs from damage are necessary. Without such a discrimination, and without labels, it is entirely possible that the assumed SE cluster from Figure 3, for example, is misrepresented as damage, and included in the formulation when conducting damage localisation.
As the stiffening effect of the structure is unlikely to be localized in the same way as the damage states, including the SE state can be detrimental to the M-JDA mapping, and the following classifier performance (Table 8). As the methods used in this article focus on mapping unlabelled target data to labeled source data, the assumption is that information about the source data is known a priori. As a result, it is assumed that SE data would not be included in the source domain when attempting damage localisation. Clearly, it is also important to disregard the SE data from the target domain when attempting to localize damage. Given the unavailability of labels in the target class, the input features (the bending modes) within the target class could be studied as a population during the feature selection process, in order to eliminate the SE data from analysis; by considering a distance metric such as the MMD introduced in Section 2.1, the dissimilarity of the SE cluster can be identified in a principled way, without damage labels. Figure 16 presents the MMD values between the classes in the target training data. A high MMD value (dark blue) across the population suggests dissimilarity. The SE state has a higher MMD compared to other classes in the target training data, as it behaves differently to the damage state data; the stiffening effect increases the stiffness, whereas, the damage states introduce a reduction in stiffness. The effect of these different behaviors is also evident in Figure 3. The information in Figure 16 can, therefore, aid feature selection at the start of the analysis chain to discount SE data from the transfer learning process.
6. Conclusions and future work
This is the first article in the context of SHM that investigated the use of domain adaptation for damage localisation when there exists discrepancies between the damage states in the labeled source structure and unlabelled target structure, under natural excitations. Differences such as severity and type of damage between source and target domains are likely in realistic applications of transfer learning, outside of laboratory environments. This is also the first article in SHM to address the effects of EOVs on real, operational structures that may affect the damage localisation process in transfer learning.
For damage localisation, metric-informed joint distribution adaptation was used to reduce the distance between the source and target domains with the help of informative pseudo-labels. This approach was able to leverage information from a mast structure (under natural excitations) with minor damage states in order to localize severe damage states, by only using the natural frequencies as input features. The M-JDA method was applied alongside normal correlation alignment in order to reduce the effect of disparities in the environments between the source and target datasets. The classification accuracy of the KNN classifier exceeded 99.97% in testing, where an overall $ {\mathrm{F}}_1 $ score of 1.0 was achieved. The M-JDA + NCA method is extremely helpful in identifying the location of damage in a target structure (that does not contain any damage labels), which can then be used to optimize any following repair and maintenance procedures. By using a population-based approach, it may be possible to avoid installing a network of strategically-placed sensors for damage localisation, that are essential to traditional SHM methods.
Realistic applications of transfer learning are likely to encounter class imbalance where the source and target structures contain different number of classes. Consequently, this article also tested the suitability of the NCA + M-JDA method when there exists a class imbalance within the source and target domains. Classification performances remained high in some instances and reduced significantly in others. Given the unavailability in target labels in realistic implementations of this method, incorrect mappings can be detrimental to the decision-making process for intervention. Methods to address class imbalance have, therefore, been suggested here. In future work, a comprehensive method to address the issue of class imbalance will be studied.
Data availability statement
The data used for this work from the LUMO structure can be found at https://data.uni-hannover.de/dataset/lumo.
Acknowledgments
We gratefully acknowledge the financial support of the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—SFB-1463—434502799. C.T.W. would also like to thank the Mercator fellowship program as well as The Dynamics Research Group at the University of Sheffield for supporting this work.
Author contribution
Conceptualization: C.T.W.; Formal analysis: C.T.W.; Funding acquisition: R.R.; Investigation: C.T.W.; Methodology: C.T.W., P.G.; Resources: R.R.; Supervision: C.H., R.R.; Visualization: C.T.W.; Writing—original draft: C.T.W.; Writing—review and editing: P.G., J.P., C.J., C.H., R.R. All authors approved the final submitted draft.
Funding statement
This research was supported by grants from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—SFB-1463—434502799.
Competing interest
The authors declare none.
Ethical standard
The research meets all ethical guidelines, including adherence to the legal requirements of the study country.
Comments
No Comments have been published for this article.