1. Introduction
Himalayan glaciers have been showing wastage over the last few decades (Brun and others, Reference Brun, Berthier, Wagnon, Kääb and Treichler2017; Azam and others, Reference Azam2018; Bolch and others, Reference Bolch and Koji Fujita2019); consequently, their dynamics are gradually adjusting (Azam and others, Reference Azam2012; Dehecq and others, Reference Dehecq2019) to the current mass distribution and hypsometry. Reliable information on glacier ice thickness and volume is necessary to understand the status of these glaciers, determine future fresh water availability in the region and assess future sea level rise potential. Field-based ground penetrating radar (GPR) surveys (Hubbard and Glasser, Reference Hubbard and Glasser2005), one of the best available methods to estimate the ice thickness, have been used on glaciers worldwide (Pettersson and others, Reference Pettersson, Jansson and Holmlund2003; Pattyn and others, Reference Pattyn, Delcourt, Samyn, De Smedt and Nolan2009; Wagnon and others, Reference Wagnon2013). Due to rough terrain, harsh climatic conditions and high expedition costs in the Himalaya, GPR surveys have been conducted for ice-thickness estimates only for a few selected glaciers: Khumbu (Iwata and others, Reference Iwata, Watanabe and Fushimi1980), Dokriani (Gergan and others, Reference Gergan, Dobhal and Kaushik1999), Chhota Shigri (Azam and others, Reference Azam2012; Singh and others, Reference Singh, Rathore, Bahuguna, Ramnathan and Ajai2012), Mera (Wagnon and others, Reference Wagnon2013), Changri Nup (Vincent and others, Reference Vincent2016), Lirung (McCarthy and others, Reference McCarthy, Pritchard, Willis and King2017) and Satopanth (Mishra and others, Reference Mishra, Negi, Banerjee, Nainwal and Shankar2018) Glaciers.
1.1 Ice thickness and volume estimation
Glacier ice thickness and volume are generally estimated using mathematical models such as volume–area scaling (Meier and Bahr, Reference Meier, Bahr and Colbeck1996; Raper and Braithwaite, Reference Raper and Braithwaite2005; Frey and others, Reference Frey2014), ice flux method (Haeberli and Hoelzle, Reference Haeberli and Hoelzle1995; Farinotti and others, Reference Farinotti, Huss, Bauder, Funk and Truffer2009a, Reference Farinotti, Huss, Bauder and Funk2009b; Huss and Farinotti, Reference Huss and Farinotti2012), surface velocity–slope relationship (Gantayat and others, Reference Gantayat, Kulkarni and Srinivasan2014), inverse modelling of glacier ice flow (Raymond-Pralong and Gudmundsson, Reference Raymond-Pralong and Gudmundsson2011; Mosbeux and others, Reference Mosbeux, Gillet-Chaulet and Gagliardini2016) and artificial neural networks (ANNs) (Clarke and others, Reference Clarke, Berthier, Schoof and Jarosch2009; Haq and others, Reference Haq, Jain and Menon2014).
Due to its simplicity, volume–area scaling has been used extensively to estimate glacier ice thickness and volume (Singh, Reference Singh2011; Frey and others, Reference Frey2014; Bahr and others, Reference Bahr, Pfeffer and Kaser2015). Volume–area scaling is essentially a transposition because unknown parameters such as glacier depth are derived from glacier area (Lliboutry, Reference Lliboutry1987; Gantayat and others, Reference Gantayat, Kulkarni and Srinivasan2014). This method has generally been applied at the regional scale (Radić and Hock, Reference Radić and Hock2011; Bliss and others, Reference Bliss, Hock and Cogley2013; Frey and others, Reference Frey2014) and can be erroneous when applied to individual glaciers for which the scaling parameter have not been well-established (Agrawal and Tayal, Reference Agrawal and Tayal2013; Bahr and others, Reference Bahr, Pfeffer and Kaser2015). Another shortcoming of the volume–area scaling method is that it yields no useful information on subglacial topography, which is a necessary boundary condition for glacier dynamics models (Clarke and others, Reference Clarke, Berthier, Schoof and Jarosch2009). Paul and Linsbauer (Reference Paul and Linsbauer2012) derived the ice thickness of glaciers based on a numerical model of perfect-plasticity assumption developed by Nye (Reference Nye1952). In this approach, the glacier volume estimate is based on the surface extent and average slope of the glacier. These models provide an estimate of ice thickness but are prone to errors if their underlying assumptions are not fulfilled (Clarke and others, Reference Clarke, Berthier, Schoof and Jarosch2009). Farinotti and others (Reference Farinotti2017) applied 17 models to derive the ice thickness of 21 test glaciers around the world.
Ramsankaran and others (Reference Ramsankaran, Pandit and Azam2018) applied the Glabtop-2 model with optimization of the shape factor to derive the volume of Chhota Shigri Glacier. The majority of methods used in Farinotti and others (Reference Farinotti2017), as well as GlabTop-2 in Ramsankaran and others (Reference Ramsankaran, Pandit and Azam2018), are highly sensitive to the surface slope and shape factor (Rabatel and others, Reference Rabatel, Sanchez, Vincent and Six2018). Rabatel and others (Reference Rabatel, Sanchez, Vincent and Six2018) discussed the uncertainties caused by the surface slope and demonstrated that they can be as high as 50% of the thickness calculated by ice flux methods. Vashishth and others (Reference Vashisht, Pandey, Ramanathan, Tayal and Jackson2017) applied different methods including glacier flow mechanics, thickness–area scaling and volume–area scaling to compute the ice thickness of Chhota Shigri Glacier. Singh and others (Reference Singh, Rathore, Bahuguna, Ramnathan and Ajai2012) used the thickness-area scaling method and GPR data, whereas Dobhal and others (Reference Dobhal, Kumar and Mundepi1995) used residual Bouger values to measure the thickness of Chhota Shigri Glacier. The availability of these studies on Chhota Shigri Glacier provides an opportunity to compare our results with a variety of models.
1.2 Artificial neural network
An ANN is a mathematical or computational model that is inspired by the operation of biological neural networks (Abraham, Reference Abraham, Sydenham and Thorn2005; Eluyode and Akomolafe, Reference Eluyode and Akomolafe2013). ANNs have already been applied in cryospheric science to understand past and future glacier length variations (Steiner and others, Reference Steiner, Walter and Zumbühl2005; Zumbühl and others, Reference Zumbühl, Steiner and Nussbaumer2008), to predict snow cover in mountain ranges (Sauter and others, Reference Sauter, Schneider, Kilian and Moritz2009, Reference Sauter, Weitzenkamp and Schneider2010; Mishra and others, Reference Mishra, Tripathi and Babel2014), to simulate meltwater runoff from glaciers (Caiping and Yongjian, Reference Caiping and Yongjian2009), to classify glacier/snow cover surfaces (Bishop and others, Reference Bishop, Shroder and Hickman1999; Doberva and Klein, Reference Doberva and Klein2009; Czyzowska-Wisniewski and others, Reference Czyzowska-Wisniewski, van Leeuwen, Hirschboeck, Marsh and Wisniewski2015), to detect glacier and ice-shelf margins (Baumhoer and others, Reference Baumhoer, Dietz, Kneisel and Kuenzer2019; Mohajerani and others, Reference Mohajerani2019) and to simulate glacier surface mass balances (Bolibar and others, Reference Bolibar2020). An ANN model was used by Clarke and others (Reference Clarke, Berthier, Schoof and Jarosch2009) to estimate the sub-glacial topography and glacier volume of the Mount Waddington icefield, British Columbia, Canada. They validated the ANN-estimated ice thickness with ice thickness values obtained from a numerical ice dynamics model. In the Himalaya, the use of ANNs to estimate glacier ice thickness and volume estimation is rare. Haq and others (Reference Haq, Jain and Menon2014) applied an ANN model to derive ice thickness and volume of Gangotri Glacier in the central Himalaya. However, the estimated ice thickness on Gangotri Glacier could not be validated due to the unavailability of ice thickness data from the field.
1.3 ANN principles
A neural network is a collection of interconnected simple processing elements called neurons. These processing elements are connected with coefficients or weights, which constitute the neural network structure and are organized in layers. The behaviour of a neural network is determined by its architecture, transfer functions and learning rule (Agatonovic-Kustrin and Beresford, Reference Agatonovic-Kustrin and Beresford2000). Every connection of a neural network is assigned a weight that comes through training the ANN. Typically this weight represents the strength of the interconnections among neurons inside the neural network. The aim of weight initialization is to prevent layer activation outputs from exploding or vanishing during the course of a forward pass through a neural network. The weighted inputs are all summed up inside the artificial neuron. The neurons use transfer functions f such as transig, logsig and purelin to derive output from the sum of weighted functions and biases. The function logsig generates outputs between 0 and 1 as the neuron's net input ranges from negative to positive infinity (Fortuna and others, Reference Fortuna, Ferla and Imbruglia2002). Alternatively, multilayer networks may use the tan-sigmoid transfer function, tansig. Occasionally, the linear transfer function purelin is used in backpropagation networks (Demuth and Beale, Reference Demuth and Beale2000). If the final layer of a multilayer network has neurons and sigmoid functions, then the outputs of the network will have a small range.
The training of ANN architecture can be performed using a defined set of rules, also known as optimizers, such as the Widrow–Hoff or gradient descent or backpropagation. The Widrow–Hoff learning rule minimizes the mean square error and moves the decision boundaries as far as it can from the training patterns (Demuth and Beale, Reference Demuth and Beale2000). The term ‘backpropagation’ refers to the way that the gradient is computed for nonlinear multilayer networks (Demuth and Beale, Reference Demuth and Beale2000). It is a generalization of the Widrow–Hoff learning rule to multiple-layer networks and nonlinear differentiable transfer functions. There are different types of backpropagation training algorithms such as Levenberg–Marquardt, conjugate gradient and resilient backpropagation (De Villiers and Barnard, Reference De Villiers and Barnard1993). The Levenberg–Marquardt training algorithm works on loss functions that take the form of a sum of squared errors. The loss function is composed of an error and regularization. The error evaluates how a neural network fits the dataset and the regularization is used to prevent overfitting based on the effective complexity of the neural network. The Levenberg–Marquardt algorithm computes the gradient vector and the Jacobian matrix. It is an algorithm that trains a neural network 10–100 times faster than the usual gradient descent backpropagation method (Demuth and Beale, Reference Demuth and Beale2000). The reason for choosing Levenberg–Marquardt optimization in the current study is that it is more powerful and faster than other conventional gradient descent techniques (Hagan and Menhaj, Reference Hagan and Menhaj1994; Kişi and Uncuoğlu, Reference Kişi and Uncuoğlu2005).
The neural network architecture defines the network structure, including the number of hidden layers, the number of hidden nodes and the number of output nodes. The hidden layers provide the network with its ability to generalize. A network with one hidden layer can model any continuous function (Beale and Jackson, Reference Beale and Jackson1990). There are no a priori fixed criteria on the architecture of the ANN (Sheela and Deepa, Reference Sheela and Deepa2013). A trial-and-error procedure and network growing technique are generally applied to decide the number of hidden layers and number of neurons in each hidden layer.
1.4 Objectives
In the present work, we assessed the efficiency of ANN modelling for the estimation of ice thickness and volume of Chhota Shigri Glacier (western Himalaya, India), where GPR ice-thickness measurements are also available (Azam and others, Reference Azam2012). A major advantage of the proposed ANN model in this study is the limited input data requirements, which include only the glacier extent and a digital elevation model (DEM). As a first step, we developed an ANN model for ice thickness estimation on Chhota Shigri Glacier and then validated the output with field-derived ice thickness data from five GPR transverse cross sections. In the second step, a subset of field-derived observations of ice thickness was used as training data. The initial and trained models were then applied on five additional transverse cross sections and two longitudinal sections before deriving the final glacier volume.
2. Study area and data
Chhota Shigri Glacier (32.28° N, 77.58° E) is located in the Chandra Valley of the western Himalaya (Fig. 1). This glacier has a total length of 9 km with an area of 15.5 km2 (Azam and others, Reference Azam2016). Chhota Shigri is one of the most documented glaciers in the Himalayan region with studies covering its mass balance, energy balance, ice thickness and dynamics (Wagnon and others, Reference Wagnon2007; Azam and others, Reference Azam2012, Reference Azam2014a, Reference Azam2014b, Reference Azam2019; Singh and others, Reference Singh, Rathore, Bahuguna, Ramnathan and Ajai2012; Vincent and others, Reference Vincent2013; Frey and others, Reference Frey2014; Vashishth and others, Reference Vashisht, Pandey, Ramanathan, Tayal and Jackson2017; Ramsankaran and others, Reference Ramsankaran, Pandit and Azam2018; Mandal and others, Reference Mandal2020).
2.1 Digital elevation model
High-resolution topographic surface information is a prerequisite for accurate glacier volume estimates therefore resolution and quality of the DEM are crucial factors. Freely available DEMs such as Advanced Spaceborne Thermal Emission and Reflection Radiometer and the Shuttle Radar Topography Mission (SRTM) have a spatial resolution of 30 m. To satisfy the requirement of high-resolution topographic information in the present study, Cartosat-1 (2.5 m) stereopair imagery dated 3 April 2007 was used to generate the DEM at a spatial resolution and vertical accuracy of 10 m.
2.2 GPR observations
In October 2009, GPR measurements were conducted on Chhota Shigri Glacier to determine the ice thicknesses on five transverse cross sections between 4400 and 4900 m a.s.l. (cross sections 1–5 in Fig. 1) (Azam and others, Reference Azam2012). The GPR thickness points were available at 10 m steps for each cross section. Details of the GPR measurements can be found in Azam and others (Reference Azam2012).
2.3 Differential global positioning system (DGPS) observations
Ground control points (GCPs) used in DEM generation and testing were collected in 2009 and 2010 at Chhota Shigri Glacier. Details of the 2009 survey, where 200 GCPs were collected during the GPR survey, are available in Azam and others (Reference Azam2012). Four additional GCPs were collected in 2010 using a Trimble R-7 DGPS operated in post-processing kinematics mode. The estimated accuracy of the 2009 and 2010 GCPs is <1 m.
2.4 Transverse and longitudinal slopes and elevations
Extending the GPR transects onto surrounding ice-free glacier slopes, we extracted sidewall slopes (SC) and elevations (EC) from the DEM. Longitudinal elevations (EL) and slopes (SL) were also extracted from the DEM (Fig. 2).
3. Methodology
3.1 DEM generation
A DEM was generated from a Cartosat-1 stereo pair collected on 3 April 2007. The DEM was generated using the Leica Photogrammetry Suite (LPS) and a subset of GCPs collected in 2009 and 2010 was used to refine the rational polynomial coefficient model. Tie points were generated both implicitly and explicitly for a better distribution of points over the glacier, and a first-order polynomial triangulation was used to refine the model.
The Automatic Terrain Extraction with Dense Point Matching (eATE) tool was used to place the subset of GCPs and generate tie points on the stereo pair. The GCPs were used as full control points (a full control point has x, y and z coordinates in LPS), while the tie points were generated using an implicit image correlation algorithm in LPS. A tie point is the image coordinate position of an object appearing on two or more images and its x, y and z coordinates are calculated during the aerial triangulation process. A total of 78 tie points were generated, manually and automatically. The triangulation was performed after adding GCPs and tie points. The DEM was extracted with a cell size of 10 × 10 m.
A total of 160 DGPS points (October 2009) on five GPR cross sections (1–5) and four DGPS points (October 2010) were used to generate the DEM. A total of 40 points (eight points per profile) were reserved and used to validate the error of the DEM. While the Cartosat-1 DEM was generated from April 2007 imagery, the GCPs were collected in October 2009 and October 2010. The resulting glacier DEM was edited based on break lines and sidewall slopes in LPS. The altitude values of the generated Cartosat DEM (3 April 2007) showed lower values than the global positioning system points (2009) for cross sections 1–4, however, cross section 5 showed a slightly higher altitude (Table S1).
3.2 ANN training
Two additional sources of information were used in this work, compared to previous ANN studies (Clarke and others, Reference Clarke, Berthier, Schoof and Jarosch2009; Haq and others, Reference Haq, Jain and Menon2014). The first is the additional geomorphic assumption of inclusion of sidewall slopes in the ANN model and the second is the availability of GPR data to assess and validate the ANN results. According to Clarke and others (Reference Clarke, Berthier, Schoof and Jarosch2009), the areas that are currently ice-free were formerly ice-covered and are therefore subject to landscape erosion. Two geomorphic assumptions were used in the current investigation. First, there may be some relationship between the elevations of the longitudinal section of the de-glaciated frontal area and the glacier of the present time (Haq and others, Reference Haq, Jain and Menon2014). Slopes and elevations along the longitudinal profiles (SL and EL) were thus used as predictors in the ANN models. The second assumption is based on the extension of valley sidewall slopes beneath the current glacier surface. Valley slopes (SC) and elevations (EC) extended from transects 1–5 were also used as predictors in the ANN models. Clarke and others (Reference Clarke, Berthier, Schoof and Jarosch2009) used bedrock geometries from deglaciated valleys to train their ANN model and improve dynamically modelled ice thickness.
3.3 ANN architecture
The performance of a neural network depends to a significant extent on how well it has been trained. Initially, a single layer linear network with the Widrow–Hoff least-square learning rule (also known as delta or least mean squared rule) was attempted. The training and testing showed unstable results on vector data (columns of input data). A multilayer feed-forward approach and standard Levenberg–Marquardt backpropagation training algorithm (Levenberg, Reference Levenberg1944; Marquardt, Reference Marquardt1963) were used in Matlab 2016b.
The Levenberg–Marquardt method is an effective optimization algorithm and is preferred over steepest descent and Gauss–Newton methods in a wide variety of problems (Wilson and Mantooth, Reference Wilson, Mantooth, Wilson and Mantooth2013; Gavin, Reference Gavin2019). The Levenberg–Marquardt is a combination of the two optimization methods: the gradient descent method and the Gauss–Newton method. A major limitation of the Levenberg–Marquardt algorithm is its memory requirement. However, in the present study, the number of data points is low, therefore memory requirements were low.
In the ANN architecture, the design of the input and output layers is straightforward, however, the design of hidden layers is not. There is no straightforward way to determine good network architecture and the number of hidden layers. It varies with the number of training samples and the complexity of the problem. In the current study, several combinations of layers and transfer functions were tested on a network growing basis with a learning rate of 0.001. It typically starts with a small network and nodes are added until a suitably chosen measure stops decreasing (Gallant, Reference Gallant1986; Kwok and Yeung, Reference Kwok and Yeung1995; Haq and others, Reference Haq, Jain and Menon2014).
In this study, we use a three-layer architecture, in which the input layer had four neurons (4x), the first active layer had eight neurons with a tan-sigmoid (transig) transfer function (8T) and the last layer had a transig function (1T). The final selected ANN architecture notation was therefore 4x–8T–1T (Fig. 3). The assessment of different ANN architectures was based on a combined R value for training, validation and testing. This performance criterion was used to select the network architecture and the size of the training set.
3.4 ANN models for Chhota Shigri
Two ANN models were developed in the current study to estimate the ice thickness of Chhota Shigri Glacier. The first ANN model, which does not use observed ice depths in training, is abbreviated as ABT (ANN before training). The second ANN model includes observed ice depths in training and is abbreviated as AAT (ANN after training). The ABT model was developed using four inputs such as slope and elevations extracted from the DEM on valley sidewalls and longitudinal profiles (EC, SC, EL, and SL). The target or dependent values were extrapolated using equation (S1); please see the Supplementary material.
3.5 ANN assessment
ANN models were assessed using a k-fold cross-validation technique, multiple R values, and by comparing mean and maximum ice thicknesses. The evaluation of the developed ANN model was attempted based on k-fold cross-validation. This k-fold cross-validation technique was applied to split the dataset for better generalization and to avoid the overfitting problem and provide a stable estimate of the model error (Little and others, Reference Little, Varoquaux and Saeb2017). The first fold is kept for testing and the model is trained on k−1 folds. The process is repeated 10 times and each time different fold or a different group of data points are used for validation. Data points for training, testing and validation sets were taken randomly from all the available transects in both models and these points change depending on each fold.
3.5.1 Assessment statistics
The comparisons of observed and ANN estimated ice thicknesses along transverse cross sections were based on various statistics including multiple R-value of training, testing and, validation; max ice thickness and mean ice thickness.
3.5.2 Ice thickness interpolation
The ANN models produced estimates of bedrock depths at ten transverse cross sections, which were then converted into ice thicknesses by differencing with the surface elevation. We used kriging to interpolate the spatially irregular ice-thickness data over the whole glacier to obtain gridded estimates of ice thickness. Kriging is a stochastic interpolation method that offers an assessment of prediction errors. At the 10 m resolution of the DEM, five transverse cross sections (1–5), five additional transverse cross sections (A–E), and a longitudinal section (L1) give a total number of 1449 points for the interpolation. A workflow diagram (Fig. 4) further explains the two ANN models, AAT and ABT, adopted in this study.
3.6 Error analysis
ANN modelled ice thickness data are continuous since they are based on the DEM. GPR-based ice depth observations are discontinuous, and data are often missing at the end of transects where the terrain is rough and GPR signals are weak due to steep valley walls (Azam and others, Reference Azam2012). Comparisons of observed and modelled ice thicknesses were made with datasets where (a) only the observed values were considered and (b) missing ice thickness observations were interpolated from nearby observations. Errors in modelled ice thickness are based on root mean squared error (RMSE) and the mean bias error. There are four sources of error: (1) error in GPR thickness, (2) mean average bias of ABT and AAT, (3) error in kriging interpolation for ABT and for AAT, and (4) error in DEM. To calculate the total errors in the resulting ice thickness estimation and total volume calculation, quadratic addition was applied based on the above-mentioned four error sources (Vanlooy and others, Reference Vanlooy, Miege, Vandeberg and Forster2014).
4. Results
4.1 ANN models
Out of 25 training runs conducted with the ABT model, three were rejected due to low combined R-values. The performances computed from the average of all folds for different networks are given in Table 1. Among the 22 remaining runs, the best correlation was 0.90 with a mean bias error for ice thickness of 13 m and RMSE of 24 m (Table 1 and 2). Out of 25 training runs conducted with the AAT model, which includes 20 GPR measurements as inputs, four runs were rejected (Table 2). Among the 25 runs, the best correlation was 0.93, with a mean bias error of 6 m and a RMSE of 13 m. The assimilation of GPR data for ANN network training decreased the RMSE by 11 m (24–13 m) and the mean thickness error of the five transverse sections by 7 m (13–6 m). Levenberg–Marquardt algorithm outperformed the gradient descent method in the current study.
4.2 Estimated ice thickness and volume
The performance of the ABT and AAT models is shown using scatter plots (Figs 5 and 6, respectively). These plots represent model performance based on the R-value for training, validation, testing and overall (a combination of training, validation, and testing), respectively. The solid line represents the best fit linear regression between the outputs and targets. The targets are the data (measured thicknesses) and the outputs are the results of the ANN model. The ABT model gives an overall R of 0.90 whereas the AAT model gives an overall R of 0.93. Performance statistics were computed from the average of all folds for the 4x–8T–1T set. The performance of the training, validation, and test sets are similar because of the L1 regularization used. Most ice thickness errors fall between ± 10 m (Fig. 7). The spread of errors ranges between ± 29 m and ± 0.83 m for maximum and minimum values, respectively.
The modelled ice-thickness estimates were 104 and 109 m for ABT and AAT, respectively. The volume of Chhota Shigri Glacier was estimated to be 1.61 and 1.69 km3 for ABT and AAT models, respectively.
4.3 Ice thickness profiles
Interpolated ice thicknesses for Chhota Shigri Glacier were obtained from the selected ABT and AAT models, and differences between the two approaches were calculated (Fig. 8). On the boundaries of the glacier, the ice thickness ranges from 0 to 15 m, whereas the mid area of the glacier shows ice-thickness values ranging from 175 to 300 m. The altitudinal range between 4550 and 5000 m shows maximum ice thickness ranging from 200 to 300 m. The mean ice thickness of GPR points was 159 m whereas the mean thickness of the AAT model was estimated to be 152 m. The mean elevation of the DGPS survey (October 2009), mean altitude from the Cartosat-1 DEM, and the maximum and mean ice depth of GPR, ABT and AAT are shown in Table S1.
The ice thicknesses (bedrock topography) obtained from ABT, AAT, GlabTop (Frey and others, Reference Frey2014) and Glabtop-2 (Ramsankaran and others, Reference Ramsankaran, Pandit and Azam2018) models were compared with GPR thickness measurements for all five transverse sections (Fig. 9). The maximum ice thickness was observed at cross section 4 for both the AAT and ABT models but the minimum ice thickness given by the ABT model was slightly higher than the GPR value (Table S1). The mean surface elevation, maximum ice depth and mean ice depth for additional cross sections A–E are given in Table S2. A maximum ice thickness of 290 was estimated at cross section D using the AAT model. The ANN modelled ice-thickness distribution for sections A–E is plotted in Figure 10. The elevation vs distance along the longitudinal cross section (L1) of Chhota Shigri surface and ice thickness vs distance obtained using the ANN (AAT) model are shown in Figure 11.
Modelled ice thicknesses show varying levels of agreement with the observations at all five cross sections (Fig. 9). The AAT model showed lesser deviations towards the west side while the deviations were greater on the east side when compared to the GPR cross sections. The ABT model also showed greater deviations for all sections on the east side. The reason behind this shift may be the topographic pattern of the eastern sidewall of the glacier, given that the transverse section also included a portion of the glacier sidewall (300 m) in the input and training datasets. Another reason may be related to the interpolated values of GPR data at the sides due to the complexities of data collection.
To understand the relationship of ABT and AAT data points with GPR data points, the Pearson R was used. The values of R for ABT and AAT are given in Table S3. AAT performed better than ABT for cross sections 1–5, as suggested by the R values. ABT showed the highest correlation for cross section 3 whereas AAT showed the best correlation for cross section 2. The differences are probably due to the fact that AAT used partial GPR information of four points per profile randomly whereas ABT used only extrapolated values as targets. For both cases, the lowest correlation was observed for cross section 1. Large differences between observed and modelled ice thicknesses at transverse section 1 (Fig. 12), particularly at greater ice thicknesses, may be due to variations in surface slope, which is used as a predictor in the ANNs.
4.4 Error analysis
The overall RMSE of the DEM was estimated to be 10 m. While the Cartosat-1 DEM was generated from April 2007 imagery, the GCPs were collected in October 2009 and October 2010. The mean bias error for transverse cross section altitudes between DEM and GCPs was 4.6 m (Table S1). Based on the lower RMSE and mean bias errors, the temporal difference between DEM and GCPs can be ignored.
Comparisons of modelled and observed ice thicknesses that include interpolated GPR data yield RMSEs of 27 and 16 m for ABT and AAT models, respectively. Without the interpolated data, the RMSE decreased to 24 and 13 m for ABT and AAT models, respectively.
The errors in GPR thickness (±15 m), mean average bias of ABT (±13 m) and AAT (±6 m), error in kriging interpolation (±3.5 m for ABT and ±3 m for AAT) and error in DEM (±5 m) were calculated as quadratic addition (Vanlooy and others, Reference Vanlooy, Miege, Vandeberg and Forster2014). The total error for glacier ice thickness estimation using quadratic addition was ±21 m (ABT) and ±17 m (AAT). Therefore, mean ice-thickness estimates with errors were 104 ± 21 and 109 ± 17 m for ABT and AAT, respectively. The total error in ice thickness was converted to a total error in volume by multiplying by the area of the glacier (15.5 km2). The result has a total uncertainty in the ice volume of ±0.33 km3 (ABT) and ±0.26 km3 (AAT). Therefore, the volume of the Chhota Shigri Glacier is estimated to be 1.61 ± 0.33 and 1.69 ± 0.26 km3 for ABT and AAT models, respectively. The minor differences in the ice-thickness uncertainty between the GPR estimates (±15 m) reported by Azam and others (Reference Azam2012) and the present modelling estimates (±21 and ±17 m) indicate that the ANN model performance is acceptable.
5. Discussion
Field-based GPR surveys are one of the best available methods for ice-thickness measurements. However, due to rough terrain, harsh climatic conditions and high expedition costs, GPR surveys for ice-thickness estimates have been conducted on very few glaciers in the Himalaya. The motivation for this study is to estimate ice thickness for individual glaciers with or without GPR measurements. In the absence of validation data, the performance of the ANN model was difficult to assess in the case of Gangotri glacier (Haq and others, Reference Haq, Jain and Menon2014). Haq and others (Reference Haq, Jain and Menon2014) used the slope of sidewalls that extended 500 m out from the glacier to train an ANN, but the sidewalls were treated as a single entity.
5.1 Limitations
ANNs do not make use of glacier physics, so the modelling requires careful selection of parameters. One of the important issues faced during the present investigation was the selection of an optimal number of transverse sections. Ten transverse cross sections were used in this study, five of which were coincident with GPR measurements of ice thickness. More cross sections could potentially be added, but model stability and complex topography at higher elevations (>5200 m a.s.l.) limit the value of having additional cross sections.
Major limitations of ANN models include their black box nature, the possibility of overfitting, and the empirical nature of model development. Another shortcoming of the ANN approach is that it is computationally intensive and hence difficult to apply over large regions. Studies such as Frey and others (Reference Frey2014) and Farinotti and others (Reference Farinotti2020) applied their approaches on a regional scale. Regional-scale models require normalization of parameters such as shape factor to account for varying glacier shapes and topography. The present approach, calibrated for one glacier with in situ observations, is more likely to perform well on similar glaciers than a model trained at a regional scale.
5.2 Comparison with other studies
In this section, we compare our results with those of previously published studies on Chhota Shigri Glacier (Table 3). Our model has two outputs: ABT and AAT. One of the goals of the current study is to show the capability of ANN models to simulate ice thickness without calibration. Other studies, such as Frey and others (Reference Frey2014) and Ramsankaran and others (Reference Ramsankaran, Pandit and Azam2018), used GPR data from Chhota Shigri Glacier to train the GlabTop and the improved GlapTop2 models (Table 3, Table S4). The outputs from these models were compared with those of our trained model (AAT) to show the strength of ANN modelling. In the present study, the mean ice thickness for the entire Chhota Shigri Glacier is estimated to be 109 m, whereas Ramsankaran and others (Reference Ramsankaran, Pandit and Azam2018) estimated it to be 121 m. The ice thickness of five transverse cross sections (1–5) is plotted in Figure 7 along with the transverse cross sections generated by Ramsankaran and others (Reference Ramsankaran, Pandit and Azam2018) and Frey and others (Reference Frey2014) using the 10 m DEM from Ramsankaran and others (Reference Ramsankaran, Pandit and Azam2018). AAT and ABT showed a better match with the GPR data (Tables S4 and S5) compared to previous ice thickness estimates by Frey and others (Reference Frey2014) and Ramsankaran and others (Reference Ramsankaran, Pandit and Azam2018). The modelled average ice-thickness errors ranging from 6 to 21 m for ABT and from 1 to 13 m for AAT were lower than the mean bias error for ice-thickness ranging from 9 to 22 m estimated by Ramsankaran and others (Reference Ramsankaran, Pandit and Azam2018) at different transverse sections. The mean bias error for ice-thickness was also found to be lower than estimated by Frey and others (Reference Frey2014) (Table S4). The high errors in Frey and others (Reference Frey2014) were probably due to the relatively low-resolution DEM (SRTM 90 m) they used and the value of the shape factor (Ramsankaran and others, Reference Ramsankaran, Pandit and Azam2018). To address the spatial resolution issue in Frey and others (Reference Frey2014), we used a 10 m DEM (TandemX) developed by Ramsankaran and others (Reference Ramsankaran, Pandit and Azam2018) and re-estimated the ice thicknesses. It was observed that the mean bias error for ice-thickness for transverse cross sections was between −1 and 53 m. Additionally, the performance for each transverse cross section based on the RMSE is given in Table S5. Figure 7 shows that GlabTop results can have 100–150 m differences with GPR measurements at some locations. The ice volume of 1.69 ± 0.26 km3 for Chhota Shigri Glacier derived using the AAT model in the present study showed good agreement with the volume of 1.74 ± 0.25 km3 derived by Ramsankaran and others (Reference Ramsankaran, Pandit and Azam2018).
Vashisht and others (Reference Vashisht, Pandey, Ramanathan, Tayal and Jackson2017) used glacier flow mechanics, thickness-area scaling and volume-area scaling methods and estimated the volume of Chhota Shigri Glacier as 0.55, 1.40 and 7.15 km3, respectively. Interestingly, the estimated volume using the volume–area scaling method was 13 times higher than the volume estimated from glacier flow mechanics. Singh and others (Reference Singh, Rathore, Bahuguna, Ramnathan and Ajai2012) used the thickness-area scaling method and GPR data and computed a volume of 1.20 km3. Dobhal and others (Reference Dobhal, Kumar and Mundepi1995) used residual Bouger values to compute a volume of 0.64 km3.
5.3 Outlook
Worthwhile directions for future study would be to investigate the potential of deep learning networks such as convolution neural networks (CNNs) with more number of parameters such as surface velocities (e.g. Dehecq and others, Reference Dehecq2019) and mass balance (e.g. Wu and others, Reference Wu, Liu, Jiang, Xu and Wei2019) for more number of glaciers. The more the model is constrained with input predictors, the less dependency on data it will have. Another future scope of the present research will be to apply the ABT model on additional datasets for testing.
6. Conclusion
In this work, two ANN models were developed to estimate the glacier ice thickness at 10 transverse cross sections and two longitudinal cross sections on Chhota Shigri Glacier in western Himalaya, India. The model results were compared with the GPR-derived ice thicknesses on five transverse cross sections located between 4400 and 4900 m a.s.l. on Chhota Shigri Glacier. The results of the current investigation indicate that the ice thickness and volume of a glacier can be estimated fairly well using ANN models. Furthermore, an ANN model alone, without integrating ice thickness observations, was shown to provide a good estimate of the glacier ice-thickness distribution. The ABT estimated mean ice thickness and volume for Chhota Shigri Glacier are 104 ± 21 m and 1.61 ± 0.33 km3, respectively, while the AAT estimated mean ice thickness and volume for Chhota Shigri Glacier are 109 ± 17 m and 1.69 ± 0.26 km3, respectively.
The ABT and AAT estimates of ice thickness showed good agreement with the GPR measurements available for five cross sections. Mean bias errors for ice thickness along the measured profiles ranging from 6 to 21 m for ABT and from 1 to 13 m for AAT are lower than the average ice-thickness errors ranging from 9 to 22 m estimated by Ramsankaran and others (Reference Ramsankaran, Pandit and Azam2018) for all transverse cross sections. Based on the calculated RMSE values, ANN models trained with and without ice thickness observations provide reasonable estimates of ice thickness. The ANN architecture used in this study requires minimal input data that can be extracted for other locations. However, model transferability remains an open question, and future research should be directed towards CNNs and additional model inputs such as surface velocity (Dehecq and others, Reference Dehecq2019) and mass balance (e.g. Wu and others, Reference Wu, Liu, Jiang, Xu and Wei2019). The more the model is constrained with input predictors, the less dependency on data it will have.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/jog.2021.19
Acknowledgements
Mohd Anul Haq would like to thank the Deanship of Scientific Research at Majmaah University for supporting this work under Project No. R-2021-10. This work has also been supported by IFCPAR/CEFIPRA within project no. 3900-W1 and by the French GLACIOCLIM observation service. MFA acknowledges the research grant provided by the INSPIRE Faculty award (IFA-14-EAS-22) from the Department of Science and Technology (DST, India). The authors thank three anonymous reviewers for their constructive comments and suggestions. The authors would like to warmly thanks Scientific Editor Dr Joseph Shea for numerous editorial revisions and suggestions. The authors are highly thankful to Holger Frey for sharing 90 m resolution GlabTop model data and Raaj Ramsankaran and Ankur Pandit for sharing GlabTop2 model data. The authors are also thankful to Dr Munir Nayak and Dr Pottakkal George Jose for language improvement.