1. Introduction
Synthetic aperture radar (SAR) is the main data source for year-round, high-resolution sea ice monitoring and for the production of operational sea ice charts by national ice services around the world (Zakhvatkina and others, Reference Zakhvatkina, Smirnov and Bychkova2019). The most commonly used data format is wide-swath imagery such as Sentinel-1 (S1) data acquired in extra-wide swath (EW) mode, which is typically distributed at 40 × 40 m pixel spacing and covers a ground range of approximately 410 km. The resulting ice charts provide information that is crucial for navigational support and to ensure the safety of vessels in the Arctic. While operational ice charts are at present still based on manual analysis of SAR imagery by expert sea ice analysts, considerable progress has been made in the field of automated mapping of both sea ice concentration (SIC) and sea ice type/stage of development (SoD). This has resulted in a variety of published algorithms which can potentially increase automation in operational ice charting or ice type mapping for navigation support inside the pack ice (e.g. Ochilov and Clausi, Reference Ochilov and Clausi2012; Leigh and others, Reference Leigh, Wang and Clausi2014; Zakhvatkina and others, Reference Zakhvatkina, Korosov, Muckenhuber, Sandven and Babiker2017; Boulze and others, Reference Boulze, Korosov and Brajard2020; Malmgren-Hansen and others, Reference Malmgren-Hansen2021; Khaleghian and others, Reference Khaleghian2021; Pires de Lima and others, Reference Pires de Lima2023). However, most of these algorithms are only used within academia and evaluation of their classification results is usually done in the traditional way, i.e. based on independent training and test sets (e.g. Murashkin and Frost, Reference Murashkin and Frost2021; Stokholm and others, Reference Stokholm2023). Running an automated algorithm in the operational procedures at the ice services requires more thorough and representative “real-world” in-situ validation that must go hand-in-hand with further improvement of the algorithms to ensure that they can uphold or even improve the quality standards of manual image analysis by trained experts.
In this study, we take a step towards bridging the gap between research and operations in automated ice type mapping, using a research cruise conducted by the Centre for Integrated Remote Sensing and Forecasting for Arctic Operations (CIRFA-22 cruise) as an example for the application and validation of a supervised algorithm in a fully-automated processing chain. The main goals of this study can be summarized as follows:
1. Automatically classify sea ice types in the area of interest for the cruise and demonstrate that we can transfer classification results in near real-time (NRT) to the ship.
2. Validate the classification results in the field.
3. Assess which ice types can be mapped reliably based on manually selected validation regions and compare this traditional quantitative assessment to a qualitative evaluation based on in-situ observations.
It should be emphasized that the goal of our automated support for the CIRFA-22 cruise was not to reproduce operational ice charts but to enable fast and efficient navigation within the pack ice and close to the Greenland fast ice. While standard ice charts are very useful to safely navigate close to the ice edge, they do not always provide sufficient spatial detail for navigation within the ice. Once a vessel is in a region of high SIC, the best options available are (a) the direct transfer and interpretation of satellite imagery, (b) manual analysis with finer detail than the usual ice charts, or (c) automated products that provide information on the individual lead and floe scale level. While option (a) requires trained personnel on board the ship to interpret the SAR imagery, option (b) creates additional daily workload for the ice services’ sea ice analysts. Option (c), on the other hand, requires preparation work such as the setup of the processing and data transfer chain and the training of the algorithm before a cruise or operation. The technical part of this preparation work is of course directly transferable between different operations. However, the training of the algorithm and selection of ice types may depend on the region, time of year, and user requirements and abilities, such as the demands on the mobility within the pack ice or the icebreaker class of the vessel.
The remainder of this article is structured as follows: In the following section, we give an overview of the study area and environmental conditions during the cruise, followed by a description of the remote sensing data sets and in-situ observations used in this study. Afterwards, we describe our selection of training data for different ice types, the classification algorithm, data processing chain, and the setup for the NRT data transfer to the vessel. We then present the results and discuss them with respect to the main goals of the study stated above and in comparison to standard operational ice charts. Finally, we summarize our conclusions and outline recommendations for future work.
2. Study area and data sets
2.1. Study area
The CIRFA-22 cruise was conducted on the research vessel Kronprins Haakon (KPH) in April and May 2022. The main purpose of the cruise was “to perform measurements and make observations which allow for validation of information and forecast products resulting from CIRFA's work” (Dierking and others, Reference Dierking, Schneider, Eltoft and Gerland2022), making it an ideal test scenario for this study. The cruise started in Longyearbyen, Svalbard, on April 22nd and the ship spent approximately three weeks in the Belgica Bank area outside the north-east Greenland coast before returning to Longyearbyen on May 9th (Fig. 1). The sea ice situation around Belgica Bank can be challenging for navigation at this time of the year. The ice cover typically consists of both level and heavily deformed landfast ice close to the Greenland coast, as well as drift ice at various SoD further east in Fram Strait (Hughes and others, Reference Hughes, Wilkinson and Wadhams2011). More detailed information on the actual ice conditions in the area in 2022 can be found in the cruise report (Dierking and others, Reference Dierking, Schneider, Eltoft and Gerland2022) or in Eltoft and others (Reference Eltoft, Johansson, Lohse and Ferro-Famil2023).
The local air temperature measured on board KPH was consistently cold during the entire cruise period, mostly between −10 and −15°C and never exceeding −5°C. This is in agreement with several sea ice mass balance buoys deployed during the cruise, which recorded temperatures rising above −5°C for the first time in the second half of May 2022. Surface melt was not observed at any time during the cruise. We therefore consider the entire cruise period as “winter conditions” and can hence apply a classifier that was trained for cold winter conditions and dry snow. The training selection is described in more detail in the Method section of this paper.
2.2. Satellite data
2.2.1. Sentinel-1
Our processing chain and classification algorithm (described in the Method section) is based entirely on S1 data. S1 operates at C-band frequency (5.405 GHz) in either single- or dual-polarization mode. All data is freely available and can be accessed for example through the Copernicus dataspace platform (https://dataspace.copernicus.eu/). Here we use S1 images acquired at dual polarization (HH and HV channels) in EW mode and work with the Level-1 product in ground-range detected format at medium resolution (GRDM). The EW GRDM product is provided at a pixel spacing of 40 × 40 m with an actual spatial resolution of approximately 93 × 87 m (Aulard-Macler, Reference Aulard-Macler2011). The full swath width of 410 km is divided into five sub-swaths EW1 to EW5, with incident angles (IA) ranging from 18.9° in the near-range to 47.0° in the far-range. The pixel values are multi-looked intensities with 18 looks in the first sub-swath EW1 and 12 looks in the remaining sub-swaths EW2 to EW5. The noise-equivalent sigma zero (NESZ) of the S1 EW GRDM product, also known as the system noise floor, decreases across the swath. While its maximum value is equal to −23.1 dB in sub-swath EW1, the NESZ is mostly in the range between −27 and −33 dB in sub-swaths EW2 to EW5 (Aulard-Macler, Reference Aulard-Macler2011).
Thanks to its fine spatial resolution at wide coverage, its all-day and all-weather imaging capability, and the free data availability, S1 wide-swath imagery is one of the most important data sources in operational ice charting.
2.2.2. Sentinel-2
In cloud-free conditions during daylight, optical sensors can provide valuable complementary information to the SAR data and thus aid in the interpretation of SAR signatures from different sea ice types. In this study, we use optical imagery acquired by Sentinel-2 (S2) to guide the selection of ice classes and training data. The S2 high-resolution multispectral instrument provides data at 13 spectral channels. For our visual interpretation in combination with S1 SAR data, we only use the visible channels (B4, B3, B2), which are provided at a pixel spacing of 10 × 10 m.
2.3. In-situ data
A large set of in-situ measurements was conducted during the cruise, including ship-based ice observations, on-ice measurements of physical snow and ice properties, drift observations using buoys with GPS sensors, and drone-based observations with both optical and radar sensors. Details about all the acquired data sets can be found in the official cruise report (Dierking and others, Reference Dierking, Schneider, Eltoft and Gerland2022) and in the online publications of the individual data sets.
For the work presented here, our in-situ validation of the classification results in the field is based on visual observations from the ship. These include the regular IceWatch (Hutchings and others, Reference Hutchings, Delamere and Heil2020) observations during the cruise (available at https://icewatch.met.no/cruises/130), as well as additional visual observations and photographs from the bridge and observation deck that were specifically timed to coincide with the timing of overlapping S1 image acquisitions. During the cruise, there were ten occasions at which KPH was located within the footprint of an S1 scene at the time of image acquisition (Fig. 1, Table 5 in Appendix A). Finally, most of the analysis presented here uses photographs that are taken each minute by a camera mounted in the crow's nest of KPH (hereafter denoted as “monkeytop camera”). Compared to the hand-held photographs, the monkeytop camera offers the advantage of a fixed imaging geometry and does not require manual operation.
3. Method
3.1. Training data selection
Before the start of the expedition, we used overlapping SAR (S1) and optical (S2) data to assess the typical sea ice situation in the cruise's target area around Belgica Bank during April and May. For this purpose, we studied ice charts and satellite images from the previous years (2020 and 2021) and from the months in the lead-up to the cruise in 2022 (February, March and first half of April). Based on this analysis, we identified four main ice types: Open Water/New Ice (OW), Young Ice (YI), Level Ice (LI), and Deformed Ice (DI) (Table 1). Figures 2 and 3 show examples of overlapping S1 and S2 images and indicate areas with the identified ice types. We used multiple such image pairs to select the classes (ice types) that we needed to separate in our automated processing chain during the cruise.
The choice of these four ice types is motivated by the goal of this project, which is to provide a product for navigational purposes inside the pack ice. As a consequence, we do not include a separate class for large open water areas. The OW class is specifically trained to identify small leads and openings (on the scale of tens or hundreds of meters up to a few kilometers) within large areas of high SIC. By our definition, this class includes entirely open leads as well as refrozen leads that may be covered by grease ice or very thin sheets of nilas in the earliest stages of sea ice formation. Adding an additional class for large areas of open water is possible, but since the cruise was planned to mostly operate close to the landfast ice far away from the marginal ice zone, we did not consider it necessary for the given task. An outlook on how to best include this additional class is given in the final section of this paper.
Furthermore, we separate between YI, LI, and DI. For the latter two we do not distinguish between FYI and MYI. While the separation of FYI and MYI is important for several applications, it is also a challenging task based on individual SAR intensities only. In most cases, the history of a particular ice floe or region needs to be considered to make a correct and unambiguous decision. Because of its overall strong backscatter and particularly its expected high backscatter in HV channel due to volume scattering (e.g. Onstott and Carsey, Reference Onstott and Carsey1993; Komarov and Buehner, Reference Komarov and Buehner2019; Lohse and others, Reference Lohse, Doulgeris and Dierking2019), most of the MYI will fall into the DI class. This class should be interpreted as an area that will be difficult to navigate through even with an icebreaker such as KPH, hence making the classification result with our selected ice types valuable for navigational purposes. Furthermore, as the DI areas are likely to be considerable thicker than the LI areas, classification results for the selected ice types can also be useful for data assimilation in numerical models for sea ice forecasts.
It should be noted here that the ice types used in this study are consistent with the World Meteorological Organization (WMO) Sea Ice Nomenclature (WMO, 2014) definitions, and hence do not exactly match the SoD ice types provided in some operational ice charts. However, here we are interested in providing useful information for navigational support and route planning inside the pack ice. Separation of lead areas with OW from lead areas covered with thin nilas is not required for navigational purposes, as an icebreaker will travel equally easily through both. Since both of these classes are also difficult to distinguish because of their weak backscatter signatures caused by a smooth surface, it is reasonable to combine them into one class here. For the Norwegian icebreakers KPH and KV Svalbard, similar arguments hold for the separation of deformed FYI and MYI. To save time and fuel, both ships would avoid deformed FYI as much as MYI. However, this may be different for other cruises or operations on ships with a higher ice class. Hence, the class and training data selection should be tailored to the planned operations and the abilities of the involved vessels. Combining these classes (OW and new ice as well as deformed FYI and MYI) will of course lead to higher scores when evaluating classification results. This must be kept in mind when comparing classification scores to other algorithms that try to separate these ice types.
3.2. Classification algorithm
Based on the manually selected training regions for the ice types in Table 1, we trained a pixel-wise classification algorithm introduced by Lohse and others (Reference Lohse, Doulgeris and Dierking2020) that uses both HH and HV intensity together with the IA to classify the S1 images. The method accounts for class-dependent differences in the variation of backscatter intensity with IA (Mäkynen and Karvonen, Reference Mäkynen and Karvonen2017; Guo and others, Reference Guo, Itkin, Lohse, Johansson and Doulgeris2022), assuming a linear decrease of backscatter in decibel (dB) with increasing IA. This is achieved by using a two-dimensional Gaussian distribution with a linearly variable mean vector μ = a + b ⋅ Θ, where μ is the mean vector, Θ is the IA, and b is a vector with the linear slopes for HH and HV. The concept of linear IA dependency of the input features can in principle be extended to include texture features. However, the decision whether or not to use texture features is a trade-off between computation time, spatial resolution of the results, and the gain in classification accuracy (CA). As shown in Lohse and others (Reference Lohse, Doulgeris and Dierking2021), the algorithm requires large texture windows (>21 × 21 pixels) for a significant improvement of CA, which effectively decreases the spatial resolution of the resulting ice type maps. Furthermore, the largest improvement is found for the separation of OW and FYI or MYI. Since we require fine spatial resolution in this study and do not include a separate OW class, we use only the intensity channels as input features here. Figure 4 illustrates the per-class IA dependency of HH and HV backscatter after training the algorithm for the relevant region and the season of the cruise. We see that in the near-range the HV intensity of LI, YI, and OW is close to or partly below the system noise floor. However, since the HH intensity is above the noise floor for all classes, we do not expect thermal noise to significantly affect the classification in this study. This is in good agreement with previous studies (e.g. Dierking, Reference Dierking2010).
3.3. Processing chain and data transfer
We implemented a processing chain at the Norwegian Meteorological Institute (MET Norway) as part of the Norwegian Ice Service's (NIS) daily production that automatically downloads, pre-processes, classifies, and geocodes all S1 EW data covering an area of interest (AOI) for the cruise. The pre-processing includes the standard noise correction implemented in the Sentinel Application Platform (SNAP) as well as calibration of the data to normalized radar cross section σ 0, followed by various levels of multi-looking (ML) with increasing window sizes, and finally the conversion of σ 0 to dB. The pixel-wise classification result for each ML level was then geocoded to a suitable corresponding pixel size in polar stereographic projection (EPSG:3996), sub-set to two differently sized AOIs (red squares in Fig. 1), compressed, and uploaded to an ftp server that can be accessed from the ship. Table 2 gives an overview of the different processing settings for ML and pixel spacing after geocoding. Note that larger ML windows result in a smoother classification result that can hence be geocoded to coarser pixel spacing. The final file sizes are much smaller than the images at the original pixel spacing of 40 m (Table 2) and can be downloaded to the ship more easily. However, smoothing and re-sampling comes at the cost of losing spatial detail. Processing the data with multiple ML levels allowed us to download the finest spatial resolution possible to the ship at any time, while considering the limited bandwidth and internet connection on board KPH. Sub-setting to two separate AOIs furthermore enabled us to download a coarser resolution product for the larger AOI (400 × 400 km) and finer resolution product with more spatial detail for the smaller AOI (200 × 200 km).
4. Results
4.1. NRT data transfer
The fully automated processing chain at MET Norway and the data transfer to the vessel during the cruise worked well. All relevant images were successfully downloaded and processed, and classification results were available on KPH within 2 to 5 h after the image acquisition, which was sufficient for navigation support and decision making during the cruise. The sea ice information was available significantly faster compared to manually produced operational ice charts, which are at best issued once per day for the Belgica Bank area. Furthermore, our automated product contained more spatial detail than the standard sea ice charts, including for example the exact location of large floes or of leads that were favorable for efficient travel (see last subsection of this Results section).
4.2. Classification time series
During the cruise, we classified all S1 imagery acquired over the large AOI indicated in Figure 1 and transferred results to KPH in NRT. After the cruise, we extended the time series to cover the area for the entire time period from March 1st until May 31st 2022. Figure 5 shows selected examples of the S1 imagery and the corresponding classification results.
4.2.1. Visual inspection of classification results
A visual inspection of the SAR imagery and the classification results shows that most ice types are successfully identified by our algorithm. The stationary fast ice area is classified consistently over time and the results are independent of changes in imaging geometry such as IA, the radar look direction, or ascending and descending orbits of the satellite. Classification errors occur close to the Greenland coast in landfast LI areas with an untypical radar signature. The ice here presumably grows under protected conditions and forms a very smooth surface. This results in a low backscatter signal in both polarization channels which is easily confused with OW (Fig. 5d) and poses a known challenge for automated classification algorithms (e.g. Wang and others, Reference Wang, Lohse, Doulgeris and Eltoft2023). At large enough distance to the coast, passive microwave radiometer (PMR) data could help to mitigate classification errors (Malmgren-Hansen and others, Reference Malmgren-Hansen2021), but in narrow fjords and close to the coast the PMR has too coarse spatial resolution. However, for our task of navigation support, these classification errors are not critical, as the affected regions are too far in the landfast ice to be reached by KPH.
LI and DI are generally mapped correctly by the classifier. While the landfast ice is largely classified as either LI (western part) and DI (eastern part), the drift ice entering the area from the north consists mostly of deformed floes, intersected by smaller areas of YI or LI. Two polynyas repeatedly opened up at the fast ice edge in the time period between March and May 2022 (Fig. 5e). As the overall ice drift in the polynyas was towards the south and temperatures were well below freezing point until the middle of May, the polynya areas were mostly covered by YI, with smaller fraction of OW directly at the fast ice edge. While these patterns are often identified correctly by our algorithm, we also observe some classification errors of YI as LI. This is most likely caused by variations in the small-scale surface roughness of YI, due to the absence or the presence (and density) of frost flowers as well as finger rafting and the beginning of ridging.
4.2.2. Quantitative assessment of classification results
For a quantitative assessment of our algorithm, we evaluate the classification results over manually selected validation regions of interest (ROI)s and report per-class CA [%] in the form of a confusion matrix. We consider landfast and drift ice separately and include only images from the time period during the cruise. The selection of the validation ROIs is based on a comparison of S1 and optical data in combination with ship-based IceWatch and in-situ observations during the cruise. We therefore consider our manual selection to be reliable ground-truth data for a quantitative validation. For the landfast ice, we defined ten ROIs each for LI and DI. Each of the ROIs covers 8 × 8 pixels, corresponding to an area of 640 × 640 m. As the landfast ice did not undergo major changes during the cruise, we can use the same ROIs for all images. For the drift ice, we defined five ROIs per image for each class (OW, YI, LI, DI). Because some of the classes in the drift ice, in particular OW, often cover only small contiguous areas, we chose a smaller size of 5 × 5 pixels, corresponding to an area of 400 × 400 m. For a common evaluation of the different ML settings (Table 2) in our processing chain, we re-sampled the results from the 9 × 9 and 21 × 21 ML to 80 m pixel spacing.
Tables 3 and 4 show the confusion matrices for the classification results from the validation ROIs over landfast ice and over drift ice, respectively. The results support our visual inspection of the classified images. OW, LI, and DI are all identified with high accuracy. YI proves to be the most challenging class and only achieves an accuracy of around 50 %. It is often incorrectly classified as LI, and sometimes as OW or DI. More smoothing with larger ML windows increases the CA for DI, both in the landfast and the drift ice region. For OW, YI, and LI this is only true when we increase the ML window from 3 × 3 to 9 × 9. The further increase to 21 × 21 results in a constant or lower accuracy for these classes.
The landfast ice validation labels (True class) only contain LI and DI ice types. Values are given in percentage [%]. See Table 1 for ice classes.
Values are given in percentage [%]. See Table 1 for ice classes.
4.3. Comparison to ship-based in-situ photographs
For the in-situ validation of our algorithm, we compare the classification results to visual observations from the monkeytop camera of KPH at the time of image acquisition. There were in total ten occasions during the cruise at which KPH was in areas of high SIC and within the footprint of an S1 image. Figure 6 shows four representative examples of the monkeytop photographs together with 25 × 25 km close-ups of the coincident S1 images and the corresponding classification results, centered around the position of KPH. The comparison with in-situ data shows that large areas of both LI and DI are correctly classified (Figs. 6a, c). Both ice types are also mapped correctly in more heterogeneous regions with smaller floe sizes and a mixture of classes (Fig. 6d). Note that in-situ ice cores taken during the cruise (not shown here) indicate that the deformed ice areas do in fact contain a mixture of FYI and MYI, which is in agreement with our initial assumption during ice class and training data selection.
For YI, the comparison of in-situ observations and classification results reveals significant classification errors. The region around KPH in Figure 6b is mostly classified as LI, while the monkeytop photograph clearly shows YI. Additional visual observations and manual photographs (not shown here) confirm that the YI in this example has little small-scale (mm to dm) or large-scale (m) surface roughness, hence its backscatter signal is relatively weak. Further south in the same image, we find rougher YI with a stronger backscatter signature that is mapped correctly. These results are in good agreement with the significantly lower CA scores for the YI class.
4.4. Comparison to operational ice charts
Operational ice charts are the standard product issued by national ice services to support maritime navigation in the Arctic. For the Belgica Bank area at the time of the CIRFA-22 cruise, the NIS produced charts on weekdays showing SIC while the Greenland Ice Service at the Danish Meteorological Institute (DMI) produced weekly ice charts with total SIC, partial concentrations for different SoD, and floe size distributions. Figure 7 shows an example comparison of original SAR imagery, NIS and DMI ice charts, and our classification result. While the polygons in both ice charts generally reflect the large-scale (tens of kilometer) patterns in the SAR imagery and the classification result, the pixel-wise ice type labels from our algorithm provide much finer spatial information on the individual lead and floe scale level, which is not present in either of the ice charts. The two large polynya areas along the fast ice edge are identified in both ice charts. The NIS characterizes the polynyas as a mixture of Very Open Drift Ice and Close Drift Ice, while the DMI ice chart denotes them as Close Drift Ice with YI, nilas and thin FYI as the predominant SoD. Neither of the ice charts provides the precise location or orientation of refrozen leads or deformed ice floes.
5. Discussion
The automated ice type mapping and information transfer to the vessel was clearly successful. To our best knowledge, this is the first time that such results were actually sent into the field in NRT. While having high-resolution classification images available on KPH within a few hours after image acquisition was beneficial for navigation and route planning, the information was also used to guide scientific questions and decisions about the locations of ice stations and in-situ measurements during the cruise. An example case of how the SAR data and classified images were used for tactical decisions is given at the end of this section. Using the fully-automated processing chain, the classification results were available significantly faster than traditional ice charts and contained more detailed spatial information than the standard ice charts by the NIS, which only provide SIC for the AOI. The DMI ice charts contain some information on ice types (SoD and floe size) that could potentially be used for route planning, yet it is not provided with the same spatial detail as our classification product. Furthermore, the DMI charts for the AOI were issued only once per week at the time of the cruise. For the tactical navigation within the ice, the timeliness as well as the temporal and spatial resolution of our classification product is clearly preferable. It should be noted, however, that the setup for automated support during the cruise required a considerable amount of preparation work, most importantly the training of a pixel-wise classifier for the specific area and season of interest. While we were able to do this successfully for this demonstration example, it cannot necessarily be directly transferred to other ice regions or times of the year. Furthermore, a different cruise or operation with another research vessel may have different requirements on the mapped ice types, both because of scientific research questions and because of the ice-breaking capabilities of the vessel.
The quantitative evaluation of the classification results shows that OW, LI, and DI are generally mapped well by the classifier, achieving maximum accuracies of 99.9 %, 94.3 % and 99.7 %, respectively. It is noteworthy that these maxima for the different classes are achieved at different ML levels. For DI, the CA improves steadily with increasing ML windows and achieves its maximum value for ML 21 × 21 (Tables 3 and 4). This indicates that classification errors of DI are largely caused by class-internal speckle variation. A larger number of looks reduces speckle and results in a tightened class distribution around the mean, hence reducing the number of misclassified pixels. For OW and LI, however, maximum CA is achieved at ML 9 × 9 (Table 4). This can be explained by two competing effects: On the one hand, similar to the DI case, a larger number of looks decreases the class-internal speckle and hence improves the CA. On the other hand, edge effects and mixing of classes in large ML windows can lead to classification errors close to the boundaries between two different classes. As the regions covered by OW or LI within our study AOI are often significantly smaller than the areas covered by DI, these boundary effects are more prominent for OW and LI. Note also that within the landfast ice, there are larger contiguous regions of LI, and the maximum CA of LI within the landfast area is achieved with ML 21 × 21 (Table 3).
YI is the most difficult ice type to classify and achieves the lowest score of all classes in this study. Its maximum CA is at 54.2 % with a ML window of 9 × 9. The main challenge for the classification of YI is its highly variable small-scale surface roughness with respect to the radar wavelength. For C-band, this small-scale roughness is on the order of millimeters to centimeters, which corresponds to changes of the surface caused for example by frost flowers or snow crusts (Isleifson and others, Reference Isleifson2013). While very smooth YI with low-backscatter is misclassified as LI (38.7 %) or OW (6.6 %), very rough YI with strong-backscatter can be misclassified as DI (0.6 %). These fractions of misclassified YI indicate that our initial selection of YI training data before the cruise was biased towards rough YI. One way to mitigate this issue could be to introduce a second YI class that is trained on smoother YI. While this should increase the YI CA, it would also increase the false positives and wrongly classify more LI as YI. This is unwanted, in particular in the landfast ice area. Separately trained classifiers for the landfast ice and the drift ice can be used to overcome this issue, but will require either a manual delineation or an automated detection of the fast ice edge. This is beyond the scope of the present study, but we are planning to investigate it in future work.
It should also be noted that the absolute CA numbers must be interpreted carefully, as they are dependent on the subjective selection of validation ROIs. This is a common problem in any traditional evaluation of a classifier that is based on a train and test set. In this study, we therefore also qualitatively compare the classification results with in-situ observations during the cruise. Overall, the quantitative accuracy assessment is in good agreement with the qualitative comparison of in-situ observations and classification results. OW, LI, and DI are correctly classified whenever the monkeytop photograph shows the respective ice type (Fig. 6). Furthermore, the monkeytop photographs and IceWatch observations also confirm the challenges of YI classification, as for example the relatively smooth YI in Figure 6b is misclassified as LI.
The ten occasions during the CIRFA-22 cruise when KPH was located within an S1 footprint are not sufficient to use the in-situ observations for a quantitative evaluation of the classification results. Yet our qualitative comparison here shows the potential of such ship-based photographs to be used for the assessment of automated ice type classification. Similar to the monkeytop camera, manual IceWatch observations and photographs can be used in the same way. However, to facilitate the use of IceWatch data, the observations need to be aligned with the S1 acquisition schedule, which is not always feasible given the many different tasks that are carried out on a cruise. Hence, automatically acquired photographs provide a more practical solution. In the future, it would be beneficial to install additional monkeytop cameras looking not just to the front, but also to the sides of the ship. Especially in variable sea ice conditions, this is an easy way to increase the amount of validation data. In a large-scale study using monkeytop photographs for quantitative accuracy assessment, the photographs taken in the different directions at fixed geometries can then be warped onto a map and directly compared to classification results. This can also be useful to validate not only ice type classification, but also pixel-wise ice-water mapping in the marginal ice zone, such as the method introduced by Wang and others (Reference Wang, Lohse, Doulgeris and Eltoft2023).
5.1. Example of guided tactical decision-making during the cruise
In Figure 8 we show an illustrative example of how the classified imagery was used on board KPH to support tactical navigation during the cruise. The figure shows a time series of four SAR images over the small AOI between May 2nd and May 5th 2022. KPH's position is indicated by a red marker within each image and the ship track between the previous and the current image acquisition is shown by an orange line. Older ship tracks are displayed as gray lines.
On May 2nd, the scientific work within the “southern polynya” was finished and the cruise plan was to go north through an area of DI to reach a planned fast ice station adjacent to the northern tip of the “northern polynya”. At the time of image acquisition on May 3rd (Figs. 8b, f) KPH was traversing through the northern polynya with fast progress. Given the information from the imagery on this day (SAR imagery or classification result) it became clear that the polynya was kept open by a large DI floe (marked by a red ellipse) that blocked the smaller DI floes drifting in from the north. As the large floe slowly drifted southward, the goal to reach the planned fast ice station with enough time left to conduct valuable work became unfeasible. Instead, various in-situ measurements were conducted within the northern polynya and on the drift ice just next to the large floe. The imagery on May 4th (Figs. 8c, g) indicated that the drift of the large floe had slowly turned towards the fast ice edge in the west, constituting the risk of trapping KPH between the floe and the fast ice edge. This led to the decision to escape back around the large floe (see the ship track Figs. 8d, h) before getting stuck. Without the high-resolution information available on board and the fine spatial detail provided by the pixel-wise classification result in comparison to the ice charts (Fig. 7), these considerations and decisions would not have been possible and much cruise time and fuel would likely have been wasted trying to follow the original cruise plan.
6. Conclusion
In this study, we have taken a step to bridge the gap between research and operations in automated ice type mapping. We have successfully demonstrated the application of a fully-automated sea ice type classification workflow at MET Norway and transferred classification results in NRT to a vessel in the Arctic, providing detailed sea ice information at fine spatial resolution in support of tactical navigation and decision making. We have evaluated the classification results for individual ice types using validation ROIs that are confirmed from observations in the field. The results show that OW, LI, and DI are mapped with high accuracy, while YI remains challenging due to variable small-scale (mm to dm) and large-scale (m) surface roughness. Finally, we have used ship-based in-situ photographs to qualitatively evaluate the classification results. The comparison shows good agreement between observations and classification of OW, LI, and DI, while it also reflects the challenges for reliable identification of YI, caused by its small-scale surface roughness variability.
For the next steps of this work, we plan to classify ice types within the landfast ice and drift ice areas separately, using either a manually or automatically detected fast ice edge (Selyuzhenok and Demchev, Reference Selyuzhenok and Demchev2021; Wang and others, Reference Wang, Wang, Li, Ni and Liu2021). This will reduce classification errors within the fast ice and furthermore allow us to test the incorporation of multiple YI classes in the drift ice, without compromising the landfast ice type classification. While the reliable separation of ice types within the fast ice is only of minor importance for ship traffic, it can be critical at inhabited coasts where people move on fast ice for hunting, fishing, and access to islands (Segal and others, Reference Segal, Scharien, Duerden and Tam2020).
Furthermore, recall that we did not train a separate class for large open water areas. While it has been shown that the algorithm used in this study can classify OW using a combination of intensity features and image texture (Lohse and others, Reference Lohse, Doulgeris and Dierking2021), other publications indicate the convolutional neural networks perform ice-water separation faster and more reliable (e.g. Malmgren-Hansen and others, Reference Malmgren-Hansen2021; Stokholm and others, Reference Stokholm2022; Chen and others, Reference Chen2023; Wang and others, Reference Wang, Lohse, Doulgeris and Eltoft2023). Hence, in future work, the fine-resolution ice type classification presented in this study can be applied in regions that are identified as high SIC by a CNN.
Finally, the monkeytop photographs can be warped onto the geocoded SAR classification results and thus allow large-scale qualitative evaluation of the retrieved ice types independent of manually selected validation ROIs. We expect that this “real-world” validation based on a comparison between automatically mapped ice types and in-situ observations will increase stakeholders’ trust in the automated products and thus facilitate the transition of algorithms from research into operations at the ice services.
Acknowledgements
This research was funded by CIRFA partners and the Research Council of Norway (RCN) (grant number 237906) and ESA through ESA contract RFP/3-17845/22/NL/FF/ab. The presented work contains primary and altered Sentinel data products (@Copernicus data). We would like to thank the anonymous reviewers for their helpful comments and feedback. The suggested changes have significantly improved this manuscript. We also thank Wolfgang Dierking for his invaluable input during the writing process, that improved the quality of this manuscript.
Appendix A
Sentinel-1 data
For the classification time series, we used in total 277 S1 scenes that were acquired between March 1st and May 31st 2022 and intersect with the larger AOI shown in Figure 1. A complete list of images can be obtained using an automated search on the Copernicus dataspace platform or by contacting the corresponding author. Additional scenes from previous years and earlier months in 2022 were used for visual inspection and the selection of training data.
Table 5 lists the ten S1 scenes that were acquired while KPH was within the footprint of the image during the CIRFA-22 cruise and indicates if (and where) the scenes are shown in this publication.