Introduction
Mastoid surgery is the cornerstone in the surgical management of middle- and inner-ear disease. The temporal bone has unique complexity as the structures of interest are encased in bone and are not easily appreciable to most observing learners.Reference Sanna, Russo and Taibah1 Training in temporal bone surgery, therefore, needs repeated dissection and practice to understand the complex anatomy and the safe use of instruments.Reference George and De2 Traditional methods of surgical education focussed on the need for clinical experience to gain insight and to refine surgical skills: ‘The more you do, the more you know.’Reference Shaw and Shea3 However, modern ethicolegal discourse argues that the early stages of learning should take place outside the operating theatre until the trainee has gained appropriate operative knowledge and skills, and can manage basic technical issues while ensuring patient safety.Reference Kotsis and Chung4
Temporal bone dissection courses remain essential for acquiring surgical competency and deeper anatomical understanding.Reference Arnoldner, Lin and Chen5 Conventionally, training in mastoid surgery has relied on cadaveric dissection, a method long regarded as a benchmark for high quality training. However, cadaveric specimens are becoming more difficult to obtain due to the scarcity of bones, their cost, ethical issues, and risk of infection.Reference Naik, Naik and Bains6, Reference Bhutta7
Some of these challenges have prompted the exploration of novel training methods.Reference Musbahi, Aydin, Al Omran, Skilbeck and Ahmed8 There are now alternatives to cadaveric dissection with which otolaryngologists can use learn anatomical and practical aspects of temporal bone dissection.Reference Wiet, Rastatter, Bapna, Packer, Stredney and Welling9 Simulation-based education utilising artificial temporal bone models is a promising approach to allow surgical skill enhancement through deliberate practice.Reference Bakhos10 The models are synthetic anatomical replica produced at relatively low cost with potentially greater ease of acquisition.Reference Ke, Ma, Zhang and Sun11 They allow temporal bone dissection to be replicated and give the opportunity for trainee competency to evolve without risk to patients or time constraints.Reference Rajaratnam, Rahman and Dong12
Aim
The aim of this pilot study was to evaluate the face validity and content validity of artificial temporal bone dissection within the UK training structure. Face validity was assessed by determining whether artificial temporal bone dissection delivered a realistic experience of temporal bone surgery by comparing it with in vivo mastoid surgery and cadaveric bone dissection. Content validity was assessed by determining if operating on artificial temporal bones covered the necessary anatomical structures and surgical techniques for effective surgical training.
Ultimately, this study seeks to contribute to the advancement of surgical skill training by validating the usefulness and realism of artificial temporal bone dissection. This is a pilot study to help inform the design of potential randomised, controlled trials.
Method
Participants were otolaryngologists attending the University Hospital Birmingham ENT Dry Simulation Laboratory between March 2023 and July 2023. Participants attended temporal bone skills courses tailored to their experience. The courses used two artificial temporal bone models (Table 1), produced by PHACON (Atlanta, Georgia, USA) and MED-EL (Innsbruck, Austria). Allocation to these bones was based on availability of the resource at the time (convenience sampling). Therefore, unless otherwise specified, the artificial temporal bones are treated as a single intervention: assessing the concept of artificial temporal bone dissection in training in the UK as opposed to the individual bone products. At conclusion of the course, participants completed questionnaires assessing content and face validity of the temporal bone model they had used for their dissection.
Table 1. Constituent materials of dry bone models (reproduced with permission of manufacturers)

The cohort was subdivided based on their previous mastoidectomy experience: Those having performed more than 50 mastoidectomies were classed experts while those having performed 50 or fewer mastoidectomies were considered non-experts. The grade of the participant and years of consultant experience (where applicable) were also recorded.
As per the Health Research Authority decision tool, formal ethical approval was not required. However, the project was locally reviewed by the University Hospital Birmingham research and development department and registered as a project without objections.
All participants were introduced to the artificial temporal bone dissection station, equipment, and facilities. In any one session, between 1 and 9 participants drilled simultaneously. Consultant otologists with experience in surgical simulation training provided guidance to participants at a maximum ratio of 3 participants to 1 faculty. All the participants were allotted 90 minutes drilling time on the artificial temporal bones to complete their dissection. Cortical mastoidectomy and posterior tympanotomy included: opening the cortex over MacEwan’s triangle, exposing and delineating the sinodural angle, thinning the posterior ear canal wall, identifying the short process of the incus and the lateral semicircular canal, delineating the vertical segment of the facial nerve and chorda tympani, and performing a posterior tympanotomy to visualise the round window niche. Some participants also went on to perform translabyrinthine dissection, middle-ear and cochlear implantation. These were not validated in this study.
After completing the dissection, all participants completed a 22-item questionnaire, assessing face validity in comparison with cadaveric temporal bone dissection and in vivo mastoid surgery.Reference Reddy-Kolanu and Alderson13 Content validity (utility of the artificial temporal bone for training) was assessed by a further seven items.Reference Aussedat, Venail, Marx, Boullaud and Bakhos14 The items were adapted from previous literature used to validate digital temporal bone simulators.Reference Reddy-Kolanu and Alderson13, Reference Aussedat, Venail, Marx, Boullaud and Bakhos14
Data were collected using Microsoft Forms and Excel (Washington, USA). Statistical analysis was conducted using the Statistical Package of Social Science (SPSS) version 29 for quantitative data (Chicago, USA). Each ordinal scale was converted to a numerical score (1–5) and totalled to provide a quantitative measure of both face and content validity.
Results
Thirty-three respondents completed the questionnaire. Demographics of respondents can be seen in Table 2. Item responses assessing face validity for both artificial temporal bones are displayed in Figures 1 and 2. Item responses for content validity are displayed in Figure 3.
Table 2. Demographics of respondents


Figure 1. Face validity assessing artificial temporal bone against cadaveric temporal bone (organised from factors favouring to cadaveric to factors favouring artificial)

Figure 2. Face validity comparing artificial temporal bone with in vivo temporal bone (organised from most unrealistic to most realistic elements).

Figure 3. Items assessing content validity of the artificial models.
Total scores for face validity (parts 1 and 2) were normally distributed (Shapiro–Wilk test; p = 0.570 and p = 0.348, respectively), whilst content validity was non-parametric (Shapiro–Wilk test; p < 0.001). Overall, content validity had a median score of 34.00 (interquartile range 32.00–35.00) of a maximum score of 35. Face validity part 1 was comparing artificial to cadaveric models and had a mean score of 28.00 (95 per cent CI 25.30–30.70). This 95 per cent CI crossed the midpoint score of 27.00, which represents equivalence meaning that neither modality was considered significantly preferable over the other. The mean average for face validity part 2 was 45.76 (95 per cent CI 42.57–48.94) of a maximum of 65 (the higher the value, the more realistic compared to real human tissue). Table 3 demonstrates the difference in total scores in the three subscales between expert and non-expert surgeons.
Table 3. Total scores by number of mastoids completed

* Part 1 refers to comparison with cadaveric bone in which a midpoint score of 27 would represent equality, < 27 favours cadaveric, > 27 favours artificial. **Part 2 refers to comparison to real patients (higher score, out of 65, represents greater realism). #Higher score, out of 35, represents greater education value.
When comparing scores based on the experience of consultants, there was no significant difference in any subscale total scores (face = Kruskal–Wallis test; part 1 p = 0.275; part 2 p = 0.059; content = ANOVA p = 0.132), however there was a trend towards lower scores with greater years of experience.
When comparing MED-EL to PHACON artificial temporal bones, there was a significant difference in the proportion of experts and non-experts in both groups (chi-square; p = 0.009) with MED-EL having a greater number of experts (13 vs 4). Despite this, there was no significant difference in the total scores for content validity between both artificial bones (Mann–Whitney U test; p = 0.606) or either of the face validity scales (independent t-test; part 1 p = 0.133; part 2 p = 0.105). This includes there being no significant difference in the realism for each individual item in face validity part 2 (Mann Whitney U test; p = 0.074–0.929). The only item close to significance was ‘Facial nerve/chorda position and appearance’ with the PHACON temporal bone trending towards having greater realism (median 4.00, interquartile range 3.00–4.00 vs median 3.00, interquartile range 2.00–4.00). ‘Soft tissue’ also demonstrated no significant difference among the three different artificial temporal bone models (Kruskal–Wallis test p = 0.132). Additionally, both models without soft tissue demonstrated no significant difference in ‘soft tissue’ score in comparison to the PHACON model with soft tissue (Pairwise comparison, adjusted by Bonforroni correction, p = 1.000 vs MED-EL and p = 0.729 vs standard PHACON).
Respondents were asked to rate at what stage of training (or equivalent non-training grade) they thought a model would be a useful training tool. The results are displayed in Figure 4.

Figure 4. Responses to item: At what stage of training (or equivalent non-training grade) do you think this model would be a useful training tool? CCT = Certificate of Completion of Training; ST = Speciality Training.
Discussion
In this study, we sought to assess the face validity and content validity of artificial temporal bone dissection for surgical training and skill development. Otolaryngology consultants, trainees, and post-Certificate of Completion of Training fellows were recruited to participate in the temporal bone dissection and its validation. Good levels of face and content validity were achieved in most domains.
Specifically, content validity (the usefulness of the technique as a training tool) appears extremely good across the board with the median score close to the maximum-possible score in this scale. All respondents considered artificial bones valuable or extremely valuable for learning anatomy, drilling, and hand-eye coordination. The rating of artificial temporal bones overall as a teaching tool was also regarded as extremely valuable by nearly all the participants. Once more, there was no difference in perceptions between experience levels: both experienced trainers and trainees or relative novices rated the educational experience as equally highly valuable.
These results are consistent with previous work examining other methods of non-cadaveric temporal bone simulation.Reference Reddy-Kolanu and Alderson13, Reference Bone and Mowry15 The Voxel-Man TempoSurg virtual reality simulator excels in enabling repeated practice, offers ease in controlling difficulty levels, and effectively captures a wide range of clinical and pathological scenarios.Reference Reddy-Kolanu and Alderson13 Likewise, low-cost 3D-printed temporal bone models have been shown to be an effective addition to cadaveric temporal bones for the purpose of residents training in cortical mastoidectomies.Reference Bone and Mowry15
Furthermore, our study population was able to provide some additional context to this. Early to middle grade surgical trainees were perceived to be those that would gain the most benefit from use of the artificial temporal bones (phase 1 and 2 surgical trainees). However, there was some appreciation that the models would be useful for all grades (from medical students to consultant otologists). Evidence suggests that expertise development, irrespective of career stage, requires repetitive rehearsal of procedure,Reference Malik, Varela, Park, Masood, Laeeq and Bhatti16 which may be where the value is being seen for these models. This view is supported by broader studies showing that training using simulation-based education remains valuable throughout a medical career.Reference Abas and Juma17 Artificial temporal bone dissection provides surgical trainees with a secure and controlled setting to refine their skills in intricate procedures, all without jeopardising patient well-being. This platform permits repetitive practice, the mastery of surgical techniques, and the acquisition of diverse skills before their application in actual surgical scenarios.Reference Ke, Ma, Zhang and Sun11 Moreover, this method of training allows greater scaffolding with comprehensive feedback and evaluation, thereby assisting in the progression of surgical proficiency.Reference Ioannou, Zhou, Wijewickrema, Piromchai, Copson and Kennedy18
Artificial temporal bone dissection was considered predominantly realistic in 11 of the 13 items of face validity compared to real human tissue. The only exceptions were soft tissue (where this was felt to be highly unrealistic) and odour generation (where the majority rated it neither realistic nor unrealistic). This could have been due to only a minority of the artificial bones used having a soft-tissue component to dissect (27.3 per cent that use PHACON with soft tissue). However, within this item there was also no significant difference in score between those with or without soft tissue, suggesting that the soft tissue on the PHACON model was rated as equally as unrealistic as having no soft tissue. This is useful for planning mastoid surgery training because artificial models can play a significant role with regards to the temporal bone work, but soft-tissue work will need to continue to be learnt in vivo.Reference Wiet, Sørensen and Andersen19
Most participants considered the external contour of the bone and mastoid architecture extremely realistic. Therefore, from this point the artificial models appear to have excellent face validity. This is marginally reduced when looking at items relating to the otic capsule and facial nerve, which suggests that, although not as invalid as the soft tissue, procedures involving these parts of the temporal bone may need to consider if the training needs to be augmented with other modalities (i.e. cadaveric bone). Certainly for items relating to the cortical mastoidectomy, the models appear to have high face validity in comparison to human tissue.
Non-experts significantly favoured dissection of the artificial temporal bone over cadaveric bone (95 per cent confidence interval above the equivalence value of 27.0). The expert group had a greater preference for cadaveric bone, but this was not significantly less than the score of equivalence. Therefore, we can conclude that artificial bones are rated as at least equivalent to cadaveric bone for training overall. Artificial appears superior to cadaveric for providing repetition, enabling regular training, controlling the difficulty and being more amenable to different teaching methods.
There are two clear items that are exceptions for which cadaveric bone was clearly felt to be superior: feedback aiding learning and replication of performance of the same procedure on a real patient. It is easy to conceptualise why the latter would favour cadaveric, especially when considering the weakness in realism of the soft tissue that has been already highlighted. However, the former is less clear. This seems to suggest that the delivery of feedback in a cadaveric dissection setting is superior to the artificial. This may reflect the way our unit delivers artificial bone teaching and is an area that will need to be reviewed internally. Studies examining the application of artificial bones in the same setting as cadaveric bone would be useful to further assess this outcome to determine if this is related to the models or the setting. This might negate some of the potential benefits of using artificial bone (not requiring human-tissue-act-licensed premises). This item is also incongruous with other items that relate to delivery including providing a non-threatening environment, focussing on learning needs and having clear goals and outcomes, which all demonstrated relative equivalence between the two modalities. These responses are in line with previous studies suggesting that planned practice on cadaveric temporal bones can result in proficiency in both surgical skills and human anatomy.Reference Naik, Naik and Bains6, Reference Sudhakara Rao, Chandrasekhara Rao, Raja Lakshmi, Satish Chandra and Murthy20, Reference Irugu, Singh, Sikka, Bhinyaram and Sharma21
Virtual reality simulation and artificial temporal bone models comparison present distinct advantages and challenges.Reference Aussedat, Venail, Marx, Boullaud and Bakhos14, Reference Gawęcki, Węgrzyniak, Mickiewicz, Gawłowska, Talar and Wierzbicka22 In the realm of virtual reality, studies have demonstrated that this form of simulation offers an immersive and repeatable learning experience, albeit with certain limitations in providing realistic haptic feedback and the requirement of significant initial setup costs.Reference Aussedat, Venail, Marx, Boullaud and Bakhos14, Reference Gawęcki, Węgrzyniak, Mickiewicz, Gawłowska, Talar and Wierzbicka22 On the other hand, artificial models offer a tangible and realistic training environment, prioritising haptic feedback and hands-on experiences akin to real-life patient surgical scenarios. Trainees can practise in a controlled setting without any risk to actual patients using theatre equipment.
Reddy-Kolanu and AldersonReference Reddy-Kolanu and Alderson13 reported the face validity of virtual-reality simulation in an experienced cohort. This was deemed acceptable for anatomical structures but received lower scores for drill ergonomics and haptic feedback. It garnered less-favourable assessments for face validity concerning its role in senior training levels. However, virtual reality exhibited stronger performance in terms of content validity. Subjects perceived it as a valuable educational tool for identifying critical structures and relevant landmarks. The choice among these simulation-based training approaches hinges on various factors, including the specific training objectives, the trainee’s level of expertise, accessibility, and the preferences embedded within the training curriculum.Reference Compton, Agrawal, Ladak, Chan, Hoy and Nakoneshny23
When a comparison between expert and non-expert opinions was made in this study, it was observed that experts rated the face validity (realism and acceptability) significantly less favourable than non-experts. This can be because of their professional experience drilling into real bones, which evidently marks the difference. Although those with high levels of experience are likely to be experienced trainers, we cannot ignore the preference of novices/trainees who are in their own right adult learners and will engage better with learning when they have autonomy over how and when they learn.Reference Milliren, Evans, Richmond and Dunn24 This, coupled with the identification of junior trainees as the most likely to find artificial temporal bone useful, suggests that artificial temporal bones are likely to hold a role in a multimodal training curriculum. This approach has been highlighted in otology training programmes elsewhere in Europe.Reference Knowles25
Neither experts nor non-experts were likely to favour one producer of artificial temporal bone over another, which implies that both hold potential and can serve as valid alternatives to traditional training in temporal bone dissection. The decision about which producer of artificial temporal bone to use must be based on cost and further validity assessments. Both bones can offer a high-quality environment for surgical training and further research can now be undertaken as to whether their use objectively enhances the learning curve for mastoid surgery training.
In this study, regardless of not attaining the highest level of face validity, MED-EL and PHACON temporal bones both corresponded on content validity for the cortical mastoidectomy. This implies that the temporal bone does not necessarily require the maximum level of realism to become a useful training tool.
• Studies show that artificial temporal bone models effectively mimic real surgery and cadaveric training, with participants finding them convenient and valuable
• These models potentially offer a low-cost, repeatable alternative, enhancing surgical skills without patient risk
• This study contributes to the growing body of evidence supporting both face and content validity of artificial temporal bones in dissection
• Artificial temporal bones provide high content validity, ease of access, and lower costs compared to cadaveric dissection
• Artificial temporal bone dissection is becoming a vital tool in otolaryngology training, with a promising future in curricula
Limitations
This study had some limitations. The experts were not equally distributed across both bone models, as discussed above. However, despite experts being more likely to rate the face validity less favourably, the unequal distribution does not appear to have disproportionately affected the results in this sample size. The varied distribution of experts occurred because sponsored courses necessitated the use of a certain bone. For this reason, a randomised trial should be carried out to decrease bias and offer a meticulous tool to assess differences.
Conclusion
This study demonstrated the utility of artificial temporal bone dissection in imitating the in-vivo mastoid surgery as well as cadaveric training. Participants appeared to judge that artificial temporal bone dissection was predominantly convenient as a learning tool. With high content validity, relative ease of access, and lower cost than cadaveric, artificial temporal bone dissection emerges as a potential tool in otolaryngology surgery training, demonstrating its future place in curricula. It should be noted that an artificial temporal bone cannot substitute for the cadaveric temporal bone in all aspects; rather, it enhances the conventional training path of aspiring surgeons.
Financial support
This research received no specific grant from any funding agency, commercial or not-for-profit sectors.
Competing interests
The authors declare none.