Introduction
Medical and trauma resuscitation in rugged prehospital settings, such as combat zones and austere environments, presents multiple challenges to the practitioner. First responders must provide time-sensitive, critical interventions under precipitous circumstances, commonly hampered by material and provider resource scarcity. Reference Anagnostou, Michas and Giannou1 As a result, clinicians may be required to perform sophisticated care beyond the traditional scope of practice for their level of training or certification. Reference Anagnostou, Michas and Giannou1–Reference Orkin, Venugopal and Curran3 Education and maintenance of proficiency in all potentially needed resuscitative procedures in austere environments is not feasible given the breadth of skills that could be needed and the rarity with which the interventions are performed. Reference Latman and Wooley4 Consequently, solutions are needed to provide austere field clinicians with tools to navigate rare but critical procedures.
For such circumstances, properly designed Just-in-Time Guidance (JITG) interventions may present a viable strategy by delivering immediately accessible support that accounts for an individual’s changing contextual and internal state. Reference Nahum-Shani, Smith and Spring5 These JITG strategies may assist clinicians in performing procedures in real-time by providing contextual, “by-the-hip” training that is adaptive to the needs of the target audience and allows them to simultaneously learn and perform the task of interest. Reference Kumar, Nilsen and Abernethy6,Reference Spruijt-Metz and Nilsen7 However, a medium for delivery of JITG that can be accommodated in the field environment is critical.
Augmented reality (AR) may present a tenable solution to field-initiated JITG for high-stakes, low-frequency procedures. Reference Munzer, Khan, Shipman and Mahajan8 This technology, which combines real and virtual content in an interactive, real-time, three-dimensional environment, is gaining recognition for its potential as a viable medium for medical training, practice guidance, and education. Reference Munzer, Khan, Shipman and Mahajan8,Reference Azuma9 Often harnessed through the use of a wearable device, such as a headset, AR software programs convey an interactive experience during which objects and surroundings present in the user’s real environment are enhanced by computer-generated perceptual information. Reference Munzer, Khan, Shipman and Mahajan8 Unlike virtual reality, during which the entirety of the user’s environment is simulated by software, AR technology is intended to enhance the existing environment and interact with objects in the user’s vicinity. This has the potential to alter how the user engages with their actual environment by providing additional insight or instruction about what the user is viewing and potentially providing guidance on how to interpret, manipulate, or interact in that environment. Reference Munzer, Khan, Shipman and Mahajan8,Reference Siu, Best, Kim, Oleynikov and Ritter10
While obstacles to the dissemination of AR technology, including expense, accessibility, and portability, historically hindered its emergence as a mainstream tool, significant advances have been made in its capabilities, and manufacturers are now producing products that are ergonomic, affordable, and high-resolution. Reference Van Krevelen and Poelman11 These advances have made AR platforms viable for many industries, including medicine. Reference Munzer, Khan, Shipman and Mahajan8,Reference Azuma12 Thus, AR has the potential to enhance clinical practice and procedural proficiency via the integration of auditory, visual, and tactile stimuli and the ability to integrate with simulation equipment or real patients. Reference Dhar, Rocks, Samarasinghe, Stephenson and Smith13–Reference Uruthiralingam, Rea and Rea15
Previous investigation suggests comparable or improved learning results when medical professionals utilize AR technology for training purposes as compared to traditional in-person education, Reference Aebersold, Voepel-Lewis and Cherara16–Reference Rochlen, Levine and Tait18 indicating that it has the potential to serve as an effective primary training tool or enhancement for medical professionals. Intuitively, these results may extend to austere environments by allowing practitioners to integrate virtual guidance and real-time patient information while uninterruptedly interacting with patients and providing care. Reference Broach, Hart and Griswold19 Using AR, first responders could receive step-by-step instructions for life-saving procedures in austere environments or critical access facilities, expediting the time from recognition of dangerous conditions to life-saving intervention. Reference Munzer, Khan, Shipman and Mahajan8 Such a strategy supports the performance of critical, time-sensitive tasks not normally encompassed by their traditional training in situations when a more advanced provider is not available. Reference Rojas-Muñoz, Lin and Sanchez-Tamayo20
Despite its potential as a tool for procedural guidance, literature on the efficacy of AR is limited. To date, few studies have compared performance or subject satisfaction between AR and traditional education for common critical field procedures, nor investigated its impact on learners of different skill levels. There is also a paucity of investigation pertaining to the feasibility of AR use for JITG, during which subjects use the technology to learn and perform procedures for the first time simultaneously. The objective of this study is to evaluate the feasibility and efficacy of AR-mediated JITG for the performance of critical, rugged field procedures by prehospital clinicians as compared to task performance after traditional education delivery. Performed in a highly controlled simulation environment, this investigation evaluated subject performance and time-to-task performance, as well as participant-reported usability and acceptability of the novel AR technology.
Methods
Study Design and Setting
This study utilized a randomized between-subjects design examining emergency medical technician-basic (EMT-B) and paramedic cohorts and was performed in a simulation center at a medical school affiliated with an urban academic tertiary care hospital. The study design and reporting were compliant with CONSORT guidelines for randomized control trials in simulation. Reference Cuschieri21 Recruitment was begun in March 2022 and all study procedures were complete by July 2022.
Selection of Participants
Two types of subjects were recruited: Advanced Life Support paramedics and Basic Life Support emergency medical technicians (EMT-B). Paramedic and EMT-B participants were recruited by an email distributed through local Emergency Medical Services (EMS) agencies and were eligible if they were 18 years or older and held no higher level of licensure or training beyond their EMS certifications. Non-English-speaking subjects, subjects under 18, and subjects unable to provide informed consent were excluded. Participation was voluntary, but all subjects received a small monetary reimbursement for participation. Study activities were performed over five separate days. This study was approved by the Institutional Review Board of the University of Massachusetts Chan Medical School (Worcester, Massachusetts USA; IRB Docket H00023537).
Intervention
The intervention focused on the performance of three common emergency medical tasks: bag-valve-mask (BVM) ventilation, needle chest decompression (Needle-d), and intraosseous (IO) line placement. These tasks were chosen because they are time-sensitive, potentially life-saving procedures that would feasibly be required in austere environments and require specific psychomotor skills. Additionally, they are procedures that would likely be familiar to paramedics but new to EMT-Bs. By local state guidelines, a paramedic would be expected to have competency in all three procedures, however, an EMT-B would be expected to be competent in performing BVM only. Subjects were randomized to one of two types of training conditions for BVM, IO, and Needle-d procedures: a control condition video training or the investigational AR JITG.
In the conventional “control training” activity, subjects observed a lecture video for all three tasks in which an experienced instructor performed and narrated each procedure with visual aids as they would in a traditional training session. After watching the training video, subjects performed the BVM, IO, and Needle-d tasks on simulation mannequins independently in a medium-fidelity simulation environment.
For the investigational arm, subjects utilized AR JITG software to train in the BVM, IO, and Needle-d tasks. The AR JITG software included an experimental, commercially sponsored prototype heads-up display implemented on the Microsoft HoloLens 2 (Microsoft Corp.; Redmond, Washington USA). Reference Keebler, Rosen, Sittig, Thomas and Salas22,Reference Luna, Quispe and Gonzalez23 The investigational AR-based interface was designed and developed using human factors and user-centered design principles, providing real-time training guidance on the targeted tasks and including both voice command and eye tracking/hand gesture interaction mechanisms (eg, say “next,” or use the hand gesture “Air tap” after looking at the “next” icon, to move to the next guidance step). A key design consideration was to provide sufficient guidance to complete procedures, without detracting/obscuring attention in the visual field of view during medical care. The prototype consists of a heads-up information display intended to provide step-wise guidance to the user through each stage of the chosen tasks. Figure 1 illustrates the first-person view of the AR training technology (for the BVM task), and Figure 2 illustrates the third-person view (for the IO task) and experimental set-up for the three tasks in the testing space.
For the AR JITG condition, subjects were oriented to the HoloLens hardware and AR JITG software prototype using an AR orientation application that instructed subjects on the use of key functionality without showing any JITG. Subsequently, subjects were instructed to utilize the “just-in-time” AR modules to perform BVM, IO, and Needle-d tasks in the same medium-fidelity environment as the control cohort.
Participants were randomized upon recruitment using a simple computer-generated randomization program to either the video-lecture control training or AR JITG condition. They were further randomized to the order in which they would approach the assigned tasks. Condition assignments were assigned to individual subjects prior to the day of the simulation event and provided to subjects upon arrival at the study site by members of the study team. Subjects utilized the AR orientation application or training videos once and attempted each task twice. Subject completion of each procedure (BVM, IO, and Needle-d) was video recorded for later in-depth scoring. After completion of each task, subjects completed a corresponding set of survey materials regarding the training they received to complete the tasks.
Measurements and Outcomes
The sample size was chosen a priori based on project resource constraints and a review of guidelines for studies using similar methods (ie, comparative usability study) that suggested a sample size of between eight and 25 (varying depending on study complexity) for detection of usability issues and group differences. Reference Macefield24–Reference Nielsen and Landauer26
The primary outcome was subject task performance. Three key results were of interest in the between-conditions comparison for each group: performance on the tasks, including differences in performance between first and second attempts; time to complete the tasks; and self-reported usability measures. Four trained evaluators who were board-certified emergency physicians observed the procedures and reviewed supplementary video footage of subjects completing each simulation task. Task performance was scored according to a pre-determined validated rubric used for scoring practical examinations of EMS certification candidates developed by the National Registry of Emergency Medical Technicians (NREMT; Detroit, Michigan USA). 27,28 These rubrics awarded one point each for each required psychomotor task correctly performed, such as setting up or operating a piece of equipment, rendering simulated care to a patient, or evaluating of the impact of an action. Each task was scored by one evaluator. Evaluators were blinded to subject type but not to the training method used for the procedure as the subject could be seen in video footage either wearing or not wearing the investigational headset. Scores were reported as a percentage of possible total points achieved. All data were transcribed into REDCap (Vanderbilt University; Nashville, Tennessee USA), a secure online electronic database. Reference Harris, Taylor and Minor29 Task completion time for each task attempt was also recorded as an additional measure of performance.
Secondary outcomes included subject-perceived usability and usefulness. After each task attempt, subjects were instructed to fill out a subjective usability and usefulness questionnaire pertaining to their experience with their assigned training type. They responded to four questions using a seven-point Likert scale: (1) Was the training you received conveyed to you in a usable way? (2) Was the training you received useful to you? (3) Would you opt to use this method of training again for future medical tasks? And (4) How well do you feel you performed the medical task you just performed?
Statistical Analysis
Demographical and outcomes data were reported descriptively. Comparisons between outcomes including task performance and the usability and usefulness questionnaire data were analyzed using unpaired T testing. All statistical analyses were completed using JASP version 0.16.1 (University of Amsterdam; Amsterdam, The Netherlands).
Results
In total, N = 60 subjects (n = 30 for each of EMT-B and paramedic cohorts) were enrolled. Fifteen subjects from each cohort were randomized to the AR JITG condition and fifteen to the video control conditions. Subject demographics are summarized in Table 1.
Abbreviation: EMT-B, emergency medical technician-basic.
Table 2 summarizes subject task performance and Table 3 compares task scores between the training conditions. During the first attempt of all three tasks, EMT-B subjects in the video control group achieved higher scores than EMT-B subjects utilizing the AR JITG technology. In the second attempt, there was no difference in score between training types in the BVM or IO group, but the video control group performed better in the Needle-d group. When averaged across attempts, there was not a statistically significant score difference in the BVM or IO groups, and the video control group scored better on the Needle-d task (22% score difference; P = .01). In the paramedic cohort, there was no difference in task performance between the training conditions in the BVM or Needle-d tasks, but the video control group had a higher score on both attempts of the IO procedure as well as when averaged across attempts (mean difference 15%; P = .044).
Abbreviations: EMT, emergency medical technician; BVM, bag-valve-mask; JITG, Just-in-Time Guidance; IO, intraosseous; Needle-d, needle decompression.
Abbreviations: EMT, emergency medical technician; BVM, bag-valve-mask; JITG, Just-in-Time Guidance; IO, intraosseous; Needle-d, needle decompression.
For EMT-B subjects, there was a significant performance difference between task attempts (collapsed across task types) for the AR JITG group, with a mean 10% improvement in performance score on the second attempt compared to the first attempt (P = .02). This effect was not present for the paramedic cohort in either training condition or for EMT-Bs in the control group.
The time to complete each task, broken down by task type, attempt number, and training condition, is summarized in Table 4. The BVM task was performed the fastest (mean time 1.53 minutes) followed by Needle-d (2.16 minutes) and IO (3.02 minutes). All task-type pairwise comparisons for task completion time were statistically significant (P <.001 for all comparisons). Collapsed across task types, the task completion time was faster in the video control condition compared to the AR JITG condition (mean difference 1.58 minutes; P < .001). Tasks were completed faster during the second attempt compared to the first attempt for all task and condition types (mean difference 0.78 minutes; P <.001). Tasks were also completed faster on the second attempt for AR JITG compared to the first attempt (mean difference 1.278 minutes; P <.001).
Abbreviations: EMT, emergency medical technician; BVM, bag-valve-mask; JITG, Just-in-Time Guidance; IO, intraosseous; Needle-d, needle decompression.
Subjects’ responses to the usability and usefulness questionnaire are summarized in Table 5. The only statistically significant difference between the control and JITG groups on any measure of usability or usefulness was among paramedics when rating the likelihood of using the training condition again; in this case, the control condition was statistically preferred (score rating difference 1.96 points; P = .02).
Abbreviations: EMT-B, emergency medical technician-basic; JITG, Just-in-Time Guidance.
Discussion
This investigation yielded preliminary data to suggest that AR JITG for the three procedures tested was feasible, resulted in participant task performance comparable to a traditional training modality, and exhibited sufficient acceptance by users. These results are promising in that they suggest that AR may be a potentially feasible Just-In-Time teaching modality for austere environments for some learners; however, the training platform evaluated in this study did exhibit significant limitations, indicating that additional technology and/or content development is likely required before such technology is ready for field use.
Performance on all tasks was initially better with traditional training as opposed to JITG. However, especially among less-experienced operators, this effect was eliminated during the second attempt at each task. This is potentially relevant because the less-experienced subjects, the EMT-Bs, more closely resemble the group of combat life-savers that may be deployed to austere environments and require immediate training on procedures that they are not accustomed to performing. These data suggest that when the operator is unfamiliar with the task before being exposed to the training condition, the differences between the control training condition and JITG are minimal, especially on the second attempt when the operator has used the technology once already on a previous attempt. These findings mirror a previous pilot study comparing EMT-B and lay learners, which showed comparable performance between EMTs and adults with no medical training when performing Needle-d and BVM tasks using AR JITG in a similar medium-fidelity environment. Reference O’Connor, Porter and Boardman30
This is an important finding as it suggests that JITG may be equally as effective as standard training and could be deployed in a more targeted manner at the point of use as opposed to training all potential users months ahead of time. In theory, JITG would not be subject to knowledge erosion that would occur over time with prior training strategies. It has the added benefit of being hands-free and can be projected in the same visual field as the actual patient, which may more optimally guide the proceduralist to find landmarks and self-pace themselves through the steps of the procedure. The study conditions may have favored the control condition in that it asked operators to immediately use the skills that they had learned in video training. This would not likely be the case for personnel who might go months between initial training and skill use in a real-world scenario.
The trend towards task completion score improvement between attempts was not observed in the paramedic cohort, who were assumed to be independently proficient in the tested tasks. A negative score differential was observed in the paramedic JITG cohort compared to the video training for the IO procedure, with no difference in performance across the remaining procedure attempts or training types. These findings suggest that the AR technology may have in fact served as a distraction that caused deterioration in performance due to the cognitive and psychomotor burden of the JITG condition. It may be posited that additional JITG is not helpful in supporting tasks for which the operator is independently competent. The notion of distraction warrants additional work to determine if the hardware or the software may be causing distraction, and if so, how they can be improved. It also stands to reason that users may become less “distracted” as they grow accustomed to AR technology and use, and no further refinement may be necessary; however, this also warrants further investigation.
This training strategy may represent a potentially viable solution for real-time training in austere settings. However, questions specific to user acceptance of the modality and its efficacy in imparting proficiency in procedural skills on novice operators remain. In addition, AR solutions may be useful in task-shifting essential procedural interventions to available medical providers with varying skill levels, especially in settings that lack immediate access to Advanced Life Support personnel, such as austere environments and combat situations. Their ability to integrate real and virtual stimuli and to enable user interaction with the software and their real environment simultaneously makes it an intriguing modality for procedural guidance, and given these early but promising findings, further investigation of the use of AR JITG technology is justified. Reference Munzer, Khan, Shipman and Mahajan8,Reference Siu, Best, Kim, Oleynikov and Ritter10
Analysis of task completion times revealed that EMT-B subjects performed slower than paramedics, likely reflective of this cohort’s inexperience with two of the three tasks prior to the study. Task performance was also slower in the JITG cohort for all tasks, likely due to the necessary physical and cognitive adjustment to the AR hardware and software. However, this difference in task time decreased after the second attempt at each procedure, suggesting that once subjects adapted to the equipment, less time was lost when using the JITG.
Subject feedback also indicated that the JITG training was similarly usable and useful compared to the control group. This finding supports the feasibility of JITG implementation as it indicates acceptability among the target users. The paramedic cohort did report a lower likelihood to use AR technology, again suggesting the technology was burdensome when it did not facilitate new learning; this is concordant with the absence of task performance improvements in this cohort.
The aggregate findings of task completion scores, task completion times, and subject ratings preliminarily support the potential for the use of AR JITG in the austere setting and lay a foundation for future user-experience, implementation, and efficacy work in this topic. To further validate the efficacy of AR training, a larger randomized control study is needed. Additional procedures with varying levels of cognitive and tactile difficulty should be investigated to assess whether the performance of AR software depends on the type and complexity of the techniques being practiced. A qualitative investigation of user experience with the JITG technology is warranted to optimize its interface. Studies conducted in field and high-fidelity environments are also necessary to determine the usability and effectiveness of AR technologies in realistic practice settings, particularly in high-stimulus environments like austere or combat settings. Pragmatic operational considerations, including hardware durability and the feasibility of integrating it with the equipment already employed by the target population, must also be considered.
Limitations
This study had several limitations, including a small sample size, potentially leading to under-powered detection of comparative differences in subject or training-type usability ratings and task performance. Additionally, the investigation focused on only three procedures with limited cognitive and tactile skills required. All simulations were performed in a highly controlled medium-fidelity environment, which provided comfort and limited external stimuli. While subjects were randomized to control or AR JITG cohorts, evaluators were not blinded to the type of training each cohort received, as the AR device was visible during the evaluation. Additionally, subjects themselves were not blinded to their own training condition, which may have introduced bias to task performance and subject-reported outcomes.
Conclusion
Participants’ task performance wielding the experimental AR JITG platform used in this study was similar to the performance of those using a traditional training modality. Additionally, participants reported that the experimental platform is acceptable to use in simulated practice. This investigation provides evidence that AR-mediated “Just-in-Time” guidance for select emergency medical procedures has the early potential to be feasible and efficacious as a training and guidance modality for prehospital clinicians.
Conflicts of interest/funding
Three authors (NM, SL, and CL) are employed by Charles River Analytics, which manufactures a product related to the subject matter of this manuscript. All other authors disclose no conflicts of interest or funding for this manuscript.
Acknowledgments
This material is based upon work supported by the United States Army Medical Research and Development Command (USAMRDC)/Telemedicine & Advanced Technology Research Center (TATRC) under Contract No. W81XWH20C0034. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the USAMRDC/TATRC.
Supplementary Materials
To view supplementary material for this article, please visit https://doi.org/10.1017/S1049023X24000372