Introduction
The advent of smart phone applications within the medical arena is becoming increasingly common. Medical health applications offer a huge potential for the improvement of clinical practice and development of how we deliver healthcare to patients. Medical applications have been developed to test for hearing loss,Reference Melo, Silva, Camargo, Cavalcanti, Ferrari and Taveira1 although these are not widely utilised within the National Health Service (NHS) in the UK. Several reports have demonstrated that hearing test threshold values achieved using automated audiometry are similar in reliability compared with results obtained by an audiologist using the conventional manual pure tone audiometry.Reference van Tonder, Swanepoel, Mahomed-Asmail, Myburgh and Eikelboom2
The hearX Group developed hearTest™ which is the mobile hearing test application that is used within this study. HearTest is a CE-marked application that is validated and approved for testing air conduction thresholds.Reference van Tonder, Swanepoel, Mahomed-Asmail, Myburgh and Eikelboom2
Sudden sensorineural hearing loss (SNHL) is an ENT emergency that requires urgent and accurate audiological assessment in order to deliver timely and effective treatment.Reference Chandrasekhar, Tsai Do, Schwartz, Bontempo, Faucett and Finestone3 In our practice and experience, it is not uncommon for patients presenting with sudden SNHL to have a delay in formal audiological testing. In order to reduce time to treatment, patients are regularly treated empirically with oral steroids before formal hearing testing takes place. Therefore, we aim to assess whether hearTest could be used within an NHS secondary care ENT setting to streamline assessment and management of people presenting with hearing loss.
Materials and methods
Study design
HAppENT has been designed as a multi-centre, prospective, non-randomised study across three teaching hospitals in Greater Manchester. Participants underwent hearing tests by conventional audiometry and the tablet-based hearTest. Participants and clinicians involved in the study were invited to complete a feedback questionnaire regarding the usability and feasibility of the hearTest. Results from the hearing tests were compared for accuracy.
Study population
Participants were recruited from the ENT clinic at three hospitals across Greater Manchester: Manchester Royal Infirmary, Wythenshawe Hospital and Fairfield Hospital. A member of the ENT team approached the identified patients about the study, and they were given a patient information leaflet. Patients were provided with time to consider their involvement in the study if they wished. Patients were then screened against the inclusion and exclusion criteria. Patients were included if they were 18 years old or above and clinically required conventional audiometry. Non-English speakers were included providing there was a face-to-face translator present in the clinic. Patients were excluded if they had an active ear infection, congenital or acquired otological deformities, recent loud noise exposure in the previous 48 hours (temporary threshold shift) or if they were unable to follow simple commands. Patients were required to provide written, informed consent. Participants provided qualitative feedback regarding the usability, practicality and feasibility of their experience using the HearTest hearing test.
At the end of the data collection period, all clinicians who had been involved in using hearTest were invited to be recruited into the study. Clinicians were given a clinician participation leaflet and were required to provide written, informed consent. Clinicians were invited to complete a feedback questionnaire. No demographic data were collected about clinicians.
Study intervention
Patients undertook both formal pure tone audiometry and hearTest, in either order. HearTest was carried out in a quiet room on the day of the clinic. HearTest can be purchased directly from the creators, HearX Group. HearTest is provided on a Samsung tablet and is linked to calibrated, over-ear headphones. HearTest is not available for purchase from application stores onto mobile telephones or tablets.
The calibrated headphones have an inbuilt microphone that detects the background noise of the test room. If the background noise exceeded the upper noise threshold, at any point during the test, the test would pause and alert the patient and clinician.
The patient was required to tap the screen when they could hear a series of pure tones of different frequencies. The tested frequencies were 0.5 kHz, 1 kHz, 2k Hz, 4 kHz and 8 kHz. The process was repeated for the left and right ears. If there was abnormal hearing (25 dB or worse in two consecutive frequencies), then tuning fork tests were carried out, and results were recorded on the patient pro forma. The tuning fork tests helped indicate if there was sensorineural, conductive or mixed hearing loss. Immediately following the hearTest patients would complete a feedback questionnaire about the usability, practicality and feasibility of the tablet-based test.
Patients also underwent formal hearing testing by ‘gold-standard’, audiology-led, manual pure tone audiometry in a sound proof booth. It was preferable if the formal hearing test was performed on the same day as the hearTest, but when this was not available patients would attend for this at a later date. Results for the formal pure tone audiometry were collected once performed and transcribed or attached onto the patient pro forma.
At the end of the recruitment period, each clinician who had recruited patients into the study was invited to also be recruited into the study. Clinicians were asked to complete a feedback questionnaire regarding the usability, practicality and feasibility of using the tablet-based hearing test in ENT clinics.
Study outcomes
The primary outcome measure was qualitative data from patients and clinicians regarding the usability, practicality and feasibility of their experience using the hearTest application-based hearing test. Patients undertook a questionnaire immediately following the hearTest. The questionnaire was adapted from the University of Pittsburgh mHealth App Usability Questionnaire for Standalone mHealth Apps Used by Patients (Appendix 1).Reference Zhou, Bao, Setiawan, Saptono and Parmanto4 The questionnaire is validated for feedback from patients using standalone medical health applications.Reference Zhou, Bao, Setiawan, Saptono and Parmanto4 The questionnaire was adapted by removing questions that were not relevant to this particular medical health application. Responses were graded on a Likert scale, ranging from strongly disagree to strongly agree.
Clinicians undertook a questionnaire at the end of the study recruitment period. The questionnaire was adapted from the University of Pittsburgh mHealth App Usability Questionnaire for Standalone mHealth Apps Used by Healthcare Providers.Reference Zhou, Bao, Setiawan, Saptono and Parmanto4 This questionnaire is validated for obtained feedback from healthcare professionals using standalone medical health applications. Responses are graded on a Likert scale, ranging from strongly disagree to strongly agree. The clinician questionnaire also contained four non-validated, open questions regarding practicality and feasibility of using a mobile hearing test in an ENT out-patient clinic setting.
The secondary outcome measure was quantitative data detailing the accuracy of the hearTest hearing test result in comparison with ‘gold standard’ conventional audiometry. Comparisons were drawn between the air conduction hearing threshold results from both of the tests. The frequencies that were tested in both tests, and were therefore used to draw comparison, were 0.5 kHz, 1 kHz, 2 kHz, 4 kHz and 8 kHz. The patients had both left and right ears tested, regardless of their presenting complaint. As the hearTest does not test bone conduction, it is not possible to ascertain the type of hearing loss using the test alone. Therefore, when the hearTest reported hearing loss (air conduction 25 dB or worse in two consecutive frequencies), a tuning fork test was carried out. Results of the tuning fork test were recorded on the patient pro forma and documented as left and right, normal hearing, sensorineural, conductive, mixed or unclear.
The hearTest application has the ability to amend the testing protocol. Prior to commencing the HAppENT study, we carried out a small feasibility study and patient participation groups. The feedback from these groups suggested that the hearTest test took too long. From the feedback, we adjusted the test protocol to reduce the length of time the test took, without impacting on the clinical importance of the test results. The frequencies tested were 0.5 kHz, 1 kHz, 2 kHz, 4 kHz and 8 kHz, excluding 250 Hz and 6 kHz. Each frequency was tested to the minimum value of 10 dB. The rationale for this was that 10 dB is considered normal hearing, so testing lower than this would offer no clinical benefit. This also contributed to reducing the length of time the application hearing test took to complete.
Sample size and statistical analysis
The target sample size was 100 patients, with an anticipated non-compliance attrition rate of 10 per cent, leaving 90 patients. Our initial estimate of sample size was chosen on practical grounds. Each patient recruited would have left- and right-sided results to compare the two methods of hearing tests. The number of patients eligible for recruitment and who were approached, number of those who consented, number of those who declined and number of incomplete tests was recorded.
The patient and clinician questionnaires were assessed for usability and feasibility of the hearTest. The guidance was followed by the developers of the validated mHealth App Usability Questionnaire for interpreting results. We followed the developers’ recommended method to interpret the validated questionnaire. In order to calculate the usability of the application, we calculated the total and determined the average of the responses to all statements. The higher the overall average, the higher the usability of the application. Summary statistics (mean (standard deviation)) for each question were provided. Number (per cent) of unanswered responses for each question were reported.
The patient's hearing thresholds for the different frequencies (0.5 kHz, 1 KHz, 2 KHz, 4 KHz and 8 KHz) from the hearTest and conventional audiometry were compared to assess for accuracy of the application. Bland–Altman agreement plots in which the differences of hearTest and formal audiology are depicted against their average were used to assess the distribution and variability of paired differences. The normally distributed differences between the two methods are said to be in reasonable agreement if the differences within agreement limits are of minimal clinical significance (less than 10 dB). For each frequency, the proportion of paired measurements in clinical agreement (less than 10 dB apart) and the 95 per cent confidence interval are reported.
Results
Study population
Participants were recruited between October 2021 and December 2022. A total of 83 patients were approached to be recruited into the study (n = 83). Two patients declined participation in the study (n = 81). The remainder completed the hearTest hearing test and provided patient feedback (n = 81). Four patients did not attend their scheduled formal pure tone audiometry test (n = 77). Specifically, 52 of the 77 patients had formal pure tone audiometry on the same day as the hearTest hearing test (n = 52) and were included in the data comparison between the two hearing tests. One patient was excluded from this group as they had difficulty using the application and the hearTest application reported high risk of inaccuracy. The remaining 51 participants provided 102 hearTest results (left and right side) to compare with conventional audiometry.
Baseline demographic data were collected for patient participants. The age range of patients was 18 to 81 years (median, 49 years). There were 46 women and 35 men. Six clinicians recruited into the study and all six completed the clinical questionnaire. Baseline demographic data were not collected for the clinicians. No complications were reported during the study period.
Patient-reported usability of hearTest
The University of Pittsburgh ‘mHealth App Usability Questionnaire for Standalone mHealth Apps Used by Patients’ was completed by all participants (n = 81). There were no incomplete questionnaires. Each question was ranked on a Likert scale ranging from 1 (strongly disagree) to 7 (strongly agree). The developers of the questionnaire recommended determining the mean of all of the responses to all statements to ascertain the usability of an application. The higher the overall average, the higher the usability of the application.
Analysis of patients’ questionnaires showed that 98.8 per cent found the application easy to use and 97.6 per cent found that the time taken to do the test was appropriate. A total of 98.7 per cent found the device comfortable to use with an overall patient satisfaction of 98.7 per cent (Table 1).
*1 = strongly disagree, 7 = strongly agree
Clinician reported usability of hearTest
The University of Pittsburgh mHealth App Usability Questionnaire for Standalone mHealth Apps Used by Healthcare Providers was completed by all clinicians who recruited into the study (n = 6). There were no incomplete questionnaires. Each question was ranked on a Likert scale ranging from 1 (strongly disagree) to 7 (strongly agree). The developers of the questionnaire recommended determining the mean of all of the responses to all statements to ascertain the usability of an application. The higher the overall average, the higher the usability of the application.
Clinicians involved in the study all reported that the application was easy to use and set up. They all found the time taken to use the application during the clinic visit appropriate. Only 2.4 per cent found it difficult to recover if they made a mistake in using or setting up the application. Overall, clinician satisfaction with the application was 100 per cent, with all clinicians finding it helpful in their practice and willing to use it again (Table 2).
*1 = strongly disagree, 7 = strongly agree
Clinicians were also invited to answer some open questions regarding their use of the hearTest device. There were no concerns regarding background noise when carrying out the test in a out-patient ENT clinic room rather than the sound proofed booth required for manual audiometry. Clinicians agreed that the device was easy to set up and did not delay their clinic. However, depending on when the device was last used, the hearTest device requires a full shut-down and restart which takes 300 seconds (5 minutes).
Accuracy of hearTest hearing test
The results of the hearTest were compared with the conventional manual audiometry results when both tests were performed on the same day (n = 52). One patient was excluded because the hearTest indicated high risk of inaccuracy despite repeating the hearing test protocol (n = 51). Patients that completed the two tests on different days were excluded from the comparison because of the risk of their hearing thresholds changing over time (e.g. if patients were given steroids for sensorineural hearing loss). Each patient completed a right and a left hearing test, totalling 102 hearTest tests to compare with formal audiometry. Comparisons were drawn between the hearTest and conventional audiometry air conduction hearing threshold results. The frequencies that were used to draw a comparison were 0.5 kHz, 1 kHz, 2 kHz, 4 kHz and 8 kHz.
For each tested frequency (0.5 kHz, 1 kHz, 2 kHz, 4 kHz and 8 kHz), the mean difference between hearTest and audiometry was calculated. The hearTest testing protocol used in the study tested to a minimum level of 10 dB. However, the test indicated when the true result could be lower than this, and this was represented by an arrow symbol on the audiogram result. These patients who had an arrow indicated on the hearTest sometimes had true threshold values of 0 dB or 5 dB when tested by formal audiometry. For the purposes of data interpretation, these results have been adjusted to equal equivalence (i.e. difference = 0).
For each frequency, the percentage of measurements in clinical agreement (less than 10 dB apart) and its 95 per cent confidence interval are reported in Table 2. The frequencies 0.5 kHz to 4 kHz had an average clinical agreement rate ranging from 94.1 per cent to 96.1 per cent. However, 8 kHz had a much lower average clinical agreement rate of 71.3 per cent, making 8 kHz on the hearTest unreliable for the diagnosis of hearing loss (Table 3).
Bland–Altman agreement plots were used to depict the differences between the two tests against their average and assess the distribution and variability of paired differences (see Appendix 2). The two methods are said to be in reasonable agreement if the difference between the tests is within 10 dB (i.e. minimal clinical significance; Table 4).
*The hearTest testing protocol used in the study tested to a minimum decibel level of 10 dB. HearTest indicated when the true result could be lower than this. These patients sometimes had true threshold values of 0 dB or 5 dB when tested by formal audiometry. For the purposes of data interpretation, these results have been adjusted to equal equivalence
For patients with normal hearing on formal audiometry (i.e. less than 25 dB), hearTest had a 97.12 per cent accuracy of diagnosing normal hearing across all tested frequencies (0.5 kHz to 8 kHz).
The hearTest tests only assessed air conduction thresholds, so because of the lack of bone conduction testing they cannot independently diagnose conductive hearing loss. The study aimed to assess if tuning fork tests provided an accurate enough method for diagnosing conductive or sensorineural hearing loss in patients with reduced air conduction thresholds on their hearTest test. Only patients who had an air conduction hearing loss of 25 dB or worse across two consecutive frequencies on hearTest went on to have tuning fork tests by means of Rinne's and Weber's tests. For patients with a pure sensorineural hearing loss shown on manual audiometry, 41 of 44 (93.2 per cent) tuning forks were accurate in identifying this. For patients with a conductive or mixed component, Rinne's test was accurate in identifying this in 10 of 12 patients (83.3 per cent).
Discussion
HearTest has shown to be an acceptable method of testing for hearing loss by both patients and clinicians. A total of 100 per cent of adult patients agreed that they found the device comfortable to wear, and 98.8 per cent found the application easy to use and agreed that the time to take the test was appropriate. A total of 97.6 per cent of patients agreed they would use this method of testing again if it were offered to them. Similarly, 100 per cent of ENT clinicians agreed that the application was easy to use and could help improve the delivery of healthcare to their patients. Clinicians found that using a quiet clinic room was satisfactory for carrying out the hearTest (Table 5).
• Medical applications have been developed to test for hearing loss, although these are not widely utilised within the National Health Service (NHS) in the UK
• Hearing threshold values achieved using automated audiometry are similar in reliability to results obtained using conventional manual pure tone audiometry
• This study has proven that hearTest is an acceptable form of hearing assessment for patients and clinicians, with high levels of usability and practicality
• This study proposes that hearTest can be used within NHS ENT services to test for hearing loss when manual audiometry is not immediately available
• HearTest allows for serial testing to monitor the response to steroid treatment in cases of sudden sensorineural hearing loss
Within our sample, 0.5 kHz, 1 kHz, 2 kHz and 4 kHz had an average clinical agreement (±10 dB) rate of 95.1 per cent, making it an accurate test to diagnose hearing loss. This is marginally higher than the hearX group's published results, which quote 94.4 per cent in an adult population.Reference Melo, Silva, Camargo, Cavalcanti, Ferrari and Taveira1 However, in our sample 8 kHz had a much lower average clinical agreement rate of 71.3 per cent, deeming 8 kHz on hearTest unreliable for diagnosing the severity of hearing loss. Furthermore, hearTest has an average of 97.1 per cent accuracy of diagnosing normal hearing (less than 25 dB) across all of the tested frequencies.
Conclusion
HearTest is an accurate and acceptable form of hearing testing for both patients and clinicians. We propose that hearTest can be used within NHS ENT services as an adjunct to clinical examination when the gold-standard manual audiometry is not immediately available. Tuning fork tests should be used alongside history and examination to ascertain whether hearing loss is sensorineural or conductive. A normal hearing result using hearTest is extremely accurate and may negate the need for formal hearing assessment.
Within our department, hearTest has been integrated into the local sudden sensorineural hearing loss pathway. It provides rapid audiometric assessment on the day of presentation and can be used out of hours. HearTest can also be used for serial hearing testing to monitor the response to steroid treatment. Patients are all referred for formal manual audiometry as per the standard protocol.
Further research is being undertaken to assess the usability and practicality of hearTest within the NHS primary care setting and for screening for hearing loss in patients with dementia.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S002221512300138X.
Acknowledgements
The HAppENT study was supported by the Manchester NHS Foundation Trust Head and Neck Charity (grant number: 2020/102). Manchester NHS Foundation Trust Head and Neck Charity has played no role in the design or running of this study.
Competing interest
None declared
Appendix 1.
(a) Patient-reported usability: mean, median and interquartile range of responses for each question of the mHealth App Usability Questionnaire for Standalone mHealth Apps Used by Healthcare Patients
(b) Clinician-reported usability for responses to each question of the mHealth App Usability Questionnaire for Standalone mHealth Apps Used by Healthcare Providers
Appendix 2. Bland-Altman plots and agreement
(a) Bland–Altman plot and agreement limit – 0.5 kHz
(b) Bland–Altman plot and agreement limit – 1 kHz
(c) Bland–Altman plot and agreement limit – 2 kHz
(d) Bland–Altman plot and agreement limit – 4 kHz
(e) Bland–Altman plot and agreement limit – 8 kHz