INTRODUCTION
Modelling of infectious disease transmission has a long history in mathematical biology for assessing epidemiological phenomena [Reference Kermack and McKendrick1]. In recent years, it has become an element of public health decision-making on several occasions, to examine major risks such as HIV/AIDS epidemics, pandemic influenza or multi-resistant infections in hospitals [Reference Halloran and Lipsitch2, Reference Hethcote3]. However, perception of these models by other scientists working on these public health problems remains divided. Although modelling approaches have gained growing acceptance in recent years, as illustrated by the number of articles published in high-profile journals [Reference Anderson4–Reference Levin6], the use of mathematical models has also at times elicited scepticism or defiance [Reference Kitching, Thrusfield and Taylor7, Reference May8]. To the best of our knowledge, no study quantifying the perceived growing impact of the modelling approach on the scientific community or providing in-depth information on the relationships between mathematical models and other relevant scientific fields has been published.
Herein, we describe a method for investigating the developing status of mathematical modelling in epidemiology and apply it to antibiotic resistance research, as a case study. The reasons for this choice are twofold. First, mathematical modelling has addressed key issues in antibiotic resistance, e.g. analysis of treatment protocols for resistance prevention [Reference Bonhoeffer, Lipsitch and Levin9], assessment of control strategies in hospital settings [Reference Austin10], or prediction of future trends in the community [Reference McCormick11]. Second, although antibiotic resistance is now considered a major public health issue in all developed countries, it is still an emerging problem that probably has not reached its full impact. It has provided a pertinent case study, as we think that the evolution of mathematical models of antibiotic resistance reflects the dynamics of the entire field.
We conducted a quantitative analysis of 60 articles modelling antibiotic resistance that were published over the last 15 years. Adapting the method introduced by Hasbrouk et al. [Reference Hasbrouck12], based on these articles, we began by investigating the relationship between mathematical modelling and other scientific approaches, in terms of input and output flows, respectively identified by references listed by a given article and by other articles citing the given article. We then examined possible temporal trends in the number of published models, to verify whether the modelling approach has indeed been gaining stature in recent years. Finally, we evaluated the citation impact of modelling articles by comparing them to other articles published in the same journal issues.
METHODS
Selection of modelling papers
Preliminary trials convinced us that no magic search equation exists that automatically provides the list of all mathematical models of antibiotic resistance. In particular, the use of general key words, such as ‘antibiotic’ or ‘bacteria’ in the search equation resulted in not finding an important number of the models, as several were specifically concerned with a single antibiotic-bacterium couple (for instance, methicillin-resistant Staphylococcus aureus). We therefore decided to undertake a highly sensitive but obviously non-specific search of PubMed, using the search equation:
As of 1 July 2006, this search had retrieved over 6000 articles, among which 4663 had been published after 1990. We then conducted a systematic manual screening of this list, using the following criteria to select a modelling article for our analysis:
(1) The article had to report original results; no letters, editorials or reviews were considered, except when they also introduced a new mathematical model. This led to the exclusion of 295 articles.
(2) The article had to deal with bacterial resistance to antibiotics; in particular, articles on resistance to antiviral agents or cancer therapies and those on pest resistance to insecticides were rejected. Because our search equation retrieved all articles dealing with ‘resistance’ phenomena in general (e.g. including biomechanical phenomena), this was by far the most important criterion, and led us to discard 4022 of the remaining articles.
(3) The article had to use a mathematical approach; in particular, articles selected only because they used an animal model, a model organism or a molecular model were excluded. This led us to discard 88 of the remaining articles.
(4) Pharmacokinetic–pharmacodynamic (PKPD) models were excluded, as over the last few decades we considered them to have become a wholly independent field of research. This led us to discard 167 of the remaining articles.
(5) Articles based on the statistical analysis of data and the computation of relative risks or odds ratios, such as multiple linear- or logistic-regression analyses were rejected, as we believed that these methods are now widely accepted and used outside the scope of mathematical modelling. This led us to discard 29 of the remaining articles.
Overall, the successive refining of our criteria concerning the initial set of 4663 articles resulted in the selection of 60 articles [Reference Bonhoeffer, Lipsitch and Levin9–Reference McCormick11, Reference Massad, Lundberg and Yang13–Reference Wang and Lipsitch69] (Table 1) published in a journal referenced in the Science Citation Index (SCI) database; two additional articles not referenced in the SCI database were excluded.
Retrieval of references and citations
Using the Web Of Science® service by Thomson/Institue for Scientific Information (ISI) to search the SCI database for each of the 60 selected modelling articles, we obtained all the references listed in the retained article, herein referred to as ‘references’, as well as all the other articles citing the above selected article, herein designated as ‘citations’.
Based on these collected data, we made a list of all the references given in the original 60 modelling articles; to avoid redundancy, any given article was mentioned only once in this list. The same methodology was applied to make a list of citations from all the articles citing any of the 60 originally selected modelling articles.
Assignment of references and citations to scientific classes
In the Journal Citation Reports (JCR) database, journals are sorted into subject categories, which provide general information on the scientific area of specialization of articles published in them. We summarized this information by dividing these subject categories into the following six scientific classes: clinical medicine, biology, mathematics/statistics/informatics, other basic science (e.g. chemistry), epidemiology/public health, and multidisciplinary sciences (for general journals, like Science, Nature or Proceedings of the National Academy of Sciences of the USA). The repartition of JCR subject categories into our scientific classes is detailed in Table 2. We were then able to assign articles from our reference and citation lists into these six classes according to the journal in which they were published.
Applying a method similar to that devised by Hasbrouck et al. [Reference Hasbrouck12], we used this classification to compute inflows and outflows between mathematical modelling and other approaches from the six scientific classes.
Assessment of the citation impact
Considering the lapse of time until citation, recently published articles are not appropriate for studying the scientific impact of models. We therefore restricted citation–impact assessment to the 37 modelling articles published in 2004 or earlier [Reference Bonhoeffer, Lipsitch and Levin9–Reference McCormick11, Reference Massad, Lundberg and Yang13–Reference Temime, Guillemot and Boelle46]. We compared the numbers of citations of each of these 37 modelling articles to those of all other articles published in the same issue of the same journal. Use of this comparison, instead of crude numbers of citations per modelling article, was meant to avoid a possible bias due to journal impact factors. For example, when examining article number 1 [Reference Massad, Lundberg and Yang13], we retrieved the number of citations of each article published in volume 33, issue 1 of the International Journal of Biomedical Computing. We were thus able to compare the number of citations of article number 1 with the mean and the median numbers of citations within this issue, and to compute the citation percentile within which article number 1 was situated.
RESULTS
Description of the data
Modelling articles (Table 1) referenced a mean of 36·6 articles (median: 32 references listed per article); 70% of the listed articles had 20–50 references. The overall reference list, built as detailed in the Methods section, included 1373 distinct articles, published in 293 different journals.
Modelling articles were subsequently cited in a mean of 22·5 articles (median: seven citations per article); many articles had either never or almost never been cited at the time we retrieved the information from the SCI database. The overall citation list, built as detailed in the Methods section, includes 971 distinct articles, published in 321 different journals.
Interaction of modelling with other approaches
According to the journal in which they were published, the 971 citations and 1373 references of the 60 selected modelling articles were assigned to one of the six scientific classes defined above (Table 2). The resulting proportional scientific inflows and outflows to and from mathematical models are illustrated in Figure 1.
Among both inflows and outflows, the most frequent category was clinical medicine, with about 50% of the articles, followed by biology. All other categories represented less than 10% of either flows. Despite these similarities, the distribution of citations among the six scientific classes differed significantly from that of references (χ2 test, P<0·001). This difference might reflect, in particular, the slightly more frequent outflow towards biology (35% of citations were made in articles published in journals from the biology class) than the inflow from biology (30% of references listed articles published in journals from the biology class). Another notable discrepancy concerns ‘multidisciplinary sciences’ journals, which are more often referenced in modelling articles (8% of all references) than they publish articles citing modelling studies (5% of all citations).
To analyse potential trends in the temporal evolution of the relationship between modelling and other scientific fields, we also studied the inflows and outflows separately for each year considered (1993–2006). The distributions of references and citations among different scientific classes did not change markedly over years.
Evolution of the number of published modelling articles
Figure 2a reports the number of articles from the list divided by the total number of publications included in the SCI database, both according to the year of publication. Although we searched for articles uniformly over the last 15 years, the annual number of published mathematical modelling articles on antibiotic resistance in our list increased progressively over time, even when normalized to the total number of publications included in the SCI database at corresponding years for correction on the overall publication size (Spearman trend test, P<0·01).
Likewise, Figure 2b illustrates, as a function of publication year, the proportion of articles included in the SCI database which were retrieved by the search equation:
This search equation was both too specific to cover all articles on antibiotic resistance and too sensitive not to include articles unrelated to antibiotic resistance. However, it provided us with a crude means of comparing the dynamics of antibiotic-resistance modelling with those of antibiotic resistance as a whole, all scientific approaches considered. From Figure 2b, it appears that the general attention accorded to antibiotic resistance has been rising steadily over the last 15 years. However, the ratio of antibiotic-resistance models to antibiotic-resistance-related articles has also been increasing (Spearman trend test, P<0·01).
Citation impact of modelling papers
The 37 modelling articles published before 2005 were cited in a mean of 35·9 subsequently published articles (median: 20 citations per article). Figure 3 shows the citation percentile distribution for all 37 articles, computed from a comparison with other articles published in the same issues of the same journals, as detailed in the Methods section.
Overall, 80% of the studied modelling articles were cited more often than the median of articles published in the same issues (i.e. they were among the top 50% most-cited articles from the issue). The mean computed citation percentile was 34%, meaning that modelling articles are among the top third of cited articles in their journal. The median computed percentile was 28%, indicating that 18 of the 36 articles belonged to the top quarter of articles in their journal in terms of citations.
Factors associated with the high citation impact of modelling articles
We evaluated separately the five best cited modelling publications in our list, i.e. the five modelling publications that ranked the highest among articles published in the same issue (articles number 13, 16, 18, 21 and 28) [Reference Austin, Kristinsson and Anderson24, Reference Levin, Perrot and Walker26, Reference Bonten, Austin and Lipsitch28, Reference Grundmann31, Reference Bergstrom, Lo and Lipsitch37], to search for factors potentially associated with their high citation impact.
The overall distribution of inflows from the six scientific classes defined earlier differed significantly between these five frequently cited articles and the other 55 modelling articles (χ2 test, P<0·01). Although the majority of references listed in those five articles were also published in journals assigned to the clinical medicine and biology classes, these best cited articles referred to more medical-journal articles and to fewer hard-science-journal articles (in particular biology or mathematics/informatics) than other less cited modelling publications.
DISCUSSION
Herein, we quantified the developing status of mathematical modelling in antibiotic-resistance research, and investigated the relationship between the modelling approach and other scientific approaches in this field. Pertinently, most of the articles citing mathematical antibiotic-resistance models were published in clinical medicine and biology journals. This observation suggests a general interest in applied mathematical epidemiological models outside the fields of mathematics or epidemiology. Moreover, the number of published antibiotic-resistance models has increased progressively in recent years, suggesting heightened attention and acceptance accorded to these methods by the scientific community. This rising impact was further underlined by the high citation impact that we found for most modelling articles, compared to other articles published in the same journals.
This study has several limitations. First, the citation index that we used is, by definition, ever-changing, as a new report citing any article from our list can be published every day. We therefore had to decide on a final date to complete data retrieval, which was July 2006 for this study. It is highly probable that data obtained from the SCI database at some time other than this date will yield different numbers of citations of the modelling articles we considered. Nevertheless, our final date is still likely to provide sufficient data to get a relatively clear picture of the nature of these citations – especially as far as the distribution of various scientific classes among them is concerned. Furthermore, our analysis of the citation impact of modelling articles was made more robust by our assessment of this impact by comparison with other articles published in the same issues and journals as the retained articles, rather than by using an unadjusted number of citations.
The JCR subject categories, and a fortiori our derived scientific classes, characterize the journal that published the article, rather than the article itself. As a consequence, the information we analysed is at best incomplete. For example, mathematical modelling articles are often published in journals from the clinical medicine or epidemiology/public health classes, rather than those from the mathematics class, as might have been expected. It is therefore likely that some of the citations of our selected articles, which we had assigned to the clinical medicine class, are indeed the work of other mathematical modellers. However, it can be argued that the scientific category to which a journal belongs reflects the occupations of its readers more than the content of its articles. In this respect, our conclusion that models attract scientists outside the field of mathematical epidemiology remains relevant.
We chose to regroup subject categories into only six scientific classes for obvious motives of simplicity and interpretability of the results. However, although we tried to make this process as straightforward as possible, a few doubts remained regarding the classification of several subject categories. Among them, only one (‘Immunology’) would yield a substantial impact on our results. Although we classified immunology as a biological science, it could also have been considered a medical speciality. Changing this decision would increase the inflows and outflows to and from clinical medicine at the expense of biology, but would not change our main conclusions.
Although it appears, based on international publications, that the modelling approach has gained a wider following in various epidemiological fields in recent years, no study to date has assessed this perception quantitatively. Herein, we established a framework that can be used to investigate quantitatively the status of mathematical modelling among other scientific approaches for any epidemiological field.
It should be noted that we think that this popularity of mathematical modelling is strongly dependent on the dynamics of the public health problem in question. Indeed, modelling is obviously more useful for emerging public health issues for which data are scarce, while a sound basis for decision-making is still required. On the other hand, when a public health problem has been recognized and studied for a certain length of time, the need for models becomes less demanding, as more information becomes available through direct observation. Therefore, it should probably be expected that the rapid increase in recent years of the number of published antibiotic-resistance models reported here will abate somewhat in years to come, as will the popularity (in terms of citations) of these models. It would be enlightening to use the framework we developed for other important epidemiological issues and to verify how our major conclusions obtained in this case study hold up in other contexts with regards to their dynamics as acknowledged public health issues.
DECLARATION OF INTEREST
None.