Hostname: page-component-5cf477f64f-fcbfl Total loading time: 0 Render date: 2025-03-31T01:02:35.675Z Has data issue: false hasContentIssue false

Sensitivity analysis for reporting bias on the time-dependent summary receiver operating characteristics curve in meta-analysis of prognosis studies with time-to-event outcomes

Published online by Cambridge University Press:  21 March 2025

Yi Zhou
Affiliation:
Beijing International Center for Mathematical Research, Peking University, Beijing, China Department of Biomedical Statistics, Graduate School of Medicine, Osaka University, Osaka, Japan Graduate School of Human Development and Environment, Kobe University, Kobe, Japan
Ao Huang
Affiliation:
Department of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany
Satoshi Hattori*
Affiliation:
Department of Biomedical Statistics, Graduate School of Medicine, Osaka University, Osaka, Japan Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives (OTRI), Osaka University, Osaka, Japan
*
Corresponding author: Satoshi Hattori; Email: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

In prognosis studies with time-to-event outcomes, the survivals of groups with high/low biomarker expression are often estimated by the Kaplan–Meier method, and the difference between groups is measured by the hazard ratios (HRs). Since the high/low expressions are usually determined by study-specific cutoff values, synthesizing only HRs for summarizing the prognostic capacity of a biomarker brings heterogeneity in the meta-analysis. The time-dependent summary receiver operating characteristics (SROC) curve was proposed as a cutoff-free summary of the prognostic capacity, extended from the SROC curve in meta-analysis of diagnostic studies. However, estimates of the time-dependent SROC curve may be threatened by reporting bias in that studies with significant outcomes, such as HRs, are more likely to be published and selected in meta-analyses. Under this conjecture, this paper proposes a sensitivity analysis method for quantifying and adjusting reporting bias on the time-dependent SROC curve. We model the publication process determined by the significance of the HRs and introduce a sensitivity analysis method based on the conditional likelihood constrained by some expected proportions of published studies. Simulation studies showed that the proposed method could reduce reporting bias given the correctly-specified marginal selection probability. The proposed method is illustrated on the real-world meta-analysis of Ki67 for breast cancer.

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of The Society for Research Synthesis Methodology

Highlights

What is already known

  • Heterogeneous cutoff values over studies often occur in meta-analysis examining the association of a continuous biomarker with a binary outcome (diagnostic study) and with a time-to-event outcome (prognostic study).

  • For diagnostic meta-analysis, the summary receiver operating characteristics (SROC) curve and its area under the curve are widely recognized as a useful tool and used to handle the issue of heterogeneous cutoff values.

  • For prognostic meta-analysis with time-to-event outcomes, the time-dependent SROC method, which is an extension of the SROC method in diagnostic meta-analysis, has been developed by Hattori and Zhou (2016, Statistics in Medicine 35(26), 4746–4763) and is useful to address the issue of heterogeneous cutoff values.

  • In meta-analysis, large studies or studies with significant results are more likely to be published and observed for meta-analysis; synthesizing only these studies may result in reporting bias, also known as, publication bias or small-study effects.

  • Reporting bias is unavoidable and may also induce biased estimates of the time-dependent SROC method by the bivariate normal model of Hattori and Zhou (2016, Statistics in Medicine 35(26), 4746–4763).

What is new

  • For prognostic studies with time-to-event outcomes, the selective publication process of studies and the selective reporting of the Kaplan–Meier estimates can be influenced by the significance of the hazard ratios, equivalently, the significance of the log-rank tests.

  • To model selective publication processes driven by the log-rank test, inference of trivariate model has been successfully established, which is essentially beyond the sensitivity analysis method for reporting bias in meta-analysis of diagnostic studies.

  • The proposed model was constrained by the marginal selection probability as a sensitivity parameter; the impact of reporting bias on the time-dependent SROC method can be evaluated by varying the marginal selection probability within (0,1).

Potential impact for Research Synthesis Methods readers outside the authors’ field

  • We proposed the first method dealing with reporting bias on the time-dependent SROC method for meta-analysis of prognosis studies with time-to-event outcomes; then, users could evaluate the robustness of the estimates when evaluating the prognostic capacity of the biomarker.

  • By incorporating the ability to address reporting bias, our method enhanced the utility and applicability of the time-dependent SROC method of Hattori and Zhou (2016, Statistics in Medicine 35(26), 4746–4763) for meta-analysis of prognosis studies.

1 Introduction

Biomarkers have been playing critical roles in medical therapeutics and precision medicine, and many clinical studies aim to investigate the associations between biomarkers and subjects’ outcomes. Usually, diagnostic or prognostic capacity is measured by the associations between subjects’ biomarker expression values and their (binary) disease outcomes or time-to-event outcomes, respectively. Correspondingly, studies evaluating diagnostic or prognostic capacity of the biomarker of interest are referred to as diagnostic or prognosis studies, respectively. In this paper, we are interested in prognosis studies that evaluate the association between a continuous biomarker and time-to-event outcomes; although prognosis studies can have continuous or binary outcomes, we focus on studies with time-to-event outcomes.

In the analysis of a prognosis or diagnostic study with a continuous biomarker, subjects are often classified into the high/low expression (or positive/negative) groups by a certain cutoff value. In a diagnostic study, diagnostic capacity of the biomarker is usually represented by the pair of sensitivity and specificity estimated at a specified cutoff value.Reference Macaskill, Gatsonis, Deeks, Harbord and Takwoingi 1 More informative measurements include the receiver operator characteristic (ROC) curve and the area under the curve (AUC) since they present the diagnostic capacity over the range of cutoff value. This successful ROC methodology, developed for diagnostic studies with binary outcomes, has been successfully extended to time-to-event outcomes. By formulating the problem as diagnosing whether a subject survives beyond a fixed time point, such as one year, the time-dependent sensitivity and specificity are defined. Then, the time-dependent ROC curve and its AUC are utilized for evaluating prognostic capacity on time-to-event outcomes.Reference Heagerty, Lumley and Pepe 2 , Reference Kamarudin, Cox and Kolamunnage-Dona 3

Meta-analysis is a useful tool to synthesize data of multiple studies and provides a more precise estimate of diagnostic or prognostic capacity. In meta-analysis of diagnostic studies, study-specific cutoff values induce correlation between the empirical sensitivity and specificity pairs among the collected studies. Additionally, they bring difficulty in interpretation of the meta-analytical results that are aggregated by the standard meta-analysis technique, such as the random-effects model. Thus, bivariate models are recommended to model the empirical sensitivity and specificity pairs and the summary ROC (SROC) curveReference Rutter and Gatsonis 4 Reference Haitao, Hongfei and Yijie 6 to show diagnostic capacity. The SROC curve presents the monotonic relationship between sensitivity and $1-\text {specificity}$ at all possible cutoff values, and the area under the SROC curve, namely the summary AUC (SAUC), gives a univariate summary of diagnostic capacity. Properties and extensions of the SROC curve have been much discussed in many statistical literature, and the SROC curve and the SAUC are widely used as the main results in meta-analysis of diagnostic studies.Reference Macaskill, Gatsonis, Deeks, Harbord and Takwoingi 1 In contrast, in clinical journals, almost all meta-analysis of prognosis studies aggregate the hazard ratios (HRs) using the standard meta-analysis technique without caring about the study-specific cutoff values among studies.Reference Look, vWLJ and Duffy 7 Reference Pak, Cheon and HY 11 The major problem is that varying cutoff values contribute to large heterogeneity in the meta-analysis results.Reference Look, vWLJ and Duffy 7 , Reference De Azambuja, Cardoso and De Castro 8 A couple of papers proposed methods to aggregate the HRs accounting for heterogeneous cutoff values.Reference Riley, Elia, Malin, Hemming and Price 12 , Reference Sadashima, Hattori and Takahashi 13 However, these HR-based summary measures are hard to interpret. Motivated by the wide acceptance of the SROC curves in meta-analysis of diagnostic studies, meta-analytic version of the time-dependent ROC curve was developed, that is, the time-dependent SROC curve, denoted by $\text {SROC}(t)$ . Comparing to the synthesis of the HRs, $\text {SROC}(t)$ provides a visual presentation of the overall prognostic capacity without depending on one or multiple specific cutoff values. Additionally, $\text {SROC}(t)$ synthesizes prognostic capacity at specific time points; thus, one could view the change of prognostic capacity over time. The area under $\text {SROC}(t)$ , denoted by $\text {SAUC}(t)$ , is useful to quantify prognostic capacity. By testing the null hypothesis that ${H_0:\text {SAUC}(t)=0.5}$ , one could examine whether or not the biomarker has significant prognostic capacity. To make inference about $\text {SROC}(t)$ , Combescure et al.Reference Combescure, Daures and Foucher 14 employed the non-linear mixed model to model the biomarker and time-to-event distributions. Hattori and ZhouReference Hattori and Zhou 15 proposed the bivariate normal model and the bivariate binomial model by extending the bivariate modelsReference Rutter and Gatsonis 4 , Reference Reitsma, Glas, Rutjes, Scholten, Bossuyt and Zwinderman 5 for meta-analysis of diagnostic studies. All these methods utilize the Kaplan–Meier (KM) estimates extracted from literatures. Among the models estimating $\text {SROC}(t)$ , the bivariate normal model of Hattori and ZhouReference Hattori and Zhou 15 (hereinafter, the HZ model) appears to be simplest in the inference and practical implementation. As proposed, the pairs of empirical time-dependent sensitivity and specificity with their variances are estimated using the retrieved number of patients and KM estimates at several time points. Then, the HZ model bivariately models the empirical time-dependent sensitivity and specificity pairs for estimating $\text {SROC}(t)$ . The detailed inference procedure is introduced in Section 3.

Despite the usefulness of meta-analyses, validity of the synthesized results is often threatened by publication bias, also known as small-study effects. Publication bias is induced by selective publication, where large studies or studies with significant outcomes are more likely to be published and collected for meta-analysis, and consequently, overlooking unpublished studies in meta-analysis can lead to biased estimates and overoptimistic conclusions.Reference Sutton, Song, Gilbody and Abrams 16 In univariate meta-analysis of intervention studies (e.g., randomized clinical trials), methods for assessing and adjusting publication bias have been intensively studied. Despite the popular graphical methods (e.g., the funnel plot and the trim-and-fill method), sensitivity analysis methods with selection functions, including the Heckman-type selection functions,Reference Copas and Shi 17 , Reference Copas and Shi 18 the t-statistic based selection function,Reference Copas 19 and the worst-case analysis,Reference Copas and Jackson 20 provide more careful evaluations on the impact of publication bias. In meta-analysis of diagnostic studies, several selection function based methods have been proposed for dealing with publication bias on the estimate of the SROC curve. Most methods modeled the selective publication process by the Heckman-type selection functions.Reference Hattori and Zhou 21 Reference Li, Fan, Liu and Liu 23 Recently, Zhou et al.Reference Zhou, Huang and Hattori 24 introduced the cutoff-dependent selection function, which is the probit model on the t-type statistic of the empirical sensitivity and specificity pairs. Specifically, the t-type statistic is defined by that of the linear combination of the logit-transformed empirical sensitivities and specificities. Thus, the cutoff-dependent selection function can model a variety of selective publication processes determined by the significance of sensitivity, specificity, or both. Since the scientific arguments in each diagnostic study are mainly based on the cutoff-dependent quantities (e.g., sensitivity, specificity, and their derivatives), such cutoff-dependent selection functions would be more appealing to model the selective publication.

Publication bias can also affect the estimate of $\text {SROC}(t)$ . Furthermore, $\text {SROC}(t)$ may suffer from additional source of bias. As aforementioned, inference of $\text {SROC}(t)$ relies on the KM estimates extracted from the literature, where authors might place the figures of KM estimates to highlight their findings of substantial survival improvement. However, not all the literatures report the figures of KM estimates. Exclusion of studies missing KM estimates may result in biased $\text {SROC}(t)$ estimations, which is an issue of selective reporting.Reference Boutron, Page, Higgins, Altman, Lundh and Hróbjartsson 25 In this paper, we refer to the combined bias resulting from selective publication and selective reporting of KM estimates as reporting bias. Consequently, we aim to propose a sensitivity analysis method for assessing the impact of reporting bias on $\text {SROC}(t)$ estimated by the HZ model. Due to the similarity between the HZ model and the bivariate normal model in meta-analysis of diagnostic studies, an intuitive idea is to apply the method of Zhou et al.Reference Zhou, Huang and Hattori 24 to the HZ model. However, the publication mechanisms between diagnostic and prognosis studies are different; thus, the selection function defined in Zhou et al.Reference Zhou, Huang and Hattori 24 may not be appealing to model the publication mechanism in prognosis studies. In prognostic studies, the KM estimates of the high/low expression groups are often reported graphically with the P-values of the log-rank test, and the conclusions about prognostic capacity of the biomarker are measured by the HRs. Thus, the significance of the log-rank test, or the significance of the log-transformed HR (lnHR) between two groups, is supposed to be a determinant of the publication of prognosis studies. On the other hand, publication mechanism modeled by the selection function on the test statistic is more interpretable. These motivated us to propose the sensitivity analysis method that employs the test statistic based selection functionReference Zhou, Huang and Hattori 24 to address reporting bias in meta-analysis of prognosis studies. However, it is challenging in the development. In Zhou et al.,Reference Zhou, Huang and Hattori 24 the t-type statistic can be expressed by sensitivity and specificity, which are reported as outcomes in the diagnostic studies. They successfully extended the likelihood-based sensitivity analysis method of CopasReference Copas 19 into the bivariate normal model for the logit-transformed sensitivity and specificity. In meta-analysis of prognosis studies, the data of time-dependent sensitivity and specificity pairs are not observable outcomes, and it is difficult to re-express the log-rank statistic (equivalently, the t-statistic of lnHR without considering covariates) as a linear function of the time-dependent sensitivity and specificity pairs. To overcome this, we propose a trivariate model for logit-transformed time-dependent sensitivity and specificity and the lnHR. Based on the trivariate model, the conditional likelihood taking into account the selection function of the log-rank statistic is derived. The conditional likelihood is further constrained by the marginal selection probability. By specifying a plausible range of marginal selection probabilities, one could assess the possibly minimal and maximal impact of reporting bias on $\text {SROC}(t)$ and $\text {SAUC}(t)$ . This paper provides an idea for dealing with reporting bias in meta-analysis of studies with time-to-event outcomes.

The rest of this article is organized as follows. In Section 2, we introduce the issue of reporting bias using a motivating meta-analysis of Ki67. In Section 3, we describe the general data structure and review the HZ model without taking into account selective publication. In Section 4, we propose the trivariate model and the sensitivity analysis method for reporting bias in detail. In Section 5, we revisit the meta-analysis of Ki67 and evaluate the potential impact of reporting bias by the proposed method. In Section 6, simulation studies are conducted to evaluate the performance of the proposed method. In Section 7, we conclude this work with a discussion.

Table 1 Scenarios of distributions of biomarker and cutoff values used in simulation studies. e follows the standard logistic distribution

Figure 1 The funnel plot and the trim and fill method for detecting reporting bias in meta-analysis of Ki67. The vertical black dashed lines are the integrated lnHRs without considering reporting bias. The central axes of the funnel plots are the adjusted lnHRs. The open circle points are the filled unpublished studies. The red and black points are the published studies categorized by the P-values of the lnHRs. P for the open circle point in the legends indicated the estimated P-values of the imputed lnHRs of the filled studies.

2 Motivating example: reporting bias in meta-analysis of Ki67

De Azambuja et al.Reference De Azambuja, Cardoso and De Castro 8 conducted meta-analysis to evaluate the prognostic capacity of Ki67. They synthesized the reported HRs that assessed the association between Ki67 and the survival outcome of patients with early breast cancer. In the overall survival outcomes, 38 studies reported the HRs (or equivalently, the log-transformed HRs, denoted by lnHRs) with the corresponding standard errors (SEs). The data are presented in Table 1 of De Azambuja et al.,Reference De Azambuja, Cardoso and De Castro 8 where the SEs can be derived from the 95% confidence intervals (CIs). The HR was defined by that between the high versus low expression of Ki67 groups, and the high/low expressions were decided by various study-specific cutoff values, ranging from 0.035 to 0.286. The heterogeneous cutoff values contributed to some of the heterogeneity in this meta-analysis of the lnHRs.

The major concern, which is often too overlooked, is reporting bias, the mixture of selective publication of studies and selective reporting of outcomes. In this meta-analysis, the 38 studies may have been selectively published from the population of studies due to the significant results. Since the selective publication process of studies cannot be verified from the observed data, to explore the existence of potentially unpublished studies, one straightforward way is to use funnel plot of the lnHRs for visualization. In this meta-analysis, the unpublished studies were imputed by the trim-and-fill method until the funnel plot on the lnHR was symmetric. In Figure 1a, the funnel plot and the trim-and-fill method implied that some studies with insignificant lnHRs might be unpublished (the circle points in the plot), and studies with insignificant lnHRs (i.e., P-value greater than 0.5) seemed vulnerable to unpublication. However, since the lnHRs were estimated dependent of the study-specific cutoff values, the funnel plot on the lnHRs could be misleading and influenced by the heterogeneity resulted from cutoff values. To mitigate the influence of heterogeneous cutoff values, we then apply the trim-and-fill method in the strata of cutoff values. In the funnel plots of the subgroup of cutoff value equal to 0.1 (Figure 1b), the asymmetry of funnel plots suggested some potential unpublished studies determined by the significance of lnHRs, and reporting bias seemed more likely to occur in the subgroup of cutoff value equal to 0.1. In contrast, in the subgroup of cutoff value equal to 0.2 (Figure 1c), selective publication was not detected.

To evaluate the prognostic capacity independent of cutoff values, Hattori and ZhouReference Hattori and Zhou 15 re-analyzed this meta-analysis by their proposed HZ model. They extracted the KM estimates of the high/low expression groups at several time points and estimated $\text {SROC}(t)$ and $\text {SAUC}(t)$ at the third ( $t=3$ ) and the fifth ( $t=5$ ) follow-up years. The estimate of SAUC(3) was 0.649 (95% CI: 0.606-0.690), and SAUC(5) was estimated as 0.646 (0.610, 0.680), indicating that Ki67 was still useful to discriminant patients with breast cancer at the third and fifth follow-up years. However, among the 38 studies, not all the studies reported the KM curves. The estimates of SAUC(3) and SAUC(5) were based on 23 and 21 studies, which reported the KM estimates at the third and fifth years, respectively.

In summary, among the studies reporting the lnHRs, the trim-and-fill method detected some unpublished studies that were almost insignificant in the lnHRs (Figure 2). When using the HZ model for meta-analysis, less than 38 studies reporting KM estimates were used for synthesis. These two sources of missingness could lead to reporting bias in the results. Thus, to evaluate the robustness of the estimates of SROC(t) and SAUC(t), reporting bias including both selective publication of studies and selective reporting of the KM curves should be taken with caution. Although the funnel plot and the trim-and-fill method could raise some concerns about reporting bias in this meta-analysis, they are not useful for evaluating the potential impact of reporting bias on $\text {SROC}(t)$ and $\text {SAUC}(t)$ . To overcome this issue, we propose a sensitivity analysis method for evaluating the impact of reporting bias on $\text {SROC}(t)$ by the HZ model in the later sections.

Figure 2 The funnel plot and the trim and fill method for detecting reporting bias in meta-analysis of Ki67 based on the 23 studies reporting the KM curves. The vertical black dashed lines are the integrated lnHRs without considering reporting bias. The central axes of the funnel plots are the adjusted lnHRs. The open circle points are the filled unpublished studies. The red and black points are the published studies categorized by the P-values of the lnHRs. P for the open circle point in the legends indicated the estimated P-values of the imputed lnHRs of the filled studies.

3 Bivariate normal model without considering reporting bias

3.1 Notations and data structure

Suppose that S prognosis studies are conducted to evaluate the association between the expressions of biomarker and subjects’ time-to-event outcomes. Among the S studies, data of N prognosis studies are published and used in meta-analysis to summarize the prognostic capacity, while data of $S-N$ studies are missing. In this section, we do not consider the existence of reporting bias, that is, data of N published studies consist of the population or random sample from the S studies. Since we focus on the HZ model to synthesize data of the published studies, we follow the notations in Hattori and ZhouReference Hattori and Zhou 15 to introduce the latent individual patient data (IPD) in each study and the observable data for meta-analysis as follows.

Suppose that each prognosis study $i\left (i = 1, 2, \dots , N\right )$ includes $n^{(i)}$ subjects, and the subjects are assumed to be random samples from the population of interest. For each subject, let $\tilde X$ be the baseline measurement of biomarker, $\tilde T$ the failure time, $\tilde C$ the right-censored time, and $\tilde Y = \mathrm {min}\left (\tilde T, \tilde C\right )$ the follow-up time. We use tilde to denote these variables for individuals, which are unextractable from the literature for meta-analysis. We assume that the distribution of $\tilde C$ is identical across N studies as well as $\tilde C {\perp \!\!\!\perp } \tilde X$ , and $\tilde T {\perp \!\!\!\perp } \tilde C\mid \tilde X$ , where ${\perp \!\!\!\perp }$ indicates variables are independent. These assumptions lead to $\tilde T {\perp \!\!\!\perp } \tilde C$ .

For each study i, let $v^{(i)}$ denote the study-specific cutoff value, which is not necessarily reported. This cutoff value separates subjects into the low expression group, denoted by $Z=0$ , if $\tilde X \le v^{(i)}$ or the high expression group ( $Z=1$ ) if $\tilde X> v^{(i)}$ . The survival functions of the low and high expression groups are respectively denoted and defined by

$$ \begin{align*} S_0^{(i)}\left(t\right)&=P\left(\tilde T> t \mid \tilde X \le v^{(i)}\right)\\ S_1^{(i)}\left(t\right)&=P\left(\tilde T > t\mid \tilde X > v^{(i)}\right). \end{align*} $$

The disease status at time t is defined by the counting process $D(t)=1$ if $\tilde T\le t$ or $D(t)=0$ if $\tilde T> t$ , indicating that subjects have an event before time t or survive after t, respectively. To facilitate the understanding, we showed the number of subjects given different disease statuses and expression groups in Table S1 in the Supplementary Material. Analogous to the diagnostic study, sensitivity and specificity at time t are respectively denoted and defined by

$$ \begin{align*} \text{se}\left(x,t\right) &= P\left(\tilde X> x\mid \tilde T \le t\right)\\ \text{sp}\left(x,t\right) &= P\left(\tilde X \le x\mid\tilde T > t\right). \end{align*} $$

For meta-analysis, the following summarized data are required and can be extracted from published literatures. For study $i~(i=1,2,\dots , N)$ , let $n_0^{(i)}$ and $n_1^{(i)}$ denote the number of subjects separated into the low and high expression groups, respectively; the total number of subjects is $n^{(i)} = n_0^{(i)} + n_1^{(i)}$ . Let $\hat S_0^{(i)}\left (t\right )$ and $\hat S_1^{(i)}\left (t\right )$ denote the KM estimates of the low and high expression groups, respectively, and they can be extracted from the reported plots of the KM curves at the partition of time interval $\left [0,t\right ]$ : $0=t_0<t_1<\dots <t_K=t$ . Let $\hat \mu _{\text {lnHR}}^{(i)}$ denote the reported lnHR between high versus low expression groups and $\hat s_{\text {lnHR}}^{(i)}$ the corresponding SE; both are estimated by the Cox model.Reference Cox 26 Among N studies, the sample medians of the follow-up time over the total subjects can also be extracted from some studies. These medians of follow-up time are used to estimate the censoring distribution in estimating the SEs of $\hat {S}_0^{(i)}(t)$ and $\hat {S}_1^{(i)}(t)$ . (See Section 3.2 in Hattori and ZhouReference Hattori and Zhou 15 for more details.) We take the example of meta-analysis of Ki67 to illustrate the data structure with details explained in Section S1 of the Supplementary Material.

3.2 Bivariate normal model

In this section, we review the structure of HZ model and the definitions of $\text {SROC}(t)$ and $\text {SAUC}(t)$ . The HZ model employs the bivariate normal-normal random-effects model to model the empirical pairs of logit-transformed time-dependent sensitivities and specificities, which are estimated using KM estimates from published literatures, allowing for heterogeneity from different sources (e.g., cutoff values, study designs, population, etc.) in time-dependent sensitivities and specificities across studies. It extended the model of Reitsma et al.Reference Reitsma, Glas, Rutjes, Scholten, Bossuyt and Zwinderman 5 which has been widely used in meta-analysis of diagnostic studies.

According to Hattori and Zhou,Reference Hattori and Zhou 15 with the Bayes rule and simple algebraic manipulations, the true sensitivity and specificity at time t of study i can be respectively re-expressed by

(1) $$ \begin{align} \begin{aligned} \text{se}\left(v^{(i)}, t \right) &= \dfrac {\left\{1- S_1^{(i)}\left(t\right)\right\}q_1^{(i)}} {\left\{1- S_0^{(i)}\left(t\right)\right\}q_0^{(i)} + \left\{1- S_1^{(i)}\left(t\right)\right\}q_1^{(i)}}\\ \text{sp}\left(v^{(i)}, t\right) &= \dfrac { S_0^{(i)}\left(t\right)\cdot q_0^{(i)}} { S_0^{(i)}\left(t\right)\cdot q_0^{(i)} + S_1^{(i)}\left(t\right)\cdot q_1^{(i)}}, \end{aligned} \end{align} $$

where $q_0^{(i)} = P\left (\tilde X\le v^{(i)}\right )$ and $q_1^{(i)} = P\left (\tilde X>v^{(i)}\right )$ . With the data extracted from collected studies, $q_0^{(i)}$ and $q_1^{(i)}$ are estimated by $\hat q_0^{(i)} = n_0^{(i)}/n^{(i)}$ and $\hat q_1^{(i)} = n_1^{(i)}/n^{(i)}$ , respectively; the time-dependent sensitivity and specificity in equation (1) are consistently estimated by substituting $\left (\hat q_0^{(i)}, \hat q_1^{(i)}, \hat S_0^{(i)}, \hat S_1^{(i)}\right )$ for $\left (q_0^{(i)}, q_1^{(i)}, S_0^{(i)}, S_1^{(i)}\right )$ . The resulting consistent estimators are denoted by $\hat {\textrm {se}}\left (v^{(i)}, t \right )$ and $\hat {\textrm {sp}}\left (v^{(i)}, t \right )$ , respectively.

The HZ model, following Reitsma et al.,Reference Reitsma, Glas, Rutjes, Scholten, Bossuyt and Zwinderman 5 employs a logit-transformation to map the time-dependent sensitivity and specificity to the real number line $(-\infty , \infty )$ . While other transformations are also applicable,Reference Doebler, Holling and Böhning 27 logit-transformation is widely accepted. Let $\boldsymbol \mu ^{(i)} = \left (\mu _{\text {se}}^{(i)},\mu _{\text {sp}}^{(i)}\right )^\top $ denote the true time-dependent sensitivity and specificity pair of study i on the logit scale, that is,

$$ \begin{align*} \begin{aligned} \mu_{\text{se}}^{(i)}=\text{logit}\left\{\text{se}\left(v^{(i)},t\right)\right\}\text{ and } \mu_{\text{sp}}^{(i)}=\text{logit}\left\{\text{sp}\left(v^{(i)},t\right)\right\}, \end{aligned} \end{align*} $$

where $\text {logit}=\log x-\log (1-x)$ . At the between-study level, it is assumed that

(2) $$ \begin{align}{\boldsymbol \mu}^{(i)} \sim N_2 \left (\boldsymbol \mu, \boldsymbol \Omega \right ) \text{ with } \boldsymbol \Omega= \begin{bmatrix} \tau_{\text{se}}^2 & \tau_{\mathrm{se,sp}}\\ \tau_{\mathrm{se,sp}} & \tau_{\text{sp}}^2 \end{bmatrix}, \end{align} $$

where $N_2$ denotes the bivariate normal distribution, and $\boldsymbol \mu = \left (\mu _{\text {se}}, \mu _{\text {sp}}\right )^\top $ is the overall mean at time t across multiple prognosis studies; $\boldsymbol \Omega $ is the between-study variance-covariance matrix of ${\boldsymbol \mu }^{(i)}$ , where $\tau _{\text {se}}^2$ and $\tau _{\text {sp}}^2$ are the variance of $\mu _{\text {se}}^{(i)}$ and $\mu _{\text {sp}}^{(i)}$ , respectively, and $\tau _{\mathrm {se,sp}}$ the covariance between them.

Let $\hat {\boldsymbol \mu }^{(i)} = \left (\text {logit}\left \{\hat {\textrm {se}}\left (v^{(i)},t\right )\right \}, \text {logit}\left \{\hat {\textrm {sp}}\left (v^{(i)},t\right )\right \}\right )^\top $ denote the consistent estimates of $\boldsymbol \mu ^{(i)}$ in each study. Hattori and ZhouReference Hattori and Zhou 15 showed that $\hat {\boldsymbol \mu }^{(i)}$ has the following asymptotic distribution at the within-study level:

(3) $$ \begin{align} \hat{\boldsymbol \mu}^{(i)}\mid \boldsymbol \mu^{(i)} \sim N_2 \left (\boldsymbol \mu^{(i)}, \left.{ {\mathbf H}^{(i)}}\right/{n^{(i)}} \right ) \text{ with } {\mathbf H}^{(i)} = \begin{bmatrix} \left\{\sigma_{\text{se}}^{(i)}\right\}^2 & \sigma_{\mathrm{se,sp}}^{(i)}\\ \sigma_{\mathrm{se,sp}}^{(i)} & \left\{\sigma_{\text{sp}}^{(i)}\right\}^2 \end{bmatrix}, \end{align} $$

where ${\mathbf H}^{(i)}$ indicates the within-study asymptotic variance-covariance matrix. The detailed expression of ${\mathbf H}^{(i)}$ is presented in equation (S1) of the Supplementary Material. The parameter ${\mathbf H}^{(i)}$ in model (3) can be replaced with its consistent estimator $\hat {\mathbf H}^{(i)}$ estimated by the Greenwood formula with median follow-up time, which follows the convention that the within-study variance–covariance matrix is known in meta-analysis. (See Section 3.2 in Hattori and ZhouReference Hattori and Zhou 15 for more details.) Combining models (2) and (3) induces the HZ model:

(4) $$ \begin{align} \hat{\boldsymbol \mu}^{(i)} \sim N_2 \left (\boldsymbol \mu, \left. \boldsymbol \Omega + {\hat{\mathbf H}^{(i)}}\right/{n^{(i)}} \right ), \end{align} $$

where $(\boldsymbol \mu , \boldsymbol \Omega )$ are the unknown parameters at time t, and they can be estimated by the maximum likelihood (ML) method; we denote their ML estimators as $(\hat {\boldsymbol \mu }, \hat {\boldsymbol \Omega })$ .

Based on the HZ model, $\text {SROC}(t)$ is derived by taking the conditional expectation of $\mu _{\text {se}}^{(i)}$ given $\mu _{\text {sp}}^{(i)}$ in model (2). Let x denote $1-\text {sp}\left (x,t\right )$ , $\text {SROC}(t)$ is defined by the following time-dependent function:

(5) $$ \begin{align} \text{SROC}(t) = \text{SROC}\left(x, t; \boldsymbol \mu, \boldsymbol \Omega \right) = \text{logit}^{-1}\left[ \mu_{\text{se}} - \dfrac{\tau_{\mathrm{se,sp}}}{\tau_{\text{sp}}^2}\left\{ \text{logit}(x)+\mu_{\text{sp}}\right\}\right]. \end{align} $$

Accordingly, $\text {SAUC}(t)$ is defined by

(6) $$ \begin{align} \text{SAUC}(t) = \text{SAUC}\left(t; \boldsymbol \mu, \boldsymbol \Omega \right) = \int_0^1 \text{SROC}\left(x, t; \boldsymbol \mu, \boldsymbol \Omega\right) dx. \end{align} $$

$\text {SROC}(t)$ and $\text {SAUC}(t)$ can be estimated by replacing the unknown parameters $(\boldsymbol \mu , \boldsymbol \Omega )$ in (5) and (6).

4 Sensitivity analysis for reporting bias

4.1 Trivariate model incorporating the lnHR

In intervention studies, whether or not a study is published is influenced by the significance of the result, such as, the P-value of the t-statistic. Similarly, a prognosis study with time-to-event outcomes is more likely to be published when its log-rank test (or equivalently, the t-statistic of the lnHR without covariates) is significant, and vice versa. The phenomenon of selective publication causes N published studies to be biased sample from the S studies and may induce reporting bias in the estimates of $\text {SROC}(t)$ and $\text {SAUC}(t)$ . To quantify reporting bias on the estimates, we aim to extend the methods of CopasReference Copas 19 and Zhou et al.Reference Zhou, Huang and Hattori 24 and introduce the selection function of the log-rank test (equivalently, the t-statistic of the lnHR) to model the publication mechanism of prognosis studies; then, the inference of reporting bias is made based on the conditional likelihood of the HZ model given the published studies. However, the HZ model (4) does not involve the lnHR. Thus, before introducing the proposed method, we need to expand the HZ model (4) to correlate the time-dependent sensitivity and specificity with the lnHR for constructing the conditional likelihood in the later section.

To distinguish from some notations for the HZ model (4) in Section 3.2, we let $\hat {\mathbf y}^{(i)}=\left (\hat \mu _{\text {se}}^{(i)},\hat \mu _{\text {sp}}^{(i)}, \hat \mu _{\text {lnHR}}^{(i)}\right )^\top $ denote the consistent estimators of $\boldsymbol \theta ^{(i)} = \left ( \mu _{\text {se}}^{(i)},\mu _{\text {sp}}^{(i)}, \mu _{\text {lnHR}}^{(i)}\right )^\top $ at time t, where $\mu _{\text {lnHR}}^{(i)}$ denotes the lnHR and $\hat \mu _{\text {lnHR}}^{(i)}$ the empirical lnHR estimated by the Cox model from the published prognosis studies. At the between-study level, it is assumed that

(7) $$ \begin{align} {\boldsymbol\theta}^{(i)} \sim N_3\left ({\boldsymbol\theta}, \boldsymbol \Psi \right ) ~\text{with}~ \boldsymbol \Psi = \begin{bmatrix} \psi_{\text{se}}^2 & \psi_{\mathrm{se,sp}} & \psi_{\mathrm{se,lnHR}} \\ \psi_{\mathrm{se,sp}} & \psi_{\text{sp}}^2 & \psi_{\mathrm{sp,lnHR}} \\ \psi_{\mathrm{se,lnHR}} & \psi_{\mathrm{sp,lnHR}} & \psi_{\text{lnHR}}^2 \end{bmatrix}, \end{align} $$

where $\boldsymbol \theta = \left (\theta _{\text {se}}, \theta _{\text {sp}}, \theta _{\text {lnHR}}\right )^\top $ are the common means, and $\boldsymbol \Psi $ indicates the between-study variance-covariance matrix of ${\boldsymbol \theta }^{(i)}$ ; the diagonal elements are the corresponding variances of $\theta _{\text {se}}, \theta _{\text {sp}}$ , and $\theta _{\text {lnHR}}$ and the others are the covariances between each two of them.

At the within-study level, we can prove that

(8) $$ \begin{align} \hat{\mathbf y}^{(i)}\mid \boldsymbol\theta^{(i)} \sim N_3 \left ( \left.{\boldsymbol\theta^{(i)}}, {{\boldsymbol \Sigma}^{(i)}}\right/n^{(i)} \right ) \text{ with } {\boldsymbol \Sigma}^{(i)} = \begin{bmatrix} \mathbf H^{(i)} & \boldsymbol \Sigma_{12}^{(i)}\\ {\boldsymbol \Sigma_{12}^{(i)}}^\top & \left\{ \sigma_{\text{lnHR}}^{(i)}\right\}^2 \end{bmatrix}, \end{align} $$

where $N_3$ denotes the trivariate normal distribution, $\boldsymbol \Sigma ^{(i)}$ the within-study asymptotic variance–covariance matrix, ${\mathbf H}^{(i)}$ the variance–covariance matrix in model (3); let $\boldsymbol \Sigma _{12}^{(i)}=\left (\sigma _{\mathrm {se,lnHR}}^{(i)}, \sigma _{\mathrm {sp,lnHR}}^{(i)}\right )^\top $ , where $\sigma _{\mathrm {se,lnHR}}^{(i)}$ is the covariance between $\hat \mu _{\text {se}}^{(i)}$ and $\hat \mu _{\text {lnHR}}^{(i)}$ and $\sigma _{\mathrm {sp,lnHR}}^{(i)}$ the covariance between $\hat \mu _{\text {sp}}^{(i)}$ and $\hat \mu _{\text {lnHR}}^{(i)}$ , and $\left \{ \sigma _{\text {lnHR}}^{(i)}\right \}^2$ the variance of $\hat \mu _{\text {lnHR}}^{(i)}$ . The proof of the asymptotic distribution of $\hat {\mathbf y}^{(i)}$ (8) is presented in Section S2 of the Supplementary Material. In (8), $\boldsymbol \Sigma ^{(i)}$ can be replaced by its consistent estimator $\hat {\boldsymbol \Sigma }^{(i)}$ , according to the convention in meta-analysis that the within-study variance–covariance matrix is known. Combining models (8) and (7), we derives the marginal distribution of $\hat {\mathbf y}^{(i)}$ :

(9) $$ \begin{align} \hat{\mathbf y}^{(i)} \sim N_3 \left (\boldsymbol \theta, \left. \boldsymbol \Psi + {\hat{\boldsymbol \Sigma}^{(i)}}\right/{n^{(i)}} \right ). \end{align} $$

4.2 Selection functions on the significance of the lnHR

As aforementioned, we conjectured that the published N studies are subject to selective publication in that study with significant lnHR (or small P-value of the log-rank test) is more likely to be published. The selective publication process is modeled by the probability of a study being selected given its t-statistic of the lnHR, denoted by $t_{\text {HR}}^{(i)}$ , as shown in the following selection function:

(10) $$ \begin{align} P\left(\text{select}\mid \hat{\mathbf y}^{(i)}, \hat {\boldsymbol \Sigma}^{(i)}\right) = a\left({\hat{\mathbf y}^{(i)}, \hat {\boldsymbol \Sigma}^{(i)}} \right)=a\left( {t_{\text{HR}}^{(i)}}\right), \end{align} $$

where the function $a(\cdot )$ is a non-decreasing function of ${t_{\text {HR}}^{(i)}}$ , that is, the t-statistic of the lnHR. To simplify the inference procedure, we, following Copas,Reference Copas 19 employ the probit model to $a(\cdot )$ . Thus, equation (10) is defined by:

(11) $$ \begin{align} a\left({t_{\text{HR}}^{(i)}}\right)=\Phi\left(\alpha + \beta\cdot {t_{\text{HR}}^{(i)}}\right), \end{align} $$

where ${t_{\text {HR}}^{(i)}} = \hat \mu _{\text {lnHR}}\left /\hat s^{(i)}_{\text {lnHR}}\right .$ , parameters $\alpha $ and $\beta $ control the probability of selective publication, and $\hat s^{(i)}_{\text {lnHR}}$ denotes the reported SE of the lnHR with $\hat s^{(i)}_{\text {lnHR}} = \left .\hat \sigma _{\text {lnHR}}^{(i)}\right /\sqrt {n^{(i)}}$ . The monotonic property of the probit model links two cases of randomly selective publication: (1) when $\beta =0$ and suppose $\alpha =\Phi ^{-1}\left (p_0\right )$ , the probability of selective publication is independent of the t-statistic, and each study is randomly published from the population with selection probability $p_0$ ; (2) when $\beta \rightarrow \infty $ , each study is published with probability 1.

According to the definition of the probit model, the selection function (11) can be represented by

(12) $$ \begin{align} a\left({t_{\text{HR}}^{(i)}}\right) =\Phi\left(z^{(i)} < \alpha + \beta\cdot {t_{\text{HR}}^{(i)}}\right)= \Phi\left\{ \alpha + \beta\cdot \left(\left.{\hat \mu^{(i)}_{\text{lnHR}}}\right/{\hat s^{(i)}_{\text{lnHR}}}\right) \right\}, \end{align} $$

where $z^{(i)}$ is the standard normal random variable independent of ${t_{\text {HR}}^{(i)}}$ . Based on the trivariate model (9), the marginal distribution of $\hat \mu ^{(i)}_{\text {lnHR}}$ is

$$ \begin{align*} \hat \mu^{(i)}_{\text{lnHR}}\sim N\left({\theta_{\text{lnHR}}}, \psi_{\text{lnHR}}^2 + \left\{\hat s^{(i)}_{\text{lnHR}}\right\}^2 \right). \end{align*} $$

Then, the distribution of ${t_{\text {HR}}^{(i)}}$ is derived by

$$ \begin{align*} N \left( \dfrac{\theta_{\text{lnHR}}}{\hat s^{(i)}_{\text{lnHR}}}, 1 + \left\{\dfrac{\psi_{\text{lnHR}}}{\hat s^{(i)}_{\text{lnHR}}}\right\}^2\right ), \end{align*} $$

and the selection function $a\left ({t_{\text {HR}}^{(i)}}\right )$ in equation (12) can be written into the following selection function:

(13) $$ \begin{align} P\left(\text{select}\mid {\hat {\boldsymbol \Sigma}^{(i)}}\right) = b\left(\hat{\boldsymbol \Sigma}^{(i)}\right) = \Phi\left\{ \dfrac{\alpha + \beta\cdot \left(\theta_{\text{lnHR}}\left/\hat s^{(i)}_{\text{lnHR}}\right.\right) } {\sqrt{1 + \beta^2\cdot \left\{1+ \left({\psi_{\text{lnHR}}}\left/{\hat s^{(i)}_{\text{lnHR}}}\right.\right)^2\right\}}} \right\}. \end{align} $$

4.3 Likelihood based sensitivity analysis

We estimate the parameters in $\text {SROC}(t)$ and $\text {SAUC}(t)$ by maximizing the loglikelihood subject to a certain value of marginal selection probability, denoted by $P(\text {select})$ and defined by

(14) $$ \begin{align} P(\text{select}) = E_P\left\{a\left({t_{\text{HR}}^{(i)}}\right)\right\} = E_P\left\{b\left(\hat{\boldsymbol \Sigma}^{(i)}\right)\right\}. \end{align} $$

We regard $P(\text {select})=p$ as the sensitivity parameter, implying the the expected proportion of the published from the population studies. With various values of p, the changes in the estimates of $\text {SROC}(t)$ and $\text {SAUC}(t)$ imply the impact of reporting bias on them. In this section, we derive the loglikelihood function conditional on the published studies at a fixed value of p.

We let $f_P\left (\hat {\mathbf y}^{(i)}\mid \hat {\boldsymbol \Sigma }^{(i)}\right )$ denote the marginal distribution of $\hat {\mathbf y}^{(i)}$ and $f_P\left (\hat {\boldsymbol \Sigma }^{(i)}\right )$ the distribution of $\hat {\boldsymbol \Sigma }^{(i)}$ over the population studies. In meta-analysis without taking into account reporting bias, the empirical data $\left (\hat {\mathbf y}^{(i)}, \hat {\boldsymbol \Sigma }^{(i)}\right )$ are regarded as random sample from the population studies with the joint distribution $f_P\left (\hat {\mathbf y}^{(i)}, \hat {\boldsymbol \Sigma }^{(i)}\right )$ . In the presence of selective publication, the published studies can be biased sample from the population; we let $f_O$ denote the distribution of the selectively published samples. Given a fixed p, the distribution of $\hat {\boldsymbol \Sigma }^{(i)}$ in the published studies is derived by

$$ \begin{align*} f_O\left(\hat{\boldsymbol\Sigma}^{(i)}\right) = f\left(\left.\hat{\boldsymbol\Sigma}^{(i)}\right|\text{select}\right) = \dfrac{P\left(\text{select}\mid\hat{\boldsymbol\Sigma}^{(i)}\right)f_P\left(\hat{\boldsymbol\Sigma}^{(i)}\right)}{P\left(\text{select}\right)} = \dfrac{b\left(\hat{\boldsymbol \Sigma}^{(i)}\right)f_P\left(\hat{\boldsymbol\Sigma}^{(i)}\right)}{p}, \end{align*} $$

which gives

(15) $$ \begin{align} f_P\left(\hat{\boldsymbol\Sigma}^{(i)}\right)= p\cdot \left\{b\left( \hat{\boldsymbol \Sigma}^{(i)}\right)\right\}^{-1}f_O\left(\hat{\boldsymbol\Sigma}^{(i)}\right). \end{align} $$

The joint distribution of the empirical data is then derived by

$$ \begin{align*} f_O\left(\hat{\mathbf y}^{(i)}, \hat{\boldsymbol{\Sigma}}^{(i)}\right) & = f\left(\hat{\mathbf y}^{(i)}, \hat{\boldsymbol{\Sigma}}^{(i)}\mid \text{select}\right) \\ & = \dfrac{P\left(\text{select}\mid \hat{\mathbf y}^{(i)}, \hat{\boldsymbol{\Sigma}}^{(i)} \right) f_P\left(\hat{\mathbf y}^{(i)}, \hat{\boldsymbol{\Sigma}}^{(i)}\right)}{p} \\ & = \dfrac {a\left({t_{\text{HR}}^{(i)}}\right) f_P\left(\hat{\mathbf y}^{(i)} \mid \hat{\boldsymbol{\Sigma}}^{(i)}\right) f_O\left(\hat{\boldsymbol{\Sigma}}^{(i)}\right)} {b \left(\hat{\boldsymbol \Sigma}^{(i)}\right) }. \end{align*} $$

This joint distribution allows us to derive the loglikelihood of published studies, that is,

(16) $$ \begin{align} \begin{aligned} \ell_O\left(\boldsymbol\theta, \boldsymbol{\Psi}, \alpha, \beta\right) &= \log\prod_{i=1}^N f_O\left(\hat{\boldsymbol{\mu}}^{(i)}, \hat{\boldsymbol{\Sigma}}^{(i)}\right) \\ &= \sum_{i=1}^{N}\log f_P\left(\hat{\boldsymbol{\mu}}^{(i)}\left| \hat{\boldsymbol{\Sigma}}^{(i)}\right.\right) + \sum_{i=1}^{N}\log a\left({t_{\text{HR}}^{(i)}}\right) - \sum_{i=1}^{N}\log b\left(\hat{\boldsymbol{\Sigma}}^{(i)}\right) + c, \end{aligned} \end{align} $$

where the second and the third terms are used for correcting reporting bias, and $c = \sum _{i=1}^{N}\log \left \{ {f_O\left (\hat {\boldsymbol {\Sigma }}^{(i)}\right )}\right \}$ is constant. If a random selection of studies holds, then either $\beta =0$ or $\beta \rightarrow \infty $ holds as well. Consequently, these two terms cancel each other, making the likelihood reduce to that without accounting for reporting bias.

Noting that by taking the integral of both sides in equation (15), we can derive

(17) $$ \begin{align} p = 1\left/E_O\left\{b\left(\hat{\boldsymbol{\Sigma}}^{(i)}\right)^{-1}\right\}\right. \simeq N\left/\sum_{i=1}^N b\left(\hat{\boldsymbol{\Sigma}}^{(i)}\right)^{-1}\right.. \end{align} $$

According to the definition of $b(\cdot )$ (13), equation (17) is monotonic with respect to $\alpha $ ; thus, the parameter $\alpha $ can be represented by the function of $(\boldsymbol \theta , \boldsymbol {\Psi }, \beta )$ given a value of p. We denote this by $\alpha _p = \alpha _p(\boldsymbol \theta , \boldsymbol {\Psi }, \beta )$ . By replacing the $\alpha $ with $\alpha _p$ , we derive the conditional loglikelihood given the published studies from the loglikelihood (16):

(18) $$ \begin{align} \begin{aligned} &\ell_O\left(\boldsymbol\theta, \boldsymbol{\Psi}, \beta; p\right) \\ \propto& -\dfrac{1}{2}\sum_{i=1}^{N}\left\{ \left(\hat{\mathbf y}^{(i)} - \boldsymbol\theta\right)^\top \left(\boldsymbol{\Psi}+\left.\hat{\boldsymbol{\Sigma}}^{(i)}\right/n^{(i)}\right)^{-1} \left(\hat{\mathbf y}^{(i)}- \boldsymbol{\theta}\right) + \log\left|\boldsymbol{\Psi}+\left.\hat{\boldsymbol{\Sigma}}^{(i)}\right/n^{(i)}\right|\right \} \\ +& \sum_{i=1}^{N}\log \Phi\left\{ \alpha_p + \beta\cdot \left({\hat \mu^{(i)}_{\text{lnHR}}}\left/{\hat s^{(i)}_{\text{lnHR}}}\right.\right) \right\} \\ -& \sum_{i=1}^{N}\log \Phi\left\{ \dfrac{\alpha_p + \beta\cdot \left(\theta_{\text{lnHR}}\left/\hat s^{(i)}_{\text{lnHR}}\right.\right) } {\sqrt{1 + \beta^2\cdot \left\{1+ \left({\psi_{\text{lnHR}}}\left/{\hat s^{(i)}_{\text{lnHR}}}\right)^2\right.\right\}}} \right\}. \end{aligned} \end{align} $$

Differently from the loglikelihood (16), with $\alpha _p$ , this loglikelihood is constraint by the marginal selection probability. The parameters $\left (\boldsymbol \theta , \boldsymbol {\Psi }, \beta \right )$ can be estimated by maximizing the conditional loglikelihood given a specified value of p. We denote these ML estimates as $(\hat {\boldsymbol \theta }, \hat {\boldsymbol \Psi }, \hat \beta )$ . With these MLEs, $\text {SROC}(t)$ in equation (5) and $\text {SAUC}(t)$ in equation (6) can be estimated accordingly, denoted by $\mathrm {SR}\hat {\mathrm O}\mathrm C(t) =\text {SROC}(x, t; \hat {\boldsymbol \theta }, \hat {\boldsymbol \Psi })$ and $\mathrm {SA}\hat {\mathrm U}\mathrm C(t) = \text {SAUC}(t; \hat {\boldsymbol \theta }, \hat {\boldsymbol \Psi })$ , respectively. The asymptotic normality of $(\hat {\boldsymbol \theta }, \hat {\boldsymbol \Psi }, \hat \beta )$ follows the general theory of the ML estimation under the assumptions that the number of studies and subjects are large. The asymptotic variance-covariance matrix of $(\hat {\boldsymbol \theta }, \hat {\boldsymbol \Psi }, \hat \beta )$ can be consistently estimated by the inverse of the empirical Fisher information following the ML theory. The two-tailed CI for the SAUC can be constructed by the delta method, and the detailed derivation is presented in Section S3 of the Supplementary Material.

In practice, the true value of p is unknown. As a sensitivity analysis for reporting bias, the value of p should be taken within the range $(0,1)$ in general. It is recommended to specify a decreasing series for p, such as $p=0.9,0.8,\dots , 0.1$ , to thoroughly examine the changes of $\mathrm {SR}\hat {\mathrm O}\mathrm C(t)$ or $\mathrm {SA}\hat {\mathrm U}\mathrm C(t)$ . Specifying a large value, such as $p=0.9$ , implies that the published studies account for 90% of the population, and thereby small reporting bias will be corrected. In contrast, specifying a small value, such as $p=0.1$ , indicates that the published studies represent 10% of the population, resulting in possibly substantial reporting bias to be corrected by the proposed method. If the range for the probability of publication can be reasonably conjectured, it is preferable to specify a decreasing series for p within that narrower range. In the following section, we illustrate the practical implementation of the proposed method.

5 Application

We revisited the meta-analysis of Ki67Reference De Azambuja, Cardoso and De Castro 8 to evaluate the potential impact of reporting bias on $\mathrm {SR}\hat {\mathrm O}\mathrm C(t)$ and $\mathrm {SA}\hat {\mathrm U}\mathrm C(t)$ at $t=3$ and $t=5$ . As mentioned in Section 2, 38 studies reported the lnHRs for the overall survival outcome. Some potentially unpublished studies were detected in the lnHRs and the logit-transformed sensitivities by the funnel plots, as shown in Figure S1 in the Supplementary Material. When $t=3$ and $t=5$ , only 23 and 21 studies reported the KM estimates, respectively. To evaluate the impact of reporting bias on $\text {SROC}(t)$ and $\text {SAUC}(t)$ , we then adopted the proposed method. Considering that the number of population studies should be at least 38, the marginal selection probabilities $p=P(\text {select})$ were estimated to be no greater than $23/38\approx 0.61$ or 0.55, respectively. Thus, we considered implementing the proposed sensitivity analysis on the estimates of $\text {SROC}(t)$ given $p=0.6, 0.4, 0.2$ . As for $\text {SAUC}(t)$ , we gave thorough sensitivity analysis given $p=0.9, 0.8, \dots , 0.1$ . In the estimations, we used the absolute value of the t-statistic, $|{t_{\text {HR}}^{(i)}}|$ , in (11), indicating that the publication of studies was influenced by the two-tailed P-value of the lnHRs.

We regarded estimates by the HZ model without taking into account reporting bias (when $p=1$ ) as the benchmark. The changes of $\mathrm {SR}\hat {\mathrm O}\mathrm C(t)$ at $t=3$ and $t=5$ are shown in Figure 3a and 3d, respectively, with the estimates of corresponding parameters presented in Tables S4 and S5 in the Supplementary Material. Although there was a small impact of reporting bias on the $\mathrm {SR}\hat {\mathrm O}\mathrm C(t)$ , the estimated summary operating points (the diamond points) were changed variously when $t=3$ and $t=5$ . At $t=3$ , with p decreasing to 0.2, the integrated sensitivity increased slightly, and the integrated specificity decreased (as $1-$ specificity increased). At $t=5$ , the integrated sensitivities were almost unchanged, while the integrated specificity decreased when p decreased to 0.2. The different changes on the integrated sensitivity and specificity indicated that, in this example, the selective publication determined by the significance of the lnHRs had more impact on the estimates of specificity than sensitivity; thus, one should also be cautious about the inference of integrated specificity.

Figure 3 The SRÔC(t) and SAÛC(t), and the probit selection function at $t=3, 5$ when $p=0.6, 0.4, 0.2$ in Ki-67 example. In panels (a) and (d), the circle points are the empirical $\mathrm {se}(x,t)$ and $1-\mathrm {sp}(x,t)$ pairs from 23 prognosis studies; the diamond points are the estimated summary operating points, $\left (\mathrm {logit}^{-1} \hat{\mu} _{\mathrm {se}}, 1-\mathrm {logit}^{-1} \hat{\mu} _{\mathrm {sp}}\right )$ . Panel (b) and (e) show SAÛC(t) by the HZ model ( $p=1$ ) and the proposed method given $p=0.9, \dots , 0.1$ . In panels (c) and (f), the vertical lines at the top are the observed t-statistics from 23 prognosis studies.

The HZ model estimated $\mathrm {SA}\hat {\mathrm U}\mathrm C(t)$ at $t=3$ and $t=5$ to be 0.649 (95% CI: 0.606, 0.690) and 0.646 (0.610, 0.680), respectively. Corresponding to the estimate of $\mathrm {SR}\hat {\mathrm O}\mathrm C(t)$ , when $p=0.6$ , that is, about $(1-0.6)/0.6\times 23 \approx 15$ unpublished studies potentially existed, the estimated $\text {SAUC}(t)$ at $t=3$ and $t=5$ decreased to 0.638 (0.591, 0.682) and 0.632 (0.592, 0.677), respectively. In the worst case when $p=0.1$ , $\mathrm {SA}\hat {\mathrm U}\mathrm C(t)$ at $t=3$ and $t=5$ decreased to 0.621 (0.526, 0.708) and 0.608 (0.525, 0.685), respectively. The changes of $\mathrm {SA}\hat {\mathrm U}\mathrm C(t)$ at $t=3$ and $t=5$ given $p=0.9, \dots , 0.1$ are shown in Figure 3b and 3e, respectively. Although $\text {SAUC}(t)$ decreased with decreasing p, the estimates of $\text {SAUC}(t)$ were still significantly different from 0.5. The estimated probit selection functions (equation 11) at $t=3$ and $t=5$ were presented in Figure 3c and 3f, respectively. The t-statistics of the published studies were shown as the vertical lines, and most studies had high probabilities of being selected. The detailed estimates of $\text {SAUC}(t)$ in Figure 3b and 3e were presented in Table S6 in the Supplementary Material.

Although the prognostic capacity of Ki67 was not high, $\mathrm {SA}\hat {\mathrm U}\mathrm C(t)$ at $t=3$ and $t=5$ was estimated to be statistically significant by the HZ model. The sensitivity analysis supported the robustness of these estimates, indicating that $\mathrm {SA}\hat {\mathrm U}\mathrm C(t)$ were affected by reporting bias to a small degree at the third ( $t=3$ ) and the fifth ( $t=5$ ) years. With the sensitivity analysis, one could draw the robust conclusion that the prognostic capacity of Ki67 antigen was not very high in patients with early breast cancer.

6 Simulation studies

Simulation studies were conducted to evaluate the performance of the proposed sensitivity analysis method. Following the data structure introduced in Section 3.1, we generated latent IPD in each prognosis study and used the observable data for meta-analysis. In each prognosis study i, we considered the following scenarios to generate the IPD.Reference Hattori and Zhou 15 We considered one moderate size of total subjects with $n^{(i)}$ generated from the uniform distribution $U(50,150)$ and one heterogeneous and large size with $n^{(i)}\sim U(40, 300)$ . The failure time of each subject $\tilde T$ was generated from $\log \tilde T = 1+\epsilon $ , where $\epsilon $ followed the standard normal distribution. The potential censored time $\tilde C$ was generated from the exponential distribution with hazard rate $\lambda = 0.2$ , denoted by $Exp(0.2)$ . Since the asymptotic variance–covariance matrices in equations (4) and (9) required the limiting variances of $\hat S_0(t)$ and $\hat S_1(t)$ , the distribution of $\tilde C$ were estimated. We supposed $\tilde C$ to be the exponential distribution. Following the method of Hattori and Zhou,Reference Hattori and Zhou 15 we estimated $\lambda $ by using the empirical medians of the follow-up time. In simulation studies, we were interested in estimating $\text {SAUC}(t)$ at $t=2$ , denoted by SAUC(2). Different distributions of biomarkers were considered for subjects with failure time $\tilde T\le 2$ and $\tilde T>2$ . Specifically, we considered five scenarios of the biomarker $\tilde X$ with normal distributed cutoff values having large or small variances; all the scenarios were summarized in Table 1.

With the IPD in study i, the sensitivity and specificity at $t=2$ were estimated according to equation (1), and $\hat \mu _{\text {lnHR}}^{(i)}$ was estimated by the Cox model on the biomarker $\tilde X$ . The asymptotic variance–covariance matrix in the trivariate model (7) was derived according to equations (S1), (S9), and (S10) in the Supplementary Material.

For meta-analysis, we considered 70% and 50% of studies were published from the population, that is, the marginal selection probability $p=P(\text {select})$ was set as 0.7 or 0.5, respectively. We considered small, medium, and large sizes of published studies, that is, about 25, 50, and 140 or 150 studies to be published. In each selective publication process, we generated S population studies and then selected $N=S\times p$ studies according to the probit selection function (11). In the true probit selection function, $\beta $ was set to be 5, and the values of $\alpha $ at $p=0.7$ or 0.5 were calculated according to the definition of $P(\text {select})$ in equation (14). The corresponding values of $\alpha $ were presented in Table S7 in the Supplementary Material; the detailed data-generating process was presented in Section S5.1 of the Supplementary Material. The publication process was repeated 1,000 times in each scenario.

In the absence of reporting bias, SAUC(2) was estimated by maximizing the likelihood of the HZ model based on S population studies, where results were denoted by BNM $_P$ ; SAUC(2) with reporting bias were estimated by the HZ model with N published studies, and results were denoted by BNM $_O$ . The impact of reporting bias on SAUC(2) was indicated by the differences between the estimates of BNM $_P$ and BNM $_O$ . The proposed method was applied to the data of N published studies and its results were denoted by Proposed. The difference between the estimates of BNM $_P$ and proposed implied the bias of the proposed method. The estimated $p\approx N/S$ was used as a constraint to optimize the parameters. In practice, estimating p was infeasible; in simulation studies, p could be estimated and was used as the specified value for evaluating the performance of the proposed method.

Table 2 Summary of estimates of SAUC(2) by the HZ model and the proposed method when censoring distribution is correctly specified and $p=0.7$

Note: B denotes the scenarios of biomarker coresponding to Table 2; CR shows convergence rate of the proposed method; estimates are summarized by median (first quantile, third quantile); RB denotes reporting bias; Bias denotes bias of the proposed method.

Table 3 Summary of estimates of SAUC(2) by the HZ model and the proposed method when censoring distribution is correctly specified and $p=0.5$

Note: B denotes the scenarios of biomarker coresponding to Table 2; CR shows convergence rate of the proposed method; estimates are summarized by median (first quantile, third quantile); RB denotes reporting bias; Bias denotes bias of the proposed method.

All statistical computing was conducted by R (R Development Core Team, Version 4.1.3). The ML estimations were optimized by the Newton–Raphson method and conducted by R function nlminb(). The lnHR was estimated by the Cox model conducted by the R package survival.Reference Therneau 28 The details of optimization were presented in Section S5.2 of the Supplementary Material.

We considered the situation when the censoring distribution was correctly fitted by the exponential distribution and estimated the hazard rate using the medians of the simulated follow-up time. The medians with the first and the third quartiles of the estimated SAUC(2) in all the scenarios when $p=0.7$ and $p=0.5$ were summarized in Tables 2 and 3, respectively. With the setting of a smaller $p=0.5$ , reporting bias, as shown in RB, increased in Table 3 comparing to the case of $p=0.7$ in Table 2. The magnitude of reporting bias was observed to be greater in Biomarker Scenarios 3–6, indicating that the bias increased with increasing variances of biomarkers. When the censoring distribution was correctly fitted in the estimation, the proposed method reduced reporting bias in all the scenarios, especially when the number of studies (N) was large. When the number of patients was larger, the bias of the proposed method as well as reporting bias decreased. In addition, we summarized the corresponding estimates of time-dependent sensitivity and specificity; the results were presented in Tables S8S11 in the Supplementary Material. We observed larger reporting bias on the estimates of sensitivity than specificity for Biomarker Scenarios 1–5, where smaller variances were set for biomarkers given failure time $\tilde T\le 2$ . Conversely, in scenario 6, where larger variances were set for biomarkers given failure time $\tilde T\le 2$ , larger reporting bias was found on the estimates of specificity. We observed that the magnitude of reporting bias varied in estimates of sensitivity or specificity due to differences in data, and regardless of the scenarios, the proposed methods successfully reduced biases on both sensitivity and specificity.

The convergence rates (CRs) of the proposed method were also presented; the CR was calculated by the proportion of successfully obtaining the converged estimates among 1,000 repetitions. Since the proposed method was based on the trivariate normal model and contained 11 parameter, when the number of published studies was small, the performance of the proposed method showed comparatively low CR. When the number of patients were large, reporting bias was observed to be decreased, and the proposed method could still reduce reporting bias in the estimates.

We also considered the situations when censoring distribution was misspecified in the proposed method given $p=0.7$ . We generated the true censoring distribution by the uniform distribution $U[1,4]$ and lognormal distribution $LN(1.5\log 2, (\log 2)^2)$ . In the proposed method, we still used the exponential distribution to estimate the censoring distribution incorrectly. Under the misspecification of the censoring distribution, we evaluated the performance of the proposed methods. The estimates were summarized in Tables S12 and S13 in the Supplementary Material. Even with the misspecification of the censoring distribution, the proposed method could reduce reporting bias. The results were in agreement with those when the censoring distribution was correctly fitted; however, the misspecification of the censoring distribution caused the CRs of the estimation reduced, especially when the number of studies was small or the number of patients was large.

Overall, simulation studies showed that the proposed method could reduce reporting bias. However, since the proposed methods needed to estimate ten unknown parameters, the unconverged estimates were obtained by using nlminb(), especially when the number of published studies was small or the underlying censoring distribution was misfitted. With enough number of published studies, the simulation studies showed that the proposed method could obtain CR around 70% or more when $\text {SAUC}(t)$ were better than moderate. In practice, we suggest trying different plausible initial values such as some values that are speculated to be close to the true values of some parameters.

7 Discussion

Reporting bias is widely recognized as a major issue in various meta-analyses and may lead to over-optimistic conclusions. Thus, addressing reporting bias is a crucial part of meta-analysis, enhancing the reliability of meta-analytical results. In recent decades, meta-analysis of prognosis studies has been gaining increasing interest in medical research. Riley et al.Reference Riley, Moons and Snell 29 discussed the importance of meta-analysis of prognosis studies and summarized several challenges specific for it. The choice of cutoff value is one of the issues. With a continuous biomarker, different choice of cutoff values would lead to great heterogeneity in meta-analysis and difficulty in interpreting the results.Reference Riley, Moons and Snell 29 However, in practice, almost all the clinical reports of meta-analysis of prognostic studies simply applied the standard meta-analysis technique to aggregate the outcomes, such as the HRs, of individual studies, ignoring the heterogeneous cutoff values. The underdevelopment of relevant statistical methods may account for such phenomena. The time-dependent SROC curve has been proposed to show the summary prognostic capacity independent of cutoff values.Reference Combescure, Daures and Foucher 14 , Reference Hattori and Zhou 15 Considering the wide acceptance of the SROC method in diagnostic meta-analysis, these $\text {SROC}(t)$ -based methodologies should be emphasized and applied more in practice. Meanwhile, there is a growing need to develop corresponding methods addressing reporting bias in these meta-analytical methodologies. In general, our proposed method is expected to enhance the utility of $\text {SROC}(t)$ method in evaluating prognostic capacity of biomarkers and moreover reduce the possible reporting bias on the meta-analytical results. Our method is developed as a subordinate analysis to the bivariate normal model by Hattori and Zhou.Reference Hattori and Zhou 15 Their model is much simpler in the inference than the method of Combesecure et al.Reference Combescure, Daures and Foucher 14 and the bivariate binomial model of Hattori and Zhou.Reference Hattori and Zhou 15 Our proposal provides a mean for researchers to draw more reliable conclusions about prognostic capacity from their meta-analysis.

While many methods have been proposed to deal with reporting bias on the aggregated HRs, to the best of our knowledge, no methods exist for reporting bias on $\text {SROC}(t)$ -based results. Our proposal is the first tool to evaluate the robustness of $\text {SROC}(t)$ -based results against reporting bias. However, several limitations should be acknowledged. The proposed method relies on the trivariate normal distribution, which involves numerous parameters. Nonconvergence issues may occur in some small meta-analyses; however, simulation studies indicated that meta-analysis with at least 25 studies achieved an acceptable convergence rate. Enhancing the estimation strategy remains an area for future improvement. On the other hand, our proposal employs the parametric probit selection function, which is theoretically soundReference Hedges 30 for modeling selective publication process of prognostic studies and allows the derivation of a closed-form likelihood; however, the suitability of the probit model cannot be checked from the observed data. Recently, Zhou et al.Reference Zhou, Huang and Hattori 31 proposed non-parametric worst-case bounds to deal with reporting bias on the SROC/SAUC estimations in diagnostic meta-analysis. Their method utilized a class of nonparametric selection functions under the assumption that larger studies with possibly significant sensitivity or specificity are more likely to be published. The non-parametric worst-case bounds method provides an alternative framework addressing reporting bias and could potentially be adapted to the HZ model for prognostic meta-analysis in the future.

Our proposal put forward an idea for modeling reporting bias in prognosis studies with time-to-event outcomes and may help the development of new statistical methods dealing with reporting bias on various estimates in meta-analysis of prognosis studies. The bivariate binomial model of Hattori and ZhouReference Hattori and Zhou 15 is based on the exact likelihood to model the numbers of survived and failed subjects and may outperform the HZ model when the number of studies is small. Hattori and ZhouReference Hattori and Zhou 32 developed a meta-analytic version of the concordance index for the time-to-event outcome. Our proposal is expected to be extended to address reporting bias in these meta-analytical models.

Author contributions

Y.Z.: Conceptualization, formal analysis, methodology, writing - original draft, writing - review and editing. A.H.: Investigation, methodology, writing - review and editing. S.H.: Conceptualization, methodology, supervision, writing - review and editing.

Competing interest statement

The authors declare that no competing interests exist.

Data availability statement

R codes together with a sample application data set are available at https://github.com/meta2020/progmetasa-r.

Funding statement

This research was partly supported by Grant-in-Aid for Challenging Exploratory Research (16K12403) and for Scientific Research (16H06299, 18H03208) from the Ministry of Education, Science, Sports and Technology of Japan.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/rsm.2025.14.

Footnotes

This article was awarded Open Data and Open Materials badges for transparent practices. See the Data availability statement for details.

References

Macaskill, P, Gatsonis, C, Deeks, J, Harbord, R, Takwoingi, Y. Chapter 10: Analysing and presenting results. The cochrane collaboration. 2010. Accessed January 15, 2025. http://srdta.cochrane.org/.Google Scholar
Heagerty, PJ, Lumley, T, Pepe, MS. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics 2000;56(2):337344.CrossRefGoogle ScholarPubMed
Kamarudin, AN, Cox, T, Kolamunnage-Dona, R. Time-dependent ROC curve analysis in medical research: current methods and applications. BMC Med. Res. Methodol. 2017;17(1):53.Google ScholarPubMed
Rutter, CM, Gatsonis, CA. A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations. Stat. Med. 2001;20(19):28652884.Google ScholarPubMed
Reitsma, JB, Glas, AS, Rutjes, AW, Scholten, RJ, Bossuyt, PM, Zwinderman, AH. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. J. Clin. Epidemiol. 2005;58(10):982990.CrossRefGoogle ScholarPubMed
Haitao, C, Hongfei, G, Yijie, Z. Bivariate random effects meta-analysis of diagnostic studies using generalized linear mixed models. Med. Decis. Mak. 2010;30(4):499508.CrossRefGoogle Scholar
Look, MP, vWLJ, Putten, Duffy, MJ, et al. Pooled analysis of prognostic impact of urokinase-type plasminogen activator and its inhibitor PAI-1 in 8377 breast cancer patients. JNCI J. Nat. Cancer Inst. 2002;94:116128.Google ScholarPubMed
De Azambuja, E, Cardoso, F, De Castro, G, et al. Ki-67 as prognostic marker in early breast cancer: a meta-analysis of published studies involving 12 155 patients. Br. J. Cancer 2007;96(10):15041513.Google Scholar
Callagy, GM, Webber, MJ, Pharoah, PD, Caldas, C. Meta-analysis confirms BCL2 is an independent prognostic marker in breast cancer. BMC Cancer 2008;8(1):153.Google Scholar
Na, F, Wang, J, Li, C, Deng, L, Xue, J, Lu, Y. Primary tumor standardized uptake value measured on F18-Fluorodeoxyglucose positron emission tomography is of prediction value for survival and local control in non–small-cell lung cancer receiving radiotherapy: Meta-analysis. J. Thorac. Oncol. 2014;9(6):834842.CrossRefGoogle ScholarPubMed
Pak, K, Cheon, GJ, HY, Nam, et al. Prognostic value of metabolic tumor volume and total lesion glycolysis in head and neck cancer: A systematic review and meta-analysis. J. Nucl. Med. 2014;55(6):884890.Google ScholarPubMed
Riley, RD, Elia, EG, Malin, G, Hemming, K, Price, MP. Multivariate meta-analysis of prognostic factor studies with multiple cut-points and/or methods of measurement. Stat. Med. 2015;34(17):24812496.CrossRefGoogle ScholarPubMed
Sadashima, E, Hattori, S, Takahashi, K. Meta-analysis of prognostic studies for a biomarker with a study-specific cutoff value. Res. Synth. Methods 2016;7(4):402419.CrossRefGoogle ScholarPubMed
Combescure, C, Daures, J, Foucher, Y. A literature-based approach to evaluate the predictive capacity of a marker using time-dependent summary receiver operating characteristics. Stat. Methods Med. Res. 2016;25(2):674685.CrossRefGoogle ScholarPubMed
Hattori, S, Zhou, XH. Time-dependent summary receiver operating characteristics for meta-analysis of prognostic studies. Stat. Med. 2016;35(26):47464763.CrossRefGoogle ScholarPubMed
Sutton, AJ, Song, F, Gilbody, SM, Abrams, KR. Modelling publication bias in meta-analysis: A review. Stat. Methods Med. Res. 2000;9(5):421445.Google ScholarPubMed
Copas, JB, Shi, JQ. Meta-analysis, funnel plots and sensitivity analysis. Biostatistics 2000;1(3):247262.Google ScholarPubMed
Copas, JB, Shi, JQ. A sensitivity analysis for publication bias in systematic reviews. Stat. Methods Med. Res. 2001;10(4):251265.Google ScholarPubMed
Copas, JB. A likelihood-based sensitivity analysis for publication bias in meta-analysis. J. R. Stat. Soc. Ser. C Appl. Stat. 2013;62(1):4766.CrossRefGoogle Scholar
Copas, JB, Jackson, D. A bound for publication bias based on the fraction of unpublished studies. Biometrics 2004;60(1):146153.Google ScholarPubMed
Hattori, S, Zhou, XH. Sensitivity analysis for publication bias in meta-analysis of diagnostic studies for a continuous biomarker. Stat. Med. 2018;37(3):327342.Google ScholarPubMed
Piao, J, Liu, Y, Chen, Y, Ning, J. Copas-like selection model to correct publication bias in systematic review of diagnostic test studies. Stat. Methods Med. Res. 2019;28(10–11):29122923.Google ScholarPubMed
Li, M, Fan, Y, Liu, Y, Liu, Y. Diagnostic test meta-analysis by empirical likelihood under a Copas-like selection model. Metrika 2021;84(6):927947.Google Scholar
Zhou, Y, Huang, A, Hattori, S. A likelihood-based sensitivity analysis for publication bias on the summary receiver operating characteristic in meta-analysis of diagnostic test accuracy. Stat. Med. 2023;42(6):781798.CrossRefGoogle ScholarPubMed
Boutron, I, Page, MJ, Higgins, JP, Altman, DG, Lundh, A, Hróbjartsson, A. Considering risk of bias due to missing results. Cochrane. 2024. Accessed January 15, 2025. https://training.cochrane.org/handbook/current/chapter-07#section-7-7.Google Scholar
Cox, DR. Regression models and life-tables. J. R. Stat. Soc. Ser. B 1972;34(2):187202.Google Scholar
Doebler, P, Holling, H, Böhning, D. A mixed model approach to meta-analysis of diagnostic studies with binary test outcome. Psychol. Methods 2012;17(3):418436.CrossRefGoogle ScholarPubMed
Therneau, T. A package for survival analysis in R. R package version 3.7-0. 2024.Google Scholar
Riley, RD, Moons, KGM, Snell, KIE, et al. A guide to systematic review and meta-analysis of prognostic factor studies. BMJ 2019;364:k4597.Google ScholarPubMed
Hedges, LV. [Selection models and the file drawer problem]: Comment. Stat. Sci. 1988;3(1):409435.CrossRefGoogle Scholar
Zhou, Y, Huang, A, Hattori, S. Nonparametric worst-case bounds for publication bias on the summary receiver operating characteristic curve. Biometrics 2024;80(3):ujae080.Google ScholarPubMed
Hattori, S, Zhou, XH. Summary concordance index for meta-analysis of prognosis studies with a survival outcome. Stat. Med. 2021;40(24):52185236.Google ScholarPubMed
Figure 0

Table 1 Scenarios of distributions of biomarker and cutoff values used in simulation studies. e follows the standard logistic distribution

Figure 1

Figure 1 The funnel plot and the trim and fill method for detecting reporting bias in meta-analysis of Ki67. The vertical black dashed lines are the integrated lnHRs without considering reporting bias. The central axes of the funnel plots are the adjusted lnHRs. The open circle points are the filled unpublished studies. The red and black points are the published studies categorized by the P-values of the lnHRs. P for the open circle point in the legends indicated the estimated P-values of the imputed lnHRs of the filled studies.

Figure 2

Figure 2 The funnel plot and the trim and fill method for detecting reporting bias in meta-analysis of Ki67 based on the 23 studies reporting the KM curves. The vertical black dashed lines are the integrated lnHRs without considering reporting bias. The central axes of the funnel plots are the adjusted lnHRs. The open circle points are the filled unpublished studies. The red and black points are the published studies categorized by the P-values of the lnHRs. P for the open circle point in the legends indicated the estimated P-values of the imputed lnHRs of the filled studies.

Figure 3

Figure 3 The SRÔC(t) and SAÛC(t), and the probit selection function at $t=3, 5$ when $p=0.6, 0.4, 0.2$ in Ki-67 example. In panels (a) and (d), the circle points are the empirical $\mathrm {se}(x,t)$ and $1-\mathrm {sp}(x,t)$ pairs from 23 prognosis studies; the diamond points are the estimated summary operating points, $\left (\mathrm {logit}^{-1} \hat{\mu} _{\mathrm {se}}, 1-\mathrm {logit}^{-1} \hat{\mu} _{\mathrm {sp}}\right )$. Panel (b) and (e) show SAÛC(t) by the HZ model ($p=1$) and the proposed method given $p=0.9, \dots , 0.1$. In panels (c) and (f), the vertical lines at the top are the observed t-statistics from 23 prognosis studies.

Figure 4

Table 2 Summary of estimates of SAUC(2) by the HZ model and the proposed method when censoring distribution is correctly specified and $p=0.7$

Figure 5

Table 3 Summary of estimates of SAUC(2) by the HZ model and the proposed method when censoring distribution is correctly specified and $p=0.5$

Supplementary material: File

Zhou et al. supplementary material

Zhou et al. supplementary material
Download Zhou et al. supplementary material(File)
File 397.6 KB