Introduction
Bipolar disorder (BD) is characterized by fluctuations in mood state and is a leading cause of disability due to its cognitive and functional impact [Reference Grande, Berk, Birmaher and Vieta1]. Sex differences in BD have been reported in clinical outcomes, with BD-I showing equal prevalence between sexes and BD-II being more common in females [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2–Reference Diflorio and Jones4]. Females are at higher risk of depression, rapid cycling, hypomania, and a seasonal pattern [Reference Curtis3, Reference Carrus, Christodoulou, Hadjulis, Haldane, Galea and Koukopoulos5–Reference Bücker, Popuri, Muralidharan, Kozicky, Baitz and Honer7] whereas males more frequently experience manic episodes and substance abuse [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2, Reference Carrus, Christodoulou, Hadjulis, Haldane, Galea and Koukopoulos5, Reference Suwalska and Łojko6, Reference Messer, Lammers, Müller-Siecheneder, Schmidt and Latifi8].
Besides clinical outcomes, differences in neurocognition (NC) between males and females have been found. These differences are mostly in line with those detected in control participants: verbal and facial memory has been reported to be outperformed by females whereas spatial processing and motor processing by males in the general population [Reference Gur and Gur9, Reference Mendrek and Mancini-Marïe10]. Similarly, females with BD performed better in verbal learning and memory than males [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2, Reference Carrus, Christodoulou, Hadjulis, Haldane, Galea and Koukopoulos5, Reference Gogos, Joshua, Rossell, Fellow and Alfred11]. Moreover, Carrus et al.[Reference Carrus, Christodoulou, Hadjulis, Haldane, Galea and Koukopoulos5] reported worse immediate memory in males with BD compared with control males and did not observe the same pattern in females. Furthermore, males with BD outperformed females with BD in attention and working memory [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2, Reference Bücker, Popuri, Muralidharan, Kozicky, Baitz and Honer7, Reference Barrett, Kelly, Bell and King12]. Regarding processing speed, a study by Solé et al.[Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2] reported no differences between sexes but Gogos et al. [Reference Gogos, Joshua, Rossell, Fellow and Alfred11] found better performance in female patients. Similarly, in semantic fluency females with BD outperformed males [Reference Gogos, Joshua, Rossell, Fellow and Alfred11] although other studies found no differences [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2, Reference Bücker, Popuri, Muralidharan, Kozicky, Baitz and Honer7]. The data in Vaskinn et al. [Reference Vaskinn, Sundet, Simonsen, Hellvin, Melle and Andreassen13] and Gogos et al. [Reference Gogos, Joshua, Rossell, Fellow and Alfred11] suggest a poorer NC performance in males compared to females, but the findings remain inconclusive. The discrepancies in the results could be explained due to different tests used to assess NC, small sample sizes, and different clinical and sociodemographic characteristics between studies.
Deficits in NC have been associated with poor psychosocial functioning [Reference Vieta, Berk, Schulze, Carvalho, Suppes and Calabrese14], being verbal memory and executive function as the main predictors [Reference Bonnín, Martínez-Arán, Torrent, Pacchiarotti, Rosa and Franco15, Reference Baune and Malhi16]. Most of the studies have shown a better functioning profile in females in comparison with males [Reference Vaskinn, Sundet, Simonsen, Hellvin, Melle and Andreassen13, Reference Sanchez-Moreno, Bonnin, González-Pinto, Amann, Solé and Balanzá-Martinez17]. In contrast, Solé et al. [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2] found no differences between sexes.
Nonetheless, results remain non-conclusive as mixed findings have been reported. As such, we conducted the present systematic review and meta-analysis to better understand these discrepancies. Understanding sex differences in cognitive functioning and functional outcomes in BD is critical for advancing both scientific knowledge and clinical practice. These differences could provide valuable insights contributing to a better understanding of their patterns in males and females, since it will enable the development of personalized interventions for this population. By tailoring interventions to address sex-specific needs, clinicians could improve both cognitive and functional outcomes, ultimately reducing the burden of the disorder on individuals and their families. To the best of our knowledge, no other study has systematically reviewed the literature exploring sex differences in psychosocial functioning and NC in BD. Specifically, the aim of the present study was to conduct a systematic review and meta-analysis to examine whether males and females with BD present differences in NC performance and psychosocial functioning. The primary question of this research is whether there are differences in neurocognitive performance and psychosocial functioning between males and females with BD. Two main hypotheses were formulated: differences will be found between males and females in cognitive performance and psychosocial functioning.
Methods
The present systematic review and meta-analysis were conducted following the PRISMA guidelines [Reference Page, Mckenzie, Bossuyt, Boutron, Hoffmann and Mulrow18] and had a registered protocol (PROSPERO-ID: CRD42022369013). The PRISMA checklist is reported in Supplementary Materials – Appendix 1.
Selection criteria
Eligibility criteria were based on the Population, Intervention, Comparison, Outcome (PICO) framework. The following inclusion criteria were used: (1) original articles published in a peer-reviewed journal; (2) including people with BD, according to any edition of the Diagnostic and Statistical Manual for Mental Disorders (DSM) [19–21] the International Classification of Diseases (ICD) [22] the Research Diagnostic Criteria (RDC) [Reference Spitzer23]; (3) assessing and providing measures of global functioning or psychosocial functioning, self-rated or clinician-rated, or NC using validated measurement tools; and (4) comparing participants based on sex (i.e., females and males). Both observational (cross-sectional and longitudinal) and intervention studies were eligible for inclusion, but only baseline data were considered in the case of longitudinal and intervention studies. No language and age restrictions were applied. Studies were excluded if they were (1) reviews, (2) meta-analyses, (3) case reports, and (4) case series.
Search strategy
The Cochrane Library, EMBASE, PsycINFO, PubMed, Scopus, and Web of Science databases were systematically searched from inception until November 20, 2023 (search strings are available in Supplementary Materials – Appendix 2). The backward snowballing technique was used to identify any additional papers not found in the original search.
Procedure and data extraction
All retrieved studies were screened by title and abstract based on the previously defined inclusion and exclusion criteria and irrelevant studies were excluded. The remaining articles were then reviewed and examined at the full-text level.
Data extraction, when available, included: first author, year of publication, geographical region and country, study design, diagnostic criteria, diagnostic interview administered, study setting, total number of cases and controls (i.e., females and males), validated measurement tools used to assess outcomes, cognitive functioning measurement (specific cognitive domains evaluated, neuropsychological assessment implemented) psychosocial functioning measurement (functional evaluation and domains), type of outcome, mean and standard deviation (SD) of outcomes for females and males, mean age and SD of females and males, mean and SD of duration of BD illness for females and males, mean and SD of age of BD onset for females and males, % of BD-I among females and males, % of females and males with euthymic, depressed, hypomanic, manic, and mixed episodes, mean and SD of total, depressive, and (hypo)manic episodes number among females and males, % of females and males prescribed with psychotropic medication, psychiatric and/or medical comorbidities in females and males, instrument used to measure depressive and (hypo)manic symptoms, mean scores and SD obtained on symptom severity scale for females and males. If the data were not fully available in the published article, the corresponding authors were contacted up to two times to ask for the necessary data.
Specifically, to standardize the categorization of cognitive tests into cognitive domains, we based our approach on The International Society for Bipolar Disorders – Battery for Assessment of Neurocognition (ISBD-BANC) [Reference Yatham, Torres, Malhi, Frangou, Glahn and Bearden24]. Overall cognitive functioning has been added to provide relevant information on general cognitive performance, reflecting global cognitive ability rather than isolated domains.
-
1) Attention/vigilance: RBANS attention/vigilance subtest – digit span and coding task [Reference Randolph, Tierney, Mohr and Chase25], Wechsler Adult Intelligence Scale (WAIS-III) digit span subtest [Reference Wechsler26]; The Conners Continuous Performance Test (CPT-II) [Reference Conners and Sitarenios27]; Trail Making Test Form A [Reference Reitan28].
-
2) Processing speed: Delis–Kaplan Executive Function System (D-KEFS) [Reference Fine and Delis29], psychomotor speed-Trail Making subtest. It is a modification of the classic test, designed to isolate the psychomotor component [Reference Swanson30]; The Screen for Cognitive Impairment in Psychiatry (SCIP) Processing Speed Subtest [Reference Schmid, Czekaj, Frick, Steinert, Purdon and Uhlmann31]; Processing speed WAIS-III [Reference Wechsler26].
-
3) Executive/Working memory: Cambridge Neuropsychological Test Automated Battery (CANTAB) Spatial Working Memory Task (SWM) Strategy [Reference HLS, Rocha, Sabbá, Tomás, NVO and Anthony32]; Executive functioning D-KEFS subtest [Reference Fine and Delis29]; Stockings of Cambridge (SOC) planning and problem-solving [Reference HLS, Rocha, Sabbá, Tomás, NVO and Anthony32]; N-back; Stroop – word and color test [Reference Golden and Freshwater33]; Wechsler Memory Scale (WMS-III) working memory sub-scale [Reference Wechsler26]; SCIP working memory subtest [Reference Schmid, Czekaj, Frick, Steinert, Purdon and Uhlmann31].
-
4) Verbal learning and memory: RBANS Delayed verbal memory subtest [Reference Randolph, Tierney, Mohr and Chase25], California Verbal Learning Test [Reference Dumont and Willis34] (CVLT-II) recall Trial 1 – 5; DKEFS Memory subtest [Reference Fine and Delis29]; RBANS – list and story learning Subtest [Reference Randolph, Tierney, Mohr and Chase25]; WMS-III Auditory delayed subtest [Reference Wechsler26]; SCIP delayed verbal learning subtest [Reference Schmid, Czekaj, Frick, Steinert, Purdon and Uhlmann31].
-
5) Visual learning and memory: RBANS Figure recall subtest, visuo-spatial memory Spatial Recognition Memory (SRM) [Reference Randolph, Tierney, Mohr and Chase25]; RBANS – figure copy and line orientation task [Reference Randolph, Tierney, Mohr and Chase25]; WMS-III visual delayed WMS-III [Reference Wechsler26]; Rey – Osterrieth complex figure (ROCF) copy and recall [Reference Osterrieth35].
-
6) Social cognition: face auditory ID; Pictures of Facial Affect (POFA) [Reference Ekman and Friesen36].
-
7) Language: RBANS – picture naming and semantic fluency tasks [Reference Randolph, Tierney, Mohr and Chase25].
-
8) Intelligence: Wechsler Abbreviated Scale of Intelligence (WASI) [Reference Wechsler37] and Wechsler Adult Intelligence Scale (WAIS III) [Reference Wechsler26] full-scale IQ.
-
9) Overall cognitive functioning: RBANS [Reference Randolph, Tierney, Mohr and Chase25], DKEF-S [Reference Fine and Delis29], and SCIP [Reference Schmid, Czekaj, Frick, Steinert, Purdon and Uhlmann31] total scores.
When multiple cognitive measures were reported within a domain, the following strategies were applied to ensure consistency and comparability: (1) aggregation, if multiple measures originated from the same scale but no composite or total score was provided, aggregated scores were calculated using weighted averages of the raw scores, with weights based on sample sizes and (2) selection, if multiple different measures were reported, the most viable measure was selected based on its relevance, frequency of use in the literature, and comparability to other included studies.
Three authors (MSN, DC, CV) independently conducted all described stages. When a consensus was not reached, discrepancies were reached in a consensus meeting with two fellow authors (SA, CT).
Quality appraisal
The risk of bias was assessed independently by three authors (MSN, DC, CV), and disagreements were resolved by involving two senior authors (SA, CT). The Newcastle–Ottawa Scale (NOS) [Reference Stang38] was used, and the scores obtained were converted according to the “Agency for Healthcare Research and Quality” (AHRQ) standards as done in Oliva et al.[Reference Oliva, De Prisco, Fico, Possidente, Fortea and Montejo39].
Statistical analyses
Statistical analyses were conducted using R version 4.1.2 (R Core Team, 2020) and the separate meta-analyses for each outcome were performed via the metafor R-package [Reference Viechtbauer and Viecthbauer40] using a random-effect model (restricted maximum-likelihood estimator) [Reference Harville41]. Standardized mean differences (SMD) with 95% confidence intervals (CI) represented by Hedge’s g were used as effect sizes. Cochran’s Q [Reference Cochran42], τ2 and I 2 were used to test for heterogeneity. Prediction intervals were also estimated [Reference Borenstein, Higgins, Hedges and Rothstein43]. If high heterogeneity was detected (Cochran’s Q p-value <0.10 or I 2 >50%), meta-regressions were conducted according to predefined predictors, including the mean age of females and males, the mean severity of depressive and (hypo)manic symptoms for females and males, and the percentage of females and males in treatment with psychotropic drugs, such as antidepressants, antipsychotics, lithium, or mood stabilizers. A leave-one-out sensitivity analysis excluding one study at a time from the main analysis was used to investigate each study’s influence on the overall effect size estimation. Publication bias was examined via funnel plots and using the Egger’s test [Reference Egger, Smith, Schneider and Minder44] when at least 10 studies were available.
Results
The overall study selection process is shown in the PRISMA flowchart in Figure 1. A total of 13,073 articles were identified via a systematic search through electronic databases. Of these, 1798 duplicates were identified and removed, and 11,275 articles underwent title and abstract screening. After the exclusion of 11,238 irrelevant articles, 37 reports underwent full-text evaluation, and a total of 19 were excluded. As such, 18 studies were included in this systematic review [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2, Reference Carrus, Christodoulou, Hadjulis, Haldane, Galea and Koukopoulos5–Reference Bücker, Popuri, Muralidharan, Kozicky, Baitz and Honer7, Reference Gogos, Joshua, Rossell, Fellow and Alfred11, Reference Barrett, Kelly, Bell and King12, Reference Dittmann, Seemüller, Schwarz, Kleindienst, Stampfer and Zach45–Reference Blanken, Oudega, Almeida, Schouws, Orhan and Beunders53] and 17 [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2, Reference Carrus, Christodoulou, Hadjulis, Haldane, Galea and Koukopoulos5, Reference Suwalska and Łojko6, Reference Bücker, Popuri, Muralidharan, Kozicky, Baitz and Honer7, Reference Gogos, Joshua, Rossell, Fellow and Alfred11–Reference Vaskinn, Sundet, Simonsen, Hellvin, Melle and Andreassen13, Reference Gogos, Son, Rossell, Karantonis, Furlong and Felmingham46–Reference Navarra-Ventura, Vicent-Gil, Serra-Blasco, Massons, Crosas and Cobo48, Reference Sanchez-Autet, Arranz, Safont, Sierra, Garcia-Blanco and de la Fuente49, Reference Robb, Young, Cooke and Joffe50, Reference Blanken, Oudega, Almeida, Schouws, Orhan and Beunders53–Reference Xu, Xiang, Qiu, Teng, Li and Huang57] were included in the meta-analysis. A list of excluded studies with reasons for exclusion is available in Supplementary Materials – Appendix 3.

Figure 1. PRISMA flowchart, 2020 edition, adapted. *Consider, if feasible to do so, reporting the number of records identified from each database or register searched (rather than the total number across all databases/registers). **If automation tools were used, indicate how many records were excluded by a human and how many were excluded by automation tools.
Morgan et al. [Reference Morgan, Mitchell and Jablensky51] was included in the systematic review due to its examination of sex-based differences in functioning among individuals with BD. However, the data were reported as percentages, rather than the continuous variables (means and standard deviations) required for our meta-analytic synthesis. Consequently, this study could not be integrated into the meta-analysis, as it lacked the necessary statistical measures for effect size estimation.
Study characteristics
Table 1 summarizes the relevant characteristics of the 20 included studies. The studies were published between 2005 and 2023 and included a total of 2286 patients with BD. 1368 (59.8%) patients were females and 918 (40.2%) were males. The mean age of female participants was 41.5 (SD = 9.7), and the mean age of male participants was 41 (SD = 10). 19 included studies were cross-sectional [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2,Reference Carrus, Christodoulou, Hadjulis, Haldane, Galea and Koukopoulos5,Reference Navarra-Ventura, Vicent-Gil, Serra-Blasco, Massons, Crosas and Cobo48,Reference Sanchez-Autet, Arranz, Safont, Sierra, Garcia-Blanco and de la Fuente49,Reference Morgan, Mitchell and Jablensky51–Reference Xu, Xiang, Qiu, Teng, Li and Huang57,Reference Suwalska and Łojko6,Reference Bücker, Popuri, Muralidharan, Kozicky, Baitz and Honer7,Reference Gogos, Joshua, Rossell, Fellow and Alfred11–Reference Vaskinn, Sundet, Simonsen, Hellvin, Melle and Andreassen13,Reference Dittmann, Seemüller, Schwarz, Kleindienst, Stampfer and Zach45–Reference Mueser, Pratt, Bartels, Forester, Wolfe and Cather47] and one study was prospective [Reference Robb, Young, Cooke and Joffe50].
Table 1. Characteristics of included studies

Abbreviations: BD, Bipolar disease; HC, Healthy controls; SCZ, Schizophrenia; SA, schizoaffective disorder; NC, Neurocognition; MD, major depression. FAST, Functioning Assessment Short Test; GAF, General Assessment of Functioning; MOS, Medical Outcome Survey; POFA, Pictures of Facial Affect; RMET, Reading the Mind in the Eyes Test; SFS, Social Functioning Scale; CVLT-II, California Verbal Learning Test II; D-KEFS, Kaplan Executive Function System; WAIS, Wechsler Adult Intelligence Scale; SCWT, Stroop Color and Word Test; TMT, Trail Making Test; RBANS, the Repeatable Battery for the Assessment of Neuropsychological Status; COWAT, Control Oral Word Association test; CPT-II, Continous Performance Test-II; WMS-III, Logical Memory subtest of the Wechsler Memory Scale-III; ROCF, Rey-Osterrieth Complex Figure; WCST, Wisconsin Card Sorting Test; CANTAB, Cambridge neuropsychological test automated battery; SRM, spatial recognition memory; PAL, paired associates learning; SOC, stockings of Cambridge; Intradimensional/Extradimensional attentional set shifting (ID/ED); TONI-3, Test of Nonverbal Intelligence-3; SAS, Social Adjustment Scale; LNST, letter-number sequencing test; HVLT-R, Hopkins Verbal Learning Test-Revised; BVMT-R, Brief Visuospatial Memory Test-Revised; SOFAS. Social and Occupational Functioning Assessment Scale; FSBD, functionality scale in Bipolar Disorder.
The overall quality of the included studies was good. The average quality rating of the included studies was 7.2 (SD = 1.4; range = 5–9) (see the agreed quality grades of each study in Table 1 and a report of each general score in the Supplementary material – Appendix 4).
Main analyses
The main results of the meta-analyses are reported in Table 2 and Figure 2. Significant differences were found in verbal learning and memory (SMD = 0.313; 95% CI = 0.135–0.49; p <0.001) and visual learning and memory (SMD = 0.263; 95% CI = 0.014–0.513; p = 0.039), where females outperformed males in these two domains. No significant differences were found between females and males in either psychosocial functioning or any other NC outcome. Forest plots are reported in the Supplementary Materials – Appendix 5.
Table 2. Results of the meta-analyses in detail

Abbreviations: CIs – Confidence Intervals; I 2 – Higgin and Thompson’s I 2 estimating of the total heterogeneity; PIs – Prediction Intervals; Qp – p-value for the Cochran’s Q-test of (residual) heterogeneity; SMD – Standardized mean difference; tau2 – between-study variance.
Note: Significant results are depicted in bold.

Figure 2. Differences in neurocognition and functioning between females (right) and males (left). Point size is proportional to the number of patients included in that specific comparison.
Meta-regression analyses
When comparing females and males with BD, none of the predefined predictors were significantly associated with the outcomes that were significant in the main analysis. Other results of meta-regressions can be consulted in Supplementary Materials – Appendix 6.
Sensitivity analysis
The following comparisons changed significance after the leave-one-out sensitivity analysis: (i) attention/vigilance became significant by removing the study Vaskinn et al. [Reference Vaskinn, Sundet, Simonsen, Hellvin, Melle and Andreassen13]; (ii) overall cognitive functioning became significant by removing the study Mueser et al. [Reference Mueser, Pratt, Bartels, Forester, Wolfe and Cather47]; (iii) visual learning and memory became non-significant by removing the studies Gogos et al. [Reference Gogos, Joshua, Rossell, Fellow and Alfred11], Tournikioti et al. [Reference Tournikioti, Ferentinos, Michopoulos, Dikeos, Soldatos and Douzenis54], Xu et al. [Reference Xu, Xiang, Qiu, Teng, Li and Huang57], Carrus et al. [Reference Carrus, Christodoulou, Hadjulis, Haldane, Galea and Koukopoulos5], and Gogos et al. [Reference Gogos, Son, Rossell, Karantonis, Furlong and Felmingham46]. Additional details on the sensitivity analyses are presented in the Supplementary Materials – Appendix 7.
Publication bias
There was no evidence of publication bias (Supplementary Materials – Appendix 8).
Discussion
To the best of our knowledge, this is the first systematic review and meta-analysis investigating sex differences in NC and psychosocial functioning in people diagnosed with BD. Two core results were found. First, significant sex differences were identified in verbal and visual memory and learning, with females performing better than males. Second, no significant sex differences were found in psychosocial functioning, although females performed better in two cognitive domains. Overall, results are of clinical importance as specific NC sex differences could be addressed to reduce impairment in patients with BD. Conversely, results suggest that psychosocial functioning may not require a specific intervention based on sex.
Regarding NC, significant sex differences were found with females performing better than males in verbal and visual memory and learning. Our findings are in line with previous studies that found sex differences in NC [Reference Gur and Gur9, Reference Mendrek and Mancini-Marïe10], in other psychiatric populations [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2, Reference Carrus, Christodoulou, Hadjulis, Haldane, Galea and Koukopoulos5, Reference Gogos, Son, Rossell, Karantonis, Furlong and Felmingham46]. Nevertheless, these results do not infer causation as to why these differences are observed. One potential explanation is that these specific sex differences are not unique to the context of mental illness as they are also present in controls without mental illness [Reference Zhang and Swaab58]. Furthermore, specific cognitive impairment can be present between patients and controls (i.e., males with BD vs. male HCs) and not be present in the opposite sex [Reference Zhang and Swaab58]. As such, we cannot conclude that the observed differences are unique to clinical populations as these impairments may have been present prior to illness onset or even due to sexual dimorphisms in brain structure [Reference Goldstein59]. In this context, we argue that studies including neuroimaging data could be important in brain anatomy and function. This may also include studies comparing the general population, high-risk population and BD in different illness stages. Further, the observed sex differences were investigated via meta-regressions using female and male age as predictor variables. While no significant differences were found, three important factors must be considered. First, a higher number of females were included in the analyses. Second, heterogeneity in the measurement of cognitive domains may also explain the lack of consistency in results regarding sex differences. Thirdly, the majority of comparisons included a very low number of studies, which may also have impacted these findings. Accordingly, we suggest that future research adopts a more homogenous approach to measuring NC in more balanced samples in terms of sex to better understand the complexity of sex differences in NC in BD.
Furthermore, the sensitivity analyses conducted provided greater insight into the significant results. Interestingly, for the visual learning and memory domain, where performance was significantly better in females, only the exclusion of Solé et al. [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2] did not change the significance of the overall result. In contrast, excluding any of the other five studies rendered the result not significant. Various factors could contribute to this analysis. First, the sample size varies across studies [Reference De Prisco and Vieta60]. Solé et al. [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2] have the largest sample (n = 347) of euthymic patients with BD. Second, sample characteristics are heterogeneous with some studies only including euthymic patients [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2], others symptomatic [Reference Carrus, Christodoulou, Hadjulis, Haldane, Galea and Koukopoulos5, Reference Xu, Xiang, Qiu, Teng, Li and Huang57] and the remainder a mixture of both [Reference Gogos, Son, Rossell, Karantonis, Furlong and Felmingham46, Reference Tournikioti, Ferentinos, Michopoulos, Dikeos, Soldatos and Douzenis54]. Mood state might be a major contributing factor to the differences across studies, as cognitive function tends to stabilize during euthymic phases, potentially leading to different results compared to studies with symptomatic patients. However, meta-regression analyses based on symptom severity did not change the overall results, suggesting that symptomatology alone is unlikely to explain the observed differences. Third, the illness stage also varied, for example, Xu et al. [Reference Xu, Xiang, Qiu, Teng, Li and Huang57] focused on the early stage of the disease, and Gogos et al. [Reference Gogos, Joshua, Rossell, Fellow and Alfred11] recruited chronic patients. Moreover, Gogos et al. [Reference Gogos, Son, Rossell, Karantonis, Furlong and Felmingham46] reported that their sample varied in terms of previous family history of BD, rapid cycling, and BD patients with comorbid anxiety disorder and substance use issues. Accordingly, the varied sample sizes and characteristics may play a significant role in the changes observed in the sensitivity analysis. Fourth, it is crucial to consider the role of medication in this analysis as research has shown that can have an impact on cognitive performance. Patients included in the present analysis were prescribed different patterns of medication (monotherapy vs. polypharmacy); some studies included patients prescribed various medications [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2, Reference Carrus, Christodoulou, Hadjulis, Haldane, Galea and Koukopoulos5, Reference Gogos, Joshua, Rossell, Fellow and Alfred11, Reference Tournikioti, Ferentinos, Michopoulos, Dikeos, Soldatos and Douzenis54], while others had samples who were only partially medicated [Reference Gogos, Son, Rossell, Karantonis, Furlong and Felmingham46] and Xu et al. [Reference Xu, Xiang, Qiu, Teng, Li and Huang57] included non-medicated patients. Given that medication is an unavoidable confounder in clinical research [Reference Ilzarbe and Vieta61], it is pertinent to account for these differences across studies. Additionally, an important factor to consider in the study of sex differences is the menstrual cycle together with the reproductive aging state which has been associated with worse cognitive performance according to the phase of the cycle when women are tested [Reference Postma, Winkel, Tuiten and van Honk62, Reference Metcalf, Duffy, Page and Novick63]. Of the six included studies only Gogos et al. [Reference Gogos, Joshua, Rossell, Fellow and Alfred11] collected this information. Finally, each study used different assessments of NC which most likely contributes to the changes of results in the sensitivity analysis. Overall, future studies should aim to include balanced samples and adopt a standardized approach to NC assessment while also collecting data relevant to sex differences to address limitations in the extant literature. Additionally, the identification of potential cultural variables could help to explain the sex differences.
In terms of psychosocial functioning, no significant sex differences were found. As such, our results are in line with the existing literature on other severe mental disorders such as schizophrenia [Reference Prat, Escandell, Garcia-Franco, Martín-Martínez, Tortades and Vilamala64]. However, these results do not support previous studies which highlighted NC and functional sex differences [Reference Vaskinn, Sundet, Simonsen, Hellvin, Melle and Andreassen13, Reference Sanchez-Autet, Arranz, Safont, Sierra, Garcia-Blanco and de la Fuente49]. The lack of consensus among studies on sex differences in functioning may partly arise from the clinical heterogeneity of BD subtypes and their associated polarity patterns. In the included studies, only three [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2, Reference Sanchez-Autet, Arranz, Safont, Sierra, Garcia-Blanco and de la Fuente49, Reference Blanken, Oudega, Almeida, Schouws, Orhan and Beunders53] included both BD-I and BD-II while the remaining four [Reference Bücker, Popuri, Muralidharan, Kozicky, Baitz and Honer7, Reference Vaskinn, Sundet, Simonsen, Hellvin, Melle and Andreassen13, Reference Robb, Young, Cooke and Joffe50, Reference Yazla, Inanç and Bilici56] included BD-I only. For instance, BD-I, more evenly distributed across sexes, is often associated with manic episodes, whereas BD-II, more prevalent in females, is more linked to depressive episodes [Reference Diflorio and Jones4, Reference Schneck, Miklowitz, Calabrese, Allen, Thomas and Wisniewski65]. Similarly, men are more likely to present hypomanic polarity whereas females are likely to present depressive polarity [Reference Nivoli, Pacchiarotti, Rosa, Popovic, Murru and Valenti66, Reference Vega, Barbeito, de Azúa, Martínez-Cengotitabengoa, González-Ortega and Saenz67]. These differences in predominant polarity could influence psychosocial functioning and cognitive performance, complicating direct comparisons across studies with mixed samples. Further research with balanced and subtype-specific cohorts is needed to disentangle these effects. Moreover, heterogeneous methods of measuring psychosocial functioning were employed. Two studies [Reference Solé, Varo, Torrent, Montejo, Jiménez and del Mar Bonnin2, Reference Yazla, Inanç and Bilici56] used the Functioning Assessment Short Test (FAST) [Reference Rosa, Sánchez-Moreno, Martínez-Aran, Salamero, Torrent and Reinares68], one [Reference Vaskinn, Sundet, Simonsen, Hellvin, Melle and Andreassen13] the Social Functioning Scale (SFS) [Reference Birchwood, Smith, Cochrane, Wetton and Copestake69], and four [Reference Bücker, Popuri, Muralidharan, Kozicky, Baitz and Honer7, Reference Sanchez-Autet, Arranz, Safont, Sierra, Garcia-Blanco and de la Fuente49, Reference Robb, Young, Cooke and Joffe50, Reference Blanken, Oudega, Almeida, Schouws, Orhan and Beunders53] the Global Assessment of Functioning (GAF) [Reference Aas70]. This may explain the lack of significance observed in global psychosocial functioning and suggests that using scales, such as the FAST, that explore sub-domains of functioning could be of clinical relevance, as they provide a more comprehensive assessment of a patient’s functional abilities. This approach allows clinicians to identify specific areas of impairment and tailor interventions accordingly, leading to more effective and targeted treatment strategies. Conversely, GAF offers a single composite score which may fail to capture specific areas of strength/impairment as it is more symptom-focused. Therefore, future research should aim to explore both BD subtypes with balanced samples using standardized consensus assessment batteries approaches to measure functioning and neuropsychological performance. This approach is essential before disregarding potential sex differences, particularly important given that sub-depressive symptoms, more frequent manic episodes, and higher rates of hospitalizations are associated with functional impairment [Reference Bonnín, Martínez-Arán, Torrent, Pacchiarotti, Rosa and Franco15, Reference Sanchez-Moreno, Bonnin, González-Pinto, Amann, Solé and Balanzá-Martinez17]. This could include specific evaluation tools exploring subdomains to gain better insight into the impact of sex differences.
Overall, findings suggest that female patients with BD show better performance in both verbal and visual learning and memory compared to males with BD. Identifying the particular cognitive domains affected can inform individualized therapeutic interventions. Regarding psychosocial functioning, no significant sex differences were found. In the same line, recent findings [Reference Serra-Navarro, Clougher, Solé, Sánchez-Moreno, González-Pinto and Jiménez71] also suggest that the benefits of functional remediation (FR) do not differ by sex, indicating that tailored approaches to psychosocial functioning may not be necessary. These results emphasize that both males and females benefit similarly from FR, supporting its general applicability. Thus, the present findings must be considered in the context of the highlighted methodological challenges in the research in NC and psychosocial functioning in this population. Identifying these differences could promote preventative treatment options and offer psychotherapeutic methods to help patients reach cognitive and functional recovery, thus reducing the impact of illness on our patients. Taken as a whole, adopting sex-informed approaches to treatment may facilitate targeted therapies that optimize cognitive performance, while also acknowledging shared pathways for psychosocial improvement. This strategy may ultimately help reduce the burden of BD on patients’ lives.
The present results must be considered in light of certain limitations. Firstly, heterogeneity was observed throughout the analyses conducted. We suggest this is owed to the imbalance of sample size and the multiple different assessments used for NC and psychosocial functioning. Accordingly, we recommend a more homogenous approach that aims to standardize these inconsistencies and address limitations in the present literature. Further, a reduced number of studies provided information regarding mood state which limits the overall generalizability of the results [Reference De Prisco and Vieta60]. Based on our findings, future research could significantly enhance the understanding of sex specific-factors on BD. This includes standardizing neurocognitive assessments to enable comparisons between studies, longitudinal studies to examine the evolution of sex differences over time, investigating the impact of these differences on the effectiveness of treatment options, and exploring the biological and psychosocial mechanisms underlying these disparities. Such research could refine our ability to predict outcomes and develop more tailored and effective interventions.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1192/j.eurpsy.2025.27.
Data availability statement
Data are publicly available. Requests to see any data that are not included in the Article or the appendix should be directed to the corresponding author.
Acknowledgments
Eduard Vieta thanks the support of the Spanish Ministry of Science, Innovation and Universities [PI18/00805; PI21/00787] integrated into the Plan Nacional de I + D + I and co-financed by ISCIII-Subdirección General de Evaluación y el Fondo Europeo de Desarrollo Regional [FEDER]; CIBERSAM; and the Comissionat per a Universitats i Recerca del DIUE de la Generalitat de Catalunya to the Bipolar Disorders Group [2021 SGR 1358] and the project SLT006/17/00357, from PERIS 2016-2020 [Departament de Salut]. CERCA Programme/Generalitat de Catalunya. María Florencia Forte received the support of “Contratos predoctorales de formación en investigación en salud” [PFIS22] [FI22/00185] from the Instituto de Salud Carlos III [ISCIII] with European funds from the Recovery, Transformation and Resilience Plan, by virtue of the Resolution of the Directorate of the Carlos III Health Institute, O.A., M.P. of December 14, 2022, granting Predoctoral Research Training Contracts in Health [PFIS Contracts]. Funded by the European Union NextGenerationEU. Marina Garriga thanks the support of the Spanish Ministry of Science, Innovation and Universities [PI21/00340] integrated into the Plan Nacional de I + D + I and co-financed by ISCIII-Subdirección General de Evaluación y el Fondo Europeo de Desarrollo Regional [FEDER]; CIBERSAM; and the Comissionat per a Universitats i Recerca del DIUE de la Generalitat de Catalunya to the Bipolar Disorders Group [2021 SGR 1358]. CERCA Programme/Generalitat de Catalunya.
J. Antoni Ramos-Quiroga thanks the suport of the CIBERSAM; Comissionat per a Universitats i Recerca del DIUE de la Generalitat de Catalunya to the Psychiatry, Mental Health and Addictions Group [2021 SGR 00840]. Anabel Martinez-Aran thanks the support of the Spanish Ministry of Science and Innovation [PI18/00789, PI21/00787] integrated into the Plan Nacional de I + D + I and cofinanced by ISCIII-Subdirección General de Evaluación and the Fondo Europeo de Desarrollo Regional [FEDER]; the ISCIII; the CIBER of Mental Health [CIBERSAM]; the Secretaria d’Universitats i Recerca del Departament d’Economia i Coneixement [2017 SGR 1365]; the CERCA Programme; and the Departament de Salut de la Generalitat de Catalunya for the Pla estratègic de recerca I innovació en salut [PERIS] grant SLT006/17/00177. Silvia Amoretti has been supported by Sara Borrell doctoral programme [CD20/00177] and M-AES mobility fellowship [MV22/00002], from the Instituto de Salud Carlos III [ISCIII], and co-funded by European Social Fund “Investing in your future.” This study was also supported by La Marató-TV3 Foundation grants 202234-32 [to S. Amoretti]; 202234-30 [to E. Vieta] and 202205-10 [to M. Bernardo] and PI24/00671, funded by the Instituto de Salud Carlos III and cofinanced by the European Union [FEDER] “Una manera de hacer Europa”. Carla Torrent has been supported through a “Miguel Servet” postdoctoral contract [CPI14/00175] and a Miguel Servet II contract [CPII19/00018] and thanks the support of the Spanish Ministry of Innovation and Science [PI17/01066, PI20/00344 and PI24/00407], funded by the Instituto de Salud Carlos III and cofinanced by the European Union [FEDER] “Una manera de hacer Europa.”
Financial support
This study has also been funded by Instituto de Salud Carlos III [ISCIII] through the project “PI21/00787” and co-funded by the European Union [FEDER] “Una manera de hacer Europa”.
Competing interest
Eduard Vieta has received grants and served as a consultant, advisor, or CME speaker for the following entities: AB-Biotics, AbbVie, Adamed, Alcediag, Angelini, Biogen, Beckley-Psytech, Biohaven, Boehringer-Ingelheim, Celon Pharma, Compass, Dainippon Sumitomo Pharma, Ethypharm, Ferrer, Gedeon Richter, GH Research, Glaxo-Smith Kline, HMNC, Idorsia, Johnson & Johnson, Lundbeck, Luye Pharma, Medincell, Merck, Newron, Novartis, Orion Corporation, Organon, Otsuka, Roche, Rovi, Sage, Sanofi-Aventis, Sunovion, Takeda, Teva, and Viatris, outside the submitted work. Marina Garriga has received honoraria/travel support from Ferrer, Janssen-Cilag, and Lundbeck, with no financial or other relationship relevant to the subject of this article. J. Antoni Ramos-Quiroga was on the speakers’ bureau and/or acted as consultant for Biogen, Idorsia, Casen-Recordati, Janssen-Cilag, Novartis, Takeda, Bial, Sincrolab, Neuraxpharm, Novartis, BMS, Medice, Rubió, Uriach, Technofarma and Raffo in the last 3 years. He also received travel awards [air tickets + hotel] for taking part in psychiatric meetings from Idorsia, Janssen-Cilag, Rubió, Takeda, Bial and Medice. The Department of Psychiatry chaired by him received unrestricted educational and research support from the following companies in the last 3 years: Exeltis, Idorsia, Janssen-Cilag, Neuraxpharm, Oryzon, Roche, Probitas and Rubió. Miquel Bernardo has been a consultant for, received grant/research support and honoraria from, and been on the speakers/advisory board of ABBiotics, Adamed, Angelini, Casen Recordati, Janssen-Cilag, Menarini, Rovi and Takeda. Silvia Amoretti has been a consultant to and/or has received honoraria/grants from Otsuka-Lundbeck, with no financial or other relationship relevant to the subject of this article. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Ethical statement
An ethics statement is not applicable because this study is based exclusively on already published data. Accordingly, written informed consent was not required.
Additional Information
Maria Serra-Navarro and Derek Clougher are the co-first authors. PROSPERO Registration: # CRD42022369013.
Comments
No Comments have been published for this article.