How much conversation content is actually social: human conversational behaviour revisited

Anna Szala; Sławomir Wacewicz; Marek Placiński; Aleksandra Ewa Poniewierska; Arkadiusz Schmeichel; Michal Mikolaj Stefanczyk; Przemysław Żywiczyński; Robin I.M. Dunbar

doi:10.1017/langcog.2024.54

How much conversation content is actually social: human conversational behaviour revisited

Published online by Cambridge University Press: 09 January 2025

Anna Szala

Sławomir Wacewicz

Marek Placiński

Aleksandra Ewa Poniewierska

Arkadiusz Schmeichel ,

Michal Mikolaj Stefanczyk

Przemysław Żywiczyński

and

Robin I.M. Dunbar

Show author details

Anna Szala*: Affiliation:
Center for Language Evolution Studies, Nicolaus Copernicus University in Toruń, Toruń, Poland Hirszfeld Institute of Immunology and Experimental Therapy, Polish Academy of Sciences, Wrocław, Poland
Sławomir Wacewicz: Affiliation:
Center for Language Evolution Studies, Nicolaus Copernicus University in Toruń, Toruń, Poland
Marek Placiński: Affiliation:
Center for Language Evolution Studies, Nicolaus Copernicus University in Toruń, Toruń, Poland
Aleksandra Ewa Poniewierska: Affiliation:
Center for Language Evolution Studies, Nicolaus Copernicus University in Toruń, Toruń, Poland
Arkadiusz Schmeichel: Affiliation:
Center for Language Evolution Studies, Nicolaus Copernicus University in Toruń, Toruń, Poland
Michal Mikolaj Stefanczyk: Affiliation:
Institute of Psychology, University of Wrocław, Wrocław, Poland
Przemysław Żywiczyński: Affiliation:
Center for Language Evolution Studies, Nicolaus Copernicus University in Toruń, Toruń, Poland
Robin I.M. Dunbar: Affiliation:
Department of Experimental Psychology, University of Oxford, Oxford, UK
*: Corresponding author: Anna Szala; Email: [email protected]

Article contents

Abstract
Introduction
Materials and methods
Results
Discussion
Conclusions
Data availability statement
Competing interest
Footnotes
References

Rights & Permissions

Abstract

Our study explores aspects of human conversation within the framework of evolutionary psychology, focusing on the proportion of ‘social’ to ‘non-social’ content in casual conversation. Building upon the seminal study by Dunbar et al. (1997, Human Nature, 8, 231–246), which posited that two-thirds of conversation gravitates around social matters, our findings indicate an even larger portion, approximately 85% being of a social nature. Additionally, we provide a nuanced categorisation of ‘social’ rooted in the principles of evolutionary psychology. Similarly to Dunbar et al.’s findings, our results support theories of human evolution that highlight the importance of social interactions and information exchange and the importance of the exchange of social information in human interactions across various contexts.

Keywords

conversation analysis Dunbar evolutionary psychology language evolution social discourse

Type: Article
Information: Language and Cognition , Volume 17 , 2025 , e11

DOI: https://doi.org/10.1017/langcog.2024.54 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial licence (http://creativecommons.org/licenses/by-nc/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use.
Copyright: © The Author(s), 2025. Published by Cambridge University Press

1. Introduction

How much of what we say to each other is of a social nature? In a pioneering study, Dunbar et al. (Reference Dunbar, Marriott and Duncan1997) estimated that ‘gossip’ – understood loosely as conversation about social and personal topics – accounted for about two-thirds of time spent on conversation. This result has been extremely consequential: in addition to its considerable popular impact, it was instrumental in motivating some of the most influential theories of the evolution of the human brain and cognition (e.g. the social brain theory – Dunbar, Reference Dunbar1998a, Reference Dunbar1998b, Reference Dunbar2009) and an influential theory in the field of language evolution (the ‘gossip’ theory of language origins – Dunbar, Reference Dunbar1998a, p. 199). However, it is not clear that the theoretical importance of this ‘two-thirds’ estimate has a sufficiently strong evidential basis. The original study’s authors relied on a small number of conversations (N = 45), collected exclusively in open public environments, between a sample of participants with a limited demographic and geographical distribution. In what follows, we revisit Dunbar et al.’s question about the proportion of conversational time spent on exchanging social information, drawing on recent linguistic-analytic resources and providing a deeper discussion of what type of language use should count as ‘social’.

1.1. The social evolution of human cognition and language

Humans are exceptional among primates in several respects, but among the most salient are the ability to use language and the large size of our brains. The human brain is over three times larger than the brain of a primate of similar body size (Marino, Reference Marino1998), which comes with several types of costs. In terms of energy expenditure, the brain accounts for up to 50% of the basal metabolic rate during childhood (Armstrong, Reference Armstrong1983; Milton, Reference Milton, Byrne and Whiten1988) and ca. 20% in adulthood, still higher than for other organs (Aiello & Dunbar, Reference Aiello and Dunbar1993), and the brain’s high susceptibility to oxygen deprivation necessitates a stable energy supply (Byrne, Reference Byrne2000). Developmentally, very large postnatal growth required for the brain to reach adult size makes humans unusually helpless at birth and exceptionally dependent on parental or alloparental care. There are also allometric costs related to the much larger head in proportion to the rest of the body in humans, both in neonates (e.g. complications during labour) and in adults (e.g. complicating balance in bipedal walking – Abitbol, Reference Abitbol1993). In sum, we would expect such traits to be selected against unless their high costs trade off against equally weighty fitness advantages; in the case of human brain size, these are typically interpreted in the context of cognitive specialisation (Byrne, Reference Byrne2000).

Historically, theories of human cognitive evolution leading to high encephalisation highlighted ecological challenges, such as tool use (Vaesen, Reference Vaesen2012), social technology transmission (Nicol, Reference Nicol1995; see Shilton, Reference Shilton2019, for review) and food acquisition benefits (DeLouize et al., Reference DeLouize, Coolidge and Wynn2017). Spatial cognition (Epstein et al., Reference Epstein, Patai, Julian and Spiers2017), hunting (Speth, Reference Speth2010) and extractive foraging (Bickerton, Reference Bickerton, Hurford, Studdert-Kennedy and Knight1998; see also Byrne, Reference Byrne2000) were also proposed as potential drivers of increased brain size in the hominin lineage. Since the advent of the Machiavellian intelligence hypothesis (Whiten & Byrne, Reference Whiten, Byrne, Byrne and Whiten1988), these ecological explanations have been gradually complemented by a social perspective, in which the social environment was as important as the physical environment in shaping primate cognitive evolution. The literature also indicates differences in social communication: men speak more and louder during arguments (Kimble & Musgrove, Reference Kimble and Musgrove1988), while women use gossip more for competition (Buss & Dedden, Reference Buss and Dedden1990) and engage in more affiliative speech, especially with children (Leaper & Ayres, Reference Leaper and Ayres2007; Leaper & Smith, Reference Leaper and Smith2004). Hyde and Linn’s (Reference Hyde and Linn1988) meta-analysis suggests women might have slightly higher verbal ability.

The research by Dunbar (see, e.g., Dunbar, Reference Dunbar2009) has been particularly important in spearheading this point of view. The influential social brain hypothesis (e.g. Dunbar, Reference Dunbar2010) holds that the selection pressures for the increased neocortex size in hominins reflected the high cognitive demands of dealing with increasingly complex social groups. This is because successful survival and reproduction in large and intricately organised primate groups require the capacity to understand and navigate fluid social hierarchies, form and break short- and long-term alliances and coalitions and keep a balance between cooperation and conflict. Since the amount of relevant social information grows exponentially with group size, its efficient processing places high computational demands on the brain, particularly its neocortical areas. The requirements of dealing with social (rather than ecological) information would thus have been the main driver of the evolution of human cognition. This includes the ability to use language, which might have evolved as an all-purpose social tool to transmit social information (Dahmardeh & Dunbar, Reference Dahmardeh and Dunbar2017; Mesoudi et al., Reference Mesoudi, Whiten and Dunbar2006; Pleyer, Reference Pleyer2023; Redhead & Dunbar, Reference Redhead and Dunbar2013; Wacewicz, Reference Wacewicz, Dor, Knight and Lewis2015; see also Dunbar & Shultz, Reference Dunbar and Shultz2023; Shultz & Dunbar, Reference Shultz and Dunbar2022).

1.2. The study by Dunbar, Marriott, and Duncan, Reference Dunbar, Marriott and Duncan1997

Dunbar et al. (Reference Dunbar, Marriott and Duncan1997) analysed the content of 45 casual conversations held in public settings by manually recording the general topic of speakers’ utterances. The content was analysed speaker by speaker. The authors used scan sampling, which generally provides accurate estimates of the amount of time devoted to an activity. They sampled the topic at a specific moment in time and repeated the process every 30 seconds. The topics were classified into several thematic categories, with a subset of them later grouped as concerning social topics. While groundbreaking in using linguistic analysis to inform evolutionary considerations, and despite being further confirmed in another study that showed that the proportion of time devoted to social topics was 76% (Dahmardeh & Dunbar, Reference Dahmardeh and Dunbar2017), the study had several limitations related to its dataset and approach.

From today’s perspective, some limitations of the study by Dunbar et al. (Reference Dunbar, Marriott and Duncan1997) derive from the size and composition of the sample. While 45 conversations are similar to other self-collected corpora of casual conversations of the time (e.g. Gumperz & Tannen, Reference Gumperz and Tannen1979; Tannen & Wallat, Reference Tannen, Wallat, Fisher and Todd1983) and may be sufficient for qualitative analyses (cf. Saville-Troike, Reference Saville-Troike, Lange and Paige2003), quantitative analyses now tend to rely on much larger datasets. The advent of the Digital Age at the beginning of the 20th century allowed researchers to collect rich conversational data in various formats (text, audio and video) and use computational tools to quantify these data (Yeomans et al., Reference Yeomans, Boland, Collins, Abi-Esber and Brooks2023). Furthermore, the sample was heavily biased towards young urban adults with academic backgrounds: over three-quarters of the conversations were held by students in two university cafeterias (in Liverpool and London, UK). Finally, the conversations were acquired in open public environments, which might not be representative of the content of conversations held in completely private settings (cf. Heritage & Stivers, Reference Heritage, Stivers, Sidnell and Stivers2012, for the role of socioeconomic variables in conversation analysis).

The design of the study implied several logistic challenges. First, as surreptitious eavesdroppers, the coders might not have heard the full content of at least some conversations, and second, they had to identify and record the topics in real time, simultaneously listening to the conversations. Dunbar and colleagues acknowledge that the process of assigning a topic to a subject area could be difficult and that, in some cases, the annotators had to rely on their interpretation of the speaker’s intentions to make a single choice. Notably, these classifications were based on a single coder’s intuition at one of three locations (Dunbar et al., Reference Dunbar, Marriott and Duncan1997, p. 236), which may prompt questions about the reliability of the results reported in the study (Luginbühl et al., Reference Luginbühl, Mundwiler, Kreuz, Müller-Feldmeth and Hauser2021). While Dunbar et al.’s data suggest high coder agreement, it did not involve the now-standard procedure of achieving consensus agreement among at least two coders or measuring inter-rater agreement for at least part of the coded material (Stolarova et al., Reference Stolarova, Wolf, Rinker and Brielmann2014).

Finally, and most importantly for the current analysis, the study by Dunbar et al. (Reference Dunbar, Marriott and Duncan1997) had a much broader focus than classifying the content of talk into social vs. non-social. For this reason, it did not provide an explicit operationalisation of these two categories or the difference between them; instead, the amount of speaking time taken up by social topics was a sum of the time devoted to lower-level topics that were jointly grouped as ‘social’. The authors’ original classification had fourteen subject areas, which were later reclassified into ten major categories: personal relationships, personal experiences, future social activity, future non-social activity, sport/leisure, culture/art/music, politics, religion/morals/ethics, work/academic and technical/instructional. Each subject area contained several topics; e.g., the category ‘personal relationships’ included personal experiences arising from social events, social relationships and actual behaviour in social situations, as well as the emotional experiences involved. The authors explain that they ‘delineated the topics a priori to reflect functionally relevant categories’, i.e. subject categories based on their functions (1997: 235).

1.3. What language use counts as ‘social’?

‘Social’ is an everyday language word with a very broad range of meanings, which makes ‘talking about social matters’ challenging to operationalise. In this study, our approach is directly guided by an evolutionary perspective: we investigate current patterns of language use because we take them to be informative on the nature of human-evolved cognition. This is in line with the basic tenets of evolutionary psychology (Buss, Reference Buss2019; Tooby & Cosmides, Reference Tooby, Cosmides, Barkow, Cosmides and Tooby1992) and aligns with research in environmental aesthetics (Kaplan, Reference Kaplan, Barkow, Cosmides and Tooby1992; Orians & Heerwagen, Reference Orians, Heerwagen, Barkow, Cosmides and Tooby1992), Darwinian literary theory (Carroll, Reference Carroll1995, Reference Carroll2011) or evolutionary psychology of the media (Szlendak & Kozłowski, Reference Szlendak and Kozłowski2008; Tooby & Cosmides, Reference Tooby and Cosmides2001), all of which assume that the types of content that humans find interesting reflect main adaptive challenges, i.e. recurrent adaptive problems of our ancestors. Similar to Dunbar et al. (Reference Dunbar, Marriott and Duncan1997), we assume that the broad categories of content that the conversants find worthwhile speaking about reflect the relative importance of the social and ecological challenges in our evolutionary past.

In line with the above, we were interested in the proportion of linguistic information conveyed in conversation that pertains to social versus non-social matters. Similar to Dunbar et al. (Reference Dunbar, Marriott and Duncan1997), our focus was not on the organisation of the talk, for example the development and sequencing of topics (Heritage, Reference Heritage, Sidnell and Stivers2012). Rather, we treated utterances as blocks of static text as if they were single-voice documents (a standalone document created by one person; Yeomans et al., Reference Yeomans, Boland, Collins, Abi-Esber and Brooks2023) and identified their content as either social or non-social (see below). Accordingly, we labelled instances of language use as ‘social’ when their content was related to managing or navigating social situations rather than non-social situations (cf. Section 2). So construed, ‘social content’ is a very broad category that encompasses all utterances related to one’s social connections within a group, containing information primarily relevant to and useful for managing one’s social relations. Conversely, ‘non-social’ content relates more closely to other aspects of human life and may be more useful in contexts outside of human relationshipsFootnote ¹.

In our study, we specified three categories with increasing social distance to the individuals being discussed in the content of the conversation (see Table 1). The first category of ‘Social’ concerns individuals participating in the conversation. Speaking about oneself might not intuitively be considered to count as ‘social’ in the sense that it may not directly involve other people. Indeed, in our preliminary studies (e.g. Szala et al., Reference Szala, Placiński, Poniewierska, Szczepańska, Wacewicz, Ravignani, Asano, Valente, Ferretti, Hartmann, Hayashi, Jadoul, Martins, Oseki, Rodrigues, Vasileva and Wacewicz2022), contra Dunbar et al. (Reference Dunbar, Marriott and Duncan1997), we decided to exclude the subcategory of ‘personal experiences’ as not exemplifying the category ‘social’. However, the present grounding of our categories in terms of types of content domains related to this content’s evolutionary relevance made us reconsider this decision. From this perspective, self-disclosure plays a key role in building social relationships, both through self-promotion and as an invitation for the other speaker to do likewiseFootnote ². More broadly, sharing information about oneself belongs to the content domain of ‘information about people’ and constitutes a key mechanism for building one’s own reputation. When accompanied by others, people tend to share information about the intensity of their emotional reactions (Stefanczyk et al., Reference Stefanczyk, Lizak, Kowal and Sorokowska2022) or sexual desires (Stefanczyk, Reference Stefanczyk2024) that deviate from reality for the sake of their image in the eyes of others. Furthermore, people vary in what traits they decide to highlight when put in the context of a romantic date (Stefanczyk et al., Reference Stefanczyk, Conroy-Beam, Ujma, Walter, Zborowska and Sorokowska2024). Thus, what and when we disclose about ourselves may be considered a tool we use to navigate social interactions successfully.

Table 1. Coding instructions and examples from the database per line of text for five categories

The second category of ‘Social’ concerns sharing information about people not physically present during the conversation but belonging to the extended social circle of at least one of the conversants. The ‘extended social circle’ stands for the range of people with whom we have or potentially could have some form of relationship: people whom we either actually know personally or could realistically get to know personally. In other words, my ‘extended social circle’ encompasses people with whom I can realistically have future non-transient social interactions, and information on those people can potentially be useful in guiding those interactions (cf. also Krems et al., Reference Krems, Dunbar and Neuberg2016, on forming mental models of absent individuals).

We distinguish this category from the previous one for two reasons. First, this category is instrumental in extending the reputation-building of individuals beyond the interacting dyad. It provides a powerful source of information on third parties that is alternative or complementary to one’s own first-hand experience of interactions. Access to this additional, non-direct source of information is quite fundamental for indirect reciprocity (Nowak & Sigmund, Reference Nowak and Sigmund2005), whereby individual A helps individual B even without B directly reciprocating the help; instead, A can reliably count on being ‘helped back’ by other individuals in the social network. Indirect reciprocity is one of the key mechanisms implicated in the origin of human cooperative norms (including moral norms), which are at the core of a number of approaches to the evolution of human cooperation (e.g. Ohtsuki & Iwasa, Reference Ohtsuki and Iwasa2006). Second, this discrimination aligns well with the most common understanding of the term ‘gossip’, i.e. ‘informal and evaluative talk (…) about another member (…) who is not present’ (Kurland & Pelled, Reference Kurland and Pelled2000), or to put it simply, a private transmission between ‘A and B talking about C’ (Hannerz, Reference Hannerz1967).

The third category of ‘Social’ concerns instances of language use that contain information on people outside of the conversants’ extended social circle; that is, individuals with whom one is highly unlikely (or impossible) to have social interactions in the foreseeable future. As such, this type of information – a reputation of people that we will not meet anyway – does not have direct social value in the specific sense described above. However, exchanging social information, such as details about the lives of movie stars, likely engages the same information-processing mechanisms that we use when processing information about members of our own social circle; indeed, this is a standard explanation of the popularity of celebrities suggested in evolutionary media studies (e.g. Szlendak & Kozłowski, Reference Szlendak and Kozłowski2008). Also, information on unknown and unknowable individuals still contains explicit or implicit general advice on handling social situations. This is again visible in evolutionarily inspired research on the content of oral and written literature (e.g. Carroll, Reference Carroll2012; Gottschall, Reference Gottschall2005; Saunders, Reference Saunders, Shackelford and Hansen2015), which has been argued to have fitness-enhancing value in the sense of transmitting information useful in dealing with recurring adaptive challenges.

To further refine our analysis, we divided Category 3 into subcategories 3A and 3B to differentiate between general information about people (3A) and information specifically related to their social lives (3B). This distinction was influenced by our preliminary study (Szala et al., Reference Szala, Placiński, Poniewierska, Szczepańska, Wacewicz, Ravignani, Asano, Valente, Ferretti, Hartmann, Hayashi, Jadoul, Martins, Oseki, Rodrigues, Vasileva and Wacewicz2022), and it allows us to capture the difference between these two approaches to coding ‘social’: broadly, any information about people, and narrowly, information specifically related to their social lives. By comparing 3 versus 3A or 3B, we can identify the proportion of social information that is general versus that which is socially specific within this category, which may reflect this proportion across all our ‘social’ categories.

2. Materials and methods

The study has been preregistered (see https://osf.io/kjf4e/).

2.1. Corpus

In our study, we used Spokes (Pęzik, Reference Pęzik, Przepiórkowski, Bańko, Górski and Lewandowska-Tomaszczyk2012, Reference Pęzik2014), a corpus of Polish informal conversations (N = 669; >2.6 million word tokens). The corpus was based on live recordings of casual speech, which were recorded in private as well as public places, with speakers from a variety of Polish demographic backgrounds, such as age (range 1–99; M = 34.71, SD = 17.54; Mdn = 27; Mo = 23) and education (from no education to higher education). The corpus consisted of 171,126 lines of text from female speakers and 101,339 lines of text from male speakers, as well as 2489 lines not tagged for the speaker’s sex (0.9%). At the moment of recording, some speakers were unaware that their conversations were being recorded, providing their consent and demographic data only later. Because of this, the corpus is especially useful for studying naturally occurring interactions. The recordings were then transcribed to a text form using ELAN (see Wittenburg et al., Reference Wittenburg, Brugman, Russel, Klassmann and Sloeties2006) and exported to the freely available open database at http://spokes.clarin-pl.eu/. Pęzik explains that the corpus was manually divided into lines; the aim was to mark alternating statements in a conversation (personal communication, 9 June 2023); hence, individual lines differ in the number of word tokens.

2.2. Sampling

Our goal was to obtain a strong representation of speakers of all ages, genders and education levels. However, to increase the quality of our sample, we excluded files for two reasons. First, we excluded the files that contained at least one line with more than 150 word tokens since such lines tended to express content falling within several categories. Second, we excluded the files that had 50 or fewer lines of text, as this made it very difficult to establish the context of the conversation and, therefore, correctly classify the content. When using those exclusion criteria, we ended up with a sample of N = 535 conversations and N = 274,954 lines of data (80% of the full corpus).

2.3. Coding

The coding scheme was inspired by a similar study by Szala et al. (Reference Szala, Placiński, Poniewierska, Szczepańska, Wacewicz, Ravignani, Asano, Valente, Ferretti, Hartmann, Hayashi, Jadoul, Martins, Oseki, Rodrigues, Vasileva and Wacewicz2022) that utilised fragments of the same database and was informed by extensive qualitative analysis and group discussions about the conversations and the training set. Each line in the final dataset was coded by two annotators (native Polish speakers), who were provided with instructions and received a brief coding manual for reference (see Supplementary Material, Appendix A), along with individual training in annotation for this particular study. There were ten coders in total; two coders were randomly assigned to every text. They worked independently, blind to each other’s coding.

The annotators coded a dataset of 535 conversations, which comprised a total of 274,954 lines. For the N = 535 (total = 274,954 lines), the mean number of lines in a conversation was x͂ = 327 (SD = 524.7), and the mean length of a line of text was M = 8.29 (SD = 8.58) word tokens (the high standard deviation is explained by the fact that some conversations and lines were significantly longer than others). Lines that could not be individually identified as expressing any topic (e.g. short lines such as ‘yeah’ or ‘all right’) were interpreted within the context they belonged to and coded accordingly. For the coding instructions and examples, see Table 1. Again, note that in the light of our approach expounded in 1.3 above, our operationalisation of ‘social’ was very broad, effectively subsuming any lines at all about people involved in the interaction (Category 1), people known to the interlocutors (Category 2) or people that the interlocutors know of indirectly (Category 3)Footnote ³.

3. Results

Statistical descriptions were prepared in the Python programming language, whereas models were fitted using the R programming language (R Core Team) and several of its libraries: ggplot (Wickham, Reference Wickham2016) for data visualisation, tidyverse (Wickham et al., Reference Wickham, Averick, Bryan, Chang, McGowan, François, Grolemund, Hayes, Henry, Hester, Kuhn, Pedersen, Miller, Bache, Müller, Ooms, Robinson, Seidel, Spinu and Yutani2019) for data wrangling, irr (Gamer et al., Reference Gamer, Lemon and Singh2019) for estimating Cohen’s kappa and lme4 (Bates et al., Reference Bates, Mächler, Bolker and Walker2015) for mixed-effects modelling. The open-access database and the relevant code used for the analysis can be found at https://osf.io/mqs5k/.

In the first analysis, we compared the broad categories ‘non-social’ (all lines coded as 0) with ‘social’ (all lines coded as 1, 2, 3A or 3B). The lines that the two annotators coded differently were discarded – that is, for this analysis, we removed the lines rated as 0 by one annotator and 1, 2, 3A or 3B by the other. The annotators agreed on 71% of our dataset, meaning that we arrived at a database comprising 197,621 lines of data (x͂ = 366.56 lines, SD = 403.42, and x͂ = 9.09, SD = 9.22 word tokens).

In our second analysis, the ‘social’ categories, i.e. Social–1, Social–2, Social–3A and Social–3B, were treated separately. Again, the lines that the two annotators coded differently were discarded from this analysis. Here, the annotators agreed on 59% of all lines of data, which accounts for 164,374 lines (x͂ = 304.7434 lines, SD = 342.13, and x͂ = 8.91 word tokens, SD = 9.07).

Most lines in the sample were produced by 20- to 30-year-old adults with higher education (139,575 out of a total of 274,954, 50% of the entire dataset). The corpus data itself make no distinction between speakers who have no education because they are too young to have finished elementary school and those who have no education because, despite being old enough, they did not graduate from any school. To avoid making arbitrary choices, we analysed the age structure and education of the speakers in the dataset. The analysis revealed that there is an educational gap between the ages of 12 and 29. All individuals within this age range have received some education. Consequently, in our analysis, we differentiate between children with no education (below the age of 12) and adults with no education because they never finished any school (29 years and older). In Supplementary Material, Appendix B, we describe the part of the dataset that was removed from the analysis.

The question of ‘How much language use is social?’ can potentially be answered in two main ways. Because information about utterance duration is not available in our database, we can either count the total number of lines expressing specific content or count how many words individuals uttered expressing that content. The former informs us about the overall distribution of content, whereas the latter constitutes a proxy of the duration of an utterance. In the initial stage, we compare the results from both types of operationalisations. For both operationalisations, we decided to perform a two-level analysis.

The results of the analyses indicate that, overall, the proportion of social content in conversation was substantially larger than non-social content; within that former category, personal content (Social–1) constituted a majority of content, followed by personal non-participating (Social–2), i.e. discussions about individuals that are part of one’s extended social circle, but that are not present during the conversation (see Tables 2–5 for exact values). The overall pattern of results is the same for the two operationalisations: social content is more frequent than non-social content (see Tables 2 and 4); Social–1 is the most frequent among all categories, followed by Social–2, Non-social and the two Social–3 categories (see Tables 3 and 5). In the remainder of our analyses, we report the results for word tokens.

Table 2. The overall distribution of social vs. non-social content (lines of text)

Table 3. Distribution of social content categories vs. non-social content (lines of text)

Table 4. The overall distribution of social vs. non-social content (word tokens)

Table 5. Distribution of social content categories vs. non-social content (word tokens)

In the next step, we analysed the distribution of content between the sexes. The number of observations totalled 1,148,287 word tokens for females and 632,795 for males when considering social vs. non-social content in general. Females engage in conversing about social content more often than males (91% vs. 86%). When individual categories are considered (927,410 observations for females and 521,594 for males), males appear to express non-personal social content (category Social–3B) more often than females (6% vs. 1.5% of word tokens), i.e. the category concerns such social content that does not carry information about the social lives of the individuals under discussion (e.g. ‘papież na biało chodzi’ [Eng. the Pope wears white]) (Figure 1).

Figure 1. (A) The overall distribution of social topics per sex. (B) Distribution of individual social topic categories per sex. In the next step, we analysed the proportion of content types between education groups. The proportions of content discussed by education groups can be seen in Figure 2.

The pattern of results here resembles the general trend. All education groups devote most of their conversations to social content (see Figure 2A). There are some exceptions when individual categories of social content are considered. Adults with no education and with primary education engage in social content the least (72% and 66% of word tokens, respectively). In addition, adults without any education were found to devote more word tokens to Social–2 than to Social–1 content (40% vs. 28% of word tokens). Similarly, the high school group also stands out in terms of engaging in the Social–2 content type (42% of word tokens). Higher education and vocational groups resemble the general trend (Figure 3).

Figure 2. (A) The overall distribution of social content per education level. (B) Distribution of individual social content categories per education level.

Figure 3. (A) The overall distribution of social content per age group. (B) Distribution of individual social content categories per age group.

The results indicate that, overall, every age group devotes more word tokens to social content than non-social. The age groups that engage in social content less are children under the age of ten (73% of word tokens expressing social content) and adults between ages 30 and 39 and over 40 years of age (both 84% of word tokens expressing social content). We can also see that, overall, the number of words devoted to social content increases with age: whereas children devote the least words to social content (73% of all words), teenagers devote nearly a third of that number of words to non-social content and spend 90% of all word tokens on social content. This trend persists until the ages of 30–39 and over 40 when the number of words devoted to non-social content plateaus. Another factor that differs between age groups is the number of words devoted to Social–2 content. Whereas children under the age of 10 focus on personal (Social–1 content), older individuals start focusing on individuals not participating in the conversation (Social–2 content). The group that by far focuses most on Social–2 content is the 25–29 age group, where the number of words surpasses that of Social–1 content (45% vs. 40%). This is also true of the age group of people over 40 years of age (41% to Social–2 and 33 % to Social–1 content).

We also fitted random effects linear models to our data to verify the descriptive statistics reported above. For the differences between sexes, we fitted a mixed effects regression model with utterance length as the outcome variable, speaker sex (male or female) and content type (social content or non-social content) with an interaction term as predictor variables, and conversation identifier as a random effect (to control for the fact that the observations within a single conversation are not independent) (Table 6).

Table 6. A mixed-effects linear model with sex as the predictor

The model results indicate that in comparison with women, men spend more time discussing social content than non-social ones (β = 2.75, p = 0.001) and that there is no significant difference between women and men in terms of how much they spend discussing social (β = 0.02, p = 0.86) or non-social (β = 0.004, p = 0.97) content.

We estimated the relationship between education level and time spent on discussing content by fitting a mixed effects linear regression model to our data with utterance length as the outcome variable and education and type of social content as the predictor variables with an interaction term. We also included the conversation identifier as a random predictor. Education level was deviation coded so that each estimate shows how a given level of the predictor differs from the grand mean (Table 7).

Table 7. A mixed-effects linear model with education as the predictor

The model indicates that individuals with higher education (the intercept) spend more time discussing non-social content (β = 6.94, p < 0.05) and social content (β = 2.43, p < 0.05) than the grand mean. This most likely results from their representation in the corpus, as higher education individuals constitute most of our sample. The model also suggests that children without education spend less time discussing social content than the grand mean (β = -0.87, p < 0.05). Furthermore, we do not see statistical significance for adults without any education, or those with high school or vocational education, and their time spent discussing social content compared with the grand meant. Additionally, the model indicates that adults with high school education spend less time discussing non-social content than the grand mean (β = -2.16, p < 0.05). This suggests that although they do not spend as much time discussing social content as other groups (for instance, the group with higher education), they still spend most of their time on social content relative to non-social content.

Finally, we turn our attention to age as the predictor variable. Similarly to the education model, we also deviation-coded the education predictor so that the model estimates are the deviation of a group relative to the grand mean. The model had an interaction term between its two predictors, that is age group and social content, and the conversation identifier as a random effect (Table 8).

Table 8. A mixed-effects linear model with age as a predictor

The intercept is the age group between 20 and 24 years of age and non-social content. Overall, we find that people between 20 and 25, 26 and 29 years of age and over 40 spend more time discussing social content than the grand mean. Children under the age of 10, teenagers between 10 and 19 years of age and adults between 30 and 39 years of age spend less time than the grand mean.

4. Discussion

Our study provides a comprehensive examination of the proportion of ‘social’ vs. ‘non-social’ content in casual conversation, offering valuable insights into the dynamics of human communication and its evolutionary implications. Dunbar et al.’s (Reference Dunbar, Marriott and Duncan1997) original study estimated that conversations revolving around social and personal content accounted for approximately two-thirds of the time spent in conversation. Building upon Dunbar et al.’s (Reference Dunbar, Marriott and Duncan1997) pioneering research, our study classifies an even greater proportion of conversation time, i.e. roughly 85%, as dedicated to discussing social content. This aligns with the results of previous studies: for example, Dahmardeh and Dunbar (Reference Dahmardeh and Dunbar2017) estimated the proportion of social topics in Iranian conversations at 76%, and a qualitative comparison of 174 conversations among the Ju/’ hoan (!Kung), hunter-gatherers from southern Africa, supplemented by 68 translated texts, suggests that their conversations centre on how economic matters and gossip regulate social relations (Wiessner, Reference Wiessner2014). Despite the differences between the methodological approaches, this stable pattern of results underscores the robustness of the prevalence of social discourse in human communication across different contexts and populations, which in turn points to the evolutionary importance of social interactions (see also Mesoudi et al., Reference Mesoudi, Whiten and Dunbar2006; Redhead & Dunbar, Reference Redhead and Dunbar2013).

We employed a host of methodological solutions made possible by corpus tools; in particular, we relied on a coding practice that facilitated several discussions and deliberations between the coders to mitigate individual biases and ensure a more robust and reliable coding process. We also highlighted the strict dependence of the proportion of ‘social’ to ‘non-social’ content on the operationalisation of ‘social content’. Our key decision here was to understand ‘social’ within an evolutionary context, focusing on the primary domain of applicability of information gained through conversation (see 1.3). For example, information about myself is socially useful through impression management, and information about others is useful for updating my knowledge base on them, leading to better predictions of their behaviour. And secondarily, any information about the social lives of people outside of our social circle can be useful for learning vicariously about the consequences of social decisions. We, however, stress that alternative operationalisations of ‘social’ will invariably lead to different results; indeed, in our own preliminary study (Szala et al., Reference Szala, Placiński, Poniewierska, Szczepańska, Wacewicz, Ravignani, Asano, Valente, Ferretti, Hartmann, Hayashi, Jadoul, Martins, Oseki, Rodrigues, Vasileva and Wacewicz2022), leaving out the category ‘personal experiences’ led to only 50.9% per cent of conversation content being classified as ‘social’.

Contrasting our findings with the original study by Dunbar et al. (Reference Dunbar, Marriott and Duncan1997) allows us to examine the evolutionary implications of our results. The alignment of our findings with those of Dunbar and colleagues reinforces the notion that social interactions and discussions about social topics play a fundamental role in our evolutionary history. The emphasis on social bonding, cooperation and information exchange is particularly consistent with the social brain theory, which posits that our large brains have evolved to meet the demands of complex social relationships.

5. Conclusions

Before attempting to go from our numerical results to more general conclusions, we must once again emphasise a crucial, if obvious, caveat: the proportion of ‘social’ to ‘non-social’ content in conversation depends mostly on how ‘social content’ is defined. This apparently self-evident reservation should not be overlooked in interpretations. In this study, in line with our evolutionarily motivated research question, we defined ‘social content’ very inclusively (see 1.3). When considering our results in different contexts, researchers should be cautious, as the prevalence and nature of social discourse may vary.

Our study points to several directions for further research. In particular, it could be productive to increase the geographical, typological and cultural diversity of the source material (i.e. languages and their speakers), to consider a broader variety of inclusion and exclusion criteria and to include additional socio-demographic variables. Finally, it could be helpful to employ a mixed-methods approach combining quantitative analyses with qualitative analyses, which can provide a more nuanced picture of the contextual factors behind discussing social content.

In summary, our study underscores the importance of precisely defining and operationalising ‘social content’ in conversation analysis. By doing so and noting its alignment with the study by Dunbar et al. (Reference Dunbar, Marriott and Duncan1997), we contribute to a deeper understanding of the evolutionary significance of social discourse in human communication.

Supplementary material

The supplementary material for this article can be found at http://doi.org/10.1017/langcog.2024.54.

Data availability statement

No new primary data were acquired in this study. The study was preregistered at https://osf.io/kjf4e/. Database and the scripts (Python and R) used: https://osf.io/mqs5k/.

Acknowledgements

This work was supported by the Polish National Science Centre under Grant UMO-2019/34/E/HS2/00248.

Competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Footnotes

¹ An important if obvious caveat is that categorising on such a broad level inevitably involves some degree of fuzziness. As one example, discussing any non-social topic – such as when teaching a partner how to make tools – may still incorporate sharing some social information, and ‘sharing personal experience’ has been shown to improve the effectiveness of toolmaking (Tilston et al., Reference Tilston, Bangerter and Tylén2022). As another example, phatic communication is quintessentially social in its function but not necessarily social in its content and as a result is often not classified as ‘communication on social topics’.

² We are grateful to two anonymous reviewers who independently raised this point in their report on our preliminary study (Szala et al., Reference Szala, Placiński, Poniewierska, Szczepańska, Wacewicz, Ravignani, Asano, Valente, Ferretti, Hartmann, Hayashi, Jadoul, Martins, Oseki, Rodrigues, Vasileva and Wacewicz2022).

³ We are grateful to an anonymous reviewer for pointing this out.

References

Abitbol, M. (1993). Quadrupedalism and the acquisition of bipedalism in human children. Gait & Posture, 1(4), 189–195. https://doi.org/10.1016/0966-6362(93)90045-3.CrossRef Google Scholar

Aiello, L. C., & Dunbar, R. I. M. (1993). Neocortex size, group size, and the evolution of language. Current Anthropology, 34(2), 184–193. https://doi.org/10.1086/204160.CrossRef Google Scholar

Armstrong, E. (1983). Relative brain size and metabolism in mammals. Science, 220(4603), 1302–1304. https://doi.org/10.1126/science.6407108.CrossRef Google Scholar PubMed

Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01.CrossRef Google Scholar

Bickerton, D. (1998). Catastrophic evolution: The case for a single step from protolanguage to full human language. In Hurford, J. R., Studdert-Kennedy, M., & Knight, C. (Eds.), Approaches to the evolution of language: Social and cognitive bases. Cambridge University Press.Google Scholar

Buss, D. M. (2019). Evolutionary psychology: The new science of the mind (6th ed.). Routledge.CrossRef Google Scholar

Buss, D. M., & Dedden, L. A. (1990). Derogation of competitors. Journal of Social and Personal Relationships, 7(3), 395–422. https://doi.org/10.1177/0265407590073006.CrossRef Google Scholar

Byrne, R. W. (2000). Evolution of primate cognition. Cognitive Science, 24(3), 543–570. https://doi.org/10.1207/s15516709cog2403_8.CrossRef Google Scholar

Carroll, J. (1995). Evolution and literary theory. Human Nature, 6(2), 119–134. https://doi.org/10.1007/BF02734174.CrossRef Google Scholar PubMed

Carroll, J. (2011). Reading human nature: Literary Darwinism in theory and practice. State University of New York Press.CrossRef Google Scholar

Carroll, J. (Ed.) (2012). Graphing Jane Austen: The evolutionary basis of literary meaning. Palgrave Macmillan.CrossRef Google Scholar

Dahmardeh, M., & Dunbar, R. I. M. (2017). What shall we talk about in Farsi? Content of everyday conversations in Iran. Human Nature, 28(4), 423–433. https://doi.org/10.1007/s12110-017-9300-4.CrossRef Google Scholar PubMed

DeLouize, A. M., Coolidge, F. L., & Wynn, T. (2017). Dopaminergic systems expansion and the advent of Homo erectus. Quaternary International, 427, 245–252. https://doi.org/10.1016/j.quaint.2015.10.123.CrossRef Google Scholar

Dunbar, R. I. M. (1998a). Grooming, gossip, and the evolution of language (1st Harvard University Press paperback ed). Harvard University Press.Google Scholar

Dunbar, R. I. M. (1998b). The social brain hypothesis. Evolutionary Anthropology: Issues, News, and Reviews, 6(5), 178–190. https://doi.org/10.1002/(SICI)1520-6505(1998)6:5<178::AID-EVAN5>3.0.CO;2-8.3.0.CO;2-8>CrossRef Google Scholar

Dunbar, R. I. M. (2009). The social brain hypothesis and its implications for social evolution. Annals of Human Biology, 36(5), 562–572. https://doi.org/10.1080/03014460902960289.CrossRef Google Scholar PubMed

Dunbar, R. I. M. (2010). How many friends does one person need? Dunbar’s number and other evolutionary quirks. Harvard University Press.Google Scholar

Dunbar, R. I. M., Marriott, A., & Duncan, N. D. C. (1997). Human conversational behavior. Human Nature, 8(3), 231–246. https://doi.org/10.1007/BF02912493.CrossRef Google Scholar PubMed

Dunbar, R. I. M., & Shultz, S. (2023). Four errors and a fallacy: Pitfalls for the unwary in comparative brain analyses. Biological Reviews, 98(4), 1278–1309. https://doi.org/10.1111/brv.12953.CrossRef Google Scholar

Epstein, R. A., Patai, E. Z., Julian, J. B., & Spiers, H. J. (2017). The cognitive map in humans: Spatial navigation and beyond. Nature Neuroscience, 20(11), 1504–1513. https://doi.org/10.1038/nn.4656.CrossRef Google Scholar PubMed

Gamer, M., Lemon, J., & Singh, I. F. P. (2019). Irr: Various coefficients of interrater reliability and agreement (Version 0.84.1) [Computer software]. https://cran.r-project.org/web/packages/irr/index.html.Google Scholar

Gottschall, J. (2005). The heroine with a thousand faces: Universal trends in the characterization of female folk tale protagonists. Evolutionary Psychology, 3(1), 147470490500300. https://doi.org/10.1177/147470490500300108.CrossRef Google Scholar

Gumperz, J. J., & Tannen, D. (1979). Individual and social differences in language use. In Individual differences in language ability and language behavior (pp. 305–325). Elsevier. https://doi.org/10.1016/B978-0-12-255950-1.50024-XCrossRef Google Scholar

Hannerz, U. (1967). Gossip, networks and culture in a black American ghetto. Ethnos, 32(1–4), 35–60. https://doi.org/10.1080/00141844.1967.9980988.CrossRef Google Scholar

Heritage, J. (2012). Epistemics in Conversation. In Sidnell, J., & Stivers, T. (Eds.), The handbook of conversation analysis (1st ed., pp. 370–394). Wiley. https://doi.org/10.1002/9781118325001.ch18CrossRef Google Scholar

Heritage, J., & Stivers, T. (2012). Conversation analysis and sociology. In Sidnell, J., & Stivers, T. (Eds.), The handbook of conversation analysis (1st ed., pp. 657–673). Wiley. https://doi.org/10.1002/9781118325001.ch32.CrossRef Google Scholar

Hyde, J. S., & Linn, M. C. (1988). Gender differences in verbal ability: A meta-analysis. Psychological Bulletin, 104(1), 53–69. https://doi.org/10.1037/0033-2909.104.1.53.CrossRef Google Scholar

Kaplan, S. (1992). Environmental preference in a knowledge-seeking, knowledge-using organism. In Barkow, J. H., Cosmides, L., & Tooby, J. (Eds.), The adapted mind: Evolutionary psychology and the generation of culture (pp. 581–598). Oxford University Press.CrossRef Google Scholar

Kimble, C. E., & Musgrove, J. I. (1988). Dominance in arguing mixed-sex dyads: Visual dominance patterns, talking time, and speech loudness. Journal of Research in Personality, 22(1), 1–16. https://doi.org/10.1016/0092-6566(88)90021-9.CrossRef Google Scholar

Krems, J. A., Dunbar, R. I. M., & Neuberg, S. L. (2016). Something to talk about: Are conversation sizes constrained by mental modeling abilities? Evolution and Human Behavior, 37(6), 423–428. https://doi.org/10.1016/j.evolhumbehav.2016.05.005.CrossRef Google Scholar

Kurland, N. B., & Pelled, L. H. (2000). Passing the word: Toward a model of gossip and power in the workplace. The Academy of Management Review, 25(2), 428. https://doi.org/10.2307/259023.CrossRef Google Scholar

Leaper, C., & Ayres, M. M. (2007). A meta-analytic review of gender variations in adults’ language Use: Talkativeness, affiliative speech, and assertive speech. Personality and Social Psychology Review, 11(4), 328–363. https://doi.org/10.1177/1088868307302221.CrossRef Google Scholar PubMed

Leaper, C., & Smith, T. E. (2004). A meta-analytic review of gender variations in children’s language use: Talkativeness, affiliative speech, and assertive speech. Developmental Psychology, 40(6), 993–1027. https://doi.org/10.1037/0012-1649.40.6.993.CrossRef Google Scholar PubMed

Luginbühl, M., Mundwiler, M. V., Kreuz, J., Müller-Feldmeth, D., & Hauser, (2021). Quantitative and qualitative approaches in conversation analysis: Methodological reflections on a study of argumentative group discussions. Gesprächsforschung - Online-Zeitschrift zur verbalen Interaktion, 22, 179–236. Available online at: http://www.gespraechsforschung-online.de/fileadmin/dateien/heft2021/ga-luginbuehl.pdf Google Scholar

Marino, L. (1998). A comparison of encephalization between odontocete cetaceans and anthropoid primates. Brain, Behavior and Evolution, 51(4), 230–238. https://doi.org/10.1159/000006540.CrossRef Google Scholar PubMed

Mesoudi, A., Whiten, A., & Dunbar, R. (2006). A bias for social information in human cultural transmission. British Journal of Psychology, 97(3), 405–423. https://doi.org/10.1348/000712605X85871.CrossRef Google Scholar PubMed

Milton, K. (1988). Foraging behaviour and the evolution of intellect in monkeys, apes and humans. In Byrne, R. W. & Whiten, A. (Eds.), Machiavellian intelligence: Social expertise and the evolution of intellect in monkeys, apes and humans (pp. 285–305). Clarendon Press.Google Scholar

Nicol, C. J. (1995). The social transmission of information and behaviour. Applied Animal Behaviour Science, 44(2–4), 79–98. https://doi.org/10.1016/0168-1591(95)00607-T.CrossRef Google Scholar

Nowak, M. A., & Sigmund, K. (2005). Evolution of indirect reciprocity. Nature, 437(7063), 1291–1298. https://doi.org/10.1038/nature04131.CrossRef Google Scholar PubMed

Ohtsuki, H., & Iwasa, Y. (2006). The leading eight: Social norms that can maintain cooperation by indirect reciprocity. Journal of Theoretical Biology, 239(4), 435–444. https://doi.org/10.1016/j.jtbi.2005.08.008.CrossRef Google Scholar PubMed

Orians, G. H., & Heerwagen, J. H. (1992). Evolved responses to landscapes. In Barkow, J. H., Cosmides, L., & Tooby, J. (Eds.), The adapted mind: Evolutionary psychology and the generation of culture (pp. 555–579). Oxford University Press.CrossRef Google Scholar

Pęzik, P. (2012). Jezyk mówiony w NKJP [Spoken language in National Corpus of Polish Language]. In Przepiórkowski, A., Bańko, M., Górski, R., & Lewandowska-Tomaszczyk, B. (Eds.), Narodowy Korpus Jezyka Polskiego [National Corpus of Polish Language] (pp. 37–47). Wydawnictwo Naukowe PWN.Google Scholar

Pęzik, P. (2014). Spokes – a search and exploration service for conversational corpus data. In CLARIN 2014: Soesterberg, The Netherlands. Linköping Electronic Conference Proceedings. www.ep.liu.se/ecp/116/009/ecp15116009.pdf Google Scholar

Pleyer, M. (2023). The role of interactional and cognitive mechanisms in the evolution of (proto)language(s). Lingua, 282, 103458. https://doi.org/10.1016/j.lingua.2022.103458.CrossRef Google Scholar

Redhead, G., & Dunbar, R. I. M. (2013). The functions of language: An experimental study. Evolutionary Psychology, 11(4), 845–854. https://doi.org/10.1177/147470491301100409.CrossRef Google Scholar PubMed

Saunders, J. P. (2015). Darwinian literary analysis of sexuality. In Shackelford, T. K. & Hansen, R. D. (Eds.), The evolution of sexuality (pp. 29–55). Springer International Publishing. https://doi.org/10.1007/978-3-319-09384-0_2.CrossRef Google Scholar

Saville-Troike, M. (2003). A sociolinguistic perspective. In Lange, D. L. & Paige, M. (Eds.), Culture as the core: Perspective on culture in second language education. Information Age Publishing.Google Scholar

Shilton, D. (2019). Is language necessary for the social transmission of lithic technology? Journal of Language Evolution, 4(2), 124–133. https://doi.org/10.1093/jole/lzz004.CrossRef Google Scholar

Shultz, S., & Dunbar, R. I. M. (2022). Socioecological complexity in primate groups and its cognitive correlates. Philosophical Transactions of the Royal Society B: Biological Sciences, 377(1860), 20210296. https://doi.org/10.1098/rstb.2021.0296.CrossRef Google Scholar PubMed

Speth, J. D. (2010). The paleoanthropology and archaeology of big-game hunting. Springer. https://doi.org/10.1007/978-1-4419-6733-6.CrossRef Google Scholar

Stefanczyk, M. M. (2024). People declare lowered levels of sociosexual desire in the presence of an attractive audience. Archives of Sexual Behavior, 53(3), 879–887. https://doi.org/10.1007/s10508-023-02753-w.CrossRef Google Scholar PubMed

Stefanczyk, M. M., Conroy-Beam, D., Ujma, B., Walter, K. V., Zborowska, Z., & Sorokowska, A. (2024). Disgust in the mating context – Choosing the best and the least bad self-presentation option in a date simulation game. Telematics and Informatics, 92, 102159. https://doi.org/10.1016/j.tele.2024.102159.CrossRef Google Scholar

Stefanczyk, M. M., Lizak, K., Kowal, M., & Sorokowska, A. (2022). “May I present you: My disgust!” – Declared disgust sensitivity in the presence of attractive models. British Journal of Psychology, 113(3), 739–757. https://doi.org/10.1111/bjop.12556.CrossRef Google Scholar

Stolarova, M., Wolf, C., Rinker, T., & Brielmann, A. (2014). How to assess and compare inter-rater reliability, agreement and correlation of ratings: An exemplary analysis of mother-father and parent-teacher expressive vocabulary rating pairs. Frontiers in Psychology, 5, 1–13. https://doi.org/10.3389/fpsyg.2014.00509.CrossRef Google Scholar PubMed

Szala, A., Placiński, M., Poniewierska, A., Szczepańska, A., & Wacewicz, S. (2022). How much language use is actually on social topics? [Application/pdf]. In Ravignani, A., Asano, R., Valente, D., Ferretti, F., Hartmann, S., Hayashi, M., Jadoul, Y., Martins, M., Oseki, Y., Rodrigues, E. D., Vasileva, O., & Wacewicz, S. (Eds.), The evolution of language: Proceedings of the joint conference on language evolution (JCoLE) (Version 2, pp. 705–707). Joint Conference on Language Evolution (JCoLE). https://pure.mpg.de/pubman/item/item_3398549 Google Scholar

Szala, A., Placiński, M., Zywiczynski, P., Poniewierska, A., Schmeichel, A., & Wacewicz, S. (2023). How much language use is actually on social topics: Human conversational behavior revisited. https://doi.org/10.17605/OSF.IO/KJF4ECrossRef Google Scholar

Szlendak, T., & Kozłowski, T. (2008). Naga małpa przed telewizorem: Popkultura w świetle psychologii ewolucyjnej [Naked ape in front of the TV. Pop Culture in the light of evolutionary psychology]. Wydawnictwa Akademickie i Profesjonalne.Google Scholar

Tannen, D., & Wallat, C. (1983). Doctor/mother/child communication: Linguistic analysis of a pediatric interaction. In Fisher, S. & Todd, A. D. (Eds.), The social organization of doctor-patient communication. Washington, DC: Center for Applied Linguistics.Google Scholar

Tilston, O., Bangerter, A., & Tylén, K. (2022). Teaching, sharing experience, and innovation in cultural transmission. Journal of Language Evolution, 7(1), 81–94. https://doi.org/10.1093/jole/lzac007.CrossRef Google Scholar

Tooby, J., & Cosmides, L. (1992). The psychological foundations of culture. In Barkow, J. H., Cosmides, L., & Tooby, J. (Eds.), The adapted mind: Evolutionary psychology and the generation of culture (pp. 19–136). Oxford University Press.CrossRef Google Scholar

Tooby, J., & Cosmides, L. (2001). Does beauty build adapted minds? Toward an evolutionary theory of aesthetics, fiction and the arts. SubStance, 30(1/2), 6. https://doi.org/10.2307/3685502.Google Scholar

Vaesen, K. (2012). The cognitive bases of human tool use. Behavioral and Brain Sciences, 35(4), 203–218. https://doi.org/10.1017/S0140525X11001452.CrossRef Google Scholar PubMed

Wacewicz, S. (2015). The shades of social. A discussion of “The social origins of language. In Dor, D., Knight, C., & Lewis, J. (Eds.), Theoria et Historia Scientiarum (Vol. 11, p. 191). https://doi.org/10.12775/ths-2014-011.CrossRef Google Scholar

Whiten, A., & Byrne, R. W. (1988). The Machiavellian intelligence hypotheses: Editorial. In Byrne, R. W. & Whiten, A. (Eds.), Machiavellian intelligence: Social expertise and the evolution of intellect in monkeys, apes, and humans (pp. 1–9). Clarendon Press/Oxford University Press.Google Scholar

Wickham, H. (2016). ggplot2: Elegant graphics for data analysis (Vol. 2016, 2nd ed.). Springer International Publishing. https://doi.org/10.1007/978-3-319-24277-4.CrossRef Google Scholar

Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T., Miller, E., Bache, S., Müller, K., Ooms, J., Robinson, D., Seidel, D., Spinu, V., … Yutani, H. (2019). Welcome to the Tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686.CrossRef Google Scholar

Wiessner, P. W. (2014). Embers of society: Firelight talk among the Ju/’hoansi Bushmen. Proceedings of the National Academy of Sciences, 111(39), 14027–14035. https://doi.org/10.1073/pnas.1404212111.CrossRef Google Scholar PubMed

Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., & Sloeties, H. (2006). ELAN: A professional framework for multimodality research. In 5th International Conference on Language Resources and Evaluation (LREC 2006). (pp. 1556–1559). European Language Resources Association.Google Scholar

Yeomans, M., Boland, F. K., Collins, H. K., Abi-Esber, N., & Brooks, A. W. (2023). A practical guide to conversation research: How to study what people say to each other. Advances in Methods and Practices in Psychological Science, 6(4), 25152459231183919. https://doi.org/10.1177/25152459231183919.CrossRef Google Scholar

Table 1. Coding instructions and examples from the database per line of text for five categories

Table 2. The overall distribution of social vs. non-social content (lines of text)

Table 3. Distribution of social content categories vs. non-social content (lines of text)

Table 4. The overall distribution of social vs. non-social content (word tokens)

Table 5. Distribution of social content categories vs. non-social content (word tokens)

Figure 2. (A) The overall distribution of social content per education level. (B) Distribution of individual social content categories per education level.

Figure 3. (A) The overall distribution of social content per age group. (B) Distribution of individual social content categories per age group.

Table 6. A mixed-effects linear model with sex as the predictor

Table 7. A mixed-effects linear model with education as the predictor

Table 8. A mixed-effects linear model with age as a predictor

Szala et al. supplementary material 1

Szala et al. supplementary material

File 48.8 KB

Szala et al. supplementary material 2

Szala et al. supplementary material

File 15.5 KB

Article contents

How much conversation content is actually social: human conversational behaviour revisited

Abstract

Keywords

1. Introduction

1.1. The social evolution of human cognition and language

1.2. The study by Dunbar, Marriott, and Duncan, Reference Dunbar, Marriott and Duncan1997

1.3. What language use counts as ‘social’?

2. Materials and methods

2.1. Corpus

2.2. Sampling

2.3. Coding

3. Results

4. Discussion

5. Conclusions

Supplementary material

Data availability statement

Acknowledgements

Competing interest

Footnotes

References

Szala et al. supplementary material 1

Szala et al. supplementary material 2

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests