Diversity, networks, and innovation: A text analytic approach to measuring expertise diversity

Alina Lungeanu; Ryan Whalen; Y. Jasmine Wu; Leslie A. DeChurch; Noshir S. Contractor

doi:10.1017/nws.2022.34

Diversity, networks, and innovation: A text analytic approach to measuring expertise diversity

Published online by Cambridge University Press: 15 December 2022

Alina Lungeanu

Ryan Whalen

Y. Jasmine Wu

Leslie A. DeChurch and

Noshir S. Contractor

Show author details

Alina Lungeanu*: Affiliation:
Northwestern University, Evanston, IL, USA
Ryan Whalen: Affiliation:
The University of Hong Kong, Pokfulam, Hong Kong
Y. Jasmine Wu: Affiliation:
Northwestern University, Evanston, IL, USA
Leslie A. DeChurch: Affiliation:
Northwestern University, Evanston, IL, USA
Noshir S. Contractor: Affiliation:
Northwestern University, Evanston, IL, USA
*: *Corresponding author. Email: [email protected]

Article contents

Abstract
Introduction
Team expertise diversity
Method
Results
Discussion
Conclusion
Funding
Competing interests
Footnotes
References

Rights & Permissions

Abstract

Despite the importance of diverse expertise in helping solve difficult interdisciplinary problems, measuring it is challenging and often relies on proxy measures and presumptive correlates of actual knowledge and experience. To address this challenge, we propose a text-based measure that uses researcher’s prior work to estimate their substantive expertise. These expertise estimates are then used to measure team-level expertise diversity by determining similarity or dissimilarity in members’ prior knowledge and skills. Using this measure on 2.8 million team invented patents granted by the US Patent Office, we show evidence of trends in expertise diversity over time and across team sizes, as well as its relationship with the quality and impact of a team’s innovation output.

Keywords

inventor networks network science text analytics innovation patent records team science team expertise diversity

Type: Research Article
Information: Network Science , Volume 11 , Special Issue 1: Scientific Networks , March 2023 , pp. 36 - 64

DOI: https://doi.org/10.1017/nws.2022.34 [Opens in a new window]
Copyright: © The Author(s), 2022. Published by Cambridge University Press

1. Introduction

Innovation in science and technology is increasingly the province of teams (Wuchty et al., Reference Wuchty, Jones and Uzzi2007). As knowledge becomes ever more specialized, teams are needed to tackle complex problems with solutions requiring insight from multiple domains (Jones, Reference Jones2009). However, teams are fundamentally social entities. Realizing the benefits that diverse teams have in solving hard problems requires combining the often-disparate knowledge of each team member.

Although scientific team composition and diversity have been ongoing areas of policy concern and research focus, there are few established methods that can be used to measure and analyze team expertise diversity, especially accurately and at scale. In this paper, we help address the need for improved measurement of team expertise diversity by proposing a new and more precise measure. To do so, we leverage researchers’ collaboration networks together with the text of researchers’ output to identify patterns in the way scientific teams relate, extend, integrate, and juxtapose the breadth and depth of their prior knowledge. Specifically, we create measures of expertiseFootnote ¹ and diversity that provide insight into not only which research areas individual team members’ have expertise in but also the degree to which those areas of expertise are similar or dissimilar to one another. To operationalize our measure, we draw on United States Patent and Trademark Office (USPTO) patenting data which provides networks and textual insight into applied scientific and technical research teams. These data include the full text of more than 6 million patents granted in the USA since 1976, as well as data on the citations between them, and the inventors and teams responsible for them.

Using this rich source of textual data on researcher’s output as well as information about their collaboration networks, we develop new methodological techniques to analyze scientific teams and the networks that underlie them. Using this approach to measure expertise diversity, we found that team expertise diversity steadily increased between 1976 and 1996, and that it has subsequently remained relatively constant thereafter. We also found that there are decreasing marginal expertise diversity increases as team members are added, and that, on average, expertise diversity appears to plateau at about eight team members. Finally, our novel measure of team expertise diversity is a reliable predictor of team innovation, specifically innovation atypicality and success. Our study helps guide future research by both providing novel empirical insight into expertise diversity, as well as methodological approaches to understanding the evolution of scientific networks over time.

2. Team expertise diversity

2.1 Team expertise diversity and innovation

While collaboration has long been important to scientific practice (Cummings & Kiesler, Reference Cummings and Kiesler2005; Finholt & Olson, Reference Finholt and Olson1997), recent research has amplified that importance in two ways. First, by providing evidence that the most impactful work is created by teams (Börner et al., Reference Börner, Contractor, Falk-Krzesinski, Fiore, Hall, Keyton and Uzzi2010; Wuchty et al., Reference Wuchty, Jones and Uzzi2007). Second, by showing that teams exhibiting diversity in knowledge are especially likely to produce highly innovative works (Uzzi et al., Reference Uzzi, Mukherjee, Stringer and Jones2013). For these two reasons, cross-boundary team science has become a coveted undertaking in academic institutions, research funding agencies, and private firms.

Despite their increasing popularity (Whalen, Reference Whalen2018), the formation and maintenance of cross-boundary teams face several inherent challenges (Cummings & Kiesler, Reference Cummings and Kiesler2008). Take for example a simple cross-boundary arrangement where expertise diversity arises because team members are from different disciplines and thus have varied knowledge and beliefs. Although such interdisciplinary teams often struggle to find common ground (Hall et al., Reference Hall, Vogel, Huang, Serrano, Rice, Tsakraklides and Fiore2018; McCorcle, Reference McCorcle1982; Wagner et al., Reference Wagner, Roessner, Bobb, Klein, Boyack, Keyton and Börner2011), they are nonetheless the epitome of cross-boundary. Indeed, there is an abundance of research demonstrating that people shy away from connecting with those who are different (Byrne & Griffitt, Reference Byrne and Griffitt1973; McPherson et al., Reference McPherson, Smith-Lovin and Cook2001; Montoya & Horton, Reference Montoya and Horton2013).

A team’s diversity can be conceptualized along many dimensions. However, when the output of concern is the degree of the team’s innovation, one of the most commonly used diversity dimensions is a focus on the members’ diverse expertise. Expertise is the “specialized skills and knowledge that people bring to the team’s task” (Faraj & Sproull, Reference Faraj and Sproull2000, p. 1555). Indeed, as Bruns (Reference Bruns2013) states, innovation requires cross-domain or cross-functional collaboration and thus unique capabilities that can only be developed by bringing together diverse specializations. Collaboration among diversely specialized parties is appropriate for tasks requiring unique types of knowledge that one party could not develop alone (Cummings & Kiesler, Reference Cummings and Kiesler2007). Recent research in team science has argued that teams not only need to span scientific specialties in their search for novel ideas but also be effective at combining knowledge (Uzzi et al., Reference Uzzi, Mukherjee, Stringer and Jones2013). For example, models of creativity suggest that innovation is spurred through boundary-spanning combinations that spark new insights. While new ideas tend to overwhelmingly be found through boundary-spanning combinations rather than within one’s field of expertise, the ever-expanding size and complexity of scientific knowledge bound scientists to narrowly specialize and thus increase their difficulty to search for knowledge in unfamiliar expertise domains (Fleming, Reference Fleming2001; Jones, Reference Jones2009; Schilling & Green, Reference Schilling and Green2011).

While studies in the diversity literature strongly suggest that innovation results from the increased range of knowledge, skills, and perspectives that a diverse team confers (O’Reilly III et al., Reference O’Reilly, Williams and Barsade1998), the literature has tended to use ever-expanding definitions of expertise and varieties of proxies to capture the expertise diversity construct. These range from “surface-level” characteristics such as age, gender, and race (Harrison et al., Reference Harrison, Price, Gavin and Florey2002; Jehn et al., Reference Jehn, Northcraft and Neale1999) to more “deep-level” characteristics meant to capture cognitive diversity (Taylor & Greve, Reference Taylor and Greve2006) or to bibliometric measures that proxies expertise diversity indirectly by inferring rather than directly observing collaboration across fields (Mukherjee et al., Reference Mukherjee, Romero, Jones and Uzzi2017; Uzzi et al., Reference Uzzi, Mukherjee, Stringer and Jones2013). Of course, such an expansion of terms and operationalizations comes with a drawback. Specifically, the relationship of these various measures to expertise or knowledge diversity on scientific innovation is not all congruent (Harrison & Klein, Reference Harrison and Klein2007; Horwitz & Horwitz, Reference Horwitz and Horwitz2007; Joshi & Roh, Reference Joshi and Roh2009), suggesting that proxies for team expertise diversity may not all be conceptually relevant to the outcome variable of scientific innovation. We discuss these measures and the challenges they prompt below.

2.2 Team expertise diversity constructs

Team size is often used as a proxy for team expertise diversity, under the assumption that a larger number of members increases the likelihood that some of those members will have different cognitive strategies and career experiences. In turn, these traits are thought to lead to variation in knowledge and problem-solving approaches (Taylor & Greve, Reference Taylor and Greve2006; West & Anderson, Reference West and Anderson1996). However, it is by no means true that all large teams exhibit high expertise diversity, nor that small teams do not.

In the context of cross-boundary scientific teams, team diversity has also been inferred by using members’ career stage or years in the scientific community. The presence of a balanced mix of younger and more seasoned researchers should result in the expression of bold ideas (Horwitz, Reference Horwitz2005) while also enabling the evaluating of those ideas and advocating for their adoption by the community of researchers. Interdisciplinary scientific teams with a mix of young and senior researchers have the research appetite of younger researchers benefiting from the research prowess, experience, resources, and prestige of more senior researchers (Hinnant et al., Reference Hinnant, Stvilia, Wu, Worrall, Burnett, Burnett and Marty2012). Relatedly, there is evidence that age diversity (rather than years in the scientific community) brings a wider range of perspectives and experiences that improve team decision quality (Cox & Blake, Reference Cox and Blake1991; Horwitz, Reference Horwitz2005; Pelled, Reference Pelled1996). In both cases, however, such proxies for diversity are unlikely to correlate well with diversity in expertise, because teams exhibiting a high degree of career-stage diversity are often still composed of members from the same scientific area.

Expertise diversity has also been proxied using the degree of experience that team members have in working with one another (Taylor & Greve, Reference Taylor and Greve2006). Teams with many prior collaborative projects are more likely to develop standardized operation practices, which result in higher performance and quality of outputs (Gilson et al., Reference Gilson, Mathieu, Shalley and Ruddy2005). These repeat collaborations result in cohesiveness and predictability (Guimera et al., Reference Guimera, Uzzi, Spiro and Amaral2005). Scholars have argued that cohesiveness results in an uninterrupted exchange of ideas (Coleman, Reference Coleman1988; Hansen, Reference Hansen1999; Reagans & McEvily, Reference Reagans and McEvily2003), particularly for tacit and complex knowledge, and in an increase in team performance. On the other hand, partnering with other scientists that have been trained differently, work in different areas, or use different techniques in their work may offer the greatest innovation potential, whereas cohesiveness can suffer due to a high degree of information redundancy within the team. Thus, despite arguments that cohesive teams benefit because information flows more easily amongst team members, cohesiveness can also impede a team’s creativity and ability to innovate.

Structural diversity, defined as variation in members’ organizational affiliations, roles, or geographical locations (Van den Bulte & Moenaert, Reference Van den Bulte and Moenaert1998), has also been used as a proxy for expertise or knowledge diversity. Specifically, it has been argued that teams whose members are geographically dispersed are likely to be exposed to different information and knowledge because of individuals’ embeddedness in different social networks (Cummings, Reference Cummings2004; Monge et al., Reference Monge, Rothman, Eisenberg, Miller and Kirste1985). However, considering how geographically distant team members may exhibit similar areas of expertise, such a proxy for diversity is unlikely to correlate well with diversity in expertise.

Finally, in studies examining scientific collaboration, network research has produced several expertise diversity measures using citation patterns. For example, Lungeanu et al. (Reference Lungeanu, Huang and Contractor2014) showed that scientists working with other scientists they have cited in the past are less likely to produce innovative research due to a focus on conventionality rather than novelty. Furthermore, Uzzi et al. (Reference Uzzi, Mukherjee, Stringer and Jones2013) showed that atypical combinations of citations suggest that ideas from two entirely different domains are seeding a new idea. This might be more likely to happen when scientists from different domains, who are versed in different kinds of literature, put their heads together to solve a scientific puzzle (Lungeanu & Contractor, Reference Lungeanu and Contractor2015).

While the studies outlined above have advanced our understanding of the links between diversity in expertise and scientific innovation, they also share two limitations. First, because they are built on the assumption that heterogeneity in individual attributes (rather than an individual’s actual knowledge) is an accurate representation of expertise diversity, these proxy expertise diversity measures offer only crude approximations of the actual knowledge heterogeneity within diverse teams. Second, while one can argue that using proxy measures is appropriate absent the tools for accurate measurement of a given construct, expertise diversity proxies are unable to capture individual expertise in a nuanced manner, and in particular, struggle to estimate the degree of similarity or dissimilarity of knowledge held by individuals to which these diversity measures apply. For example, consider measures of expertise diversity that rely on citation patterns to infer whether a scientific innovation results from combining general ideas from different domains. In general, this approach is unable to accurately represent the specific expertise held by members of the scientific team. A measure of expertise diversity that can account for both the general domain of expertise and the specific expertise of scientific team’s members can overcome many of the hurdles encountered in accurately representing the knowledge held by individual team members.

2.3 Studying team expertise diversity

In this paper, we address the need for the improved measurement of expertise diversity by leveraging the time-tested methodology used in mapping the network of prior (collaborations) combined with natural language processing techniques often used in corpus linguistics (Pollach, Reference Pollach2012). Combining the analysis of collaboration networks with the textual analysis of researchers’ output allows us to identify patterns in the way scientific teams are relating, extending, integrating, and juxtaposing the breadth and depth of their prior knowledge. This novel measure of team expertise diversity provides insight into not only which research areas individual team members’ have expertise in but also the degree to which those areas of expertise are similar or dissimilar. To do so, we leverage a large patent database that provides not just metadata about patents but also the text describing the outputs generated by teams.

In recent decades, science and innovation policy have advocated for increased interdisciplinarity and diversity in research projects (National Academy of Sciences et al., 2005). Meanwhile, there has been a simultaneous increase in the size of research teams, and in the research impact that these large teams generate (Milojević, Reference Milojević2014; Wuchty et al., Reference Wuchty, Jones and Uzzi2007). However, it remains unclear precisely how these two trends interact with the degree to which the teams that increasingly generate interdisciplinary science are themselves composed of members with diverse areas of expertise. To provide insight into this, we pursue two lines of inquiry regarding team expertise diversity, as related to trends in diversity over time and across team sizes with the following two-pronged research questions:

RQ1(a): How does team expertise diversity vary over time?

RQ1(b): How does team expertise diversity vary across different team sizes?

2.3.1 Relation with known constructs of team diversity and coherence

Our proposed measure of team expertise diversity is based on an analysis of individual team members’ prior work and language analysis techniques applied to their actual research output (i.e., their patents). While the measure of expertise diversity is both a novel and perhaps more accurate representation of the expertise diversity of the team, we are also mindful that the relevance of the new measure is partially conditioned by the extent of its conceptual relatedness to proxies used in the past. Therefore, we also examine the correlation between team expertise diversity to proxies that reflect team characteristics that are either similar or dissimilar with the diversity construct.

In terms of measures that are similar to our proposed metric, we are interested in those that reflect the homogeneity of attributes. For example, team coherence describes factors that stabilize the team, providing members with predictability in their teammates, and hasten the development of needed shared cognitive properties of teams (DeChurch & Mesmer-Magnus, Reference DeChurch and Mesmer-Magnus2010; Hinds et al., Reference Hinds, Carley, Krackhardt and Wholey2000). Prior work identifies two ways team design fosters coherence—familiarity and homophily. Team familiarity is the extent to which members have worked together previously (Littlepage et al., Reference Littlepage, Robison and Reddington1997). Previous research finds familiarity through prior collaboration predicts the success of project teams (Harrison et al., Reference Harrison, Mohammed, McGrath, Florey and Vanderstoep2003), software teams (Espinosa et al., Reference Espinosa, Slaughter, Kraut and Herbsleb2007; Huckman et al., Reference Huckman, Staats and Upton2009), movie-making teams (Cattani et al., Reference Cattani, Ferriani, Mariani and Mengoli2013), and sports teams (Mukherjee et al., Reference Mukherjee, Huang, Neidhardt, Uzzi and Contractor2019; Sieweke & Zhao, Reference Sieweke and Zhao2015). Homophily is another mechanism supporting high-quality team interaction (Hinds et al., Reference Hinds, Carley, Krackhardt and Wholey2000; Reagans et al., Reference Reagans, Zuckerman and McEvily2004). The psychological mechanisms of familiarity and homophily are similar: heightened trust reduced coordination costs, and transactive memory systems allowing team members to efficiently source one another’s expertise (Littlepage et al., Reference Littlepage, Robison and Reddington1997).

In terms of measures that are dissimilar with our proposed metric, we are interested in those that reflect heterogeneity in attributes. For example, team diversity represents the distribution of differences among members of a team with respect to a common attribute (Harrison & Klein, Reference Harrison and Klein2007). Team diversity is a team-level construct that considers members’ attributes in relation to one another. The literature on team diversity has been subject to at least five meta-analyses (Bell, Reference Bell2007; Bell et al., Reference Bell, Villado, Lukasik, Belau and Briggs2011; Horwitz & Horwitz, Reference Horwitz and Horwitz2007; Stahl et al., Reference Stahl, Maznevski, Voigt and Jonsen2010; Webber & Donahue, Reference Webber and Donahue2001). An important distinction is that between “surface” and “deep” level diversity. Surface-level diversity refers to differences in demographic variables. In scientific teams, team demographic diversity is the degree to which team members are different in terms of surface-level, visible, background characteristics such as career stage, gender, institutional affiliation, and national affiliation. In contrast, deep-level diversity, also called functional diversity, refers to differences based on ideas, values, or information. In scientific teams, team expertise diversity captures the degree of variation between team members in their areas of expertise. Research on team diversity shows that deep-level diversity has stronger positive effects on team outcomes than does surface diversity (Bell, Reference Bell2007; Horwitz & Horwitz, Reference Horwitz and Horwitz2007; Webber & Donahue, Reference Webber and Donahue2001). We formalize our second inquiry as follows:

RQ2(a): Does team expertise diversity correlate with known measures of team diversity and coherence?

RQ2(b): Can we predict team expertise diversity by examining known measures of team diversity and coherence?

2.3.2 Effect on the atypicality and the impact of a team’s output

Establishing an improved measure of team expertise diversity and a method for its operationalization is relevant to the extent that it can be used to predict team innovation and the quality of that innovation. We examine these in terms of a team’s innovation atypicality (i.e., atypical knowledge combination) and success (i.e., citation rates).

Recent research demonstrates that a scientific development’s impact is partially a function of the degree to which it mixes infrequently combined knowledge inputs. This has been demonstrated in scientific journal articles that combine infrequently combined sets of sources (Lee et al., Reference Lee, Walsh and Wang2015; Uzzi et al., Reference Uzzi, Mukherjee, Stringer and Jones2013), in the combinations of chemicals that researchers choose to focus on (Shi et al., Reference Shi, Foster and Evans2015), and in the way technologies are combined in patented inventions (Fleming, Reference Fleming2001).

Although it seems clear that the diversity of knowledge that teams combine is an important factor in determining their likelihood of producing high impact research output, it is less clear how the diversity of the team itself might influence its tendency to do so. There is reason to believe that teams with greater expertise diversity will benefit and be more likely to produce research that combines atypically combined inputs. Teams with diverse task experience will have greater access to diverse knowledge (Hong & Page, Reference Hong and Page2004; Lee et al., Reference Lee, Walsh and Wang2015; Taylor & Greve, Reference Taylor and Greve2006) perhaps making it easier for them to recombine their diverse areas of expertise in ways not usually done. On the other hand, diverse expertise is not without cost. Teams exhibiting a high degree of expertise diversity may face greater challenges in communicating with one another and effectively collaborating (Cummings & Kiesler, Reference Cummings and Kiesler2007). This leads us to ask our third research question:

RQ3(a): Does team expertise diversity correlate with the atypicality of a team’s output?

There is a variety of work suggesting that team expertise diversity correlates favorably with the impact of the team’s output. For example, studies have found that teams with more authors produce more highly cited work than solos or smaller teams (Wuchty et al., Reference Wuchty, Jones and Uzzi2007). However, while teams, and larger teams, produce work that is cited more, they also produce less of that work (Cummings et al., Reference Cummings, Kiesler, Bosagh Zadeh and Balakrishnan2013; Leahey et al., Reference Leahey, Beckman and Stanko2017). Other findings show that scientists have predictable tendencies in how they form teams: they add members up to a point, they tend to repeat prior collaborations, and they add newcomers to the team (Guimera et al., Reference Guimera, Uzzi, Spiro and Amaral2005). Also, the atypicality and diversity embodied in the work determine, at least in part, the work’s eventual impact. For example, citing recent work, along with a few very old citations, is associated with a paper having a higher impact (Mukherjee et al., Reference Mukherjee, Romero, Jones and Uzzi2017). Similarly, citing studies that have rarely been cited together before, along with studies that were frequently cited together in the past, is also associated with high impact (Uzzi et al., Reference Uzzi, Mukherjee, Stringer and Jones2013). Together, these findings suggest that teams with members who each have different expertise are more open and receptive to boundary-crossing ideas in general. In the context of our study, we then ask:

RQ3(b): Does team expertise diversity correlate with the impact of a team’s output?

3. Method

3.1 Dataset: USPTO database

We turn to the patent record to provide evidence of researcher expertise and team composition. Patents are drafted to describe the claimed invention both to disclose that invention to the public and to put others on notice about the bounds of the intellectual property claimed by the patent owner. As such, the text contained in patents generally includes an extensive description of the invention. This description can be used to estimate the expertise of those who created the invention. For example, if an inventor collaborates on a patent describing an invention for a new type of cancer drug, we can assume that she has a degree of expertise in cancer pharmacology. To estimate researcher expertise, we used patent text data from the USPTO.Footnote ² This includes data on all patents granted from 1976 to mid-2018 and contains the textual descriptions of the inventions, the language that precisely describes what is being claimed by the patent, as well as metadata information about the inventors and the technological categorization that the USPTO assigned to the invention. Each patent is assigned to one of the nine main scientific areas, based on the cooperative patent classification (CPC) scheme, from “A-Human necessity” to “H-Electricity.” Additionally, each patent can be assigned to the scientific area “Y-New technological developments.” In our study, we included all 2,781,797 patents that are co-invented by 3,833,204 inventors as well as their 6,704,707 prior inventions.

3.2 Team expertise diversity metric

To estimate the expertise diversity of a team, we need some way to measure how similar or dissimilar the members’ areas of expertise are. Those teams with relatively little difference in their members’ areas of expertise have less expertise diversity than teams made up of members with widely divergent expertise. To do this, we can compare the text of the team member’s inventions to estimate how similar or dissimilar their expertise—as demonstrated by their inventing histories—are to one another.

There are a wide variety of text similarity measures of varying degrees of sophistication. Most of them involve representing a document in a vector space, where each document is represented by a set of coordinates in some n-dimensional space. This can be done using the large and sparse vectors created by relatively simple approaches like a bag-of-words or TF-IDF method, or the reduced dimensional vectors produced by models like LSI, LDA, or Doc2Vec (Milojević, Reference Milojević2015; Milojević et al., Reference Milojević, Sugimoto, Yan and Ding2011). Other approaches such as BERT leverage pre-trained models built on large input datasets. Here, we use Doc2Vec because it allows us to train our own model suited to the idiosyncrasies of patent text and because the resulting reduced-dimension vector representation of the text allows us to more easily and accurately perform operations on multiple vectors (Le & Mikolov, Reference Le and Mikolov2014; Mikolov et al., Reference Mikolov, Sutskever, Chen, Corrado and Dean2013). Doc2Vec is an extension of the Word2Vec model which itself uses a three-layer neural network to predict words based on their context. Doc2Vec extends this by adding document-level nodes in addition to the word nodes used in Word2Vec, which allows for one to embed entire documents in the vector space estimated by the model. To produce the model used below, we use the text from both the description and independent claims of all of the utility patents in our dataset.Footnote ³ This model allows us to embed each patent in 300-dimensional space.

Determining team expertise diversity first requires identifying each team member’s expertise and subsequently comparing them to one another. To do this, we first identify each of the patent’s inventors, and for each inventor each of his or her previous inventions.Footnote ⁴ We then calculate each inventor’s “average expertise” by taking the mean of the model embeddings for each inventor’s previous inventions. These can be thought of as the location in the model space that represents that inventor’s “average prior invention.”

Using the average expertise vectors for each inventor, we then calculate the pairwise distances between each team member by taking the cosine distance between their vectors. This allows us to determine how “close” or “distant” from one another each inventor’s prior inventing experience is. Inventors who have previously worked on very similar inventions will have a low distance between their average expertise vectors, whereas those who have diverse inventing experience will have a higher distance. As an example, consider a collaborative invention for a new style of coffee cup with a nanomaterial coating. Inventor A has prior experience patenting a coffee cup lid sealing mechanism, B has prior experience patenting a coffee cup heating device, and C has prior patents covering nanomaterials. We average the text embeddings for each inventor’s prior patents to determine their general areas of experience and then take the cosine distance between these three points to determine how similar or dissimilar they are from one another. In this situation, because of their cup inventing histories, the embedding vectors of inventors A and B will be quite similar to one another, and the largest degrees of distance between inventors will be between A-C or B-C.

We take the maximum pairwise distance between inventors to be a team’s expertise diversity. This represents the degree to which the team brings together at least two individuals with diverse inventing histories. Those teams featuring individuals with dissimilar inventing histories are likely to have distinct knowledge and are thus more likely to both face the coordination challenges and perhaps enjoy the output benefits, of what we refer to as “expertise diversity” above. In team task taxonomic terms, one can consider the process of inventing to be a “conjunctive task” that requires input from all team members (Steiner, Reference Steiner1972). Indeed, it is a legal requirement that each inventor listed on a patent must have made a non-trivial and inventive contribution. The conjunctive nature of the inventing task requires team members to navigate the differences that may arise due to their unique training, experience, and knowledge. As such, the maximum—rather than a measure of central tendency like the median or mean—distance between team members’ expertise areas reflects the extent to which the team needed to span or “conjoin” divergent subject areas and is used as our measure of expertise diversity.

Table 1 presents a graphical representation of computing team expertise diversity for a team of size 3.

Table 1. Computing team inter-personal expertise diversity

3.3 Team output measures

We measure team output atypicality using the network non-obviousness score (NNOS, Pedraza-Fariña & Whalen, Reference Pedraza-Fariña and Whalen2020). NNOS measures the degree to which a team’s research output combines scientific or technical areas that are rarely combined. Research suggests that combining rarely combined fields can lead to a higher chance of producing breakthrough innovation (Uzzi et al., Reference Uzzi, Mukherjee, Stringer and Jones2013), but that making atypical combinations is difficult to do effectively (Fleming, Reference Fleming2001). Here, we rely on the CPC data to measure the degree to which inventions represent typical or atypical combinations of technical areas. We first take each patent’s classification at the subgroup level. These include categorizations such as Manufacture of Dairy Products (A01J) or Nitrogenous Fertilizers (C05C). We then use the history of granted patents to calculate the probability of observing each pair of CPC subclassifications and take the lowest probability as that patent’s NNOS as it represents the degree to which that patent combines rarely combined technical fields (Pedraza-Fariña & Whalen, Reference Pedraza-Fariña and Whalen2020). To ease comparisons with our other metrics, we use 1-NNOS—that is, 1 minus the probability of observing the two least-frequently combined CPC subgroups—so that a higher score represents a more unlikely combination.Footnote ⁵

We measure team output impact using the number of citations. In the context of scientific research, measuring research impact using the number of citations a paper has received is a frequently used method to evaluate team success (e.g., Aksnes, Reference Aksnes2006; Biscaro & Giupponi, Reference Biscaro and Giupponi2014; Ebadi & Schiffauerova, Reference Ebadi and Schiffauerova2015; Lee et al., Reference Lee, Walsh and Wang2015; Uzzi et al., Reference Uzzi, Mukherjee, Stringer and Jones2013; Wuchty et al., Reference Wuchty, Jones and Uzzi2007). In this study, we use the number of citations within 8 years from patenting, and we use the natural logarithm to account for the skewed distribution. Therefore, our team output impact measure is computed for team invented patents published between 1976 and 2010, for a total of 809,985.

3.4 Team diversity and coherence measures

3.4.1 Team size

We define team size as a simple count of the number of members in a team.

3.4.2 Surface-level diversity measures

We used the Blau (Reference Blau1977) index to calculate gender diversity. A high index score indicates greater gender diversity among team members. Next, we computed team experience diversity in two ways. One is to measure the experience as the number of years in patenting, and the other is to measure the experience by counting the number of patents each team member published before. We used the standard deviation of this across the team to calculate the patenting experience diversity in teams.

3.4.3 Structural diversity measures

The multiple organizations measure is a dummy variable computed based on the number of organizations assigned to a patent: “1” for multiple organizations and “0” for one organization. Multiple organizations assignment suggests that team members are affiliated with different organizations. Geographical proximity was computed using the “distance” function from the Geopy Python library. The function computes the pairwise geographic distances based on the latitude and longitude of team members’ addresses, which we then averaged across the team to represent the geographic distances between team members.

3.4.4 Network-level diversity measures

We also built team diversity measures starting from the co-inventor network. First, the repeated incumbents measure represents the extent to which team members have worked together previously. To calculate it, we compute the ratio of members who have collaborated with at least one other member before in a team (Guimera et al., Reference Guimera, Uzzi, Spiro and Amaral2005). The external collaborators measure represents the “outreach” of the members by computing the total number of unique collaborators of team members from prior inventions. We then used the natural logarithm to scale it, as the distribution of the actual value is skewed with the results heavily influenced by a few inventors who have an extremely high number of collaborators.

3.5 Analytical approach

To study the level of team expertise diversity and its trend over time and across team sizes, we build null models to set a baseline. We started from the observed team patents and we generated 2.8 million random teams with similar attributes. Specifically, we randomly rewired the ties within and between teams while holding constant the number of teams in a year, the number of inventors, the number of patents, the distribution of inventors per patent, and the distribution of the number of patents per inventor (Lungeanu et al., Reference Lungeanu, Carter, DeChurch and Contractor2018). Therefore, the null model preserves the same number of teams and team sizes as our observed sample.

For each of the 2.8 million simulated teams, we used our newly developed measure to calculate their expertise diversity. Next, we used ordinary least squares (OLS) regression to compare the team expertise diversity generated from the simulated teams with the observed teams. To test whether team expertise diversity varies over time, we used the interaction term between year and model type (i.e., observed versus null). A significant interaction term indicates that team expertise diversity over time is different in the observed model compared to the null model. Similarly, to test whether team expertise diversity varies across team sizes we used the interaction term between team size and model type.

Next, to investigate the relationship between known constructs of diversity and coherence and team expertise diversity and between team expertise diversity and team outcomes we used OLS regression. In our analyses, we controlled for the patent year and CPC section (i.e., technical area).

4. Results

4.1 Descriptive results

Table 2 provides descriptive statistics about the team expertise diversity metric. For the 2,781,797 patent teams, the team expertise diversity metric (M = 0.21, SD = 0.12) ranges from 0 to 0.571. Most of the patents (N = 2,290,874) were issued between 1997 and 2018, and the average team expertise diversity was 0.214. Compared to the average team expertise diversity (M = 0.196, SD = 0.124) of patent teams between 1976 and 1996, this shows an increase over time. Across team sizes, the average team expertise diversity is 0.209 for 2,731,419 teams with eight or fewer than eight members, and 0.297 for 50,378 teams with more than eight members. This shows a slow increase in expertise diversity as team size grows.

Table 2. Team expertise diversity: descriptive statistics

Figure 1. Team expertise diversity over time.

Table 3. Effect of time on team inter-personal diversity: observed vs. simulated

All variables, except the constant, report standardized beta coefficients; Standard errors in parentheses; Two-tail model; ^*p < 0.05, ^**p < 0.01, ^***p < 0.001.

Figure 2. Team expertise diversity over time: observed model vs. null model.

4.2 Team expertise diversity: Trends over time and across team sizes

Our first set of analyses examines the trends in team expertise diversity across years and team sizes. RQ1(a) asked how team expertise diversity varies over time. Figure 1 presents the trend of expertise diversity over time. Overall, the expertise of team members became more diverse from 1976 until 1996. Since 1996 team expertise diversity has remained relatively constant with a slight increase. However, starting in 2016 team expertise diversity began to decline. To test whether the change in team expertise diversity over time is statistically significant, we compared the observed team expertise with the null-modeled team expertise. Generally, as the null models rewire the network from the complete set of inventors, it contributes to a high variation of patenting history between members of the simulated teams. In other words, simulated teams are always more diverse than observed teams in terms of expertise because the simulation joins inventors at random. However, the change in team expertise diversity over time is different in the observed model compared to the null model. Table 3 and Figure 2 present the comparison of the change in team expertise diversity over years between observed and simulated teams. Table 3 presents the effect of years on team expertise diversity split into the two time periods that we noted above exhibit quite different trends: 1976–1995 and 1996–2018. Model 1 shows that team expertise diversity increases until 1995 (β = 0.061, p < 0.001) while model 3 shows that team expertise diversity decreases after 1995 (β = −0.038, p < 0.001). Model 2 and model 4 contain the interaction term between year and model type (i.e., observed vs null). The interaction terms are positive and significant showing that the difference in slopes between the two models is significant (model 2: β = 33.239, p < 0.001; model 4: β = 12.922, p < 0.001).

Figure 3. Team expertise diversity across team sizes.

Figure 4. Team expertise diversity across team sizes: observed model vs. null model.

RQ1(b) asked how team expertise diversity varies as team size grows. Figure 3 shows the trend of team expertise diversity across team sizes. As expected, team expertise diversity increases as the size of the team increases. However, the marginal increase in expertise diversity is minimal as teams grow beyond eight members. To test the significance of this effect, Figure 4 further presents the effect of an increase in team size on team expertise diversity in the observed versus the simulated teams. The results of interaction effects between team size and expertise diversity show that beyond the team size of eight, the effect of adding a new member on team expertise diversity no longer exists. Table 4 presents the effect of team sizes on team expertise diversity across two team size groups: team size smaller than or equal to eight and team size larger than eight. Model 1 shows that team expertise diversity increases when the team size is smaller than or equal to eight (β = 0.275, p < 0.001), and model 3 also shows that team expertise diversity increases when the team size is larger than eight (β = 0.060, p < 0.001). Model 2 and model 4 contain the interaction term between team size and model type (i.e., observed vs null). The interaction term is positive and significant in model 2 (β = 0.071, p < 0.001) but is significantly negative (β = −0.138, p < 0.001) in model 4 where the team size is larger than eight, showing that the difference in slopes between the two models is significant.

Table 4. Effect of team size on team inter-personal diversity: observed vs. simulated

All variables, except the constant, report standardized beta coefficients; Standard errors in parentheses; Two-tail model; ^*p < 0.05, ^**p < 0.01, ^***p < 0.001.

4.3 Team diversity in patent networks: Team expertise diversity vs known constructs

Our second set of analyses examines how the team expertise diversity metric correlates with established measures of team diversity and coherence—such as surface-level, structural-level, or network-level diversity measures (RQ2a) and whether we can predict team expertise diversity by examining known measures of team diversity and coherence (RQ2b).

Table 5 presents the correlation matrix between the team expertise diversity construct and prior diversity and coherence measures. Among all the known measures, the ratio of repeated incumbents (M = 0.71, SD = 0.40) is the most negatively correlated (ρ = −0.537) with team expertise diversity, and team size (M = 3.48, SD = 1.71) is the most positively correlated (ρ = 0.305) with it. Table 6 presents the results of linear regression with technical area fixed effects and team expertise diversity as the dependent variable. Model 1 (R² = 0.104) includes the team size, year, and technical areas fixed effects. Models 2, 3, and 4 add to the model 1 the surface-level, structural-level, and network-level diversity constructs, respectively. Finally, model 5 contains all diversity constructs. Consistent with the correlation results, model 4 (R² = 0.404) that contains the network-level constructs shows the highest improvement fit compared to model 1 (ΔR² = 0.334). In order to test the curvilinear relationship of time and team size with expertise diversity, model 6 adds quadratic terms of year and team size. The results show a negative coefficient for the quadratic team size (β = −0.2415, p < 0.001). This implies a curvilinear, non-monotonic relationship between team size and expertise diversity. Expertise diversity increases at first as team size increases, but then decreases for large team size. Likewise, results also indicate a similar curvilinear relationship between year and expertise diversity (β = −0.1458, p < 0.001). This suggests that expertise diversity increases at first as time progresses, but then decreases continuously in more recent years.

Table 5. Correlation between team expertise diversity construct and known diversity constructs

N = 2,781,797.

Table 6. OLS regression predicting team expertise diversity

All variables, except the constant, report standardized beta coefficients; standard errors in parentheses; two-tail model; ^*p < 0.05, ^**p < 0.01, ^***p < 0.001.

In sum, the correlation and regression results show that team size, team experience diversity in terms of patenting years, number of external collaborators, gender diversity, geographical distance, and affiliation with multiple organizations are all positively correlated with team expertise diversity. Repeated incumbent ratio and team experience diversity with respect to patenting times are negatively correlated with team expertise diversity.

4.4 Team expertise diversity: Effect on teams’ output

Our final set of analyses examines how team expertise diversity is related to the team’s output. Table 7 contains the descriptive statistics and the correlation matrix between team expertise diversity, the known diversity measures, and output-based measures—patent impact (M = 1.61, SD = 1.08) and patent atypicality (M = 0.94, SD = 0.09) in around 1.5 million patenting teams.

Table 7. Team outcome: descriptive statistics

N = 1,508,238; ^aN = 809,985 (Patent impact measure is computed for patents published between 1976 and 2010).

Table 8. Team expertise diversity predicting team output’s atypicality

All variables, except the constant, report standardized beta coefficients; standard errors in parentheses; two-tail model; ^*p < 0.05, ^**p < 0.01, ^***p < 0.001.

4.4.1 Patent atypicality

RQ3(a) asks how team expertise diversity relates to the atypicality of a team’s output. Table 8 presents the OLS regression models of team expertise diversity as a predictor of patents’ atypicality. Models 1–5 report the individual effects of known diversity constructs and team expertise diversity on patent atypicality, and model 6 reports the full model. As shown in model 6, the team expertise diversity positively predicts the atypicality of the patent (β = 0.0215, p < 0.001). Model 7 adds interaction terms between expertise diversity and the team size squared and year squared. Notably, the results indicate that expertise diversity is moderated by team size squared (β = −0.0282, p < 0.001) when predicting patent atypicality. The linear effect for expertise diversity predicting patent atypicality is significantly more positive for larger teams than for smaller teams. When teams have both low and high expertise diversity, the simple slopes of regression curves have positive values for team size. The results also indicate that expertise diversity is moderated by year squared (β = 0.1632, p < 0.001) when predicting patent atypicality. When teams have low expertise diversity, the simple slope of the regression curve had a positive value for earlier patents and a negative value for more recent patents. When teams have high expertise diversity, the slope of the regression curve predicting patent atypicality had a negative value for earlier patents and a positive value for more recent patents.

4.4.2 Patent impact

RQ3(b) asks how team expertise diversity relates to the impact of a team’s output. Table 9 presents the OLS regression models including team expertise diversity as a predictor of patent forward citations. Models 1–6 report the individual effects of known diversity constructs and team expertise diversity on patents’ forward citations, and model 7 reports the full model. As shown in model 7, team expertise diversity measure positively predicts patents’ impact (β = 0.0359, p < 0.001). Model 8 adds interaction terms between expertise diversity and the quadrics of team size and year. Notably, the results indicate that expertise diversity is moderated by team size squared (β = −0.0653, p < 0.001) when predicting patent impact. As shown in Figure 5(a), when teams have low expertise diversity, the simple slope of the regression curve has a positive value for small team size and a negative value for large team size. As shown in Figure 5(b), when teams have high expertise diversity, the simple slope of the regression curve predicting patent impact has a positive value for both small and large teams. The results also indicate that expertise diversity is moderated by year squared (β = −0.2207, p < 0.001) when predicting patent impact. Regression curves follow the same pattern when teams have low and high expertise diversity. The slopes of regression curves predicting patent impact have positive values for earlier patents and negative values for more recent patents.

Figure 5. Interaction effect between team expertise diversity and team size squared predicting team output’s impact. (a) Team expertise diversity = M − 1SD. (b) Team expertise diversity = M + 1SD.

Table 9. Team expertise diversity predicting team output’s impact

All variables, except the constant, report standardized beta coefficients; standard errors in parentheses; two-tail model; ^*p < 0.05, ^**p < 0.01, ^***p < 0.001.

5. Discussion

We began this study by noting the large and still growing body of research linking team diversity and innovation and the ever-expanding definitions of expertise and the wide variety of proxies used to capture the expertise diversity construct. The relevance of establishing an accurate method and measure of team expertise diversity is highlighted by both the important relationship that team diversity has with scientific progress and the proliferation of literature relying on diversity proxies. This study leverages an innovative text analytic approach to measure team expertise, capturing the breadth and depth of inventors’ prior knowledge. Using this metric, we analyze 40 years of patent data to discover trends in team composition over time. We make three contributions to the literature at the intersection of social networks, diversity research, and innovation studies.

Our first contribution is a methodological one: we create measures of expertise and diversity that provide insight into not only which research areas individual team members have expertise in but also the degree to which those areas of expertise are similar or dissimilar. In this way, we contribute to research on expertise diversity that suffers from two methodological limitations. Specifically, proxies for expertise diversity offer only crude approximations of the actual knowledge heterogeneity within teams and are unable to capture the nuances of individual expertise as related to the degree of similarity or dissimilarity of knowledge among team members.

Using this measure, we identify patterns in scientific team expertise by using researchers’ collaboration networks and the text describing their patented inventions as found in more than 6 million patents granted in the USA since 1976. This allows us to estimate team members’ particular areas of expertise and how they relate to those of other team members. We organized our inquiry with three research questions guiding our theoretical and empirical analysis. Our answers to these questions, both individually and in combination, highlight the relevance of this methodological contribution to research that sits at the intersection of team diversity, networks, and innovation. Specifically, we set out to examine (a) the evolution of teams with reference to expertise diversity since 1976, (b) the extent to which the new measure is correlated with established proxies of team diversity, and (c) the extent to which team expertise diversity, operationalized using our novel measure, predicts team innovation and impact.

Our second contribution is to observe how team expertise diversity evolves over time and varies by team size. This is important because the complexity of science necessary to achieve innovation has increased over time (at least since our first observation in 1976), potentially requiring a higher mixture of specialized expertise. We show that team expertise diversity steadily increased between 1976 and 1996, and that it has subsequently remained relatively constant thereafter. This finding is congruent with a belief that innovation is increasingly the province of teams due to the concurrent increase in both problem complexity and knowledge specialization. Effectively tackling these complex problems requires teams of diverse experts.

We also found decreasing gains in marginal expertise diversity as team members are added, and that, on average, expertise diversity plateaus at about eight team members. That is to say, as teams grow in size each new member is likely to add some new dimension of expertise diversity until the team has about eight members at which point new team members’ backgrounds are almost entirely duplicative of other team members. This could be because those inventions that require such large teams are in relatively narrow technical fields that do not benefit as much from expertise diversity, or alternately it could be related to the coordination challenges faced by particularly large and diverse teams.

Our third contribution refers to the extent to which our novel measure of team expertise diversity is a reliable predictor of team innovation, specifically innovation atypicality (i.e., atypical knowledge combination) and success (i.e., citation rates). We first examined the extent to which our measure correlates with known proxies that reflect both oppositional and compatible characteristics of diversity and the extent to which these proxies are sound predictors of team expertise diversity. We found that team expertise diversity is most strongly correlated with network-level diversity measures, an expected finding given that the measure is affected by team members’ prior collaborations. Although the new measure correlates with network-based diversity measures, it is a stronger predictor of both the atypicality and the impact of the team output. One reason could be that this new measure is based not only on the prior relations but also on the actual output produced during those collaborations and thus is better at capturing team members’ expertise and identifying actual expertise diversity.

5.1 Limitations and directions for future research

Our study develops a new metric to measure team expertise diversity and presents results that indicate its relationship with other measures of diversity and coherence. However, there are some limitations that need to be acknowledged. First, we only examined teams who have invented utility patents granted by the USPTO. As previous research (Chan et al., Reference Chan, Mihm and Sosa2020; Singh & Fleming, Reference Singh and Fleming2010) shows, for utility patents, teams are more likely to create impactful innovation than solo inventors but this advantage of teams disappears in design patent teams, this suggests that teams which produce different types of outputs may have varying internal dynamics that further affect the ideation process. Future research could use a similar approach to analyze other scientific outputs (e.g., design patents, journal articles) to help us better understand the variation between teams in different domains. Future work could also use similar methods to analyze how single inventors change their areas of expertise over time, and how those changes relate to success and scientific productivity.

A second limitation is that the metric developed in this study does not capture the expertise of inventors without prior inventions. This is because our proposed measure uses each inventor’s previous inventions to estimate his or her area of expertise. When an inventor makes his or her first invention, there is no history to draw from and thus we are unable to compare their previous inventing history with those of their fellow team members.

Another limitation arises from the lack of detail patents convey about the relative contributions of each team member. Our method makes a simplifying assumption that one can estimate a researcher’s prior expertise by treating all their prior works (collaborative or not) as providing equal information about their expertise. There is a limitation here, in that we cannot know with certainty what they contributed to joint works.

Using the maximum pairwise distances between team members to represent team expertise diversity is another limitation of our method. This approach only captures the most diverse dyads in the team. For example, team A has members with expertise in agriculture, electricity, and chemistry, while team B has two members in agriculture and one in electricity. Our metric could indicate that two teams have the same value for team expertise diversity because the most diverse dyad in both teams matches in their expertise combination. Future research should explore how different operationalizations of this diversity measure affect its accuracy.

A further limitation concerns the team output measures. Here, we explored the relationship between expertise diversity and the degree to which a patent is classified into rarely combined technical areas and how frequently it is cited by future patents. Future work could expand upon this both by exploring other output factors—such as measures of diffusion and impact incubation periods—and by examining whether there are optimal levels of expertise diversity to maximize research impact.

Finally, we did not incorporate intrapersonal expertise diversity into our analysis. Studies have shown that knowledge integration takes place both between and within individuals (Miller & Mansilla, Reference Miller and Mansilla2004; Whalen, Reference Whalen2018). In other words, individuals are not simply experts in a single narrowly focused area but rather have varying degrees of knowledge from different domains. In our measure, we used the centroid of prior inventions’ vectors to represent inventors’ expertise, and this takes into account all the prior patents one published but does not account for the variance amongst them. It would be valuable for further research to include intrapersonal expertise diversity as an independent metric and examine how it affects expertise diversity at the team level.

6. Conclusion

Innovation in science and technology requires teams to tackle complex problems with diverse sets of knowledge. Using an innovative text analytic approach, we created a measure of expertise that captures the substantive focus of inventors’ prior knowledge. Applying this measure at the team level, we constructed a measure of team expertise diversity that provides insight into not only which research areas individual team members’ have expertise in but also the degree to which those areas of expertise are similar or dissimilar to one another. We reveal that team expertise diversity correlates to varying degrees with many alternate diversity constructs, and that it is a reliable predictor of team innovation atypicality (i.e., atypical knowledge combination) and its success (i.e., citation rates). These methods and findings contribute to research on innovation, social networks, and expertise diversity.

Funding

This work was supported by the National Science Foundation under award #1856090 and by the National Institutes of Health under award #R01GM1374100. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or the National Institutes of Health.

Competing interests

None.

Footnotes

Guest Editor (Special Issue on Scientific Networks): Dmitry Zaytsev

1 We use the terms “expertise” and “knowledge” interchangeably.

2 Full details on how the data were collected and processed are available in Whalen et al. (Reference Whalen, Lungeanu, DeChurch and Contractor2020) which also provides data sharing details.

3 We use the Gensim Python library, training the model over 12 epochs, ignoring character case, and using default word downsampling.

4 This method requires inventors to have some prior patenting experience in order to estimate their area of expertise. Because of this, the first time an inventor is listed on a patent they will not be included in the teams’ expertise diversity calculations.

5 Because this measure requires multiple CPC subgroup classifications, it is undefined for any patents with a single subgroup classification. Thus, some of the analyses below use the subset of the patent data for which NNOS is defined.

References

Aksnes, D. W. (2006). Citation rates and perceptions of scientific contribution. Journal of the American Society for Information Science and Technology, 57(2), 169–185.CrossRef Google Scholar

Bell, S. T. (2007). Deep-level composition variables as predictors of team performance: A meta-analysis. Journal of Applied Psychology, 92(3), 595–615.CrossRef Google Scholar PubMed

Bell, S. T., Villado, A. J., Lukasik, M. A., Belau, L., & Briggs, A. L. (2011). Getting specific about demographic diversity variable and team performance relationships: A meta-analysis. Journal of Management, 37(3), 709–743.CrossRef Google Scholar

Biscaro, C., & Giupponi, C. (2014). Co-authorship and bibliographic coupling network effects on citations. PLoS One, 9(6), e99502.10.1371/journal.pone.0099502CrossRef Google Scholar PubMed

Blau, P. M. (1977). Inequality and heterogeneity: A primitive theory of social structure (Vol. 7). New York: Free Press.Google Scholar

Börner, K., Contractor, N., Falk-Krzesinski, H. J., Fiore, S. M., Hall, K. L., Keyton, J., …Uzzi, B. (2010). A multi-level systems perspective for the science of team science. Science Translational Medicine, 2(49), 49cm24.10.1126/scitranslmed.3001399CrossRef Google Scholar PubMed

Bruns, H. C. (2013). Working alone together: Coordination in collaboration across domains of expertise. Academy of Management Journal, 56(1), 62–83.CrossRef Google Scholar

Byrne, D., & Griffitt, W. (1973). Interpersonal attraction. Annual review of Psychology, 24(1), 317–336.10.1146/annurev.ps.24.020173.001533CrossRef Google Scholar

Cattani, G., Ferriani, S., Mariani, M. M., & Mengoli, S. (2013). Tackling the, Galácticos, effect: Team familiarity and the performance of star-studded projects. Industrial and Corporate Change, 22(6), 1629–1662.CrossRef Google Scholar

Chan, T. H., Mihm, J., & Sosa, M. (2020). Revisiting the role of collaboration in creating breakthrough inventions. Manufacturing & Service Operations Management, 23(5), 1005–1331.CrossRef Google Scholar

Coleman, J. S. (1988). Social capital in the creation of human capital. American Journal of Sociology, 94, S95–S120.10.1086/228943CrossRef Google Scholar

Cox, T. H., & Blake, S. (1991). Managing cultural diversity: Implications for organizational competitiveness. Academy of Management Perspectives, 5(3), 45–56.10.5465/ame.1991.4274465CrossRef Google Scholar

Cummings, J. N. (2004). Work groups, structural diversity, and knowledge sharing in a global organization. Management Science, 50(3), 352–364.CrossRef Google Scholar

Cummings, J. N., & Kiesler, S. (2005). Collaborative research across disciplinary and organizational boundaries. Social Studies of Science, 35(5), 703–722.CrossRef Google Scholar

Cummings, J. N., & Kiesler, S. (2007). Coordination costs and project outcomes in multi-university collaborations. Research Policy, 36(10), 1620–1634.CrossRef Google Scholar

Cummings, J. N., & Kiesler, S. (2008). Who collaborates successfully? Prior experience reduces collaboration barriers in distributed interdisciplinary research. In Proceedings of the 2008 ACM conference on Computer supported cooperative work .CrossRef Google Scholar

Cummings, J. N., Kiesler, S., Bosagh Zadeh, R., & Balakrishnan, A. D. (2013). Group heterogeneity increases the risks of large group size: A longitudinal study of productivity in research groups. Psychological Science, 24(6), 880–890.CrossRef Google Scholar

DeChurch, L. A., & Mesmer-Magnus, J. R. (2010). The cognitive underpinnings of effective teamwork: A meta-analysis. Journal of Applied Psychology, 95(1), 32–53.CrossRef Google Scholar PubMed

Ebadi, A., & Schiffauerova, A. (2015). How to receive more funding for your research? Get connected to the right people!. PLoS One, 10(7), e0133061.CrossRef Google Scholar

Espinosa, J. A., Slaughter, S. A., Kraut, R. E., & Herbsleb, J. D. (2007). Familiarity, complexity, and team performance in geographically distributed software development. Organization Science, 18(4), 613–630.CrossRef Google Scholar

Faraj, S., & Sproull, L. (2000). Coordinating expertise in software development teams. Management Science, 46(12), 1554–1568.CrossRef Google Scholar

Finholt, T. A., & Olson, G. M. (1997). From laboratories to collaboratories: A new organizational form for scientific collaboration. Psychological Science, 8(1), 28–36.CrossRef Google Scholar

Fleming, L. (2001). Recombinant uncertainty in technological search. Management Science, 47(1), 117–132.CrossRef Google Scholar

Gilson, L. L., Mathieu, J. E., Shalley, C. E., & Ruddy, T. M. (2005). Creativity and standardization: Complementary or conflicting drivers of team effectiveness? Academy of Management Journal, 48(3), 521–531.CrossRef Google Scholar

Guimera, R., Uzzi, B., Spiro, J., & Amaral, L. A. N. (2005). Team assembly mechanisms determine collaboration network structure and team performance. Science, 308(5722), 697–702.CrossRef Google Scholar PubMed

Hall, K. L., Vogel, A. L., Huang, G. C., Serrano, K. J., Rice, E. L., Tsakraklides, S. P., & Fiore, S. M. (2018). The science of team science: A review of the empirical evidence and research gaps on collaboration in science. American Psychologist, 73(4), 532–548.CrossRef Google Scholar PubMed

Hansen, M. T. (1999). The search-transfer problem: The role of weak ties in sharing knowledge across organization subunits. Administrative Science Quarterly, 44(1), 82–111.CrossRef Google Scholar

Harrison, D. A., & Klein, K. J. (2007). What’s the difference? Diversity constructs as separation, variety, or disparity in organizations. Academy of Management Review, 32(4), 1199–1228.CrossRef Google Scholar

Harrison, D. A., Mohammed, S., McGrath, J. E., Florey, A. T., & Vanderstoep, S. W. (2003). Time matters in team performance: Effects of member familiarity, entrainment, and task discontinuity on speed and quality. Personnel Psychology, 56(3), 633–669.CrossRef Google Scholar

Harrison, D. A., Price, K. H., Gavin, J. H., & Florey, A. T. (2002). Time, teams, and task performance: Changing effects of surface-and deep-level diversity on group functioning. Academy of Management Journal, 45(5), 1029–1045.CrossRef Google Scholar

Hinds, P. J., Carley, K. M., Krackhardt, D., & Wholey, D. (2000). Choosing work group members: Balancing similarity, competence, and familiarity. Organizational Behavior and Human Decision Processes, 81(2), 226–251.CrossRef Google Scholar PubMed

Hinnant, C. C., Stvilia, B., Wu, S., Worrall, A., Burnett, G., Burnett, K., …Marty, P. F. (2012). Author-team diversity and the impact of scientific publications: Evidence from physics research at a national science lab. Library & Information Science Research, 34(4), 249–257.CrossRef Google Scholar

Hong, L., & Page, S. E. (2004). Groups of diverse problem solvers can outperform groups of high-ability problem solvers. Proceedings of the National Academy of Sciences, 101(46), 16385–16389.CrossRef Google Scholar PubMed

Horwitz, S. K. (2005). The compositional impact of team diversity on performance: Theoretical considerations. Human Resource Development Review, 4(2), 219–245.CrossRef Google Scholar

Horwitz, S. K., & Horwitz, I. B. (2007). The effects of team diversity on team outcomes: A meta-analytic review of team demography. Journal of Management, 33(6), 987–1015.CrossRef Google Scholar

Huckman, R. S., Staats, B. R., & Upton, D. M. (2009). Team familiarity, role experience, and performance: Evidence from Indian software services. Management Science, 55(1), 85–100.CrossRef Google Scholar

Jehn, K. A., Northcraft, G. B., & Neale, M. A. (1999). Why differences make a difference: A field study of diversity, conflict and performance in workgroups. Administrative Science Quarterly, 44(4), 741–763.CrossRef Google Scholar

Jones, B. F. (2009). The burden of knowledge and the “death of the renaissance man”: Is innovation getting harder? The Review of Economic Studies, 76(1), 283–317.CrossRef Google Scholar

Joshi, A., & Roh, H. (2009). The role of context in work team diversity research: A meta-analytic review. Academy of Management Journal, 52(3), 599–627.CrossRef Google Scholar

Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents. In International conference on machine learning .Google Scholar

Leahey, E., Beckman, C. M., & Stanko, T. L. (2017). Prominent but less productive: The impact of interdisciplinarity on scientists’ research. Administrative Science Quarterly, 62(1), 105–139.CrossRef Google Scholar

Lee, Y.-N., Walsh, J. P., & Wang, J. (2015). Creativity in scientific teams: Unpacking novelty and impact. Research Policy, 44(3), 684–697.CrossRef Google Scholar

Littlepage, G., Robison, W., & Reddington, K. (1997). Effects of task experience and group experience on group performance, member ability, and recognition of expertise. Organizational Behavior and Human Decision Processes, 69(2), 133–147.CrossRef Google Scholar

Lungeanu, A., Carter, D. R., DeChurch, L. A., & Contractor, N. S. (2018). How team interlock ecosystems shape the assembly of scientific teams: A hypergraph approach. Communication Methods and Measures, 12(2-3), 174–198.CrossRef Google Scholar PubMed

Lungeanu, A., & Contractor, N. S. (2015). The effects of diversity and network ties on innovations: The emergence of a new scientific field. American Behavioral Scientist, 59(5), 548–564.CrossRef Google Scholar PubMed

Lungeanu, A., Huang, Y., & Contractor, N. S. (2014). Understanding the assembly of interdisciplinary teams and its impact on performance. Journal of Informetrics, 8(1), 59–70.CrossRef Google Scholar PubMed

McCorcle, M. D. (1982). Critical issues in the functioning of interdisciplinary groups. Small Group Behavior, 13(3), 291–310.CrossRef Google Scholar

McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27(1), 415–444.CrossRef Google Scholar

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems.Google Scholar

Miller, M., & Mansilla, V. B. (2004). Thinking across perspectives and disciplines. In Goodwork Project Report Series. Harvard University Cambridge, MA.Google Scholar

Milojević, S. (2014). Principles of scientific research team formation and evolution. Proceedings of the National Academy of Sciences, 111(11), 3984–3989.CrossRef Google Scholar

Milojević, S. (2015). Quantifying the cognitive extent of science. Journal of Informetrics, 9(4), 962–973.CrossRef Google Scholar

Milojević, S., Sugimoto, C. R., Yan, E., & Ding, Y. (2011). The cognitive structure of library and information science: Analysis of article title words. Journal of the American Society for Information Science and Technology, 62(10), 1933–1953.CrossRef Google Scholar

Monge, P. R., Rothman, L. W., Eisenberg, E. M., Miller, K. I., & Kirste, K. K. (1985). The dynamics of organizational proximity. Management Science, 31(9), 1129–1141.CrossRef Google Scholar

Montoya, R. M., & Horton, R. S. (2013). A meta-analytic investigation of the processes underlying the similarity-attraction effect. Journal of Social and Personal Relationships, 30(1), 64–94.CrossRef Google Scholar

Mukherjee, S., Huang, Y., Neidhardt, J., Uzzi, B., & Contractor, N. (2019). Prior shared success predicts victory in team competitions. Nature Human Behaviour, 3(1), 74–81.CrossRef Google Scholar PubMed

Mukherjee, S., Romero, D. M., Jones, B., & Uzzi, B. (2017). The nearly universal link between the age of past knowledge and tomorrow’s breakthroughs in science and technology: The hotspot. Science Advances, 3(4), e1601315.CrossRef Google Scholar PubMed

National Academy of Sciences, National Academy of Engineering, & Institute of Medicine (2005). Facilitating interdisciplinary research. Washington, DC: National Academies Press.Google Scholar

O’Reilly, C. A. III, Williams, K. Y., & Barsade, S. (1998). Group demography and innovation: Does diversity help? In Composition. New York: Elsevier.Google Scholar

Pedraza-Fariña, L. G., & Whalen, R. (2020). A network theory of patentability. The University of Chicago Law Review, 87(1), 63–144.Google Scholar

Pelled, L. H. (1996). Demographic diversity, conflict, and work group outcomes: An intervening process theory. Organization Science, 7(6), 615–631.CrossRef Google Scholar

Pollach, I. (2012). Taming textual data: The contribution of corpus linguistics to computer-aided text analysis. Organizational Research Methods, 15(2), 263–287.CrossRef Google Scholar

Reagans, R., & McEvily, B. (2003). Network structure and knowledge transfer: The effects of cohesion and range. Administrative Science Quarterly, 48(2), 240–267. doi: 10.2307/3556658.CrossRef Google Scholar

Reagans, R., Zuckerman, E., & McEvily, B. (2004). How to make the team: Social networks vs. demography as criteria for designing effective teams. Administrative Science Quarterly, 49(1), 101–133.CrossRef Google Scholar

Schilling, M. A., & Green, E. (2011). Recombinant search and breakthrough idea generation: An analysis of high impact papers in the social sciences. Research Policy, 40(10), 1321–1331.CrossRef Google Scholar

Shi, F., Foster, J. G., & Evans, J. A. (2015). Weaving the fabric of science: Dynamic network models of science’s unfolding structure. Social Networks, 43, 73–85.CrossRef Google Scholar

Sieweke, J., & Zhao, B. (2015). The impact of team familiarity and team leader experience on team coordination errors: A panel analysis of professional basketball teams. Journal of Organizational Behavior, 36(3), 382–402.CrossRef Google Scholar

Singh, J., & Fleming, L. (2010). Lone inventors as sources of breakthroughs: Myth or reality? Management Science, 56(1), 41–56.CrossRef Google Scholar

Stahl, G. K., Maznevski, M. L., Voigt, A., & Jonsen, K. (2010). Unraveling the effects of cultural diversity in teams: A meta-analysis of research on multicultural work groups. Journal of International Business Studies, 41(4), 690–709.CrossRef Google Scholar

Steiner, I. D. (1972). Group process and productivity. New York: Academic press.Google Scholar

Taylor, A., & Greve, H. R. (2006). Superman or the fantastic four? Knowledge combination and experience in innovative teams. Academy of Management Journal, 49(4), 723–740.CrossRef Google Scholar

Uzzi, B., Mukherjee, S., Stringer, M., & Jones, B. (2013). Atypical combinations and scientific impact. Science, 342(6157), 468–472.CrossRef Google Scholar PubMed

Van den Bulte, C., & Moenaert, R. K. (1998). The effects of R&D team co-location on communication patterns among R&D, marketing, and manufacturing. Management Science, 44(11-part-2), S1–S18.CrossRef Google Scholar

Wagner, C. S., Roessner, J. D., Bobb, K., Klein, J. T., Boyack, K. W., Keyton, J., …Börner, K. (2011). Approaches to understanding and measuring interdisciplinary scientific research (IDR): A review of the literature. Journal of Informetrics, 5(1), 14–26.CrossRef Google Scholar

Webber, S. S., & Donahue, L. M. (2001). Impact of highly and less job-related diversity on work group cohesion and performance: A meta-analysis. Journal of Management, 27(2), 141–162.CrossRef Google Scholar

West, M. A., & Anderson, N. R. (1996). Innovation in top management teams. Journal of Applied Psychology, 81(6), 680–693.CrossRef Google Scholar

Whalen, R. (2018). Boundary spanning innovation and the patent system: Interdisciplinary challenges for a specialized examination system. Research Policy, 47(7), 1334–1343.CrossRef Google Scholar

Whalen, R., Lungeanu, A., DeChurch, L., & Contractor, N. (2020). Patent similarity data and innovation metrics. Journal of Empirical Legal Studies, 17(3), 615–639.CrossRef Google Scholar

Wuchty, S., Jones, B. F., & Uzzi, B. (2007). The increasing dominance of teams in production of knowledge. Science, 316(5827), 1036–1039.CrossRef Google Scholar PubMed