Cigarette smoking is the leading cause of preventable death in the United States and has been associated with considerable economic, social, and personal costs. Annually, tobacco use costs the nation an estimated $193 billion, inclusive of lost productivity and direct health care expenditures (Centers for Disease Control and Prevention & Office on Smoking and Health, 2008). Yet, 19% of all US adults, or approximately 43.8 million people, smoke cigarettes (Centers for Disease Control and Prevention, 2012). Of these adult smokers, 70% began smoking regularly by the age of 18 (US Department of Health and Human Services, 1994).
Despite notable declines in cigarette smoking over the past 40 years, smoking behavior among adolescents remains a huge public health concern. Every day, about 3,900 children under the age of 18 try their first cigarette. Of these children, an estimated 950 will become new, regular daily smokers (Substance Abuse and Mental Health Services Administration, 2008); approximately half will die as a result of nicotine addiction and other smoking-related causes (Centers for Disease Control and Prevention & Office on Smoking and Health, 2008). Twin studies have suggested that both genetic and environmental factors contribute to smoking behavior. However, many of these twin studies investigated the influences of genes and the environment on cigarette use among adults, so less information is known regarding the genetic and environmental influences of cigarette use in adolescents.
Early twin studies investigating the genetic and environmental influences of smoking behavior of adolescents analyzed various stages of smoking behavior such as initiation, progression, dependence, and addiction separately (Koopmans et al., Reference Koopmans, van Doornen and Boomsma1997; Reference Koopmans, Slutske, Heath, Neale and Boomsma1999; Lyons et al., Reference Lyons, Hitsman, Xian, Panizzon, Jerskey, Santangelo and Tsyang2008; Madden et al., Reference Madden, Heath, Pedersen, Kaprio, Koskenvuo and Martin1999). These studies found that the initiation of tobacco use in adolescence was primarily explained by shared environmental factors (Koopmans et al., Reference Koopmans, van Doornen and Boomsma1997; Slomowski et al., Reference Slomowski, Rende, Novak, Lloyd-Richardson and Niaura2005), while genetic factors contributed more to individual differences in other smoking behaviors, such as daily quantity of cigarettes smoked (Koopmans et al., Reference Koopmans, van Doornen and Boomsma1997; Reference Kendler, Neale, Sullivan, Corey, Gardner and Prescott1999) or smoking progression, which has an estimated heritability of 0.80 (Koopmans et al., Reference Koopmans, van Doornen and Boomsma1997; Maes et al., Reference Maes, Woodard, Murrelle, Meyer, Silberg, Hewitt and Eaves1999; Vink et al., Reference Vink, Willemsen and Boomsma2005). Furthermore, population-based twin studies provide evidence that genetic influences come to play a larger role in smoking behavior by late adolescence, when the etiological structure of smoking initiation closely resembles that of adult samples (Karp et al., Reference Karp, O’Loughlin, Paradis, Handley and DiFranza2005; Kendler et al., Reference Kendler, Schmitt, Aggen and Prescott2008).
Among adult samples, heritability estimates for smoking initiation range from 0.32 to 0.78, making it a moderately heritable trait (Broms et al., Reference Broms, Silventoinen, Madden, Heath and Kaprio2006; Edwards et al., Reference Edwards, Austin and Jarvik1995; Heath et al., Reference Heath, Cates, Martin, Meyer, Hewitt, Neale and Eaves1993; Reference Heath, Kirk, Meyer and Martin1999; Kendler et al., Reference Kendler, Neale, Sullivan, Corey, Gardner and Prescott1999; Madden et al., Reference Madden, Heath, Pedersen, Kaprio, Koskenvuo and Martin1999; True et al., Reference True, Heath, Scherrer, Waterman, Goldberg, Lin and Tsuang1997). On average, estimates are higher in women relative to men (Heath et al. Reference Heath, Martin, Lynskey, Todorov and Madden2002; Li et al., Reference Li, Cheng, Ma and Swan2003; Madden et al., Reference Madden, Heath, Pedersen, Kaprio, Koskenvuo and Martin1999; Zavos et al. Reference Zavos, Kovas, Ball, Ball, Siribaddana, Glozier and Rijsdijk2012), suggesting that the heritability of smoking initiation may differ by gender. However, this finding has not been replicated across all studies (Kendler et al., Reference Kendler, Neale, Sullivan, Corey, Gardner and Prescott1999).
As a consequence of analyzing smoking behavioral factors separately, we lack information on whether any overlap exists across stages (Broms et al., Reference Broms, Silventoinen, Madden, Heath and Kaprio2006). Although we know from adult studies that utilize bivariate and trivariate analyses that significant genetic and environmental covariance exists between initiation and dependence (Gillespie et al., Reference Gillespie, Neale and Kendler2009; Kendler et al., Reference Kendler, Neale, Sullivan, Corey, Gardner and Prescott1999; Madden et al., Reference Madden, Heath, Pedersen, Kaprio, Koskenvuo and Martin1999; Maes et al., Reference Maes, Sullivan, Bulik, Neale, Prescott and Eaves2004), it remains unclear whether the genetic and environmental factors influencing the relationship between smoking initiation and progression in adulthood are the same across adolescence into early adulthood (Fowler et al., Reference Fowler, Lifford, Shelton, Rice, Thapar, Neale and van den Bree2007; Koopmans et al., Reference Koopmans, Slutske, Heath, Neale and Boomsma1999; Zavos et al., Reference Zavos, Kovas, Ball, Ball, Siribaddana, Glozier and Rijsdijk2012). We also do not know if qualitative and quantitative sex differences found in adult samples exist in adolescent samples (Kendler et al., Reference Kendler, Gardner, Jacobson, Neale and Prescott2005; Zavos et al., Reference Zavos, Kovas, Ball, Ball, Siribaddana, Glozier and Rijsdijk2012).
Thus, this study seeks to answer these questions by examining the relationship between smoking initiation and current quantity smoked from adolescence to early adulthood, determining if qualitative and quantitative sex differences exist in this relationship, and estimating the contributions of genetic and environmental factors to smoking initiation and current quantity smoked in this younger age group.
Materials and Methods
Data were obtained from the Virginia Twin Study of Adolescent Behavioral Development (VTSABD) and its young adult follow-up, transitions to substance abuse (TSA). The VTSABD is a multi-wave, cohort-sequential prospective study of adolescent psychopathology and its risk factors, in over 1,400 Caucasian juvenile twin pairs aged 8–17 years and their parents (Meyer et al., Reference Meyer, Silberg, Simonoff, Kendler and Hewitt1996); greater detail about the ascertained sample have been provided elsewhere (Hewitt et al., Reference Hewitt, Silberg, Rutter, Simonoff, Meyer, Maes and Eaves1997). To be included the present study, individual twins had to have responded to questions regarding smoking initiation and current quantity smoked. The total sample size of this study was 2,804 twins (including 632 MZ male twins, 829 MZ female twins, 367 DZ male twins, 389 DZ female twins, and 587 DZ opposite sex twins). Data obtained for the 22–32-year age group (N = 1,074) was obtained from one wave of the TSA, to which all participants of earlier waves of the VTSABD were invited.
Data from each of the five waves of the VTSABD were merged and then re-categorized into age groups to ensure that there was an adequate sample size (i.e., 12–13 years, 14–15 years, and 16–17 years). However, since there was only one assessment during the age period from 22–32 years, subdividing the TSA sample by age was not warranted. Two main variables of interest were re-coded across each of these age groups: one measuring whether twins had ever smoked at least one whole cigarette and another measuring the current quantity of cigarettes smoked daily. The ‘ever smoke’ variable was binary, coded as 0 for those who had never smoked at least one whole cigarette and 1 for those who had indicated that they had ever smoked at least one whole cigarette. If respondents indicated that they had ‘ever smoked’ in a given age group (i.e., 14–15 years), they would be given a value of 1 for ‘ever smoke’ in that age group and every subsequent age group (i.e., 14–15 years, 16–17 years, and 22–32 years). Otherwise, if the respondents indicated that they had not ‘ever smoked’ across all age groups, they were given a value of 0 for ‘ever smoke’. To measure current quantity smoked, respondents had to indicate the number of cigarettes smoked daily in the past three months. Free responses were coded into three categories. These categories indicated: zero cigarettes smoked daily (‘non-current smoker’), one–five cigarettes smoked daily (‘current, light smoker’), and five or more cigarettes smoked daily (‘current, heavy smoker’). Only responses where twins indicated that they had smoked before under the ‘ever smoke’ variable were included in the quantity of cigarette use variable. Otherwise, responses for individuals who had indicated that they had never tried cigarettes were coded as missing for the quantity of cigarette use variable.
Descriptive Statistics
Prevalence estimates for smoking initiation and quantity are reported using percentages.
Genetic Analyses
All data analyses were conducted using the open-source structural equation modeling software OpenMx (Boker et al., Reference Boker, Neale, Maes, Wilde, Spiegel, Brick and Fox2011; Neale et al., Reference Neale, Boker, Xie and Maes2003). Due to the inadequate sample size for smoking quantity in 12–13-year olds, only univariate genetic analysis on smoking initiation was conducted in this age group. Causal-common-contingent (CCC) models were fit, individually, for smoking initiation and smoking quantity across all other age groups (i.e., 14–15 years and 16–17 years in the VTSABD, and 22–32 years in the TSA).
Using the CCC model originally developed by Kendler and colleagues (Reference Kendler, Neale, Sullivan, Corey, Gardner and Prescott1999), smoking behavior was conceptualized as a two-stage process incorporating initiation and current quantity smoked. This model was chosen because it allows for estimating the relative magnitude of the contributions of genetic and environmental factors to smoking liability, as well as for testing the strength of the association between initiation and current quantity smoked stages for smoking via a beta pathway between the two stages (Agrawal et al., Reference Agrawal, Neale, Jacobson, Prescott and Kendler2005; Fowler et al., Reference Fowler, Lifford, Shelton, Rice, Thapar, Neale and van den Bree2007; Kendler et al., Reference Kendler, Neale, Sullivan, Corey, Gardner and Prescott1999; Maes et al., Reference Maes, Sullivan, Bulik, Neale, Prescott and Eaves2004; Neale et al., Reference Neale, Harvey, Maes, Sullivan and Kendler2006).
The significance of an estimated beta pathway between the two stages is used to assess whether the two stages are independent or correlated processes. Specifically, if an estimated beta coefficient is found to be not significant, the liabilities for initiation and current quantity smoked are said to be independent of one another, implying that smoking initiation and current quantity smoked have separate genetic and environmental risk factors. Otherwise, if the estimated beta coefficient is significant, the liabilities for smoking initiation and current quantity smoked are said to share genetic and environmental risk factors. In this case, the beta coefficient provides an estimate of the magnitude of strength of association between smoking initiation and current quantity smoked. The greater the estimated beta coefficient, the larger the magnitude of the strength of the association between smoking initiation and current quantity smoked (i.e., a beta coefficient of zero suggests that the two stages do not share genetic and environmental risk factors, while a beta coefficient of one suggests that the genetic and environmental risk factors for these two stages are identical). The estimated 95% confidence intervals around the beta coefficient give further information regarding the degree of overlap between the two stages. Again, lower limits approaching zero (or below) support independent liabilities and upper limits approaching one provide support for identical liabilities.
Using this model also allows for the direct estimation of additive genetic effects (a2), shared/common environmental effects (c2), and unique environmental effects (e2) on both smoking initiation and current quantity smoked. However, since current quantity smoked is modeled conditionally upon smoking initiation, the genetic and environmental influences unique to current quantity smoked are estimated after those on initiation are taken into account. Thus, the proportion of variance in current quantity smoked explained by the respective influences on initiation can be calculated by multiplying them by the squared beta coefficient. The proportion of the variance in liability to current quantity smoked that is explained by genetic factors is the sum of the proportion of variance in initiation explained by genetic factors multiplied by the squared beta parameter and the proportion of variance explained by unique genetic factors contributing only to the current quantity smoked stage, with the same principle applied for environmental factors.
Nested models were fitted to test specific hypotheses about the nature of association between the two stages of smoking initiation and current quantity smoked. More explicitly, to determine whether qualitative sex differences exist in the relationship between smoking initiation and current quantity smoked, we tested the significance of the genetic and shared environmental correlations between male and female factors. A model constraining the correlation between males and females to one, suggesting that the same factors contribute to male and female smoking behavior, was compared to a model that freely estimated correlations between male and female factors, suggesting that different factors contribute to male and female smoking behavior. This was done separately to test whether the same genes or same environmental factors contribute to the liability of smoking initiation and current quantity smoked in males and females.
Quantitative sex differences were tested for simultaneously to answer the question of whether genetic and environmental factors explain the same proportion of the liability of smoking initiation and current quantity smoked in males and females. To test for quantitative sex differences, a model equating all parameters (i.e., genetic, shared environmental, and unique environmental factors, but not thresholds) for males and females was compared to one allowing for free estimation of parameters for males and females separately. If the model equating parameters between males and females fit the data best, it was concluded that quantitative sex differences did not exist. This process was repeated for each age group.
Following these tests for qualitative and quantitative sex differences, other alternative models were fitted to the data. Specifically, nested models were created to test if there is a direct relationship between smoking initiation and current quantity smoked and whether genetic or common environmental factors could be dropped from initiation and current quantity smoked stages. Where the beta pathway could be dropped from the model without significant loss to goodness-of-fit to the data, it was determined that smoking initiation and current quantity smoked had independent liabilities. Alternatively, when dropping the beta pathway led to significant loss to goodness-of-fit, smoking initiation and current quantity smoked were said to have shared liabilities. Regardless of whether this finding was significant, we moved on to test whether we could drop genetic or shared environmental factors from either initiation or current quantity smoked. Where genetic or environmental factors could not be dropped without significant loss to goodness-of-fit, the factor was said to contribute significantly to the smoking phenotype.
Nested models were compared using likelihood ratio chi-square (LRC) statistics, in which the degrees of freedom equal the difference between the degrees of freedom of the full and nested submodels. LRC is calculated as the difference in -2 log likelihood (-2LL of a comparison model and the -2LL of a reduced nested model) (Neyman & Pearson, Reference Neyman and Pearson1928; Vuong, Reference Vuong1989). Where the LRC comparing the two models is non-significant, the reduced model is selected as the better fitting model. Akaike information criterion (AIC) was also used as an index of model fit, as well as an index of parsimony (Akaike, Reference Akaike1987; Williams, Reference Williams1994).
Smoking Prevalence
At age 12–13 years, 10.4% of the total sample had indicated that they had ever smoked. This increased to 27.4% by age 14–15 years, 46.6% by age 16–17 years, and 79.1% by age 22–32 years. Across all age groups, most respondents indicated that they were not current smokers (i.e., indicated that in the past 3 months they smoked zero cigarettes daily). Although the majority (approximately 71%) of adolescents who tried smoking did not become ‘current, heavy smokers’, the proportion of ‘current, light smokers’ and ‘current, heavy smokers’ did increase consistently from the younger to the older age groups (see Table 1).
TABLE 1 Smoking Initiation and Progression Prevalence of Sample

Qualitative and Quantitative Sex Differences
Genetic analyses indicated that no significant qualitative or quantitative sex differences existed in the contribution of genetic or environmental factors to liability of smoking initiation and current quantity smoked, and in the relationship between smoking initiation and current quantity smoked for any of the age groups in this sample (see Table 2). More specifically, the same genes and environmental factors contributed to the liability of smoking initiation and current quantity smoked in males and females, and genetic and environmental contributions could be equated across sex across ages 14–15, 16–17, and 22–32. (Ages 12–13 were not included in these analyses due to inadequate sample size and ages 22–32 were combined to ensure adequate sample size for analyses.)
TABLE 2 Model Fit Statistics from CCC Models

Note: EP indicates the number of estimated parameters in the model.
Relationships between Smoking Initiation and Current Quantity Smoked
The relationship between smoking initiation and current quantity smoked could not be assessed for ages 12–13 years, due to inadequate sample size for the smoking quantity variable. Instead, univariate genetic analysis was conducted on the smoking initiation variable. The best fitting model for this age group did not include additive genetic factors, suggesting that common environmental (71.7%; 95% CI: 58.7%, 81.8%) and unique environmental factors (28.2%; 95% CI: 18.2%, 41.2%) best explained the variance in smoking initiation at age 12–13 years.
Across ages 14–15 and 16–17 years, dropping the beta parameter from the CCC model did not result in significantly worse model fit. This implied that smoking initiation and smoking quantity had independent liabilities at these age groups. The best fitting models were not the same across these age groups, however. For age 14–15 years, the best fitting models were an ACE model for smoking initiation and a CE model for current quantity smoked, suggesting that genetic (53.5%) and environmental factors (shared: 28.6%; unique: 17.8%) contributed to smoking initiation while environmental factors contributed to current quantity smoked (shared: 84.8%; unique: 15.2%), as measured by quantity smoked. For age 16–17 years, the best fitting model for smoking initiation was an AE model, while a CE model still fitted the data best for current quantity smoked, suggesting that genetic (84.8%) and unique (15.2%) environmental factors contributed to smoking initiation, while environmental factors (shared: 88.7%; unique: 11.3%) contributed to current quantity smoked (see Figure 1).

FIGURE 1 Best fitting CCC models and variance component estimates.
For ages 22–32 years, the beta parameter between the initiation and current quantity smoked stages was significant, and the best fitting model was an AE model for both initiation and current quantity smoked. This suggested that smoking initiation and current quantity smoked shared liabilities to a moderate extent (β = 0.48) and was no longer independent, as with the earlier age groups. Additionally, genetic and unique environmental factors contributed to both smoking initiation and current quantity smoked, but shared environmental factors no longer exerted a signification impact on liability to smoking. Thus, of the genetic variance in liability to current quantity smoked, approximately 77.3% of the genetic variance was specific to current quantity smoked and 23.0% was shared with smoking initiation. In other words, mostly different genetic factors contributed to the liabilities of smoking initiation and current quantity smoked across adolescence, but in young adulthood, there was some overlap between the factors influencing initiation and current quantity smoked.
No qualitative or quantitative differences were found between males and females regarding the genetic and environmental influences on individual differences in smoking initiation and current quantity smoked across adolescence into early adulthood, lending support for similar findings in other studies (Kendler et al., Reference Kendler, Neale, Sullivan, Corey, Gardner and Prescott1999; Koopmans et al., Reference Koopmans, Slutske, Heath, Neale and Boomsma1999). However, at age 22–32 years, when testing for qualitative sex differences, models constraining the genetic correlation to one, indicating the same genes influence smoking initiation and current quantity smoked in males and females, fitted the data only slightly better than models that allowed for the free estimation of the genetic correlation. Thus, it is possible that qualitative sex differences do exist in later adulthood and that we did not have the power to detect them in the current sample. This might explain why other studies utilizing adult samples have found qualitative sex differences in the genetic and environmental influences in smoking behavior (Heath et al., Reference Heath, Cates, Martin, Meyer, Hewitt, Neale and Eaves1993; Kendler et al., Reference Kendler, Gardner, Jacobson, Neale and Prescott2005; Madden et al., Reference Madden, Heath, Pedersen, Kaprio, Koskenvuo and Martin1999; Zavos et al., Reference Zavos, Kovas, Ball, Ball, Siribaddana, Glozier and Rijsdijk2012).
Unfortunately, due to sample size constraints, we were unable to determine whether genetic or environmental factors contributed more significantly during the earliest ages of adolescence (ages 12–13 years). However, we did find that different factors contribute to smoking initiation and current quantity smoked across mid-adolescence into early adulthood. More specifically, smoking initiation and current quantity smoked seemed to have independent liabilities until adulthood, when liabilities were shared. Genetic, shared, and unique environmental factors were found to significantly contribute to smoking initiation during early adolescence (i.e., ages 14–15 years), but not during later adolescence (i.e., ages 16–17 years) or adulthood (i.e., ages 22–32 years), when genetic and unique environmental factors significantly contribute. Shared environmental influences may be more important for 14–15-year olds relative to older age groups because they experience greater limitations on the access to and availability of cigarettes. Although 14–15-year olds and 16–17-year olds experience the same legal age restriction on the purchasing of cigarettes, the 14–15-year olds might still have a harder time in gaining access to cigarettes among their peer groups if they have fewer friends who are of the legal age to buy cigarettes.
Additionally, genetic influences were not found to contribute significantly to smoking initiation until later adolescence into adulthood (beginning at age 14–15 years), much in the same way other studies suggest (Kendler et al., Reference Kendler, Schmitt, Aggen and Prescott2008; Koopmans et al., Reference Koopmans, van Doornen and Boomsma1997; Slomowski et al., Reference Slomowski, Rende, Novak, Lloyd-Richardson and Niaura2005). However, contrary to other findings, which found greater genetic influence on heavier/problem substance use, we found that genetic factors do not contribute significantly to the variance in current quantity smoked across all age groups until young adulthood (i.e., ages 22–32 years). Interestingly, it is also during this time that the liabilities of smoking initiation and quantity smoked are no longer independent of one another, but rather correlated. Again, this might be a function of access and availability to cigarettes. As access and availability of cigarettes increase, the expression of genetic predispositions towards increased smoking frequency and potential addiction may also increase, following initiation. Or, it could be the case that using a recent estimate of quantity smoked rather than an estimate from heaviest period of use is less stable and representative of adolescent youth relative to adults, and that our choice of measures for the analysis in this study could influence the estimate of the variance components.
Limitations and Strengths
Like all other studies, the present study has its limitations. Due to low prevalence of smoking behavior among early adolescents in this sample, the power of the current study was limited. This was apparent when we found that only a univariate genetic analysis could be conducted on the smoking initiation variable among 12–13-year olds as there were too many missing values for the current quantity smoked variable and consequently, a CCC model could not be fit. It is also possible that using self-reported data underestimated the prevalence for smoking behaviors, as a result of social desirability bias, which could have also influenced genetic analysis. Furthermore, this study is not generalizable to all populations, as the sample included only Caucasians.
Despite these shortcomings, the present study does include both males and females. It is also one of only a few studies investigating the relationship between smoking initiation and current quantity smoked within an adolescent sample and adds to the literature by investigating this relationship across various age groups. Future studies could include the use of measures related to smoking progression, other than current quantity smoked to investigate their effects on the relationship between smoking initiation and current quantity smoked. It would also be interesting to see if the same relationships are found among other adolescent datasets, using different populations than the one described in the present study and if these relationships are affected by the addition of environmental covariates, such as parental monitoring or peer influences.
This work was supported by the National Institutes of Health's National Center for Advancing Translational Science (E.K.D., award number UL1TR000058) and the National Institutes of Health's National Institute on Drug Abuse (E.K.D., H.H.M., project number 1R01DA025109-01A2: Developmental Genetic Epidemiology of Smoking).